 Most of my working life has been focused on machine learning, artificial intelligence, like research. I work in an area called Recreation Endurance. These are the services that tell you, if you both did book, then you might like that book. But if you like this movie, then how about watching that movie? That's a recommendation. And we make those recommendations by mining lots of data about your past habits and your purchases and trying to figure out what people like you have done in the past and maybe some of those things that you've been doing. So that's what I've been doing for most of my life. And then I had a little bit of a midlife crisis and I thought, 25 years of doing this, is there not more to life than recommending books and music and movies? And then I thought, well, I could either buy a sports car or I could do something else. And I got into rolling. And then I thought, wow, rolling is quite interesting. Maybe I could start to apply some of what I've been doing in these stages to the face of rolling. After all, one of the great things about rolling, especially when I started, was that I was genuinely confused about the stages to go with our activity. And we're all wearing these watches and carrying our bone with us. And almost everything we're doing these days is reported. We keep the data generated and stored somewhere. And at the same time, people are very motivated about rolling. Even a casual mother wants to know, how long should I go for? How many times do we have to learn through this? And that's interesting in finding out more about what they should be doing with their roll. So there's a huge amount of recommendation opportunities, lots of different ways that we can suggest to runners, things they might want to think about. I like rolling marathon. I like rolling marathon. I'm not going to do it, but I'm going to cook. And be able to enjoy rolling marathon from there. And rolling marathon is very interesting, because there's a number of different stages you go through that most of you will know. And a variety of different ways that we're looking for advice from the participants, especially when you're not very experienced. Whether it's the training that you do in the 12 to 16 weeks before the race when you're rolling, you're wondering how much training should I be doing? How should I increase my training load? I'm for the personal coach. So in following this one-time physical training plan on a five-hour internet, should I stick with that? How am I doing well? Is my business changing? What's my position in this time? And am I at risk of injury? And then one of the weird things about rolling is just that you're getting off the race that you go through this period called your taper period, where apparently you're supposed to not run as much as you were, and that's easy as it would be. How should I taper that? Do I stop running? Or do I lose down? If I lose down, how should I lose down? And then on next day, what do I do? I'll rock up to the start line, and I'll just go as fast as I can. I'll try to get to the back line, and then hold on up to do that. Is that the best strategy? Or a lot of people say don't start too fast. Go ahead and let it get easier. And then when I finish the race, how should I recover? How soon after the race should I be willing to take? These are some of the things that I've been interested in. And I thought for this talk, I'd pick up a number of different case studies that I've done over the years. And these case studies have a lot of two different stages of that that you can even refer to. One is the race stage of that, which is really just about marathon race times and split times every five kilometers. You're in the race. And then separately, we have a data theory agreement with Strava, and they provided us with a lot of data about the people's training practice when they were going. So the Strava data. So the race stage of that is by sort of spine projects that I've been doing over the last five or 10 years. So we started looking at marathon websites that publish their results data, and over the years wrote lots and lots of code and started gathering lots and lots of race data. And now there's more than 4 million. I think it's about 5 million race results that I've collected. And then we started looking at marathon websites that publish their results data and over the years wrote lots of race results that I've collected over lots of big city marathons, not just the finish time, but also their 5K split times. Separately, we have the Strava data set, which is really about the sort of raw activity data you imagine getting through Strava. So it's very fine grained, and it's obviously people, not just their race data, but also their training data as well. So they're the two race data sets. I'm going to pick up on a number of studies, and some later speakers will actually pick up on some of this as well. I'm just going to go through the studies in a fairly sort of high level way, focusing more on the results than some of the sort of technical details, but hopefully that's the right way to do it for this audience. So let me look at one of the early projects that I was interested in, and it was estimating training load, and this idea of be careful how you increase your training. Don't go too far too fast. You're often captured in the form of this 10% rule. Only increase your weekly mileage by at most 10% week on week. One of the measures, it's a little bit of a controversial measure these days, but one of the measures of training load that people use is the acute chronic ratio. And with the acute chronic ratio is essentially the ratio of this week's training load. So for the purpose of this study, think about mileage. This week's mileage as a fraction of the average of the previous four weeks mileage. And if your acute chronic ratio is a lot greater than one, it means that you've ramped up your mileage significantly. And the hypothesis is, and it's been shown in a number of other sports, that if your acute chronic ratio is higher than one, you're at greater risk of injury. So that sounds like a fairly simple measure that recreational runners could keep track of. What does it tell us about injury risk in recreational runners? So I looked at using the Strava dataset, looked at runners who were competing in the Dublin, London and New York City marathons during 2014 to 2017. Now, we don't know if runners are injured. We don't have that data in the Strava dataset. We just know the training they were doing and the races that they did. We do know if there was gaps in their training, however. So what we were interested in doing was estimating this acute chronic ratio based on weekly mileage alone. We didn't look at intensity. That's for a future study, but just using mileage, so training volume. Look at how that varied runner by runner and look at the relationship between the acute chronic ratio and training breaks, gaps in their training. So here's a runner who hasn't run for 10 days, two weeks, three weeks, and maybe some of those longer training gaps are indicative of injuries. You could imagine a runner taking maybe three or four days off during their training because life just catches up or they're feeling a bit tired, or maybe they're traveling, but taking more than a couple of weeks off is probably unusual. It might be a sign that something else is going on. So we had a lot of data to look through. We had over 30,000 runners and all of their training activities in the weeks before their marathon, and we looked at that and we calculated their ACR, their weekly acute chronic ratio, and we looked at the relationship between that and their training breaks. And we can see, is there a pointer here? Yeah. So you can see there's their weekly ACR. So one just above one is probably where most people would be during marathon training, increasing their training gradually, but not too much, and this is the number of days on average for their longest training break. Most people had a longest training break of between six and seven days, just under a week, but you can see as the acute chronic ratio went up, the number of days in the longest break also went up. And we were particularly interested in very long breaks, though the percentage of runners that experienced more than two weeks of training break, and we can see here again that as the acute chronic ratio increased, their training load is increasing more than it should be. We found that the chances of people having a two-week break before race day went up quite significantly. Okay, so about 10 to 15% of runners ended up with a two-week or longer training break during their training. So that seems to suggest that, yes, as you might expect, if you increase your weekly mileage too much or a greater risk of injury, the optimal seems to be somewhere around an ACR of one to 1.0 to 1.1. Not too much difference between male and females. Found that older runners tended to suffer more than younger runners, so they need to be a little bit more careful with their training load. And slower runners, more than faster runners, presumably there's an experienced thing going on there as well. For slower and faster here, it was sub four-hour finishers versus slower than four-hour finishers. So this data on recreational runners, which hadn't really been done before, seems to suggest that a high ACR means you're at greater risk of injury. Your optimal ACR should be somewhere around the one mark, so somewhere between 0.9 and 1.1. Older and slower runners are greater risk than younger or faster runners. We were interested in predicting whether we could identify these disruptions. We weren't able to make predictions reliably enough, but we were able to estimate the cost of some of these training disruptions. And Kira later will speak to this. A second case study, tapering for the marathon. I'm about to go into my marathon taper for the London marathon, so this is very close to my heart. How do people taper for the marathon? What is a taper? Well, it's a period of reduced training load in the weeks before race day. So typically about two to three weeks, maybe as many as four weeks before race day. Most marathon programs will recommend a reduction in training load, not a cessation of training, a gradual reduction in training load. And the idea is that it helps runners recover from high volume, high intensity training over the previous three or four months and allows them to recover before race day and be at their best for race day. And there's lots of advice given to the runners about how they should and shouldn't taper, but what do we actually do as recreational runners? Do we find that people follow the advice are they good at following the advice? Are certain forms of taper better than other forms of taper when it comes to finish time? So again, we looked at Strava data for this. We identified about 150,000 Strava marathon runners this time for this study. They all had up to 25 weeks of training prior to race day. We identified the four weeks immediately before marathon day as the taper period. So that's a period we were paying particular attention to to see if their training was reducing. We used the average of five and six weeks before race day as a baseline. So that's kind of the kind of high point in training. And we expected them to reduce training volume after that. And we were interested in understanding two things about the taper. First of all, the duration of the taper. This is the number of down weeks as we call them. The number of weeks of reduced training during that taper period. Maybe there's two weeks, maybe there's three weeks. And the discipline. So if it's a two week taper, both of those down weeks occur directly before race day. Or were you punctuating them with a higher week of training because you're getting a bit nervous that you were losing fitness? So strict versus relaxed tapers. Strict tapers are, if you have a two week strict taper, it means both of those down weeks occur directly before race day. If you have a relaxed two week taper, it means that you might have had a down week, maybe three weeks before the marathon. Then your training went back up two weeks before the marathon and then it went down a week before the marathon again. So it's a little bit more disorganized. So you can see, there's me later, there we are. So, pointer's a bit dead. Anyway, you can see there, the red line there at the top, the red solid line is a strict two week taper. You can see a steady decline in training volume up to race day. The green line below it is a relaxed taper. There's a down week and then they go up a bit and another down week. Now we found that most runners engaged in two or three week tapers. So they had two or three down weeks during that taper period. You can see about 36% of runners had two weeks, a relaxed two week taper. So there was two weeks somewhere in that four week period where their training reduced. It just wasn't the two weeks directly before the marathon. A much smaller set of about 9% of runners had a strict two week taper. They followed the instructions for two week tapering and tapered directly before race day. Similarly, three week tapers were popular. Most of them were relaxed forms of tapers rather than strict forms. A few people doing a four week taper and some limited number of people doing a one week taper. There was even a few people that didn't taper at all, probably because the race wasn't their main race. They were just using it as a kind of training race. And we found, as you might expect, that there was changes in finish times by taper type. So one week taper, people who tapered for one week tended to have slower finish times on average than longer tapers, two or three week tapers in particular. People who did a relaxed form of the taper had a slower finish time than people who did a strict form of the taper, given the same number of down weeks in that taper. That's not a great analysis because you could find that maybe it's just the faster runners are dominating a particular taper type. So that could be a bit misleading. So we came up with an alternative measure which is something called a finish time benefit. I don't have time to go into it in the talk now, but it's discussed at length in the paper that's cited here. But suffice it to say it gives us a more normalized version of the effect of a longer taper, relative to a one week taper, a one week relaxed taper, which was our baseline. And we found that if you do a two week or a three week taper, you're looking at gains, improvements in finish time of around about 2%, a little over 2% for the strict form, a little over 1% for the relaxed form of taper. So two or three week tapers are good tapers. The four week taper worked out pretty well, but there wasn't as much statistical significance there. So we talk about that in the paper as well. We also found that the benefits of these longer tapers were greater for females than for males, interestingly. I don't know why that is, but that's just an interesting result there. So longer, more disciplined, longer strict tapers, strict two week, strict three week in particular offer greater finish time gains when we control for a whole host of factors than shorter tapers. One of the first studies that I did was this idea of what do you do with the stark line? Everyone was saying, don't go out too fast. So is it the case that people who start too fast tend to finish more slowly than other runners or tend to have slower finish times than other runners? It's a really common piece of advice. And we started, we looked at this using the race data set. So we have five kilometer split times. So we were interested in how fast did they go in that first five kilometers of the race relative to their overall race time? Okay, so that's what we're measuring here. So a relative start pace greater than one means they've gone faster than their average race pace, but they've gone a bit too fast, much greater than one means they've gone a lot faster than their average race pace. In other words, they've slowed during the rest of their race. And we used our race data set. We'd about 1.7 million runners in this data set at the time. And when we looked at it, we found that sure enough, there's people who start pretty much even their first five K pace is pretty much the same as their race pace. But look, people who are, well, this is the number of runners. So we've a lot of runners, great percentage of runner who are starting faster than their average race pace. So there's a bunch of runners here who are starting about 10% faster, about 20% of runners starting 10% faster than their average race pace. Much fewer runners start slower than their average race pace. Not surprising, the start line is pretty exciting and everyone just sort of follows the crowd and maybe goes a bit too fast. But when we then looked at the average finish time for these runners, we found that those who started faster had slower and slower finish times. Those who started slower than their average race pace also had slower finish times. But if you start faster, it's worse than starting slower if you know what I mean on a like for like basis. So don't start too fast. Now, again, and we found that the cost of a fast start tended to be greater for male runners than for female runners as well. So female runners, lots of evidence to suggest that female runners are more disciplined and more even paces. So that might be playing into that. Now, we did control for different levels of ability as well. So we can see that at all levels of ability, three to four hour finishers, four to five hour finishers, five to six hour finishers, we found a similar sort of effect. Interestingly, we also found this type of effect when we looked at the final 2.2K in the marathon. We found that those runners who finished faster, so their pace in the last 2.2 kilometers of the race was faster than their overall race pace. They had higher, longer finish times, a greater finish time cost than runners who finished slowly, more slowly. So maybe that's because there was runners, they still had something in the tank and possibly too much in the tank. They'd raced too conservatively and then couldn't spend what they had left in the last 2.2 kilometers of the race. So to sum up, not surprisingly, we found that most runners did start fast, arguably too fast, and starting fast was associated with a slower overall finish time when we controlled for all sorts of things like gender and ability levels, even when we looked at their PB times. We found that runners who started too fast rarely achieved a PB. And there's some evidence that starting too fast is worse than starting too slow, but overall your best bet is to do your first 5K at your mean race pace. And some evidence as well that finishing too fast is not a good sign because maybe you've paced too conservatively. We also found that runners who started too fast were much more likely to hit the wall later in the race. Beaking of hitting the wall, and if some of you may have seen this video of the London marathon, I think it might have been back in 2018, this poor chap hit the wall with about 300 meters to go. He was a very fast runner. I think he was a sub three hour finisher. And when you see someone hit the wall, you know you've hit the wall. It's quite striking. So it's this iconic hazard in the marathon where people find they have a significant late race slowdown. So they're just out of fuel, out of energy. It can be exacerbated by poor fueling and poor pacing during the race. It wasn't clear how often this thing happened. You had lots of sort of post-race surveys among recreational runners and they'd all say, yeah, I hit the wall. And some studies report 60% of people hitting the wall, which couldn't really be true. I think most people just decide that after the race hitting the wall is like a rite of passage. If only I hadn't hit the wall, I would have got a better finish time. Damn. But anyway, what's it really like? And again, we looked at this. We had about 2.7 million runners this time in our race data set when we started looking at that. We needed an operational definition of hitting the wall. How do we know if someone hits the wall? What we're really talking about is late race slowdown. So I think the nutritionists and the sports scientists might say that hitting the wall is much more nuanced than just slowing down late in the race. And I'd sort of take that as given. But this is a kind of practical signal that someone might be hitting the wall, is to look at how they slow in the second half of the race. Relative to how they were running in the first part of the race. We ignored the first 5K, because as we know, people go out too fast. So that's not representative. So we looked at, as a base pace, the 5 to 20K section, and then looked at how much they slowed down in the second half of the race and for how long. And we were particularly interested in people who slowed down by at least 25% for at least five kilometers. And we did lots of sensitivity analysis by this definition to get a sense of if we adjusted that definition that it radically changed the results and it didn't. So here's some of the findings. We found that many more males hit the wall than females. Nearly twice as many males hit the wall than females. And by this definition, is that really surprising? I think possibly not. There's a sort of, I told you so. Age plays a little bit of a factor ability. PB time has an impact as well. Faster runners are hitting the wall at much fewer fast runners hitting the wall than slower runners. But again, more males and females. This is an interesting one. We were able to look at the years from a runner's PB. So in the years a runner was chasing a PB, they were much more likely to hit the wall in the three years before they got their PB. Again, more males than females. And again, more likely to hit the wall in the three years after their PB. Maybe they think, oh, I've still got another PB in me. I'm going to chase that. So again, it's all down to presumably poor pacing management as they're chasing an unattainable goal in this particular race. And when something goes wrong, it goes terribly wrong. We were also interested in figuring out what's the finish time cost of hitting the wall? Because it's not enough to just say it's your finish time when you hit the wall. We know that's going to be slower. But what is the cost relative to what that race could have been if you hit the wall? Now, that's a tricky thing to estimate. So what we did was we looked at the time you finished in when you hit the wall and your recent PB time, so your recent personal best. And the difference between these two times is a proxy for the cost of hitting the wall. And we found that age didn't really depend on that, have much of an influence there, ability levels did. So interestingly, faster runners had a much greater cost when they hit the wall. They lost more minutes. Now, you might think, does that mean they're hitting the wall worse, more, harder if that's the right adjective to use than slower runners? Probably I think it's more likely to be that the PB time of faster runners is just so much greater than they're hitting the wall time. Whereas for slower runners, we're kind of always almost hitting the wall. We're always on the edge of it. Even our PB, we're probably just flirting with disaster. So the difference between our PB and when we hit the wall is not as great. That was kind of interesting. So anyway, guys, don't go out too fast and be really careful of your pacing because you're much more likely to hit the wall, especially if you're chasing a PB and especially if you're just coming off a PB. So just be careful. Okay. I'm going to kind of skip this section because I know the guys want to make up some time, but we did some other work on PB prediction and I'll just cut to the chase. We used a machine learning technique called case-based reasoning, where we were estimating your PB time by looking at the PB of similar runners based on a whole host of different races that you had run recent marathons. And we were able to generate very accurate PB predictions down around 2%. Just to give you a sense of that, you find that the classics of the Rigel formula for estimating marathon time, the error associated with that can be somewhere between 7% and 10%. So we're able to significantly reduce the error associated with these predictions. And that can be really important. If you're able to predict someone's likely PB time, it has a whole load of positive benefits in terms of training and the paces they should be targeting and training. And obviously on race day, in terms of the pace, they should be targeting on race day itself. And because of the nature of what we do, we're trying to build these into apps and provide runners with useful tools to manage their training, to learn from their training, to go beyond the simple, here's a report of what you've done in your training to here's what you might think about doing next. And if you do this, then you might get this. So that's what we're kind of interested in. So to conclude then, I think this is such an interesting area. It's so much better than recommending Netflix movies. The amount of data that exists and the willingness of runners to be open to advice and to suggestions and their eagerness to find out more about their runners they're running. And I'm sure it's the same for cyclists and other sports as well. I think that that makes this a really interesting data science area. I think that the current batch of training apps, Strava's, the Run Keepers, they're great, really, really useful and getting better all the time. But they're still largely focusing on recording your history and telling you about what you've done and allowing you to share that with other people. Starting now to look to the future and give you some advice, I think there's a lot more to be done in that regard. So adding prediction and forecasting not just to help us train to become faster runners, but to train more safely to become less injured runners and you'll hear a lot more about that in the next few slides. So I've just scratched the surface of all of the different places that we think data science can improve the life of runners, recreational runners in particular. I think there's a lot more to be done and hopefully in future years we may even have an opportunity to tell you some more about that. But I'll leave it there for now. Thank you very much.