 So that's a quote by George Box that's been around for a long time and I always use it when I'm teaching predictive analytics and students always ask me you know I can build a model how good does it have to be and that's a really hard question to answer and it's really hard question to answer because it depends and I drive students demented by saying that whenever they ask the question I say oh it depends and the reason it depends is exactly that that all models are approximations so any predictive model that we build is an approximation to some real process or other but some of those approximations are useful for us and what I want to do is go through basically a series of lessons and I'm going to go through six lessons to make sure that we all get home at a reasonable hour that I guess I've picked up from building predictive models over the years and hopefully they might be useful to people so these are the six lessons and I basically just jump straight into them and the first one is around what we mean by prediction and when we build predictive models what exactly are we talking about and I guess Connor was talking about it there where he's talking about decision making and all of predictive modeling in my mind at least should be should start from the point of wanting to make better decisions and this is a kind of process diagram that we use all the time and basically over here someone is going to make a decision we're going to try and extract some inf sight from some data to help that do it do that in a better way and in particular if we're thinking about predictive models the insight is going to be the output from a predictive model we're going to use some predictive model to make an output here but I think a useful thing to think about is what can that prediction be right so what's the range of things that we can talk about and I've used this I don't know if people have come across this before so it's Kaggle.com so anyone who's who's into predictive modeling will probably have used Kaggle.com Kaggle.com is kind of a marketplace for crowdsource predictive modeling so if you have a predictive modeling project and rather than hiring someone you can put that project up on Kaggle.com with a cash prize and then data scientists all around the world will try and build their best predictive model and there's a leaderboard and whoever wins at the end or has the best model wins the prize and you can see some of them are some of them are worth a go so there's prizes of a hundred thousand dollars up there and they kind of go all the way down to competitions that you just do for phone or for smaller money but what they're useful for is to see the range of things that we can look at when we think about predictive modeling and I just want to go through a few here so the first one is this idea the bike sharing demand forecast so this is a Canadian city who has a scheme just like Dublin bikes here and what they would like is a nice predictive model that allows them to know next week on Tuesday how many bikes I'm going to need right so what's the demand on that going to be so essentially we're predicting an unknown value into some time in the future and that's very comfortably a forecast right so we have some kind of time series and we're going to forecast off into the future and one of the things that I see a lot when I talk about predictive modeling to people is that's what people think about so they have a notion of someone doing something at a point in time in the future and that applies to some of the predictive modeling tasks we do like the second one here which is a marketing response challenge where essentially you're trying to predict the propensity of somebody to respond to a particular offer and again you're going to predict the likelihood of someone to do something into the future and that very comfortably sits in as a ranking problem essentially so you're going to take all of your customers and you want to put them in a ranking from most likely to respond to least likely to respond to this marketing ad that you're going to put out to them and that'll tell you about who's going to do something in the future and you can make your decision about who you should send this ad to and who you should ignore essentially so both of those sit very comfortably in the idea of making a prediction into the future but I think the definition of prediction we should think about is much broader than that so if we look at these two challenges on Kaggle there's the ultrasound nerve segmentation challenge which is an image processing challenge where basically there's scan there are ultrasounds and the job is to recognize whether those ultrasound images contain nerves or not and the other one down here is a job salary prediction problem so here given a text description of a job can you stamp that with an expected salary so can you essentially label that with an expected salary and we can think about these as labels or classification type tasks and I think that's a key thing to keep in mind so when we think about predictive modeling that we we don't always think about it having to have a temporal effect it doesn't always have to be a prediction into the future if I skip ahead of that the definition that I think is useful so in data analytics a prediction is an assignment of a value to an unknown variable and I think keeping that broad definition of predictive modeling is useful and probably if you think back over all the talks that we've seen in the last couple of days hopefully most of them will kind of sit in there and I think it's more useful than always thinking about a temporal effect behind that so I think that's the key thing to think about first so remembering that prediction and predictive modeling allows us to do a lot of different things but always to focus back on the decision and say what's a what kind of prediction is going to help me most in making a better decision and a better data-driven decision that might turn into a data-driven discussion at some point on the back of that the second thing is I really like this there's this thing called the no free lunch theorem