 Okay, this short video is going to be about using linear regression in Excel So to briefly recap if you've read the blurb around this You should know that if we have some data that is described by a straight line We need two values to describe it and predict where that straight line is going We need the slope just how steep it is and the intercept which is how far up and down it moves The equations on screen here label it as alpha and beta sometimes it's there M and C sometimes that B Doesn't really matter what you call them as long as you know that there's a slope and an intercept and those equations are Quite difficult to get your head around. I've really quite involved. There's a lot of looping over Variables and subtracting variables from our average and so on So writing that down on paper is difficult and practice you want a computer to do it And there's not one correct approach for how to do this You could learn some programming do it up in Python or whatever you prefer You can get a copy of Mathematica. I don't believe that's free at the moment and that will have some functions built in You could use R which is a stats package specifically designed for doing regression So if you do more complicated things R will probably be the one that you want But this video is going to focus on Excel for a number of reasons So certainly purists will say that you shouldn't use spreadsheets for anything But they're actually do have some genuine advantages for one They're a lot more flexible than you might expect they're not just for doing your finances month on month and adding up numbers You can do a lot of really crazy things with them But most importantly there is a blower barrier to entry if you try to learn how to do this with a programming language You might be stuck for a couple of days trying to figure out how to do two plus two They're not the easiest things to get into but with spreadsheets Everything's laid out for you. You know where your data goes, you know where it's the outputs are it's all arranged quite nicely So it's actually what a spreadsheet really is is what's known as an integrated development environment It genuinely is that if you know that term already you've just laughed at genuinely is it's an ID and it genuinely works You know where everything is you can organize your projects really easily with it in a way that's slightly more challenging when you're coding and It's really ubiquitous. It's pretty much in most offices Most places will use it Because you can you can prototype things really rapidly using Excel in a way that it might take programming a lot longer to do So probably no matter where you go in the future You'll have to either build a spreadsheet or fix someone else's spreadsheet to use it. So It's good to get to grips with and if you want to do more advanced stuff there is Always the more advanced options that you can pick up in your own time so let's do Some Excel work Right, so I've got my spreadsheet open here and we want to correlate some data now I haven't got any data. So I'm just gonna make it up. I'm gonna do an X column and a Y column Yeah, I'm gonna zoom into this There you go and I'll Tell you what you can also do. Let's put some cell styles on this and just give it a header so we know That that is our heading and I'm just going to type in some random numbers Let's assume we went into the lab and we measured zero one Two three four five we did six data points there and the Y value that we measured Maybe it's a response to something. I'll keep it simple. I'm just gonna put the same numbers in Now that means that this should be a perfectly straight line starting from zero going up so We know in advance that Y equals MX plus C should be One plus zero really So let's have a look at it Let's highlight all of these and then we go to insert Chart and what we want is inserting a scatter chart not a bar chart not a line chart a Scatter because this is going to correlate two variables together X and Y and We don't really want anything too fancy here. We just want the data points in fact for now I'm going to delete the things we don't need such as the title and The grid I would probably recommend getting rid of the grid in the background the defaults on this tend to be Anyway, what we've got here now is our straight line and you can see it's a straight line That's the exact same numbers and traditionally the way that you would get our your Intercept and slope from here is to go to add trend line Had that in and what we've got of various options down the right here that appear We could do a logarithmic scale, which it won't do Without crashing because it doesn't quite fit But add trend line we just want a linear one and down here Which you should get used to doing if you're gonna use this to generate anything to stay on chart and to stay the R squared on chart And this is exactly what we expected. It says Y equals X and R squared equals one R squared is a measure of how well Correlated the data is the closer to one more perfected is so this is obviously going to be one because it's straight here now let's Let's let's randomize this a little bit. Let's put actually some random numbers in there So Code for that is equals RAND and if you want to just fill your data down automatically you can double-click That little square down in the right hand corner of your cell and it just shoots it down And you can see now we're randomized a little bit these numbers are now a lot more complicated We can read Y equals MX plus C from that and we can see because it's actually random data the R squared's gone down a lot as well and Normally you would just say okay, let's write those numbers down but let's Let's Write some numbers down that I think we'll actually break this a little bit. I'm gonna go zero zero point zero zero one zero point zero zero Two What we've got there is this is now getting towards zero point zero zero one three It's actually half this again. Let's put a Put some more zeros in here What we've got zero zero zero one and the thing is if we start then changing this slightly What we find is that all we've got is zero zero one now that should not be the case The status is no longer perfect. What I'm doing here is editing the numbers to be slightly off Change that to 45 maybe and yet Our square our intercept value is not really changing a great deal And that's because there is a bit of a sensitivity issue with this You would have to go into some quite advanced settings to try and get more decimal places out of it So instead of doing that We're gonna do something with a formula instead and we're gonna get the slope and the intercept instead And the neat thing about Excel is it's really obvious. It's really easy to use if you want the slope your type equals And then slope and if you're working in Excel and you you will be able to see My screen recording software doesn't pick this up I learned that the hard way, but if you type in SLO You get the word slope underneath and a list of other things that might match and then you hit the tab button to auto complete it Open your brackets because that's how an Excel formula works and it's telling you with this look Little pop-up which you can't see on the screen recording software This is known wise. So I want to know what my known y values are They're here Right there. That's that range there And then I put a comma to move to the next one and it's asking me what my known x values are. Well, that's these here I don't really need to close the bracket. It'll do it automatically. But when I press return I Get 8.37 times 10 to the minus 5. I'm getting more Significant figures out more decimal places than I am from Putting the trend line on there so that slope function is getting me that automatically that's actually automatically dealing with that formula That's really complicated. Let me just jump back to it for a second So what you can see on the screen there is that complicated formula Excel is doing that for you Automatically almost instantaneously. In fact, what I want to the plus see what I want the intercept. Well, again Excel is really dumb It's called Intercept and it's the exact same syntax it asking for my known y values Which I'm going to put in and known x values Again, I don't need to close the brackets for just a formula like this. It'll do it automatically. So I hit return gets me 1.9 times 10 to the minus 5 Which is more to significant figures than here Now a couple of other tricks you can do here before we Move on what I'm going to do is I'm going to highlight all of that data And I'm going to go to the formulas tab. This is a very underused feature When people first iron out in Excel, but it's actually extremely useful. So I'm going to cover it a couple of times What I want to do is this section here called name manager So the name manager Let's you name rangers So you don't need to go back and keep typing them in or selecting them manually I click the word str create from selection What I get is a dialogue that asks me to create names from values And I'm going to say that the values are in the top row So all the names are in the top row now. So if I highlight x here What I can see in this top left-hand corner here is it's called x It's labeled it if I highlight all the y's I can see in the top right here. It says y So what happens if I actually go to that drop-down menu and select x? There it is at the automatic highlights all my x values Why it's also my or air having all my y values Now if I go to this slope and intercept section down here and type an equal slope Well, no one wise I don't need to highlight it or find it anymore. I can just type in y There it is and x no next values Return that exactly the same number as before and then intercept Finish that off with a tab no one wise. I just need to type in y comma x return my x and y Intercepts and There we go. That's the first inroads into how to use Excel to do some linear regression And hopefully in the future so much more complicated stuff as well