 In this video I want to continue our discussion of linear regression to help us cement all the concepts and processes that we're going to need when we look at proper deep neural networks. So I hope you looked at some of the videos just describing the basic linear algebra, the basic video on derivatives and then the one who looked slightly deeper at gradient descent and have a good understanding of what is going on here. And remember these videos, the video series on deep learning is for people interested in healthcare solving problems with healthcare so I don't expect of you to have this deep understanding of mathematics or computer science. It is about using it to solve problems and I want you to be able to use deep neural networks to understand deep learning to solve these problems. So I've got another document here created with R Markdown in R Studio and it's published on R-Pubs. Remember that these files will be available on R-Pubs, they're available on GitHub. I'll talk about them on Twitter and on LinkedIn. So follow me on YouTube, subscribe to me at least on YouTube, follow me on Twitter, connect with me on LinkedIn, have a look at these R-Pubs pages and also GitHub. All the links will be everywhere. I want everyone to have access to this. So let's move down. We can use multi-variable linear regression. So whereas before, all we had was a single feature variable predicting a target variable, we are now going to have three feature variables predicting a target variable. And I'm going to import a CSV file that will also be on GitHub in case you want to do that now, that was just some random values. So we're going to have very poor prediction of the target value, just four columns and just random values and we can still build a model from that. So I've imported it here in R, don't worry about it, it's not what this is about. Let's just remind ourselves of where we are. We've now got these three unknowns. I'm going to call them X sub 1, X sub 2 and X sub 3. They might three feature variables and there will be multiple rows of data describing these patients. If they are patients, for instance, we can see they're 10-year in actual fact and we have a target variable Y. So we need these parameters, beta sub 0, beta sub 1, sub 2 and beta sub 3, so that we can create this. I'm going to take beta sub 0 remember plus beta sub 1 times X sub 1, beta sub 2, that should say 2, X sub 2 and beta sub 3, X sub 3 and that's going to give me roughly Y. Now we say this roughly Y because what it actually gives me remember is Y hat with a little hat on top and that's going to be different from the actual Y and that's why these squiggly lines means approximately equal to. So that's still going to give me a loss function. You'll see this notation written elsewhere where there's a superscript I and the superscript I just means every sample. So that first row will be the I1, the second row would be I2, the third row would be I3. It just means the loss function here was going to be this addition of all these things multiplied by each other. So in this first one it will be beta sub 1 times 20.1, beta sub 2 times 39.3 and beta sub 3 times 1.3 add to that beta sub 0 and that's going to give me a value Y hat for that Ith row, that first row and from that I subtract the actual ground truth value which in this first instance was going to be 394 and I square that and that's my loss function. And I can add up and divide by how many there are so that I eventually get this average of all the loss and call that specifically my cost function. So that'll be a long equation. Remember previously through algebra we just condensed that to a simple one but it's still going to have beta sub 0, beta sub 1, beta sub 2, beta sub 3's in there and I have to solve for those to give me the lowest possible cost function. Now once again I'm going to cheat a little bit an R and I'm going to use the LM linear model function there and I say Y is given by X sub 1 plus X2 plus X3 and the data comes from the data frame. This is how I imported it. I saved it in this computer variable this object called DF and I can cheat a little bit and it gives me these values. It says beta sub 0 should be there, 342.7, beta sub 1 should be there, there's beta sub 2 and there's beta sub 3 and you've seen, see I've written it here, those are the values that it should take. Note though that if I look at the multiple R squared it's only 0.016, very poor model and as I said those values were randomly just created so we expected they'll be poor so don't worry about that too much. What I want to show you is just this because that's the idea, the aim of this video lecture. We've got to get used to this little diagram that we've drawn here because I've taken this multi-variable linear regression and I have represented it in this way. If I scroll down I'm going to decrease the size just a little bit so it all should fit in the same page. There we go, almost there, bear with me, let's go, let's go, there it's better. There we go. Here I have what is called an input layer, x sub 1, x sub 2 and x sub 3 so I'm actually turning this on its side whereas if I look at, let's go back up, there's x sub 1, x sub 2 and x sub 3, they lie horizontally for the first patient, there they are, I'm turning that on its side so x sub 1, that 20.1 and then the 39.3 and the 1.3, they are now this way along and that's called my input layer so I can put those three values in there and you can see they each connected to this layer here and it's called a hidden layer, a hidden layer of nodes, see the three nodes and we see I've put a little tooltip there, it's multiply and there's my beta 1, beta 2 and beta 3 so I'm going, these are called, in deep neural networks these are called weights so I'm going to take each input value and multiply it by its weight, each input value multiplied by its weight, this input value multiplied by its weight and that gives me the values in the three nodes of the hidden layer and they each only have a single connection with the input layer, later on we're going to see it gets much more hectic than this. So I have this beta sub 1, x sub 1, beta sub 2, x sub 2, beta sub 3, x sub 3 in this layer here and then I add all of these three, remember it was this one plus this one and I add beta 0 and that gives me this y hat and there we have sort of a neural network, we have an input layer, we have a hidden layer here and we have an output layer and that is exactly how we construct deep neural networks, we're going to have many more hidden layers, one layer after another, we're going to call each of these a node but we can also see them as a little neuron, a brain cell that's connected to all sorts of other brain cells but we're just going to add on adding these layers and that's a deep neural network, here we just have this single hidden layer. Now so much so that if I plug in those, the first patient that was 20.139.3, 1.3 I multiplied by the weights that we stole from our linear model, it gives me these interim values in my hidden layer and I've got my bias node there, I add all of these and that one and I get 241.7 which is my predicted but the actual one was 394.5 so quite a bit of difference but I told you through that adjusted R squared this is going to be a bad model but anyway I think this should give you a meaningful deep understanding of what a network, a neural network is all about, we're going to have this layer of our input values, one for each row in our data, we put them there, we're going to have these hidden layers and we calculate their values by multiplying the input times this weight and we'll see later on we can add a bias right here and add a bias even into this, don't worry about that, for now we have these layers and then they pass to some output layer and that output layer can calculate a value for us or a state even depending on what our problem is. That is the basic structure of a neural network and you know now that this was forward propagation, we call this forward propagation to get to this and now through gradient descent we're going to go backwards in the opposite direction doing back propagation whereby we update these values of our weights, we update them to better values as you saw in the previous video and I hope you watched the previous video in this playlist whereby we update those, now we have new values for beta sub 0, beta sub 1, 2 and 3 and we can go forward again, get a new value, update that back and forth, back and forth. Initially these three values that we have here when we do proper deep neural, create a proper deep neural network we're just going to take stabs in the dark at these, we'll just ask the computer to generate random numbers so that we start somewhere on our gradient descent graph, we have some point where we can stand and we start to look at how we can go downhill so our cost function can be as low as possible and it will run backwards and forwards, backwards and forwards, forward propagation, back propagation, forward propagation, back propagation, updating those weights every time through this process of gradient descent until we get them as good as possible so that the cost function is at its minimum, the slope is going to get to zero that's how we know it's going to approach zero at least and that's where we know we've got the best value for these weights so I really hope this has been a good introduction for you that you have this a deep understanding of what neural networks are all about and we used a very simple thing like linear regression to do all of this for us. If you come across this video and this is the first one you find please go back, I always create my playlist so one follows the other when you look at our pubs there'll be a series of these these notebooks, these markdown files, so please do them in order so that you know which one follows the next. I am a surgeon that means I'm a clinician but I'm very interested in deep learning and I want everyone to be interested in it because it is going to be used to solve so many problems as it already has for us as human beings. Get everyone involved in deep learning, I maintain my argument that it's time for clinicians and those involved in healthcare to learn about deep learning and we should not just leave it to our colleagues in mathematics and computer science to do this, we should get on board as well. I'll speak to you again.