 Hello, everyone. So today I'm going to talk about deep learning, or at least an introduction to it, by presenting a algorithm which I consider to be the building blocks. And we'll see how it is the building blocks. And by the end, you'll see how you can take this algorithm and expand it out to make sort of deeper predictions. So who am I? I'm Jonathan Farrow. I started life as an electronics engineer, initially working on sort of radios and very nice sort of circuit boards. And then I moved into Rolls Royce, where this is really where I began my product journey, sort of technical product manager there. And then from there, I've sort of moved on through different companies and different products. And I think I've worked on some interesting products in my life. So I've designed parts for submarines, APIs to help with communications, the PlayStation 5, and sort of accounting software more recently. So a very varied set of products that each have their own challenges. And that tends to make products interesting. So the agenda today, it's a very, very small agenda, but it's everything we're going to need to know. All that sort of very intro for us to get first to begin our journey as product managers into deep learning. So what are we going to be covering? Well, it's something called the perceptron algorithm. And it's this is considered to be the sort of simplest, so it's sort of neural net. So a single perceptron that makes a prediction. And the way I want to cover that is, as you understand this, you can then begin to layer this concept to build more and more complex algorithms and neural nets. So it really truly is the foundation. And why do I think this is important for product managers to have an understanding of deep learning and sort of AI and just models in general and data? Well, as product managers, we are focused on user experience with focused on the business, but also focus on technology. So we need to we need to understand what technology can do and see if there's opportunities for us to apply such technologies, or if there's certain situations where we can say, oh, yes, maybe this algorithm could apply or this we have a problem. Maybe we can solve it in such a manner. So enable these sort of conversations and these thoughts to happen. So something key I said there is data, you would need to look at data and say, yes, this algorithm can be applied to it. And that tends to be the case for most algorithms is they will suit a particular type of data. And the perception algorithm, for example, suits data that is sort of a binary output. What do I mean by that? So if you look at this sort of data here, we have sort of what we consider a pass and fail. And this could be relating to do anything. This could be houses sold houses did not sell. And this could be a passing grade failing grade. That doesn't matter. What we should concern ourselves with here is the output is is of one and zero. That we have two sort of classifications here. Now we can apply the percept we know it's suitable to preside the perception algorithm. And also something to note about when we're looking at this. So the y axis and x axis, there are predictors in this. So we have the pass and fail. And it seems to correspond to these two axis. So there are predictors. This is this is important to know. So what is a perception algorithm trying to do? Or if we wanted to make a prediction, what would we try and do to say, if we're going to plot a new point in here that we think it was going to pass or fail? Well, we'll draw a line going going across it and say anything above that line is a pass or anything below it is a fail. So if I come along with a new point such as 99 or 910, I could plot that on this and say, yes, it's above the line. I think this is a pass. I'm very, very confident that this is a pass. So that is that is in essence what what our algorithm is going to try and do. It's going to try and establish this line. So then when we feed new data points and it will be able to predict and say, look, it's above this line. So I predict this is a pass. So how do we how do we draw that line? Well, this is an this is probably an equation many of you have seen before. Right. If you have done linear regression before, or just plotted a line on a graph, this is the equation you'll be familiar with. And for those who aren't, let me just go through it. So why is your target your prediction variable? Often called Y hat, but just for simplicity's sake here, I'm calling it just Y. This is this is the target. So versus is the pass or fail. So is it above the line below the line? So this is this is what's going to establish line for us is using this equation. So the Y is our target. X is the predictors. So if you remember I said on that that previous graph, we had we had two predictors. Now come to that in a second because we seem to have one X here. So how do we put two two X's in there? Which is very, very, very simple. And before before we get on to that, we have m and B. And what are what are they in this this equation? Well, we'll consider those are model parameters. Okay, they're actually what our algorithm is going to change and manipulate to try and alter the line. When m is the slope, and we'll see the larger it is, the greater rate of change. So the steeper the slope and B is the intercept that is where it crosses the Y axis. Okay. So I'll show some graph representations of that in a second. But that is that is all this just really pertains to right is us finding the these these variables and we can then begin to plot that line. So what's that equation below them? Well, this is just sort of in, of course, a machine learning representation for us for our neural net. And you can see her X one and X two. So what I've done here is because I have two predictors, two X variables, we'll say now so we will convert the Y axis into an X. I'm just adding, adding that on to the equation. So what we're saying here is weight one when w here is essentially m saying w one X one, we'll times it together. But also the next predictor will have weight two next to if we had a third one, we'd have w w three X three, and then always adding sort of the bias. So that's what we're doing. It's very, very simple equation, more interest to reiterate, the more predictions you have, you just add them on, you just change them on. And then we can begin making our predictions. Now, just to represent this in sort of graphical form. And we can see here, when B is zero, it's going to, you know, it's going to cross the zero in set point, right? w is now two, so it's two. You can see now the rate of change is going up quite steeply. It's doubling here. Now, if we move on to here where the intercept is a bias 500, it's now intercepting the line here. As you can, you can see how these parameters affect the line. Okay. So it's just, this is your sort of representation to show what the parameters are really doing. So how does this all fit into, you know, deep learning? Well, this is, this is the perceptron, right? This is, this is what we're going to be doing. This is the first normal net that we would be constructing. Okay. So taking our equation here, we would build our linear model. So we would have our graph with our graph with our line on that would exist in, in here, in this what we call our perceptron node. That model now exists in that. We then feed in our inputs x1, x2. And this is a one because this is going into our bias here. So we don't want to, we don't want to change that bias, but we just sort of times it by one. And this is going to feed into that linear model. And it's going to say, right, based on this x and this w, this x2 and this, this, this w should be w2. Going into this linear model. How, where does that, where does that plot on that graph? And then it's going to, you know, output that, and then we're going to have some sort of step function that will then convert that into a one or a zero and say, right, if, if sort of above zero, one below zero, zero, maybe something like that, this will then say it's a path or a fail. This will step functions is doing. So that's all the perceptron algorithm is. So on the concept, very, very simple. So how do we, how do we get started with this? What, what needs to happen? Does the, does the perception automatically have this model in there? No, it doesn't. Okay. So how do we, how do we get it to have, have that model learned? What do we have to do? Well, first of all, it's going to randomize those parameters. It's going to randomize the, the w's and the b. It's going to pick some random variables to begin with and say, right, I'll plot this line. Okay, good. This will usually have about an accuracy of say, best 5050. So when I put any of these points through, it might get some right and might get some wrong. So it's not great right now. Okay. We have our net. We've got some sort of random predicted sort of model here. We need to somehow make it change. We need to say, no, this, this, this line is incorrect. Please change it. So then what we're doing is, we'll feed the data through. And that, that point, as it goes through, we'll say, no, I'm misclassified. I am true. And you've predicted me as false. So just for example, say this point here would go through and say, but you've got me below the line. I'm true. You're picking this false. So what's it going to do? Well, we're going to, you know, correct it by this error. And this is to be an error function. And how does this work for, say, a perception? What can we do? We can say, right, plus or minus the value of X, take it away from the weight. So, you know, based on if I'm above or below the line, how you've misclassified me, plus or minus me away from the weight. And then, you know, adjust that variable as accordingly. And we'll do this at each point we feed through. Now, you're probably wondering, what is this sort of a looking symbol here? Well, this is the learning rate. So this is something we're going to, to say, right, this is how much we want you to change. Okay, because if we were just saying take away the input value, we're going to get huge swings of that line, right? Because say it is maybe, you know, input value is nine, we're going to say it's a huge swing that we've taken away from W. So what we'll say is, let's apply a learning rate. This is how much we want, you know, a small change to happen. So that A might be something like what, not 0.01. So then we're going to times, therefore, 0.01 by the predictor and take it away from the weight. So now we've made the change a lot smaller. So that means the line will change, not as drastically. And this is, this is good, because we only want to make small changes and keeping this data through until we get a good line. If we have huge swings, then we could, we could get stuck where the line is constantly moving up and down because it's stuck between points that have huge values. It's not moving small enough and making gradual changes until it finally finds its fit. That's why we want to introduce this sort of learning rate to it. And then we take that learning rate away from the bias as well. We do this at every point that we feed through this, this sort of simple network. And just to sort of illustrate how this happens. So this point here is saying, hey, you've misclassified me. Come closer to me. And the line does. That's, this was nice about this function is that that's, as you feed it through, it's going to move the line closer to the point. Now this, this line saying, hey, you've also misclassified me. Please move me closer as well. And it now moves, moves closer to that line. And we keep doing that until we have a line that fits, a line that is now, you know, is going to make a good prediction for us. Now we can estimate how good this prediction is by sort of looking at the sort of mean squared error. And when the sort of, well, when the mean squared error gets small enough, we can say, okay, our model is now, is now, is now good. So what is, what is the sort of mean squared error? Well, what it is, it is doing, it is taking all of the sort of errors, squaring them, and taking away from the mean. So you take each point away from the mean, and then divide that by the number of points that there are. What this is then, this is then going to do is it will give you a value. And it, it smooths out smaller errors and, and sort of penalizes you for large errors. So the point is very far away, you'll be penalized more. And so as this, as the mean squared error reduces, what you're now learning is you're saying, okay, when it gets to a small enough point, all right, we're now happy with this model, okay. It's now, the error is now, it's small enough that it's giving a good enough prediction for us. And so this is where sort of, this will come in to play for us. So we can sort of have an estimate of how good our model is or when, when to stop, sort of its predictions. Okay. So just to sort of visualize everything that is going on here, we have our data. And this might be multidimensional data, as we've got here, with x1, x2, they could even be x3. So there's a multidimensional plane. We'll build a model off of that. So that is our equation that we, we first put in. So we've now, we've now got a model. Now we're going to, we're going to apply some error correction to that, to say, you know, where are, you know, where is the line? Is it correct? Let's move it. Let's change these parameters. And then we're going to say, right, is the model good? You know, is it, are we happy with the model now? Let's, let's see if our error function is small enough to say, yes, we are, we're happy with that line fit. So this is really the beginnings of, no, no, this, this process here. Once we, once we have this, we can start making predictions, feeding new points in, and making, yes, it's above the line, it's below the line. And that's the very sort of, very beginning of deep learning, is this very simple neural net. Now you can take this and expand it a level further. Just want to show that. You can do that by having multi perception model. So now all the, all the, the more nodes, so we have now two nodes, two model nodes here, two output nodes. This is our input layer. So as you, as you start adding more and more perceptions, your models get more and more complex. And you can start predicting greater outputs and more, more complex models. So this is how a very simple perceptron model can start to build up into more and more complex models. And each of these inputs are then feeding in to these perceptron nodes, which are then going to give outputs. And these, these, these could be changed as well. That, that, that is an option. So you can start, how you then start modeling this and how you start where what inputs are feeding into, to what node is where the next levels of complexity come in. But this is the basis of it. This is how that perceptron algorithm is the beginning of deep learning. So once you understand this, and once you can sort of grasp that it's just first it will pick that line. Now let's make it more complex. So I hope this has sort of helped everyone on here. So just the grasp of what the first step of deep learning is. What I would advise sort of everyone to do is try and find a data set out there. There are some sites that will provide that. Have a look at it and say, is this data correct? You know, can I, is it of the binary format that? It's, you know, two classes. Now can I build a very simple perceptron? And I say go and do this because there's loads of code examples out there how to do it. And the code is six or seven lines to build. And then you can start making predictions. So it is just very simple in the beginning to build this. And and another factor if you want to learn even more. So as we get into sort of the next stage of this or more complex models. So where it's no longer, we can divide the data by a straight line. Maybe it's now a curved line. So we have to now apply some more advanced concepts. Well, I actually teach for product school. We go into deeper aspects of machine learning and how we can apply some of these factors. And if it was, if you are interested, please do check those classes out. But also if you would then like me to go into sort of a further lecture where we're going to, okay, non-linear models anymore where the classification boundaries are no straight, no longer straight. I'd be happy to make that that next lecture. But first please do try and you know have a go at coding this with a data set. And get comfortable with that before we move on to what will be the next in the next concept. So just to summarize, the perception algorithm works on binary data. We'll randomize the parameters. The algorithm will then try and use an error function to learn the parameters and adjust the line. It will then output a binary one or zero for its prediction. One or above the line zero and below the line. And that's how we would define, okay, which class is it stating. So I hope that was very clear for everyone. And as I said, please do try and code this and reach out if you would like the next lectures or look into what product school offers in terms of their courses. And why yet again I think it is important for product people to really sort of begin this journey of learning is this set. When you can look at a data set and say, right, we could potentially build a model to predict, make a prediction here, or do we have a question or a problem that we'd like to understand. Can we start to apply any models or algorithms to this to make that prediction for us and automate this. This is sort of where these conversations will come from. And will enable you as well to say, yeah, I think this is a route that's viable for us to take and have it also with conversations with ML engineers and sort of software engineers on about the feasibilities. So that is all from me and I hope that was that was useful. Enjoy the rest of your day. Thank you.