 Good morning everyone. So this is the last day of EuroPython and in this session We are actually dealing with some of the very hot topics that I'm really interested in that's neural networks and deep learning and To begin with we have our first speaker T. Rashid He'll give a talk on a gentle introduction to neural networks. So please give a big round of applause Hello, hi, can you hear me? Does this work? Yeah, cool Right first of all, thank you very much for I'm coming to my talk For me, it's really great to be at an open community open source conference I always learn a lot actually I always have great conversations and there's always a very generous spirit So I want to thank everyone and also the organizers. So nice one and I'm gonna talk about Neural networks and I just want to be really clear my talk is very Intended to be very introductory. It's it's for people who perhaps don't know what neural networks are or how they work Or maybe you studied them a long time ago and forgotten So if you already know what they are if you're already an expert, you might be bored So I don't mind if you if you want to go to another talk cool My name is Tariq Rashid. I forgot to say that I'm one of the co-organizers of London Python And if you want to come and do something with us Please come along and have a chat with me We really want to do more sort of broad things everything from computer art to teaching people to code right cool So I'm as I said, this is an introductory talk and we'll talk a little bit about the background of what is artificial intelligence Why is there a lot of interest in neural networks at the moment? Then we'll get into the ideas and that's the the meat of this talk really. It's what are the concepts that are used In neural networks and what all you do is I'll I'll use some very simple kind of naughty examples Which which may not seem that interesting, but they illustrate the very key points that make neural networks work So I hope that you'll stick with them and that will help us understand what's going on inside a neural network We'll also apply them. I'll give you an example of applying neural networks to quite an interesting Challenge recognizing handwritten numbers. I'll give some pointers around how you might code your own and I might regret this I'll do a live demo at the end And if it goes wrong, that's going to be really embarrassing We'll give it a go and I'm not going to talk about Libraries and there's lots of cool stuff out there. There's the Arno. There's TensorFlow and there's there'll be lots of talks today Covering things like that. So mine is really about the concepts. What's really going on and how you might do your own? So just to get us into the kind of the right frame of mind and let's start with two questions So I have a seven-year-old daughter and I and she likes challenges So I set her this challenge. I said can you look at this picture and point out where the people are and As a seven-year-old child. She found that quite exciting and very easy They look like seven-year-old children and she counted the the people in the picture and that was that was fine She she can do numbers she can add she can subtract so I sort of said to her Can you add those numbers and and she found that very difficult? But you know that with coding with computers with Python Doing a calculation such as that one on the right is actually very easy But instructing a computer to find people in a photo is not so easy. So that's interesting That's easy for us and that's easy to code for computers But that's hard to code for computers and that's easy So there's something there and we would like to be able to solve these kinds of problems because they're interesting You know find me a picture of a cat Work out of what this sound sample what the words are in this sort of audio file You know those are really interesting problems And we want to be able to solve more of them and you know the terminology like artificial intelligence It means different things to different people. So for me it means being able to solve the kinds of problems that traditionally have not been That straightforward. So that's what that's what this is about And there's a lot of hype at the moment. There's a lot of bars There's lots of stuff going on that you you know, you won't have missed this autonomous cars There's health data being used to improve outcomes Google's been a very kind of active recently with You know being able to play go which which is amazing We thought that it would take another sort of 20 years and they use you know neural networks as part of their solution So that that's you know interest people And that's what we'll talk about today So let's go right back to the beginning, you know really really Assuming nothing. So we want to ask a computer a question and we want an answer and it's Do some kind of thinking well clearly you can't think it's a it's just you know metal and wires So it has to calculate, you know, that's it has to process and those are words I guess programmers like ourselves Understand we have input we have some kind of calculation and we have output and neural networks and artificial intelligence That's that's all it is. There's nothing mysterious about it. It's just calculations Just cleverly done So let's let's set ourselves a very very simple example just to get started Imagine that the conversion from kilometers to miles is a difficult problem. Just imagine I know it's not you know, it's not let's just imagine it is and imagine we didn't know how to do it So we invent a model in our mind We sort of say maybe one is the other one multiplied by a number That's a model. We've kind of come up with a model. We think it might be right. We might be wrong. We're gonna try it Let's start with a number. We don't know. It could be miles is kilometers times a hundred or miles is kilometers times Two let's start with a number. We'll start with point five and if we compare it with In a real examples of truth, we know it should be sixty two point one three seven, but our model Calculated fifty. It's not that bad There's an error of twelve. It's okay. It's not great Let's let's tweak that Point six that gives a better Answer still not exactly right, but the error is much smaller now Um Let's try again point seven. We've gone too far and it's much worse now Let's let's not be so enthusiastic about jumping and try point six one and that's actually getting quite close So this this idea of using a model and Tweaking a parameter inside it and then comparing the output with with what we know should be true Is how neural networks work and many other actually machine learning methods? We use the error That pops out the other end and use that to kind of tweak and guide the refinement of a parameter inside the model I hope that's clear and that's a super easy example But that's what a neural network is doing if you just replace that circle with a neural network That's really what's happening. You're training it. You're looking at the error and you're tweaking Parameters inside it to try and get a better answer at the other end You can go home now. That's it Okay, so there's the key points there, you know, if we don't know how something really works You know, we we we haven't got an exact mathematical model We can invent we can we can come up with a model that we think might be true And we can try it and we can have parameters that we can adjust And the important point there is the error is used to refine the the model So, let's let's take our daughter into the garden where she likes to pick up bugs and she's picked up some caterpillars and ladybirds and Imagine that we've plotted them on a graph with width and length so caterpillars are thin and long and Ladybirds are short and wide and if we plot them we can see there's two clusters two groups there Which is interesting and some of you will recognize this as clustering. That's that's that's that's cool What we did first with our first example was have a linear line a predictor The relationship between kilometers and miles we we thought was a straight line with and we changed the parameter We change the slope What we're gonna try and do here is see if we can apply that same simple model and see if we can come up with a way of predicting or classifying What a bug should be so that line instead of being a prediction line. It could be a separating line So things on one side of line are one type of object caterpillars things on the other side of the line might be The ladybirds that's not a good one because that doesn't separate the two kinds of bugs This one doesn't either And that one does So you can see that Learning to classify is not that different from the very first simple example. We looked at and You can also see through this kind of naive animation that we're changing the slope is a way of Learning to find a good separation line between those two clusters So once we've if we've learned that line if we've learned a good separating line if we then find an unknown bug You can say well, that's that falls in that half of the of the space. So it must be a caterpillar So classifying things is it's kind of like predicting things So we apply these methods when we don't really know what the model should be But what we do have is real data. So we learn from data We invent a model. We think it's a good one and we try to refine it To match the data that we've collected. It might be you know data from space You know the microwave background radiation which we heard about earlier in the week It might be voice data. It might be a sentiment We're gonna stick with a super super simple data set consisting of two items Just the widths and lengths of two bugs Here we are we're plotting them. So we start again with a randomly chosen Parameter for that line a randomly chosen gradient and we say, okay, that's not That's not good because it doesn't separate the two lines So let's look at the first example there and we need to shift the line up To that point. Okay, we've improved it but that kind of does a good job Now let's what could we learn from the second example? Well, we we look at the second example and we say all right the separator must keep that example On that side of the line and that kind of works and If you're interested in the maths, it's really simple. We've got straight lines It's very sort of simple linear algebra. You can rearrange the terms to work out what the change in grading should be if you want to Get the line to go through a certain point but actually We've made a kind of a mistake here because what we've done is we've looked at an example and ignored all the previous ones before it We don't want to do that We want to learn from all the data Not just the last one we looked at if we did this We would work through all the examples and we just Have an answer which you could have got by looking at the last example so one way of doing that is actually not to be so enthusiastic about your The amounts that you jump up by What you what you can do is you can say instead of jumping forward or changing the line By, you know, five we apply a factor like a learning rate So we only jump a little bit Safe if the first example wants me to go over there I just move in that direction a little bit if the next example wants me to go over there I move a little bit So with lots of data you can see that you eventually get better and better but you're not overly influenced by each any individual data point and that's good because Data is noisy and there can be outliers There can be errors in the data and you don't want to you know Overgives too much importance to any one individual data points and that learning rate is is quite an important Idea in neural networks, and we'll understand why in a minute so let's Increase the dial up the complexity a little bit. So imagine we have data which is something to do with the real world and It's causal So maybe I'm measuring the amount of smiling in this room and the other factors I'm measuring are whether it's sunny and whether it's a weekend So you can see if it's sunny and it's the weekend There might be more smiles or if the sun shining and there's no cloud the temperature might go up You can see that in in the real physical world data can have causal links and we want to be able to model that I mean that that would be a great thing to do to model and Be able to predict or classify a data that comes from the real world. So here's some simple examples here You know, we have the Julian relations We have the and relation and the or relation if two conditions are true and the third one is The and is only true if the both inputs are true So some data can be like this Can we can remodel that with the very simple example the simple classifier that we've got? Just to say actually that that thing we did at the start with the picture There's nothing wrong with having two inputs into a calculation. That's okay So you can visualize them that data by saying, you know We we can plot these sort of you know the the two inputs as coordinates and we can say the output is colored So if we have an and relationship we color it green if it's if they're both on and yes The dividing line still works. We can still have a linear classifier to separate off data which has sort of an and Kind of causal link in there and same with all so that's cool We could use that very simple very kind of naive classifier to learn data which has the and Kind of cause in it or or logical or in it. So that's that's cool. That's that's that's looking hopeful But actually in history. I think it was probably in the 70s People sort of Became sad because somebody wrote a paper that said actually these simple classifiers are very limited And it's because those simple classifiers the linear classifiers Can't learn Data which has the XOR relationship. So if I have two variables, which are related To the answer with an XOR the only truth either of the inputs are true But not both can't do that and you can see why you can see visually no line Correctly separates those two classes. So so that led to a bit of I Guess a bit of a slowdown in research in neural networks But if you look at this you think well the answer is kind of there already We have two lines And this is an important point actually. I know this is a very simple example, but what this Suggest to us is that actually we need more than one of those classifiers to help us with data. That's more complex And that is actually why neural networks have many many nodes rather than just one node And that's that's that's I think that's an important point there So some problems can't be solved with just a simple linear classifier You kind of know that But it's the motivation for why we might want to explore using multiple nodes Let's take a shift a little bit and look at Nature again, we started right at the start with the example of my daughter's brain being able to Find people in a photograph, but me not being able to code that very easily So human brains are doing something and working in a way that's different from my kind of you know laptop here wants to work and it's you know throughout history people have tried to understand what is it about the way Biological brains work that makes them so good. What can we learn from that and replicate in new kinds of algorithms? And actually, you know just just have a look. I'm you know, this computer's got what is it 16 gig of RAM and How many mega kind of instructions per second? It's it's quite Quite chunky and yet a pigeon with which has a brain of point four grams It can fly it can learn to eat it can communicate It it can learn to do new tasks. That's really important Snails got 11,000 neurons. That's not a lot really, you know We can store sort of you know with big data technologies can a huge amounts of Data structures and we these things have just 11,000 this and this worm has 302 neurons, you know I'm sure I can fit that in a in a micro bit This is interesting there is a species of of of of whale Which has 37 billion neurons, but we humans have 20 and And it's using them because if it wasn't using them it would have evolved away because it's a cost makes you think that maybe we're not the most Superior things on the planet But anyway, the point here is that Nature's doing something with with brains that we can learn from and you know, which with apparently such small resources, they're able to Do tasks which we think are quite complicated So those neurons that biologists Know are inside our brains and our nervous systems if we look at them what they do is they kind of Transmit a signal along Onto another one Those are the just it kind of the names for the various elements, but what they don't do is they don't sort of pass a signal on Unin kind of you know without any Kind of resistance what they do is they only pass a signal on Once that signal is kind of past a threshold It's like turning up a dial and the light only goes on after I've reached a certain kind of number So maybe our computing neurons that we model maybe they should do the same and Some people think oh, I could use a step function to do that So if the input is past a certain point then it switches on Actually, you could do that But in nature we know that things aren't always sort of black and white and hard-edged things are softer So we we might try a softer Kind of function. That's a sigmoid function and there are others that you could use and We know in nature these things are connected like sort of a network and mesh and you can see the signals going along So maybe that's what we should try and model When we want to do some interesting tasks like recognizing pictures and again going back to that thing We saw right at the start There's nothing wrong with having more than one input coming into a computing kind of node And what we've just said here is that we're collecting the inputs Just as they do in in the in in nature And we're going to apply a threshold function so that we only have an output if that combination is big enough And that becomes our node in a neural network So after how many minutes of talking oh my god running out of time We've finally got a neural network. That's all it is a network a neural network an artificial neural network is our attempt to try and recreate what those biological brains are doing and Each of those circles is doing What we saw here? Collecting the signals applying the threshold function and passing on the output It is convention that we call these layers. We have a middle layer. We have an input layer. We have an output layer And there we have some connections so Let's pause a little bit and think With our very very first example where we wanted to convert miles to kilometers We had a straight line with an adjustable slope a parameter That's what did the learning the learning was the changing of that slope that that kind of multiplication factor What's learning in a neural network? What do we change? What do we need to tweak so that the outputs are better? There's probably lots of answers to that you might say that function that threshold function that curve Maybe we need to change the slope of that in each of those nodes That's that's probably you know, that's it's not a bad idea Actually what where history has taken us what people do is adjust the links the strength of the links between those nodes So if a link is strong a signal is amplified if a link is weak It is kind of reduced if a link is zero Effective if the weight if the strength is zero you effectively break break a link So that's that's one approach and that's the one that's become popular probably because it's easier as well So when we feed signals forward, let's just imagine we've got signal one at the top there And we have a link there go call between one and one and you can see it's got a weight a Strength of point nine what we would do is take that signal one multiply by point nine and that's what feeds into the next node Same here point five times point three is what would go up there. That's really easy. That's that's not complicated at all That's that's what is happening inside a neural network just multiple multiplying signals through Connections and feeding them on to the next node and collecting them That's just a reminder of what we're doing. We've got these signals coming in We add them up but this time you can see that we're weighting them We're using the weights of those links to kind of either boost or reduce those signals of color than there so you can see You can see the slides afterwards. You don't have to do the calculation now But you know you can if you want to you can verify that you know that times that plus that times that would give you That answer and that's that really is as simple as as it gets you and that's what a neural network is doing There's nothing nothing very complicated there at all so We had a very simple network here with just you know four nodes If we wrote out with a pen and paper what is happening at each node? So at that node number one in layer two if we wrote out what's actually happening We'd say it's the input one times that weight plus the input two times that weight and if you wrote it out for this one and we wrote it out again for all the nodes you start to see a pattern and That pattern is really helpful Because that allows us to write that calculation as a matrix multiplication So the weights matrix times input signals becomes the signals that go into the next Layer and that's really really Valuable to us because for two reasons it allows us to write that calculation in a much more Concise way so we don't have to write pages and pages for big networks We just write weight times input is the signal into the next layer The other reason that's really important is because computers numpy Can accelerate? Matrix multiplications and we want to take advantage of that whether it's numpy whether it's Fortran libraries We heard about earlier whether it's hardware accelerations. So using your graphics card to multiply matrices If we can formulate our calculations in terms of matrices then we can take advantage of the acceleration That's possible. So you might say our man matrix is so boring. Why would we have to do this again? But but this is the reason So that's cool. We're kind of feeding a signal forward through each layer of the network and we get an answer at the other end We know that actually we're likely to be wrong just like we were at the start. So we have an error and Going back to that very first example again. We use that error to refine and improve the Parameters inside the model. How do we do that here? So let's break it down lesson. Let's kind of have a simple network here simple picture just to see what might happen Well, we know that we need to change the weights that we've we've already agreed that we're going to change those The strength of those links in order to try and improve the answer. That's what we're playing with That's the parameter that we're tuning We what's the what's the error? We know what the error is right at the end of the network if the answer should be five and we get three the Errors to but what's the error inside the network as we need to know the error in order to change the weights So that's that's quite an interesting question and actually lots of lots of the guides and the books kind of gloss over that a little bit What we could do is well actually the first thing to say that there's probably no kind of Mathematically kind of perfect answer. So what we we do is we kind of think The word is heuristics. We think well What would be a an intuitive way of working at the area inside the network and one intuitive idea is to say Let's split it. So if the error is five, maybe I push two and a half this way in two and a half that way That's an idea another idea is to say That top link three with the weight three Contributed more to the error because it was a bigger stronger link. It magnified the signal Maybe I should put more error in that direction. So you split the error proportionate to the links So if if I've got weights of three and one the the links of strength three and three and one You can see three quarters the error would go to the top node and a quarter would go this way that kind of makes sense I'm sure there's more sophisticated things you can do, but we want to keep it simple, especially if it works So you can see there actually that the error from that node Is is being kind of split and pushed back And the same here and the internal nodes you would actually collect the several fractions of error that you that link to it Sounds complicated, but when you see it as a picture you can see the errors flowing backwards Back propagation of the error. That's where the term comes from error back propagation So feeding forward signals and back propagating errors Oh, it's too many calculations. You can look at it afterwards The point here is that you're summing up the error So if the errors point six from that top right and this one's contributing point one the errors then point seven So point six plus point one just collected the errors and again It's really nice really fortunate that if we did write out what what's really happening in in terms of in a variables We it becomes a matrix multiplication again Which is really nice because we can accelerate that and we can write it in a very concise way Without worrying about the actual size of the network. It's only slightly different because the weights matrix is then transposed. It's flipped diagonally Against super super simple Okay, so we've got the errors now at each node How do we how do we how do we change the weights? Okay So that's the output one of the output nodes those w's are the weights of those links inside I'm not gonna be able to untangle that if you can you know well done That's horrible So what we need to do here is to say we're not gonna be able to kind of untangle that in any kind of nice Mathematically clean way. Let's find other mathematical methods, which are perhaps approximate but good enough So let's go on a bit of a journey Imagine this landscape is a complicated function like that one we saw and If it's say an error function Because that what we had here is the output and the error function is simply that Minus what the what the actual target should be if this horrible complicated lumpy landscape is is a really complicated function Which we can't work out Analytically, you know with nice clean algebra Another way to kind of work with it and maybe work out the minimum where the minimum error is is to sort of say well if this was a landscape and I didn't have a map of everything And it was dark. I couldn't understand the whole function But I did have a torch what I could do is I could point the torch Down near my feet and say well its slope is going in this direction Take a few steps. It's going in that direction. Take a few steps and eventually You would work your way down to a minimum Some of you will put your hands up and say it might not be the best minimum, but we'll come to that So this this approach, which is not mathematically kind of clean It's an approximate method, but it works really well and you can see it working really well with You know, let's pretend the x-squared function is really difficult. Let's just pretend that You can say that you know We start at a point and we see where the gradient is locally and we kind of move in that direction And we keep doing that and you'll get to the minimum Which which works and that's that's kind of nice and you might even be more sophisticated and say as The slope gets smaller you might take smaller steps because you're getting close and closer to the real minimum And you don't want to overstep it That's an idea that's actually used in neural networks as well So if that if that error if that complicated function was the error function then we have a way of Finding You have a picture to show you there it is. So if we have the weights, which is what we want to Kind of improve and we have an error function, which is complicated. We want to use this gradient descent method to find the minimum of that error and We'll then know what the right weight should be So if we're over here with the wrong weights, we're gonna have high error and we're gonna try and say, okay I want to improve my position and move down the error function to somewhere where the error is smaller And then the weights there will tell me what the right weights in that network should be so this is called gradient descent and it's a way of Working with that horrible kind of Expression that we couldn't kind of do analytically before If you did write this out again with pen and paper And and worked out sort of the gradient locally. It's not that hard. I mean I won't do it here I have a blog post if you want to look it's very simple calculus It's the kind of calculus you do at school just using the chain rule nothing more complicated than that So if you are interested, you know have a look there's a there's a what I hope is a very clear kind of blog post on that So what we've what we've doing now is we've we've we've worked out a way of improving the weights based on the gradient of that error function and You know, you've seen this kind of a question many times where we iterate and keep improving Okay, I've only got a bit of time left. So I'm gonna zoom through Sort of how you might do it yourself. I'm not an expert Python coder There's people here who are but broadly speaking, you know, if you wanted to do this yourself You might think what would a Python kind of in a program or class look like well We know we've got to initialize this data structure this network And it's really simple, you know, all we're having to really do is set the size And initialize those weights to random initial values We know that we've got to have some way of training the network so we're doing the learning and then we've got to have a way a method of Queering the network. So we ask it a question and get an answer back And it's you know, you if you go on to make your own kind of neural network library or class It doesn't have to be more complicated than this at all I had to go, you know, just for fun to learn And there's some very useful kind of Python libraries. NumPy is great for matrix multiplications As you've heard all this week SciPy's got some nice Functions in there for doing Like that that's curved equate that curve graph the threshold function. It's got that built in so you can you can use that yourself Plotting things is mapplotlib. You can use other things. I Started programming a long time ago, and then I stopped so I was coding Python in 1999 I think 98 Python 1.5 1.6 Numeric library rather than NumPy. Anyway, so notebooks didn't exist then so I came back to Python and notebooks are fantastic excellent So this this is an example of a function which initializes the network It looks complicated and don't be put off all it's really doing is setting the size in terms of the input nodes hidden nodes The the output nodes and you can see here. I'm using a NumPy function to randomize Do the weights which are a matrix? That's it. Nothing more complicated than that. You can see the SciPy function there as well X bit otherwise called X bit, but that's the logistic function the curve We'll do the training next but querying again really easy. We take the inputs I've turned them into an array It was a list here We can see that there's a matrix multiplication NumPy dot dot to do the calculation And that's it We apply the activation function to the outputs of that multiplication Simple as that you've then got the Signal at the next layer, and then you do it again. That's it. It's as simple as four lines Propagating the signal through a neural network as simple as that. I'm sure I could make it, you know even more concise, but I Just want to let you know really by doing this that that it's not mysterious or scary or complicated It really is as simple as just that And the training again it will look scary, but it isn't really The top half is exactly the same as what we've just had. We're feeding the signals forward and Exactly the same code as before and then what we're doing is we're saying the output errors is a target Which is from our training data minus what we've what we've worked out and then we use another set of matrix multiplications to work out the The errors internally to the network and then we change the weights Using that expression that we worked out with calculus. That's it. That's how you train a neural network I'm sure I can make it even more beautiful and clean But I just wanted you to kind of get the feel for what's really going on in the network and it's not that complicated you could do it yourself Okay with a few minutes left I'm just gonna sort of show you that with just those very simple ideas that we looked at I mean, you know people will have many more kind of you know sophisticated methods and optimizations and You can read quite a lot about neural networks, but just with the very very simple ideas that we've looked at You can do some powerful things So we can train a network to learn to recognize human handwritten numbers There's a famous challenge a data set called the MNIST data set It's got 60,000 Training examples It's all free open data. You can get it yourself if you want to look at my blog you can point you to it And there's a test set as well and you can compare your results with others If I looked at the data, you'll see the numbers there actually if you plot as an image Using what did I use that plotlib? You can see that's that's a five twenty eight by twenty eight pixels So if we feed those that data into a network and train it Actually, I missed something there. We have to choose what the output looks like What I've chosen is to say we have ten nodes at the output And if the answer should be say nine the ninth one has the biggest value So you can see here if it's five then it's the fifth node that should be of a high value and everything else should be low That's what I'm using to train the network. That last example is interesting because in this one the network thinks The answer is probably nine, but it might also be four And and you you know you you can get some really good results just with those simple ideas 96% accuracy is what I got in the first go That's not bad. You know sort of 20 lines of code and You know handwritten human You know numbers and you can get you know over 90% accuracy That's that's that's that's not bad at all. Is it I think if you have a play You can you can tune things like the learning rate or the number of hidden nodes And you can see that you can get improved performance and then it might not improve so much I've deliberately put in a Bit of a wonky graph there just to remind us that neural network training is a random process We're starting off with random initial weights and sometimes it can go wrong And that's for the scientists among us. It reminds us that We should do this many many times and take the best or or to make sure that we've not gotten a normalist kind of answer Remember that gradient descent before you might end up at the wrong minimum not the best minimum so it's worth doing this many times If you rotate your Original data set you can get 98% accuracy. This is actually really good because if you look at the academic papers They get sort of you know 99% 99.5% but they're using really advanced techniques And I think you know with just 20 lines of code. That's not bad I'll just skip this and what I'll do now is I'll just say you can actually do this with a raspberry pi zero everything I've done you can look at the blog as well. You can do with a raspberry pi zero which costs about four or five euros You don't need an Nvidia graphics card to do any of this So the last few minutes I'm gonna try and do a live demo and I might I Might regret it. So let's think of a number to classify. Let's let's um think of a number Seven okay. It has to be one digit. That's seven. Let's write it Here's one I did earlier actually from a newspaper. So that's the number two and it got that right This is a network of trains last night. So let's uh, oops Okay, all right. Let's three if it doesn't work. It's not my fault Okay, let's resize that to 28 by 28. Let's save that Png all this code is on get I'll be if you want to have an explore. All right, if I press go let's see Three yet. No work says three few I did a few last night and it didn't work every time Few I'll stop there. Um, I'd love to have a chat about this afterwards. I don't know how much time I've got left for questions Do I have any or did I use it all up? Okay, all right. Yeah, okay. Thanks for listening by the way. Thanks So any questions? Hi. Hello. So we have seen the output nodes of the numbers What are the input nodes are those the individual pixels? Those are yeah, those are the individual pixels So if we have an image of 28 by 28, that's 784 pixels, I think so you would have an input layer of 784 You can choose other ideas You can say I want to rescale everything or I might want to have different features as inputs You can have you can do things like that But this is a very simple example very naive example, which takes the raw pixels and it works but you People will do other things to perhaps if they know something about the data they're working with they might say I think another feature is more likely to be a Factor in the answer and they might use that to train the network instead. So they might use color They might use alpha values. They might use something else So any other question, okay, thanks a lot. It was great talk