 So hello everybody and welcome to deep learning for coders lesson one This is the fourth Year that we've done this But it's a very different and very special Version for a number of reasons the first reason it's different is because we are bringing it to you live from day number one Of a complete shutdown or not complete shutdown, but nearly complete shutdown of San Francisco We're gonna be recording it over the next two months in the midst of this global pandemic So if things seem a little crazy sometimes in this course, I apologize, but that's why this is that's why this is happening The other reason it's special is because It's we're trying to make this our kind of definitive version, right? Since we've been doing this for a while now We've actually finally gotten to the point where we almost feel like we know what we're talking about To the point that Sylvain and I have actually written a book And we've actually written a piece of software from scratch called the fast AI library version 2 We've written a peer-reviewed paper about this Library so this is kind of designed to be like the version of the course that is Hopefully gonna last a while The syllabus is based very closely on this book, right? So if you want to read along properly as you go Please buy it and I say please buy it because actually the whole thing is also available for free in the form of Jupiter notebooks and that is Thanks to the Huge generosity of O'Reilly media who have let us do that so well You'll be able to see on the website for the course how to kind of access all this but Here is the fast book a repo where you can read the whole damn thing At the moment as you see it's a draft but by the time you see this it won't be So we have a big request here, which is The deal is this you can read this thing for free as Jupiter notebooks, but That is not as convenient as reading it on a Kindle or you know in a paper book or whatever So, please don't turn this into a PDF, right? Please don't turn it into a Form designed more for reading Because kind of the whole point is that we hope you know that you'll you'll buy it, right? Don't don't take advantage of O'Reilly's generosity by Creating the thing that you know, they're not giving you for free and that's actually explicitly the license Under which we're providing this as well. So there's a you know It's mainly a request to being a decent human being If you see somebody else not being a decent human being and and stealing the book version of the book Please tell them. Please don't do that. It's not nice and don't be that person So either way you can read along with the syllabus in in the book There's a couple of different versions of of these notebooks, right? There's the There's the full notebook that Has the entire prose pictures everything now. We actually wrote a System to turn notebooks into a printed book and sometimes that looks kind of weird for example Here's a weird looking table and if you look in the actual book It actually looks like a proper table, right? So sometimes you'll see like a little weird bits Okay, they're not mistakes. They're bits where we kind of add information to help our book turn into a proper nice book So just just ignore them Now when I say we who is we well I mentioned One important part of the we is Sylvan Sylvan is my co-author of the book and the fast AI version 2 library So he is my partner in crime here. The other key we here is Rachel Thomas and so maybe Rachel you can come and say hello. She's the co-founder of fast AI Well, yes, I'm the co-founder of fast AI. I'm also Lower sorry taller than Jeremy And I'm the founding director of the Center for Applied Data Ethics at the University of San Francisco I'm really excited to be a part of this course and I'll be the voice you here asking questions from the forums Rachel and Sylvan are also the people in this group who actually understand math I am a mere philosophy graduate Rachel has a PhD Sylvan has written 10 books about math So if math questions come along it's possible. I may pass them along But it's very nice to have an opportunity to work with people who understand this topic so well Yes, yes, Rachel. Did you want to Sure. Oh, thank you and As Rachel mentioned the other area where she's You know real has real world-class expertise is data ethics. She is the founding director of the Center for Applied Data Ethics At the University of San Francisco at the University San Francisco, thank you We're going to be talking about data ethics throughout the course Because well we happen to think it's very important So for those parts, although I'll generally be presenting them. They will be on the whole based on Rachel's Rachel's work because she actually knows what she's talking about Although thanks to her I kind of know a bit about what I'm talking about too Right, so that's that So Should you be here? Is there any point you were tempting to understand? Press the right button understand deep learning Okay, so what do you you know should you should you be here? Is there any point you were tempting to learn deep learning or are you? Choose stupid or you don't have enough vast Resources or whatever because that's what a lot of people are telling us they're saying you need teams of phd's and massive data centers full of GPUs otherwise It's it's pointless Don't worry that is not at all true. It couldn't be further from the truth In fact the vast majority so a lot of world-class Research and world-class industry projects have come out of fast AI alumni and fast AI Library-based projects and and elsewhere which are Created on a single GPU using a Few dozen or a few hundred data points from people that have no graduate-level Technical expertise or in my case. I have no undergraduate level technical expertise. I'm just a philosophy major so there is When we'll see it throughout the course But there is lots and lots and lots of clear empirical evidence that you don't need lots of math You don't need lots of data. You don't need lots of expensive computers to do great stuff with deep learning. So Just bear with us. You'll be fine To do this course you do need to code Preferably, you know how to code in Python But if you've done other languages, you can learn Python if the only languages you've done is something like Matlab Where you've used it more of kind of like a scripty kind of thing You might find it a bit. You will find it a bit heavier going But that's okay. Stick with it. You can learn Python as you go Is there any point learning deep learning is it any good at stuff If you are hoping to build a brain That is an AGI. I Cannot promise we're going to help you with that and AGI stands for artificial general intelligence. Thank you What I can tell you though is that in all of these areas Deep learning is the best-known approach to at least many versions of all of these things So it's it is not speculative at this point whether this is a useful tool It's a useful tool in lots and lots and lots of places extremely useful tool and in many of these cases it is Equivalent to or better than human performance at least according to some particular kind of narrow definition of things that humans do in these kinds of areas So deep learning is pretty amazing And if you kind of want to pause the video here and have a look through and try and pick some things out that you think might look Interesting and type that keyword and deep learning into Google and you'll find lots of papers and examples and stuff like that deep learning Comes from a background of neural networks as you'll see deep learning is just a type of neural network learning a Deep one will describe exactly what that means later that neural networks are certainly not a new thing They go back at least in 1943 when McCulloch and Pitts Created a mathematical model of an artificial neuron and got very excited about where that could get to And then in the fifties Frank Rosenblatt Then built on top of that To he basically created some subtle changes to that mathematical model And he thought that with these subtle changes We could witness the birth of a machine that is capable of perceiving Recognizing and identifying its surroundings without any human training or control and he oversaw the building of this Extraordinary thing the mark one perceptron at Cornell So that was I think this picture was 1961 Thankfully nowadays we don't have to build neural networks by running the damn wires show you want a neuron artificial neuron artificial neuron But you can kind of see the idea a lot of connections going on and you'll hear the word connection a lot in this course because that's what it's all about Then we had the first AI winter as it was known which really to a strong degree happened because an MIT professor named Marvin Minsky And Pappert wrote a book called perceptrons about Rosenblatt's invention in which they pointed out that a single layer of these artificial neuron devices Actually couldn't learn Some critical things. It was like impossible for them to learn something as simple as the Boolean X or operator In the same book they showed that using multiple layers of the devices actually would fix the problem people ignore didn't notice that part of the book and only noticed the limitation and people basically decided that neural networks were going to go nowhere and they Kind of largely disappeared for decades until In some ways 1986 a lot happened in the meantime, but there was a big thing in 1986 which is MIT released a thing called a book the series series of two volumes of book called parallel distributed processing In which they described this thing they call parallel distributed processing where you have a bunch of processing units that have some state of activation and some output function and some pattern and pattern of connectivity and some propagation rule and some activation rule and some learning rule operating in an environment and then they described how things that met these requirements could in theory do all kinds of Amazing work and this was the result of many many researchers working together as a whole group involved in this in this project Which resulted in this very very important book? And so that the interesting thing here to me is that if you as you go through this course come back and have a look At this picture and you'll see we are doing exactly these things everything. We're learning about really is how do you do? each of these Eight things right and it's interesting that they include the environment because that's something which very often Data scientists ignore which is you build a model you've trained it. It's learned something. What's the context? It works in we're talking about that quite a bit over the next couple of lessons as well so in the 80s You know during and after this was released People started building in this second layer of neurons avoiding mince keys problem and in fact It was shown That it was mathematically provable that by adding that one extra layer of neurons It was enough to allow any mathematical model to be approximated to any level of accuracy with these neural networks And so that was like the exact opposite of the mince key thing that was like hey, you know There's nothing we can't do we provably. There's nothing we can't do and so that was kind of when I started getting involved in neural networks So I was a little bit later. I guess I was getting involved in the early 90s and they were very widely used in industry I was using them for the very boring things like targeted marketing for retail banks It tended to be big companies with lots of money that we're using them And it certainly though was true that often the networks were too big or slow to be useful. They were certainly useful for some things but they You know, they never felt to me like they were living up to the promise for some reason Now what I didn't know and nobody I Personally met you Was that actually there were researchers that had showed 30 years ago that to get practical good performance You need more layers of neurons Even though mathematically theoretically you can get as accurate as you want with just one extra layer To do it with good performance. You need more layers So when you add more layers to a neural network, you get deep Learning so deep doesn't mean anything like mystical. It just means More layers more layers than just adding the one extra one So thanks to that neural nets are now living up to their potential as we saw in that like what's deep learning good at I think so we could now say that Rosenblatt was right We we have a machine that's capable of perceiving Recognizing and identifying its surroundings without any human training or control that is that's definitely true I don't think there's anything controversial about that statement based on the current technology So we're going to be learning how to do that We're going to be learning how to do that in exactly the opposite way of probably all of the other math and Technical education you've had we are not going to start with a You know two-hour lesson about the sigmoid function, you know or a study of linear algebra or a refresher course on calculus and the reason for that is that People who study how to teach and learn have found that is not the right way to do it for most people for most people So if we work a lot based on the work of professor David Perkins from from Harvard And and others who work at similar things who talk about this idea of playing the whole game And so playing the whole game is like it's based on the sports analogy if you're going to teach somebody baseball you don't take them out you know into a classroom and start teaching them about the physics of a parabola and how to stitch a ball and Three-part history of a hundred years of baseball politics and then you know ten years later You let them watch a game and then twenty years later. You let them play a game, you know, which is kind of like how Math education is being done. Right instead with baseball Step one is to say hey, let's go and watch some baseball. What do you think? That was fun, right? See that guy there you try to run there before the other guy throws a ball over there. Hey, you want to try having a hit Okay, so you're gonna hit the ball and then I have to try and catch it and then you have to run over there And so it's from step one you are playing the whole game Yeah, and just to add to that When people start they're often may not have a full team or be playing a full nine innings But they still have a sense of what the game is and kind of the big picture idea. Yeah, so There's lots and lots of reasons that this helps most human beings not everybody, right? There's a small percentage of people Who like to build things up from the foundations and the principles and not surprisingly They are massively over represented in a university setting because the people who get to be academics are the people who thrive With the kind of up, you know to me upside down way things are taught but outside of universities Most people learn best in this top down way where you start with the full context So step up Number two in these seven principles I'm only gonna mention the first three is to make the game worth playing Which is like if you're playing baseball, you're like you have a competition, you know, you actually you score You you try and win You bring together teams from around the community and have people try to beat each other and you have leaderboards It's like who's got the you know highest number of runs or or whatever, right? So this is all about making sure that you the thing you're doing you're doing it properly, you know You're making it the whole thing you're providing the context and the interest So For the faster I approach to To learning deep learning what this means is that today we're going to train models end to end We're going to actually train models, right? And they won't just be crappy models They will be state of the art World-class models from today and we're going to try to have you build your own State of the art world-class models from either today or next lesson depending on how things go Then number three in the seven principles from Harvard is work on the hard parts, which is kind of like this idea of You know practice yes, I say deliberate practice. Yeah deliberate practice, right work on the hard parts means that you don't just You don't swing a bat, you know at a ball every time, you know You go out and just muck around but you you train properly you find the bit that you're the least good at You figure out where the problems are you work damn hard at it, right? So in in the deep learning context that means that we do not Dumb things down, right by the end of the course you will have Done the calculus you will have done the linear algebra. You will have done the the software engineering of the code, right? You will be practicing These things which are hard so it requires tenacity and Commitment but hopefully you'll understand why it matters because before you start Practicing something you'll know why you need that thing because you'll be using it like to make your model better You'll have to understand that that concept first so For those of you used to a traditional university environment This is going to feel pretty weird and a lot of people say that they regret At you know after a year of studying fast AI that they spent too much time studying theory and not enough time training models and writing code That's that's the kind of like the number one piece of feedback. We get people who say I wish I'd done things differently It's that so please try to as best as you can since you're here follow along with this approach We are going to be using Kind of a software stack. Sorry. I just want to say one more thing about the approach I think since so many of us spent so many years kind of with a traditional educational approach of bottom up That this can feel very uncomfortable at first I still feel uncomfortable with it sometimes even though I'm committed to the idea and that Some of it is also kind of having to catch yourself and be okay with the not knowing the details, which I think can feel very Unfamiliar or even wrong when you're kind of new to that of like oh wait I'm using something and I don't understand every underlying detail But you kind of have to trust that we're going to get to those details later So I can't empathize because I did not let's spend lots of time doing that But I will tell you this Teaching this way is very very very hard. You know, I very often find myself Jumping back into a foundation's first approach because it's just so easy to be like oh, you need to know this You need to know this you need to do this and then you can know this That's so much easier to teach So I do find this much much more challenging to teach but hopefully it's worth it We've spent a long long time figuring out how to get deep learning into this format But one of the things that helps us here is is the software we have available if you haven't used Python before it's a ridiculously flexible and expressive and easy to use language We have plenty of bits about it we don't love but on the whole we love the overall thing and we think it's it Most importantly the vast vast vast majority of deep learning practitioners and researchers are using Python On top of Python, there are two libraries that most folks are using today PyTorch and tensorflow There's been a very rapid change here tensorflow was what we were teaching until a couple of years ago It's what everyone that's using until a couple of years ago It got super bogged down basically tensorflow got super bogged down This other software called PyTorch came along that was much easier to use and much more, you know useful to researchers and within the last 12 months the number of the percentage of Papers at major conferences that uses PyTorch has gone from 20% to 80% and vice versa Those that use tensorflow have gone from 80% to 20% So basically all the folks that are actually building the technology. We're all using and now using PyTorch And you know industry moves a bit more slowly, but in the next year or two you'll probably see a similar thing in industry now the thing about PyTorch is it's super super flexible and Really is designed for for flexibility and developer friendliness certainly not designed for beginner friendliness And it's not designed for what we would say it doesn't have like higher level API's by which I mean there There isn't really things to make it Easy to build stuff quickly using PyTorch so So to deal with that issue we have a library called fast AI that sits on top of PyTorch Fast AI is the most popular Higher level API for PyTorch it is Because our courses are so popular some people are under the mistaken impression that fast AI is designed for beginners or for teaching It is designed for beginners in teaching as well as practitioners and industry and researchers Though the way we do this make sure that it's that it's the best API for all of those people as We use something called a layered API And so there's a peer reviewed paper that silver and I wrote that described how we did that and for those of you That are software engineers it will not be at all unusual or surprising It's just totally standard software engineering practices, but they were practices that were not followed in any deep learning library We had seen Just you know basically lots of refactoring and decoupling And so by using that approach it's allowed us to build something which you can do super low-level research You can do state-of-the-art production models and you can do kind of Super easy beginner but beginner world-class models So that's the basic software stack. There's other pieces of software we'll be learning about along the way But the main thing I think to mention here is it actually doesn't matter If you learn this software stack, and then at work you need to use TensorFlow and Keras say You'll be able to switch with you know in less than a week lots and lots of students have have done that It's never been a problem that the important thing is to learn the concepts, right? And so we're going to focus on on those concepts and by using an API Which minimizes the amount of boilerplate you have to use it means you can focus on the bits that are important The actual lines of code will correspond more much more to the actual concepts you're implementing You are going to need a GPU machine a GPU is a graphics processing unit and specifically you need an NVIDIA GPU other brands of GPU Just aren't well supported by any deep learning libraries Please don't buy one if you already have one you probably shouldn't use it Instead you should use one of the platforms that we have already got set up for you It's just a huge distraction to be spending your time doing like system administration on a GPU machine and installing Drivers and blah blah blah right and run it on Linux, please That's what everybody's doing not just ask everybody's running it on Linux like make life easy for yourself It's hard enough to learn deep learning without having to do it in a way that you're learning You know all kinds of arcane hardware support issues There's a lot of free options available and so Please please use them If you're using an option that's not free Don't forget to shut down your instance So what's going to be happening is you're going to be spinning up a server that lives somewhere else in the world And you're going to be connecting to it from your computer and and training and running and building models Just Because you close your browser window doesn't mean your server stops running on the whole right? So don't forget to shut it down because otherwise you're paying for it Colab is a is a great system which is free There's also a paid subscription version of it Be careful with Colab the most of the other systems We recommend save your work for you automatically and you can come back for any time Colab doesn't so be sure to check out the Colab platform thread on the forums to to learn about that So I mentioned the forums The forums are really really important because that is where All of the discussion and setup and everything happens So for example, if you want help with setup here, you know, there's a setup help thread and you can find out, you know how to Best setup Colab and you can see discussions about it and you can ask questions And please remember to search before you ask your question, right? Because it's probably been asked before Unless you're one of the very very earliest people who are doing the course so Once you so step one is to get your server set up by just following the instructions From the forums or from the course website and the course website will have lots of step-by-step instructions for each platform They will vary in price. They will vary in speed. They will vary in availability and so forth Once you have finished following those instructions the last step of those instructions Will end up showing you Something like this The course v4 Folder so version 4 of our course and by the time you see this video This is likely to have more stuff in it, but it will have an nb's standing for notebooks folder So you can click on that and that will show you all of the notebooks for the course What I want you to do is scroll to the bottom and Find the one called app Jupiter and click on that and this is where you can start learning about Jupiter notebook What is Jupiter notebook? Jupiter notebook is something where you can start typing things and press shift enter and It will give you an answer and so the thing you're typing is Python code and the thing that comes out is the result of that code and so you can put in Anything in Python x equals three times four X plus one And as you can see it displays a result any time there's a result to display So for those of you that have done a bit of Coding before you will recognize this as a repel repel read evaluate print loop Most languages have some kind of repel The Jupiter notebook repel is particularly interesting because it has things like headings graphical outputs Interactive multimedia. It's it's it's a really astonishing piece of software. It's it's won some really big awards. It's You know, I would have thought the most widely used Repel outside of shells like bash It's a very powerful system. We love it. We've written our whole book in it We've written the entire fast AI library with it. We do all our teaching with it It's Extremely unfamiliar to people who have done most of their work in IDE You should expect it to feel as awkward as perhaps the first time you moved from a GUI to a command line It it's it's different. Right. So if you're not familiar with kind of repel based systems, it's going to feel super weird But stick with it because it it really is great The kind of the model going on here is that this web page I'm looking at is is letting me type in things through a server to do and show me the results of Computations the server is doing so the server is is off somewhere else. It's not running on my computer Right. The only thing running on the computer is this web page But as I do things so for example, if I say x equals x times 3 This is updating the server's state. There's this state There's like what's currently the value of x and so I can find out Now x is something different. So you can see when I did this line here It didn't change the earlier x plus one, right? So that means that when you look at a Jupiter notebook It's not showing you the current state of your server It's just showing you what that state was at the time that you printed that thing out It's just like if you're if you use a shell like bash and you type ls And then you delete a file that earlier ls you printed doesn't go back and change, right? That's kind of how ripples generally work including this one Jupiter notebook has Two modes one is edit mode, which is when I click on a cell and I get a flashing cursor And I can move left and right and type right There's not very many keyboard shortcuts in this mode one useful one is control or command slash which will comment and uncomment The main one to know is shift enter to actually run the cell At that point there's no flashing cursor anymore And that means that i'm now in command mode not edit mode. So as I go up and down i'm selecting different cells So in command mode as we Move around we're now selecting cells and there are there are now lots of keyboard shortcuts you can use So if you hit h you can get a list of them So for example and you'll see that they're not on the whole like control or command with something They're just the letter on its own so if you use like vim you'll be more familiar with this idea So for example, if I hit c to copy and v to paste Then it copies the cell or x to cut it A to add a new cell above And then I can press the various number keys to create a heading So number two will create a heading level two And as you can see I can actually type Formatted text not just code the formatted text I type is um in markdown like so My numbered one work There you go So that's in markdown if you haven't used markdown before it's a super super useful Way to write formatted text that used is used very very very widely So learn it because it's super handy And you need it for for jupiter So when you look at our Book notebooks for example you can see An example of all the kinds of formatting and code and stuff here So you should go ahead and yeah go through the app jupiter um And you can see here how you can create plots for example and create lists of things and import libraries and display pictures and so forth If you want to create a new notebook you can just go new Python 3 and that creates a new Notebook which by default is just called untitled so you can then rename it To give it whatever name you like And so then you'll now see that in the list here new name The other thing to know about um jupiter is that it's a nice easy way to jump into a terminal If you know how to use a terminal you certainly don't have to for this course at least for the first bit if I go new terminal You can see here. I I have a terminal um One thing to note is uh for The the the notebooks are Attached to a github repository if you haven't used github before that's fine But basically they're they're attached to a Server where from time to time we will update the notebooks on that server Um, and we will we you'll see on the course website in the forum We tell you how to make sure you have the most recent versions When you grab our most recent version you don't want it to conflict with or overwrite your changes So as you start experimenting it's not a bad idea to like select a notebook and click duplicate And then start doing your work in the copy and that way when you get an update of our latest course materials Um, it's not going to interfere with the experiments that you've been running So there's two important repositories to know about One is the Fast book repository Which we saw earlier, which is kind of the um the the full book with all the outputs and pros and and everything um And then the other one is the course v4 repository And here is the exact same notebook from the course v4 repository and for this one we remove all of the pros And all of the pictures and all of the outputs and just leave behind the headings And the code uh in this case you can see some outputs because I just ran that code but for most of it Uh, there won't be any oh no, I guess we have left the outputs I'm not sure if we'll keep that or not. So you may or may not see the outputs um, so the idea with this is this is probably The version that you want to be experimenting with Um, because it kind of forces you to think about like what's going on as you do each step, you know rather than just Reading it and running it without thinking we kind of want you to To do it in this more bare environment in which you're thinking about like oh, what did the books say? Why was this happening? Um, and if you forget then you can kind of go back to the book um The other thing to mention is both the course v4 version and the fast book version at the end have A questionnaire And uh, quite a few folks have told us um, that you know amongst the reviewers and stuff that they actually read the questionnaire first um We spent many many weeks writing the questionnaires Silvia and I And the reason for that is because we tried to think about like What do we want you to take away? Uh from each notebook So if you kind of read the questionnaire first You can find out what are the things we think are important? What are the things that you should know before you move on? So rather than having like a summary section at the end saying at the end of this you should know blah blah blah We instead have a questionnaire to do the same thing So please make sure you do the questionnaire before you move on to the next chapter You don't have to get everything right And most of the time answering the questions is as simple as going back to that part of the notebook and and reading the pros um But if you've missed something Like do go back and read it because we're assuming these are the things we're assuming you know So if you don't know these things um Before you move on it could get frustrating Having said that if you get stuck um After trying a couple of times Do move on to the next chapter Uh, do two or three more chapters and then come back Maybe by the time you've done a couple more chapters, you know, you'll get some more perspective. We try to Re-explain things multiple times in different ways um So yeah, it's it's okay if you've tried and you get stuck Then you can try moving on All right, so let's try running the first part of the notebook So here we are in 01 intro, uh, so this is chapter one and Here is Our first cell so I click on the cell and by default Actually, there will be a header and a toolbar as you can see you can turn the one on off I always leave them off myself and so to run this cell You can either click on the play, you know the run button Or as I mentioned you can hit shift enter. So for this one I just click And as you can see this star appears. So this says I'm running and now you can see a progress bar popping up That's going to take a few seconds okay, and so um As it runs it's going to print out some results Don't expect to get exactly the same results as us There's some randomness involved in training a model. Um, and that's okay Don't expect to get exactly the same time as us If this first cell takes more than five minutes Unless you have a really old GPU That's probably a bad sign. You might want to hop on the forums and figure out what's going wrong um Or maybe you're trying to use windows, which really doesn't work very well for this at the moment Uh, don't worry that we don't know what all the code does yet. We're just making sure that we can train a model Uh, so here we are it's finished running And so as you can see it's printed out some information and in this case it's showing me that there's an error rate of 0.005 at doing Something what is the something it's doing? Well, what it's doing here Is it's actually grabbing a data set we call the pets data set which is a data set of pictures of cats and dogs And it's trying to figure out Which ones are cats and which ones are dogs And as you can see after about uh, well less than a minute It's able to do that with a 0.5 percent error rate. So it can do it pretty much perfectly So we've trained our first model. We have no idea how we don't know what we were doing But we have indeed trained a model. So that's a good start And as you can see we can train models pretty quickly on a single computer, which you know, many of which you can get for free One more thing to mention is if you have a mac It doesn't matter whether you have windows or mac or linux in terms of what's running in the browser But if you have a mac, please don't try to Use that gpu Macs actually apple doesn't even support nvidia gpu's anymore So that's really not going to be a great a great option. So stick with linux. It'll make life Much easier for you Well, actually the first thing we should do is actually try it out So we've I claim we've trained a model that can pick cats and dogs. Let's make sure we can So let's Check out this cell. This is interesting, right? We've created a widgets file upload object and displayed it And this is actually showing us a clickable button. So as I mentioned, this is an unusual ripple We can even create gooey's in this ripple. So if I click on this file upload And I can pick cat there we go and I can now Turn that uploaded data into an image There's a cat and now I can do predict and It's a cat with a 99.96 percent probability So we can see we have just uploaded an image that we've picked out So you should try this right grab grab a picture of a cat find one from the internet Or go and take a picture of one yourself and make sure that you get a picture of a cat This is something which can recognize photos of cats not line drawings of cats And so as we'll see in this course These kinds of models could only learn from the kinds of information you give it and so far we've only given it as you'll discover photos of cats Not not anime cats not drawn cats not abstract representations of cats But just photos So we're now going to look at What's actually happened here? And you'll see at the moment I'm not getting some great information here if you see this in your notebooks you have to go file trust notebook And that just tells duper that it's allowed to run the code necessary to display things to make sure that there isn't any security problems And so you'll now see the outputs Sometimes you'll actually see some weird code like this This is code that actually creates outputs Um, so sometimes we hide that code Sometimes we show it. So generally speaking you can just ignore the stuff like that and focus on what comes out So i'm not going to go through these instead. I'm going to have a look at it. Um The same thing over here on the slides So what we're doing here is um We're doing machine learning Deep learning is a kind of machine learning. Uh, what is machine learning machine learning is? Just like regular programming. It's a way to get computers to do something Um, but in this case like it's pretty hard to understand how you would use regular programming to Recognize dog photos from cat photos. How do you kind of create the loops and the variable assignments and the conditionals? To create a program that recognizes dogs versus cats in photos It's super hard Super super hard so hard that until kind of the deep learning era Nobody really had a model that was remotely accurate at this apparently easy task Because we can't write down the steps necessary So normally, you know, we write down a function that takes some inputs and goes through our program and gives us some results so, um This this general idea where the program is something that we write the steps doesn't seem to work great for things like recognizing pictures So back in 1949 somebody named Arthur Samuel started trying to figure out a way to solve problems like recognizing pictures of cats and dogs and in 1962 He described a way of doing this Well, first of all, he described the problem programming a computer for these kinds of computations is at best a difficult task Because of the need to spell out every minute step of the process in exasperating detail Computers are giant morons, which all of us coders totally recognize So he said, okay, let's not tell the computer the exact steps But let's give it examples of a problem to solve and figure out how to solve it itself And so by 1961 he had built a checkers program that had beaten the Connecticut state champion not by Telling it the the steps to take to play checkers But instead by doing this Which is a range for an automatic means of testing the effectiveness of a weight assignment in terms of actual performance And a mechanism for altering the weight assignment so as to maximize performance This sentence is the key thing And it's a pretty tricky sentence so we can spend some time on it The basic idea is this instead of saying Inputs to a program and then outputs Let's have inputs To a let's call the program now model But it's the same basic idea inputs to a model and results And then we're going to have a second thing called weights And so the basic idea is that this model Is something that creates outputs based not only on for example the state of a checkers board But also based on some set of weights or parameters that describe how that model is going to work So the idea is if we could like enumerate all the possible ways of playing checkers And then kind of describe each of those ways using some set of parameters or what Arthur Samuel Porter waits Then if we had a way of checking how effective A current weight assignment is in terms of actual performance in other words Does that particular enumeration of a strategy for playing checkers end up winning or losing games? Right and then a way to alter the weight assignment so as to maximize the performance So then oh, let's try increasing or decreasing each one of those weights one at a time To find out if there's a slightly better way of playing checkers And then do that lots of lots of times Then eventually Such a procedure could be made entirely automatic And then the machine so programmed wouldn't learn from its experience So this little paragraph is It's the thing this is machine learning right a way of creating programs Such that they learn rather than they're programmed So if we had such a thing Then we would basically now have something that looks like this you have inputs and weights again going into a model Creating results i.e. you went you won or you're lost and then a measurement of performance So remember that was this key step and then the second key step is a way to update the weights based on the measure of performance And then you could loop through this process And create a train a machine learning model So this is the abstract idea So after we've run that for a while Right it's come up with a set of weights. That's pretty good Right we can now forget the way it was trained And we have something that's just like This right except the word program is now replaced with the word model So a trained model can be used just like any other computer program So the idea is we're building a computer program Not by putting out the steps necessary to do the task But by training it to learn to do the task at the end of which it's just another program And so this is what's called inference right is using a trained model As a program to do a task such as playing checkers So machine learning Is training programs developed by allowing a computer to learn from its experience rather than through manually coding Okay, how would you do this for image recognition? What is that Model and that set of weights such that as we vary them It could get better and better at recognizing cats versus dogs I mean for checkers, it's not too hard to imagine how you could kind of enumerate like Depending on different kinds of how far away the opponent's piece is from your piece What should you do in that situation? How should you weight defensive versus aggressive strategies? Blah blah blah Not at all obvious how you would do that for image recognition So what we really want Is some function in here Which is so flexible That there's a set of weights that could could could cause it to do anything A real like the world's most flexible possible function And it turns out that there is such a thing It's a neural network So we'll be describing exactly what that mathematical function is In the coming lessons To use it. It actually doesn't really matter What the mathematical function is it's a function Which is we say Parameterized by some set of weights by which I mean as as I give it a different set of weights It does a different task And it can actually do Any possible task something called the universal approximation theorem Tells us that mathematically provable provably This functional form can solve any problem that is solvable to any level of accuracy If you just find the right set of weights Which is kind of restating what we described earlier in that like oh, how do we deal with the min ski the marvin min ski problem? so Neural networks are so flexible that if you could find the right set of weights They can solve any problem including is this a cat or is it a dog? So that means you need to focus your effort on the process of training them That is finding Good weights Good weight assignments to use art semials terminology So how do you do that? We want a politely general way to do This to update the weights Based on some measure of performance such as how good is it at recognizing cats versus dogs? And luckily it turns out such a thing exists And that thing is called stochastic gradient descent or sgd again We'll look at exactly how it works. We'll build it from ourselves and scratch But for now we don't have to worry about it. I will tell you this though Um, neither sgd nor neural nets are at all mathematically complex They nearly entirely are addition and multiplication The trick is it just does a lot of them like billions of them so many more Then we can like intuitively grasp that they can do extraordinarily powerful things, but they're not They're not rocket science at all. They're not complex things and we'll we'll see exactly how they work Right, so that's that's the Arthur Samuel version, right Nowadays we don't use quite the same terminology, but we use exactly the same idea So That function that sits in the middle We call an architecture and architecture is is the function that we're adjusting the weights to get it to do something So that's the architecture. That's the functional form of the model sometimes people say model To mean architecture so don't let that confuse you too much, but really the right word is architecture We don't call them weights. We call them parameters Weights has a specific meaning. It's quite a particular kind of parameter The things that come out of the model the trait that the architecture with the parameters we call them predictions And the predictions are based on two kinds of inputs Independent variables. That's that's the data like the pictures of the cats and the dogs And dependent variables also known as labels, which is like the thing saying this is a cat. This is a dog. This is a cat So that's your inputs So the results are predictions The measure of performance to use Arthur Samuel's word Is known as the loss So the loss has been calculated from the labels and the predictions Okay, and then there's the update back to the parameters Okay, so this is the same picture as we saw but just putting in the words that we use today So this picture if you forget if I say these are the parameters of this You know used for this architecture to create a model. You can go back and remind yourself. What do I mean? What are the parameters? What are the predictions? What is the loss? Okay, the loss is some function that measures the performance of the model in such a way that we can update the parameters so It's important to note that deep learning and machine learning are not magic, right? The model can only be created Where you have data Showing you examples of the thing that you're trying to learn about It can only learn to operate on the patterns that you've seen in the input used to train it All right, so if we don't have any line drawings of cats and dogs Then there's never going to be an update to the parameters that makes the architecture The and so the architecture and the parameters together is the model So we'll just say the model that makes the model better at predicting Line drawings of cats and dogs because they just they never received those weight updates because they've never received those inputs Notice also that this learning approach only ever creates predictions It doesn't tell you what to do About it and that's going to be very important when we think about things like a recommendation system of like What product do we recommend to somebody? Well, I don't know. We don't do that, right? We can predict what somebody will say about a product after we've shown them But we're not creating actions. We're creating predictions. That's a super important difference to recognize It's not enough just to have examples of input data like pictures of dogs and cats We can't do anything without labels And so very often organizations say we don't have enough data Most of the time they mean we don't have enough labeled data Because if a company is trying to do something with deep learning Often it's because they're trying to automate or improve something they're already doing Which means by definition they have data about that thing or a way to capture data about that thing because they're doing it right, but often the tricky part is Labeling it. So for example in medicine If you're trying to build a model for radiology You can almost certainly get lots of medical images about just about anything you can think of But it might be very hard to label them according to malignancy of a tumor or according to whether or not Meningeoma is present or whatever Because these kinds of labels are not necessarily captured in a structured way at least in the us medical system So that's an important distinction that really impacts your kind of strategy here So then a model as we saw from the the pdp book a model operates in an environment, right? It is You you roll it out and you do something with it and so then This piece of that kind of pdp framework is super important, right? You you have a model That's actually doing something for example. You've built a predictive policing model that predicts doesn't recommend actions It predicts where an arrest might be made. This is something a lot of jurisdictions in the us are using Now it's predicting that based on data and based on labeled data And in this case, it's actually going to be using in the us for example data where I think Depending on whether you're black or white black people in the us I think get arrested something something like seven times more often for say marijuana possession than whites even though The actual underlying amount of marijuana use is about the same in the two populations So if you start with biased data and you build a predictive policing model, its prediction will say oh You will Find somebody you can arrest here based on some biased data So then law enforcement officers might decide to Focus their police activity on the areas where those predictions are happening As a result of which they'll find more people to arrest And then they'll use that to put it back into the model Which will now find oh there's even more people we should be arresting in the black neighborhoods And thus it continues So this would be an example of a modeling interacting with this environment to create something called a positive feedback loop Where the more is more a model is used the more biased the data becomes making the model even more biased and so forth So one of the things to be super careful about with machine learning is recognizing how That model is actually being used and what kinds of things might happen as a result of that I was just going to add that this is also an example of proxies because here arrest is being used as a proxy for crime and I think that pretty much in all cases the The data that you actually have is a proxy for some value that you truly care about And that difference between the proxy and the actual value often ends up being significant Thanks Rachel. That's a really important point Um, okay, so let's um finish off by looking at What's going on with this code? So the code we ran Is basically one two three four five six lines of code So The first line of code Is an import line so in python You can't use an external library until you import from it Um, normally in python people import just the functions and classes that they need from a library But python does provide a Convenient facility where you can import everything from a module which is by putting a star there Most of the time this is a bad idea Because by default the way python works is that if you say import star It doesn't only import the things that are interesting and important in the library You're trying to get something from but it also imports things from all the libraries it used and all the libraries They used and you end up kind of exploding your namespace in horrible ways and causing all kinds of bugs Because fast ai is designed to be used Um In this repel environment where you want to be able to do a lot of quick rapid prototyping We actually spent a lot of time figuring out how to avoid that problem so that you can import star safely So whether you do this or not is entirely up to you But rest assured that if you import star from a fast ai library It's actually been explicitly designed in a way that you only get the bits that you actually need One thing to mention is in the video you see it's called fast ai 2 That's because we're recording this video using a pre-release version by the time you are watching the The online the MOOC version of this you'll you will have the two will be gone Something else to mention is There are as I speak four main pre-defined applications in fast ai being vision text Um tabular and collaborative filtering we'll be learning about all of them and a lot more Um For each one say his vision you can import from the dot all kind of meta module I guess we could call it and that will give you all the stuff that you need for most common vision applications So if you're using a repel system like to put a notebook It's going to give you all the stuff right there that you need without having to go back and figure it out One of the issues with this is a lot of python users Don't If they look at something like untar data They would figure out where it comes from by looking at the import line And so if you import star you can't do that anymore But the good news is in a repel you don't have to you can literally just Type the symbol press shift enter and it will tell you Exactly where it came from As you can see so that's super handy so in this case for example To do the actual building the data set, uh, we called image data loaders from name funk um And I can actually uh call this special doc function to get the documentation for that Um, and as you can see it tells me exactly Everything to pass in what all the defaults are and most importantly Not only what it does but show in docs pops me over to the full documentation including an example Um, everything in the fast ai Documentation has an example and the cool thing is The entire documentation is written in jupyter notebooks So that means you can actually open the jupyter notebook for this documentation and run the line of code yourself And see it actually working and look at the outputs and so forth Um, also in the documentation you will find there's a bunch of tutorials So for example, if you look at the vision tutorial, it'll cover lots of things But one of the things it'll cover is as you can see in this case Pretty much the same kind of stuff. We're actually looking at in lesson one So there's a lot of documentation in fast ai and taking advantage of it. It's a very good idea and it is fully searchable And as I mentioned, perhaps most importantly that every one of these Documentation pages is also a fully interactive jupyter notebook So looking through more of this code Um The first line after the import is something that uses untar data that will download a data set Uh decompress it and put it on your computer if it's already downloaded it won't download it again If it's already decompressed it won't decompress it again And as you can see fast ai already has predefined access to a number of really useful data sets such as this pets data set um Data sets are a super important part as you can imagine of deep learning We'll be seeing lots of them and these are created by lots of heroes who basically spend Months or years collating data that we can use to build these models The next step is to tell fast ai What this data is and we'll be learning a lot about that but in this case, we're basically saying, okay, it contains images Uh, it contains images that are in this path. So untar data returns the path Um, that Is whereabouts it's been decompressed to or if it's already decompressed it tells us where it was previously decompressed to Um, we have to tell it things like okay. What what images are actually in that path Um One of the really interesting ones is label function. How do you tell? for each file whether it's a cat or a dog And if you actually look at the read me for the original data set it uses a slightly quirky thing, which is they said Oh anything where the first letter of the file name is an uppercase Is a cat. That's what they decided. So we just created a little function here called is cat that Returns the first letter is an uppercase or not and we tell fast ai. That's how you tell if it's a cat um We'll come back to these two in a moment So the next thing so now we've told it what the data is Um, we then have to create something called a learner a learner is the thing that that that learns It does the training. So you have to tell it what data to use Then you have to tell it what architecture to use I'll be talking a lot about this in the course, but basically there's a lot of predefined neural network architectures That have certain pros and cons and for computer vision the architecture is called resnet Are just a super great starting point and so we're just going to use a reasonably small one of them So these are all predefined and set up for you And then you can tell fast ai what things you want to print out as it's training in this in this case You're saying oh tell us the error, please as you train So then we can call this really important method called fine tune that we'll be learning about in the next lesson Which actually does the training Valid percent does something very important It grabs in this case 20 of the data point two proportion And does not use it for training a model Instead it uses for telling you the error rate of the model So always in fast ai this this metric the error rate will always be calculated on a part of the data Which has not been trained with and the idea here and we'll talk a lot more about this in the future Lessons so the basic idea here is we want to make sure that we're not overfitting. Let me explain Overfitting looks like this Let's say you're trying to create a function that fits all these dots Right a nice function would look like that right But you could also fit you can actually fit it much much more precisely with this function Look, this is going much closer to all the dots and this one is right So this is obviously a better function except as soon as you get outside where the dots are especially if you go off the edges It's it's obviously doesn't make any sense. So this is what you would call an overfit function So overfitting happens for all kinds of reasons. We use a model that's too big or we use not enough data We'll be we'll be talking all about it right but really The the craft of deep learning is all about creating a model That has a proper fit and the only way you know if a model has a proper fit is by seeing whether it works Well on data that was not used to train it And so we always set aside some of the data to create something called a validation set But the validation set is the data that we use not to touch it at all when we're training a model But we're only using it to figure out whether the model's actually working or not um One thing that sylvan mentioned in the book Is that one of the interesting things about studying fast ai is you learn a lot of Interesting programming practices. So I've been programming. I mean since I was a kid so like 40 years And sylvan and I'd both work really really hard to make python do a lot of work for us and to use You know programming practices which make us very productive and allow us to come back to our code years later and still understand it And so you'll see In our code we'll often do things that you might not have seen before And so we a lot of students who have gone through previous courses say they learned a lot about Coding and python coding and software engineering from the course Um, so yeah check, you know when you see something new check it out and feel free to ask on the forums if you're curious about why something was done that way um One thing to mention is uh, just like they mentioned like import star is something most python programmers don't do because most libraries Don't support doing it properly Um, we do a lot of things like that. We do a lot of things where we don't follow a traditional approach to python programming um Because I've used so many languages over the years I code not in a way that's specifically pythonic, but incorporates like ideas from lots of other languages and lots of other notations um And heavily customize our approach to python programming based on what works well for data science um That means that the code you see in fast ai is Not probably not gonna fit with the kind of style guides and normal approaches at your workplace if you use python there um, so obviously Uh, you should make sure that you fit in with your organizations programming practices rather than following ours Um, but perhaps in your own hobby work, you can follow ours and see if you find that Interesting and helpful or even experiment with that in your company if you're a manager and you're interested in doing so Okay, so to finish I'm going to show you something pretty interesting Which is have a look at this code untar data Image data loaders from name funk learner fine tune Untar data segmentation data loaders from label funk learner fine tune Almost the same code and this is built a model that does something Whoa totally different It's something which has taken images This is on the left. This is the label data It's got images with color codes to tell you whether it's a car Or a tree or a building or a sky or a line marking or a road And on the right is our model and our model has successfully figured out for each pixel Is that a car a line marking a road now it's only done it in Under 20 seconds, right? So it's a very small quick model. So it's made some mistakes like it's missing this line marking And some of these cars that thinks is House right, but you can see so if you train this for a few minutes, it's nearly perfect But you can see the basic idea is that we can very rapidly with almost exactly the same code create something Not that classifies cats and dogs, but does what's called segmentation figures out what every pixel image is Look, here's the same thing from import star text loaders from folder learner learn fine tune same basic code This is now something where we can give it a sentence And it can figure out whether that is expressing a positive or negative sentiment And this is actually giving a 93 accuracy on that task In about 15 minutes On the imdb dataset, which contains thousands of full length movie reviews Like 1000 to 3000 word movie reviews And this number here that we got with the same three lines of code Would have been the best in the world for this task in a very very very popular academics dataset in like 2015 I think So we are creating world-class models in our browser Using the same basic code Here's the same basic steps again from import star and tar data tabular data loaders from csv learner fit This is now Building a model that is predicting Let's find out salary Based on a csv table containing These columns so this is tabular data Here's the same basic steps from import star and tar data collab data loaders from csv learner learn fine tune This is now building something Which predicts for each combination of a user and a movie What rating? Do we think that user will give that movie based on What other movies they've watched and liked in the past? This is called collaborative filtering and used to recommendation systems So here you've seen some examples of each of the four applications in fast ai And as you'll see throughout this course the same basic code and also the same basic mathematical and software engineering concepts Allow us to do vastly different things Using the same basic approach And the reason why is because of Arthur Samuel. It's because of this basic description of What it is you can do if only you have a way to Parameterize a model and you have an update procedure which can update the weights to make you better at Your loss function And in this case we can use neural networks which are totally flexible functions so um That's it for this first lesson. It's a little bit shorter than Other lessons are going to be and the reason for that is that we are as I mentioned at the Start of a global pandemic here or at least in the west in other countries. They are much further into it So we spend some time talking about that at the start of the course and you can find that video elsewhere So in future lessons there will be more time on deep learning So what I suggest you do over the next week before you Work on the next lesson Is just make sure that you can spin up a gpu server that you can shut it down when it's finished that you can run all of the The code here and as you go through it see, you know, is this using python in a way you recognize? Use the documentation use that doc function Um Do some searching of the fast ai doc see what it does see if you can actually grab the fast ai documentation notebooks and cells and run them just try to get comfortable That you kind of can know your way around because the most important thing to do with this style of learning this top-down learning Is to be able to run experiments and that means you need to be able to run code So my recommendation is don't move on until you can run the code Um, read the chapter of the book um, and then Go through the questionnaire We still got some more work Some more stuff to do about validation sets and test sets and transfer learning So you won't be able to do all of it yet. Um, but try to do all the parts that you can Based on what we've seen in the course so far Rate sure anything you wanted to add before we go. Okay, so thanks very much for joining us So lesson one everybody and really looking forward to seeing you next time Uh, where we will learn about transfer learning Um, and then we will move on to creating, uh A actual production version of an application that we can actually put out on the internet and you can start Building apps that you can show your friends and they can start playing with All right. Bye everybody