 I'm Jeremy, this is Sylvain who helped me develop this course, and I'm a trainer who talks about my own things. So hopefully you're here to learn about learning how to take learning, otherwise you're in the wrong place. Don't ask me much about logistics because I don't know anything, that would be a careless job. This is the basic deal for course logistics. I wanted to say a couple of things about prerequisites. It really is a course for coders. So if you're not a strong coder, it's still like, you can do it, you just have to work hard. We're not even going to tell you how to code in Python. So if you've never coded in Python, but you've done two or three other languages, you'll pick it up super fast, it'll be fine. If you've done, I don't know, a little bit of Matlab scripting and that's about it, you're going to have to put in more Alice. Have a look at the recordings of some of the previous classes to get a sense of the amount of coding drops that are expected, but really it's about, you'll get out of it when you put into it. The location is here, PG&E, so here's what I want Howard where we are, so it's just a couple of blocks away, but don't turn up here for the class because it's not here. It's a much bigger auditorium because it's quite a popular course. However, there is a study group here every day throughout the course. It's not office hours, it's not tutoring, it's a study group, so you can come and hang out at USF. It's kind of particularly set it up because a lot of people fly in from overseas and they're just here the whole time, so it's kind of a nice place for them to work from. I'll show you some examples of student projects that have come out of the course and most of them have happened in the study group. That's a particularly good thing to do if a lot of people put their jobs on hiatus during the course so they can focus on it full time and if that's you, you probably want to come along to the study group and get involved. It's really like doing projects and hanging out with other like-minded students. Why learn deep learning? Well, deep learning is quite good at quite a lot of things. These are all things that deep learning is the best in the world at right now and for many of these things superhuman as well. I won't go through all of them but basically for kind of complex problems, particularly those involving some amount of kind of pattern recognition and analogy making, deep learning tends to work very well and it's used very widely in industry and scientific research and so forth. So I kind of saw this coming a while ago and got pretty excited about it and also kind of nervous about it because to me when the big technology comes along which kind of changes what's possible, what changes how much people can do things, it gives big opportunities but it can also be a bit of a threat if it all ends up in the hands of a small and a large enough group of people. So kind of our mission is to get this tool into the hands of as many people as possible. One of the things that stops people from getting into deep learning is a view that they can't do it or that deep learning is not right for them. This is a list of reasons people tell me that they're not doing deep learning. None of them are true. As you'll see in the course, you can get great results with 30 or 40 images. We do nearly all of our work on a single laptop computer. You can use deep learning for a really wide range of applications. We're not going to be building a brain. We're not claiming this is artificial general intelligence it's not something we're going to talk about at all. We're just talking about deep learning as a tool to get stuff done. So there's a strong connection between the University of San Francisco and Fast.ai. The University of San Francisco is the oldest university in San Francisco. The main campus is on the other side of town. This downtown campus is where the data institute lives, which is where all the data science stuff happens. How many of you are familiar with Fast.ai? The courses you see on Fast.ai all got recorded here at USF or at PG&E. They're all USF courses that get turned into MOOCs. One obvious question then would be why do the in-person course when it'll be online in July? The obvious answer, well A, you're going to be the first to see the material by quite a bit. You're going to be doing it with a bunch of like-minded people. The difference between being in an in-person group of people who are all studying the same thing at the same time is pretty different. And it's interesting to see how many of the best students who go on to do the most high-impact work were in the in-person course. Which when you think there's like two or three hundred people that do the in-person course versus two or three hundred thousand that do the online course, it's kind of interesting to see that. Fast.ai does a few things. So one of them is education. But all of the things that Fast.ai does is about making deep learning accessible, as I described. So as well as the education, there's a community, an online community that we build and help develop. We do research and we build software. All of these things are very connected to the course that I want to talk about. So I'm going to bring all these together because the stuff we do at USF is all deeply connected to the mission of Fast.ai. The community all happens here on the forums. And one of the really interesting things about this is that during the period that the live course is going on, there's a whole nother level of activity on the forums. One of the reasons for that is that we actually invite this year the top 800 participants from the forums to participate live in the course with you through a live stream. So that's an invite-only thing where the best people from the community get to participate. And so most of those folks are expert practitioners. Many of them have published papers or PhDs or whatever. And the quid pro quo is that they help during the course answering questions and expanding on things that people are interested in and so forth. So there's this huge uptick in activity on the forums that goes on during the in-person course. And the actual category where that's happening is a private category just for the people that are in the course, either the invited live stream or the in-person folks until the online version comes out in July. So it's kind of like your private study group of a thousand people around the world. So as I mentioned, the courses that get recorded here at USF get remixed into an online course. The online courses that we've developed have been super popular. So like nearly a million hours of time spent around the world on watching this material over three million views. One of the reasons that the course has been so popular is it's kind of upside down compared to most technical teaching, which is that you learn how to do things before you learn why those things work. So we describe that as a difference between a bottom up teaching approach and a top down teaching approach. So bottom up is what most university technical material looks like. It's kind of started like addition and then subtraction and then gradually build up until in your PhD program you learn to do something useful. And it's a lot of people who like study math, for example, kind of say like, I didn't actually get to appreciate the beauty of this subject until I got to my PhD. There's another way to learn, which is top down, which is how we learn music or baseball, which is to like put an instrument in somebody's hands and get them playing music and then gradually over the next ten years you learn about harmonies and theory and history and whatever. So we teach people learning more like how people teach music or sports. So you get started using stuff right away and this means that you avoid these problems with the bottom up approach. You have motivation from the start because you're building useful stuff. You have the context. Human brains love context so you know why things are being done. And do you understand which bits are important? Because we don't teach the lower level pieces until we need them. So one misunderstanding of the top down approach is some people think it's kind of dumbed down or has less theory and foundations and that couldn't be further from the truth because what happens is with the top down approach as we peel the layers away we do eventually get to that core and so we actually end up seeing the math and the theory and so forth. Having said that, the math is not taught with Greek letters and math notation. The math is taught with code because you know our view is, well there's a couple of things. The first is the math is, all the math gets into, it ends up being turned into code anyway to actually get the computer to do something. So you may as well see it in the form that's going to be used. When it's shown in the form of code you can experiment with it. You can put inputs into it and see the outputs come out of it. You can see what's going on. And also why learn two whole separate languages and notations? If you know how to code then let's do that. So we teach the math that's necessary to actually understand the foundations. We introduce a bit of notation because sometimes you just have to read a paper to see how something works so we kind of try to show like here's how to make sense of papers. But the vast majority of the explanation is as code. I will say here's a piece of math. Now let's look at the code of the math and see how it maps. Okay so one thing that you might find interesting is to look at some student work. So something that I did a while ago was to put up a post on the forum saying like oh after lesson one if you've made something interesting let us know. And this was now quite a while ago I did this when there's a thousand replies there's probably more like 2,000 now. Lots of people have posted and said like oh here's something I built. And it's been really cool because there are people all around the world that do the course even the live course because of the 800 I mentioned you're also live streaming it. You get all these interesting projects going on. So one person talked about a recognition program to see different types of Trinidad and Tobago masqueraders versus regular islanders. Somebody did zucchinis versus cucumber. And one of the interesting things here is it's like 47 cucumbers and 39 zucchinis. And they got I think it was 100% accuracy right so we don't need lots of data even for things that are pretty subtle like cucumbers versus zucchinis. This was an interesting one which is predicting what city a satellite photo is of which is kind of something that I doubt any of us humans could do but a computer was doing it to 85% accuracy for 110 cities which is amazing. Looking at Panamanian buses, butik cloth patterns, Tanzanian building conditions. And it turns out that it's not at all uncommon like every course there's lots of examples of people who discover that they have a new state of the art because deep lighting still has been not applied to more things than it has been applied to. So whatever you come in here with an interest in as a hobby or as your vocation or whatever hopefully you can try these techniques on that thing. So it turned out that Suvash got a state of the art result on Devangari character recognition literally using the lesson one course material. Ethan got a state of the art result on environmental sound classification. One of the interesting things here is lesson one is all about image classification but you can turn a lot of things into images. So in this case he converted sounds into images representing those sounds by creating things called spectrographs. And then he compared to his paper to see what the state of the art was and got the best result. Elena is an example of somebody who took it to a whole other level. She actually during the course did three hosts all in the area of genomics. She's one of the top scientists at Human Long Agility International and in every case she showed a significant advance over the state of the art in different genomics areas. So we actually like you'll see there's a lot of writing here. You don't have to write but a lot of students do. We encourage it because it's a great way to develop your understanding of the material is to try to write it down. And so we do talk a bit about writing and a lot of students try out their hand at writing something about, particularly about like in Elena's case a combination of deep learning and something you know about. Some of the student projects go big. A good example is Jason Antich during the course created this project he called Deoldify. So in lesson seven last year we showed a new approach to generative image models where we said like oh you can take a picture as an input and create a new picture as an output. And Jason thought oh I wonder what would happen if I did that to create black and white input images to turn them into color output images and as you can see it worked amazingly. And as of last week he just announced that he's now quit his job. He now has a new company. He's just sold it to the world's largest ancestry online site and in the first week they had over I think a million people use his system to colorize photos of their relatives. It's even kind of created new communities of practitioners. A lot of radiologists have taken the course. Folks like Alex who went on to get together with a bunch of other folks and create a well regarded paper. He also won a Kaggle competition on pneumonia detection a very significant Kaggle competition. Now you know widely regarded as one of the experts in the field of deep learning and medical imaging even though I think he's still a resident and recently just finished being a resident. It's been cool to see like there's lots of radiologists now who you know particularly younger folks like Alex who are expert practitioners in deep learning and also deeply understand their field of radiology and bringing the two together to do some really powerful stuff. Melissa Fabros actually did some super exciting work. She kind of really pioneered the study of facial recognition algorithms on people of color but it turned out that they didn't work real well and this is important because she was helping Kiva that does microlending mainly in markets where there's not that many white folks so they tried to kind of use the algorithms that are out there and they didn't work very well and she won this million dollar AI for everyone challenge round and you know a lot of these people don't have traditional machine learning backgrounds so at the time I think she was doing a PhD in English literature as I mentioned Alex radiology. Another alumni saying was a lawyer and he built this super impressive system for GitHub showing how you can search for code in English and get back post snippets. Christine Payne wrote a neural net that created this. So this music generator she developed so after fast.ai, so after doing the fast.ai course and actually she was in the in-person USF course she went to open AI and became a resident there and she wrote this which went on to be produced by the BBC Philharmonic and so her background is that's her as a pianist. So here's a great example of people bringing their domain expertise together with their deep learning skills. As important to realize you know folks like Christine it's not like she just does a single course and she's instantly an expert. I mean she worked very hard over a significant period of time and is also a genius which helps. But the point here is that you know absolutely can bring together kind of domain expertise and a deep understanding of deep learning that you can get through this USF course and the other fast.ai online courses and so forth and put together into something super cool. So a lot of our research ends up in the course and actually a lot of that research happens during the course particularly in the study group. So for example MIT Technology Review wrote about how a small team of student coders beat Google. They didn't just beat Google but also Intel to create the fastest ever training of ImageNet and Sci-Fi 10 to have the most important computer vision benchmarks in the world. This was a competition called Dawnbench. So yeah it's just like exciting in the study group a few of us decided you know hey let's see if we can take a crack at this competition and spend a couple of weeks giving it our best shot. And really the trick was we didn't have the resources like Intel entered and they entered by combining a thousand servers into a mega cluster and running it on this mega cluster. We had one computer so we had to think how do we beat a thousand computers with one computer so we had to be creative and thoughtful. And so we've had lots of examples of this both from projects that we've done in our research and our students again and again get state of the art results with far fewer resources than the big guys. Another interesting example of research that actually came out of a course was a couple of years ago I thought it would be great to show the students how transferable ideas are. And so the first two lectures were maybe mainly that computer vision images and so I wanted to show what would happen if you took those same ideas and applied them to text and it turned out that nobody had really done that before and I didn't know much about text natural language processing but I thought I'd give it a go and within a few hours of trying it it turned out I had a new state of the art result for one of the most widely studied text classification notices. So somebody who saw that course who was then doing his PhD in this area got in touch and offered to write it up into a paper and we ended up publishing a paper together we got published in the top journal for computational linguistics and actually kind of went on to help kickstart a new era in NLP even got written up in the New York Times and today this idea of using these computer vision based transfer learning techniques in NLP is probably the most important NLP development going on at the moment. Another area that we've really focused on in trying to this is all trying to make deep learning more accessible less computers, less data less specialist knowledge so one of the things that's been holding back people from using deep learning is there's a lot of parameters to tweak settings to get just right like learning rates and regularization and optimization parameters and whatever so one of the things we do is we just look out there to find the research that's already being done that's been overlooked to solve these problems so for example the most important parameter to set is something called a learning rate and we discovered there was only a paper showing how to set it really well in like a minute whereas previously people were using vast compute clusters to try out lots of different values so we kind of pop or raise that and put it into the software and make it easy to use, it's called the learning rate finder the other thing we do is we tried lots of different these settings that we tweak, they're called hyperparameters we tried lots of different hyperparameters across lots of different data sets and found a set of hyperparameters that just work nearly all the time and we made them all the defaults so one of the things we show in the course is how to not waste your time and money doing stuff that you just don't have to fit which is already well known good settings or there's easy ways to figure them out pretty quickly so it's a very kind of practical approach so one of the really interesting things about it is the use of something called the fastai library so the fastai library is a library that sits on top of another library called PyTorch and in planning there are two libraries that pretty much everyone uses PyTorch and TensorFlow TensorFlow came out of Google the vast resources a few years ago and we used to teach it in this course but we got to a point where it wasn't flexible enough to handle what we wanted to show students to do with it and flexible enough for the research we wanted to do and luckily at that time a couple of years ago a new library came out called PyTorch which was really just a couple of people wrote it so far fewer resources but interestingly it kind of had a bit of a fastai feel to it and because they didn't have the resources they had to be super careful and to curate the best approach to each thing and we thought it was amazingly great and we switched everything over to PyTorch and what's happened since then is PyTorch is kind of taking over the world the first place you take over the world is in research because when the researchers flock to your software then all of the new developments come out with your software and anybody who wants to use that has to do it with your software so if we look at the last few academic conferences in each case the percentage using PyTorch for papers is over 50% and you can see that's happened basically in one year so you can clearly see that PyTorch is going to take over research and then industry in the next year or two so there's no point wanting what people used to use and there's a reason that this is happening it's just much better so we focus on that the only issue is that PyTorch is kind of a lower level plumbing library you have to write everything yourself so that's no good really for like particularly a top-down approach to getting stuff done so we wrote our own library on top of that called FastAI which just makes a lot of things much easier and so FastAI is now super popular in its own right it's available on every major cloud platform lots and lots of researchers coming out of it lots of Fortune 500 companies using it and it's kind of really cool we often get messages like this from people saying oh I just started using FastAI I used to use TensorFlow the first thing I tried everything's so much better how could it be this much better I thought deep learning was deep learning and then somebody else replied founder of this company saying yep that's what we found we used to use TensorFlow we spent months tweaking it FastAI used it and immediately got better results so I mean the main thing we teach is the concepts the understanding of what we're doing having great software doesn't matter as much but it sure is nice that when you do something correctly it would be nice to get a world class result rather than have to spend months fiddling around with things so for example with the previous version of FastAI when we compared it to Keras which is the main equivalent kind of API on top of TensorFlow comparing it to the code that Keras makes available for a particular problem we found our error was about half of the Keras error speed was about twice the Keras speed amount of lines of code was about one-sixth of the Keras lines of code lines of code is important it's really important because these extra 26 lines of code is like 26 lines of things you have to tweak and change and choose which is cognitive overhead that you have to deal with and if it doesn't work which one of those extra lines was the problem where did you make the mistake so when we used to use Keras we used to teach it in the course this would happen all the time I would keep finding that things didn't work as well as I hoped they would and it would be days of figuring out why and it would be one of those lines of boilerplate where something was true rather than false so our view is that you shouldn't have to type more lines of code than necessary but at the same token everything should be infinitely customizable so to get to that point this course will be the first ever course to show how to use FastAI version 2 which is officially coming out in July so we'll be using the pre-release version of it and FastAI version 2 is a rewrite from scratch that is described fully in this peer reviewed paper in the journal information and will be in this O'Reilly book of which you'll all be able to access for free during the course and it's a huge advance over anything that's come before as I said it's a rewrite from scratch very much designed to be well we describe it in the paper as a layered API it's all about what happens when you take a code of mentality to a deep learning library and you think hard about things like refactoring and separation of concerns and stuff that software engineers really care about so with FastAI version 2 there's a lot of interesting stuff that you'll be the first ones to learn about and experiment with so for example there are diagnostics that print out after training that create a picture this is a unique new kind of picture showing you what's going on inside a network as trains I want to describe it fully here but basically what this is showing is the first 1, 2, 3, 4 layers of a deep neural network and this is called activations that neural network are growing exponentially and crashing this is really bad and so there are these pictures that you can get out of the training process to actually look inside and see what's going on and one of the nice things about this picture is it's actually developed during the last course in a study group so one of the international visitors from Italy Stefano actually helped all the different ways that we could build this visualization and it's ended up in the library so as I mentioned it's a layered API and the course will focus initially on the top layer which we call the applications and these are the four things which are pretty well established as being things deep learning is very good at and we know how to do them and we kind of really know how to do them properly and it should work out of the box each time so we start there and then we gradually delve into the mid layer API which is the components the flexible components that the applications are built on and then eventually we get to the foundation which is the lowest level plumbing that everything is built on top of so the applications for vision for example this is all the code you need with fast AI to create a world class classifier for recognizing pet breeds and it takes as you see 35 seconds to run this is on a single computer and it's 1, 2, 3, 4 lines of code the lines of code don't matter so much other than to point out that if you wanted to switch from an image classifier to something that can handle segmentation this is segmentation it's where you have a photo and you want to color code every pixel to say what is the pixel of so here green is the road red is sidewalk, orange is a light pole and so forth so you can see again it's 1, 2, 3, 4 lines of code nearly the same 4 lines of code to do segmentation and this show batch which visualizes the contents are going to be the same each time too, yeah the color selection for those is it random by the machine? the color selection is coming from something called a color map so the plotting library we use is something called map plot lib and map plot lib has a variety of color maps and we just have a default color map that tries to select colors that are nicely distinctive same thing for text so this is how to get world class results on sentiment analysis and again it's the same lines of code and again show batch will now tell us is the text and is the labels tabular data stuff that's in spreadsheets or database tables is something that a lot of people don't realize actually works great and again it's basically the same lines of code so here's one to predict who's going to be a high income earner versus a low income earner based on socioeconomic data there's one extra step which is you have to say which ones are categorical and which ones are continuous which we'll learn all about but other than that it's the same steps and then very related is collaborative filtering collaborative filtering is a really important technique for recommendation systems figuring out who's going to be interested in buying this product or looking at this movie based on the past behavior of similar customers and again it's the same basic lines of code so as I mentioned those applications so we'll study all of those applications we'll learn how to use them in practice and make it work well but they're built on this mid tier API that you can kind of mix and match so I won't show you the whole thing right now but an example of this is something called the data block API where for example if you want to do digit recognition so you've probably heard like for machine learning, for dick learning a lot of the work is the data processing it's getting the data into a form you can model it we realized there's basically four things you have to do to make that happen and so we created this data block API where you list the four things separately so in this case to do digit recognition we say the input type is the black and white image as you can see the output type is a category which digit is it they are image files how do you split into training and validation how do you get the label so you just say each of the things and these are all plain Python functions so you can write your own Python code to replace any of these things and so once you've done that you now have something called a data loader which is a PyTorch concept it's basically a thing you can train a model from so you can use a very similar looking data block to do custom labeling perhaps example to label with multiple labels for example for satellite classification segmentation it looks almost the same instead of now having an image output and a category output sorry, an image input and a category output we have an image input and a mask output if you're doing a key point so in this case we're looking for the center of people's faces it's almost the same thing but now we have an image input and a point output so this is an example of how mid it's really software engineering basic principles of building these APIs with a nice decoupled separation of concerns that ends up in a situation where they can build what they need in a kind of fast and customized and easy way so I'll show you one more example of this this mid tier API which is something called optimizers optimizers are the things that actually train your model and you'll be learning all about them it turns out that optimizers are a current kind of big area of research interest and in the last 12 months people have built some much better optimizers they work much better than the basic approach called SGD and so when they do that they release papers and they release code so one of the important recent optimizers is something called Adam W it's actually not that recent it was a couple of years ago but it took about 2 years for Adam W to get implemented in PyTorch and in PyTorch all this code because the software engineering work of refactoring had never happened so we realized when we looked at lots and lots of papers that you could refactor all of them into a small basic framework using callbacks so this is the same thing all this code gets turned into these three lines and this little gray bit so that's the equivalent for us we had Adam W implemented the day after the paper came out so one of the cool things about working with fast AI is you often get to be the first to try out new research techniques because they're so easy to implement, either us or somebody in the community will implement them so a really cool example of this is Google actually implemented a new optimizer to reduce the amount of time it took to train an important NLP language model from three days to 76 minutes and they created this thing called the LAM Optimizer and in their paper this is their algorithm and in fast AI this is the algorithm and you can map basically line to line and one of the nice things if you're not a math person like me, I am not a math person being able to see how fast me get more comfortable with the math so actually I presented this to Google a few weeks ago fast AI too and it turned out one of the people in the room was one of the people on the paper and he was just so happy to see his ideas so nicely expressed in code so a lot of people are kind of funny about the idea of having a small amount of code as if that somehow decreases readability but a small amount of code means you're expressing the actual thing you want to express so when it says you know compute this thing there should be a line of code computing that thing if it's 50 lines of code computing that thing my brain can't cope so you know fast AI v2 is kind of designed for people with brains like mine that can't cope with too much complexity so we kind of okay so that's the quick overview and we've got 15 minutes for questions so so I've watched several of the courses I don't know by the way they're really good but now one of the parts that I've struggled it's basically the math trying to understand sometimes it's really hard but is there anything that you would recommend for the math? so for understanding the math the best is actually part two which I assume will be in October but we already have a part two so in the part two we already have online on fast AI that was recorded as the USF part two we we implement I don't know how many silver like all papers from scratch so if you do that like for me as I say I'm not a math person I studied philosophy and even then not in any breadth depth so my understanding of the math is basically come from this process of implementing papers and so I find when I read somebody else's code and I read their paper and I compare them and then I implement it myself I kind of get there and there is a there is a kind of a language to it that's it takes a while to pick up just like programming but it's the other thing is it's kind of it's less it's less well-defined people change the way it's described so sometimes you just have to stare at it for a while or go and ask people but it's showing the same ideas that's in the code that's so it takes a certain amount of tenacity as well so how is the serious course different from the most serious course and if you do the online course would you get a head start if so like what video you do so each year the course is 100% new material but each year it's trying to teach you the same stuff which is how to be a good deep learning practitioner and also prepare you to do the part two course which is to become a world-class researcher or at least the foundations for that but each year the world changes enough that we think we can do a much better job with the things that have happened in the last 12 months so doing the previous year's course helps a lot so like of the 800 people that are doing it live probably most of them have done all of our previous courses like a lot of people do every year's course partly because you get a different angle on the material so yeah it's super helpful particularly if you haven't got lots of Python experience or if you haven't played around much with like NumPy or you know the kind of scientific programming library it's super helpful the thing I'll say is this course is kind of unusual in the variety of people that do it there are plenty of people that turn up two hours a week and don't do any of the assignments and that's it and they come out of it like learning stuff they probably can't train a model much themselves and make it work or it breaks they wouldn't know how to fix it because they haven't practiced but they have a good sense like if they're a product manager or a CTO or whatever what are the capabilities what does the kind of approach look like where do people get stuck where are the constraints but a lot of people study parts one and two full time for a year like a lot of people take a year off to just do that and you certainly can get into that depth as well so you can kind of decide how deep to go and if you are interested in going deep then studying previous courses is certainly useful one of the things we're doing differently this year is we're also incorporating the key material from the introduction to machine learning course so we'll be learning in more detail about things like training versus validation sets we'll be learning about random forests we'll be learning about feature importance stuff like that so that's one key difference so in the past there was two separate courses do you have any recommended readings that we might want to look at prior to the course the best recommended reading would be the previous videos there there isn't much I was trying to think do you think it's your way of recommending readings I mean there's actually people have taken great really great notes about previous courses so like the other thing would be and they're linked to the courses so there's people who have gone to a lot of effort to turn lessons into those but yeah I mean there's not a lot of great material out there and kind of top down practical deep learning elsewhere that we found are we going to use PyTorch 3D PyTorch 3D no there won't be any 3D stuff Part 2 that there will be we do, so we have a medical research initiative here which I chair called WAMRI so we have a lot of 3D medical data it turns out that the vast majority of the time the best thing to do is what we call 2.5D which is where you basically treat the images largely separately and then at the very end you combine them the basic techniques to do that we'll learn them all in the Part 1 course but to actually put them together that would be more a project you could do I guess I actually have two questions the first one I really love what you did with the abstraction in FastAI I think that's brilliant however generally when you do something like that you have to give up something and what I'm thinking is you give up fine grain control maybe is that fair to say no it's not you know just like we have layers of abstractions in all the software we use and just because you create another layer of abstraction doesn't mean the other ones go away so in the previous version of FastAI we didn't have this mid-layer and that was a big problem because the applications were written in these low level foundations so if you wanted to create a new application for audio or 3D medical imaging or whatever you would dump way down into those weeds so by adding this extra tier we've kind of had this pre-release version available for a few months now it's been amazing to see how the community is building a lot more stuff already so yeah I think the other thing is that when you provide the applications tier that makes it more concise um you can get involved more quickly you can understand what's going on by seeing like oh these are the things you actually have to change from application to application data set to data set and so then you can customize a little bit at a time so it makes it much more gradually learnable gradually extensible second question you mentioned the code versus mathematics that how there's a almost one correspondence between the two but I'm thinking you can't prove stuff using code you can with math so we don't do any proofs and you know I don't find that's a problem like I've published very highly cited research papers in top journals and I don't have proofs you know hey you know here's here's this thing in computer vision that works and here's an understanding of why it works and here's like the similar ideas in natural language processing and so we would expect the same thing to happen and let's try it out and whatever so you know proofs a controversial topic in deep learning and for many years they were absolutely demanded pretty much every conference paper conferences and Jeffrey Hinton who's kind of one of the fathers of deep learning complained regularly about the fact like well we built something that is you know creates these billion parameter networks with all these layers that any proof requires simplifying the math down to a point that it's not accurate anymore and there's still a bit of that that happens honestly there's a lot of papers published today where they end up in top journals because they have proofs in them but they require it's kind of like economics you know it's like oh let's set up some premises that have nothing to do with the reality of training a real neural network but now we simplified it so much the main thing people would try to do is prove convergence bounds it basically says like oh regardless of how you initialize this thing you'll always end up getting better and better and when I started out in this area kind of 20 plus years ago everything was always about proving convergence but a lot of people in operations research focused on you know only looked at algorithms where you could prove that it would be the optimal and it really set the field back because it meant that all the techniques that worked in practice but we could improve it for got ignored so I'm kind of very cautious about proofs in general and it's certainly not something that we look at in the course Yes Is there anything that we should be doing in terms of expectations that you may have for projects or whatever is there anything we should be comparing between now and when it gets? Yes absolutely Your greatest impact will be if you can combine what you learn in this course with stuff that you're deeply passionate about right now so that might be stuff you do at work or it might be stuff you do outside of work so you know if you can come with ideas about problems you want to solve data sets you want to explore that's super helpful if you can start curating the data set you might be interested in learning more about that would be that would be super helpful I think you know there's always kind of a delicate balance between following the material that's in each lesson versus exploring your project and there's things to be made on either side if you only look at your project you're going to miss out on actually understanding the stuff that's being presented each week but if you just do nothing but read and listen what's being presented and not experimenting with your own stuff you don't get to find out where the hard edges are and you can jump on the forums and say hey Jeremy thought this would work if we did this thing and I tried it on this data set it's totally not working at all and have that conversation and try to figure it out and more generally for code write as much Python as possible and then the Python we use for deep learning is a particular kind of Python it's where we almost never use loops but instead we do things on whole arrays or we call them tensors at once so learning to use the NumPy library would be super helpful because PyTorch is nearly identical and kind of get some experience of you know working with matrices and vectors and adding them together and modifying them and stuff like that is useful you don't need to study linear programming sorry linear algebra we have a linear algebra course but you don't need to know a lot of people come in spending too much time on this math stuff but I would come in as good at coding as you can because that's the language you'll be talking to the computer in yes sir so no you don't and thanks for the good question if you go to course.fast.ai and then click on server setup you can see there's all these different options and if you scroll down here you'll find a description of each one so if you pick let's say gradient then you'll see it basically tells you create an account click on the fast.ai button decide which machine you want click create notebook and you start it so all the major cloud companies have fast.ai built into their environments there's always a group of people that are tempted to set up a new computer to you and all this stuff those are the people that spend the entire course installing Linux drivers so don't be that person Google Cloud comes with $300 of free credits which is more than enough to get you through the course many times over there's also something called Colab which is free and cable kernels which are free if you're interested in like exploring a little bit basic linux setup stuff Google Cloud is your best option if you don't want to touch that at all Paper Space Gradient has free one click Jupyter notebooks that you can get started straight away and all the data sets in notebooks you're ready to go