 There's quite a lot to do, hopefully if you've got a laptop you could follow along. In particular there's, you should have two sets of files, one of which you'll be called presentation, which will have an index file on, so that will look something like this. If you click index you'll get to this and you'll also have an OVA file, which is a virtual appliance for Oracle VirtualBox. So there are a bunch of different stuff and there's the main presentation. Okay, so I'm going to be talking about, this is a deep learning talk. I'm going to kind of assume you've seen deep learning talks before, because this whole thing has been listed with them. So this, typically for these, for talks at FOSAsia, you might be learning, you're either sleeping or you're learning stuff. Okay, if it's a machine learning talk, you'll be learning to learn. Okay, this because we're going meta. So about me, so if you're a member of the TensorFlow group, which is Sam and I are running this thing out at Google, or just have been around, you'll have seen me. I've been in Singapore since September 2013. During 2014 I essentially converted myself from finance guy back to PhD guy by doing machine learning, deep learning, natural language processing, robots and drones, having a good year out. But since 2015 I've been kind of serious doing natural language processing, deep learning, produce some papers which have been accepted to conferences, and doing a dev course for deep learning. So there's, I'll talk about that later, maybe. I'm kind of, we do kind of deep learning consulting, prototyping. Does anyone else need one of these? Do you have a laptop? Okay. Okay. But in terms of products, which is kind of behind the scenes at the moment, we're interested in conversational computing, which is like, if you've got a Google Assistant, you kind of know how limited it is, it's kind of interesting, but it's very limited. We think there's going to be like a next, there's got to be a next generation of that. And for that we're going to need natural voice generation and also kind of knowledge-based interaction. So we're playing with these things. So the idea here is learning, you're going to be learning to learn to learn. So I'm going to talk about the basic ideas of learning, how you would learn from a lot of data, how you would learn from some data, and then how you would learn from just a little data. And if you've got VirtualBox, you'll be installing the OVA, whatever. So the first thing, though, doesn't require the VirtualBox to be installed. If you click on the TensorFlow link in the thing, which is here, you'll get up this very simple thing. How many people here have seen the Google Playground, this TensorFlow Playground? Virtually everyone. Do you have a laptop? Do you need one of these sticks? OK. So this is a quick neural network thing. Basically, what we have on this left side is some training data, which consists of these orange and these blue dots. And it's going to try and optimize it so that if you give it some test points with the same kind of distribution, it will classify them. And it starts off in a random state. And fortunately, in this state, it's actually classifying the blue ones with the blue background or orange ones with the orange background. But if I start it on some other random state, it doesn't know anything. But if I press this Play button, it will then adjust the parameters so that it forms a nice boundary between these two things. So this is the first part of the talk in that you have now learned how to make machine learn, OK? So this is one level of learning. And it's a fairly simple one. I just wanted to show that you have a task which has some input data and some test data. And we're going to try and do our best to figure this out. So let me just, I'm going to have to keep going. So with this, and if you're interested in the basics of neural networks and you haven't seen this before, this is kind of an interesting thing because you can see that with these features which is kind of lefty righty and upy downy, you can actually synthesize a diagonal line, OK? Now, that's fine if I know that a diagonal line will work. But if I go to a distribution like this, I know that there's, you can kind of tell there's no diagonal line which will cut this into oranges and blues, right? It will just fail, OK? And the only way to make this work is by adding additional hidden layers, what we call hidden layers. And so what we can do with these intermediate things is that they will each produce diagonal lines. And if I can train this sufficiently, it will work out what diagonal lines it needs to make an isolated thing. So here we're doing another training. And it's now figured out that if it picks the right internal features, it can actually draw a triangle around the blue dots, and so it's now learnt, OK? And if you play around with this, you'll find that there are limits as to how much the simple model can learn, but you can expand it by adding layers or by adding neurons or whatever. There's a whole little playground beautifully done by the group with us. OK, so this is training a neural network to learn. And let me just, OK, so there's that. And so what we've understood, OK, is what the goal here was to learn to predict these regions. You've got the input features. The single neurons can only make a single straight line. The way in which it's learned is by assigning blame, but we won't get into the whole back propagation thing here, OK? But also these deep networks can also create the features that they need. So I didn't have to tell it it needs to make a triangle. That's what it did. It's figured out that a triangle will solve this problem. And if I added more different features, it might solve something else, right? But in particular, if I have a problem where I don't understand the domain, I might not be able to understand what features it should be creating. The whole point of the deep learning is it can do this for itself, OK? If you're lucky. OK, so neural network's next. So now we're on to a thing called ImageNet. So this was a huge competition. And how many people here have heard of ImageNet? Few or some? OK, some motion? OK, so ImageNet was a huge competition organized, I think, since the 90s even, but certainly to 2000s, where computer vision people would essentially battle with each other to try and recognize images of, let's say, hot dog, OK? And they would be in the whole image in 22,000 categories. But the basic game of ImageNet was a thousand different categories. And what happened is that over time, up until about 2010, people would engineer special features. They would make a bread detector and a meat detector. And they'd say, well, if the bread's going around the meat, then this is a hot dog kind of thing, right? Or if it's got ears, it's probably not a hot dog. So feature engineering was all the rage, OK? And then came the neural net people, who suddenly these deep networks started to work on this image stuff, partly because they discovered the GPUs could be applied to doing this. So suddenly everything changed. And this is where we started off with, this is 2014. There's a Google network. This is now quite a big network compared to the TensorFlow playground. But it got more and more complicated. This is before the neural networks came along, with 25% error rates. When the networks came along, and there was also a grinding to a halt back in there. So the networks came along, it fell to 16, then 11, then 7, it's very rapidly converged to this, which is now almost three years old. This is now better than humans. So this is one thing where one of the limitations in learning this task is that you need more and more humans to label this, because the human error rates individually will be higher than the network. So you have to have all of these samples labeled by committees now. And the same would go for say radiology examples, where these radiology programs are better than a radiologist. But you can't be better than, there's no metric to say whether you're better than all radiologists, because all of them will be making error rates at kind of the same rate as you do. It's like a battle of the radiologists at that point. Anyway, so you also, this is the point on which you say, okay, I wish I had installed the OVA, because there is a virtual machine in there, which will hopefully you can do import appliance on your Oracle VirtualBox, you will then boot into Linux, and that will allow you to then go for your host machine that is on local host, there'll be on 8080, there is a Jupyter notebook thing. So basically you'll be able to just open Jupyter on this tab. Okay, the image will also do other things. So you can SSH into it. So all of these presentations on your, in your presentations folder anyway. So you can see exactly what's going on. You can get a console, the, you know, user password, in fact user and password. There's a proper, ah, yes, good idea. Okay, it's got a whole bunch of stuff in here. It's not as, I'd say this one is not as fully featured as it has been in previous years, in that I'm really interested in just a few of these examples. I'll be updating this as time goes on, because what has happened is that the frameworks have moved on significantly since last year, and so are the data sets, and for reasons I'll explain. So, or data sets and models, in that the model sizes have got much smaller, and I'll explain how this has happened, right. And, but also new framework has come along, PyTorch, which is kind of very interesting framework. So I, of course, need to do that. One of the older frameworks, the Arno's been retired. Okay, so there's quite a lot of movement in this whole space. Um, so hands on image net. So I've got, I've got one of these here. So this will be under CNN, Transfer Learning, GRAS. So there'll be this. I'll just go back to refresh. So this, so under this notebooks folder, there should be a two for CNNs. Within the CNNs, there's five for Transfer Learning, and then there's this one with Image Classified Keras. Hopefully I can find it again. Okay. I will do this in two stages. Um, and what we want to do is, okay. First stage, I'm going to do ImageNet classification. So this is basically a network which is pre-trained to do ImageNet. Okay. Um, these have been, these pre-trained networks have been created by say Google or lots of other people are interested in doing this. In video, mainly in video, because they want to sell chips, right. They will use vast numbers of images, huge computational resources, and this will learn exactly the classes which it was told to, okay, from all this data. And so if I go through this, basically I'm going to import from TensorFlow, I'll import Keras, which is now a first, kind of first order thing. Um, and we're going to be using one of the things in the Keras model zoo. I'm not sure whether, hopefully you can see this. There's a whole bunch of different architectures for doing this ImageNet problem as, and the very earliest one is this VGG16. Okay. What you can see here is this VGG16, which is more like a 2012 kind of network, has got 140 million parameters and it's scoring 70 odd percent. Okay. But what happened as time went on, is quickly we moved over here to this inception of V1, which has scored not quite the same, but a huge, usually fewer number of parameters. And then people started to do trade-offs with bigger models, much better performance. And so there's this kind of frontier as different generations go past of better and better performance. Okay. And so one of the nice things about Keras, which is a framework which runs on top of TensorFlow predominantly now, is it's all of these things in a kind of a model zoo. And so if we, let's just skip over this for a minute. And so inside your, inside the virtual machine, there's, these models are all pre-loaded, so they're, they're sitting there ready to be picked up. This NASNET mobile model has got, it's got 24 million parameters and it takes a little while to load because it's kind of complicated. It's structurally complicated. So the VGG had many, many more parameters, but it's a very simple network. Just do this stuff, then this stuff, then this stuff, then this stuff. This NASNET is many layers of this kind of stuff. This is very, very interconnected, intricate stuff. Okay. Okay. From here, having loaded this model image net, I have some helper functions to convert an image on disk into something which I can feed into the network. I have a thing which will then get a single prediction. So it will take the image input. It will then pass it through the model to get the predictions. It will then decode the predictions and give it back to me. So if I have a picture of a cat, it will do this and it will say that this is 67% confident that it's a tabby cat or a tiger cat. Okay. So this is, there's been very, very few lines of code, but I've been using Google's pre-made TensorFlow, you know, pre-made Keras model. And it works. So the VM's got all of these images in there. So here's a bunch of others. So it's a Siamese cat. It doesn't know about white owls. So it thinks this is an Arctic fox or it doesn't really have much idea what this is. It thinks this is a dingo. I don't think it has a sheba thingy. There's a tabby again. Okay. So this is, this pre-trained model has these categories. And these categories, from the thousand we originally trained it from. And so this is a restriction. I mean, the thousand is quite a large number of categories. It has a few cats. It's got many, many dogs, but it doesn't know much about faces because it's only got, you know, a few varieties of face, hardly any. I think, I think even Google realized there might be a problem getting us to recognize faces. On the other hand, they also have an equal number of gorillas which has been a problem for them. Okay. So what have we just done? We've shown that this image net classification we've trained it from zero using a vast number of images. Huge computational resources. There's instructions thing on there. If anyone else has got a spare one and needs one, please switch. It does exactly what we've told it to because we're going to get huge amounts of data. Maybe each of these categories we've had a thousand images or something. And this is kind of typical and why NVIDIA and the cloud providers are, you know, all egomode or Alice. Now let's do something a bit more. Now, you know, a thousand images is a lot, particularly if you've got a corporate need for doing something. You probably don't have a thousand images, but maybe we can do it with twenty images or something. Maybe my product is good enough that if I have twenty different samples, if only there was some way of understanding images. So this is kind of a nice little trick called transfer learning. And basically, so I'll load this other network again. So basically, hopefully, there you can see it. So this is the image network. Basically, we have this input image coming in. We have a black box, which is whatever Keras model has been loaded with its own weight. And it comes out with, here's kind of a bunch of probabilities or log probes, whatever, that this is the right class and this is a thousand long. And then we then take like a maximum over this and the maximum is going to be the answer. So this will be Tavikant or whatever it is. So this is the way the image has been trained. It's been trained. If it gets an image of Tavikant, it should say Tavikant. And it gets an image of, if it says, no, Eskimo or whatever, then it's wrong. And you should penalize the network. It should be saying Tavikant. So you train this and it gradually converges to something. What we do in this case for this transfer learning is we use the same exact trained model. We're not going to touch that trained model, but we're going to take off this top layer and just use this answer here. And we're going to pass these answers into an SVM, which is a kind of a standard, like a 1990s kind of machine learning model. And this will then, we will then train this on our classes. So in some ways, if we pass in something which this network has never seen, so I produce, we've seen a parrot, we've seen a baboon, okay? Baboons and orangutans. It may not have that much knowledge of those. But when it sees a baboon, it will have a certain kind of error pattern. It will misclassify it, because it doesn't know what baboons are. But it will like different things about the baboons than it would about the orangutans. So the kind of the pattern of errors will be different based upon, is it baboon or orangutan. But you can actually take this pattern of error and then use that as input to our own classifier. So that's what we're going to do. So basically a little cropping function, getting logits. So this is the thing which essentially takes off, that we've got the top layer and I'm now going to produce a thing which is about a thousand big, but it's more like of these error terms. So it's not just a one answer, it's all of the answers. And then we're now going to try and see. So I've got two sets of cars in here called classic and modern cars, in directories. And what this will do is it goes through everything in the classes and everything in the directories and just predicts out what the network says. Then it stores out that prediction for when I then run an SVM recognizer between the two sets. So hopefully it took 5.8 seconds each. So basically this is my training set. I've got a classic car, another classic car, another classic car, and then I've got, so I've got about 10 of these. Then I've got modern car, modern car. These are sports cars. So what I want to do is can it recognize the difference between classic cars and modern sports cars. And these are things it hasn't seen as an image and it hasn't been trained on this difference. But there are things about these images that it might like. In particular, it knows about dinner plates and flowers. And so it might like these, that the classic cars tend to have these nice circular kind of wheels and these kind of different smooth lines. And so it will confuse the classic cars with particular types of flowers. It knows quite a lot about flowers or whatever. Whereas the modern cars, these are very much more like lotus things or whatever. This is a different kind of flower which the errors it would make similar they've got different kind of lines and grills and kind of very different feel to it. So you can think that even though it doesn't know about these cars, there will be stuff which it knows about in images which it will make the same mistakes again and again. So that's my training set, the 20 off. I can then train instead of taking a week or something to train my image net or it's now the time's going down but instead of taking a very large number of GPUs I can train my SVM like that. So that took less than a second. I've got very few examples I'm only trying to tell the difference between 10 different things 10 on each side. But then I can then go through a test set saying well let's load in each one find what the image net thing says but then use the classifier I just made to classify that is this a modern or classic one? So it can then go through this and say well this is an image it's never seen before and it says that this is a modern sports car and this is a classic sports car modern sports car and well it's a Prius it thinks it's a modern sports car so it doesn't know any difference because I haven't given it the lame hybrid category so it's got these mostly right but this one it thinks is a modern car which I don't think it is but it's also kind of impenetrable how this is working but it works pretty well and I would say if you've got a product categorization thing this is what you should use and it's not just me saying Google has got this whole auto ML product which they're now selling which is you give it a tons of images and they're going to mulch it with their existing networks and try and classify the stuff as best they can so a very very similar principle is what they're using and if you want to play around with this it's quite easy to like create another folder change the class thing describe how to do it so some guy so I did this I think maybe it was Fossasia last year or the year before some guy actually used this to classify members of his family which is amazing that it works because the image in it doesn't know much about faces so this is where also the Google data set they may have a much broader set than image net they may have chosen stuff which is relevant to lots of different things they don't tell you what they're doing but they may have chosen their examples much more evenly than image net which the actual test concentrates on dogs for some reason and various things anyway maybe it was recognized in members of his family based on what they typically wore or the backgrounds they were in or something which is it's a bit of a trickery thing but it does work yes before the support vector machine or before the software yes so there's a carousel and there's two you'll see that there's a couple of ways of loading the model one of which is with top which will actually give you the classification out and there you've got just basically the dangling ends of whatever it would need but then you can then repurpose that for yourself so the transfer learning basically we've used the pre-trained image net model we're leveraging it to classify these new classes but we need hardly as much training data because the network itself knows about vision sufficiently that it's got good lead into what we're going to do so now let's go a bit better we're about halfway through so the previous methods learn fundamentally from kind of large amounts of data but humans learn from very little data we don't need that much like input what we would like to do is make models which would also learn from very little data like very very little data ideally what we want to do we don't want to have to construct these models what we'd want to do is make models which learn how to learn that it would be much more sense if they figured out how to learn that learning rather than me trying to construct something which learns because I don't necessarily understand how humans can do it what I want to do is have a model learn how to do that learning so this is called meta-learning and there are two main streams of meta-learning which are now coming to the fore one is to learn how to build the best model and this is called the structure meta-learning so there's lots of different ways of building these models but the architecture is really tough to do and people used to say that this is done by graduate student descent in that you'd have graduate students and you'd apply them to the problem and they'd fiddle around until it worked and that was expensive but it did work and that's been the flow of graduate students into industry we call this structure meta-learning a good graduate student will get to the right structure much quicker than a bad graduate student so there's a meta-learning process to train that graduate student ideally we want to train a network to do the same thing another type of meta-learning and the one I'm going to focus on is what I want to do is build a model which learns how to solve things quickly so the model itself will be kind of preconditioned to kind of like want to learn stuff so it's not just set up to understand the problem to be fitted to the problem it will understand problems as quickly as possible so what I'll do is it's in sequence and the first one I'm not going to talk much about and I'll show you why the structure meta-learning basically what I want to do is I want to enable computers to build models they could do that by searching all the different architectures efficiently and then hopefully they could guide their own search by predicting what architectures might work well because they've got a history of they've got a whole trail of architectures and how well they did work they should be able to say well every time I added a layer it worked better so I'll add a few more layers until it doesn't really work so well but the thing is you've already seen the results of this in that is it still open? it's not still open don't there you go sorry this thing again so I said that here's this is kind of the original one built by some guy actually in his bedroom this is a whole bunch of google engineers and microsoft engineers with these resnets but then these nasnets is now like the efficient frontier is these nasnets and the nasnet looks like this and this was not designed by humans basically google has had this process which would rearrange these blocks again and again they've given it essentially the ability to look at the layer before and the layer before that and just let it go to town basically and so they have these normal cells and these reduction cells and they let it go to town the criteria will be I want this to have a few parameters as possible or I want to do as few mobile ops ops on my mobile phone as possible but I want it to be as good as possible whereas other ones would be just the highest performance you can give me this is a big network and so this is where given that so there's a nasnet large which kerat has which has got 340 million parameters but I'm just using the mobile version which has got 24 mega parameters right so this is something which you can run there also even smaller models but there's trade-offs clearly google likes this so it's probably probably quite a good set of trade-offs at the moment okay so we've already run one of these models which has been produced by architecture search so the nasnet structure is created by searching over these architectures and you can actually do these trade-offs which are better than humans can because well it's now at the efficient frontier I'm not really going to talk about this this isn't really the focus here but we have talked about it at the TensorFlow Meetup whatever so Sam okay let's talk instead about something which is this is from a paper essentially this strain there was a paper published on the 8th of March so this as I said in the kind of write-up this is fairly modern content so one-shot learning humans can learn from very few examples so if I explain to you about a friend of mine devised a kitchen implement called the hammer whisk and on one end there's hammer to pulverize things and on the other end there's a whisk to make delightful meringue okay now everyone understands this implement and you could probably make one right and I haven't had to describe very I haven't had to give you a thousand example images or even ten example images of it okay everyone has in their mind a pretty good example of what a hammer whisk is about right so that's kind of your preconditions to accept these new ideas and that's kind of a powerful thing so what we really want is a model that can also learn tasks quickly and what do we how do we make something which learns tasks quickly well the thing is that each task well I could do lots of different tasks it's not that I have a lot of different classes it's each task might only have a few classes may only have a small amount of data but I want it to be flexible to learn new tasks again and again and again okay so let's backtrack a little so regular learning is I have a training set which has a bunch of different classes and each class has sample images that it needs to learn okay and the test set was can the model classify previously unseen images so this is just image net or MNIST or standard classification regular learning so now we're going to meta learning the training set will be a bunch of different tasks each task will be a different problem each of those problems will have a small amount of data and what I'll say is it would be successful meta learner if it could learn each task being a separate training example right training example is a set of different classes to learn there will be a whole bunch of little images it should learn as much as possible and then a test set is not a a set of those individual tasks is here's another task can you learn it quickly if it can learn that task quickly then the whole thing the meta learning has been a success so this would encourage you to both learn how to do a single task as well as possible but also be generic enough that it can learn many many tasks quickly okay so hands on meta learning so this is in here and this is called so there's a directory in there called 8 meta learning so it's on the same level as 2 cnn there's 8 meta learning there's a thing called reptile signs and what I'll do is hopefully I can pull this up notebooks meta learning reptile signs so what we've got here is I'll run this twice what we've got here is a torch program so this is so the previous one was using TensorFlow and Keras this is now torch torch is kind of an up and coming kind of it's like a numpy it's very kind of generic toolbox Google is also interested in doing this kind of dynamic teaching of these things but they feel like they're kind of coming from behind right now so this has been going for just over a year and it's now kind of dominating research people implementing papers and so if a paper comes out within about two days if it's an interesting paper in two days there'll be a torch version available not from the original researchers but some will just be validating using torch it's also a good way to learn so what this does is basically I'll there are two layers to this because this is meta learning so there's an inner optimization so this is if I have a single task how many steps, how many epochs how many batches, whatever, this internal task for the outer thing is how many tasks are now going to learn so each task in this case will be let me just have a task this gen task so basically this is kind of this is the key thing this is a random task I can generate as many of these random little tasks and the task I'm going to be trying to solve is a sine wave of random amplitude and random phase I'm going to give you 10 points on that sine wave and I want you to be able to predict other points where they would be within that same sine wave so if I learn this task well I will be able to say well this is a big sine wave and it's shifted off by this and here's good answers for the rest of it and if I can do that I will then score on that task I'm going to do many, many of these tasks so this is where I have a little model so this is some linear things, some tans, whatever this is my sine wave predictor thing I have a thing which trains the current model on this current problem and a little plotting thing and then this is an outer loop which basically takes a copy of the model runs a training on a random task and then uses that random training to like build the entire model so it's just more and more sensitive to learning other tasks so this is a process which they call this is an open eye AI paper and they call it reptile it's kind of building on this thing called mammal before anyway so basically what we have here is this is basically first step and you see that this is a task from the test set and I'm only going to use one task from the test set yes the training set is a whole bunch of tasks, this is a task I'm meta learning so it hasn't seen this task before and it's been given these 10 points on this sine curve the network starts out with hardly knowing anything but it gets trained for like 8 steps if I train it to 8 steps it gets slightly better slightly closer to the sine wave but this model doesn't actually produce very good answers it doesn't learn this very quickly and this will be typical for model training but what I then do is I say well I run this kind of reptile process again let's say so now after a few more iterations on lots more tasks it now is you can see it's got much more sensitive to learning to fit this thing it hasn't ever seen this sine wave this particular sine wave before it doesn't know how to learn sine waves similarly for this one now it's getting quite quick to get up to this and as the training progresses this thing learns to fit the sine wave as quickly as possible so this is a meta learning and it's learnt the task of learning sine waves as quickly as possible so this is kind of something where the sine waves aren't that beautiful an example but this is well you can see now from a network which starts off hardly knowing anything when it gets presented this new test example in a sense it's probably starting off more neutral than before because it's preparing to jump the quickest it can and as soon as it does that it then jumps to learning as much as it can about this sine wave so it can give you answers on this new task so that was meta learning on this kind of reptile thing so what we've seen is what we're trying to do is we're trying to learn tasks quickly each task in the meta training creates this sine wave with a random amplitude and phase the training data is just a few points within that and we're testing to see whether it learns the sine wave within that task but the meta test sees whether the single test gets to learnt quickly ok and so there's a paper there's a blog post this is kind of nice but it's you know this you're now learning to learn to learn so now there's another one called and this is where it would work even if you don't have the open the virtual machine thing going because it's just a javascript presentation a javascript thing there's a thing called three boxes now what this so here we have three boxes and this is my input and so let's just let's just clear it all out now I'm going to this is a basically has been trained to learn to learn it's all in javascript so there's basically they've built this model the openai group built this model in tensorflow exported it to a javascript front end so this can now run on any browser anywhere and it learns to learn things quickly so so I don't know what it was so I just for argument sake I'll say a, b and c so we get this is a task I've just given it one example of three different classes and I will now ask it to classify if I write c off of a c it thinks it might be an a but if I complete the c it thinks it's 99% c so this model has learnt from one shot learning for these three examples now if we if we start slightly more it likes a doesn't it so this is learnt quite well or we can do like here is a fish here is a cat and we've got a mouse so there's another set of three this is my new task what I could do is I could do this oh it thinks fish wow what if I do this for a mouse mouse so it's learnt the pre-trained network is ready to learn these things super quick so this is a one shot learning which it learns in your browser and you can play with this this is quite a nice little thing and so let's see how they do that so they're using something called the Omniglock data set so the question is how do they learn this task of learning tasks how do they learn the meta thing of learning tasks what they've trained on is a thing called the Omniglock data set which has got oh I've got there's a nice image of this thing basically it's lots of little symbols but it was quite difficult to find basically in lots of different character sets they've got people to draw little symbols so it could be the English data set it could be no, Hiragana it could be no, Tamil all these different things Greek, Cyrillic they've got lots of different people to write out these things but there's a thousand sixteen hundred of these different symbols now each of these characters was drawn by twenty people so it's a very very broad set of characters drawn by rather few people so this is the opposite of MNIST where MNIST is you've got ten characters zero, one, two, three, five, nine drawn five thousand times so MNIST is kind of the standard task that people will do what you have with this Omniglock thing is you've got lots of symbols drawn by only a few people and so if you can learn to do Omniglock well you're probably able to learn the difference between other characters drawn by people and that's exactly what I'm doing right, so so what we're doing is this one sort of classification each of these tasks trains on just one example for the three classes the model is pre-meta trained on Omniglock the actual model is running in JavaScript you provide a meta-test task and you give it a task it has never seen before and you have to play with this and it works pretty well so that kind of explains what this whole game has been the wrap up is this field is advancing extremely rapidly I mean models which I had last year or the year before are now just missing from this whole it's hardly worth including them because it's just bulking up the virtual thing for no gain the NASnet is extremely small designed by computer it's what you'll all have in your phones with the next Android update or whatever on the other hand this stuff is still within the grasp of individuals the open AI thing you saw it learning the signs thing learning the Omniglock thing is probably overnight on a single GPU it's not like you don't need Google scale to learn this stuff so it's still entirely within the grasp of individuals to do and so while Sam and I have been playing with voices still entirely possible to at least get your feet wet with this stuff so it's very exciting so the other thing is that open source really applies to the research field here as well in all of these papers even though Google really has commercial interests in this stuff they're publishing all the time now maybe I mean the one of the criticisms is that their papers don't exactly explain what they're doing or the deep mind papers are kind of notorious for claiming quite a lot and not really giving you what on the other hand if you're persistent you can probably and you corner someone at one of these conferences you can probably find out what they're really doing but at least it has the impression of it is all out there and you can replicate these things or fail to replicate these things and then criticize as the researchers will say oh you're kind of doing it wrong why don't you try this or this involves a hyperparameter using 800 GPUs you may not want to do that right so there's some kind of adverts here so we have a deep learning meetup group which is a TensorFlow group next meeting I think is like the 19th there's a twiddle there it's a hell of a Google which is kind of one of the dependencies we typically have a talk for people who are starting out someone from the bleeding something from the bleeding edge this would be pretty bleeding edge and lightning talks so we had a meetup I guess on Friday night we had some Googlers because they're in town someone who's interested in implementing stuff on a Raspberry Pi and someone who's interested in implementing stuff on the cloud like the Horavods and you know Infinity Bands and stuff so there's a whole wide range of interesting stuff going on we've got 2400 people in that group which is now Singapore has the largest TensorFlow group in the world which is kind of excellent we've also got a deep learning jumpstart workshop which is we're doing with SG Innovate if you go on the SG Innovate site you'll eventually find this thing the cost of this is like 600 bucks but there's deep discounts for Singapore's PRs and this will involve a full day at the weekend with download and information possibly starting out from zero you get to play with real models kind of like what we've been doing here but you will be forced to do it rather than just sitting and listening and then at the end of the day pick a project that you can do yourself and the whole point is we will then have two more follow-up sessions where kind of you should have done your homework and if you haven't been able to do it we'll kind of fix it so it's going to work because it's one thing just hearing about it and clicking through a notebook and it's another thing actually making it do something and so there's a lot more learning the frustration in programming is the real learning experience okay it's just the MOOC but it's kind of handed to you but the banging your head against the wall is a okay we've also previously held like an eight week deep learning developer course which was all done and good we hope to do another one of these again but it's kind of pending the funding situation because Singapore is extremely generous with funding for Singaporeans but it's a double A sword you also have to deal with the government which has its pluses and minuses okay I'm open for questions and just about on time thank you fire away or not yes right so this is where I believe they've also done it with stuff bigger than Omniglot but they're not going to be downloading that the an image net style network will be naturally much bigger to a JavaScript thing so yes I believe that in their reptile repo they have image network as well so this will learn to tell the difference given three image net images or whatever you'll learn to do that classification as quickly as it can so it will learn to do classification of arbitrary images as quickly as possible I think there may even be pre-trained networks for that but you're getting into this work is less than three weeks old okay so it's fairly fairly new and so if you're playing with it then you're one of very few people playing with it yeah so I'll probably hang around outside for a bit very happy to ask questions if you've got one of the USB keys can I have it back please we've had excellent return rates previous don't let Singapore down so there we go hopefully you've learned to learn to learn