 LinkedIn is a value, you're giving up your privacy to all of these things, right? I think that LinkedIn is worthwhile, Facebook, not so much. And I'm not a Twitter person, so github, or you could follow me on github anyway. Okay, so I'm going to talk about TensorFlow Extended. I'm also going to do a bit of TensorFlow Lite, because I know that that's what you're going to do. Is this good enough? It is a bit soft. A bit soft. Yeah, okay, let me just excuse me, excuse me. Let's do it that way. Can people hear me now? Okay, if only if only you had like a wireless mic for the actual speakers, there we go. Okay, so I'm going to talk about TensorFlow Extended. And the reason I want to do this is because actual the machine learning stuff you did this morning, and most people long to do the machine learning stuff, the reality is, most of your time is not spent doing that thing. It's gathering data. It's making sure it works kind of thing. So I'm, as per introduction, I'm Martin Andrews. I'm a Google Developer Expert, which basically means I talked about stuff for Google, but they don't pay me. So I also have this Red Dragon AI company with Sam, who's the previous speaker. And we do kind of, we like the conversational AI thing, we like knowledge bases, but also we teach deep learning courses here. Now I'll have a little advert at the end. Okay, so what I'm going to talk about is basically machine learning for production and how the machine learning is part of a bigger picture. And then basically how Google has, Google recognized this ages ago and has come up with this whole framework, which they've now released. Now a lot of people have also experienced this problem. And I'm one of them. I built this whole pipeline before it came up, before it was released. So I've got a very painful solution of scripts, which do all of this stuff. This is, the Google way is the better way to do it. But so I'll also explain how the components are joined together. But then I'll explain what each of these components do. I've got tons of slides. I'll try and do it quickly. One of the end points there is going to be TensorFlow Lite, which is what you're doing in the afternoon. Yeah, I apologize to a large number of slides. In particular, these come from the Robert Crow talk, which was last weekend on TensorFlow X, and the Google IO talk on TF Lite. So how many people were essentially here last weekend for AI Day? A few. Okay. So those people will have seen Robert's talk. Hopefully you will recognize some of the slides, but these are in a very different order. Okay. So the first thing is basically you've been looking to train a model, and the actual model turns out to be a very small part of the whole thing that you're going to be building, because really you want to make something which takes some kind of client thing and makes a client happy. And then in between these two things, there is a machine learning model, but the rest of it is kind of a whole process. And so this is what TensorFlow Extended is designed to, the ecosystem around TensorFlow is almost as important as TensorFlow itself. So other frameworks could include say PyTorch, for instance, or the other like MXNet or whatever, CNTK. But basically the entire ecosystem around TensorFlow is what makes TensorFlow kind of the biggest deal. And this thing powers all kinds of alphabet properties and customers. So that's that. So here's basically the workflow that one would go through. And of all of these things from taking in examples, this is the Keras model thing which you've been doing in the morning. And then there's all this other stuff. So this contains lots of components, these boxes. And basically, well, what do I want? Okay, so I'm going to explain how this is all put together, then I'll explain what the boxes do. So what is one of these components? Basically, it's a little block of code. I'm looking at this model validator component. It's got a thing which actually does the thing. But surrounding that is something which says, well, when do when have I got the inputs I need? And then what do I do with the stuff when I'm when I'm done? So this is basically a little component, we can now slot these all together. And so what we'll do is we'll take a configured component, we'll have put it on this kind of metadata store, which is going to store all of the inputs and outputs so they can kind of fish in and deliver off their results. And having done that, we can then put in one component into this stuff, it does its thing, it then goes on to the next component. Okay, so what's in this metadata store? Well, we're going to take various definitions of what these things are. We're going to then execute the stuff, but because they all link together, this metadata store is going to store who told what to whom for every run of our model, which is kind of important if you want to understand why did my model go wrong or what happened last time when I got a great model kind of thing. Okay. So given this metadata thing, you can also do some other stuff, you can find out what data was trained on. Suppose you could have had a whole bunch, you know, customers are giving you data every week. So say you're on equal mass site, you may take it every day, but say they're giving you a new match of data, but suddenly there were rogue customers from Russia who suddenly told you a bunch of nonsense data to mess your stats up, right? You want to be able to isolate the bad data and say, okay, you know, I should stop recommending people in Vietnam, Russian products, right? Or suddenly my site wants to be in Russian because the model tells it to. I need to stop that. So what is, let's isolate how everything works. The other thing I might want to do is compare, well, what was the model I gave last time? What's the next model? I can do all this kind of comparison in this metadata store because it's storing everything. And this is like one honking big data store. You can do all of these comparisons. And then you can also do things like, well, I want to take the model from last week and then just add the new data on. So I'm not going to do a big retraining. I'm going to do a tiny retraining or I'll take the data from the last hour and just enhance it. But still, I've got to keep track of what data does this model know? What is it seen? Suppose I have a lawsuit about one piece of data. I need to know where that data's gone. So this is kind of a bigger picture of what this machine learning model does. People in the real world, the lawyers, will want to know. You can do all this kind of reuse. How do we steer that? So how are this stuff orchestrated? Basically, my guess is that Google have got their own fancy orchestrator. I don't know. They probably have in history. But now they've basically said, OK, people, the orchestration is difficult to do. Let people bring their own thing. And so the two key ones that they're doing is this Kubeflow, which is, I think, basically Google open sourcing and whatever they can without talking about their own infrastructure too much. And also, there's Apache Airflow. And basically, these things allow you to wire all the components together using this directed acyclic graph. There we go. So I'm going to skip on past that, because now we understand what the components are, how they get their results and can record everything, and how this is all wired together. So let's talk about the little boxes. So here's basically, this is hooked together through Airflow. It could be hooked together by any one of these other things. So basically, we're going to have some training data over here. We're going to pass it through example gen, statistics gen, schema gen, this trainer, which in some ways the super focused thing we love. Then we're going to have some evaluations and validation and pushing to serving. So this whole thing is like a big picture and I'll explain why you want to do each of these boxes. So for the examples, basically, you may have a CSV of data or some TF records. Basically, you want to be able to consistently split this into test training and validation kind of data. But you also want to make it so that if you train it again, the split will always be the same. You don't want it to suddenly move over the validation data and then continue to train a model because suddenly you're contaminating yourself. So it's kind of important to get this right. Better to build it once and reuse it. Statistics gen, what this does is it takes all the data flowing in and just works out ball statistics, like what are the mean things? What are the deviations of this stuff? How many did I get? What time of day did I get them? Basically, it's storing like big statistics, which couldn't be very revealing when someone says, okay, well, why did the model go wrong? It's suddenly the taxis are now quoting the prices they charged in pennies, not dollars. So suddenly my statistics are moved. No one told me about the data ingestion change, but actually in the real world, this is very common. API changes are just not announced. You're just given different data. So having done that, we can now collect the statistics off these things and then kind of do this kind of drill down process. So this is kind of different from, it's doing a model, but it's very much like a scikit-learn kind of model of what the data is and allowing you to drill down. It will also do kind of, you may have big streaming data. It can cope with all kinds of nasties that the real world will force you to have, because when you come to the model, you're going to have it very much kind of batched and packaged up into a nice training set. This allows you to model much more as the data comes. Okay, next thing is schema gen. So there's a way where basically you may be getting, I mean clearly if you're getting some dollars and cents, that's one thing. But you may be getting dates. Dates are like nasty data type and it may be that you have a schema of how you want these dates delivered. So they have a component which will actually kind of guess the schema, the first runs of data when you've got kind of pure data. Okay, you can guess the schema from that and then use that in the next thing, which is basically validate that the things match the schema. Suppose I might have lots of dates coming in and then suddenly someone starts to pass me in nil or Friday. Okay, this does not match the schema. I want alarm bells going off when this happens. So this is, in some ways you say, well, of course the data's going to match the schema, but maybe not. Maybe you may have a field here which has got lots of addresses in and it may be that you only have a certain number of factories and so the addresses can only come from a certain set. New factory comes online. You need to make sure you have more categories in your input space. Just things will happen and this is useful to have because otherwise you get into like an unknown state with your model. Okay, so having got, basically we've got nice validated data coming in, maybe some images. What you may have seen from say like an ImageNet model is we're now going to want to transform these models. So basically images will be RGB with bytes of things. Basically I want to transform this into things which are like zero, between zero and one, three channels. I want to get rid of the alpha channel. I want to do all various things just to clean up the data. So I'm going to want to transform these things, maybe resize, transform these things in a kind of rigorous way. And what tends to happen is that just like in the inception models which you may have seen, there's an external kind of Keras pre-process image, thing you have to call before giving it to the model. In some ways it's kind of amazing that the pre-processing step isn't in the model to begin with. It's kind of a thing you have to remember and equally that component's probably been trained. So the problem is you've got this trained component outside your model. People tend to keep very careful care of their model and they're saved everything. But this transformation, if you don't have it, these are special float numbers which you might be copying out of the paper in order to put you into a model. So this is another step where you'd actually have a little piece of TensorFlow code or TensorFlow transformer code. Basically you can learn it during the process, freeze it off and it gets remembered as part of the whole model state. Whereas typically these things are kind of, they fall by the wayside and become special artifacts. So having trained this thing with means and quantas where you then fix it off, you can then do training and serving just fine. Okay, so here where we got to, so we basically covered these examples, the stats, schemers, transformers, so now we're training our model. Okay, so training the model, we're just going to do the thing which essentially, I guess this morning you were showing the prototyping process. So here's the trainer, we're going to take in all of this stuff and we're going to come out with some outputs which, well there's two, there's several key things, but one of which is we're going to have this thing called a saved model. So this is a TensorFlow land thing in that you can, I mean any old framework will be able to save your model. But, you know, the TensorFlow is kind of glommed onto this term saying, okay, saved model is a special thing. Saved model isn't just the numbers of the weights, it's how the model is wired together, it's like a complete description of the model. So the TensorFlow, the Keras saved model is a special kind of asset and that we can then essentially move to another system, we can use it wherever because it's completely descriptive of the model. So this is where we can then pass it along to other components. So basically you can take your module file, you train this thing and basically you get a, this is a TensorFlow thing, it will give you out a nice saved model. And from this also you can track your, we're using TensorBoard, you can track how your training's going, you know, all these things which you'll have seen kind of this morning, all of the stats that you might want to see, the training curves, the regularization, all this kind of thing, learning rates, whatever. And you can also then, because we've got the metadata thing, we can actually track it across different trainings and stuff. So having done the training, what do we do on this evaluation and serving side? So the evaluation basically takes, well here's my examples and my training, I can then pass them through, this will be the trained model, I can then pass through validation or test examples and I can see, well, how well does my model do? But because we've surrounded it with all this other machinery, we could say, how well does my model do, for examples in the morning or how well does my model do with a new particular customer class? You can actually do all this kind of sub-selection to try and drill down on why your model is doing badly or great, right? Whereas typically when we're training and prototyping these things, we have a validation set, which is just the set, right? But when it comes to the real world, typically your things won't go according to plan, and the bosses will be saying, well, it may be beautiful when you're validating your F1 score may be great, but all of my customers are being overcharged if it's raining or something. It's like, oh, who knew? I need to drill down into why that might be. So maybe when it rains, they pile up at the airport. There's all sorts of various things that could be going on, but difficult to tease out at the model level because it's just a bunch of weights, but when you've got the big stats on either end, you can kind of divide it up. So another thing you might want to do is just to make sure, is my model any good, right? You just to validate that, am I going to do something surprising? Because you're also going to want to decide, is my model better than the last model I had? So you always want to be able to understand is, are my customers going to benefit from, is the business going to benefit from pushing the new model to serving? And there's even components in serving where you might want to A-B test the models in front of real customers to find out which one they like. So there's a bunch of different ways in which you could decide, do I want to push this model, and then where do I want to push it? So essentially now we've gone through this TFX process end to end. I can't describe how the model is just one small part of it. And in reality, the production thing is tough. Okay. So one of the end points you might want to push, which is basically if you've got a cloud kind of service, or even on-prem service, but you want to serve to a REST API, you can give the saved model to TensorFlow Serving. TensorFlow Serving has been used four years at Google, and it's been millions of queries served, right? It can scale dynamically. Basically, you can say, how much am I being used? Do I need more instances of myself? And also, let you version the model. So basically, models will serve, will stay live as long as they need to, and then be replaced dynamically, which is kind of cool. High performance, it's designed for low latency. In particular, when you've got a GPU kind of server, it may be that what GPUs love large batches of stuff, right? And so serving one request at a time is no good, or it's very inefficient for your GPU. You would love to be able to serve 32 requests at a time. On the other hand, if I make a client wait for 31 other clients to come along before I give them the result, that may be, they may be having a bad day, right? So basically, you've got a kind of decision process. How quickly do I have to come back with this? And how many other one, how many other queries can I group into the same batch before I start going through the GPU thing? So this is all handled by TensorFlow Serving. Traffic isolation is a cool sounding bullet point. But my guess is it allows you to partition, like, who am I serving? What piece of, which model to whom? So this is, in particular, currently have a recommendation system ongoing. So this is being A, B, which is actually A, B, C, D tested by a publication you would know well here. But basically they have to give people cookies and every time the same person comes back, they need to be served from the same model. So you kind of need to be able to, it's not, it's not really A, B testing if everyone gets a mix of all the different models. I want to be able to ensure consistency. And even in between versions, I want to be able to ensure consistency. Okay. So this, the idea of this is I'll take this path to saved model, which is just some blob of binary information, and I can do a docker run. And basically this will then allow me to use this to create a serving endpoint with all these nice features. Now, clearly there's the config file is going to be tricky. But this is a thing which does work and has been used a lot. Okay. So another endpoint you might want to do is TensorFlow Lite. So this is something which I think this afternoon will have a lot of TensorFlow Lite. And so just how this fits in the ecosystem is that you've gone through this TFX or basically whatever you've done, you've got a saved model, you could either throw it to the TensorFlow serving thing for API kind of access, or you could convert it to TensorFlow Lite. So TensorFlow Lite, I will just read the words, is a framework for deploying ML on mobile devices and embedded systems. Okay. Now, the thing is that the server basically an API server you can expect to have a bunch of memory, like beautiful network, lots of cores, maybe a GPU or TPU or a farm or whatever, right. Mobile devices, not so much. Okay. I've got my little mobile device. This is less than $100. It's still got a bunch, it's still got some cores, but it's got only a couple of gigs of memory that I can't be sure of the network. And also I've got battery concerns. So I'm going to be kind of careful with what I run on this. The other thing is that this is not a small use case. There may be only if there may be server farms, but only a few rest end points. Whereas two billion of these things exist and TensorFlow Lite is used in production on huge, huge numbers of them. But that's not surprising because if you've got the Android suite installed and you've got Google translate, all of these things are using kind of TF Lite as a backend. So this two billion is bound to be quite a big number just because of what Google is using this for. So here are some of the things which it is used for. Text, you can kind of classify and predict. You might imagine that this could be used for auto text completion or it could be used for a Gmail auto reply. I'm not sure that's on the device, but there's a bunch of text services you might have. On speech, you could have recognition, text to speech, speech to text. Images, there's a whole bunch of neat things that people are doing with like auto photoshopping your images on the phone. Audio, there's other stuff. I'm not sure about video generation on the phone, though maybe there's a kind of, there are some dance apps where it'll actually do pose estimation on the phone. I guess people may be playing with that kind of a nice thing. So this actually has to happen in real time. There's just no network, there's no server getting in your way. I need to deploy the model right here. So easy to get started. I think this should be in quotation marks here, easy. So there are several things. You can just use pre-trained stuff, which is also pre-done. And if you're a kind of a mobile developer who doesn't really want to be doing all this training thing, there are a bunch of pre-trained models to do some of the key things. You can do your own custom model, which has its own quirks, as you'll see. And then there are other considerations, like performance and optimization. So in terms of pre-trained stuff, there's the TensorFlow Flow Lite powered ML Kit. So these are kind of pre-trained models to download. This is stuff which is available. You can just dial them up. You have a model, it will do something. But it won't be as super nice as a pre-trained model to do exactly what you want could be. On the other hand, it can be very effective just to piece together blocks of stuff. You can produce magic using pre-trained models too. On the other hand, suppose you've got your own saved model, which is your special Keras kind of save model thing. Basically, you're going to take this TensorFlow model, make it a saved model, then you'll have a TF Lite converter, and then you'll have a TF Lite model. So the thing about TF Lite is it's actually TensorFlow but Lite, right? So inside your phone, not only does it understand some TensorFlow, it understands the TensorFlow that your phone can do. So some of the operations that your TensorFlow graph will do on your GPU, it will have to do in some other kind of weak way on the mobile device. On the other hand, there may be a better way to optimize how to split up the graph onto the ops that your mobile device can do. In particular, there's a thing called NNAPI, which is an API for special mobile kind of chips that will allow it to do some ops really nicely. And so TF Lite understands how to translate between the proper saved model thing and the ops which the phone can do and do that really efficiently. So in order to do that, we're going to have to convert it. Basically, there is a TensorFlow thing where you just load a TF Lite converter and take it from a saved model, and then you just write it. So now this is in some kind of typical Google style. It's never this easy, right? But fundamentally, this is what you'll be doing. And it may be this easy in the end but it's kind of searching sometimes hard. So this is the slide which they should have said this up front. There's kind of limited ops. So some of the funky stuff you might do in Keras is not converted into good ops yet. So the TF Lite team is continually trying to build the number of things they can do. For instance, recurrent neural networks with kind of conditional operations, they don't handle that. So maybe they will at some point. But what happens is TensorFlow itself is like pretty big and they're trying to make it more efficient and smaller all the time. But of course, people keep doing new things. On the other hand, TF Lite started with the idea is let's do like convolutions really well because that will handle 80% of the use cases. But then every extra use case that comes along, they say, okay, well, maybe we need that. And gradually TensorFlow is trying to do this and TF Lite is trying to stop it just becoming the same size. Because at the end of the day, if it's one for one, the same thing, it's got to be the same, right? So there's a continual tension at the TF Lite level as to how many of these ops are we going to be able to support essentially more efficiently than TensorFlow would do in the first place. Now, another thing you might be concerned about is how fast can these models run? Because we do actually have limited CPU, maybe have a GPU, maybe have a DSP because a lot of these things will have some kind of nice thing for sounds. They may be a thing which does MP3s really efficiently. Can we now use that for doing some kind of tensor operation, right? That there's stuff in your phone made by the hardware makers, which may be able to repurpose as being nice stuff. But even this, you know, this on a server, you might have a, you know, Xeon, no cores, they're all pretty super. Whereas this one will have, I think this is an eight or 10 core thing. I went for large numbers of cores. But with an ARM chip, basically there may be some small cores and some big cores and some fast cores, just because while the thing is just hanging around, maybe looking at Wi-Fi or, you know, keeping the alarm state updated, it just needs a couple of like tiny cores keeping the thing alive. But as soon as I, and watching the fingerprint sensor, right? But as soon as I fingerprint in, then it will wake up more cores. And as soon as I open Minecraft, it's like, oh, we're, you know, we're in some fun here. It will open up all the cores so it can do more. But basically as a power saving feature, these things will kind of, the ships themselves will shut themselves down to like the minimum surface. So, but on the island theater light has to understand what the layout of the device is and how it can distribute all the ops. So, if we look at how these things perform, this is basically, there's a network, pre-trained network called MobileNet, it's a standard thing for ImageNet. And doing some kind of inference on a CPU takes with, this is with general like real TensorFlow, takes 83 milliseconds. Okay? By the time you do this with, by quantizing the models, essentially you crunch the models down. So, then instead of using big floating point numbers, they only use very limited resolution. Now, you may lose a bit of, lose a bit of F1 score. You may lose some accuracy, but typically you don't lose that much accuracy at all, like half a percent or something. But it's, you know, this thing is now almost twice as fast. Just by the model smaller, it's only doing eight-bit ops. This is much more efficient. Now, if this thing actually had a GPU, which is doing OpenGL, well, it's going to, instead of actually using the OpenGL to display things on your mobile screen, it's going to kind of secretly do matrix operations to help the compute. Now we're like five times as fast as the original. Okay? Now, if you were lucky enough to have an Edge TPU, this is a tiny little chip. I didn't bring it with me, but there's like a USB stick version. If you had a Singapore five-cent piece, this would be, you could fit like three by three or four by four of these on top of one of these pieces. Okay? So, these are tiny little chips. I guess Google would love these to go into mobile phones, but, you know, on the other hand, system on chip makers don't necessarily want to have Google's IP in there. Qualcomm, on the other hand, will be making these kind of little tensor-ready chips. But basically, they kind of have to persuade the designers of models to want to use their ops, because otherwise it's just a piece of dead silicon, right? So, there's a kind of a chicken and egg situation that the Qualcomm's of this world, or whoever it is, making their own chips, I guess, need to persuade the TensorFlow light team to pay attention to their things, so that the ops get written, so that it all gets into the flow, and then suddenly the Huawei phones will be better at these ops. But the Edge TPUs, basically, it's a tiny little version of the real server deal. It's a little systolic array doing tiny, I think it's 8-bit ops, but it does these, you know, in silicon, it's super efficient. It doesn't have to worry about displaying things on the screen, because it's not about that. It's just about matrix multiplies. So, this thing is now 42 times faster than CPU. So, having, like, silicon is, silicon to do your model is super cool. And maybe, you know, if Moore's law is grinding to an end so that we're not getting free CPU ramp-up, like, doublings every year or two years, we're going to have to move into more specialized things, where if you understand what people want to do with models, suddenly you've got a nice piece of silicon to do that thing. So, okay. So, I mentioned before this is quantization, huge speed-up, and it also makes all your models like a quarter of the size, because a float 32s, four bytes, and eight, if your weights are all eight bits, that's much smaller. So, rather simple step if you can do it. So, TensorFlow will do this by optimizing for size. This is what we had before, slightly different. Then, basically, you tell it to optimize in a certain way, and it will now go away and quantize. So, this it does even once the model's trained, it just kind of figures out how to quantize this thing after the fact. There's also another way to, this is kind of coming soon, actually do the quantization ahead of time and then see how the model trains as it's quantized. This works, but it's not a super huge uplift. I mean, if you're an F1 nerd, then you may want that thing, but this thing works pretty well as it stands. So, the next thing is, well, what happens if you don't have something as powerful as a mobile phone? What about microcontrollers? So, two billion devices sounds like a lot, but microcontrollers is like 150 billion devices. So, it may be, you know, my household has like three or four phones. Basically, I have some kind of desktop quality CPUs. I have kind of too many computers, but basically, I've got some desktop style stuff, some phone style stuff, but there will also be microcontrollers in like everything else. So, my stereo will have some microcontroller controlling the display and my fridge and my toaster and all these things will have microcontrollers in. If I were to own a car, there'll be like 30 in a car like this. There are a lot of microcontrollers around. So, what's the issue here? They don't have an operating system. So, like strike one, they have tens of kilobytes of RAM. So, just this image would fill, the image on the screen would fill their entire RAM and ROM kind of thing, right? This is crazy how small these things are. On the other hand, can we write a model, right? So, basically, your little MCU model will maybe have a couple of layers. If we detect, is there sound in the room, right? This little thing is, all it's job is, is there sound and you might do this with an op amp if you're an electronics guy, but it maybe you can do better, like rustling doesn't count, but breaking glass does count. That would be super cool to understand, right? Or is there human speech? So, this is another thing. So, the Google Home devices have this kind of big and small thing chip in them that the wake word thing is done on a small chip. So, the wake word is just waiting for some good speech to come along, but when you actually say the, I can't say the word here, but something rather Google, okay, it will then light up the bigger chip to actually do the speech capture and all this stuff. Now, Google has been in the news, I think, for actually listening into conversations, but if you read what the articles are saying, it's basically if someone inadvertently triggered the device. I'm not sure how careful the Alexa devices are with just recording everything, but we're pretty convinced that this thing is pretty much off unless it hears the magic words. So, anyway, it's one of the issues is I will often be doing like a hangout with Sam. We were talking about Google and these things and these devices will be going, oh, no, I wonder what Google, anyway. So, anyway, it would be better if I could actually set my own wake word, but there we go. Anyway, so having done these small detections, it's enough to have the only output of this MCU is do I want to wake up my big brother who can actually, well, my big brother, my bigger CPU brother who will wake up and actually interpret what's being said and then basically maybe I'm capturing the last few milliseconds because it's going to take him a while to wake up so I can then pass that along to him too. Okay, so basically the TensorFlow Lite for microcontrollers is a new thing. The idea is you need to be able to make models which are in 20k of model, but yes, this is a thing. So, here is a speech model in 20k. If you watch this thing at the TensorFlow Summit, there's a little board like this and the guy is saying, yes, yes. Anyway, hopefully the LED will light up, but I have to say the demo was not that convincing. There's also an, apparently there's a coming soon image classifier, but this is, you know, a tiny little chip. If you know how big these pins are, you'll see how tiny this MCU is. So that's an interesting kind of new direction, particularly if you're into the hardware thing, this could be very relevant. Okay, so wrapping up. I talked about TensorFlow Extended, so we've got all of these kind of, it's all kind of ready now. It was released over the last six months, I guess. You may have or if you've been doing this, you'll probably sellotape together a version of this thing and then realize, well, it could be done better if we had started better than that. So if you're just coming to this new, this is a better way to start. And you may not realize why you need to do all these things, but you'll probably need to do all of these things. The orchestration and metadata is a good thing to have. If you don't sort it out first, you'll never get to sort it out. But also these components are not just Google only, the components are fairly simple, little blocks of Python, so you can make your own components, slot in whatever you need along the way. So this is designed to be extensible, but basically they'll give you good stuff out of the box. There's a TensorFlow.org tfx page. It has a bunch of stuff on. If you're interested, go there. TensorFlow Lite, we can serve the models on mobile and embedded. Clearly, we want to optimize for speed and size. And this is kind of one of the things which makes the TensorFlow ecosystem much more compelling than other frameworks, because not only do we have the model thing, but we have all of this surrounding stuff, including directions to heavy-duty serving and mobile, down to mobile and smart objects. Interested in that, there's a whole TensorFlow Lite thing. Go there. So this is now my advert section. We have a deep learning meetup group. So this is a TensorFlow and deep learning Singapore on meetup.com. We're currently over 3,900 members. We would love to get to 4,000 members, so please sign up to that and come. And when you come, we just had one last weekend hosted in not this place but the one next door, so that their main Everest meeting room. We typically have, or we try always to have something for beginners. So something where it's like, here's how I made this model work. And it's very, basically we don't want people on stage who don't show code. So I showed a little bit of code, but typically we'd have co-lab notebooks so that you could go and have a play. That would be something for beginners, something from the bleeding edge. So if Sam or I will kind of take turns in having a paper which would say, well, this came out in the last two weeks. Here's how it works kind of thing. Because this is one of the reasons we want to do the group. It's just to push ourselves to make stuff and to talk about stuff, read stuff. And then for the other people who are also pushing themselves, we're happy to have lightning talks. So we've had someone here volunteer to do a lightning talk. Basically, this is a five minute kind of thing. And there we say, well, I've given far too many slides, but basically a limit of 10 slides, right? Five slides will be fine. And if you think about it, five slides for a lightning talk is like, here is my cover page. Here is the problem. Here is kind of what I wanted to do. Here's why it kind of worked the end, right? So that's your presentation. And we'd be super happy for anyone who's enthusiastic to come up in front. And the reality is that it's quite easy to speak to people. Because as you've seen, there's no one is heckling you. Everyone either wants to learn something, or just to be undisturbed while they read on their phone, or they're looking forward to the food or something. Okay, so it's actually much less threatening to be up talking, particularly if you've done something cool. And so we encourage people, if they want to have their first little talk, definitely talk about this. But typically, we also find people have trouble making into five slides, and the talk grows and grows, and now they're a speaker at a thing, right? So not anyone is unhappy with their current company, but if you wanted something on your resume, that would be a thing, right? We also run deep learning courses here in Singapore. We have a jumpstart course, which is two days long, plus kind of one day online content. And there's a project, we force people to make things with code, you're making things with code here. This you pay for. So I guess that's one benefit there. But fortunately, we've got a thing with IMDA, where these are proved for funding. So if you're a Singapore citizen or PR, you can get from 70% to 100% off, which is, well, if you're a student, it's all off, right? If you're as old as me, it'd be significant savings, right? So Singapore is very good at wanting to upskill people, and this is fantastic. Beyond that, we also have like advanced computer vision, which does more like object detection and segmentation. And there's lots of fun stuff beyond the kind of transfer learning stuff and other vision stuff we do in the jumpstart. Advanced NLP, if you've heard of models like GPT2 or BERT, this is where we deal with that. If you're an NLP person, you'll know there's a lot more to it than we can cover in like three hours at the jumpstart. We also, the last one, this self-supervised thing, is all to do with all of the data in the world, which isn't labeled data, which is like most of the data in the world. For instance, you could have carefully labeled self-driving car data with a labeling like every car and dustbin and the road and the rain or all these things you could have labeled, and those are extremely good by expensive data sets. Or you could just take the video from taxi drivers driving around the city and just learn because they didn't crash, or hopefully they didn't crash, right? But that's all unlabeled data and there's vastly more unlabeled data than there is labeled data. And so this is, of course, we haven't yet done, but we've got the content. So we need to get this approved and then we'll run this. Or think of understanding how cats move, right? You could just get huge amounts of YouTube video on cats and then understand cats. That would be cool if you want to have, build a cat, whatever, or cooking or whatever. But unlabeled data is a huge source of stuff. If machines can do a better job with that, that would be fantastic. So those are our courses. Definitely it costs money. The money side is handled by SG Innovate. So if you go under their talent thing, basically there's these courses, which says Red Dragon on them. We're also, Red Dragon, I'm not sure whether Sam mentioned this, we're also interested in interns. And this is basically, if you've got a burning desire to do machine learning, and normally we kind of, our bar is now rising, right? So the, fortunately we've had some interns who actually went, not just went through the process, it's more like they worked on their stuff. We helped them along. They published paper at NIPS, okay, or NURIPS. So that worked out quite well. Particularly if you're interested in the academic thing, it's more that direction. So we don't want people photocopying or just training a model. We want people having, in some sense for us it's like cheap labor to experiment. And that's what we want people doing. Okay. But this is the thing. So with, not only do we do stochastic gradient descent, which is the thing you do when training a model, but a lot of the Google models are trained using graduate student descent. Basically you take a whole bunch of graduate students and get them to train different models and now you've got a good model, right? So it's not, it's not, you know, an innovation on our part. It's like, this is what people do. Anyway, I'm done. Thank you very much indeed. If there are any questions, well, either we talk about them here or don't worry. Okay, thank you.