 Hello, everyone. My name is Jason Mays. I'm the developer advocate for TensorFlow.js here at Google, which basically means that if you're using machine learning in JavaScript in some shape or form out in the wild, there's a good chance we'll cross paths at some point. Now with that, today I'm going to talk to you about using machine learning in JavaScript, of course. So let's get started. Now, first up, I want to talk about how machine learning has the potential to revolutionize every industry, not just the tech ones, but all of them. In fact, we could be standing right here, the beginning of a new age. We've already been through the industrial and scientific revolutions, but what about the future? There could be a machine learning one too, and we could be at the very beginning of that right now. This is a really exciting time to start learning about machine learning as you can jump on the bandwagon early and really get involved and have impact. Of course, before I get started on that, what's the difference between artificial intelligence, machine learning and deep learning? I'm sure many of you today have very different backgrounds and it's important to understand what all this is all about and where it comes from and what all these key terms mean so we can understand what we're going to be making later on. Now, first off, I want to start with artificial intelligence, also known as AI. This is essentially the science of making things smart or more formally human intelligence exhibited by machines. But this is a very broad term in fact, and right now we're actually in a place of narrow AI. This basically means that the system can do one or a few things just as well as a human counterpart could do in that niche area, such as recognizing objects. And a great example of that is when people in the medical industry are trying to understand what brain tumors look like. Nowadays, experts use machine learning to actually work alongside them to help point out what parts of an image may contain a brain tumor, for example. And this leads to better results because sometimes it's just too grainy for the human eye to see but ML can pick up on these fine differences, which leads to better results for both the patient and of course for the doctor. Now, machine learning on the other hand, or ML in short, is an approach to achieve artificial intelligence that we just spoke about on the previous slide. Now, the key part about these systems is that they can be reused and this is done by creating systems that can learn to find patterns in the data presented to them. This is at the implementation level if you will. So if you have an ML system that is trained to recognize cats, you can use the same system to recognize dogs just by giving it different sample training data. So if you just roll back to traditional programming, as you can see on this slide here, you can see that in the old days we'd use lots of conditional statements in order to find spam emails, for example. If the email contains a certain word, mark it as spam. Now, this is not very efficient because the spammer can just change the words slightly and get around those conditional statements. Now, fast forward to today, and machine learning programs essentially get tons of emails to classify, which are marked as spam by you, and it tries to find what attributes of those emails led to it being classified as spam all by itself. So now there's no battle between programmer and spammer, and instead the end user can concentrate on making great software instead. So what common use cases are there then? Well, actually there's quite a few. These are the typical use cases I see machine learning being used for. There are others, of course, but we've got things like computer vision, like the object detection example we just spoke about. We've got numerical things like regression predicting a number, natural language, for example, text toxicity or sentiment analysis. We've got audio for speech commands, for example, and my personal favorite is generative, which is essentially things like style transfer and the creative kind of applications of ML. And you can see on this slide, an example from NVIDIA, whereby they are generating human faces and these faces do not actually exist in the real world. It's been trained on celebrities in this case and you can see how now this research can actually produce very cool imagery. So what about deep learning? Essentially deep learning is a technique for implementing machine learning that we just spoke about on the previous slide. And one such deep learning technique is known as deep neural networks. So you can think of deep learning as the algorithm you might choose to use in your machine learning program, essentially. So if you haven't heard of deep neural networks, don't worry. Essentially these are just programming structures for our range in layers that are loosely trying to mimic how we believe the human brain to work, essentially learning patterns of patterns and we get into that in more detail later in the talk. So in summary here, you can see how all these terms are actually interlinked. We have the deep learning that feeds into the machine learning, so the algorithm that goes into the implementation and that machine learning gives us this grand illusion of artificial intelligence which is what we're trying to aim for longer term. And these terms actually go back to the 1960s and 50s. It's not anything new. It's just that now we have the power with all the cheap processes and memory that we can actually make use of these techniques at scale with all the data that we now have. This previously wasn't possible in the older days. So how do we train machine learning systems? And that's a great question. Essentially we need features and attributes. And you can see here from this example, if we just pretend to be farmers for a second trying to classify apples and oranges, two features or attributes you might want to use would be weight and color. These things are easy to measure digitally and can be accessed at scale. So once you've got those, if we go back to our high school maths, we can try and plot those features and attributes on this 2D graph here. And we've got weight on the y-axis and color on the x. And you can see how the green apples and red apples kind of clustered together there at the bottom in their respective color spectrums. And then the oranges, because they're juicy, they're actually slightly higher up on the weight axis there. And we can draw a line to separate the apples and oranges apart. And in a way, this is actually a very naive form of machine learning if we could get a computer to figure out the equation of that line. Because if we now classify a new piece of fruit to take its weight and its color and we plot it on this graph, if it falls above the line, we can say with some level of confidence that that piece of fruit is an orange. And if it falls below the line, we can assume it's probably an apple. And that's kind of what is going on in all of these systems. The machine learning is essentially just trying to figure out the best way to separate the data so that it can classify it later on. What about bad features and attributes? It's not always obvious what we should choose here. And here is a great example, rightness and number of seeds. This could lead to a scatterplot as you see on the chart right now. And there's no easy way to separate this data with a straight line or even a curved line for that matter. And this is a good example of a bad choice of features and attributes. And you might be like, well, why Jason would you choose such things? And it's not always as simple as apples and oranges. Imagine the brain tumors we were talking about earlier on. What features and attributes would you use to be able to distinguish a positive from a negative result in that case? It gets very hard very quickly. And this is known as feature engineering to find the set of features and attributes that give you the best separation in data. And that's what folks get paid a lot of money to figure out properly. Let's look at higher dimensions. In our simple example, we had just two dimensions. Let's assume we had three. In that case, we'd need to plot it on a three-dimensional graph, as you can see on the right-hand side. And here, instead of using a line, we need a plane or a rectangle in 3D space, if you will, to be able to separate the data in a meaningful way. Now, it's actually interesting to note that most machine learning problems are actually using much higher dimensions than three. Unfortunately, our human brains just can't comprehend what that looks like, but you have to trust me if the math is actually the same. And instead of using a plane, you're using something called a hyperplane. And that just means it's one dimension less than the number of dimensions that you're working with. But the math works out the same, and you're just using this high-dimensional space and dividing it up in much the same way. So it should be easy, right? We've got a dog, we've got a mop. What could possibly go wrong? Well, some dogs look like mops and vice versa. And my point for bringing this up is that you've got to be aware of the bias in your training data. One of the biggest challenges you'll face is not finding enough training data that is unbiased for the situations you want to use it in. So in the case of recognizing a cat, something as simple as a cat, you might need to have 10,000 images of cats of different breeds, different stages of a life cycle, different shapes, sizes in different environments, different lighting conditions taken on different cameras. All this is required to have the best chance of understanding what cat pixels actually are. And without that, you may end up having biases in your machine learning model, which would be very bad. The other point to note here is that data is not always imagery. It could be tables of data with text or sensor recordings, sound samples, and pretty much anything else you can think of. As long as it can be represented numerically, we can use it in an ML system. So that brings us, of course, to JavaScript. Why would we want to do machine learning in JavaScript? And that is a great question, too. In fact, JavaScript can run pretty much everywhere, in the web browser, on the server side, desktop, mobile, and even Internet of Things. And if we dive into each one of those, you can see many of the technologies that you already know and love. On the left-hand side, there are popular web browsers you might use. On the server side, we have Node.js. For mobile, we can support React Native. And also other things that we chat with progressive web apps, of course. And for desktop, Electron can be used to write native desktop applications. And, of course, Raspberry Pi for Internet of Things. And JavaScript is the only language that can be used across all of these devices with ease without any extra add-ons and plugins. And that is a very unique point about JavaScript on its own, which I'm sure you're already aware of. And, of course, with TensorFlow.js, you can run, you can retrain via transfer learning, and you can write your machine learning models completely from scratch if you so desire, just like you could do in Python if you're familiar with machine learning in Python. And that allows you to basically dream up anything you might want, from augmented reality, gesture, sound recognition, conversational AI, whatever it might be. You can do that in JavaScript now as well, giving you superpowers in the browser and beyond. So there's three ways you can talk about using machine learning in JavaScript, and we're going to go through all of those now. The first one is pre-trained models. These are essentially really easy to use JavaScript classes for common use cases. And you can see we have many of these already, from object detection, body segmentation, which allows you to find where the body is in an image, pose estimation to detect the skeleton, and we've got speech commands, and much, much more. And that's some of our newer models on the right-hand side there. You can see we now support FaceMesh, which can recognize 468 landmarks on the human face. We've got hand pose that can detect similar things for your hand, and also the BERT Q&A model that allows you to do question, answer, based natural language processing all in the web browser. So let's see some of these in action and see how they perform. So first up, I want to talk about object recognition. This is using Cocoa SSD, which is the name of the machine learning model that we're using to power this, and that has been trained on 90 object classes, such as these dogs on the right-hand side. So 90 common objects can be recognized out of the box. Now, what's important is that you can see that this also gives back the bounding box data, which allows you to localize it in the image, and that's why we call this object recognition instead of image recognition. Image recognition is where you know that the thing exists, but you don't know where it is. So this is a pretty cool one to start with, and I'm going to show you how we can write code to make this actually work ourselves. So let's dive into the code now. So first up, let's look at the HTML. This is pretty boilerplate stuff. We're simply going to import a style sheet there, style.css, and then in our main body, we're going to have a demo section that initially is going to be invisible, so you can see class invisible is set at the very beginning there, and then we have some images that we want to be able to classify and click. So these all have the class classify and click, and an image contained within that containing div. Now these can be any images you want, but at the end there, you can see we simply have three script imports. The first one is essentially bringing in the TensorFlow.js bundle. The second one is bringing in the Cocoa SSD machine learning model, and the third one is, of course, the JavaScript we're going to write to get all of this working. So looking at the first lines of the JavaScript, first of all, we're just going to define a constant called demos section, and that's just going to get a reference area where all of our images are living. We're then going to set a variable model has loaded and set it to false, and also define a variable for the model to store that once it has loaded. Next, we need to load the model, of course, so all we need to do is call Cocoa SSD dot load, and because this is an async function, we use the then method to call back a anonymous function in this case with the results. You can see that anonymous function simply takes the loaded model as a parameter, and we can then assign that to our more global variable called model, and we can set model has loaded to true so we know that things are ready to use. Finally, we remove the invisible class from our demo section to make sure it's now visible and not grayed out like it was before. So next, we get a reference to the image containers, i.e. all the divs that had that classify on click class. We can then loop through all of those and essentially add a click handler to each so that we can decide what to do when each image within it is clicked. Here we go, here's the handle click definition. We simply check if the model has loaded. If it hasn't, we're going to return straight away because there's no point doing anything unless the model is available to use. If it is available to use, we're going to essentially call model dot detect and we're going to pass it the image that was clicked, so the event target in this case. Then again, this is an async operation, so we use the then to then call our other function handle predictions once it's ready. And in handle predictions, you can see we now passed a predictions object that simply we can log if we wish to kind of inspect as we so desire. But essentially this contains all the machine learning predictions that came back for that single image that we tried to classify. So we can loop through those predictions and we can create a new paragraph element for each and set what we saw along with its confidence. And then we can also set the margin of this paragraph so it sits nicely at the bottom of the bounding box. And then of course this thing called highlighter is essentially the bounding box that I've created and we're just setting the x, y width and height coordinates of that element so that it sits in the right place in the context of its parent div. And then of course we just add these two elements to the DOM and that should now be visible. And finally the CSS is pretty self-explanatory for various moments when we're changing for GUI. So if we put it all together this is what we get. So as you can see this is the code running and I can now click on one of these images and you can see instantly I get results coming back with the bounding boxes showing the items that it's found in each image. I've actually added a little extra bit of code here to do the same thing but with the webcam. And if I enable this you can see that I can now see myself too. And notice how the performance is pretty cool it's running at a high frames per second and all of this is running live in the web browser which means of course that your privacy is also preserved because this data is not being sent to a server for classification. So the next thing I want to talk about is face mesh. You can see here how it can recognize 468 unique points on the human face and it's just 3 megabytes in size. In fact many people are starting to use this in creative ways such as Modiface which is part of a L'oreal group who are using it for AR make-up try on as you can see from the image on the right. This lady is not wearing any make-up on her lips. In fact the lips are being chosen dynamically at runtime in the browser and then we are applying it because we know where the lips are from face mesh. Pretty cool. But let's see this running for real using my face so I can explain a little bit more. Okay so now you can see my face in the web browser and as I open and close my mouth you can see it reacts really well it's running at high frames per second but this is just running on the CPU I get you to switch at the top right and we can get even better performance by running on my graphics card. Now in addition to doing the machine learning in real-time because JavaScript is obviously great at graphics we're also rendering a 3D point cloud that we can also tinker with at the same time. As you can see I can move my face around on the 3D point cloud too so you can use this to make pretty much anything you want. So next up is body segmentation. This model allows you to distinguish 24 unique body areas across multiple bodies in real-time as you can see from the animation on the bottom here. But you can see how well that segments and it even gives you estimation for the pose of each body too. Are you aware that they think the skeleton is which can be used to do gesture recognition or much much more. Now models such as body pics can be used in really delightful ways too. Here's two examples that I created in just a couple of days that allow you to do some powerful things. On the left hand side you can see how I remove myself from the webcam in real-time rendering myself invisible much like a Harry Potter cloak or something like this. And as I get on the bed you can see how the bed still deforms even though I'm removed from the cam feed in real-time. Now on the right hand side you can see another demo I created that allows me to measure my body size in real-time. Now I don't know about you whenever I'm buying clothes I never know what size I am so I made this to help me out to find my size for different brands on the websites that I use. And in under 15 seconds I can get a result back for my chest measurements, my inside leg and all that kind of fun stuff in a much more frictionless way. And of course all of this runs in the web browser so my privacy is preserved. None of these images are going to a server. And of course all this can give you superpowers too. What if you combine TensorFlow.js with something like WebGL shaders? In that case you can get an effect like this which is made by one of the guys in our community in the USA which can shoot lasers from your mouth and eyes all in real-time at a battery smooth 60 frames per second. But let's not stop there. If we combine it with WebXR a very emerging web standard you can now even project people from magazines into your room in real-time too. And this guy is using this on his phone and then he can walk up to the person and kind of meet them in real life virtually speaking. So that's pretty cool. And I thought well if I can do this then why not go one step further and combine it with WebRTC to teleport myself in real-time. And you can see here how I can project myself from my bedroom into another living space that could be somewhere else in the world to meet my friends and family such that I can be closer to them even when I'm not. And having tried this myself it actually does feel better than a regular video call because you can walk up to the person and move around them and all this kind of stuff which you just don't get with a regular video call. The way you can use TensorFlow.js is by transpiling. This is where you retrain existing models to work with your own data. And this is the next logical step after using our pre-trained models to make things more customised to your needs. Now if you are an ML expert you can of course code all this stuff yourself but I want to show you two ways today on how to do this in a super simple fashion. Now the first one is Teachable Machine. This is a website created by Google that allows you to retrain data in the web browser like recognising an object or speech recognition or pose estimation for example. And in just a few clicks you can make your own ML model. So let's try this out right now and see how easy it is to use for something like a prototype. So here's Teachable Machine. We can click on Image Project to start and I can click on Webcam and you can see now that I'm just going to take a few samples of my head in front of the Webcam and we take a similar number of samples but this time I'm going to use this deck of cards and we've got a similar number of images as you can see. I'm now going to click on Train Model and essentially that means it's retraining the top layers of the model that we're using so that we can classify new data using things it learnt from before. So in just a few seconds this process will be complete and we can now see a live prediction coming from the Webcam and hopefully Class 1 is predicted right now and if I put the deck of cards in front it should now show Class 2. Class 1 Class 2 and look how responsive that is. It's really, really fast and you can get this great performance in just a matter of seconds. I feel like in 30 seconds we've made a custom machine learning model. So do try that out in your spare time and you can use this in prototypes. So you can simply use an export model at the top right there and you can save the JSON files that you need to then load this model on your own custom website later on to do something more useful. So maybe I can show a deck of cards and reveal a YouTube video or whatever I want to do. Now the next method I want to show you is if you want to do something more for production use case which is more than just a prototype you might have a lot more data and of course in the web browser you're limited by the RAM that you can use in a single tab in Chrome of course. So if you have like gigabytes of data you can use Cloud AutoML which is a production model in the cloud which you can then export to TensorFlow.js just like we did before. So here you can see I've just uploaded lots of data of flowers in this case lots of different folders of different types of flowers and all you need to do is then specify if you want to train for higher accuracy or faster predictions and of course with machine learning there's always a trade-off between these two things but you can choose which you prefer you click next and then after a few hours you have the option to export to TensorFlow.js as you see on this slide and it's super simple to use this exported JSON file in that here's the code all in one slide all we need to do is include the TensorFlow.js library at the top here we then include the AutoML library as well and then below this we have a new image that we have never seen before this is just a daisy image I found on the internet and we can then essentially use this as the image we want to classify and then in just three lines of JavaScript below we can now classify the image so the first thing we do is we wait for the model to load so we use tf.autoML.loadImageClassification and we simply pass it a reference to the model.json file that you would have downloaded from Cloud AutoML and that can be hosted on your CDN or your website or wherever you so desire because this is an asynchronous operation we use the wait keyword of course and then that gets assigned to the model when it's ready we then get a reference to our daisy image which is the new image we want to classify in this case and we simply use model.classify and pass it the image and await the results to come back and once this is allocated to the predictions object this is just simply a JSON object we can pass through and see all the predictions that came back from the ML model for that single image and of course you can call model.classify multiple times once the model has loaded so if you were to use this with a webcam you could then of course do that instead and have it running in real time on webcam data and the third way of course to use centrifugia is to write your own code from scratch now this is for the machine learning experts out there or people who want to go more hands on low level and of course going on that would be too much for a 30 minute presentation today but there's plenty of tutorials on our website which I'll share with you later to get started with this but today I'm going to go through the superpowers and performance benefits you can get by running in JavaScript and Node for example so first up I want to talk about the different APIs we have available there's two APIs the first one is the layers API which is essentially like Keras if you're using Python in the past and that is a high level API that's super easy to use now below this we have the ops API which is much more mathematical and this is like the original TensorFlow stuff if you will and that allows you to do all the funky linear algebra and all this kind of stuff so depending which way you want to go there's two flavors of TensorFlow.js you can use here based on your experience and capabilities so you can see how this comes together essentially we've got our models at the top there based upon the layers API and then that sits upon the core or ops API just below that now that can talk to different environments such as the client side and within the client side you might have different environments as well like browser WeChat or React Native for example and each one of these environments knows how to talk to different back ends such as the CPU that's always available but also other things like WebGL if you want graphics card acceleration on the front end or Wazm WebAssembly if you want to have better CPU performance and there's a similar story of course for the back end on the server side with Node.js and here it's important to note that we actually have the same performance as Python land so here we're actually calling the same TensorFlow CPU and GPU bindings that Python has to the C libraries that TensorFlow itself is written in and that allows us to get the same CUDA acceleration and AVX support for the processor to make sure things are running as fast as possible and in fact if for some reason your machine learning team is still using Python then of course you can load in saved Python models from the layers API if they're using Keras and you can use the TensorFlow saved model formats via our ops API directly into Node.js without conversion so you can just take a saved model and then use that in Node.js now if you want to use one of those saved models on the client side then you have to use our command line TensorFlow.js converter and that will convert the model into the JSON format we need to run in the web browser so let's look at performance then here is TensorFlow.js versus Python running MobileNet and these are the inference times having it takes to classify the thing we're looking for in the image at the top there you can see running on the graphics card in Python 7.98 milliseconds and in Node.js just 8.81 milliseconds so you know that's within a certain margin of error anyways and it's pretty much the same for all intents and purposes now what gets interesting of course is that if you have a lot of pre and post processing which basically a lot of ML models do because in order for the model to digest the data you need to manipulate the original data into something that is usable in machine learning land then you actually get further performance increases in Node.js because of a just-in-time compiler of JavaScript itself in fact we've seen with people at HuggingFace which are quite famous for making natural language processing models that they've seen a two times performance boost just by switching to Node.js for their machine learning pre and post processing so now if we focus on the client side for just a second here are five superpowers you get which are hard or impossible to achieve on the server side now the first one is privacy as I kind of hinted at before all of these machine learning models are running in the web browser on the client machine that means at no point is any of the sensor data going to a third-party server for classification and that's really important in today's world where privacy is always top of mind and with TensorFlow.js you can get that for free of course now linked to this is lower latency because no server is involved when you're running on the client side then we don't have that round-trip time from the mobile device let's say to the server which could be over 100 milliseconds or more in a bad mobile network connection and of course that leads to lower cost if you have a reasonably popular website you might be spending tens of thousands of dollars on graphics cards and beefy processes to run those machine learning models by running on the client side all of that hardware is no longer needed and of course you can just execute directly on the client machine as you all know, interactivity is a big thing for JavaScript it's kind of been designed for that from day one so we have a much richer ecosystem for graphics and charting and all that kind of fun stuff and the final point, reach and scale which we all know and love being web developers ourselves essentially anyone can click on the link in the web browser and have the machine learning loaded for free versus trying to do this in other ways on the server side which would require you to first of all understand Linux and install Linux then you need to install the TensorFlow stuff and the drivers for CUDA from NVIDIA then you need to install the GitHub repo and compile it and make sure it runs with the environment on the server side so all of that hassle goes away when you're running on the client side that can get you more eyes on your research in machine learning which could be very valuable if you're a researcher for example maybe that means 10,000 people can try your model out instead of the 5 people in your lab that can maybe uncover bugs or biases in your model that you can then fix before you see prime time now flipping to the server side for just a second there's also some benefits there too of course if you choose to use Node.js so obviously we can use the TensorFlow save model without conversion as we spoke about we can also run larger models than we can do on the client side due to the memory limitations in Chrome per tab of course it allows you to write code in just one language which is of course JavaScript which needless to say a lot of devs use JavaScript according to the Stack Overflow survey of 2019 I believe 67% of people are now using JavaScript in some capacity which is pretty cool and then the performance benefits of course you can get by getting the just in time compiler boost in Node.js over using machine learning in Python for example so with that I would like to talk to you a little bit about the resources you can use to get started if you're interested if there's one slide you want to bookmark today let it be this one and the next one actually so essentially here's some tutorials you can use to get started these are code labs you can walk through them step by step and learn as you go these are really robust ways to learn some of the things with TensorFlow.js and machine learning principles in general and then of course this slide has pretty much everything else on the slide here's our website to get started the models that you've seen in this demonstration and many more are available on our github there and we have a Google group to answer any more technical questions that you may have or may be thinking about later on and then finally we have Code Pen and Glitch which have boilerplate code you can use to get started now on the right hand side is our recommended reading material this is a great book that covers everything even if you have no machine learning background at all that's completely fine as long as you know some basic JavaScript this book will take you through everything you need to know to get your machine learning chops up to scratch and with that please come join our community in fact here's just a few more examples of what people have been making just for the last few weeks and this is growing every week if you check out the made with TFJS hashtag on Twitter or LinkedIn you can find what people are making right now and please do contribute your own for a chance to be featured at future and such in the future so the final thing I want to leave you with is this last demo from a guy in Tokyo Japan he's actually a kind of dancer and he's now used machine learning TFJS to make his next hip hop video as you can see here and it's really great to see creative folks starting to embrace machine learning as well it's no longer just for the 1% of people with PhDs it's now for everyone and hopefully TFJS can make this even more accessible to all in the future and I'm really excited to see what you will make and please do tag us with made with TFJS if you do make anything in the future so we can share it with the team so with that please do stay in touch happy to answer your questions after the talk or link connection with me on LinkedIn or Twitter and happy to ask questions over there as well thank you very much for watching and see you next time