 Hello, everyone. I'm Jason Mays, developer advocate for TensorFlow.js here at Google, and today we're going to be talking about how you can give your future web applications superpowers by using machine learning powered by TensorFlow.js. But you might be wondering, as a web developer, why should I care about machine learning? And first up, I just want to start by saying that machine learning could influence every industry out there, as web devs were in a unique position where the apps that we create could be used for any one of those industries. And so there's a good chance in the not too distant future, clients will ask us to use machine learning, too, as there are some unique benefits that we can get by doing this client side in the browser, as we'll see later. Now, as a web engineer myself, it was not until a few years ago that I started using machine learning models in my web prototypes. So I'd like to share with you today just a few examples of how we managed to level up our apps to do some pretty amazing things in JavaScript. But wait, what is all this about exactly? Let's back up just a little bit and address one of the elephants in the room. First off, what exactly are all these buzzwords anyhow? What's the difference between artificial intelligence, machine learning, or deep learning? Let's take a quick 101 on what's going on here behind the scenes to help demystify these concepts. So first up, artificial intelligence, or AI for short, is essentially defined as human intelligence exhibited by machines. But this is a very broad term. In fact, we're actually at a point in time where we typically work with narrow AI. Now, all that means is that these systems can do one or a couple of things as good or better than a human expert in that area, like recognizing objects. And a great example of that is in the medical industry where doctors use AI systems to help them identify issues in the grainy images that come back from scans of a human body. Now, this allows them to spot things that they may have otherwise missed, leading to increased accuracy and reduces the time taken to process, which of course is a great result for doctors and patients alike. Now there's a lot of systems right now that work hand in hand with their human counterparts to create a workflow that's more efficient than ever before. So next up, we have machine learning or ML for short. This is at the implementation level. It's the actual program we can create that can learn from the training data presented to it to find patterns in that data. It can then use this knowledge to classify previously unseen examples of the same class in the future. Now, machine learning is an approach to achieve AI that we just spoke about on the previous slides. And the key thing here is that these systems once programmed can be reused. So if I create an ML system that recognizes cats, I can do the same code without modification to then recognize dogs, just by feeding it different training images for it to learn from. And this is very powerful and a big difference to how we used to program in the past. Take spam email as an example. With traditional programming, we may have had a bunch of conditionals or lookups to check if the word was associated with spam. If it was, we'd block the email. However, the spammer can get savvy of this, modify the words just slightly, and then our system is broken. And then a war of keeping up to date between programmer and spammer starts to develop, which is not a good use of our time. Now, fast forward to today and we can now use machine learning to solve this problem. Instead, thousands of users mark emails as spam and the machine learning will automatically figure out what words and features are most likely to have contributed. We can retrain the model every day with fresh content and now no human needs to be involved freeing up time to do other things. And here are just a few more common use cases for machine learning. Things like object detection, recognizing an object from an image, or what about regression, which basically means predicting a numerical value from some input value. For example, what's the price of a house whose square footage is 1000 square foot? With enough data, you can predict this with machine learning. Or how about natural language processing to understand human language? With this, we can mark every sentence on a blog post comment is toxic, or if it's positive or a negative statement. One could use this to assist in blocking trolls on a website before it even gets posted. Or how about audio for speech recognition? I'm sure many of you have smartphones or have tried the web speech APIs and this is all powered by machine learning too. And then finally, we have generative or creative examples, one of which we can see on this slide right now, which is created by NVIDIA's recent research. None of the faces in this animation are real. They've all been dreamt up by the machine learning model, just like if I asked you to imagine a purple cat, you could probably do so even though you've never seen one before. Now, the machine learning here has that the essence of what a human face is composed of and they then ask to generate new ones. And what about deep learning? Deep learning is essentially one technique you can use to implement the machine learning program we just spoke about on the previous slides. You can think of this as one of the many possible algorithms you can pick from to make the program learn from the data that you present to it. There are of course many other techniques too. But essentially deep learning is where the code structures are arranged in many layers, which loosely mimic how we believe the human brain to work, learning patterns or patterns further down the layers you go. And what do I mean by that? Well, imagine at the early stages, you can recognize something simple like lines. You go one level deeper and those lines might combine to allow you to recognize shapes. And one level deeper still and those shapes might combine to allow you to recognize objects. For example, a face might be represented by several shape features in certain positions relative to each other. And generally the deeper the network, the more advanced patterns we can recognize, but this comes at cost of processing power. So in summary here, we can see how these three terms are actually linked. The deep learning is the algorithm you can use to drive the machine learning program. And this machine learning program gives us the illusion of artificial intelligence, if you will. And these concepts go back to the 1950s. They're not new, but it's only now they've got the resources at cheap enough cost such as the RAM, the CPU, the GPU to make these ideas even more feasible. We're now living in a truly exciting time and we're at the start of a new wave for how we create smarter systems in the future. So the next question you might be wondering is how on earth do we train such systems? That's a great question. Now I know we all work in many different industries, so feel free to adapt the following to your own area, but let's do a thought experiment and pretend we're trying to make a web connecting system for farmers who are trying to classify apples and oranges to speed up their delivery of picked fruits that are currently mixed together and need to be sent to the right destinations. Now the first thing we need to identify are the features or attributes of the fruits that we could measure. Let's take color and weight as an example. Both are easy to measure. We can use digital weighing scales and RGB values from a webcam would allow us to do this. So going back to our high school maths, if we were to sample some apples and oranges and plot these values on a scatter chart as shown, we can see here that the red and green apples fall in the red and green spectrums of the X-axes of the graph and tend to cluster together with similar weight variants in the Y-axes. Now the oranges, as they're super juicy, tend to be heavier and are higher up in the chart. Now if we can draw a line that separates the apples from the oranges, we can now with some degree of certainty decide what fruit something is simply by plotting its feature values on the chart. If it's above the line, it's most likely an orange and if it's below, it's most likely an apple. We've essentially learnt how to classify the fruits. So if we can get a piece of software to define the equation of this line by itself, we can then get a computer to learn how to classify fruits too. And this is the essence of what is going on behind the scenes for machine learning. Not so magic, right? Essentially, we're just trying to figure out the best possible way to separate out the example data, such that for any new unseen example, we have a chance of classifying it correctly. But what if we had chosen bad features? Let's take ripeness and number of seeds. Here, the plot is less useful to us. There's no straight or even curved line that would allow us to separate these data points. We can't really learn from this data alone. And you might be thinking, well, Jason, why would you obviously choose such bad features and attributes? And that's a great question. Sure, with this travel example, it's clear this would be unwise. But what about those medical scans we spoke about at the beginning of the presentation that are just RGB image pixels? How do you define features for that? It's not always so obvious. And what if we had more than two features? Previously, we had just two features, so we used a two-dimensional chart to separate such data. If we had three features, we need a 3D chart to do so, as shown here. Now here, we add weight to our previously unsuitable features. And we can now use a plane, or rectangle in 3D space, if you will, to separate the oranges from the apples. Now, hopefully in this image, you can see the oranges are now further back in the weight axes, making them more separable from the apples. But it turns out that three dimensions is not typically enough for most MR problems. It's not unusual to have tens, hundreds, thousands, even millions in the case of images, where each pixel is a feature. And as humans, we struggle to visualize higher than 3D. I tried and I failed miserably, so you have to trust me for that. However, for a computer, the mathematics works out much the same and is capable of doing such calculations. Instead of using a plane, we use something called a hyperplane, which is simply one dimension less than the number of dimensions that we have, allowing us to split the data just like we did here. But with more features and attributes, which can sometimes give us a better data separation. Okay, so back to TensorFlow.js. Now we understand what's going on behind the scenes, you'll be pleased to know that TensorFlow.js does a lot of the hard stuff for us. TensorFlow.js is a machine learning library written for JavaScript. Doing machine learning in browser has several advantages, such as lower latency as no server involved, user privacy as the data stays on device, and super easy deployment because anyone with a web browser can use it. And that means you can use machine learning anywhere JavaScript can run. And that includes the web browser, server side, desktop, mobile, and even IoT devices. And if we dive into each one of these stacks in more detail, you can see many of the technologies we know and love. In fact, JavaScript is one of the only languages that can run across all of these devices without any extra plugins, giving you the ability to deploy and run anywhere with just one code base. And this is a great win for JS devs as you can make scaled web applications powered by machine learning in all of these environments and even control hardware from the browser or standalone if you wish. Now, with TensorFlow.js, you can run, retrain via transfer learning, or write your own models completely from a blank canvas. And with this, you can use it for anything you might dream up, things like sound recognition, gesture-based interaction, sentiment analysis, conversational AI, and much, much more. Now, there's a few ways we can use TensorFlow.js based on your familiarity with machine learning, JavaScript, or both. The first way is to use our pre-trained models. These are really easy to use JavaScript classes that can be used on many common use cases. And there are many situations we do not need to train a brand new model from scratch and instead can leverage existing work. Let's take a look at some of those now. So here you can see several popular premade models available with TensorFlow.js, things like object detection or body segmentation, which is the act of classifying each pixel in an image to determine if it belongs to a human body or not. Or how about pose estimation to understand where the joints and skeleton might be? We even have many natural language processing models to understand human language too. In fact, our new question and answer model allows you to ask a question on any piece of text and it can automatically tell you which part of that text actually answers the question. Imagine using that on a really long webpage to automatically scroll to the information that's relevant to what you want to know. You can actually do that right now. And we have many more models you can check out via the link shown on the slide. So let's see some of these in action. First up is object detection. This model is using something known as coca SSD behind the scenes and is trained on 90 common objects. It can recognize those objects in images and provide us with a location of each object with a banding box, as you can see in this image on the right of the dogs. Notice how it can detect multiple objects at the same time. This is different from image recognition which understands something might be in the image but not where or how many. And that's why coca SSD is super useful. So let's see it in action with a live demo in the browser. So here you can see us running coca SSD live on a webpage. And if I click on any one of these images here, you can see that it highlights the objects is found within them, even if they're different classes. It can see here that this dog is very close to a bowl of treats which might be useful to know if you want to send yourself an alert. Now we can go even better than this by enabling our webcam. And if you do that, you can see me live speaking to you right now and it's classifying me in real time at a high frames per second. And what's really cool here is that this is running all locally on my web browser. None of the webcam imagery is being sent to a remote server for classification. So your privacy is preserved as well. Next up we've got face mesh. This model is just three megabytes in size and has the ability to recognize 468 landmarks on the human face. Now not only does this work super robustly, we're starting to see real-world use cases of people using this in production too as we'll see in just a bit. So let's see face mesh in action. That is a pretty cool model. So here is face mesh running live in my web browser. Notice how my face is being tracked while I'm talking to you right now. And on the left we can show the mesh of our face in real time, but because this is JavaScript, not only are we doing the machine learning, but also rendering a 3D point cloud on the right-hand side in WebGL, that's fully interactive too. So let me show you that. You can see me moving the 3D points right now live in the browser and super smooth too. Now you'll notice at the top left, I'm getting about 22 frames per second, but that's because I'm live streaming right now. And of course, if I wasn't live streaming, we'll get closer to 30, 35 frames per second, at least on my current system. Now JavaScript has a very rich ecosystem for 3D graphics and other charting, which is far more mature than other environments, which makes it super fun to prototype new ideas in the browser with machine learning models. And even better, we can choose what backend to execute on, such as the CPU or GPU, if we want to do so. And we can do this in this demo by clicking on the dropdown at the top right here. And that will give you a higher frames per second depending on the type of device that you're running on. And here we see a demo by Modiface, part of the L'Oreal group for AR makeup try-on. It should be noted that the lady is not wearing any lipstick. Here, our face mesh model is combined with WebGL shaders to augment the color chosen onto the person's lips in real time in the browser. Next up, we've got body segmentation. This model can distinguish 24 body areas across multiple bodies all in real time. Now this is hard to demo live as I need more space, but notice from the image how the bodies of each person are correctly segmented with different colors representing different body parts. Even better, we get the pose estimation too, those lines in blue to estimate where the skeleton is so we can do things like gesture recognition and much, much more. And with a little bit of imagination, we can actually emulate some of the superpowers we were promised from the sci-fi movies. First up, invisibility. This is a more advanced demo than simply replacing the background with a static image. For that you wouldn't even need machine learning. But notice here how when I go on the bed, the bed still deforms in the image on the right as I move around or how the laptop screen still plays. This prototype uses body pics that we just saw to calculate where the body is not so that I can eventually learn all the background and keep updating parts where it's safe to do so. Even better, this is made in just one day and runs entirely in the web browser. No background and machine learning is required to run this code, simply click a link and it just works. And no images are sent to the server for classification leading to real time results. Next up, lasers. Another member of the community from the USA combined his love for web gel shaders with TensorFlow.js to enable him to shoot lasers from his eyes and mouth just like Iron Man. And this uses the space mesh model we previously saw to run in real time in the browser without any issues. Now whilst this is a fun demo, you can imagine using this for a movie launch to amplify your reach by building a one click creative experience for fans to drive into excitement. Or how about teleportation? By combining TensorFlow.js with other emerging web tech, we can now create a digital teleportation of ourselves anywhere in the world in real time. Here, I segment myself from the bedroom using body pics. I transmit my segmentation anywhere in the world with webRTC and then recreate myself in the real world environment with WebXR and 3.js. Remember, all of this is running in the web browser. No app install is even required leading to a frictionless experience for the end user. And having tried this myself, it really feels more personal than a regular video call as you can actually walk up to the person and hear the audio from the correct direction as if they really are there. In fact, maybe next time when I'm presenting to you at a future event, I might be able to do so in your own room just like this as if I was standing right in front of you. And of course, there are many other delightful creations we can make beyond superpowers. How about this clothing size estimator? Here, I created a tool that can estimate your clothing size in under 15 seconds in the web browser to automatically select the correct size of clothing on a website. Now, I don't know about you, but I can never remember my sizes for clothing. And with this tool, I simply enter my height, stand facing the camera and once to the side, and it can automatically choose for me the correct size at checkout. And of course, this means less returns and less time wasted. This was created in just two days and can potentially be used by anyone with a single click at the point of checkout on any website. And finally, we have one more example from the community. Here, someone's managed to bring an image of a model from a magazine to life using WebXR and WebGL. Note that even with these fancy particle effects and machine learning running in the background, this is running on a two-year-old Android device and still the performance is great. Now, the second way to use TensorFlow.js is via transfer learning. This basically means retraining existing models to work with your own custom data. Now, if you're familiar with machine learning, you can of course do this programmatically in code, but today I want to show you two easy ways to get started. First up is Teachable Machine. This is super easy to use and runs entirely in the web browser, both for training and for inference, which is the act of using the model to classify something. The best way to explain this is with a demo, so let's try it out. So if we head over to teachablemachine.withgoogle.com, we're presented with a screen like this. We can see three types of projects we can use, image, audio, or pose. We're gonna use image today because we want to do object detection. So click on that and you then see a screen like this. On the left-hand side are the classes that you want to recognize. You can add more than two if you wish by clicking add a class, but today we're just gonna recognize my face or a deck of playing cards. So let's go ahead and give them more meaningful names. So for class one, I'm gonna call this Jason, and for class two, I'm gonna call this cards. Now, all we need to do is click on the webcam button, allow access to our cam, and now you can see a live preview. We can use this to sample that object. So the first one is my face. I'm just gonna move my face around and take some samples by clicking this button below. And notice how I move my head around to get some variety so it learns what my face looks like from the sides and different angles. I've got about 36 images here, and I'm gonna try and do the same with a deck of cards by clicking the webcam button below and try and get the same number of images. Otherwise, I'll have a bias in my system. So let's bring the cards nice and close and try and get 36 images of that as well. 35, that's good enough. So now I click on train model, and what's gonna happen is behind the scenes, we're gonna retrain the top layers of the model to distinguish the difference between my face or the deck of playing cards. And you can see in the time it took me to say that, it's already finished training. We've got a live preview on the right-hand side. Currently, it predicts Jason with 100%, which is correct, my face is indeed in view. If I bring the deck of cards into view, we can see it now predicts cards with 100%. Jason cards, Jason cards. And you can see how responsive that is too. Now, this is great for prototyping, and if this is good enough for your needs, you can click on export model at the top right there, click on download, and now you can download the model.json file that you need to run in the web browser. You can then host this on your own website or CDN and use that in any way you wish with a nice graphical user interface and user experience. Now, Teachable Machine is great for prototypes, but if you want to launch a production model with gigabytes of training data, then Cloud AutoML can be used for this, and it supports exporting to TensorFlow.js too. In this example, we see someone trying to classify flowers. All they've done is uploaded folders of flowers to Google Cloud Storage, and then we can move on to the next step of the training process. You can now select if you want to train for higher accuracy or faster prediction times. Of course, there's usually a trade-off between the two. Once complete, you'll be able to export TensorFlow.js as shown, and you can simply download the files and host on your website or CDN. Now, some of you might be wondering, how hard is it to use that resulting model in JavaScript? Well, actually, it's pretty easy. In fact, it's so easy if it's on a single slide, so let me walk you through this right now. First, we have two HTML script imports. The first one is for TensorFlow.js itself, and the second one is for the Cloud AutoML library. We then have an image tag for a new image that we want to classify. In this case, I just grabbed an image of a daisy from the internet, but it could be anything. It could even be an image from a webcam stream if you wanted. And then finally, we've got the actual JavaScript code, which is just three lines of JavaScript to do the hard work. The first line, we simply call await tf.automl.loadImageClassification, and pass to it the location of a machine animal that we just trained. In this case, it's called model.json, and it's located in the same directory. This is the file we downloaded in the previous step, and it's simply hosted somewhere on your web server. We use the await keyword here because the model load is asynchronous, meaning that it would take some time to complete. This allows us to wait for this to finish before continuing sequentially. Now, once the model is loaded, we can then grab a reference to the image we want to classify by using document.getElementById and pass the id of the element we wish to use. In this case, it's a daisy, which represents the image tag above. Finally, we can call await.model.classify and pass to it the image you want to classify. Again, depending on the model, this can take several milliseconds to execute, so this uses the await keyword too. You'll then get a json object return to our predictions constant, which we can then loop through and print the results or do something more useful. It should also be noted that you can call model.classify as many times as you like with different images once the model itself has been loaded, and that's how we can achieve real-time performance on the webcam. So, we've seen a lot of great demos, but what are the core benefits of doing machine learning in JavaScript? Well, first, let's start by explaining the TensorFlow.js architecture. We've got two APIs, and the first one is a high-level API known as the leds API, which is very similar to Keras if you're using Python already. Next, we've got a low-level API known as the ops API. This is more mathematical in nature and allows you to do things like linear algebra should you wish to work at that level. So, let's see how these all come together. Here, you can see how our premade model sit upon the leds API, which itself sits upon the core or ops API. Now, this lower-level API can speak to different environments such as the client side, which includes things like the web browser, for example, and each one of these environments can execute on a number of different backends. For example, the CPU, which is always available, WebGL, for GPU acceleration if supported, or WebAssembly, for short, if supported for more efficient performance on CPUs. And there's a similar story for the server side, too, such as in Node.js. Now, note here that our Node.js implementation can talk to the same bindings that Python TensorFlow talks to, so performance is just as good or sometimes even better than Python due to the just-in-time compiler of JavaScript. The other thing to note is that if you're working on a machine-learning research team who wants to deploy their research to the web, there's a good chance it might be coding in the Python flavor of TensorFlow. And with Node.js, you can execute the saved models they produced without any conversion required, making it super easy to integrate. However, if you want to run a Python saved model in the web browser, then we've got a command line tool that helps you to do that, which will convert the saved model to the JSON format required to use the model on the client side in the web browser. So to wrap up this section, there are five client side benefits of doing machine learning in the browser that are worth pointing out. The first is privacy. As inferences performed on the client side, no data is ever sent to a third-party server, and that means we can maintain data privacy for the user. This is particularly important for medical and legal industries where it might be a requirement not to transfer data to a third-party, not to mention they're growing concerns around privacy these days, and here you get it free. Next up is lower latency. As JavaScript has direct access to the sensors on device, such as the microphone, camera, accelerometer, and much more, there's no round-trip time to the server to analyze that data. Datancies of the server could be close to 100 milliseconds on a mobile connection, but with TensorFlow.js running on device, we can go much faster than that. Next cost. If no data is being sent to a server, then less bandwidth and hardware costs are required as no CPU, GPU, or RAM are needed to be hired to be running 24-7 for inference. You just have to pay for hosting or website assets and the model files, which is far cheaper. Next up, interactivity. Webtech has been great at this since the very start and has evolved to handle even richer formats, WebXR, WebGL, and so on. And I encourage you all to see how you can push machine learning models further when combined with the rich ecosystem that JavaScript has to offer us. And then finally, reach and scale. Zero installation is required. Anyone can click on their web hyperlink and load a web page, and the machine learning will just work. That's all you need to do to run a machine learning demo in the browser. So finally, we'll wrap up with some resources to get started if you wish to continue with TensorFlow.js journey. If there's only one slide you bookmark and share with folks, let it be this one here. This slide has all the resources you need to get started. And for example, our website, an API available on tensorfo.org forward slash js, and our models are also available to use. Today, we just touched on three of them, but there's many, many more to check out too. We're fully open source, so check us out on GitHub and we welcome contributions. And if you've got more technical questions, check out our Google group. We've also got some boilerplate code showing how to use pre-made models in minutes over on co-pen and glitch. Now, if you're looking for an all-in-one book, deep learning of JavaScript by Manning was written by folk on our team and takes you from zero machine learning knowledge to learning how to implement more advanced techniques. Familiarity of JavaScript is the only requirement and no machine learning background is required. Or check out one of our many code labs if you prefer a more hands-on approach. Learn how to make your own smart webcam just like a Nest cam in minutes, create custom models, or learn how we make teachable machine possible. Also, a quick shout out to our community. Check out the made with TFJS hashtag on Twitter or LinkedIn to see many more amazing examples that people have been creating. New content is coming every single week and it's a great way to get inspired. If you make something using TensorFlow.js, be sure to use the hashtag for a chance to be featured at future events such as our show and tells here on YouTube and even on our blog posts. And finally, the only question left is what will you make? This final example comes from our community member in Tokyo, Japan. By day, he's a dancer, but he's managed to use TensorFlow.js to create his amazing hip hop video with some awesome visual effects using body pics. The reason I show this is that machine learning really now is for everyone and we're super excited to see how TensorFlow.js will enable many more people to start their journey with machine learning. Creatives, artists, musicians, no matter what your background, you can still use models in ways never even jumped up by the original model creator as you saw from just a few of the demos today and we're super excited to see what you create. Please do use the mode of TFJS hashtag so we can find your work. And with that, feel free to stay in touch or reach out with any questions. You can add me on Twitter, Jason underscore maize and I'd love to hear from you. Thank you for listening and see you next time.