 Can everyone hear? Hello, everyone. I'm going to be talking about how you can go from zero to ML on Google Cloud Platform. I'm going to be covering a lot of different machine learning technologies. So it's going to be a bit of a whirlwind, but there'll be some demos. We'll have lots of fun. My name is Sarah Robinson. I'm a developer advocate on the Google Cloud Platform team based in New York. This is my first time in Singapore. Super excited to be here. As a developer advocate, my job is to teach external developers, like all of you, how to use Google Cloud Platform's products. And I focus specifically on our machine learning tools. So day-to-day, it involves a combination of building demos, writing code to teach other people what you can do with these technologies, writing blog posts, online content, like videos, presenting at events like this. And an important part of our job is bringing feedback back to the product teams that are building all of these tools for all of you. So if any of you have used any of Google's different machine learning products, have things that you'd like, that you don't like about them, definitely come chat with me after. Let me know. And I will pass that feedback along. You can find me on Twitter at srobtweets. So I want to start out by talking about, at a high level, what is machine learning? So machine learning is essentially teaching computers to recognize patterns in the same way that our brains do. And over time, as we give machine learning models more and more data, they'll be able to improve as they're given more examples and experience. Does anybody remember how they learned their first language, spoken language, not programming language? So your parents probably didn't give you a dictionary and a bunch of grammar books to memorize. That'd be kind of weird. Instead, you learned over time by being exposed to many different examples. So let's say the first time you had pasta for dinner, you saw it on your plate, you heard your parents identify it. Maybe you identified it incorrectly a couple of times. But over time, this repetition strengthened certain pathways in your brain. So the next time you had it for dinner, you were able to identify it correctly. And this is roughly how machine learning works, too. So at a high level, machine learning is loosely based on how the human brain learns. Instead of biological neurons, we have mathematical neurons that mimic the way that these neurons in our brain work. Machine learning lets us solve problems without knowing exactly what the solution might be. And it enables systems that improve over time as they're given more and more data. So that's kind of what machine learning is at a high level. This is how we can think about framing almost any machine learning problem. I'm going to focus here on supervised learning, which means you provide a labeled data set to train your model. So we can think of nearly every machine learning problem in this way. We give labeled inputs to our model, and then our model is going to output a prediction based on those inputs. The inputs could be many different types of things. And the amount that you know about how your model works under the hood is going to be dependent on the tool that you choose to use. I'm going to talk about a bunch of different tools for this. So what do we mean by input? Input could be this image of a cat, so the pixels in this image. And our prediction could be just the label cat. I'm sure many of you have heard of machine learning models that can tell you if an image has a cat in it or not. It could be more specific. It could tell you the bounding box of where that cat is in the image. Another type of input could be text data. So let's say our input is the headline of this article, and the prediction is the label sports, or more specifically, the label baseball. Or what if we want to build a custom model that's able to predict which news publication this article came from? Another type of input could be video, and our prediction could be what's happening in every scene of our video. So many people see the term machine learning, and they are a little bit scared off. They think it's something only for experts. Now if we look back just about 60 years ago, this was definitely the case. So this is a picture of the first neural network invented in 1957 by Frank Rosenblatt. It was called the Perceptron. And it was a device that demonstrated an ability to identify different shapes. So back then, if you wanted to work on machine learning problems, you needed access to extensive academic and computing resources. If we fast forward to today, we can see that just in the last six years, the number of products at Google that use machine learning has grown dramatically, from zero to over 1,500 projects that have model description files. So these are just a couple of the Google products that you may recognize that use machine learning in one way or another. And at Google, we don't think that machine learning should be something that's only for experts. So we want to put machine learning into the hands of any developer or data scientist with a computer and a machine learning problem that they want to solve. So when I get started solving a machine learning problem, the first thing I like to think about is the type of problem that I'm solving. Am I doing something that's generic, that someone else has solved before? Or am I doing something more custom that's very specific to the data set that I am going to be generating predictions on? So more specifically, let's talk about this in terms of image classification. So if I just want to know that this image has a cat in it, lots and lots of people have built models to do that already. So I probably don't need to start from scratch. I can utilize an existing model. But let's say that this cat's name is Bob, and I want to differentiate Bob from other cats in my data set. So in this case, I'm going to have to train a custom model, showing it labeled images of different cats so that it can differentiate between them. And I'm also going to need to do that in order to get my model to return a bounding box of where that cat is in the image. If we're thinking about this in terms of natural language processing, we can take this text from one of my tweets as an example. So if I just want to extract parts of speech from this tweet, this is a pretty common natural language processing task. And I can utilize an existing model to do this. There's already lots of models out there that will let you do parts of speech extraction. But let's say instead that I want my model to recognize that this is a tweet about programming, and more specifically, it's a tweet about Google Cloud. So I'm going to need to train a model from scratch if I want to do that, giving it thousands of tweets about programming and about Google Cloud so that it knows what to look for. So in this talk, I'm going to show you a bunch of different products on Google Cloud Platform that will help you go from zero to machine learning. Now, the products starting on the left-hand side, these are targeted more application developers. So we offer a set of pre-trained machine learning APIs. So these give you access to a pre-trained machine learning model, which is a single REST API request. And you don't really need to know anything about how that model works under the hood. You don't need to provide it with any training data. You can just get up and running right away. In the middle, we introduced a new product a couple of months ago in January called AutoML, currently available for images. And what this lets you do is it lets you customize these pre-trained APIs to your own data set. So you can give it a labeled image data set, and you access to your own custom trained API to make predictions on. And then as we move towards the right, we get into products that are targeted more at data scientists and machine learning practitioners that have a little bit more machine learning experience. So for this, we have TensorFlow, an open source library for building and training your own machine learning models. And if you want to run your TensorFlow training and serving on Cloud Machine Learning on Google Cloud, we have a product called Cloud Machine Learning Engine. And then finally, I'd like to think about when I'm just trying to decide which type of product I want to use to solve a machine learning problem, I need to think about the resources that are involved. So these are just a couple. There may be more that I missed here, but I want to show you how each of the products I just mentioned fits on this scale. So if we look at machine learning APIs, you don't need to provide these with any training data. You can just pass it one image, get a prediction back. You don't need to write any of the model code. The APIs are available for you on Google Cloud Platform, so you don't need to provision any training or serving infrastructure. You just need to write a couple of lines of code to generate your prediction. And you can get something up and running within probably a day. If you look at AutoML, this is going to require you to provide some of your own training data because you're going to get a custom API endpoint back. So you will need to provide some of your own training data. The prediction code looks pretty similar to the APIs and will require a little bit more time since you'll need to gather your image data set, maybe do some pre-processing, and maybe label that data set. I'm going to talk about two different types of custom models. The first is transfer learning, which I'll go into more detail later on in my presentation. This involves utilizing a model that's already been trained to do a similar classification task and then building on top of that model. So this will require a bit more training data. You will have to write some of the model code yourself. Training and serving is up to you. You can run training on premise or you can use a managed service like Cloud Machine Learning Engine to run your training. Prediction code looks pretty similar and it will require a bit more time than AutoML because you will be writing some of the model code yourself and training that model. And then finally we have building a custom model from scratch trained entirely on your own data and this is going to take a bit more of each of these resources. So let's dive in, starting with machine learning as an API. So as I mentioned, on Google Cloud we provide five different APIs that lets you accomplish common machine learning tasks with just a single REST API. So we have vision for analyzing images. Video intelligence will tell you what's happening in every scene of your video. It'll also tell you at a high level what your video is about. We have cloud speech. If you want to implement functionality similar to okay Google into your own applications, this will take an audio file and transcribe it to text. The natural language API lets you analyze that text in a bit more detail and then finally translation API lets you translate text in over a hundred different languages. So we've got a lot of different APIs here. I'm gonna cover, oops that's a typo that should say vision. I'm gonna cover cloud vision and cloud natural language just to give you an idea of what these APIs look like. So the vision API lets you do a couple of different things. The first is label detection. So this will tell you essentially what is this a picture of. So for this image it might return elephant, animal, et cetera. Web detection is pretty similar but this will search the web for additional details on what's in your image. Then we have OCR or Optical Character Recognition and this will extract text from your images. So if anyone has ever used the Google Translate app to take a picture of a sign and then translate it into a different language. You could implement that functionality on your own using the vision API's OCR method. Logo detection will identify common company logos in an image. Landmark detection will can tell if there's a landmark in the image. Crop hints will help you crop your photos to focus on specific subjects. And then finally we have explicit content detection which will tell you is this image appropriate or not. This one's pretty useful for pretty much any site that has user generated content. So instead of having somebody manually review every image that's submitted to see whether it's appropriate or not, you can just send it to an API, make an API call and then maybe you only need to review a subset of your images. One example of a company that's using the vision API in production is Giffy. Giffy is a website that lets you search for Giffs across the internet. And before they use the vision API, their search functionality was only searching Giffs by manually assigned tags. And in just a couple of days, one of their interns used the vision API to add search by text. So they are now, as you know, many Giffs have texts in them and now they're searching for the text in those images using vision API OCR. And this has significantly improved the accuracy of their search results. And they wrote a great blog post about it at engineering.giffy.com if you wanna learn more about how they did that. So let's see a demo of the vision API. I don't like to get too far into a talk without a demo. So what we have here is the vision API products page. And you can try out all of our machine learning APIs directly in the browser. I'm just gonna show you the vision API and I'll provide the links to all of these at the end. So what we can do here is we can upload our own image and we can see what the vision API will respond. So I'm gonna upload this picture of me seeing Hamilton a couple of months ago. Hamilton's a Broadway musical in New York. I'm sure a lot of you have heard about it over here as well. Definitely recommend listening to the soundtrack if you haven't already. So this is me at the Hamilton Theater in New York and we will see what the vision API says. There we go. So another thing the vision API can do is it can identify faces in an image. So here it's identified my face. It can tell that Joy is very likely I was super excited to be seeing Hamilton and waited a long time to see that. And it can tell me where all the different features are in my face. It can also do label detection and what I wanna highlight is web detection here. So the web detection is actually able to identify the theater that I was at. Richard Rogers Theater is where I saw Hamilton. And so what it's doing is it's looking for similar images across the web. So there's tons of images of people taking pictures of themselves at Richard Rogers Theater and it's able to extract from the context of this image that that is where I am. Another cool thing it can do is text detection. So when I sent this image to the vision API I actually didn't really think about the fact that there was text in this image. But if you look, I'm holding up a play bill and the OCR endpoint is able to extract the word play bill from my program that I'm holding there. Document text detection is for images with lots of text. So if you have an image of a menu or a resume, something like that, it'll be able to break down the text by paragraph, symbol, and word. Properties can tell us dominant colors in our image. Safe search will tell us whether the images are appropriate or not by five different categories from very unlikely to very likely. And then finally we can inspect the full JSON output of our response. So we can look at all the different things we get back here. It gives us a ton of different facial features. So things like, I don't even know what this is, right ear, trajeon. So you get a lot of details there on face detection. So that is the vision API. Definitely encourage you to try it out with your own images. Cloud.google.com slash vision and I'll share the link at the end. So if you wanna see what the API response looks like before you start writing any code, you can do that right in the browser over there. So that is the vision API. It's a REST API so you can call it from any language that you'd like. I like to use Node for a lot of my demos, so this is a Node example here, but you don't have to use Node.js. So here we have a number of client libraries for interacting with all the different Google Cloud Platform products. So here I'm using the Node module for Google Cloud and I'm just creating a vision object and all I need to do to run detection on it is call.detect and I pass it the types of detection that I wanna run, in this case face and label and then I get a bunch of cool detection data back. So that is the vision API. Next I wanna talk about natural language. So the natural language API will let you do these four things. First it will let you extract entities from your text. It'll also tell you whether your text is positive or negative and it can analyze syntax so it can get more into the linguistic details of your text. And then finally it lets you classify content. So this is the newest feature we provide. It will give you categories back. There's I believe over 700 different content categories available. This feature right now is only available in English. So I wanna show you what the syntax analysis endpoint gives you back. We'll use this sentence as an example. The natural language API helps us understand text. So the first thing we get back is what's called a dependency parse tree and this will tell us which words in a sentence depend on other words. Then we get back the parse label which gives us the role of each word in the sentence. So here we can tell that helps is the root verb of the sentence, API is the nominal subject. We get part of speech data so this will tell us is it a noun, a verb, a pronoun, et cetera. The lemma is the canonical form of the word. So in this we have just one example for helps the canonical form would be help. And then we get a lot of additional morphology details on the text and this is gonna differ based on the language that you send your text to the API and it supports a couple of different languages. So the next one I wanna show you is content classification which is the newest feature of this API. So for this example I took the headline of this article on the first sentence. I sent it to the content classification endpoint and it's able to tell me with 99% confidence that this is an article about team sports but more specifically it's an article about baseball. This is pretty cool considering the word baseball is never mentioned in the text. And as I said before you get access to over 700 different content categories you can take a look at all of them in the documentation. This is only available in English right now but if you'd like to use it in another language you could just use the translation API to translate your text into English and then classify the categories that way. Another example in Node.js of how you would call the natural language API. Here we're referencing a text file in Google Cloud Storage. So you can either send your data to the API as raw text or you could send it as a URL to a file in Google Cloud Storage. So here I just call .annotate and I get some data back on my text. So just to recap if you want to accomplish a common machine learning task like image recognition or natural language processing these APIs are a great place to start. You don't need to start from scratch you can utilize these pre-trained models that we've already trained for you using lots and lots of data. Great way to get started with machine learning if you're new to it. But one question that I get a lot when I present on these APIs is the APIs seem great but they don't get quite specific enough to this use case that I want to solve. So what if I want to train them on my own custom data? And for that we have a new product called AutoML currently just available for vision. There's two asterisks here because you currently need to be whitelisted to be able to use it. I'll talk a little bit more about that at the end. So this lets you use your own data to customize a pre-trained API currently available for vision. And I'm just gonna jump right into a demo to show you how this works. So for this demo, let's say that I'm a meteorologist and I work at a company like the Weather Channel. And let's say that I wanna predict weather trends and flight plans from images of clouds. So this begs the question can we use the cloud to analyze clouds? The answer is yes. So as I was working on this demo I learned that there's 10 different types of clouds. I didn't really know anything about actual clouds before I started working on this. So I learned that all of these types of clouds indicate different things about weather patterns. So my first thought was let's try this with the Vision API. Let's send all these different cloud types to the Vision API and see what we get back. The problem was that for all these different images the Vision API returned the same thing even though these are obviously different types of clouds. So the Vision API returned sky, cloud, day, time for all of these. So this is pretty much expected. The Vision API was trained across a broad set of image categories. So we wouldn't expect it to be able to know, for example, that this is a serious cloud formation. So this is where AutoML Vision came to the rescue. So what AutoML Vision provides is essentially a UI for importing our data, labeling it and training a model. And then you immediately get access to a custom REST API endpoint that you can use to make predictions on your custom model. You upload your photo data set, train your model and you're ready to go. So I'm going to dive into an AutoML demo. To do this, might be tricky with the microphone. I'm gonna try to do it with one hand, let's see how it goes. So this is the AutoML Vision UI. And what we get here is this takes us through every step of training our model from importing the data, labeling it to generating predictions on the data. So the first step is importing our data. And the way we do this is we put all of our images in Google Cloud Storage and then we create a CSV, which includes the URL of our image and then the associated label with that image. You can have multiple labels per image if you'd like. So I've already imported the data for this. One thing to note is that, let's say you don't have time to label your images. You've got a giant image data set and you just don't have time or resources to label it. You can utilize a human labeling service, which gives you access to in-house human labelers that will label your images for you. You just need to provide a couple of base examples for each label and some instructions and you'll get a labeled data set back. In this case, I had my images labeled already. So the next step would be to review our image labels. And here I can take a look at all the different labels for these clouds. So this is a cumulus cloud, for example, but let's say it wasn't labeled correctly. I could jump in here and easily switch the label out. So I can look at all my cloud labels in here. In this case, as I mentioned before, I'm not an expert on actual clouds, but turns out we had a meteorologist at Google, which is pretty cool. And so he helps me label all these images. And then we can look at how many images we have per label. AutoML recommends at least 100 per label for a high quality model, but you only need 10 to get started training. For this one, I don't have quite enough data for all of these. Some of these are more rare cloud formations and I just couldn't find quite enough images for them. So for the demo, it's okay. If this was a real production app, I'd probably wanna have equal amounts of images per label. So the next step is to go in here and to train my model, I literally just click this train button. I don't need to write any of the model code. I don't need to worry about what type of model I'm using, what's happening under the hood. AutoML will take care of that for me and I will get an email when my training completes. So then what I wanna do is I wanna evaluate how my model performed using some common machine learning metrics. I'm a bit short on time, so I don't wanna dive too much into this, but I will show you the confusion matrix. Again, as I mentioned, I didn't have quite enough data for, we got till 10 pass. Okay, okay, cool. All right, so I'll quickly show the confusion matrix. If it looks confusing, it's called a confusion matrix. But what we wanna see here is a strong diagonal from the top left. And as you saw, I didn't have quite enough data to do tests, to test my model on all the different categories, which is why we see some blanks here. But this will tell us for all of my clouds that were actually, for all my pictures that were actually serious clouds, my model is able to label 89% of them correctly from the test set. But for serostratus, for example, 75% of those were mislabeled as serious clouds. So this can show me where I need to maybe go back in and improve my data set. So this brings us to the most important part, which is generating predictions on our trained model. And there's a couple different ways to do this. The first is query online. So this is the easiest way to test your model out right after you've trained it. I can just try out an image in the browser. So this is an image that wasn't used in my training data set. And I'm gonna see how the model performs. And it's able to identify with 98% confidence that this is a serious cloud, which is pretty cool. So the query online tool is pretty easy to use, but chances are you wanna actually build an app that's going to make predictions on this trained model. So for that, there's a couple ways to do it. I wanna highlight the Vision API. So you'll notice here, if you've used the Vision API before, the request is gonna look pretty similar. The only things that are different are adding this custom label detection parameter. And once you've trained your model, you get access to this ID of the model that you just trained. So you just need to pass that into your Vision API request and you get some prediction data back. And just to show you how easy it is to build an app that queries a trained model, I built a super simple web app here using Firebase hosting. And what the app will do is I can upload a photo and it's gonna tell me a little bit more about that type of cloud. So let's see what it says about this cloud. So this is a cumulonimbus cloud. It knows this with 97% confidence. And this cloud, if you see it while you're flying, probably not a good sign, might mean the turbulence is coming. I actually saw this on my way here. It was a little bit scary. But, so yeah, this will just tell us a little bit more information about the type of cloud. And again, all I needed to do this was just send the request to the Vision API using my custom endpoint of my trained model. So that is AutoML Vision. I'm gonna go back to the slides. And I wanna talk about some companies that have been using AutoML Vision that are part of the alpha. The first example is Disney. Disney built a custom model to recognize different Disney characters, product categories and colors. And they use this to improve their search engine to provide more relevant results to users. The second example is Urban Outfitters. They are a clothing company based in the US and they did something similar to Disney. They built a model to create a comprehensive set of product attributes. So they train their model to recognize things like patterns and shirts, different types of necklines that the regular Vision API wouldn't be able to differentiate. And they use this to create a set of product attributes to improve their search results. And then the final example is the Zoological Society of London. And they have a bunch of cameras deployed in the wild. And all of these cameras are taking pictures of the different wildlife that's walking around in those areas. And instead of having a human manual to review these images, they now have a custom model that's able to automatically tag the types of wildlife that they're seeing in these images. So that's AutoML. But let's say you wanna go even further and get a little bit more custom. And let's say you have a custom prediction task that's very specific to your data set or use case. So some examples. Let's say we have Stack Overflow questions. And we wanna, based on the title of the Stack Overflow question, we wanna classify the tag automatically. Or let's say we've got some demographic data and we wanna predict which way a county or a region will vote. So our inputs would look like this, some demographic data. The output would be a prediction on the percentage of the county that voted for each candidate. Let's say we wanna identify the location of objects in an image. So what if I wanted to train a custom model to recognize myself? Maybe I wanted to build some home security device that would only open the door if I was walking in. I would need to train it on only images of me. So for these type of examples, you will want to build and train a custom model from scratch using your own data. And to do this, we provide two different tools. We provide TensorFlow, which is an open source framework to help you build your models. And you can train and serve your TensorFlow models anywhere. Since I work on Google Cloud, I'll talk about ML Engine, which is a managed service that we offer for running TensorFlow training and serving. A little bit about TensorFlow. So TensorFlow was created by the Google Brain team. From the beginning, they wanted everyone in the industry to be able to benefit from all the machine learning projects they were working on. So they made TensorFlow an open source project on GitHub. And the uptake has been phenomenal. Last time I checked, TensorFlow had over 90,000 GitHub stars. It's the most popular machine learning project on GitHub. And you can deploy it anywhere. So you can run your TensorFlow training jobs on your own servers and your own data centers. You can run it on VMs in different cloud providers. And you can even compile your TensorFlow models down to ARM code if you want to run it on mobile devices. So a little bit about what the TensorFlow API looks like. So at the lowest level, we have the op kernels, the CPU, the GPU, and then mobile devices. To execute those, we have the TensorFlow distributed execution engine. And what most developers are going to interact with is the API frontends. I'm going to focus on the Python frontend. And when you're building your TensorFlow model, you can choose how deep you want to go in terms of how much you want to be able to configure what types of layers your model is using under the hood. So you can use the TF layers API, which is a utility for building the layers of your model. If you want to go a bit higher level, we have two APIs, the estimators API and Keras, which are both high level TensorFlow APIs that make it really easy to build your model. And they incorporate a bunch of different things that you don't have to worry about. So in terms of training, evaluation, prediction, these are all taken care of for you by the estimators API and by Keras. You don't have to configure those specifically for your model. And then finally, we have what are called pre-built estimators. These are models in a box. You instantiate them. All you need to do is know the type of model you want to use, whether you want to use a linear classifier or a linear regressor. You instantiate it and you're ready to go. You can run training, evaluation. All you need to do is configure your data, figure out how you're going to feed that into your model and you're ready to go. And I'm going to focus on pre-built estimators. I'm going to do a demo of that towards the end of my talk. So once you've built your TensorFlow model, you need to think about how you're going to train and serve your model. So one reason that we've only seen machine learning catch on in the last couple of years is that training a model is very computationally expensive. And we've only recently seen the hardware that has been able to accomplish training machine learning models. So there's many ways to do this. I'm going to talk about Cloud Machine Learning Engines. This is our managed service for TensorFlow on Google Cloud Platform. And it lets you do distributed training with GPUs and now with TPUs. And once you've trained your model, you can choose to deploy it on ML Engine for serving. And then you get access to a scalable online or batch prediction API to use to make prediction requests on your trained model. And there's three steps to use it. First, you write your TensorFlow code locally. And then you put your model code and your training and test data into Google Cloud Storage. And you can use GCloud, which is our command line interface for interacting with a number of different Google Cloud Platform products. So you can use the GCloud command to kick off your training and evaluation job. And then you can monitor that job on the Cloud Console. So I wanted to show you an example of an app that I built. I wanted to build an app that was kind of an end-to-end example of building your model, training it, and then finally making predictions against it from a mobile device. And I know a bit of Swift. So I wanted to build a Swift client for making predictions. So I thought, if I'm building a Swift client, why not detect Taylor Swift in an image? Because that's kind of funny. So that's what I did. Here's a gif of how the app works. And I used TensorFlow Object Detection, which is a library built on top of TensorFlow specifically for building object detection models. Object detection just means detecting the bounding box of where something is in an image. I used ML Engine to run training to serve my model and generate predictions. And this is what the TensorFlow Object Detection API lets you do. As you can see here, we're identifying the location of specific objects in an image. One example is, let's say, you wanted to build a model to identify different pet breeds where there is multiple different types of pets in an image. We actually have a great blog post on the Google Cloud blog about how to do this specific type of model. And what this utilizes is a technique called transfer learning. So transfer learning lets you utilize a model that's already been trained to perform a similar classification task. So in this case, there's many different models out there that have been built for object detection. And it turns out detecting objects like where a pet is in an image to a model isn't that different from detecting, for example, where Taylor Swift is in an image. So what I do is I can take these models, take the weights of these models that have already been trained on tons and tons of data, like millions of images, and I can modify just the last one or two layers to train them on my own data set. And then the result is updated output that's trained specifically on my prediction task. So the benefit of this is if you don't have enough training data, and there are models that exist to do something similar to what you're doing, this is a great solution. So in this example, if I wanted to build a model from scratch to identify Taylor Swift in an image, I probably would have had to manually label probably like hundreds of thousands of images of her, which may not have been the best use of my time. In this case, I only had to label 200 images of her. So I manually labeled the images by using a open source Python tool for putting the bounding box on the image and it generated an XML file for me, which I could then convert into the TF record binary format. So this is how it all fits together. I used a bunch of different products here to build this. TensorFlow object detection for building the model. Then I used machine learning engine to train my model. Took about 30 minutes to train on the cloud. And then once my model trained, I deployed it to ML Engine for serving. And as a result, my Swift client is a pretty thin client. I'm using a couple of different Firebase APIs on the iOS client. For those of you that don't know what Firebase is, it's a tool for building a backend for your mobile or web app. It's part of Google. And I used a couple of different Firebase products. There's probably like 15 or 20 products under Firebase. I used a couple of different ones. So when an image is uploaded to my app, it's stored in Firebase storage. And then I'm using a tool called Cloud Functions, which lets you write Node.js functions that respond to different events in your cloud environment. So there's a function that's triggered anytime an image is uploaded to the storage bucket. And inside that function, we're doing a ton of things. We're gonna base 64 and code the image, we'll download it, send it to the ML Engine API for prediction, and then we're gonna handle all the prediction data that we get back. So the prediction data looks like a confidence value and a bounding box. So I'm gonna take that confidence value. I've chosen a threshold of 70% in this example. So for 70% sure that Taylor Swift is in the image, we will go ahead and draw a bounding box around her and then put that new image into cloud storage. I'm using a database called Firestore that's available as part of Firebase or cloud. And it's a database where you organize your data into collections and what I can do for my iOS client is create a listener that will update in real time whenever new data is passed to the specific path in my Firestore database. So I can listen for the path of the specific image I've uploaded and then whenever I get my prediction back I can update my client. So I think I have time so I'm actually gonna run a demo of this real quick. So I'm gonna jump to Xcode. Sorry, hard to do this with one hand. But here's our app and I'm going to make a prediction request and let me actually open up. There's a bit of a cold start time here which is what you may be seeing there. But what I'm gonna open up is our Firestore UI. So this is the UI for Cloud Firestore. So here we can see the path of all of our images. And then once a new image is written here we get the confidence value and then we get the path of that new updated image. So, all right, we're still waiting here. Live demos, never know how they're gonna go. But let me jump back. Let me put this into, you know what is that? All right, hopefully it will work. You never know. Live demos, as I mentioned, there's probably experiencing a little bit of cold start time here. There we go, it was written to our database and we get a prediction back. But I may be cheating here because you don't know if this model is just detecting faces in an image. So let's actually try a trickier image of Taylor Swift with a friend who looks a little bit more similar to her. There we go, it's moving a little faster now. So there, it's able to define her in the image and let's try one more of her with a bunch of different people so you know that this model has actually been trained for the specific classification task. There we go, it's able to identify her in the image. So that is my little Taylor Swift demo. And, yes, it's a Swift Swift demo that's now Swift, that was very good. I have more links about it at the end. There's a video, there's code on GitHub. Okay, I don't have too much time to cover this TensorFlow estimator part but I will breeze through it pretty quickly. So the next thing is what if you have custom data, a custom task and enough training data to build your model from scratch. So in this example, I wanna show you how I built a classification model to predict voting trends using TensorFlow estimators. So a quick bit of background. There's about 3,000 different counties in the US and I wanted to see if we could use demographic data to be able to predict which way a county will vote. I use Kaggle to get the data set. It's a really awesome tool, I definitely recommend checking it out. They've got a ton of different data sets if you're new to machine learning and wanna try to find an interesting data set to build a model to train on a prediction task. I should definitely check out Kaggle. So once I got my data, I put it into BigQuery which is our big data warehouse on Google Cloud Platform. If you were at any of yesterday's talks to my teammates, talked a little bit about BigQuery. So there was a bunch of demographic data in the original data set. I wanted to boil it down to just seven different features. Features are the inputs to my model. So I used BigQuery to extract the seven features that I wanted and I downloaded that as a CSV so that I could feed it into my model. So my inputs look like this, numerical values and then there's a couple different types of models I could use to solve this problem. I could use a regression model and if I use a regression model, my output would be a numerical value. So maybe the percentage that voted for Clinton or Trump and then just use the difference to calculate the percentage that voted for the other candidate. I chose to solve this as a classification problem. So in this case, my model instead of outputting a numerical value would output the probability that this belonged to one of two different classes. You could use multiple classes too. In this case, I just had two classes. Zero, one corresponding with Trump or Clinton. So you get a numerical value that will map to your classes. So I converted my, since my input data was all in numerical format, I just needed to write a quick Python script to convert the percentage to binary classes. And then I defined my feature columns. So what TensorFlow needs when you're building your model is what's called a feature column and this will tell TensorFlow the format of your input data into this model. So in this case, these are all numerical columns. So I can just define my feature columns with one line of code. But if instead I had maybe categorical data, there's a bunch of different types of feature columns. So I can use whatever type of data I would like to define this. But notice that as we define our feature columns, we haven't actually yet mapped our data to these columns and that's where the input function comes into play. So there's lots of different code here. I'll explain what's going on. So first we're decoding the CSV using this TF decode CSV method line by line. This is just telling TensorFlow what format it should expect for each of the values in the CSV. In this case, they're all floats. Time. Time, okay. All right, I'll just wrap this up. So we are converting this into a data set into tensors. We can create our model with one line of code, which is a really great thing about TensorFlow estimators. I just define it using a linear classifier here. If instead I was using a linear regressor, I would just switch out that one keyword there. Then I run training and evaluation. I get some data back on the accuracy. In this case, it was able to classify 96% of the test data correctly and I can generate predictions with this. My predictions look like a softmax array of probabilities. So this is telling me that there's a 99% chance this particular input belongs to a class of zero. We're not gonna see it in action because I ran out of time, but I was gonna show you how to run this on a Jupyter Notebook. And finally, if you remember just three things from this presentation, you can use a pre-trained API to accomplish common machine learning tasks like image analysis, natural language processing, or translation. If you wanna build a custom image recognition model, check out AutoML Vision. Come chat with me after if you've got a use case for this. We'd love to hear more about it and help you get access. And then for custom tasks, you can build a TensorFlow model with your own data and optionally train and serve it on machine learning engine. I know I covered a lot of material, so if you wanna quickly take a picture of the slide, get some info and resources, I think I'm out of time. So thank you all for coming. Thank you, that was many things.