 But the next speaker is Harshit. He will tell us more about Firebase Machine Learning Kit. Harshit, go ahead. Good afternoon. So first of all, a quick question. Quick show of hands. How many of you guys are mobile developers? Any mobile developer in the house? Mobile dev, working on Android, iOS. All right. And how many machine learning experts here? I mean, experts as in like you've been working on ML projects since quite a while. And I mean, you know your stuff. OK, better question. How many people in the room are not really very experienced with machine learning as in like you are fascinated by the idea of it, but somehow like you are reluctant to try it because of the math involved, because of the statistics involved, and all those things. So how many people in here? Good. I'm in the right room. Makes sense. All right. So today I'll be talking about Firebase ML Kit. So it's to modify the title a bit. So it's not exactly machine learning made easy. It's mobile machine learning made easy. Because as time progresses, the laptops or the devices that we use are getting smaller and smaller. So our mobile phones, the things that they can do today, even a computer would not be able to do those things a few years back. So tech is getting more and more mobile. We have smartwatches in upcoming time. We might have smart chip implants as well. So right now the fight is to implement machine learning in as smaller form factor as you can. So that's the crux of my talk today. I'll be talking about how you can implement machine learning in mobile, even when you are not an expert machine learning developer. So about me, I'm an Android developer. So I work in a startup in New Delhi. That's coding blocks. I work as an Android developer and an Android instructor. And I'm also an open source enthusiast. So in my free time, I love to work on open source technologies and tools. So machine learning as a mobile developer. You can replace this as a web developer or as a non-ML developer. So this is what you see. If you go on to an online course by Andrew Ng, this is what you see. Because this is an actual snapshot from one of the deep learning initiatives, the course that Andrew Ng has recently launched. So I tried doing that course. I stopped at this slide. So I mean, this was my expression. So it's some sort of elevation. I mean, I used to learn these things back in high school and college. But I mean, as you start working as a software developer, you really encounter these things. So summation, integration, theta, alpha, beta, gamma, all those things. We don't really use them in our daily lives. But as a machine learning engineer, all the courses that you have online, they start with these things. I'm not saying this is not important. It's important. But it's overwhelming when you start. So yeah, so this is me when I start. So what to do? Because I mean, machine learning is going to be the future. And you need to know. Because if you want to, so there was a talk today morning, so how machines are going to replace your job as coders. So if you are concerned about your well-being and you want to earn money when you get old, I mean, I'm not that old, sir. I still have my life. So when you get old, if you want to earn money, then you have to modify yourself. You have to learn new things, keep on learning. So how should you do that? So machine learning is one answer. But again, to do machine learning, you have to do all these things, which is sort of a feedback loop that you cannot get out of. So yeah, so what to do? So this is something that can help us get out of the loop. So the title of the talk or the crux of the talk will be implementation. So I am a practical person. I am more interested in practical implementation than theory. And a lot of course online, they focus on theory. Apart from a few courses like FastAI, other courses that you see online are more focused on theory. Not that that's wrong, but I'm not good out for it because I love practical things more than theoretical things. So yeah, so if you are like me, if you like to make things and learn by making those things, rather than learning stuff and then making things at the very end. So this is something that can help you. So who knows about Firebase? Firebase, anyone? So Firebase is basically a product built by Google. It's around mobile developers. So this used to be back-end as a service, but now they have grown their product stack and they have a lot of functionalities available. So you can implement auth in your app. You can send notification. You can do some sort of machine learning stuff on your apps. And ML Kit was recently announced at Google IOC 2018. And this allows you to implement machine learning features in your app, both offline and online. So we can use this to do some practical hands-on learning and get the joy of machine learning. So there's a very common philosophy that I like to follow. So we do something only when there are two motivating factors. So first is money. Second is we feel like doing it or we get joy by doing it. So let's say we are working at a 9 to 5 job. So either you are motivated by your work, you find joy in doing your work, or you're getting money. Apart from that, there's no third factor. And if I'm learning machine learning, so I'm not getting money for doing that. I'm paying money to learn machine learning from an online or an offline source. So the only second factor that can motivate me to learn machine learning is if I'm enjoying the process. And if I have to solve differential integral equations all day long, definitely that's not a joy for me. So building stuff, that's how I get joy. That's something that's interesting that intrigues me. So yeah, that is Firebase. That's something that Firebase ML Kit allows. So Firebase is basically it was announced at Google by in Google IoT 2018. It has some pre-built APIs that let you perform some basic machine learning inferencing. And it also allows you to host a custom TensorFlow module onto Firebase. And iOS as well. It supports both Android and iOS. And I think web support is still in works. So probably in the upcoming future, ML Kit will also support web apps by using TensorFlow.js. So what all do you need to start if you want to use Firebase ML Kit? If you are just learning, then you probably don't need these things. So for example, you don't need an app or an idea that needs some machine learning features. So you can simply start. So you need to know basics of Python and TensorFlow. This is optional later on, not right now. To start, you just need to know basics of app development because right now it's not available for web apps, but it's only for mobile apps. So you need to know the basics. OK, so how do Android apps work? How do iOS apps work? You need to know basic Java, Swift, Kotlin, all those languages. So that's something that you need to know, and that's it. Rest of the things you learn as you start exploring the APIs or playing around with the APIs. So what all the available APIs are? So first is the text detection. You can detect a text from an image. Second is face detection. You can detect faces, emotions, facial contours, all those things. It also allows you to scan barcodes and extract values from them. It has image labeling API available that takes an image and gets some info from that image. It also has landmark recognition API. So for example, you take a picture of a famous place. It tells you what that place is without using the metadata of that image. Lastly, again, it allows you to host and run custom TensorFlow models. Let's say we'll see that example later on. Let's say I want to make an app that detects whether object is a water bottle or is that a pen. So I can train a custom TensorFlow model. I can host that on Firebase. And Firebase will allow me to use that model as well. So it's very flexible in that regards. So the first five APIs that you see are custom prebuilt APIs that you cannot tweak or modify. But the last one, you can tweak and modify to your own content. So there are two types of APIs again. So first is on-device API. So that runs on your mobile. It doesn't require an internet connection. It doesn't require anything else. It runs on your mobile phone. It runs without internet. So for the next billion, so this was a very popular tag pushed by Google. So supporting or building for the next billion. Because recently in India, we got GEO. So by the inception of GEO, more than a billion people came online for the first time. So if you want to build apps for them or you want to build features for them, so they have to be tailor made and custom made for them. But the con of this on-device API is that it has a low accuracy. The accuracy is lower. The results that you get are limited. But it also has the cloud API. That's paid. But again, for the first 1,000 calls per month, it's free. So let's say you're playing around with the APIs. You don't need to pay anything. It needs internet connectivity. So that's the rest billions. And it has higher accuracy, obviously, because it runs on Google Cloud. It uses models that are trained on Google Cloud and all those things. So let's jump directly into the action part of it. I'll show you some code snippets. I'll show you some working examples of how I played around with ML Kit and built some apps. Some of those apps are live on Google Play Store. People are using it. And yeah, all these things are done within a time frame of, let's say, three to four months. So I am not an ML expert. But again, I learned things by doing them. And yeah. So first is text recognition. So again, on the left side, you see the API. And on the right side, you see an example app built that uses this API. So for the text recognition API, I built a credit card scanner app. So you see this in many apps. For example, you're using PayPal. So PayPal, if you want to enter your card, it asks you to scan your card. It automatically takes the details from your card, and that's it. So I made a clone of it using Firebase text recognition APIs. So just take a picture of the card. It will recognize the card number and the expiry date. It gives me other things as well. It gives me the city rewards and all those things. But I don't need them, so I skipped them. But you get the general idea. So it can extract text from devices. It has both cloud and non-device APIs available. So the benefit with cloud APIs is the drawback or limitation of on-device APIs is it can only detect Latin characters. Let's say you want to support Hindi, Gujarati, and other non-Latin languages, you have to use the cloud API. So you can do that. You can definitely do that. If you want to use support non-Latin languages. But yeah, cards normally have Latin characters, so on-device API worked for me. And so the implementation or the steps involved. Even if you don't know how mobile development works, the steps involved are very simple if you look closely. So first of all, you have to create a Firebase Vision Image from the bitmap that you receive. So Firebase Vision Image is a class within Firebase ML Kit. So you create an object of the Firebase Vision Image class. On this object, Firebase performs all the inferencing, all the ML inferencing. So it's a single line code. Is the code visible to everyone? So the important code is highlighted. It's like in red. So you first create an image. This language is Kotlin, by the way. JS developers might find it somewhat similar. If you're a Java developer, it's again very similar. It's not at all strange from Java. It's quite similar. So first of all, you create a Firebase Vision Image. It's a single line code. Just pass in the bitmap, and it automatically takes care of converting that bitmap into a Firebase Vision Image. Then next, you get access to the cloud or on-device text detector. So you get access to the text detector that actually going to detect something on this image. And that's a single line code as well. So this is the Firebase Vision class. You get an instance of that class, and you get the Firebase Vision Cloud text detector. For example, I use the cloud variant of the text detector. If you want to use the on-device API, you can simply call Vision Text Detector. That API is documented very well. So in case you want to learn how the API is working, also take a look at the documentation, that's something that you can do. So once this is done, the next step is run the detector on the image. So again, so Firebase Vision Text Detector, dot detect an image, pass in the image, and attach some listeners. So I have a success listener. I have a failure listener. There are more listeners. There is a completed listener. So all these are lambdas. So for example, in the success listener, I get a single string. So that string contains all the words inside this text document. And all the new lines are separated by a new line character. That's backslash n. So simply go ahead, split the single string at backslash n, and perform some registrations on the word that you get to detect if that's a card number, or is that an expiry date, or is that the user name. What's that? So again, I mean that's it. Again, doing this won't make you a machine learning engineer. Obviously, that's something that this API won't be able to do. But what it'll do is it'll arise your curiosity on what all things are possible using machine learning. So again, there's a custom model part that I'll be outlining briefly. So in that, you can train a custom model of your own. So once you train your first custom model, you sort of know. So this thing is possible using machine learning. What else is possible? But because if you don't do this thing, you probably won't know all the applications that are possible using machine learning, or using TensorFlow, or using by training your custom models. But once you get into the process, but once you get your hands dirty in the process, then you know, OK, OK, so I did this. Now this is possible. Now this is possible. It's very similar to other programming languages. If you don't know app development, you don't know what sort of apps you'll be able to make. But once you make your first app, once you know the basics, OK, OK, this is how app development works, these things are possible, then ideas start coming to you naturally. So that's something that I wanted to induce into myself. So just learn the basics. Learn what all things are possible. And once you know how the basics of things work, then you can probably start brainstorming ideas. You can search online to see, OK, is this thing possible or not. But again, that excitement, that drive to know the basics, that was not available on online courses, because they all start from the very stats and maths and probability part of things, which I don't personally like. So this was the text detection API. This was very simple three lines of code. And the best part about five bases, all the APIs are very similar. So if you know how one API works, the structure of all other APIs is very similar. It's exactly the same. You just have to change one or two things. So last is extract text from the response. That's done over here. So I get the list of words. I split them in backslash n. And I just print all the words. What all words did I get? So that's almost it. So inside onFillere, I displayed a message saying, something went wrong. Please try again. It was just bookkeeping. Next up is barcode scanning. So this allows you to scan and process barcodes. So this is part of Google's vision API. So this has only onDevice API available. It runs barcode processing on your device. It doesn't require internet connectivity to do that. So again, there's an example app on the right side. So I scanned this QR code. It showed me the data of this QR code is this, this, this. So again, steps involved, exactly same. So create a five base vision image, single line code. So there's an optional step as well. That's not available in all the APIs. But it was available in the barcode scanner API. So you can determine what all type of barcodes we want to scan. So there are more than 21 type of barcodes. So I also learned this when I was making this app. That's something else that I learned. So let's say I want to scan only QR codes. I don't want to scan other kind of barcodes. So I can specify that in this options object. So I created this options object, and I set the barcode formats. Put all formats you want to scan. And inside this line of code I wrote, I want to scan all the formats. So it's essential. It's very similar to not passing this options object. But again, you can specify what all barcodes you want to scan. Only want to scan the QR codes. There are square in nature. Do you only want to scan the barcodes that are rectangular in nature? Or there are other kind of barcodes as well. So there's NV21 type of barcode. There's Aztec format. There are more formats. Let's say you want to scan Aztec format barcodes. Then you can specify that format inside this options object, and your app will run much faster. So this is optional. This is not required. If you don't pass this optional object inside this barcode scanner, it will detect all the barcodes. That's what it does. So next, get access to the on-device barcode detector. That was, again, single-end code. Lastly, run the detector on the image. Again, so FirebaseDetector.detectInImage, pass in the Firebase Vision Image, and attach some listeners. So on success or on failure listeners. And extract data from the response. So this is how you extract the data. So depending on the type of barcode, so I said, OK, if the barcode contains a URL, get the URL from that barcode. If the barcode contains a contact info, get the contact info from that barcode. Let's say I want to store this contact info. If it's a URL, I want to open that URL. If it's a driving license, I want to see if that driver license. I want to validate that with an API. If that's Wi-Fi, I want to connect to that Wi-Fi. So these things are available out of box with Firebase MLK. So that was the barcode scanner API. Next up, you have the image labeling API. So it works like Google Lens. So how Google Lens works, not exactly like Google Lens, but it's quite close to how Google Lens works. So you take an image and this API returns a list of objects or items or emotions or colors or features that it thinks are available in the image. For example, I took an image of my headphones. So it gave me a list of items. It says, OK, this image probably contains headphones or it's related to technology or it has an electronic device or there's an audio equipment and all these things. So again, this has both, so it identifies objects, location, activities, animal species, and all these things. So this has both cloud and on-device API available. So cloud API has more objects that it can identify. The on-device API has limited number of objects that it can identify. So steps involved are the same. So create a Firebase vision image. You can optionally specify how many labels you want to return, how many items you want to detect. Let's say I only want to restrict this API to detect 20 items. So it'll give me 20 top items. Then again, get access to the on-device or cloud image detector, run the detector on the image, and extract the results. It's again, it's exactly the same as other APIs that we just saw. This is the crucial part of this API that I have just on outline. So I got the name of the label and I got the accuracy or confidence of that label. So as you see, the confidence is listed as well. So accuracy is probably 96%. So it gives you both accuracy and the name of that label. So after this, you have landmark detection API. So that sort of detects the landmark from an image. So as you can see, so I took a picture and it says, this is probably Golden Gate Bridge. So latitude is something, longitude is something, and the accuracy is something else. So this thing does not extract the metadata from an image to determine the location, but it tries to identify the popular landmarks in that image. So this works with the Taj Mahal or Colosseum or Marina Vesants. If you try it, you'll probably get a proper response from that image. So this identifies the popular landmarks from an image. This probably won't work with your backyard, if you try it with a picture of your back, it probably won't work. But it works with all the famous landmarks. I think it can detect more than 30,000 popular landmarks. And this only has cloud-only API available. So there's no on-device API available. Or let's say you take it. Perfect. Yeah, for the one box, please. First thing, let's say you took a picture for like six months back. At that time, you didn't have your location enabled. So how it should group those pictures? So it doesn't rely on the metadata. It detects the image from the landmark. It detects the location from the image available. I mean, there's no way, but if you're passing Latin long, you can use the reverse geocoder API so you can extract the location from the Latin long. You can get the name. That's something you can do. So no need to pass the image. If you have the Latin long, simply go ahead and extract the name from Latin long. OK, it's back. All right, so yeah. So I think I was at the landmark detection API. So again, steps are the same. You, 10 minutes. 10. OK, let's see. So you get the Latin long available with the response. Lastly, you have the face detection API. So that tells you whether the object in the image is smiling or is it angry, what's the dimensions, or what are the X and Y coordinates of the face. So that's Lord and savior Elon Musk. So again, steps are the same. I'll just skip to the code snippet directly. So it's this. So you get the right eye open probability. You get the left eye open probability. You get the smiling probability. You get the angry probability, the joy, the surprise probability, all these things. So you can make an emotion detection app using this API. So lastly, it's custom model study. So I won't be covering how I made this app, because first of all, there's not enough time. I only have five to six minutes, thanks to the person over there. Secondly, it's not easy to summarize in a talk. So I have written a blog post on this. I'll share the links with you. You can go through the learning process or the creation process of this app. So this custom model, what TensorFlow, or what Firebase allows you to do is you can sort of create a TensorFlow model, or a TensorFlow Lite model, host it on Firebase, and Firebase will automatically serve that model to you. You don't need to include that model in your app. That will increase the size of your app. Just have the model on Firebase. Firebase will automatically download the model, run the model, and give you output from that model. So tips for creating a good model, I'll give you some tips. So have a lot of image while training your model. Argument your data by modifying the images that you already have. And account for real-life use cases by adding noise to a data set. So these things might not make sense right now, but once you go through the blog post that I'll be sharing, they'll make more sense. So sorry about that. I could not expand on this because of the lack of time. But again, so this is something that I'll share resources with you. You can go through those resources. So takeaways, machine learning is hard, but it can be fun if done right. So it's hard. Definitely, it's not easy. So by using this API, you will not become a machine learning engineer. That's for sure. But it's fun. And it's easy. And it's fun if you do it or you tackle it the right way. So it was fun for me. It was relatively easier for me because I used the practical implementation sort of things with machine learning. I didn't go through the theoretical way. So tools in APIs available make the onboarding process super easy. So there are a lot of tools available so that make the process of onboarding or learning machine learning very easy. So for a custom model, having a good data set should be your primary focus. If you're using a custom model, make sure that your data set is proper. It's properly documented. It has proper images and all these things. So what's next? So after this talk, what should you do? So ML Kit, the thing I just showed you over here, it has a lot of code labs. So have you heard of Google Code Labs? Anyone? Google Code Labs? So basically, it's a step-by-step guide that Google has. It has starter code. It has what next steps you should do. And it has an end code. So what you can do is you can try this code labs. So it'll give you a starter code, download that code. Then there's a step two. It'll tell you, OK, now add this code. This code does this thing. Then add this code. This code does this thing. So it's a step-by-step guide to making an app that uses ML Kit. So there are various code labs available. So first is there's a barcode scanner code lab. You can take the links. You can snap some pictures if you want. That's fine. So after that, you have image labeling API available. So there's a code lab for that. You can have a Google Lins clone. There's text recognition code lab available. That you can go ahead and try and play around with. So there's a code lab for running a custom model with TensorFlow or ML Kit. That's available. Also, the next steps would be, so you should try this code lab. This is a code lab that you must try after finishing with ML Kit. So that's TensorFlow Poets. So this takes you to the process of creating your own custom model or your own Firebase API. So the Pokemon app that I just told you about, right? So I made that app by following this code lab. So it will walk you through the entire process of collecting the data set, training your model on that data set, and using that model onto Firebase. So definitely do this. It's one of the best code labs that Google has so far. So something else that you can do is you can go through Google's machine learning crash course. So this is MLCC. This is meant for developers. So this takes you through the mathematical aspects of machine learning, TensorFlow, and all these things. It's, again, it's just four-weeks course. It's relatively easy. It's practical. Then if you like what you just saw, so the apps that I just showed you, all the screenshots. So they are part of an app, a sample app that I made. That uses all these APIs. It's on my GitHub. You can go through that app. You can play around with that app. The app is open-sourced. You can go through the source code and see how the app was implemented. So there's another app that I just made using machine learning Firebase. I'm like, that was talking about. So it's Pokédex. So this is, again, open-sourced. The data set is available on Kaggle. And it's live on Play Store as well. So if you take a picture, it'll tell you which Pokemon that is in the picture. So again, there's some proof of the process that I followed, that it works. So by following the process, I was able to make an app that uses machine learning that does not depend on the pre-built APIs that user custom model and it's available on Play Store. It's live people are using it. So again, that's the validation. So initially I had the ideation and this is the validation. So you need to prove that this thing might not work. So here's the proof that it works. Yeah, so I think that's it from my side. I have blog posts on all of these APIs on all the things that I just showed you. They are available on my Medium account. You can find me on Twitter and that's my GitHub. If you have any questions, right? Do we have time for questions? Okay, yeah. So you can use image magic. So image magic, so it's a photo editing tool but they have a command line script for it, right? So I used image magic to sort of like scale and rotate my images. So initially I had only 10,000 images, right? So I scaled and rotated and cropped every image. So for a single image, I had 21 more images. So at the end of the day I had like 21, I think 210,000 images, right? So image magic, you should use image magic, yeah. Over in API, so Firebase has built an API to call a model that is on Firebase. Yeah, it's possible. So the only reason I used Firebase for this was because I don't know to bloat my app because the model is often 15 MBs, right? So it won't affect, I mean in case, in fact like deploying it on mobile phone, it's better because if you deploy it on Firebase, right? On Firebase's console, for the first time you launch your app, the model is downloaded to your phone, right? But after that it'll be used locally. It'll be used from the local downloaded model. But if you have the model on your phone, from the get go, let's say you include that model in your app's source code, from the get go, it'll be much faster for you. The first launch will be much faster, right? It'll be like, it'll be hassle free for the user. Is that, was that your question? The only battery life that you should be concerned about is the camera that's always running. Because I mean, so a lot of new phones, they have special dedicated CPUs for machine learning. So the neural net APIs, if your phone is running Android version, Marshmallow and above. So they have a specific APIs for machine learning. So they have the neural net APIs. So they reduce your battery life, sorry, they reduce the battery consumption significantly, right? Before Marshmallow, yeah, it uses your CPU that might affect the battery life. But after Marshmallow, that is Android 6.0, your battery life is like much better by when you're using these models. But again, battery life is one of your least concerns. Your main concern should be memory usage because the camera is always running, right? And yeah, that's about it. Okay, let's thank Harish for a nice talk. Thank you. Thank you.