 Welcome to this definition workshop on license plates recognition. I am Guillaume Guillemoutier and with me I have one. So maybe we'll start by quickly introduce ourselves before we dive into the content. So I am the data plumber, meaning you call me when you have any issue with your data, how to move it around, how to ingest it, how to process it. No, but more seriously, I'm a data engineering architect at Red Hat. I've been there for two years working on data science platforms in general and all sorts of related data problem, either processing, engineering, whatever you want to do with data. Erwan? Yeah, thanks Guillaume. So yeah, if you're the data plumber, I don't know what that makes me. Maybe the handyman that does a little bit of everything, just not quite to completion. But yeah, I've been at Red Hat for nearly a year now and I work as an architect and I try to have as much of a holistic view of these things as I can. But that's a good definition, because when you've been working in data science and data engineering for some time, well, in fact, when you begin, you will quickly figure out that you have to do a little bit of everything, not only training models and the rest, but there's a bit of that engineering, preparing the data and the rest. There's a bit of software engineering, as we will see in the workshop, how to package applications and use them. So that's a good introduction, because that's what we will try to cover in this workshop. See a little bit of all the things you have to do when you wanna go from your raw data to an application that you can use in a real context. So let's start. So you know what, we introduced ourselves. So you're the data plumber, I'm the handyman. I'm curious where people are from and what they see their job as. So if you want in the chat to type location and are you the electrician of the house? Are you the subcontractor? So feel free to interact in the chat. We'll keep an eye on that as we discuss these things. Yes, plus maybe it will help us maybe give some more details on this area or this other depending on the audience. So that will be good. So without further ado, let's start with what we will see today. We will make sure that you will have access to the OpenShift Data Science sandbox environment so that you are able to run for yourself this workshop without having to install anything on your computer. That's the beauty of OpenShift Data Science. It's to provide you with those environments. We will showcase how to start with the notebooks in this environment. If you don't even know what this means, a notebook that will be a good introduction to that and how it's used into Data Science and we'll go to the end of the process and show you how you can deploy an application real. In this case, it's an API, a REST API application that we will use based on the machine learning model that we use inside this workshop. Before we dive into the technical aspects and the accounts and anything, we wanted to do a small introduction on the use case because that's really what matters to be honest. It's your business case. Why are you doing this? Of course, if it's only as a hobby that you are looking into Data Science, there are multiple courses or a cool demo that you can find out there on the internet, but most of times what you will do as a project in Data Science will have to solve or concrete business case. Otherwise, maybe your Data Scientist team will last for six months and then they will just unfortunately close it because, well, if there's nothing to gain from that, well, maybe it's not a good investment. So the use case that we will work on today, it comes from a demo and a workshop that is a little bit broader that we created with some of my colleagues a few months ago and it revolves around these use case of a smart city project. Okay, and we started from a very real, a very real environmental project. It's around the city of London. For those who are not aware, there is ultra low emission zone that is defined around the city of London in the UK. And we will have to pay some fees to enter the area and so on. So based on this existing use case, we derived kind of a demo where we wanted to illustrate how we can use the data that would come from cameras all around the specific area, okay? Our business objectives are to reduce congestion, maybe by charging all vehicle fee for driving into the city. That's what they are doing in London. Of course, the goal being also to reduce pollution, maybe by charging, you know, dirty vehicles. And here in the full demo, maybe we will put some links in the chat at a moment because we have a recorded video of the whole use case, but there are some computations that are done against the vehicle registration database to charge dirty vehicles for our phoenix trophy or maybe even if we want to locate wanted vehicles, okay? If we have detected somewhere at this entrance point into this area, a specific car that we want to identify for whatever reason, like a number alerts, you know, for child abduction or stolen vehicle, maybe we can use also this environment to do that. So the, yep, everyone, you had a question. Yeah, no, so this is interesting because last month we had a more generic object detection workshop for those who attended last month. And if you didn't, the recordings available where basically we just recognize thing like, you know, car, house, flowerpot, whatever, which has its uses. But here now we're much more specific, right? We don't need to recognize trees. We don't care about the trees. We care about the car and then the license plate. And then actually we want to do test recognition on the license plate. So this is kind of a more specific use case compared to the more generic object recognition. So interesting, cool. Exactly. And I definitely encourage you to watch the full video for this workshop because you will see it run inside an automated pipeline, okay? Compared maybe to the object detection that you did last time. You will see it run inside a broader context with tons of car coming in and all those things happening. But for a workshop today, we will focus on this first part that everyone described, which is the model. How can we create a model inside OpenShift Data Science that will be able to recognize a license plate from the image of a car? Because originally what you have from the cameras are just images. So first you have to know where the license plate is. So that's the first step in the extraction. And then what's the number of the plate? Maybe this part is a little bit more familiar because then it's character recognition on the plate. But you will see there are some small tips and some small tricks that you have to apply here because of course you don't have like a page that has been scanned and then you can recognize a character. Now here, of course the images are skewed or zoomed or whatsoever. So there are some little tricks to apply here. So that's the background. That's what we are going to do. Work through some notebooks to see how we can use those models and package this small pipeline as an IPA, okay? Some words about OpenShift Data Science from Red Hat. This is a managed service. This is a cloud service, an environment that was created for data scientists and developers of intelligent applications. What it provides you is a full environment that you can work in directly, okay? The first step in data science is always a problem which is you have to set up your environment meaning you have to install of course Python and Python packages and this or that and this or this application and the rest. It can be a little bit cumbersome and even if you pass these first steps, sorry for the dog, if you pass this first step then you have to maintain this environment and you have to maintain it in the exact same state as your colleague if you want to be able to exchange notebooks or recipes or data, you have to maintain it in the same state as your production environment and with those tons of data science packages that you can use plus the fact that it's definitely not a mature realm meaning that versions always change and compatibility and the rest. It can be a real endeavor to set up a stable platform for your data science work, okay? So that's why we created OpenShift Data Science which is ready to use, maintain, support it and everything you can think of to create your data science environments, okay? So it sits on top of some cloud infrastructure, okay? WS and very soon Azure. Don't remember if it's there yet but we want to obviously support all the major service providers, eventually some accelerators if you want to have GPUs inside your environment and then you have this OpenShift layer that will provide our base platform for running containers. And inside this base platform, we deploy for you right out of OpenShift Data Science. So it comes as you will see with different flavors of notebooks and frameworks like TensorFlow or PyTorch. There will be, there are also some other applications you may use from our partners, okay? Like Anaconda for commercial edition of Anaconda or Starbus Galaxy for distributed queries inside your environment. We also provide OpenVINO from Intel or IBM Watson or Seldon. So it's a mix of our own packages, our own environment plus other things from partners that are fully integrated inside the platform to build what you need depending on what you want to develop, okay? I'm seeing on the side a quick question is OpenShift Data Science based on Kubeflow? Partly the operator to deploy all those things. Yes, it shares some code with Kubeflow. So it's similar from this perspective but here it's meant to run specifically on OpenShift. Okay, there are some caveats with Kubeflow on OpenShift environments because of all the enterprise features and security that we bring inside. So we are, let's say Kubeflow is part of the upstream that we derive from to build OpenShift Data Science. Okay, so now you know where you will run the workshop. Anyone will explain to you how to connect to it, how to use it. Yep, indeed, thank you. Thank you, Guillaume. So the next step we're going to do now in the announcement for this workshop there was a link at the bottom that said you should sign up for your free OpenShift Data Science sandbox account. So hopefully you've done that already but I'm not going to assume that's the case. I know I'm not the most diligent in fulfilling prerequisites so I'm not going to cast any stone to those who haven't had the time. So if you haven't, this is the link that you're going to want to go to. I will paste it in the chat so that you don't have to retype it. Let me just see. It should be a way to copy from presenter view but I guess not. All right, so here's the link. Boom, I've put it in the chat. So now you can click on that and I'll walk you through quickly what the steps are going to look like. It takes five minutes to sign up, I think. So it's really not crazy. So basically if you hit this link, you're going to get to this page with a big flashy red button in the middle. Try OpenShift Data Science in the sandbox. If you click on that link, if you're already logged in, you're probably going to access it directly, right? You're only going to see this screen if you don't have a Red Hat account yet or if you don't have a sandbox account yet. So I mean this is pretty self-explanatory. You need to fill out these forms. You will probably have to enter these kinds of details about yourself. You will receive an email. So that's going to be for your Red Hat account if you don't have one yet. So you'll need to confirm the email. And then when you come back here and you do launching the sandbox, there's probably going to be a second configuration where they are going to ask for your phone number. You're not going to get phone calls from us. We're not collecting any of these numbers. This is just to verify that nobody is abusing the system and just randomly asking for a thousand accounts because that would be complicated for us. So once you've filled out all of this, you're going to see a message like this. Your account is being prepared. That takes three, four minutes, something like that. And that should get you into the environment. So let us know in the chat if you're having issues with any of that. But hopefully my talking through these steps has given you enough time to get your account if you haven't created it yet. Okay, so let's get started with that. So I guess the workshop materials that we're going to cover today at a high level, well, Step 1A, we should be done with it already, right? Sign up for a Rhodes Sandbox account. Hopefully you have your account. You don't have to do the steps, right? But if you do it at the same time as we do, it gives you more information. You get to ask questions and we'll probably try to challenge you a little bit. We'll have a few like extra quizzes in there just to see if you're able to answer those things. Accessing the Sandbox, we are going to git clone a project. We're going to talk about notebooks. We're going to build and train the model and then we're going to deploy the model so that it's being served constantly by the cluster. So these are the overall steps, okay? Obviously this is probably not enough information for you to get going. So I'm going to do the same here and I'm going to copy this link. Actually, I'll be lazy and I'll use my... There we go. Copy this right here. So these instructions are available 24-7. Your Sandbox account is good for 30 days and then you can renew it. So if you want to go through all of this a second, a third time, you don't need to wait for us to do that, right? You can do it at your own pace. You don't need no problem with that. Okay, so with that, let's get started through the instructions. We'll just get back to the slides at the very end. So I will go through this step by step. If I'm not going fast enough for you, feel free to move on ahead. There's no traps or anything. You can see how fast you want to go or if you fall behind, don't feel like you have to catch up. You can do it at your own pace. So the idea, as Guillaume stated earlier, is we want to take a picture and we want to turn that into where's the license plate and then we want to read the actual text from the license plate. All right, so let's get started. So starting a Jupiter environment. Now, you might have already your own OpenShift cluster, right? You might already have OpenShift data science, but even if that's the case, we're going to ask you to work in the sandbox the way that we do so that we all have an identical environment. So I'm going to show you what this looks like, including all the logging in. And let me wait for that. Okay, so my account's already created, but I've logged myself out so it's going to prompt me to log in again in a minute. Okay, all right, and now I'm going to click here, start using my OpenShift data science sandbox, log in with OpenShift, and that should get me logged into the environment. So at this point, well, that kind of looks a lot like what I was doing there, right? So far so good. I seem to be able to access the front gate of the OpenShift data science environment. So here we're being told that we need to launch a Jupyter Hub notebook, okay? And we're being told what details we're going to need to fill out when we do that. So, well, before I go there, I'm a bit curious. So if I want to click around and poke around, you can see here under explore, there are other pieces of software that in my case are not enabled yet. I mean, Jupyter Hub is, but the rest of it isn't. And so the idea is, you know, over time for more and more options to be available there. So Jupyter Hub is one of the tools in the arsenal of the data scientist, but there's a lot of third-party software that can be very useful as well. Also, if this is your first time accessing such an environment or such a software, there's a lot of resources that are available here that you can browse at your leisure. But if you've never used Jupyter, maybe you can see how creating a Jupyter notebook, yeah, actually that sounds good. Tell me more about this. So you have these self-guided little tutorials that are embedded straight into the application. And remember, as you have access to the sandbox for 30 days, if you've just created the account, you have plenty of time to view all those tutorials and try this for yourself. There are even other tutorials that we can point you to at the end of this session, maybe the last one that everyone was talking about for object detection. You can totally run it inside your sandbox environment. So take your time, test everything, and give us some feedback. That's always appreciated. Yeah, definitely. All right, so back to here. It was a Jupyter notebook I was supposed to get, and I'm supposed to select the TensorFlow image and choose the default size. So I'm going to do what I'm told, launch the application. Because I fully logged out, prompted for authentication again. And yep, okay, so all right, you can see the last image I used was TensorFlow. So it does remember that. And then here it's got the small size. I'm told to do default. There we go. And I'm going to click here on start server. Now, this is usually fairly fast. You're looking at about 10 to 15 seconds. As a word of warning, sometimes it's a bit longer. And while I'm waiting for this, I'll explain to you why. So the cluster that is behind the sandbox has been configured to auto scale. So by default, it only has this many machines. And then when they're full, basically it will create a new machine. If you're the unlucky person who happens to come right at the cuff when things are full, you're the one triggering the addition of a new machine. Well, it does take a little bit of time for the machine to be created from scratch and booted up and everything. So if most of the time it's 15 seconds and then sometimes it's more like five minutes, it's an artifact of that, the auto scaling that we've enabled in this cluster. Okay. All right, so let me get back to my instructions. Actually, what I'll do is I'll do this in a side-by-side. Hopefully my font is big enough. Let me see if I can make that a bit bigger. All right. So I've created my Jupyter Hub Notebook server right here on the right-hand side of my screen so I can now go to the next section. And so I won't read out loud all the instructions. Okay, I'll definitely go a little bit faster, but you can always go over it again. Now you can see here I have some leftovers from the last time I logged in and did stuff. I have the road fraud detection. That was two months ago. So the one I want today is this one called License Plate Workshop. So I'm going to copy this text and I'm going to use those icons like I'm told, clone a repository. I'm going to paste that URL right there. And just like that, it's now added this new folder, License Plate Workshop, that's been just cloned from GitHub. Okay, so there we go. I've done that. Okay, I went a bit faster. Oh, okay. So we've added those things. Let me see if I'm sure some people are very familiar with Jupyter. And so let's see if they can help the ones who are less familiar. You can see at the bottom here there is another way to clone a Git repo in Jupyter Lab. Can you guess what it is? I'm going to resist the urge to click here and I'm going to see if anybody in the chat wants to tell us, anybody? Okay, let me paste the instructions I'm at. Instructions. So the instructions that I have on my screen have the link to the GitHub you need to Git clone. Okay, if that helps. Okay. Oh, interesting. Yeah. So I got some suggestions in there. So exclamation point Git clone and then, yep, open a terminal and run Git clone. Okay, I'm going to have to amend my answer because there is indeed more than one way of doing it. So the way I was thinking of was you can come here, you can do new terminal, and then you can see here we have this, you know, Linux looking terminal and you can definitely do Git clone, blah, blah, blah. So that's one way of doing it. If you're familiar with notebooks, what you could also do is you could create a new notebook and whatever you'd have typed in the terminal, yep, Git clone, blah, blah, blah. And then if I run this, it's going to execute this. So, okay, we do have some JupyterLab experts that are going to keep me in check. Very good. Okay, so this is just an easy way to, you know, bootstrap a project. You dump everything from the GitHub repo in there, but obviously you've probably seen those buttons, you can upload files, you can do all sorts of things, I'm just not going to go into it. All right, so notebooks, what's a notebook and how does it work? So the first notebook here is really minimal. If you're familiar with notebooks, this is not going to be very interesting. But if you're not, essentially these are called cells and you type Python language in there and then when you click on the run button, it executes what you've done. Now you can see not all of those are Python. So this one is a markdown cell. And if I double click on it, you can see here that there's the markdown syntax that allows you to do this. And then if you run this one, instead of executing code, it just renders. And you can see here there's a hyperlink that I can click on and all that good stuff. So those are really the basics of notebooks. You can do plenty of things. I don't want this to turn into a notebook lesson either, but you can reorder the sequence of your cell if you need to. So it's a very flexible environment where you can do all kinds of things. So we've done that. All right, I got one more question here. Let's see our JupyterLab experts. There is a keyboard shortcut that allows you to run the cells in sequence. Like 20 years ago when I started working in IT, one of my mentors showed me how to not use my mouse and just do everything from the keyboard. And so I tend to keep looking for those shortcuts myself. So I'm going to place my cursor here, and I'm going to step away from the mouse. And while people answer, I'll just show you with the arrows I can go up and down. And there's a way that I can run the cell just from the keyboard. Yup, all right, so definitely. So this shortcut, Shift Enter, so we can't do it, but you have to trust me I'm doing it, Shift Enter, runs the cell and moves to the next one. So basically, you just keep pressing Enter, Enter, Enter, Enter after you've pressed Shift and you do it. Why? Well, because Enter allows you to edit the cell. So when you're at the end and you press Enter, it doesn't execute it. It's Shift Enter that executes it. All right. And if you are beginning with Notebook, I really encourage you to take two minutes a day for one week to learn all those shortcuts. They will definitely help you when we are writing your Notebook. Just press B to insert a column to insert a new cell at the bottom, press A above and so on. Normally, you shouldn't even use your mouse when working inside Notebooks. It's not VI, okay? There are many more things that are appealing, but definitely there's no need when writing Notebooks to get off the keyboard and reach the mouse or anything. There are very, very handy shortcuts built in. Yeah, definitely. When I started working with Notebooks, just using the icons, I kind of struggled and thought, how does anybody work this way? This is pretty slow, but as you say, the shortcuts really come in handy. Okay. All right, very good. So let's go to the next section. So I've kind of shown the basics of Notebooks and all of this, but I think at this time, Guillaume, I'm going to let you walk us through the rest of these exercises. So if you share your screen, I believe Christian's going to be able to toggle it. We see yours. Yes. Okay. So now we are on Notebook number two, and I won't do as everyone, you know, with half of the screen for the instructions under it, because here everything that we will do is mostly inside the Notebook. It will be like more a code review just for you to see what's happening in the Notebooks. And if I have, I will switch back because I have it on the other tab. I will switch back to the instructions, but normally from there, everything that we will do is inside your Notebook. So again, as everyone said, if you want to go fast for whatever reason inside the thing, just go ahead. Here I'm just going to do the review if you want to follow along or for the recording for those who wants to come back. Okay. So let's mess this. I have zoomed a little bit to see what we are doing. The first Notebook zero two is the part where we built the logic around this license plate recognition. Okay. Of course, obviously we won't train the models here as part of this workshop because this can take quite a lot of time depending on the set that we will use. You know, typically training a single model would be at least 10s or 10 or 20 minutes to be able to do this properly. So here we will use pre-trained models, but you will see that even if we have those pre-trained models, there are tons of things to do just to make this usable inside an intelligent application inside a prediction environment. Okay. So first thing is the environment initialization. Okay. It's the libraries that we need to install to be able to run our model. Okay. So we have two libraries that we will need here, OpenCV Python Headlass. That's for image processing. Okay. It's a library that will be very useful to prepare the images, rescale, resize, and cut out things and everything. And we will need Keras, which is a wrapper around the TensorFlow library. So the machine learning library. So let's do this. You run the cell which will do the pip install. So like you have seen before with the exclamation mark, that means that we are running this inside the terminal process, not the notebook. So it does just do this pip install. And we have the right version of the libraries we need to run this. Okay. Then of course, we will need some imports all the different libraries that we will use. Some basic stuff, TensorFlow, Keras, some scikit-learn libraries for the preprocessing for encoding the labels. The CV2 library. So that's the visualization library we just installed NumPy and Map.lib to be able to do the display of what we are processing. So let's do those imports. No module named TensorFlow. Oh, I launched the... I didn't launch the right image. Okay. So my bad here. Let me just quickly... No, no. You meant you picked the non-Tensorflow image to show them how you could change your mind, right? Stop your notebook. Go back to that screen. Don't tell all my tricks. Yeah. Indeed. Then our data science notebook to go quicker. But we have here the TensorFlow image because that's the one we want to use. Okay. And then we start our server. Okay. So as you see, it's very simple to switch from an image to the other. We have also recipes available for you if you want to build your own image, your own custom image and bring it inside OpenShift data science. So let's say you settle on a specific set of libraries you want to use or some specific extensions for Jupyter or whatever you want to build around your environment. You can create those custom images. So we have recipes for that. And we have guides and tutorials that are coming. So let's redo the PIP install for what was missing, those OpenCV libraries. And then we can do our imports. And this time it works properly. Okay. So we have now what we need in terms of libraries and packages. So the first step will be to find Nemo or to find the plate inside a card picture. And here I have an example just to illustrate what I'm saying. Here it's a card picture. And you can see that, of course, it's obvious for a human to say, hey, yes, we want to recognize the license plate. Yeah, the license plate is there. And we can see the number. But of course, for a computer, it doesn't even know what a card is or where to look at to do the image recognition. So if we just launch a character recognition model on this, what it... Of course, we don't want to analyze the whole image just to be able to figure out if there are some characters or not. We don't want to extract this image first. Okay. So we need here. And here I want to run into details around that. But those are tons of what I call helper functions to get the size, to get the shape of the image, to resize the image, to put labels, to do some transformation metrics, to unscued image. There are tons of those helper functions. That'd be fine. If you have time and interest, I encourage you to look exactly at what's happening. But here for the sake of time, we won't go into these details. So now the next tail will be for the processing functions. Okay. So here again, that's just to organize or code properly. Of course, using notebooks and cells, we can, you know, we could do this step by step. Okay. Load model and then next cell do this or that. But it's always good to organize your code as best as you can by defining those functions that will definitely help you for the next steps when you want to package it. Because then you have something that is a little bit clear in terms of code to integrate inside your application. So we have a function to load the model, to preprocess image, to reconstruct the image from the detected pattern into the plate. All of those are mostly, you know, image manipulation functions. And then the functions to launch the, to launch the license plate detection using a model. That's where it happens in this detected image. Okay. So of course, first it has to take the image, as you can see, resize it, reshape it. And then it launches the prediction. And finally it will return the thing. So let's define all our functions. So this cell is pretty instantly run because we didn't run anything for it. Right now we just define our functions. And then we can make a detection. And by organizing our code, it's way easier to see. For example, we only have to call the load model function. Then we have to call the get plate function after we have defined where to look for the image. And then we display the results. Okay. Let's run that. There are tons of warning from TensorFlow that standard don't be scared. That's because TensorFlow, depending on the infrastructure you have, if you have access to GPUs or not, will put out some warnings. But what we have at the end is that we can see that first the model has been successfully loaded. Then we have a display of the original function. That's the first figure here that we have. Then we have another image that is the license plate that has been extracted. So in this part of the notebook, what we have done is take an input, which is a full image, and only extract this part. We have used this straight-run model to recognize where the license plate is. Now we can go to the next step, which is recognizing the license plate number. So here again, we have some processing functions, like grabbing the contour of each digit from left to right. So here, that is, we will use some OpenCV library functions to automatically detect the path and everything. So here, those are built-in functions in the OpenCV by doing color analysis or doing path analysis. It will be able to detect the forms and the path. Of course, at this stage, it doesn't know what it is, but still it's able to extract some shapes. And once you have those shapes, what we will do is we will use another model to do a prediction for each of those shapes that have been detected to translate it, if I may, to translate it to an actual number or an actual character. And finally here, we tied all those steps into a single function, which I have called RPR process. So it will extract the shapes, and for each of the shapes, it will iterate to recognize the character, and then we'll put it back together and return the result. So let's run this again, control-enter. So we have defined everything. Now we can load our model. So this is a specific model. It has to load some different things, the label classes, the character classes, and everything. So that's the way you work with this mobile net character recognition model that we are using here. And finally, now that we have all the functions and that we have loaded the model, we are able to launch the detection. And here, we will go from the license plate that is here in the images. We have plate four. That's this image here. Let's run it. Fingers crossed. Fingers crossed, and it works. Of course, it's totally faked. I have a special call in the Python function just to display this. No, it works for real. And you will be able to test it yourself with your own images if you want. So as you can see, we started from this image, and we extracted this license plate, and we have the result here. And you can see that there is an error. Here it says BM, but the model detected eight instead of BM. So you can see the model is not perfect. The character recognition model is not perfect. But, you know, it's understandable because if you notice closely here the image, the license plate on the image, it's skewed a little bit because of the angle the photo was taken, but the license plate image from which we extract this, you know, it has been put horizontal back again and kind of unsquewed. But, of course, this introduces some noise in the image or some distortions and everything. Plus, the model is on purpose, not perfect just to introduce those errors so that we can show you in those workshops that a model is never perfect. So that's why in any image application, you have to put some safeguards and checks and constantly retrend the models just to be able to have something that's rich, the degree of accuracy that you need. Okay? And it will never be 100%. Yeah, I mean, it's up to each person to decide, you know, how good is good enough. But if you wait for the model to be 100% accurate, 100% of the time, it might just never happen. So you're better off putting what you have into production and then tracking how good it is and then making, you know, incremental improvements on it over time. So, yeah. Cool. And now that we have all the right recipes and that we have defined everything as functions, then it's very easy to iterate over the data set that we have. We have some images here. So we can launch this cell that will do this detection for each and every image in the data set. Okay, so every time you run this cell, you have this, you have the original picture, you have the license plate extracted picture, and you have the inferred license plate number. Okay? And it does it for all of those cars. So you can see it's pretty fast. It's going on, it's just querying those functions. So now, as a data scientist, I can say, I've done my job. I have maybe trained a model. Let's say I have trained those models and I have the recipe to go from the image to the license plate number. Now it's up to you people in software engineering to take care of the rest. No, no, no, hold on, hold on. That stuff is stuck in your notebook. How do I put it on the cameras around the city of London? There's a few more things that need to happen, don't they? There are definitely many more things. So that's why this is definitely a team's work. You have to have business people to define the exact use case, the degree of accuracy that is needed, and the rest, you have to have, of course, data scientists to work on those models and create the models and create the recipes to work with the models. And you have to have maybe data engineers to create the data pipeline that will extract all those images and bring them inside the environment. And you have software engineers that are needed to create the application themselves. So that's why a notebook environment is always great to work on those recipes because of the nature of the notebook. You can go back and forth in the code. You don't have to, you know, working in a more traditional way, you would have to put tons of debug points inside your code and then stop it at the debug and then come back, rerun and everything. Here it's totally interactive or an iterative way of working. Okay, I'm not satisfied with the way it's resizing. Okay, now let's put 90 by 19 set of 80 by 80 for their image resizing, rerun the cell and see if it works. No, rerun the cell and so on. But at least we have the recipe. So the next step for us will be now to put it as an application. Okay, to make it work as an application. So here I'm going to close all those notebooks and open this one. Okay, this one will be a flask application for those of you who don't know, flask is a Python framework to create web services. Okay. And what we will do is use flask to serve, to serve or code, but this time packaged as directly an application. And you can see from those cells that the flask environment will call these WSGI.py. So let's have a look. We can see that it's a flask application with a first root that is just to give you a status if the web service is working or not, if the API is working. And then we have a path for predictions that will call our prediction function. And prediction is this Python file. And here we are more inside a standard way of developing things because it's a single Python file. Of course I could have split it into classes and different files and everything. But if you look at the code, it's the exact same code that we have in our notebook. Okay, so the same functions, the same transformations and so on, but packaged in a way that we can use this time because we have defined this predict function. And what this predict function will do, it will call all the different functions that we have defined before. So that's how you translate a notebook, a recipe into some code that you can work on. Basically it's retaking all those steps from the recipe and repackaging them into some more standard software or standard application. So we have these prediction.py that is called by our API server running Flask, and that's what we will run right now. So first we will have some more requirements to install, which are our Flask and the Unicorn, and make sure that we have, of course, all the needed libraries for our code itself, but mostly that they are the same libraries that we used inside the notebook before. The OpenCV, TensorFlow, Keras and the rest. And once we have this, we can run our Flask server. Okay, it's starting. And now you can see that the server is running on this address and port. Of course, here you can see it's a local address because all of this is running inside your container environment. When you launch your Jupyter notebook, in fact it's a container that is launched inside OpenCV data science and you run Jupyter and the rest of the environment inside the container. But that means that you can, in fact, run any application inside the container. That's what we are doing here. Obviously, this won't be accessible from outside of the container itself because we didn't expose the port 5000 to the outside when the container image and the notebook was built. But we can still call it from anything that would run also inside the container. And this is what we will do here. I have another notebook on file 04. That does exactly that. You can see that here on the first cell I will call localhost and port 5000. But this localhost from the container perspective will run itself so that will work because it's obviously accessible from itself. Let's run this. We can do this girl and we have the answer from the API. Status, okay, perfect. Now let's try our function by sending an image, sending some data. And here we will send the address of the image and the way the prediction function is made when you send only a car image. It will look inside its own data set. So inside our environment. So it will look inside this folder to find the right image, to find the car 374 in this case. And then it returns the prediction. Cool. Now we have our API working. Of course, that's a way to call it with girl, but we could do it also from some Python code. Let's do this. Okay. This is the same with a post request to our local prediction. So that means that now with this flask application, again, which consists of, let's do this, which consists of the WSGI and the prediction.py. We have an gunicorn for the config. We have everything that we need to create an API, a REST API that will give some prediction given an image. So these two notebooks four and five, basically they allow a data scientist to do a mockup of the final application. But it's running within their notebooks so nobody else can access it. And even if somehow you could, well, if you close the notebook, then nobody else could access it again. So that's kind of the last step before the real packaging it all up and deploying it in the cluster, right? Exactly. Cool. And so now we are ready for the next step, which is building a real application. Okay. And that's how you do this. So here I'm back on my OpenShift data science environment. And here, if you click on this, you will see that you have also access to the underlying OpenShift, to the OpenShift console. So that's what I will do. And you can see that by default in my sandbox, you have access to two different projects that have been created, Dev and Stage, where you are able to deploy applications because never forget it's an OpenShift environment. It's a plain and true OpenShift environment that you can use to now deploy the application. So the easiest way will be to do that. We can go to Add. And what we will do, we will add directly the application from a Git repository. Because, again, if I go back here and look at my license-based workshop, this comes from our Git repository. And it has already everything that I need, the WSGI, the predictions, the requirements of TXT, to know which library to compile inside the image. So I have everything needed. So here, I will import from Git. Import from Git, I will just copy... Where is it? It's at the beginning. It's in this repo, so I will copy it. And I will put it there. That's part of the built-in functions inside OpenShift to create an image from a source. In this case, our source is Git. And it has detected that it is a Python code. And here, that it can build the application using Python 3.9. I will just change it for now, because this specific code, because of the versions that we are using in the packages and everything, will be more suited for Python 3.8. So just to show you that you have also the ability to change the import strategy. So build an image in many different languages. It can be also a Docker file, many different ways to automatically build your applications inside OpenShift. All the rest we will leave as it is. So it's the name of the app, the resources it will create. It will create a deployment. It will create a route to the application to be able to access it. And then, okay, the service already exists. Sorry, it comes from another deployment. So let me add it to stage instead. I will do the same thing. Just edit the import strategy. Sorry, I thought I had done some cleanup after testing. So here, what happens with this information is that OpenShift has created a build configuration, okay? And a build configuration is what we have seen before, meaning, okay, from this Git repository, starting from this Python image 3.8 UB8, apply this Git and create it as a Python application. The process is made in this way. If it detects that there is, for example, a WSGI file, it knows that it has to build an application and serve it as if it was a web app, okay? There are some things. Depending on how you build your application, it's looking for WSGI, it's looking for app.py. That's how it figures out how to build the application. And what it will do if we look at the logs here, what it will do is if I go back at the top, it will clone the repository. Then it knows that it has to do the assembly and the assembly will be mostly to build the image based on the requirements of TXT. So it will fetch all the packages that are needed, install automatically all the packages inside the image. It will do the commit. It will create the image, everything that is needed, and it will also create the deployments. So that's what we see here in our topology view. There is deployment that has been created automatically for us that will deploy the image and schedule it to one. So right now it's still building, but as soon as the image is ready, the deployment will start again automatically. So that's an illustration of this, again, the end of the pipeline. Normally you have the data scientist doing the model, training everything, making sure you have the recipe. Then you have the software engineering part that packaging it as a real Python application like we have seen in the prediction.py and then there is this part, which is more related to the operations. So it will fall under the ML app's part, deploying this thing into production. And of course, you can totally equip all of those things with automation. Let's say every time there's a new version of the model that you push on the Git repo, it triggers the rebuild of the image, the automated deployment and everything. So it's a good mix in between the DevOps approach and functionalities that you work with currently in OpenShift with the data science part. So bringing notebooks and machine learning frameworks and all those kind of applications, it fits perfectly together. I like the self-contained aspect of this where basically everything that you do happens in the same single cluster. You don't need to push your image to Docker Hub and then, oh, I need to log into Docker Hub and I need to, like, this is, I find this UI is quite nice for people who are less familiar with containers and OpenShift and Kubernetes. This is a very user-friendly way of getting started and getting those results. But I think your build must have completed because it seems to be scaling up to one now. Yes, scheduling. It should be ready real soon. And as we said in the beginning, you know, touching a little bit of everything, if you have the skill set and you know, they are not that difficult to build, that means you are totally autonomous in a single environment to work with data, to work with models, to create APIs, to create web apps, to work with those models and the rest. So it begins very... It makes it very easy to use without having many things to figure out. With OpenShift Data Science, you have access from the start to all the tools you need to create your applications. And here, now we have the deployment. We can test it directly, opening the root, and you can see that we have the answer from the API status. If it says OK, it must be OK then. If it says OK, it must be OK, but test it for real. And to do that, we have a last notebook, which is Send Image. And here, I'm running it inside OpenShift Data Science, OK, and running it in the same environment where I've done all the previous parts, but you could run it from anywhere. As you will see, it's only basic Python code. And what we will do is call or newly created API. The only two things that we need is the image that you want to send. OK, cool. I have a core image here. So I will just replace it by the name. It's core.png. And then the root of the service. OK, that's the same root that we have here. That's directly the root to the API. Let's replace it, run the cell. We have defined our two variables. And you can see that here it's basic code to read the image and code it in Base64, because that's the way you will see it works inside our prediction.py file. It expects to receive the Base64 encoded image. And we just dump the data into some JSON with headers, the application.json. And we do a post request to our API. OK, so to the root that we have defined slash predictions passing the data and headers. Let's have a look at the result. OK, yeah, because it's a core.jpeg. Definitely not PNG. Code 18 once again. This one was a real mistake. And then you can see that we have the answer, which is again a JSON some JSON code prediction. VU69YDE VU69YDE We have a fully working API that you could test from your computer. You don't need to go through this environment and the rest if you have some Python or whatever other language you want to use. Once you have deployed your API, you know how to query it. It becomes part of an intelligent application. And what's nice about this is that now that this service is up and running, Guillaume can go on vacation for two weeks and it doesn't matter. I don't need to access his notebook. If I have the URL where this application is running, I can just send images to it and it will send me back those license plates being detected. So this is a solid deployment now. That's really good. And again, as we said earlier, if you want to see this in context, there is the full demo of the use case. And you can see that this API is linked to a Kafka stream and then to some data gathering and setting to object storage and then creating dashboards and the rest. So you can see it as in real use in a full pipeline. But it's one of the base components that we needed. We needed an API that we could query in our pipeline, in our application that would detect the license plate numbers. All right. Well, thanks a lot for that, Guillaume. I'm going to ask Kristin to share my screen if she can, and we'll do a quick wrap up. We are like seven minutes over, so I apologize for that, but didn't want to kind of, you know, stop so close. Two French people talking. It can last a long time. Yes, yes, that's, yeah. That's us right there. All right. So thank you for your attention. Thank you for your great participation in the chat. I hope that we've answered as many questions as possible. We'll stay connected maybe five more minutes in case there's more questions in the chat. But to wrap up, just a few points. So your OpenShift Data Science Sandbox account is good for 30 days, but it is renewable. So 29 days from now you'll get an email, but if you want to renew it for another 30 days, there's no issue at all whatsoever. So feel free to go through all of these steps at your own pace. There's a lot of things in there. There's a lot of things we talked about in the video. If you're not familiar with those things and they're new, like you might want to hear them a second time. The recording will be made available on, I believe you receive an email with a link to it in the next couple of days. Before you go, we have a survey. The link is this. And our AIMLs actually predict that nobody is going to answer the survey. So let's see if you agree and let's see if we can prove the AIML model wrong. I'm going to share the link in the chat and let's see how many people actually answer the survey. Models are not perfect. So let's make this one's prediction a lie. So this is the link to the survey monkey. If you can take, I think there's three questions. So if you can take two and a half more minutes to answer those, that would be great for us. And it would prove to our bosses that we actually did run this workshop and people attended and give them an idea what you thought of the materials. So help us prove the model wrong, please. And that's about it for us. I hope you had fun. I hope to see you again soon on another Dev Nation and have a good rest of your day whenever you happen to be. Thanks, everybody. Thanks, everyone.