 Hi, I'm Sophie. I'm a data scientist at Red Hat and I'm here today with my colleague Chris, who's an application developer. So when we think about the machine learning workflow, we see it as having kind of four stages like this. It starts with preparing the data, exploring that data and seeing what's there. And then we move on to developing that machine learning model. So this really encompasses everything from feature engineering to testing out different machine learning techniques and using a range of machine learning libraries. As a data scientist, this is where I spend most of my time and energy. Often the next stage is to deploy that model as part of a pipeline or a broader application. And then once it's deployed, I need to monitor it and look at the model's performance over time. Now these three stages here as a data scientist I'm pretty confident with. But this is often a pain point for data scientists. We're not application developers and so deploying our model as part of a larger application can be pretty tricky. And that's what we're going to focus on today. We're going to talk about the collaboration between data scientists and application developers and the process of going from that machine learning model perhaps developed in a notebook to a model which is deployed and running as part of a larger application. So today we're going to step through the process of how to do this with OpenShift, which is Red Hat's Kubernetes distribution. Now as a data scientist, I shouldn't have to become a container or Kubernetes expert in order to reap the benefits of the tools. And that's why we're using Red Hat OpenShift Data Science, which is a new managed service from Red Hat targeted at data scientists. So that enables me to work in my usual way. And by working on top of OpenShift, the integration of my model output into an application becomes easier. So I'm going to begin by working in notebooks and specifically I'm going to be working in a template that Chris has set up for me in Git. This makes it easier to ensure that the work that I produce can easily be lifted into an application. That lifting is going to be done with source to image. So Chris is going to show us how we can use source to image to take the output that I produce and put it into an application. So Chris, I hear you've got a particular problem to solve. I have lots of problems to solve. But this one in particular is about my dogs. My dogs are digging up my yard and I want to be able to find out when they're in there. So we need to weigh a service that can detect dogs so I get an alert when they're out there digging my yard. And could you help me with that, Sophie? I think I could. So let's go ahead and get a project together. Let's start off with an S2I project in Git that'll build and deploy automatically in OpenShift every time we push up a change. First, let's start off with this template. Now that we've generated a new project, you can see some files that will interest you. Here we've got some notebooks for you to get started and experiment. And once you're done experimenting and you've got a good prediction function, you can drop it here in this prediction.py file. And then you can put your libraries like TensorFlow or PyTorch and this requirements.txt file and that'll get included in the service as well. Once you're done, you can go ahead and save your files, push it up to Git and it'll build and deploy automatically. All right, well, there you have it. Let me go ahead and send you this link to this Git repo and you can are off and running with it. Thanks, Chris. So I'll take that and head over to Red Hat OpenShift Data Science. So once I've logged into Red Hat OpenShift Data Science, I get taken to a dashboard that looks like this. What we can see here is a card for each of the applications which I've got enabled in my environment. So I've just got a few things enabled here. We can go over to the Explore tab and look at all of the applications which we're able to install into our OpenShift Data Science instance. And for each of those applications, we've also got associated resources. These are things like documentation, quick starts to help you get going and how-to's to do little tasks, the kind of thing that you end up looking up every time you're doing work. So let's go back to enabled and we can see one of these quick starts here. If we click start, we can see this creating a Jupyter Notebook quick start that's going to step you through kind of your first experience getting going with Jupyter Hub. I'm going to go ahead and click launch and launch Jupyter Hub. So this will take me to a server starter page that looks like this where I can set some options for my notebook server. I'm going to stick with the TensorFlow Notebook image because I'm going to be doing object detection and so TensorFlow is arguably the right framework to do that in. Container size, I'm going to stick with medium, but if I was doing something that needed larger resources, I could select it here. And I can also request a GPU given that they're enabled in my environment. I want to add some environment variables. So today, I'm going to be accessing data that is in an S3 bucket on AWS. And so I add in my AWS access key and secret key here. By doing this, they're going to be injected into my Red Hat OpenShift Data Science Environment. So when I'm developing notebooks, I can access these through environment variables. I'll go ahead and click start my server and we'll wait for some notebooks to spawn. So here I am in my JupyterLab environment. And I'm going to go ahead and clone that Git repo that Chris sent us using the GitHub path. So I'll add in my URL, click clone. And there we have that Git repo added to files that we can access. Now I've already made a few changes to these files that Chris sent over. So let's go ahead and look at those. In this Explorer notebook, we stepped through some exploratory data science. So we begin by importing the packages that we're going to need to be able to do this object detection. We then use our AWS access keys and secret keys, which we set in the spawner to download an image that I've got stored in an S3 bucket. So I've downloaded this image here, these dogs, we can have a look at them. This is Max and Margo. And we're going to transform that image into a tensor so that tensorflow models are going to be able to process that. So when we do that, we get something that looks like this. We've got an array here. And now that we've got that in a nice format that tensorflow can deal with, we can go ahead and test out our model to see if it can detect these dogs. Now there's lots of pre-trained image detection models on the web. And for the purposes of this demo, we're just going to use one of those today. So we're using the SSD MobileNet version two. And this is trained on Google's open images data set. So we load in this model, we can then pass our image into that model and see if it can detect anything in the image. And this is the output we get. So again, we've got an array here, a tensor. We have a set of classes. So these are the classes corresponding to the objects which have been detected. We have the names of those classes in human readable form. So you can see we've got dog, dog, footwear. And then a range of detection scores here, denoting how confident the model is in those detections that it made. So with a bit of standard code, we can recreate our image and plot the boxes from that object detection model on top of those. So here we can see the parts of the image that are being recognized as dog and the parts of the image that are being recognized as footwear. Now, this model is doing a pretty good job of detecting the dogs. It knows that there are two dogs there. And if we filter out all of the predictions that it made with certainty of less than 50%, then actually the only things that it recognizes is those dogs. So it's confident in its predictions compared to the other predictions that it's made. So now that we've got our model that we know works well enough for this use case, I'm going to go ahead and do a couple of things that Chris asked me to do. First up, I need to put all of my requirements for this model into a requirements.txt file. So the three at the top, Chris added for me. And these ones below are the ones that are requirements for my notebook and use case specifically. So we've got TensorFlow, Matplotlib, and NumPy. And we've also fixed versions for those. I'm also going to go ahead and create a prediction.py function. So this is what Chris's application builder, the source to image builder that we're going to use is looking for. This is what it's going to use to make those predictions. So our prediction.py function just really takes hold from that notebook that we had earlier. It loads in the model. And then we've got a few functions to make predictions and kind of clean the output of those predictions. I can now head over to this predict notebook that Chris put together and check that that service is working as I expect it to. So I can install all the requirements and I can load in an image and make predictions by calling that predict function to test that it's working as expected. And indeed we can see here when we make a prediction on the data, we get some prediction response here. So these correspond to that bounding box that goes around the object that's been detected. We get an estimate of the class and a probability with which that class has been predicted. So with that, I'm going to go back over to the git helper and push this back up to git for Chris to use. So Chris, I pushed that back up to git. Can you go ahead and use this now? All right, that's fantastic. So I'll pick it up where you left off. Let's go ahead and take a look at this project and see what we got. You created your model, you updated your prediction function, and you added your dependencies to the requirements.txt file. That means I can go ahead and build and deploy a new service straight from git. Let's go ahead and do that now. Here is the current application ready to go ahead and consume that service, which we're going to create right now. It's going to go ahead and recognize this as a Python project. We'll go ahead and create a new application group for it called dog detector service. All right, you can tell it's building there and that's going to take a minute. So let me go ahead and do one more thing. We want this to build and deploy every time you push up a change. So let's go ahead and get that working. We're going to go ahead and create a web hook right in git so that every time you push, we'll go ahead and build and deploy. And there we go. So now that that's done, let's go ahead and see what my app looks like. Perfect. We have detected a dog. Well, Sophie, there we have it. My app works. The only problem is, I actually have to push this button to find and detect my dogs. That's not really what I was going for. So I thought of something, though, if I could leave the camera running and take some intermittent images and push them up to Kafka, I could go ahead and detect all the dogs as they're going in full motion. So I think that's the answer. And so do you need me to do anything extra? Well, let me show you. So the nice thing about doing this custom app is that I can do whatever I want. I'm not locked down to a certain thing, not a REST API, not a certain framework. And in this case, I did away with the REST API code and I went ahead and made a quick and dirty Kafka consumer. It's going to pick up those images off of one queue, do its prediction, and dump those objects into another queue. And then I'll just read those in my app and display them. So the first thing I did is I made a Kafka queue. I didn't really want to manage it myself. So I went ahead and just made one on Red Hat OpenShift Streams for Apache Kafka, the new managed service. And so I got the dog detector here and I need to get its credentials. Once I have those credentials, I can go ahead and plug them into my environment variables for my notebook server. That way I'll be able to use them from inside my notebooks without saving that stuff up to get. All right. So what do we got here? So as you can tell, I can go ahead and connect to this Kafka queue and produce some sample messages. And then in a whole different notebook and a whole different kernel, I went ahead and consumed those messages. So that sounds great, but really what I want to do is take a look at the data, the images that are in the queue and go ahead and save them to disk so I can play with them and make sure that they work. So I've got these messages with the images in them. And so I need to go back and make sure that it works. But really, you already did all the hard work here, Sophie. So I went ahead and just took your prediction function since it was in a single file and I just dropped it in here. And now I can go back and go ahead and build this app the exact same way I did the last application and it'll just run just like this. And there you have it. It's a working app. All right, Sophie, what do you think? Awesome. Thanks so much for putting this app together, Chris. Are you ready to join my dog detector startup and quit Red Hat? Sounds like a plan. I'll let you tell our manager. Okay. Let's recap what we did today. So, Sophie, so what do we do? Well, to start out, you set me up that Git repo so that I could go ahead and create a model in a format that was able to be deployed using the sourced image functionality. Right. And we did a really simple workflow because it's just the two of us. We wanted something very simple. So all we used was roads, open shift and get. We could have done something far more complicated and complex. We could have used storage. We could use pipelines. We could use deployment software. And we fully expect customers to be able to do that using partner software, open source software and Red Hat managed software and to come up with their own processes that use all of their favorite tools. But for us, this worked great. Here's where you can go to find out more about what we've shown today.