 Hey team, good morning. Hey Chris, how's it going? Doing good, you? Good, thank you. Let me go ahead and share my screen so we can kick this off. Alright, good morning Dev Nation. I am Sophie. I'm a data scientist here at Red Hat. And I'm here today with my colleague Chris. Chris, you want to introduce yourself? Yeah, I'm Chris Chase. I'm an application developer. For the past couple of years I've been working with data scientists like Sophie to bring their work into intelligent applications. And that's exactly what we want to talk to you about today. So we want to talk about the machine learning workflow, what's involved in getting intelligence into applications. We'll talk about some pain points that we see when Chris and I try to work together to take the machine learning that I create or massage into shape and then try to put it into an application. And we're going to show you some tips and tricks for getting this stuff going. So with that, let's dive in. We've got a few slides and then we'll hit up demo. So when we think about the machine learning workflow, we see it as having a few stages, kind of like this. So we start on the left with gathering and preparing our data and developing that machine learning model. From there, you go on to deploy that model into a larger application. And then once it's been deployed, you've got to continue to monitor this model once it's in production. Now these stages here, so the gathering and preparing and the model monitoring are really well within my comfort zone. As a data scientist, they are what I like to do. But the bit that we really struggle with and that makes it difficult to get business value from machine learning is this step here, deploying that model into an application. So I think this is a pain point for many data scientists. We're not application developers and we don't produce work that looks like artifacts that you would typically see in your software development process. So today we want to focus on the collaboration between data scientists and application developers and talk about that process of going from a machine learning model developed in a notebook to a model that's running as part of a larger application. So before we dive into doing that, let's briefly talk about why this step deploying a machine learning model is tricky. Well, there's a few reasons really. Data scientists love to work on their own laptops because it means that you can access, you know, that you can manage your own upgrades. You can sort out which versions of libraries and packages you're using. But obviously you're going to be restricted by the speed of your laptop's CPU. Your laptop likely has less storage than you'd like it to. And so you can't work with all the data that you want to. And math training is often, you know, model training is often math heavy and hardware acceleration, GPUs, TPUs can really help with this. But most laptops don't have those integrators into them. So working just on your local laptop isn't ideal. And it certainly isn't ideal when it comes to integrating with other application components that move from cold that's on my laptop to code that is running in the cloud is a much bigger lift than one would expect. And not something that data scientists are trained in. Another real reason why it's difficult to get data science into applications is because of where we like to work in terms of environment. So I think a lot of data scientists like to work in Jupiter notebooks, which look like this. So they're nice, right? They're this mix of code and pros. You can execute the cells and see the output immediately, which is great. This makes you think that, you know, they're really shareable. They're good communication tools. But in practice, that's not where it ends. They're not as reproducible as you'd like. And the reason for this is because you've got no idea how your colleague ran the cells in the notebook or what they did after they ran them. So perhaps they missed out a cell when they ran the notebook. If I now share this with Chris and he runs it kind of correctly from start to finish, he's going to get different results to me. And if I hand Chris a notebook, actually they're not always as neat as it looks like up there. They're really just kind of a brain dump of everything that I might have been working on. So Jupiter notebooks don't look like traditional software artifacts. We can't just slot them into an application in a pain-free manner. Instead, you've got to kind of do some archaeology to figure out what the data scientist is done if you want to extract the parts of the code that you need for an application in order to be able to go ahead and integrate that into your system. So if I hand Chris a notebook, the process of turning that into a machine learning model that runs in an application is a bit of a nightmare. And that's what we're going to focus on for the rest of this demo. So today we're going to step through a process of how to ease this pain with OpenShift. So OpenShift is Red Hat's Kubernetes distribution. And as a data scientist, I shouldn't have to become a container expert or a Kubernetes expert or an OpenShift expert in order to reap the benefits of containers. And so that's why we're going to be using the Red Hat OpenShift Data Science Managed Service today. This is targeted at data scientists, and it means that I can work in my usual way. And all of the integration into part of an application becomes easier because I'm working on OpenShift. So that's where I'm going to work. And I think Chris is going to help me out by setting up a template in Git that makes it easier to ensure that the work that I produce in my OpenShift Data Science environment can be easily lifted into an application. We're going to use Source to Image to lift it into an application. And Chris, I think you've actually got a specific application in mind. That's right, Sophie. I have this problem. Like someone is digging up my flowerbed. And I'm 99% sure it's my dog, but it could be like a groundhog or something, but he's very sneaky. I have yet to catch him. So what I'd like is I'd like a little application that is looking at my flowerbed and it's going to alert me whenever my dog is in there. But I need something to help me identify dogs in pictures. And that seems like a data science thing. Do you think you can help me with that? I think I can. Okay, let me see if we can go ahead and get started on a project together. All right, so I'm going to be working in my IDE. I'm going to be working with my Python app. I'm going to be writing a little Flask app. It's very tiny. You're going to be working in your Jupyter notebooks, but we're both going to be pushing up to that same Git project. And that Git project is going to go ahead and automatically build and deploy on OpenShift every time we push up a change. So earlier, you talked about some of the problems with notebooks and out of order and like there's a bunch of stuff in there. Usually there's like experimentation, data cleaning, a bunch of graphs. Oh my gosh, so many graphs. But really all I want is just like the prediction function and its dependencies, right? So I created this little template to help us get started. So here you'll see there's a couple of application files. Not a whole lot. It's like five lines of code, something like that. But I've also put a place for you to put in your notebooks. You don't have to use them. You can toss them. You can delete them. Go to town with your notebooks. But what I really would like, I would like you to tease out that prediction code and put it in this one Python file and test that from your notebook and I'll be able to call it from my application file in the same way so that we're disconnected there. And in addition, put your dependencies in a nice, with versions in them into this requirements.txt. And I'll use those same dependencies. That way we won't be out of sync. I won't have problems loading your model because some subcomponent of some dependency I didn't know about is leaving me out of luck. So let's go ahead and create a project for us to work on. So we need to detect my dogs. So let me go ahead and get a project to you. And I will go ahead and send you this URL and you can go ahead and get started. Let me send you that URL. All right. All right, fantastic. So I am going to head over to Red Hat OpenShift Data Science before I start using your URL. So let me show you the environment. So this is all running on top of OpenShift but I don't really need to know that as a data scientist. You can see here that I've got a set of enabled applications. Actually, all the ITs enabled for me is Jupyter Hub because that's all we need today. But there's also a range of other things that you could install on the cluster. So let's go ahead and launch Jupyter Hub. So Jupyter Hub is where I load my Jupyter notebook environment. And today we're going to be doing object detection because we want to just zoom in a bit here. So we want to be able to detect dogs. So TensorFlow is the perfect library to do this. I'm going to keep my container size as medium because I think the work that I'm doing today is maybe quite computationally heavy because we need to process images and so on. So I'm using TensorFlow. I've got a medium container and I can just click start my server. And this is going to go ahead and spin up a Jupyter notebook environment for me. Now, Chris imposed some structure for me in this Git project over here. And in fact, luckily, you don't have to see me code this all up from scratch. I've already gone and made some changes. I'm going to clone this repo into my environment. So I'm going to copy this URL. And we can wait for my server to start up. So as a data scientist, I didn't have to set up my own Git environment. I think many data scientists are becoming more and more confident with Git and it's being used kind of part of daily work for version control. But that's certainly something we weren't seeing sort of three to five years ago. So this is taking its time. So in true Blue Peter style, here's one I made earlier. So once it launches my notebook environment for me, it takes me into Jupyter Lab, which is what we can see here. So I could go ahead and create a notebook and start coding. But what I'm actually going to do is clone my Git repo. I'm using the helper here to do it. And I've just pasted that URL that I copied. So I can click clone and fantastic. It's successfully cloned this for us. So Chris set up some structure with notebooks. I'm going to start in notebook one and just briefly show you what I've done. We're going to whiz through this, because if you want to find out more details, you can come to our workshop tomorrow and perhaps somebody could drop the link for signing up for the workshop into chat. But essentially I'm using TensorFlow and accessing some images. So Chris thinks it's dogs digging up his garden. So I'm going to create an object detector that can tell us when we're looking at a dog. Now, here's two cute dogs. And we'll see that that's two dogs. And luckily object detection is a problem that's already been solved for us. People have already trained models to identify dogs. So I'm loading in a pre-trained model. It's been trained on 600 types of objects. This is open source. So you can go and take this model, use it and integrate it into your applications as you wish. All of this is TensorFlow things. So we're converting our image to TensorFlow. And we can pass it the image and make some predictions and ask it what it sees. So you probably can't tell on this tiny font here, but these pink boxes are dogs. I think we can agree that that is correct. These small yellow boxes are actually footwear, which I think is quite adorable that it takes the dog's paws off footwear. But we don't want Chris to be woken up in the middle of the night if someone's walking through his yard in boots. We only want him to be woken up if somebody sees a dog. So I'm going to do a bit more filtering on this model to make sure that it only returns detections that aren't dogs. And we are good to go. So we've got an image, we've got two dogs. Now, I know it works, but Chris doesn't want all of these images. He just wants the model. So Chris, you said there was two things you wanted me to do. The first was put my code into prediction.py. So I've already gone and done that here. So you can see we're loading in the model and we're asking it for predictions of dogs. Tomorrow we'll talk about how you can change that in the workshop so we can decide whether it's a dog, whether it's a cattle, whether it's a cat, whatever you want really. And then the other thing you asked me to do was identify the requirements for the model. So you set up this requirements.txt file. I know I've added these here, but these other ones were already in here. So can you tell me about those? Yeah, those are my application files. So for me, I'm making a Flask app to make a rest service. And so those top are mine. So you just have to add your data science things underneath it and we should be good to go. Fantastic. So I can tell you that I've actually tested out this Flask app from my notebook environment. Again, we'll do that tomorrow in the workshop. But for now, I'm going to push these changes back up to Git and hand it back over to you. You good? Okay. Yeah, definitely. Let me go ahead and see what you've got going on here. All right. So let's see. You've got your notebooks in here and you have added a prediction function in here. That looked very nice. I'd be able to look at that. Can you put that font size up for me, Chris? Sure. A little better? A little better. All right. Fantastic. And then you've got your requirements in here. Lockdown to nice versions, which we can depend on, which is very nice. So I will go ahead and take this and create an app out of it. Now, when I handed it off to you, it was already able to be built and deployed. And so you just add those fonts and we should still end up with that working. So here we go. I am going to go ahead. This is the app as it is now, which is going to call into that REST service. And here we go. So let's go ahead and add your REST service from Git. We'll go ahead and get the... All right. Create it from Git. And here you can tell it is already recognized as a Python application because that's what it is. And then it is... Let's go ahead and create a new application group for it. I feel like we might not need a dog detector. I know. He's easily detectable right now, isn't he? He's all over it. And so we'll go ahead and create a new service for deployment. We'll create a route so we can test it. And we'll create... All right. We'll not worry about this here. And so here it is. All right. I'll create a new one. And once that goes, we'll have to go ahead and create a web hook for this application. So if you go look into this app, let me get ahead and get that for you. So while that app is building, we will go ahead and get the Git web hook for GitHub, and go ahead and paste it and create it so that every time you push up your changes, like if you want to convert this from a dog detector into a cat detector, it'll go ahead and create... It'll push up those changes and create a new deployment for that. So we'll go ahead and add that. Uh-oh. Of course this is... All right. And so we'll go ahead and put that URL in and set it to JSON. And so every time you go ahead and push it up, we'll get a new application. And so we'll go ahead and add that web hook right now. All right. So once that thing is done building and deploying, it'll go ahead and create a new application. And so here it goes. We will... So let's see if that works here. All right. So here you've got me and let's see. We'll go ahead and create that. There we go. And so your rest service was able to identify this picture of my dog. All right. So that's pretty good. And your stuff worked great. Calling into that rest service worked out just fine. However, like I'm not going to stand out in my yard and sit there taking pictures of... That doesn't make any sense. I need to have something out there like sending me these pictures so I can detect it. So what I think we need to do is I think we need to go ahead and take these images. Instead of doing a rest service, go ahead and push them onto a Kafka queue. And then a consumer is going to take a look at those images, run a prediction on that image and dump those predictions onto a new topic where I can listen for that notification to see if there's any dogs in my picture. Okay. So I'm going to use your... So go ahead. So you're changing the application. I think I need to change the application. Now the good news is I don't think you have to do anything because the application code is going to change, but your code is not, right? So let's see what we can do about that. Okay. So first thing, I don't really want to manage my own Kafka cluster. So let's go ahead and use Red Hat's offering, the Red Hat OpenShift streams for Apache Kafka and go ahead and create a new queue with a couple of topics. Here you'll see we'll have a topic for the images going on the queue and then we'll create an objects topic for all those predictions that we're going to read from. And I also created another topic just for some notebook tests so I can actually play with it from inside my notebook. And here we go. We'll grab my connection information here and then we'll use it from inside a notebook. And first of all, let's go ahead and create a new Kafka consumer project for us to work on so that we can build this later. And then we'll go ahead and get started in Jupyter, take a look. Now that connection information, I'll go ahead and enter in as environment variables for my Jupyter Hub notebook. I'll add my security protocol, the bootstrap server, user and password and that stuff. And I'll go ahead and save those in a secret for use in the notebook. So I'll go ahead and start that server. Now as that boots up, we'll go ahead and take a look. So like I said, you can actually play with these queues from inside the notebooks just fine. And we'll go ahead and start consuming. You can see I've listened to that notebook test topic. It's listening. And we can go ahead and also just go ahead and place the messages back to make sure we are getting those messages from my other consumer. Yep. And I'm getting them. So that's pretty nice. So we know we can connect to it from Python. So we'll take that work and let's go ahead and instead of a rest service, let's do a dead simple consumer and producer. So we'll go ahead and stick that into an app.py. So instead of a rest service, we're going to go ahead and create a very simple producer and consumer. So I was going to pick up those images from the images topic and then it's going to do the prediction on it from your prediction function and go ahead and send out new messages with those predictions in it. And then I'll read those from the app and hopefully we'll get our app going. All right. And so you can tell your prediction function hasn't changed at all. And for mine, your requirements haven't changed at all, but I have put in my Kafka, Python, my dependencies for my application. So not a whole lot of changes, nothing for you. But for me, I've changed it over the app completely without having to change that prediction code at all. So let's go ahead and create and you get application from this function. So let's go ahead and get this. All right. So we've got a new Python application. We'll create a new application group. And we don't need to route to this since this is a consumer. We'll go ahead and create that app. I am having problems with my webhooks today. So let me go ahead and delete these secrets. And let's go ahead and give this another try. All right. Let's give this one more try. And there we go. All right. So now this consumer is going ahead and building. So one thing that's missing from this is our connection information to Kafka. So let's go ahead and add that. We've got a secret here with all our connection information here. You can tell us that my bootstrap server and let's go ahead and add that secret to the workload. And we'll put it in there as environment variables. All right. So now when that deployment pops up, it'll be able to connect to Kafka. And we'll have no issues. And so once that is done building and deploying, it should hopefully look a little something like this, my working topology. You've got the app. You've got that dog detector service that we created earlier and we've got a consumer. And let's go ahead and give this a try. So instead of taking a picture, let's go ahead and put it into video mode. And let's see if this works. So with that little video going here. All right. Does he got it? All right. Oh, there it is. All right. This dog is getting detected. He's not working real well with my phone though. I think, okay. Well, maybe we need a little better dog. Hey, maybe we can get a member, a volunteer from the studio audience to help out. Franklin, that's your cue, man. Come on. You can do it. This is your fault. You've been digging out my yard. All right, little buddy. How is this working? Are you recognizable? Yeah. There you go. There you go. That's you. You're the guy who's digging up my yard. And so I think if we go ahead and set up this camera in my yard, we'll be able to tell if this little monster is digging up my flowers. All right. Back to you, Sophie. So there we go. Sorry. While we changed the whole app and you did your production function once, but we changed it from a rest service to a coffee consumer, all that active without having too much problem. Right. And just because you set up the environment for me first, it made it so that we could automatically integrate into that application. So if you want to find out more about Red Hat OpenShift data science, then there's a blog post there. We can drop that link in chat and tomorrow at, I think it's 11 Eastern. We are running a workshop where you can come and create your own dog detector, find out more about the data science that goes into that and understand more about that integration into an application. So with that, I think we want to say thank you from me. Thank you from Chris. And most importantly, thank you from Franklin for spending time with us this morning.