 Hello and welcome to another OpenShift Commons briefing today. We're with Chris Chase and Sophie Watson, and they're going to talk to us about integrating data science and application development. And I've had Sophie on a number of times before, so you may recognize her face. And Chris Chase is new to the OpenShift Commons briefing series, so we're really happy to have them both here. And I'm going to let them both introduce themselves. There's live Q&A at the end. So if you have questions, type them into the chat, and we'll relay them to the guest speakers. So without further ado, Sophie, Chris, take it away. Thanks so much, Diane. Thanks for having us. Hi, everybody. I'm Sophie. I'm a data scientist here at Red Hat, and over the past few years, I've been really thinking about how we can make data scientist lives easier. So how can we make their work more efficient, their work flows more efficient, and how can we get more of that work that they produce actually integrated into applications? I'm here today with my colleague, Chris. Chris, you want to say hello? Hey, I'm Chris Chase. I'm an application developer. And for the past couple of years, I've been trying to work with data scientists and incorporating all their output into some sort of intelligent application. Awesome. So today, we're going to start with a few slides just to level set on the steps that are the stages that are involved in developing an application that contains data science or machine learning components. From there, hopefully we'll have enough information to be able to talk about some of the pain points that we see when we're working together or when we talk to customers and teams of data scientists and application developers at our customers. And then finally, we'll get to the fun bit. So we're going to do a demo today. We're going to be developing an application which incorporates data science into this application, and yeah, we'll be developing that live. We'll show you a workflow that we've been using, which is leaning on Red Hat Open Shift Data Science, which is a new managed service from Red Hat. So hopefully that sounds OK, and I will kick it off. OK, so when we think about what it takes to get machine learning into an application, it's kind of assumed that we can think about it through these stages. So it begins with gathering and preparing that data, exploring that data and making sure that it's accessible. So, you know, your data might be everywhere and elsewhere. You've got to gather, you've got to federate it. You might be dealing with streams. I think later, Chris is going to integrate some Kafka into the demo. So that's kind of step one, get hold of your data. Step two, we see is developing the machine learning model. Now that kind of sentence developing a machine learning model is sounds simple. There's actually a lot of work going on behind the scenes there. So you've got to explore your data. You've got to carry out future engineering techniques to extract information from that data that you think is going to be able to answer your question at hand. You've got to be able to solve some problem. And so your data needs, you know, extracting. You might have a load of data that's nonsense. You want to get rid of that. You've got to determine what's nonsense and what isn't. And once you've done that feature engineering to extract those important features from your data, you can go ahead and think about training or working with a machine learning model. So in general, we think of training a machine learning model as like allowing that model to learn some underlying parameters based on the data that you show it in the example we'll be using later. We're going to use a pre-trained model. So there's lots of pre-trained models on the internet. Just waiting for people to, you know, the open source, you can just go and download them and you could use them in your application. So we'll see what that process looks like as well later. Once I've got a model as the data scientist, we just deploy it in an application. Simple, right? So in the olden days, I would throw this model over the wall to Chris, who would then try and reproduce what I'd done and create an application that contains this machine learning component. And once that application is open running, we need to continuously monitor and manage it. So when you deploy a traditional application without any intelligence or machine learning components, then there's kind of a few ways that things can break. But typically when things break, you get an error message or your system stops working. Machine learning models are often a little bit different. They'll continue to spit out predictions. It just happens that the prediction might become wrong more often. So this could be due to model drift or data drift where the data that you trained the model on is no longer representative of the data you're seeing in the real world. We wouldn't want to train a stock market model for predicting the stocks tomorrow on data from the 1800s because it's just a completely different world to what it was there. So this pipeline is usually carried out by a range of people as a data scientist and most comfortable with these bits in red. So gathering, preparing data and then carrying out feature engineering and deploying that model. And I'm also happy to monitor the model, detect data drift and push out any update my model if I need to so that changes can be pushed out to the system. But the part of the system that I really find the least comfortable is this one here, deploying that model in an application. And so that's where Chris comes in as an application developer to help us get that working. So these applications need to be built by cross-functional teams in order to make successful deployments. And it's complicated for a few reasons, not just because of the communication that needs to happen between two different personas. One reason that's tricky is that data scientists love to work on their own laptops, but this isn't ideal for a few reasons. So you can't get access to sensitive data. You're restricted by the speed of your laptop's CPU. Your laptop likely has less storage than you'd like it to. And if you do have hardware accelerators on your laptop, it's probably gonna do a better job of accelerating editing movie clips than it is accelerating training machine learning models. So it's not ideal to work on your own laptop as a data scientist, but a lot do. So the solution, well, let's just push data scientists to the cloud, fantastic. They can get access to the hardware they need. They can use more storage. You can access that sensitive data that's behind company firewalls. Sadly, it's not quite so easy. So data scientists in general have a different range of skills to application developers and software engineers. And so the transition to cloud computing is really not simple. And as a data scientist, I don't know much about infrastructure. I don't really care about infrastructure. I just wanna access it. So I think a general data scientist are looking for higher level tools than the things that are usually used in standard software engineering practice. Another reason why incorporating data science into applications isn't straightforward is the tool set that data scientists like to use. The majority of data scientists we talk to in our daily work like to work in Jupyter notebooks. So if you haven't seen a Jupyter notebook before, they look something like this. They're a mixture of pros and code, and you can execute the cells of code and output is printed in line in your notebook. So they're a nice communication tool. You can kind of document what you're doing, see what's happening. And this is where I go ahead and develop my machine learning techniques. But in practice, they're not as reproducible as you'd like them to be. So actually you have no idea how your colleague ran the cells in the notebook. If I pass this over to Chris and I say, hey, I've changed my model, it's in the notebook. He doesn't know whether I ran the cells in order or maybe I skipped a cell. He's never gonna be able to reproduce the results that I got. Chris, I hear you've got a few other reasons why you're not a huge fan of Jupyter notebooks. Oh my gosh, yes. I mean, usually when I get a notebook from a data scientist that I'm trying to convert into an intelligent app, you know, there's a bunch of dependency installations. There's no version locking for those. And so I'm trying to tease out the important dependencies. And if one version is off of just a subcomponent that isn't locked down, you know, it's just a total beast to figure out which one is gonna work with this data scientist and I'll ask him and that one doesn't work. It'll conflict with something else. The no order thing is a big one. You know, it's not exactly conducive for nice modular code that I can tease out. In addition, like usually like the exploration, training and the actual predictions are all kind of mixed in that same notebook. And I'm just trying to tease out maybe that one prediction function I need. So all of this is really me trying to tease out like one function and like the proper dependencies and it's just never written that way, Sophie. I'm sorry. So, not you Sophie, somebody else from this. I'm never gonna get another job ever again. No one will hire me. So this is some of the reasons why the work that data scientists do usually aren't conducive to just kind of lifting that code and sticking it straight into an application. So what we're gonna show you today is an example of how we can do a better job at that. We're gonna be using Red Hat OpenShift Data Science, which is the new managed service from Red Hat. So this is targeted at data scientists and anybody involved in the development and management of intelligent applications. So Chris as an app dev is the perfect target user here. I'm gonna develop a model in Jupyter notebooks like normal and then we can, with a little bit of structure, integrate that into an application in an easier way. So Chris, I hear you've got an application in mind for today. I do, my dog, at least I think it's my dog is digging up my flower bed. I'm not a hundred percent sure because I had never caught him, never. He is a very sneaky dog. So I would like to make an app that's going to alert me when a dog goes into my flower bed. So I got the app kind of worked out, but I need something that's gonna detect dogs in images. Can you do that? Yeah, sure. So there's tons of machine learning models online ready for use that can detect specific objects trained on datasets that include dogs. So we can definitely take one of those, download it, explore it, test it out on some data and make sure it works. So should I just go ahead and start doing that? Let's blow down a little bit. Like you said, we could use a little bit of structure. I mean, there's a ton of solutions that are really, really advanced and they're awesome. We love them with AI teams. We could be taking your model, building it on Kubeflow, Tecton Pipelines, putting it on S3, downloading that from S3, using another pipeline and maybe Selden deploying it and KF serving to it. But I don't want to do that because it's just you and me, Sophie. I don't have a whole IT group supporting me. I am just an application developer. I am very simple-minded. So if I can bring you into my simple-minded workflow as just a regular OpenShift application developer with, like you said, a little bit of structure, I think we can still get our work done without adding a whole bunch of stuff. So let's see here. I think, so what we have is just Red Hat OpenShift data science, your Jupyter notebooks in particular. And I'm going to be working in my own environment. We're going to be deploying. We already have OpenShift. So using a normal, very popular OpenShift developer pipeline, the S2I on cluster build from Git, we can just still get our work done. So I'll be working on my IDE. You'll be working on your notebooks. We'll push up to that same Git repo. OpenShift will get those notifications, rebuild every time you push up a new model and redeploy it for me. And I'll just continue using that application. And so let's talk about that structure a little. So we have a little template here, which is the most basic buildable Python thing ever and I'd add a big little bit to it. So this is just a normal application Python project that we'll build and deploy from S2I from Git. So it's got an application file, this wsgi.py, and that is really it. There's a little bit of a configuration, but for you, you're going to have to worry about a couple other files. First of all, I just put in some sample notebooks. You can forget them if you need to, but that's just where you can start getting used to using notebooks. And then if you could tease out that prediction function as a standalone and put it in a Python file, that'd help a whole lot. And then if you make sure that you lock down those dependencies and separate them into a requirements file, that's game. We'll be way out of the curve. So build, I won't have to tease out everything you're doing and come back asking you what works, what doesn't work. So you're really just going to have to worry about the prediction file and the requirements file. And so let's go ahead and create a new project for you to work in. So we're going to make this dog detector service. I cannot hide it today. And we'll go ahead and get this to you. Let me go ahead and create it. All right, so now I am going to get this repo link. I'm going to go ahead and send this to you and then we'll be ready to roll. All right, let me send that to you. All right, up to you. So they go ahead and let's see what you got. Thanks, Chris. So let me go and share. That was not the screen we wanted to share. Let me go ahead and share the correct screen. Fantastic, hopefully you can see a web browser. Let me know if you can't, Chris. So we are in Red Hat OpenShift Data Science now. So this is where I'm going to work today. This is my environment for the day. So this is the launch page. This is what you see when you first log in. As you can see, we've only got one application enabled here today and that's Jupyter Hub. That's fine because that's all I'm going to be working with today. But if we want to see the other applications that you could install into this environment, then you can go ahead and click on explore. So what we've got here is a range of self-managed, partner-managed, and Red Hat-managed offerings that integrate with Red Hat OpenShift Data Science. So you can see you can use Anaconda Commercial Edition to manage your packages. IBM Watson Studio, if you're looking for an AIML suite. OpenShift Streams for Apache Kafka. Chris, we're going to use that later, right? As long as everything goes to plan, no promises. You can also install Selden Deploy. So when you access that, so Selden Deploy is software for managing your machine learning models once you put them into production. So that final stage, I talked about what we manage and monitor the models. Selden has lots of tools to help with that as well as with the deployment stage. So to go from code to an application. Another thing that's integrated into Red Hat OpenShift Data Science is a range of resources. So for each of those sets of technology and tools, you can access the standard technical documentation. You can also get access to tutorials and Quick Starts that look just like OpenShift Quick Starts if you're familiar with those. So they're embedded in the system and step you through getting started with things. So let's head back to Enabled and I'm going to launch Jupyter Hub. Okay, so when I click Launch, I get taken to this page here. Start a notebook server. So what you can see is I've got a few different options. We'll talk about them now. So if we start at the top with the notebook image, a notebook image is just a container that contains the Python libraries that we might want to use in our notebook. So you can see here I've got six notebook images to choose from. You can tell from their names, perhaps what they are, and we can click to get a bit more information. So you can see here that this TensorFlow notebook image contains TensorFlow version 2.4.1 as well as all of these other packages and some more standard packages as well. Now the nice thing about these notebook images is that they've been vetted by Red Hat. So we've checked that all these library versions, I say we had nothing to do with it. My fantastic team have checked the library versions here, checked that everything works well together and are able to go ahead and provide that. So I don't have to worry about setting up my own, managing my own libraries and dependencies and versions. Everything's going to play nicely together. We're going to use TensorFlow. It's commonly used for image processing and today we're going to be dealing with images of dogs. So I guess that makes sense. Deployment size, here I can select the size of the container that I want to spin up. So you can see that I've got a range of options here. These have been set for me by IT based on what we have in this cluster. And I'm going to use a medium container size. So image processing is often really computationally expensive but the most computationally expensive part is training the model. And because we're going to be using a pre-trained model today, we don't have to worry too much about that. So this medium container should be fine. I can also add some environment variables here. We're going to connect to data in S3. So I've added my AWS access key and secret access key and it's actually remembered those from last time I was here. So I am going to not spawn this server but instead just step over to one that I've already got running like a good cooking show. And here we are. Okay, so this is what you see once you launch your spawner. So we can go ahead and make a Jupyter notebook but Chris wanted me to start with a set of files that's in that format he laid out in that Git repo. So I'm going to go ahead and clone that Git repo and actually like all good cooking shows I've already done quite a lot of the work and developed this app. So let me just clone my fork of Chris's repo and go ahead and show you what we've been doing in terms of data science. So I can use the GitHub for here and clone the repository, no command line which I like as a data scientist. Fail to clone that is not a good sign. I'm the charm. I don't know why this was what will work but it didn't work first time. You never know, it's thinking about it. Fantastic. Okay, so Chris is a bit worried business. I'm a data scientist that knows how to clone Git repos so we're able to kick off. So you can see that it's added that it stops the tech to service here into my environment. That's everything that was in that Git repo. We can go ahead and have a look. So it's got all these files that Chris added for me ready there and I've started working in this explore notebook. So I'll step you through what I've done so you can see what data scientists do, how we might approach this problem. TensorFlow was already installed in that container image so I've just got to import it, not install it. I'm also importing some other libraries that we need for commonly for object detection. So I want to connect to a photo that is in S3. I've got this picture of some dogs in an S3 bucket. And so you can see here how I can use the Boto3 library and my environment variables which we set in the spawner to access my data that's in S3. The nice thing about this is that when this is pushed to Git, the whole world doesn't get access to my AWS keys, right? They're secret, they're just listed as environment variables. So I'm gonna go ahead and download this photo and here we are, we can have a little look at it. So these are two dogs. This one on the bottom right is Margo and this one on the top left is Max and they are brother and sister and they live in Bristol, England. And I think we can both confirm. Everybody can agree there's two dogs in this photo. So let's see if we can get a model to detect them. Well, in order to get a machine learning model to detect them, we can't just pass it an image. Machine learning models need things in a specific format. We're using TensorFlow today. So we're gonna have to transform it into a tensor. So we take this image, transform it into a tensor which looks something like this. It means nothing to me, but the computer is happy. And we're also gonna load in a model that we wanna use. So like I said, there's loads of pre-trained object detection models. There's no point as training our own for this purpose. The model we're gonna use today is the SSD MobileNet V2.1 model. You can download that yourself from TensorFlow Hub, have a go, play around with it. This is trained on an open dataset from Google and it can recognize 600 types of objects. So not just dogs, it's trained on 600 things, one of which is dogs. So we pass that tensor into our model and ask it to predict what is in the image. And it spits out a result that looks like this, this output here that's in nine. What we've got here is some hash of the detection class name. So hopefully this hashes from dog. We'll see in a moment. We've got some detection scores. So the model when it identifies something is able to say how confident it is in that prediction that it made. So you can see here it had over 80% confidence in its first prediction, just shy of 80 in the second. And then there's a real drop off down to 0.2% confidence. It's got all of these objects it's detected and we can see the ones near the bottom are really detected with negligible confidence. So the model's not so sure about those. We've got labels again corresponding to the object that's predicted. So you can see the first two are the same, 446, 446. And after that everything looks kind of like a 434. We've got detection boxes. So object detection models are able to identify where in the image these objects are. So for each of the objects detected it gives you four coordinates which define a box which will enclose the object. And finally we've got some human readable detection class entities. And you can see the first two predictions were dog. So this is what the model spits back to us but this isn't that much of a use. What I'd like to do is embed this information in the image so we can see it. So that's what we do here. We plot the bounding boxes that we were given the coordinates for above. And look at this. Okay, so you'll have to take my word for it when this says dog and this says dog. So we've successfully identified Max and Margo. But we've also got some extra things going on in this model. You can see here that the dog's paws have been identified as footwear which I think is kind of adorable but if we just want this dog detector to only detect dogs so that Chris doesn't get woken up in the middle of the night when someone's walking through his lawn with boots on we wanna fill out everything that isn't a dog. Similarly, you know, if Chris has a camera set up and a bird flies past we don't want it to tell Chris that it's found something we only want to say when we found a dog. So we do a little bit of work here to only accept things if the label is dog and the model is more than 30% confident that it's a dog. Hopefully that won't wake you up in the middle of the night, Chris. And you can see here we've no longer got any footwear identified in this image. It's just two dogs. Well, at this point I'd usually stop and say well, Chris have solved your problem but this obviously isn't integrated into an application. You said there was two files that I need to update in order to make your life easier. Can you remind me what they were? That would be the prediction.py and that would be the requirements.txt. Okay, so I've already pulled my prediction code into the prediction.py. So this is everything from the notebook that you need to make a prediction. So we need the model and then we've just got some simple code here. This is pulled from the notebook and you can see here we're restricting only two things that are dog and with probability of more than 0.3. So this is gonna get those, that information and return it. Requirements.txt. So in here I've got to pull all of the requirements of my machine learning prediction function. So you can, this Flask, Unicorn and Six. Chris, I think you put those in there automatically for me. Yep, those are for our Flask app we're gonna be serving this application with. You just need to add your data science requirements underneath. And the only data science requirements I have are TensorFlow, Matplotlib and NumPy. And so there's a few things that were in this notebook that we actually don't need. For example, I showed you we used the Boto3 library. We also used OS. We don't need that in our prediction.py function. So with that, I think my part is done, Chris. If I push this back up to Git, are you able to turn it into an application? I certainly hope so. What else? Whoever's watching this is gonna be severely disabled. Let me see if I can share my screen here. Okay, so let's see what you got. All right, so here's the dog detector service app. You have pushed up changes. I see them, including there's your prediction file. Perfect. And yep, there's your requirements. And with those two things, see this thing I kind of cheated, it already built when I even gave it to you. So hopefully it's gonna be super easy to build now. Let's go ahead and give that a try. All right, so here is my project that I already have like my application in. I'm gonna put your service in that same application, project space. So let's go ahead and add this application we used. I can clone this. All right, so this is actually gonna be super familiar to any developer who's used to working on OpenShift. We'll go ahead and add a project, add an application from Git. And so once I get that repo in there, it's already gonna know it's a Python application. I'm gonna go ahead and give it a new group, the dog detector service, change that name a little bit and that should build Sophie. Give it a try. Okay, there it is. So there it is, it's still building, which is fine, but one other thing we want Sophie is I want my service to get updated every time you update your model. Because maybe we need to have a lower threshold. Maybe we need to do like 0.2 thresholds for dogs. Maybe 0.3 is too high or whatever you wanna change. So let's go and make that happen. So here it is. All right, and let's go ahead and take this GitHub webhook. All right, and we're gonna go ahead and add that to our dog detector service with them. Okay, they soon, go ahead and add it. Great, that's it. So every time you push up a new model, I should get that change and I'll call into that updated service and I'll get your new results. So that should be done building now. Can you hear me, what? Yeah, so if I update my prediction.py so that it can identify dogs and cats, so I don't filter out anything, everything. I notify you if it's also a cat that's digging up your lawn. I can just push that change up and I don't have to call you up and say, update the app. Exactly, so every time you push up that model, I'm gonna call into your updated service. So it's pretty, oh, let's see if our application is working now. Okay, here it is, let's give this a try. I don't have a dog with me right now. I have fake dogs on my phone, give me a second here. That's it, that's the culprit, pretty good. Let's see if I can fake it out a little bit with this much worse dog. Nope, let's see. Nope, this dog is not, oh, there it goes. I got it at 27%, not a great dog. Just know if we might wanna move that threshold wrong. I don't know if we want it to show up or not, but I like it. So your stuff works great, it perfect. Like really, really good. Although I hate to say it, my app kinda sucks. Yeah, that's a problem. See, I'm gonna have to push this button to take pictures and call into your app. That's not exactly the monitoring solution I was looking for. I think we need to fix our app. Because right now we have this built where my application is just calling into a REST API whenever I trigger it. What I think we wanna do is I think we wanna go ahead and stream these images up and do predictions on all of them and then I get an alert when it has a dog in it. So I think for what we wanna do, I think we wanna use Kafka, we can push those images up, run that prediction on that image and push that prediction onto a new topic that I'll listen to. And so when my topic for objects gets this notification about a dog, I can get alerted and go catch my dog in the app. So I'm gonna have to change the whole app, Sophie. Okay, that sounds bad. What do you need me to do? Nothing, absolutely nothing. Because you separated your stuff into a prediction and your requirements, I get to mess with the app all I want, go ahead and make a new app. We'll call this a Kafka consumer and we will change it up so that we can use it and we will see where that gets us. So the first thing I'm gonna do that I think we need a Kafka queue. So I don't really wanna manage it myself because I'm an application developer, not an infrastructure guy so much. So I'm gonna go ahead and let Red Hat manage this for me. So I went ahead and created a Kafka queue on Red Hat openshift streams for Apache Kafka. And so this way I don't have to really mess with it. I created a few topics here. I created an images topic for those images. I created an objects topic for those predictions. In addition, I just created one extra topic just to see if I could test that out from notebooks so that we could connect to it and try this out. So let's go ahead and give this a try. Get my connection information here and let's get started in notebook so we can see if we can try this out. All right, you'll see here, I'm gonna try and use the same notebook you did the TensorFlow notebook, medium size. And I'm gonna add in my environment variables. I don't need AWS. What I need is that Kafka information. So let me go ahead and get started. You can see that. Let me go ahead and start the server. I am not so well-prepared. I am not a cooking show unfortunately so I'm just spawning this for you. Sorry. It won't take long though but until this, but you can see that we're gonna be prepared to connect to that Kafka queue. And we can see it's go ahead and there we go. And now it started. And so let's see if we can go ahead and connect to that Kafka queue from my Jupyter notebook here so we can get started. All right, so first I need to connect to it and just read anything that's being pushed up. You can see here, I'm gonna go ahead and start this whole thing. I've got the connection information here from those environment variables I injected in. I get to use those and that's my topic there, all right? And so here's just a real dead simple consumer written in Python that we're gonna be checking out. It looks like it's listening. Let's test that out too. I don't know if it's really listening. Let's go ahead and put some message on there to make sure my consumer is working and that we can produce messages too. All right, here we go. So here again, the same thing. We're installing our dependencies. Go ahead and use that connection information and we're producing, oh, there it goes. It's producing messages. Are we consuming them? All right, yes, we're good. We did not fail live, perfect. So let's go ahead and stop this. We don't need to really listen to that anymore. It works. We have a sample Python consumer and a simple Python producer. So what we want is instead of a REST service we wanna go ahead and consume messages from one topic and put them on another produced after we call your prediction function. And as you can see, I haven't changed your prediction function at all. And I have changed my dependencies for Python instead of Flask, but your dependencies are still the same. And right now we can go ahead and build this app. So I totally changed the app and you haven't done anything. So that's kinda nice. So let's go ahead and go back to OpenShift. All right, so I got the service up here. Let's go ahead and create a new app from my Kafka consumer. All right. As you can tell, it's still just a Python app, whatever I wanna do. So here we got the app. Let's go ahead and create a new application for the dog detector Kafka consumer. And then we'll make that. Now one difference to tell me is I'm gonna have to add in my environment variables just like I did for my Jupyter Notebook. So really it's gonna be the same thing. Oops, we're gonna put that in there. Now I don't wanna show you guys my username and password so I already have that working because I'm cheating. And we're gonna go ahead and go straight to it. So this is what it would look like. So here is the consumer that's been built and is ready to go. Let's see if this works. All right, let's go ahead and go to the app. All right, so it's on video here. Go ahead and give this a try. There it goes. Dog, dog. All right, let's try my fake dog. This dog sucks, but we'll try it. Oh, there it is, it caught it. Not the best dog. No, not it. No, Sophie, those aren't very good dogs. Let me see. Are you going to get rid of it? Right here, right here, right here, right here. There we go, come here. Yes, start of next. See, you are the little culprit that is digging up my flower bed. Are you a dog? Come on. There we go, dog. Dog. Franklin. Oh my gosh, it works. That is the culprit. I know we're going to catch him in the act now, Sophie, and I'm going to go ahead and let get that to you. So there it is. I'm going to be able to catch him right handed, I promise. That's so cute, Franklin, so adorable. Okay, so thanks everyone for sticking with us today. What you've seen is some of the pain points around integrating machine learning into application development. And then we showed you a workflow where Chris just did a little bit of pre-work to set up a couple of files for me in the skip repo and it enabled my work to be more efficiently and effectively incorporated into this application. So we saw how I just had to do two things, copy my code into that prediction.py and put my requirements in that requirements.txt. It wasn't too tedious. I'd be willing to do that again. And then with that, Chris was able to go ahead and create two applications. So the first was that one where you had to press the button on the camera. And then for the second, you used Red Hat OpenShift Streams for Apache Kafka to pass in a video images on a Kafka queue. And because of the way that the data science have been structured, there was no need for me to do any work in order for Chris to change that architecture up. So I think with that we will say thank you for spending some time with us today and we're around if there's any questions. Thank you for that. And now I know why your Twitter handle should be Chase Red Hat because Franklin is who you should use your icon for Twitter to get Chris. So I think you've got a new show star that in this demo, wonderful demo. And it made me actually realize that even I probably could create a data science application now. So this has been really helpful for me. And is there a landing page or something that you can share with us on how to get started with data science and the OpenShift offering? Yeah, of course. If I drop that into chat, will that get shared with everybody? Or you can share it on your screen now. Okay, that's advanced. Let me see what I can do. There you go. Now this is, this was really amazing because like I'm stuck in Jupiter notebook land and handing off Jupiter notebooks to application developers is never successful or never a good past production, shall I say. But it is an amazing way to develop models and to work with them and understand them better. So I think that's really the nice juxtaposition of these two conversations that you're having with Chris and with yourself, Sophie. So I really appreciate that. So that's awesome. There was one question early on that I thought was actually pretty good is where do most data science application development projects fail? And I think what I said to the person who was asking is I'm pretty sure we're gonna cover that in this talk. And I think it is, for me, it's always the handoff between the data scientist and the application developer and making that service available as easily as possible so that Chris doesn't have to track down Sophie to fix a model that's not broken but maybe needs a higher percentage because as we saw the stuffy dog often got recognized. So tweaking that little bits, it really, this workflow that you've come up with between the two of you or the three of you with Franklin works very nicely. And I think you did answer that question but I wanted to put that to both of you. Is that where you see the breakdown happening for application development, for data scientists working with application developers? That's certainly where I see it both for me and for a lot of the data scientists that we talk to in our daily work. Creating a machine learning model and developing a data science technique or training a model is what data scientists are trained at. That's what we do. That's what we enjoy to do but the application development part is just a completely different world. A lot of data scientists have kind of mathematical backgrounds rather than computer science backgrounds and they've just got a completely different skill set. So I think there's kind of two things to do. One is try to bring structure and discipline to the data science workflow. We've got a lot that we can learn from software engineers and I think there's a lot of similarities between the things we do and the way that we work. And so I think we can steal some of their tools and their techniques even up until kind of five years ago data scientists didn't use Git at all. Now maybe 20% are happy using Git and few use it reluctantly and the rest still not so much. But just by using those tools and integrating ourselves into the application development lifecycle, we can show how we can get these machine learning models into a application. Chris, what do you think? I mean, I think that's true. I think we're trying to allow each group to focus on their specialty so that we do a good job and allow some sort of abstraction so we can let them iterate independently and still not break everything. And this was like a really dead simple way of doing so. But in there, like I said, way more advanced ways and we could maybe use S3 to trigger builds and do a bunch of other stuff. So whatever works for you, but that just a little bit of structure, just a Python file and a Kramen file is just huge later down the road. So yeah, that helps a lot, I think. Awesome. Well, I do just wanna thank you both for your time. Awesome demo, you rocked it. And I actually now think I'm gonna go in and try and do the demo myself and I encourage everybody else who's done it. Do you have a repo with some of this stuff tucked away somewhere that we can share with us? Yeah, it's not even tucked away. It's right on the front page. Let me get that for you right now. So this one, here it is. Let's go to the RRH versus BU. The doctor detector setup will, you will have to install Stringsly and then you just run the setup and then it should be running in a single namespace, the Kafka queue and the three parts of the application. The application is kind of a toy. It doesn't scale and I'm sure there's lots of bugs but it'll work for now and I'll put that in chat too. So you can go ahead and set this up and it's gonna take those three parts and it's in our org. Awesome. Well, thanks again guys. I'm not seeing any other questions but I'm sure there's gonna be a few people who are excited about this and besides me and we'll be trying to do this as well and find the dog that's digging up our garden beds or whatever that animal is. So that'll be the trick is to do variations on this. So thanks very much everybody and we'll have you back again soon. Thanks a lot.