 Hello and welcome everybody to another OpenShift Commons briefing. Today we're going to be talking with Audrey Resnick, who is a data scientist here at Red Hat. And we're going to talk about managed services and data science. And her topic today is, what's the deal with managed services and model delivery? So it's going to be a bit of a technical overview if you have questions, ask them in the chat, and we'll relay them to the speaker at the end of this session. So Audrey, take it away. Introduce yourself. Let's hear what you have to say today. All right, well, good day, folks. My name is Audrey, and it's really a pleasure to be able to speak to you today about managed services and model delivery. This is going to be a gentle introduction onto what managed services are and how they can be used to gracefully deploy your model into a hybrid cloud. So the items that I'm going to talk about today are exactly what our managed services, who cares about them, where do I find managed services, and what do managed services have to do with model delivery specifically. So we'll kind of go through a use case. And then the question that is really important, are these managed services easy to use? Because whenever somebody tells me about something new, that's usually the first thing I ask is, well, it sounds good. Is it easy to use? If we take a look at what generic IT managed services are, that will kind of set us up for the discussion, just to give your brain a chance to flow and get into the idea of services. Managed services in generic IT are the practice of outsourcing the responsibility for maintaining and anticipating a need for and a range of processes and functions. And that's all in order to improve IT operations and really to cut expenses. So we're going to look at everyday examples of managed services that you would find in a normal IT organization. And this will give us a really good baseline in terms of not only what these services are, but we'll get you thinking about what managed services could be available for data science. So the first thing that we have, and I think everybody's familiar with, is helpdesk. Then we go on to looking at equipment installation, and along with that, hardware maintenance. Also with equipment installation, it's also moving, those moving services that could also be placed into that. Firewall and security. We need to be able to keep our organizations safe and secure. We don't want people breaking in. We do that part of the ways by keeping up to date on antivirus patches, various updates. Systems monitoring is a big part of the services because we want to see how our systems are performing. Are they performing well enough with the amount of users that we have if we add more users? Are those the systems being overwhelmed? And speaking of being overwhelmed, what about disaster recovery? What happens if our main facility or our shop is wiped out? Do we actually have the capability to bring those services for our customers back online elsewhere? And what about managed backups? I mean, that's the part how we get to disaster recovery is we should be having backups of our vital information. That's just a small amount of managed services. And there are so many, many, many more. And today in Data Sciences, we have the complexity of adding cloud as part of the platform that we work on. And there are a number of services that come with that. Therefore, the security, the data repos, the servers, the communications, sharing services, everything that we have is just a little bit more complex. So kind of understanding now what generic IT managed services are, let's look at what kind of what I call managed services for data science or the services that would make sense in data science would be. And how would these services help us deploy an AI ML application into our model into production? Whatever managed services we create, we have to allow the data scientists to focus on building their models. While building their solutions, data scientists being one of them, I want to experiment really with the latest bells and whistles. I don't want to deal with upgrades. I don't want to deal with supported versions. I don't want to have anything to do with compatibility issues. I just want to focus on my solution. However, this does not mean that a data scientist is able to walk right up to IT DevOps, hand them their laptop where they've been creating an AI ML model in isolation and say, OK, I'm good. I need this model to go into production tomorrow. I've seen people do that. We need that collaboration with IT DevOps. And we need these services and methods that data scientists can use to work with DevOps to put their models easily into production and monitor performance. So in data science, there are a few steps that data scientists are interested in when thinking about creating an AI ML model to solve a particular problem. And I feel that these steps would make good managed services. So that's what we're going to go through and look at. This all starts with data acquisition. So we're looking at extracting and transforming the data. We can integrate streaming data from OpenShift to Apache. Streams for Kafka are reaching out across the hybrid cloud to pull in data for analysis from multiple platforms in data services. And we can use services, open source services, such as Starburst Galaxy, to help us curate our data, is one example. The next thing you want to do is to be able to run experiments and create the models. We can provide a notebook environment for model experiments and for the customers that would like to access the curated data science packages. We have Anaconda Commercial Edition integrated for those that are looking to take advantage of things like AutoML and then make use of things like IBM Watson Studio. Once you've coded these experiments and you've determined that they will fit the model that you have and it looks like your model is good and primed, you want to be able to access any hardware accelerators to speed up the time to value. We partnered with NVIDIA to provide GPU capability. And what we also then want to do is go ahead and then deploy these models as services. Once you have your models developed, you can use our source to image templates that we have or use OpenShift Pipelines to deploy and do endpoint for testing. You can also use Selden Deploy for model serving. Then we want to look at monitoring the models and tracking performance. We can continue to use things like Selden Deploy or Watson Machine Learning and Watson OpenScale for any of the model monitoring and performance tracking to know when you need to kind of retrain your model and redeploy it. And when you look at this overall picture that I've set up here or this path, keep in mind that for any IT ops that's looking at this or DevOps, this flexibility can really be a nightmare because they want a reliable, stable, reproducible environment for their customers, which we hope that we can provide here. So now that we've defined this set of managed services, let's look at who beside data scientists would care about these services. It's not only the data scientists, but also the data engineers and IT ops that care about these services. And alongside with these managed services, there are other things that kind of really fall into place nicely. You want to have an AI ML model operational life cycle. And that's kind of what I sort of outlined in the previous slide. The data scientists sort of want an environment services in which they can not only do development work using kind of the latest bells and whistles that are open source, which is awesome, but also an environment that they can deploy their apps into production. And this environment should contain that exploration of data and the monitoring of deployed models and applications. The other thing that you want to care about is that production ready platform. And this platform has to be something that IT ops feels really good about because the managed services, as I mentioned, can be a nightmare for IT ops as they want something that's reliable, that's stable and reproducible for their customers. The other item which kind of falls into place and is really important to think about is the flexibility to use any open source services that look interesting to you when you're actually going ahead and creating a solution. So being open source, just to remind you, means that it's essentially free to use and that usually with an open source service or item, there's a large network of users and developers who contribute towards updates, new features, offering support for new users. And lastly, the ability to kind of deploy and portability to move from the platform that you initially developed on. The ability to deploy and move your application allows you not to be tied to a particular vendor. And I personally feel that to be very innovative these days, you need to be able to try a wide variety of technologies and services. And that means trying out a large number of vendors so that you can create the best product that you can for your customers. Now with all of these things said, there actually is a middle ground where we can make everybody happy. At least I feel that there is. So let's see if we can create kind of a data science, managed services platform that satisfies this middle ground, all these items that we've talked about. So we're gonna start with infrastructure. The hybrid cloud platform, and we'll go for hybrid cloud so that we have things on-prem inside our own network. Maybe we'll use something as Amazon web services as our public cloud portion, should offer a very consistent experience across on-premises, in the public cloud, as well as to the edge locations. And all of that has to be efficiently managed by IT operations. We need to look at compute acceleration. This hybrid cloud platform that we have should have integrations with hardware accelerators, such as GPUs, to help speed up any of our machine learning model development and inferencing tasks. This brings us to self-managed services. We could have all these services, but what would be really fantastic is to have all of these supported kind of on a self-service hybrid multi-cloud platform. And that's the platform that would really go ahead and empower anybody, such as a data scientist or data engineer or software developer, to be agile and collaborative through the whole process. And that's without depending too much on IT operations for individual tasks. We don't want to fill out many tickets to say, I need access to this or I need this type of service. You should be able to go in and self-manage that. Pick and choose what you need in order to get your job done. So here's kind of the conceptual architecture for this AI ML model services. So we'll go into kind of a typical project lifecycle. So we have data engineers that are working on gathering and preparing the data to make sure it's ready for the data scientist to develop their machine learning or AI models. And a managed service that we could possibly choose to use is a Starburst. And Starburst is a fully managed service. You can access your data using Trino. There's a premier SQL engine. There's fax access that you have access and that flexibility to manage your data. The next thing that we have is the business of developing a machine learning model. An example of a managed service here would be something that you could use like Jupyter Hub that allows you to create Jupyter notebooks for experimentation. Now, I say Jupyter Hub and Jupyter notebooks for experimentation because you don't want to end up deploying a Jupyter notebook into production. Please don't do that. I've seen people that have tried to do that. It's not a good idea. However, for experimentation, when you're first getting started to take a look at what your data looks like in terms of how it pertains to the algorithm that you're developing and kind of looking at how your algorithm can kind of solve the problem that you've been tasked with, Jupyter notebook, something like that is fine. And within this area, when we're developing the model, we need to also be able to determine or add any packages or libraries that we're working. So for pandas, we may need to use NumPy. We may, for Python, we may be able to use pandas or NumPy, we might be able to choose something else such as TensorFlow if we're working on some sort of problem. But we want the data scientists at the end of the day to really be able to experiment with those packages. So again, whether it's TensorFlow, PyTorch, Scikit-learn, any others, the whole idea is to have these tools or these services available so that data scientists can do the experimentation. Next, we have to actually go a look and deploy models in an application. So again, this is kind of part of the model lifecycle. We want to go ahead and get our model and be able to deploy it and start some inferencing, making predictions kind of based on that data and see if what the problem that you're trying to solve is going to be solved by what you're experimenting on right now. So there are manage, self-managed services such as CeltinDeploy, which help us build pipeline, sorry, and actually go ahead and deploy our model. The work does not stop there when the model is deployed. I know some people that say, OK, I'm done. You have to continuously monitor and manage any of your AIML models that you create in production. Make sure that they're making the right predictions. Make sure that there's drift not happening. And you're not going to be doing that by staring at a monitor and looking at your model performance through some simple little script that you've written. You want to have some site of services that will give you alerts, tell you when the model is drifting so that you can continuously, again, go ahead and monitor and manage your model in production to make sure that they're making those right predictions. And of course, when you do find something that is drifting or something that is not quite right, you need to have that ability to retrain those models as needed. So keeping that in mind, that that's kind of our ideal sort of managed services and kind of the platform that would go along with it. Let's actually take a normal or I should say an actual machine learning use case that one of my own colleagues is working on and see if this kind of data managed services and model delivery platform that we've kind of come up with would actually work for that. So there is a project being undertaken by one of my colleagues, Guillaume Moutier for Metro London that has to do with license plate detection. So that all has to do with looking at the cars, grabbing the license plate and being able to monitor traffic movement, car registration, and any sorts of licensing fees, again, through the license plate detection. That machine learning model has to have the ability to detect the license plate on a vehicle. If the vehicle is angled, the license plate needs to be righted. And the character is gathered through some ML algorithms that have been developed. Data can then be stored or read or analyzed through Kafka in this instance. I just heard folks that don't know Kafka is an open source software that basically will provide a framework so that you can store, read, and analyze any of your streaming data. So for instance, here, if we're looking at some of that streaming data and we found a license plate for somebody where something was notably important about that car, we could throw an amber alert. Finally, we have to actually go ahead and store that data, whether we use an object warehouse or we go back to a vehicle registration database. Storing that data then gives us that ability to do analysis further. I mean, can we look at that data and do some analysis on traffic, movement, congestion, parking, et cetera? So now that we've kind of gone over this example, let's take a real managed services platform and as a data scientist, build out this AI ML detection service for license plate detection. The architecture that we put together for managed services, this is my confession time, are the services that a data scientist could use actually exist as the Red Hat OpenShift Data Sciences platform. And I'm going to use this platform, which we call Red Hat OpenShift Data Science, to show you how you can use these managed services that we've actually discussed to deploy an ML model. So this all kind of comes together when a user first starts using the Red Hat OpenShift Managed Services platform by having everything in one central location or shared UI, so that the user can discover and access a variety of open source solution. Each managed service, whether it's Red Hat or a partner service component, basically we'll go ahead and integrate along with a series of quick starts and tutorials, so that way users can not only work with their self-managed services, they can also self-teach or understand things better about that managed services so that they can get started working with any of the components or services. And once users have enabled components, so in this example in the far screen capture in the back, you see that I've enabled Jupyter Hub. So that's a component that's going to be available for my use, again along with all the quick starts and tutorials that will always continue to be available for people to look at. But let's specifically go back to this Jupyter Hub managed service and launch it. And what I'm going to do is, I'm not going to walk through a real demo because we know how those happen sometimes, but it'll be kind of a canned slide demo. So we'll be kind of clicking through things on slides and seeing how this all comes together. So again, we're assuming that our data set for the licenses has already been curated. Therefore, we begin by using Jupyter Hub so that we can experiment with the data. And just a note here, just because we use a Jupyter Hub managed service at this point in time, it doesn't mean that we can't integrate with any other services that are out there or go back. We certainly can. We have the freedom to do that. And you have the freedom and the ability to manage and use as many services as you like. Realistically, when you're developing something, there may be other parts of the system that you haven't thought about, and these managed services can fill in gaps that you may have not built out in your workflow. So we go ahead and we launch Jupyter Hub and we're going to go ahead and create a Jupyter Hub notebook image, which means that we're going to be packaging up a Jupyter Hub notebook into a container image that you can deploy to OpenShift. And you're going to be able to customize a number of things here. You're going to be able to customize the notebook image type. Are you working with a problem that requires you to use PyTorch? Or does the problem that you're working on require you to use TensorFlow? Do you just want to use a standard data science image just to do some exploration? In this case, we're going to go ahead and... Actually, let me go here. Let me just back up a bit and go back to a container just in case folks don't know what a container is. A container, you can think of it as a single entity or unit that combines your entire runtime environment, which would include your application, any of the dependencies, libraries, any of the Python libraries that you may be using, other binaries, and any of the configuration files needed to run your application. It's all bundled into one package. And by containerizing that application platform as dependencies, your differences in your operating system distributions and your underlying infrastructure are abstracted away. And that's really good because that means that it's something very portable now that you could use on-prem, that you could probably use in the public cloud, say, I don't know, like AWS. So again, containerization just provides that clean separation of concerns so that developers can focus on their application logic and dependencies. And then, of course, the IT teams focus on the deployment and management of that container without bothering about the application details, such as a specific software version or configurations to an app. In this case, I'm looking again at these standard notebook images that I discussed. And here I'm going to pick a base image that would contain the majority of the packages and libraries that would be needed for the license plate detection. So I'm choosing a TensorFlow notebook image. We can specify a deployment or container size that we feel that we would need for our machine learning model. We're going to choose a large container size with limits of 14 CPU and 60 gigabytes for memory requests. You have the ability to add one or more GPUs based on the type of data analysis that you're doing. And, of course, in the ML code that you're working on. In this rendition, we won't use GPUs, but remember, we can always go back and recreate our notebook image with different options if we choose so. And then users also have the ability to add environment variables that they would need on your project. So this is an example of adding an AWS S3 access key ID environment variables to access an S3 bucket, so access your data in AWS. And we're going to, then once we finish adding in the secret access key environment variable and its value, we click the start button to spawn our new Jupyter notebook image. And that can take a bit to spin up. In that time being to see what is happening, you can always click on the event log to get a better idea of what parts of your image are being rolled out and where you are in the image build process. So now you're in your Jupyter lab environment. And as you can see, it's a web based environment, but everything that you do here is in fact happening on the Red Hat OpenShift data science cluster that's sitting on AWS. This means without having to install and maintain anything on your computer and without disposing a lot of local resources like CPU and RAM, you can go ahead and still conduct your data science work in this stably managed environment. So let's go ahead and populate Jupyter lab right now with our current license plate get repo. So what we'll go ahead to do is go up into the main menu, we'll choose get and we'll choose clone a repository and then we'll enter the name of the repository and press the clone button to clone the license plate workshop repository. Note you could be asked for your get credentials, so you'd enter your credentials and then again press OK to continue. And what you'll then see is that actual license plate workshop repo files appear under the name pane in the left hand side of the actual window. We can then go ahead and open up any of the notebooks or we could be creating notebooks in this case I'm showing just an example of a notebook that we use to recognize and extract the license plate numbers from car pictures. And we installed some libraries a little earlier on in this Jupyter notebook that weren't part of the container image. That's also something important to realize is that not every image will be totally perfect for everybody. There may be additional items that you can install and that's very easy to do. We'll go ahead and experiment with our model and at the end make sure that we can detect a license plate number. So along the way we'll go ahead and package the model that we end up creating as an API. And um earlier on before we got to this point we learned how to kind of create the code that would be able to extract the number from the given license plate. But of course you can't use a notebook like this in a production environment. I do know people that have tried to use Jupyter notebooks in production. It's not a good idea to be done. It's not a good idea. So therefore we're going to package this code as an API that you can directly query from another application and we do this by creating a flask application. A few explanations though. The code that we wrote for this particular um problem that Guillaume was working on ends up all those Jupyter notebooks that you saw previously end up being repackaged as a single Python file with um that we call prediction.py and basically it's just code that was in all the cells of the notebook and put together within a single file. And to use that code as a function that you can call you just add a function call, say predict that takes a string as an input which would be the name of a picture, does a recognition and sends back the result. And you could open the file directly in JupyterLab to see for yourself um but you would be able to recognize the previous code with the new function added. Then what we would go ahead to do is launch our server um so in this case here we're just going ahead and launching it locally and we could go ahead and then test our flask application and see if it's working and it looks like our status return that it was okay. So now that the application that we verified that it's working we're ready to package it as a container image and have it run directly on OpenShift as a service and when you do that you're able to call that service from any other application. So we'll go ahead and we'll build that application inside of OpenShift which means we'll go to our main OpenShift dedicated platform and within OpenShift we want to make sure that we have a project namespace set up for us to work in and I just called it user one project because my brain was dead but we have that project namespace now set up for us to work in we go ahead and we're going to import our license plate code from the GAT repository to be built and deployed and we're going to select a number of options to create a deployment for this this model. Most importantly we want to be able to create a root but as a URL through which we'll be able to access our application. This automated build process takes a few minutes then OpenShift will go ahead and deploy the application and in this case we ended up with that route that I was talking about this again will be the URL that we'll use to send images to. So we're going to go ahead and test. We want to actually test again to see that the deployment actually works so we have the application listening at the at the route that was created during deployment and we tested by simply clicking on that that route link that we saw previously or copy or pasting that URL into a browser window. Once we go ahead and do that we want to test our deployed AI ML application. We can also test our app status through curl or invoke a web request. We definitely want to be able to upload images but as our application is now a REST API endpoint there are multiple ways that we can upload images to it. We can also run this app from a Jupyter notebook who would have thought in this case we'll go ahead and we'll add an image and I'm just calling it card.jpg just a photo of a car with a license plate and I'll also go ahead and add that URL or route that I created in OpenShift and if I go ahead and run the cell I'll see that the prediction and I've got a screen capture of the car so that you can see what the license plate number was that the prediction came back with BU69YDE which is actually correct. Now that we've done that let's take a look and see if the options that we're talking about in the managed services platform that we originally put together are actually there they are. Again this was a conceptual architecture for managed services and model delivery that we were talking about but again this is actually the architecture for the red hat OpenShift Data Science platform. And as we discussed earlier we have that typical AI ML model or workload lifecycle from gathering and preparing your data developing your model, integrating your models and app development and doing some model management and in the bottom the gray area that you see is the managed cloud platform that's provided either by Red Hat OpenShift Dedicated or Red Hat OpenShift Service on AWS. Initially AWS right now is the public cloud for launch of this service and we'll be looking at Azure in the future and we do include the NVIDIA GPU support and then of course in the Red Hat Managed Cloud Services we provide our core Red Hat OpenShift Data Science offering so that's going to have Jupiter, TensorFlow PyTorch Source2Image for publishing and also tie-ins with other optional add-on cloud services things like OpenShift Streams for Apache Kafka and our OpenShift API Management Service for optional launch partners we do include the service such as Starburst for data access and PrEP and of course I mentioned Anaconda for package distribution repositories and then we also have software partner offerings like IBM Watson Studio and Seldom Deploy. So what did we learn today? Well I hoped what you learned today that what managed services are and that they are a big deal when it comes to deploying a model because they make the process easier for data scientists to experiment on for the data engineers to curate the data and when the data scientists have built the model DevOps then has the ability to easily deploy and monitor the model and as for that fact data scientists also have the ability to deploy and monitor the model and with that I would like to thank you for your time and take any questions Well thank you for this and you've covered off like many of my favorite subjects one of which is Jupiter Hub and Jupiter Notebooks the tip about trying not to or to reframe from using Jupiter Notebooks in production that might be the thing that I need to be reminded about the most So the one request we got from the chat was if we could get a hold of your slides to share them with both and I let people know that I would make sure I could do that for them but I just wanted to thank you this has really been a very interesting approach to it because most of the times if you're a data scientist or someone who dabbles in research you end up trying to do all of this by yourself or with minimal IT support I've tried to do that before in my previous lifetime and it's not easy and it's not fun and at the end of the day when you're working on pipeline delivery you're like I'm a data scientist I just want to work on my freaking code of doing this you don't have to with managed services for data science so I think this is like a huge step in the right direction and I'm sure there are other managed services too but it's wonderful to see it all working on OpenShift so thank you for the tour de force today and we'll share this with the folks that are out there in the universe looking to try this out and look forward to having you reach yours and functions come available to talk us through those as well so many thanks for your time today thank you for having me on board remember folks questions bring them on in you can always look me up on LinkedIn and get my contact through there and I'd be happy to answer questions perfect alright thanks everybody and take care and we'll talk to you all soon