 Alright, without further ado, we can go ahead and get started with the presentation. My name is Cedric Clyburn. I am a developer advocate at Red Hat. You can find me on Twitter at Cedric Clyburn. And I'm very excited to be here to kind of discuss the AI developer experience and going from the side of both a data scientist and an application developer and also email engineering to kind of create a whole story about how we can take our models. We can build them, train them, test them, fine-tune them, and then deploy them onto Kubernetes, which we're going to be doing today. And then go from the side of an application developer and actually scaffold a project that's inferencing that API and show this whole story today in the session. So we have a lot to cover. We're going to be demoing a good amount of open source projects today. Kuflo, KServe, Jupyter Hub, we're going to be using a lot of libraries, PyTorch, so thanks for being along with me for this journey. Before we begin, I want to say a little bit about myself. I'm a developer advocate at Red Hat, formerly an OpenShift developer advocate. So a lot of experience with Kubernetes. I also am from Red Hat developers specifically where I create blogs, cheat sheets, I'm working on my first book about a variety of developer tools and technologies that you can use with Kubernetes for CI CD, things like Tecton and Argo CD, which will also be seeing. And I create labs with Instruct as well, so interactive labs where you can progress through the steps and use technologies like Argo CD in your browser without having to install anything. So I do a lot, also a little bit with IBM there. I make these really cool whiteboard videos that you can find on YouTube. That one's Podman versus Docker. There's some other cool ones out there. And if we don't have time to connect here or do the Q&A, then I've got Twitter and LinkedIn, which you can reach out to me. And I'll try to answer any questions that you might have. And I think in general, before we start this presentation, it's really cool to be here because open source is really driving a lot of the innovations that we're gonna see in the demo that we're gonna be running through. And it's kind of why I'm here and why we can all be here at this conference. So thanks to the organizers for making this happen. I wanna ask a little bit about you now. Here you can see the beautiful school of Athens by Raphael. There's people in this painting such as Plato, Socrates, Aristotle. How does this relate to the presentation? Well, I wanna ask to learn a little bit more about you and your role. So just with a show of hands, I'd love to know. So who are you? Are you maybe a data scientist? Sweet, okay, two data scientists or are you in operations? Cool, and this is probably gonna be the most amount of people, developers? Fantastic, okay, great. So myself included as a developer, just this is really helpful to know. A little bit more about you, what you do so I can kind of tailor it towards that aspect. And one more follow-up question I kind of wanna ask. When it comes to working with models and deploying models, are we maybe in the beginning stages of testing things, of creating these models? We show of hands beginner or intermediate. We've created the models. We're serving them. We're working with intelligent applications or expert, performance, utilization. We're trying to save money at this point because our compute bills are too high. Okay, fantastic. Well, this is great because we're gonna go through the whole flow. And so if you haven't had experience as a data scientist, that's okay because I'll explain everything as we go through Jupyter Hub. But without further ado, I don't have to say that AI has kind of permeated about every aspect of our lives, especially when it comes to development. I mean, I was just at GitHub Universe's conference about a month ago where you probably heard the announcement that GitHub is completely rebranding itself right on AI with CoPilot. And I think AI is just now an essential part of how we're driving value for our organizations and for our businesses. And at Red Hat here, we've got Ansible Lightspeed, which is based on IBM's Watson X, where you can kind of define infrastructure just from natural language. Say I want an EC2 instance with this type of security, I can type that and the code will be written out for me. So it's really impressive to see all the developments that we have. You probably heard about GROC yesterday, the round generative AI. And I think it's really changing the way that we work with customers and what our customers are expecting from us. Because there's so many use cases, right? In industries and governments and banks for fraud detection for all these different models that we can bring to the table. And these advantages are really immense, but it's really, really essential to ensure that the models that we're working with are firstly running efficiently, right? Running reliably and most importantly running securely. So this kind of brings me to the AI and ML development life cycle. So from talking to customers at Red Hat, right? A lot of feedback that we've gotten back is that their AI and ML models are not making it to production. And that's because there's a lot of complexity from the building and the deploying and the monitoring of these models in production, right? Not just in a development sense, but actually putting them onto a cluster, running them wherever it might be. So I want to start off this presentation with the AI, ML space and the different personas that are involved in the creation, right? We talked about us in the audience as developers, as data scientists, as operations, but I want to talk about who's really involved in the creation. Because I was fortunate enough to be born in a world where DevOps was always a thing, right? And I think I've learned a lot, and we've all learned a lot from these DevOps principles. But when we look at how AI and ML development is working, there's these new complexities that have been introduced in the software development pipeline, right? It's not just development operations now. And with these different personas, there's always going to be friction occurring back and forth, right? So let's look really quickly at some of the key roles. Of course, business leadership, they're defining goals and the metrics, right? The data engineers are going to gather, prepare the data. The data scientists, right? They're using Jupyter Hub. They're using TensorFlow, PyTorch, which we're going to be using today. Application developers, someone like myself, right? I'm going to deploy the models within the application, whether it's a Flask app or in GRPC calls. And then we also have the ML engineers and the IT operations. We can't forget about those guys. They're going to handle the monitoring and the management, right? But the challenges that organizations and enterprises are facing, that we see a lot when we're talking to customers, is very different from what we're seeing in the research space. It's how do I take what I want to do with AI and actually operationalize it? And that's a huge barrier for adoption in these industries. And it's not the Googles and the Microsofts that have so much money to throw into space. It's the average company that has to put in all this work just to operationalize a model and get that final output, which is this beautiful AI-infused application, right? And so these different personas, right, they need to interact. And so we start off with the business leadership here, which is, of course, defining the goals and metrics. And then afterwards, the second part of the process, right, the data engineer gathers and prepares the data. But there's so many considerations, right, when it comes to this. We're going to store our data. Is it going to be in an S3 bucket like we're going to be doing today? Or does it need to be on-premises? Or another server elsewhere that's not even S3? Maybe it's a data lake, you know, data exploration and data preparation, and we're going to have to do that. Are we going to have to address stream data or stream processing data with Kafka? Maybe we have to use Starburst in order to process some of this data before that the data scientists can actually work on conducting experiments and building out a model. Maybe they're using a foundational model, as we're going to do with a model from Huggingface today. But they're going to be working in their environment, right? This is a Jupyter notebook. They're going to have some libraries. They're going to be using TensorFlow or PyTorch. And when it comes to working with these Jupyter notebooks, we want it all to be in, for example, a containerized environment for that reproducibility and scalability. Now, next, someone like me, application developers, then going to enter in the flow down here in this section when it comes to actually deploying our models and actually creating these intelligent-enabled applications. And of course, we're going to discuss MLOps today in this presentation and pipelines and how we can automate all of this. But say a data scientist is going to be serving their model, we can do these calls to an application, and we're going to utilize Kubernetes to perform all of this and do inferences on our model. And then finally, the work doesn't stop there once we have our intelligent application and our model being served. We've got to continuously monitor and manage it here at the end to make sure that the models aren't drifting, that the predictions are accurate, and that the model isn't slowly drifting without us even noticing. So we have to kind of iterate. We have to go back a little bit and retrain the model if we need to. And you'll notice that the ML engineer and the IT operations are involved in all of these steps throughout the entire process because they've got this vested interest, right, in making sure that these open source tools and technologies we're using are safe, are susceptible to network compromises and hacking. They maintain the security of the network. And in this chart, we haven't even mentioned InfoSec and ProdSec, which are along for the entire process as well. And so you look at this and it's a huge graph, and you're probably wondering, why is it so complicated to operationalize this model development? And that's because here in the middle is this little clear box that represents such a small piece of this. And this is the research work that's happening, right? It's the building of the model. It's people iterating over things. And when you start to do this on an enterprise level, things start to get kind of complicated, right? So there's the configuration of the serving infrastructure to be able to run these models. There's the data collection and the feature extraction and the actual management of the infrastructure to be able to scale up and scale out with our models. And things like monitoring tools, all these different aspects that we have to think about when we actually go into production. And we talk to customers, and they don't expect to see any of this, but you do when you actually approach the point where you have to productize and serve these models and applications on a large scale. And they keep coming back and they're saying, it's fine, we're running these models on our laptops and it's great for experimentation, but when we actually go to the next stage, that's when you kind of start to run into issues with all of this. And so I think it's kind of similar to kind of paint this picture of how DevOps kind of emerged, running software at scale. And we're slowly seeing the same thing with AI and ML Ops, which we'll be kind of diving into today. Because corporations will have an ML engineer that come up to them with their Jupyter Hub notebook out or running this application in a Flask app and show this really cool model. And this happened for us when we were developing Ansible Lightspeed. We had it in a Flask app and we were trying to show these different experiments we were running. But when it came time to scale up and scale out, you need the infrastructure and the scaling and all these different components to be able to actually do that. So it's kind of a learning process for organizations to have to go through. But the platform that we have to have to support all of this and these AI and ML models needs to be how we treat software, right? It's gotta be rigorous processes for change management and data verification, for testing, for new feature development. Everything that we've been doing for the past 10 years or so because while it's really cool to just think about chat GPT or Lama or the top of the iceberg, we've got all this infrastructure below that's going on, right? That's supporting the actual model being served. And so that's kind of what we're gonna take a look at today and see how open source is kind of playing the role in the foundation of opens of AI and ML models. So another representation of that iceberg that we just saw is this cool little, not as cool, but it's kind of a stat graph that shows that AI is software, right? The models on top is being held up by everything underneath. And we're kind of gonna dive into this and learn how everything plays together when it comes to open source. Because of course, it's software at the end of the day. It needs to consume hardware resources. So we need to be able to schedule it. We need to be able to use technologies like containers. And of course, how do we schedule containers and orchestrate them, tools like Kubernetes? So we'll have to automate software delivery. We'll be able to build and update our models at the same time and automate all of that. And then of course, if we take a look at the higher part of the stack, this is what the data scientist is really working on, right? The tools and the libraries and the IDEs such as Jupyter Hub and the frameworks. And if you take a look from a macro perspective, everything that has been layered on can't be missing, right? We need to start from the beginning. And to do that, you know, should we build out our own platform with all of these? Well, there's a great quote by Peter Drucker that says there's nothing surely quite as leachless as doing with great efficiency what shouldn't be done at all, which is corporate lingo for don't reinvent the wheel, right? So the way that we kind of wanna approach this as developers, data scientists as operations is to focus our time on activities actually improve what we're doing, right? And not so much the things that don't at all, right? So this could be including maintaining old technologies that we don't need or not taking advantage of the huge ecosystem that Kubernetes provides, especially in the AI ML space. And if we lean more into the left side here, we can focus spending more time on creating these cool models and building the applications that implement them and outsource the rest over here. And where do we outsource that to? Well, you probably knew the answer because I'm from Red Hat, but it's to open source, right? Because when we talk about infrastructure and platforms, there's so many problems and challenges that other people have also encountered before you in their journey of working with AI ML and maybe in a completely different setting, but maybe in the same. And there's chances that there's pretty cool libraries or packages or software that's already been developed to solve the specific problem that you're going through, right? So the open source community, it's full of this, right? From really small projects, a couple of maintainers to really big projects like Kubeflow that we're using today, that have become pretty much the standard for model experimentation on top of Kubernetes. And we're talking about the collaboration from organizations that might be competing against each other, but still are putting their best people out to develop these technologies. And so we're also creating these open standards and as a whole innovating the industry that we're working with. And so how does this tie into your AI stack? Well, it starts off with the top, right? For a data scientist, you know, we're working with things like TensorFlow, PyTorch, Skykit Learn, whatever it might be that our data scientists are using, right? And they're also working with languages like Python and R and different development environments, whether it's Jupyter Hub or using Lyra for pipeline creation, right? Then we go down one more layer. We're talking about huge, huge interactive ecosystem of tools for these different stages of data processing and modeling, things like Kubeflow, MLflow, KServe, Airflow to put everything together. And I wanna take a look at the bottom as well, right? Because we're talking about software. Software is just something that needs to consume computer resources and how are we doing that? How are we scheduling that? Well, of course with Linux at the foundation, right? The most popular open source operating system that's out there and also taking advantage of things like CUDA for GPU acceleration of the training of our models. And then we go up one more part of the stack, right? We're talking about technologies that have made our lives easier as developers like Docker and Podman and effectively being able to scale our processes and microservices in a cluster on say Kubernetes for example and take advantage of the huge ecosystem that's there. And then on top of that, we kind of talk about operating containers at scale and some of the components that include monitoring with Prometheus and Grafana, things like Helm charts, operators and continuously being able to deploy our software with technologies like Tecton and Argo, right? All these nice tools for auditing and building things. And then finally we talk about software defined storage. When we're working with these models, we've got an immense amount of different pieces going on. Also integration, we need to be able to use tools like Istio and Camelk to do maybe automatic encryption of requests and I don't wanna go into too much detail here in this huge stack because we're gonna see it in action but you kind of get the point, right? Open source is everywhere when you look at the stack and the infrastructure when it comes to working with AI models and the lower part, it's something that's not very new, right? We've been working with this for a good amount of time, 10 years at this point. It's a great plethora of different items to create a platform, right? Whether it's Kubernetes or OpenShift or whatever you're using that kind of bundles this together but the top part is what we're seeing more and more which is what the AI developers are using, the data scientists are using and they're building on top of this lower stack that we have here. And so we can say there's so many great open source projects out there, let's go ahead and let's build our own platform. Unfortunately, it's a little bit of work and one doesn't simply build a platform, right? It's a very massively complex thing. It's hard to integrate all that together and I like to think about it in this way, right? So you've got my favorite metaphor here for platform building, it's a car, right? It's a transportation platform. The goal is to get you from point A to point B. That's it, it does a great job and so a core piece of this platform, right? Is this machine learning or AI model where it's the engine that really enables this thing but you can't go anywhere in a car with just the engine, right? You need all the other pieces around it that are so critical like every other component in the car, the computer systems, the wheels, everything else to be able to integrate together and you have to train a team to be able to put all those pieces together, right? So we're not talking about just movement but we're also talking about the people that are gonna be moved in the car, you know, all the management, the integration, the testing of the car that has to happen. So it's not a trivial thing and what's really tough in software with that stack graph that I just showed you is that all these pieces have a tendency to become updated, right? Very quickly and you wanna make sure that these updates don't break your car or your platform in that case. And so how do we handle this? Well, I'm sure you probably know the answer because open source is really what's powering all of this, right? At Red Hat, we've been knowing this for 30 years before I was even here. Think about like the Linux kernel and taking that into Fedora and then productizing that with RHEL. And so we work with the communities over a million open source projects that we don't work with but we contribute to in a variety of different ways for virtualization, for open stack, whatever it might be. Specifically in today's use case, Kubernetes. And we have an upstream first mentality where we're contributing upstream. So essentially what's happening is we're bringing projects together in this middle part for the integration, sometimes into one project. So for example, Fedora, right? And this is a community that we sometimes manage or steer and eventually we stabilize them at the end into these product branches and this is where we provide stability for our customers. That's the business model, essentially the secret sauce of Red Hat right there. And this is how we bring the power of open source into different industries wherever we are. And it really helps when companies at the end there need to be stringent in terms of security and the compliance and governance. And this is what we do when it comes to open source AI. So who here is already using Kubernetes? Okay, fantastic. So about half the crowd. So at Red Hat, we are a huge contributor to Kubernetes, right? And we do a lot of things with infrastructure but mainly, especially what I do in my job revolves around the world of Kubernetes. And so we bring together a lot of different tools into OpenShift and then we contribute upstream for any conversations that we're having with customers about features that they'd like to see. And so many of the features you see in Kubernetes has just gone upstream from OpenShift talking to different customers and different areas. And that's kind of how we contribute for a distribution of Kubernetes. But think about this in a different way for the open source industry, or sorry, for the AI ML industry. So what we're doing here that you might not have heard of if you've heard of OpenShift, this is fairly different but similar because it's built on top of OpenShift, it's called Open Data Hub. And so it's a curation of all these different projects that we talked about there on the stack into one place using Kubernetes as a platform. So this includes all the data science tools for TensorFlow and PyTorch, other machine learning frameworks, and then Jupyter Hub as a multi-tenant IDE that our data scientists can collaborate on. Things like Spark for data processing, Selden and KSER for model serving. And it allows us to leverage Kubernetes and Kubernetes operators to make this happen. And so we're trying to kind of contribute up source to these projects, but also make it easier for all the different personas that are involved in AI ML like how we saw in that first chart with data scientists and ML engineers and app developers to make it easier. And so this is kind of how we work. And a quick plug here is just that we have this kind of offering for AI in ML called Red Hat OpenShift AI. This is the productized version of it similar to how Red Hat Enterprise Linux exists, giving you the foundation of OpenShift and running on top of Red Hat Enterprise Linux with the support of Red Hat for the hybrid cloud. So that's pretty neat. But we couldn't talk about AI ML and Kubernetes without talking about ML Ops, which is kind of the DevOps to ML if you kind of could equate that. So we've talked about the struggles of putting a model into production, right? The difficulties of creating a platform, it's difficult, it's consuming, and actually the model development as well. But I wanna talk about ML operations because that's how we're automating all the problems that we've had up until now and all the way from development to production, which is kind of the title of today's talk. So the main idea behind ML Ops, which is kind of a newly formed operational concept, is that we should be able to do all of this without it being painful and risky. And so it's about bringing collaboration together, all the different personas, data scientists, developers, and building on top of the work that's been done with DevOps and GitOps already, but in a machine learning and AI kind of perspective. So using a single source of truth, things like automating and securing everything and being able to iterate quickly on our applications, which is really important when we're working in such a fast paced industry. So when we take a glance at this model development lifecycle chart that we looked at earlier, ML Ops is here to kind of address and minimize all of this complexity when it comes to developing our models, right? So this includes using maybe tools like Spark for data processing during the preparation phase at the beginning, things like AI and ML pipelines when we're working with deploying our applications and working with developing models and then model canary rollouts when it comes to the end. So the entire faucet here, we kind of try to address when it comes to ML operations. Now, I wanna talk a little bit about Open Data Hub and how we're working to bring together all of these projects to kind of speed up the time to market for models, right? So this is the platform that we're gonna be using for today's demo, which is kind of, as I said before, this open source community project that incorporates a bunch of different open source tools, right? So for the data scientists, they've got their own self-service IDEs for say Jupyter Hub that they can collaborate with, use all the different libraries that they're used to. And this is great because they already love and work with these tools, so why not just bring it to them, right? Alira for creating pipelines, Kuflo pipelines for actually running those, doing experiments on our models, fine-tuning. And then KServe model mesh, which is going to allow us to deploy the model that we're creating today onto the cluster and take advantage of all the benefits of serverless computing that exists out there, as well as things like Prometheus and Grafana for actually monitoring our models and getting the metrics from there. The way it works is we're contributing upstream to these different projects that I just mentioned, and then that is curated into Open Data Hub, which we're using today, which is by itself an operator. So who's heard of Kubernetes operators before? Okay, fantastic. So operators are a way to extend your Kubernetes cluster's functionality. So you can use them from Operator Hub or Artifact Hub, and it's a simple and easy way to install, whether it be K-native or Grafana or Argo CD or MongoDB, whatever you're working with onto your cluster, and you get a lot of different features such as auto upgrades and whatnot. When it comes to the actual process that we were looking at earlier, you can use all these tools in the data preparation from running experiments directly from Jupyter Hub, right? And then deploying that model as a service for other applications to take advantage of onto your OpenShift cluster on top of Kubernetes and then gathering metrics. And so that whole life cycle is kind of boiled down. And now we get to the fun part because we get to take a look at the actual tools we're using today. So we're going to be fine-tuning our stable diffusion model today with Kubeflow Pipelines, being able to take in some training data and be able to fine-tune our model, then serve that model with K-Serve Model Mesh onto the cluster so that developers can make HTTP or GRPC calls to that application so power their AI-enabled application. So I know there's a lot of developers here. Yes, that's a good thing. And then Backstage, who here has heard of Backstage? Okay, yes, my guy. So Backstage is really cool because it's a really, really popular project right now in CNCF because it can build internal developer platforms and allow platform engineers to be able to come in and kind of really accelerate the productivity of our developers, right? Who here has 20 different tabs for all their different tools that they're using when it comes to software development? You know, it keeps it in one place and we'll take a look at that as well when it comes to building out our application on the app developer side. A quick slide about the demo we're doing today. We're starting from a data scientist flow of working in a Jupyter Hub notebook to start from a foundational model. So we're using the stable diffusion 1.5 and then we're going to do some fine-tuning with Kubeflow on top of Tekton, do some testing, then serve our model. And of course, we can come back and this is an iterative process as we showed before. And then we're going to switch into the app developer. So starting off from down here to do some API inferencing on that served model, scaffold a new app that's built a Flask app and deploy that onto Kubernetes as well as take advantage of things like Argo CD in terms of being able to manage our sync. And then of course, monitoring that. So is everyone ready? Sweet, thank you, I'm running out of breath so I appreciate that. Yes. So we're not using Istio, that's the only thing. This is just a pretty basic demo. So, but we are using ModelMesh K-Serve. But if we're ready to go to the demo, I'll go ahead and let me change my screen just a little bit here. Show the data scientist flow to start off with. And then we'll refine that model, serve that and then we'll head over to the app developer perspective as well, so let me hop out of here. So let me go to, this is Red Hat OpenShift Data Science which is built on top of ODH which we're gonna be using today, Open Data Hub, sorry. So the only difference really that you'll notice is that the logo would be different. But this is kind of where everything starts for our data scientists, for our app developers and this kind of shared experience for both of those different personas. And so this is built on top of our OpenShift cluster. So we can see that here that we've got this OpenShift cluster running on top of Kubernetes. And so what we've done is we've just installed the operator here from Operator Hub so you can learn more about Open Data Hub here, opendatahub.io about what the project is, how the project works and learn more about it there. Of course, operators, tons of them, they're really cool. They make your life easier. That's all I have to say about that. What ROADS, which is the acronym for this OpenShift Data Science allows you to do is take advantage of things like, of course, Jupyter as an IDE, but also OpenVINO for using acceleration for Intel hardware or Starburst or Anaconda, which is really cool. But what I want to start off with is this data science project. So data science project is kind of a cumulation of all those tools that we just talked about. So for example, a workbench, which allows the data scientists to be able to collaborate together, right? I'm not a data scientist, but I'm gonna pretend like I'm one. So say I was using a containerized version of PyTorch, right? We've got all these packages included. I can define the limits for my Jupyter Hub environment. I can open it, and I can also set permissions for who could access this project, which is pretty cool. But it allows us to do all the testing and deploying and pipelines that we need to do here. We can set up storage, data connections. I have a connection to an S3 bucket in order to pull down training data and also to upload my fine-tuned model in O and in X format. We've got a pipeline down here based on Kubeflow, and we're serving the model already. But I'll get back to all of this here in a second because I want to kind of introduce to you the project that we're gonna be working with today. So I'll give it a second for Jupyter Hub to start getting ready. And here we are. So we've got this really cool project that we're gonna be running through today. We're gonna start from, of course, Jupyter Hub. And so what we wanna do is work with a text to image application for this. We're gonna be using a model that's pretty familiar, hopefully to you guys all. Who's here has used stable diffusion? Sweet, cool. Stable diffusion, it's great. You could work with any type of foundational model, but it kind of accelerates our workflow. Yes, sir? I could send you the materials for afterwards. Or do you have Jupyter Hub open right now? Because it's, yeah, yeah, it's on GitHub. We have quite a hefty GPU that we're using in this Kubernetes cluster, but afterwards I can share with you the code. Yeah, what I did actually was I just brought in this GitHub repository that we're working with. So I'll share with you that afterwards. And that's some of the integrations that's already built into Jupyter Hub with the Open Data Hub project. But what we're gonna go ahead and start out doing is to make sure that we're gonna be using this hugging face model. So we're checking to make sure that we have GPU access. This is based on NVIDIA's CUDA. And we're gonna install the libraries that we need. So we're gonna be working with hugging face diffusers. It's really cool seeing the hugging face guy right upstairs, kind of talking about this as well. And we're loading in this pre-trained model right here. So the stable diffusion 1.5. So I'll go ahead and do that as well. And I'll go ahead and clear this now because once we load in this basic vanilla stable diffusion, I want it to be able to generate a photo of a dog, right? So we're gonna query, prompt, give me a photo of a dog. And what's interesting is it's gonna give us any random dog, right? This random dog, pretty cute. But what we wanna do is actually fine tune this model to generate photos of Red Hat Teddy. Red Hat Teddy is my colleagues, my friend's dog, pretty cute dog, has on a nice fedora. Can't say enough great things about him. What we're gonna try to do is generate a photo of Red Hat Teddy, right? So I'll clear out this and try to regenerate it. And of course the model hasn't been fine tuned yet. It has no idea what a Red Hat Teddy dog is. So it's gonna generate probably another random dog. The face is not even in the photo yet. But of course it tried to make its best prediction about what it's going to look like. And so essentially it's not the right dog we want. And so we've created a second notebook here in order to start the process of fine tuning this model. So of course we can check for the video memory, install the requirements. We're also gonna be training it so we're installing some other libraries. And what we're gonna be doing is essentially loading in some data here from an S3 bucket. So we've got this folder full of nice, nice photos of Teddy. I'll show you guys this here in a second. So we've got about 10, 15 photos of Teddy. Teddy's looking good, handsome dog. But we're gonna be using this and feeding this into the model in order to fine tune it, right? So we're going to let know that this is a photo of a Red Hat Teddy dog. We're going to save this in the second with an X format and this S3 bucket that we're already connected to through this Jupyter Hub notebook. And so the training is gonna take a good amount of time. So I'm kind of gonna skip through it because I ran through it earlier. What we're gonna be using is Dreambooth, right? So Dreambooth is going to allow us to train this model without it forgetting everything that it already knows. And so we've started the Dreambooth training. We gave it time about 10 minutes. And what we have afterwards is this trained model. So let me see here. We, yeah. So we've got a trained model that we've kind of exported after this training has been done. What we're gonna be doing is using O and an X, which is a great format for transporting models and saving models. And we can even take this and re-upload it to hugging face, which would be nice. But we've got a more fine-tuned model. So we did some query again on it with a prompt of a photo of a Red Hat Teddy dog. And we've got this brand new pre-loaded image here that we already created earlier of Teddy. So this is Teddy in a new environment. And it does, does it look like Teddy? Pretty well, okay. And so there's a way we could actually, you know, automate all of this. And this is through Alyra, one of the tools that we were talking about of creating all the different steps that we need for this fine-tuning to happen, right? So the downloading of the data as we did manually, the fine-tuning using Dream Booth, the exporting to O and an X and uploading that to our S3 bucket, as well as generating a sample and uploading a sample. And so we could start that, that training job. So we could also view it here, also on open data hub interface of the actual run. And it might take a few minutes, but I do wanna show you what it looks like. See, we're downloading at first the data from the S3 bucket doing all of these steps. Essentially, if you've used Kubeflow before, this is built on top of Kubeflow, but we're able to view different pipelines that we have. So here is the one that we were using. And we've already ran it beforehand. So if I go back here, I can go to one that's already been completed and view more about it five minutes about that's how long it took. And that allows us to kind of do this repeatable in a repeatable fashion, sorry, of this training and the fine-tuning that we're needing to do. But this isn't the last step, yes? It is automatic. You're talking about the pipeline that we just ran? Yeah, this pipeline here. We are looking at the pipeline right now. Yes. So the integration is already done for you. It is automatic. As soon as I... So it just asked to, I'll run it again. So in your eyes, I guess it breaks a sail into you, right? Mm-hmm. Yes, it's done automatically, makes it easy. And of course, in the background, all of these steps are being done through different pods that are being scheduled, which is pretty neat. So that's the fine-tuning step, right? But the next step that we have to do is to actually serve this model, right? So I'll show you here. In the fine-tuning notebook that we have, down here, we have to save to S3. And this has already been done for us, but essentially we're using a lot of the environment variables that we already have here to connect to our specific bucket. And we're uploading the directory though in the next format of the saved model, the trained model, right? Already to our S3 bucket. So we've got these paths now, for example, text to image, text encoder, that has our saved in the next format of this model now in a S3 bucket. So I could go back over to Open Data Hub and go to model serving and deploy this model, right? So we'd select the project, create a name, select custom model server. We're using K-Serve, so we've got a variety of these different formats here to be able to serve the model in. I would just use it on the next. And then we would select My Storage, which is that S3 bucket we configured beforehand. And then I would enter in the path of this saved model. So I've already done that here, and I'll go ahead and kind of show it to you. We've got this text encoder here, which is an internal service that's serving. This is GRPC URL endpoint. So anytime I'm in this cluster, I could access and do calls to this model that's being served, which is pretty neat. Or in REST or HTTP, a variety of different ways. And I could say, okay, I want more CPU, or sorry, GPU resource utilization for this, right? I could give it, I'll just show you actually. So I'll go back here, I'll add a server. And so I could select if I want two CPUs or 10 CPUs or whatever resource utilization I want for this model to have, I could define that. And you could do that with K serve, vanilla as well, whatever way you want to do it. Now, the last part is with this model being served, right? Based on Trident, we can go ahead and call this. So we've got these four models, model mesh serving, text to image, that's being served within the Kubernetes cluster. All right, so what I'm gonna go ahead and do is I'll go ahead and install these, well it should be already installed, these dependencies. I'm gonna make a connection to the GRPC URL. So model mesh serving is what we have, the GRPC port is 8080 or 8033, sorry. And we have the name of the text encoder that we're gonna call. So what we're gonna go ahead and do is instead of loading this model locally with Jupyter Hub, what we're gonna go ahead and do is actually just call it with this function that we have right here. So I'll go down here, the prompt is gonna be, we're gonna put Red Hat Teddy on the beach. And so we're gonna go ahead and do that call and I'll start loading these cells here. And we already have an old Teddy, but we're gonna generate a new one. What's happening, which is really cool, is that instead of using the GPUs from, directly from Jupyter Hub, what we're doing is we're just using CPU usage instead. And so it's gonna take a little bit longer, but now we're offloading these requests to the K-Serve service that's right here in a serverless fashion and everything to generate now, as we'll see here in a second, a photo of Teddy on the beach, hopefully fingers crossed. So we give it a second, Teddy's on the beach. So that's kind of the whole process of taking this served model and calling it from an endpoint. And so it's pretty powerful because we're able to have it being served and then call it without having to use GPU usage, which is pretty neat. But that's kind of the serving, the inference API that we want to do. Yes. So where is the... So you could call it just from the endpoints that you're given. So if I want to know what are the parameters which isn't for us, so the question is where is this API defined? Oh, okay. So if I think I understand what you're saying. Like why not have a GRPC call? Yes. In the GRPC call, someone had to call the GRPC endpoint which we know what it is, but with specific parameters. Yes, you could just change the connection to be a HTTP request in your actual application. That would be something that the data scientist would work with the application developer for. Well, let me get back to you after the presentation and we come up here and try to code it out. But to kind of continue on with the flow, once we have the model being served, then me as an application developer can come in here and actually work on building out an application. So the one that I have here is this where is Teddy application that's kind of just serving, sorry, doing a prediction request on the actual model that we have. So this is probably where you would change in those configurations. So for example, we have this image generator class that we wanna do a prompt for. So here we have that same code, sorry about that, that same code that we did in the remote inference in order to query the GRPC URL. Now we've just packaged this up into this application to do the same thing and do these prompts. So same code here, we've got this application. Now what we're gonna go ahead and do is use Backstage, which is this offering based on Developer Hub, which is this internal developer platform in order to scaffold this new project out and say for someone who is new to the company, they can learn how the best practices work and how our company is organized and learn about frameworks and technologies that we're using. So this is Developer Hub, which is based on Backstage and it's a downstream from our offering called Janus which is the upstream part. So think of it like the Fedora. So this is just the productized version. As a platform engineer, I can customize everything here. I can add in new links. I could make it easier for my developers to be able to scaffold new applications. Down here I have settings for authentication providers. So you'll see that I'm authenticated with GitHub and that's because we have this cool organization here called WindTurbine, Inc. And WindTurbine, Inc. is just a fake company that we've made that has a bunch of data scientists, developers, ML engineers, all in one place. And so we're gonna be simulating this organization as say an enterprise that is working to create their model. So I'll come back here and this is done through Keycloak which is a plugin that Red Hat offers for Backstage. So I'll come back here to the home. I can also show you something that's called learning paths. So if I'm new to the company, I can learn about whether it's configuring Jupyter to use GPUs or working with the first APIs or any kind of programming languages is all customizable to kind of show them the learning paths. You can also set up different APIs. So for example, this photo generator, I can kind of map in all the relationships which the API that's being consumed and create these new APIs to kind of map this all for the organization. But the biggest part about Backstage is definitely these golden path templates which allow you to scaffold everything that you need for cloud native application in one place and at one time. So the issues a lot of developers don't know or haven't had the time to get involved in all of these open source projects like Tecton, like Argo CD, like K-native. So what Backstage does is it actually allows you to, and I'll make this a little bigger, but define all of this in code. So the template.yaml is where this all happens. So we can create templates that, for example, allow you to put in parameters, allow the user to put in parameters, any automation that we wanna do, we could create a skeleton source code project so that it's already designed with our best practices in mind. We can do GitOps templating, set up Argo CD application. Anything else that we wanna do in here, we can do and automate all of the processes of building out an application on Kubernetes. So it's pretty neat. So I'll go ahead and open up this template that we've made. I'll go select the GitHub organization, namespace. For here, we can create a new namespace from OpenShift. So I'll go back here and do that. As you can see, we've already got the inference API that we're using. So we can kind of define that directly in the parameters of this template.yaml, which is pretty neat. So they don't have to enter in anything. I'll select myself as the owner. So taking advantage of the multi-tenancy that is provided by OpenShift and then the namespace, we'll create a new one and we can call that AI.dev demo. Cool. So we've created this new namespace, say I'm working in this one. I'm gonna go ahead and select that there and we'll go ahead and check that everything's gonna be good. It's going to build that specific application into a container, store that container into the internal registry and then deploy that using Argo CD manifest. So here's what the magic really looks like is that we're generating the source code into a component, which I'll mention here in a second. We're invoking the GitHub API to create these new repositories. We're generating the deployment resources for Argo CD. We're publishing those and we're creating those. And now, if we go over here, we can open up this new component in the catalog for this AI.dev demo for the Where's Teddy application. So what I'm gonna show you real quick, which is really cool is that in the wind turbine organization, now there's gonna be two new repositories. So we've got a Where's Teddy application, which is just as I showed you, we have the same code that's gonna be invoking the model that we're using. So we've built that out, that's our skeleton project. And we also have back here a GitOps repository. So we've got Helm chart and we've also got Argo CD, which is just going to define kind of the pipeline and the deployment of this actual application onto the cluster. So if anything changes in this repository, Argo CD is gonna be watching that and it's gonna sync those changes to the cluster. So any modifications I'm making here are gonna be synced directly to our Kubernetes cluster. And this is called a component. So this is kind of like a high level overview of everything about our application, right? So we've got dependable alerts, we've got pull request statistics, we can view the source code, we can view tech docs that we've put in there. We have plugins here, like the OpenShift Apology, you can see the applications is scaling, we can see any issues that might be in the repository, pull merge request, plugins like Tecton or Argo CD here, where we can see everything being created, but all just from this one single pane of view. Kubernetes to kind of view the abstracted different services and pods and deployments onto the cluster and view the API that's been used, all the dependency mapping, it's pretty cool. But that's gonna take a few minutes because we're still rating on the build and the deployment of the application. But what we can do is kind of go and check out the pre-created component that I've already done that is already finished. Argo CD has already synced. We can open that up here, which is pretty neat to kind of see everything that's already been happened. We can open up a VS code instance to start coding on it, but what we're gonna go ahead and do is kind of take a look at the OpenShift Apology, we can see the deployment here that's been created, and we can also see the Python Flask application that's been deployed to the cluster, which is pretty cool. The pod has been created, the service has been created, and now if we open up this route, does someone wanna give me a location to put Teddy at? Penicles. It might not know it. San Fran. Is that how you spot it? All right, and we'll give it a second. And it's just based on stable diffusion, so whatever stable diffusion foundational model knows, then this one we'll also know as well. And I wanna show you as well that through Argo CD, when we created the deployment for this pod, if we actually go to the pod, we can see the environment that it has, the access environment variables to be connected to the model that's being served. So we've got that same model mesh serving text to image that we had right here in the port 83.3. So we'll go back here, and Teddy is in San Francisco. Voila. We could put Teddy in a variety of other places. I think I put him in the gym earlier. Yeah, dogs have to work out too. And whatever we wanna do with our model, this kind of shows the two sides of the developers that are scaffolding apps and building these applications, and also the data scientists that's doing all their work in Jupiter hubs. So we've kind of tied together these two different sides into one place, and that's kind of what this open data hub project does is it abstracts a lot of these different open source technologies, but allows you to connect everything together and to run experiments and to take advantage of GPU utilization to serve our models and to learn a lot more, which is pretty cool. So let me close out the presentation here and love to answer some questions afterwards as well. So I wanna kind of talk about a cool instance that we saw with the education world for Boston University where we deployed OpenShift data science there, and there's hundreds of users and they're able to do their data science experiments and automation and do all their work within these Jupiter hub environments, and it's been pretty cool to see it in the academic perspective. And if you wanna try this out yourself, we have this interactive sandbox that is essentially everything we've done today. So red.hat slash roads dash sandbox. It's a 30 days free instance of this, and I'll go back to that. I'll just show you how it works real quick. But essentially you can take advantage of everything we've done today. You get 14 gigabytes of RAM and 30 gigabytes of, or 40 gigabytes of storage. I don't mind too much Bitcoin, but we also have some cool resources here. So if you wanna learn more about the developers.red.hat.com program, which is cheat sheets and ebooks, we have a new book based on backstage, which is pretty neat, and a great book here called GetOps Cookbook about all the different best practices for GetOps automation. You can learn more about data science here, and sorry about this. We have the link here on the left that'll have everything, and the slides, of course, will be posted after this session. But yeah, if I go back here, it could check out the data science interactive sandbox. And this is different from the one I was just using. This is the one that's completely free. You get the 14 gigabytes of RAM, and you can do all your data science experiments here for free, as well as taking advantage of GPU acceleration. But that's about everything I have for today. So I wanted to say thank you so much for your time. I appreciate it, and have a great rest of the conference.