 Hello and welcome to our talk, Unlocking Kubernetes Intelligence. I am Damian Kuroczko, a senior DevOps consultant in AWS. And this talk will be conducted by me and also Mateusz. Hello, everyone. I'm Mateusz Zaremba. I am DevOps architect at AWS. Okay, let's get started. We'll be talking about Unlocking Kubernetes Intelligence with foundational models and we'll need a bit of intro before we start. But very soon after the intro, there will be a demo. I can promise you that, and it's pretty exciting. But about the intro, what we'll talk about today is essentially two things. We'll talk about one, a tool that we'd like to introduce to you that helps with integration of artificial intelligence into your Kubernetes experience. And the second thing we'll talk about is how we contributed to this tool and what our contribution gives you, what can give to you, to your company, to your team. But before we go into the demo, I'd like to start with a little bit of technical introduction. And let me ask you, let's see the show of hands, who have heard about LLMs, or large language models before. Okay, I can see some hands. Some people did not raise their hands. Don't worry, I will explain a bit. And for the people who know, it's going to just be a minute, so don't worry, we'll go into the technical stuff back again soon. So what are LLMs, or large language models? Those are essentially big artificial intelligence models that convert, that accept text as an input and output also text. And they are trained on a very big body of text. You can think of Wikipedia, Stack Overflow, you can think of documentation. And by the process of training, they get to know this. And what they also get to know is how to interpret text. And since they accept text, they can also understand the context. So let's see an example. It is actually an example of one of the large language models that I asked, what should I see in Tokyo? And then I asked it also to answer in one sentence. And this is also pretty important, so that you can not only understand your input, what do you want to see, but also how do you want to frame the answer? It is the concept of understanding the shape, the syntax that you want the output to be. And let's see the example answer. And it's actually an answer to explore a couple of districts in Tokyo and see a temple. And honestly, I've been there, I've verified this, but this is pretty good advice. So that's essentially all you need to know about the LLMs before we proceed to the next step. And the next step is let's think about how AI can help us as a technical people. And I wanted to highlight two particular topics there. First one, it's a per programmer. So you can think of the AI, you can use the AI tools, like GitHub, GoPilot, like AWS Code Whisperer. You can use it to help you program. So if you have not used it before, the way it looks, you just write your code and based on the context, based on the surrounding code, it can help you generate more code. So for example, you can actually ask it to capitalise every first letter in the sentence of every word. And they can pretty easily actually generate such a code. And the second bit ties to what I said before. So the second bit is the source of knowledge. And what I mean by source of knowledge is it's a very easy, it's very easy to query the knowledge of the LLM because you don't need to hit the right keywords. You don't need to type specific syntax. You as a person just can use your natural language to ask it for something and you will get the answer back. And as I already said, they're trained on very big sources of information and we can get this information back very easily. And also we can say in what form and what shape we want the answer. Maybe we want to explain it in an easy way or maybe we want to get the steps of something. So how to solve something. And this is actually what we want to talk about today. Using the AI as a source of knowledge. But we can go one step further. I expect that a couple of people here are SREs or DevOps. And what we do like is automation. We don't want to rely on having to take an action and remembering the action that we need to take every single time. If we have a repetitive procedure, we'd like to automate this. So let's think how we can automate using the AI knowledge and how we can skip this copy pasting. So that's the one question. And then there is a second question. How can we ensure the privacy? So probably many of you used similar tools, certain tools like the LLMs, powered by LLMs, using some chats backed by the AI. But where does this information do we enter go? And as an employee, as a team, as a company, we want to ensure that this information goes into specific place. And quite often it will be actually the best if it does not leave our network. If we have control over where the input goes and what the model can do. And what we want to show you today is the answer for those questions in Kubernetes. And the answer is KSGPT. KSGPT is an open source tool that helps you stay inside your natural environment when you're working with code. So command line interface. And then with the privacy aspect, our contribution allows you to keep the traffic, keep the information that you enter under your control. And here I also wanted to give a shout out to Alex Jones, who is the creator of this project. He also helped with our contribution by reviewing and verifying our pull request. So thank you, Alex. And with this bit of introduction, I'm finished. And I like to give the words to Damian, who will tell us a story. Yeah, thank you very much, Mateusz. So we wanted to demonstrate how you can use KGPT to speed up a little bit your, let's say, working environment and also debugging what's wrong. So by this, I have a let's say that. So, first of all, to make it working, we had, I needed to prepare some infrastructure underneath and how to set it up will cover later on. Right now I want to show you how to use KGPT. So I have a deployment and let's say I'm a developer that I've got a task to change something in my application. And also this application normally runs on production on Kubernetes cluster. So to have the similar environment, I spin up the cluster locally on my laptop and I want to just run that application. But for some reason it's not working. And also we see the status is pending. Probably for seasoned administrators, Kubernetes administrators and also knowledgeable people that will mean a lot. And we would need, we know what to do next with this. Probably kubectl describe and also check the pod status and so on and so on. But here I would like to show you how to use KGPT analyze, to analyze our cluster and also show us the old problems that KGPT founds. So right now we see two different problems. Normally when we would use kubectl describe, probably we will find one problem. Later on we will, I don't know, maybe copy and paste this error message to Stack Overflow or something like that. Here we have exactly information what's wrong on what's, let's say, not in perfect state right now on the whole cluster. As well, normally we would need to go through one error by one and discover it by ourselves. Here we have it, let's say, on the screen together on one run. And what we see, we see that one kind worker is not healthy and also something is wrong with our pod. And also its error shows that it cannot be scheduled on three nodes. So one is not healthy, then probably the next one should pick it up, but it's not working. This, for this moment, we didn't involve any AI. So this is pure codified experience in KGPT. So KGPT knows what needs to be checked to know that, for example, pod is not healthy. Right now let's include some AI in it and let's try to explain why those problems are there and also let's get some solutions, recommendations for this. And right now what is going on under the hood is that these errors are sent to our LLM and LLM will answer us with the proposal what can be the solution for your error. So basically on the kind worker, the solution is that we should check the logs of the kubelet and later on it's necessary to restart the kubelet. This error, so this solution, of course, whenever I will send it once again, the LLM model will once again analyze the problem and send me back the answer and it can be different each time I will send it. Okay, so let's restart my node to make it healthy. And the second problem, let's take a look on that as well. So the node, it's insufficient CPU, try Rascal Drilling, check the instance CPU resources available. All right, so let's have a look on my definition of my deployment and also the CPU part. Okay, so we required a CPU that is 250 CPUs, that is much too big number for my laptop, so let's make it much smaller than and redeploy this deployment. So both should be fixed, but it's still pending. Okay, so let's once again analyze our infrastructure and check what's going on and why it's still pending. Send change, right? I think that this was one print, all right, still pending. It's weird, to be honest. Let's investigate it then. The demo effect. Yes. I think minus F is deleted. I think I know, am I, I think I'm in the wrong directory then. I'm changing the different deployment. Let's have a look. Yes, okay, sorry. Now it will work. It was changing the wrong file and applying the same one without any changes, sorry about that. So that's where we need the automation, right? Yes. Okay, now it's created. Okay, all right, so right now we have another error. So this shows that we have next step, let's say, right? So let's analyze this once again. Okay, no problem detected because it didn't show up yet. But let's see now. All right, so we have a nginx with fail start. Okay, so let's describe this back off restarting. So right now, even here, we don't see anything that would be, let's say, meaningful for us. And I wanted also to show the feature from KGPT that it's quite new. It's related to the filters and analyzers. And also, I wanted to show GPT filter filters list. So basically how KGPT works is that we have a list of analyzers that can analyze our cluster in depending on the different, let's say, angles, right? So we saw that it can analyze the pods and nodes as well, but there is a bunch of things that it can analyze. And also, some of the analyzers are not turned on by default because maybe it's not useful for you. And here I wanted to show you the analyzer, which is logs, analyzer to logs. Let's add it to our configuration. And right now, KGPT will be able to analyze not only Kubernetes from the Kubernetes point of view, but also it will be able to analyze our logs from containers. And let's analyze it right now. So right now, the KGPT found even six problems. And also, let's see what it will find. And so those six problems probably will be from all namespaces. As on this demo, we want to focus mostly on the default namespace. Let's limit it because we have, for example, from Kube system scheduler, some logs and so on, so on. Let's filter it out to just default namespace. Of course, we can limit it to only services, for example, or only for pods and so on, so on. So we see in the logs that also what I wanted to point out is very often probably on our pods, on containers, we will have a huge, long list of the logs and errors. And it's very tricky sometimes to find the correct line, which will be meaningful. Here, KGPT is checking which line is with the exact error. And here, we see that the container cannot bind to port 80. So let's check our definition of our pod. And we see that in the security context, we are dropping or blocking the bind service capability. So this is blocking that container for doing what it wants. So let's change it and apply this change. All right, so one is terminating and one is just to clear it up and apply. All right, so now it's running. So with port forwarding, let's have a look if it's working. Almost there. So right now we have 404. So we are inside the container, Nginx is working, but the misconfiguration of the Nginx is still not perfect. So let's once again, analyze our cluster. And here it will show that the KGPT is able to get into the logs and found the error, the exact error that is the latest one. And also it will show you the example of solution. So here we have an error with the index is not found. And the solution is that to check if file exists and so on and so on. We see that the path is not correct. It has a typo and also I've prepared the fix for that. On the, let's say this is the Docker file of our Nginx with the typo. And I've prepared the newer version, which is with fix of that. So let's change the version of our pod. And apply it. Okay. It's configured with pod routing and it works. So application is working. And this is very basic application that is only print screen from the release of KGPT that shows that the feature that with Mateusz prepared, which was the adding the SageMaker provider to KGPT was released. And also this version is quite old. It was one month ago added, but right now we have even three new versions. So that shows that this project rapidly grow. And yet this, this is the demo part. Let's go back to our presentation then. So you saw how KGPT can be used by, by someone developer or someone else. Let's have a look how to configure it and what is needed to make it working. So for sure we, from, from our laptop, we need to have a connectivity to the Kubernetes cluster. KGPT use Kube contact that is used by Kube control. So you need to have a configured context locally. And also you need to have a connectivity to a model that you pick because there is a bunch of, of them that will be covered as well later on. In our example, we used Amazon SageMaker in AWS. And in this case, to have a connectivity to SageMaker, we just need AWS CLI configured and have credentials, temporary credentials locally. To install on laptop, KGPT CLI is provided by a package for the Mac, it's easier mostly by, to install it by homebrew. But of course, the binaries for, for other OSs are also prepared for Debian based and also Red Hat based OSs and also for Windows. With each, with each version, there is a list of, of the packages that you can install. In terms of the configuration, KGPT, of course, has a very good help. And also it's very useful to, in terms of the listing and also adding the backends so we can list all backends. And also to add, for example, Amazon SageMaker, we need to provide two elements. One is in which region it was deployed. And second thing is that, in which endpoint name it has. And this part also will be covered in the part of the SageMaker. As well, everything what we will do with CLI is saved in the config file. And we can change it later on if needed. And some other parameters also are saved there. That can be also changed in CLI or in the file. Okay. But CLI is very good for temporary and maybe locally to work on right now problems. Right. And in terms of the production, very often we don't have even access to production cluster. So for, for this purpose, KGPT operator was created. So we can install KGPT inside our Kubernetes cluster, have it running all the time and also based on our configuration, create reports, which is called the results of what was going wrong in specific time. We can install KGPT operator by Helm chart and also configured by Helm values file. Additionally, with the KGPT operator, we are able to create Prometheus service monitor. If you use a cube Prometheus stack and also Grafana dashboard that will visualize the reports from KGPT, what was going wrong in the past and also graph if the problems are growing or we are fixing them. Additionally, KGPT has web hooks that we can configure to notify us on the Slack. And for those interested in platform engineering and in the backstage plugin is also created for KGPT. This part's covered KGPT so we know how to install it, how to configure it, how to use it. Now we will switch a little bit to SageMaker part and also the back ends of LMS, which will be covered by Mateusz. Thank you Damian. What we will talk about is the backend and also we'll mention our contribution, which is the SageMaker backend. But let's go into the back ends first. So what are the backend? KGPT by itself does not know how to resolve those issues. KGPT knows how to analyze, what to check for, but it does not know how to fix them. The backend is the place that tells the KGPT what information is shared to us, so shares information how to fix those issues. The AI back ends are basically behind those comments that you run explained, and then the KGPT sends the information to the backend and then the backend responds and as I mentioned before, it's powered by a large language model, so it receives text and responds with text. In our case is how to fix the issue. So let's talk about SageMaker backend itself and what is in our contribution that we thought it's important to share with you that increases the privacy, increases the control of the model that you use. So SageMaker backend architecture essentially consists of a couple of elements and when we run the explain command, we are the person here on the left side, here on the left side and then we're going from the left to the right and I'll explain what's happening. So first, we hit the SageMaker endpoint. This is the instance that's running all the time or could be a couple of instances that has the LLM on board and is responding to your requests. This LLM endpoint, this SageMaker endpoint, and by the way, SageMaker is AWS service for machine learning. So that's why the SageMaker is repeated a couple of times. And here the SageMaker endpoint is just constantly running with your model. Then on the next to the right is the SageMaker model. What it means is that this SageMaker endpoint takes the model from the model registry and the SageMaker model is association of model artifacts and container image and potentially more metadata about the model. So what the artifact means? Artifact is the weights for the model. It's the actual model. It's the packaged model that we want to load into the container. So there's the container image that we want to load in the container and the container will tell it how to serve the model. And this information together, so container image and artifacts, plus potentially more metadata if you'd like to, creates the SageMaker model. And then we can create, from the SageMaker model, we can create SageMaker endpoint. And in this case, what the SageMaker backend gives you, it gives you the power to control the SageMaker endpoint. You can also replace the artifacts and the image to whatever you would like to. So we have control over the network. You have control over the image. You have control over the artifacts. Essentially every bit of the stack you have control over. But in this case, we still have this, this part that the request potentially comes from the outside, right? But we can keep everything inside our network if we go the operator way. So in case we have our cluster on EKS, or essentially just in AWS, you can have the operator on board, inside your network, and then the request may not even leave your network. And again, here as well, you're in power of every single element of this, of this diagram, and you can change everything you want. And then I'd like to explain to you how you can, how you can deploy it yourself. So how you can have this, this entire stack that I showed you here, this backend. Yourself, how you can create it and have a training in your, in your infrastructure. So for this, we need a short detour. We will go into the cloud development kit. I'll have a short introduction into what cloud development kit is. And cloud development kit is, is a kit from AWS that you can use as infrastructure as code. It's a infrastructure as code tool that with the command CDK deploy will convert your code into actual infrastructure, into actual cloud resources. So let's see example. And in this example, I'm creating a free bucket. And for those who don't know what as free bucket is, it's a object storage in, in AWS and it will create, if you write this code, run CDK deploy, CDK will actually take this, understand what infrastructure you would like to deploy and deploy it into your AWS account. So with this information, we can actually deploy our backend. So deploy your own backend with the LLM on board it. So to deploy it yourself, you need a couple of steps. The first step is the, to prepare the environment, you will need the, in our case, we use Python, so we need Python, you will need CDK, and you will need to authenticate to your AWS account. Then you can call on the repo that we prepared for you. So this repo contains this sample code that actually powered our, our demo. So you can use this backend. We use the same backend as we are showing here. Then you install dependencies from the, from the project, and then you run CDK deploy. And after some time, you'll see something like this. And here you can see it took quite some time, 460 seconds. So don't worry if it's not instantaneous. It's, it's still going, it takes quite a bit. And then I want you to highlight one piece here. This one piece is the endpoint name. As the output, you will see in your command line model endpoints, like endpoint name. And here, important thing, this string will be, will be different for you. This is just a unique name that it's created when we do CDK deploy for our endpoint. And this is the name that Damian mentioned before that we need to specify when configuring the backend. So this is just a unique name in our account of the endpoint that we deployed. So if you go to your AWS account, you will see such endpoint running. So this is how you deploy it. And then we can test it. So let's test it. There is a file in the repository called invoke SageMaker. You can run it with Python. And the example is, and that's, that's actually input the output that, that I actually run against this backend, right? The Hello World program in Python 3. Let's see the answer. Sure. Here's a Hello World program in Python 3. Well, it is Hello World program in Python. So you can see, you can verify if your backend is working by running this file. You can also change the prompt if you would like to. And with this, we know, we already know that our backend is working. So we can edit the deployment. So as I said, you're in charge of the infrastructure. You can change to anything you want. So let's take a look at those two files. Up.py, ncdk.json. I have parametrized couple of values there. So we'll first look at the package name. So what actually specifies which model I deploy. I take existing model package name. And in cdk.json, you can see I take existing Llama 2 model. But here the model is not important, that the exact one that we used. The point is that you can change it to whatever you would like to, and you can run your own model as a backend. Then also we can specify the instance type that powers the endpoint. In my, in our case, we also specify the default for you that you can use. It's working well. And then the third thing that we can also change and specify is the instance count. I defined it to one for starters. But as I said, all those values you can change, you can change the infrastructure. The code is yours, right? You can go into the GitHub, clone it and change it to whatever you would like to. And with this, basically you all can already edit quite a lot in the deployment. If you liked more, you can take a look at the stack.py and change the actual, the infrastructure. And once we are done with playing with it, changing it, we can destroy it such that we remove the instances. If you, if you don't want to have them running because they're running, the endpoint is running all the time. We can destroy it and simply CDK destroy. It will take a moment. And after some time, you will see model endpoints like destroyed. And this actually covers the end of the lifecycle of CDK. And with this, you're ready to deploy your own backend. So we can now go into key takeaways. It will be wrapping now. And there are three points who would like you to remember about this after, after the stock. Damian, the first one is yours. Click. Yeah. Codified DevOps Accessory Knowledge. So in KGPT, it's not the magic. Under the Neath, it's codified elements and analyzers. And you can check each of them on the GitHub. And basically KGPT knows how to check if specific element is working well or not. So this is the one point and another. Sure. The second one is faster debugging powered by LLMs. So this can cut down the time of you debugging things by quite a lot. And it powers it by the fact that powers it by LLM is the part where also saves you the search. So you don't no longer need to copy paste anywhere. You can just do it right in the CLI. And there's also third point that I wanted to highlight. You can host it and control it, control the model yourself. So you don't have to use someone's endpoint. You can control it. And this is this gives you a lot of power. And those are all the three points I'd like you to remember after after this talk. Those are in our opinion, the key takeaways. Thanks a lot for attention. And we are open to questions. And then one more thing I'd like to ask you if you could scan this feedback form with your with your phone. There will be a feedback form. Very quick, free questions, all optional. And let's see the questions. Yeah, please. Yeah, sure. Which. Oh, yeah, yeah, sure. Yeah, absolutely. Yes. So I wonder actually. Sure. OK, so there is a question if you can actually reproduce this and then also if you could change the model. So the answer is the actual model that we used is the one that you would deploy if you clone the repository that I mentioned. So this actually the model we use Lama two, but you're free to change it. And with reproducibility, it's with LLMs, they they generate every time they generate something can be slightly different. It can be. So it's not necessarily you will get the same answer, but you will get probably similar answer. So absolutely. Yes, you can reproduce the experience, the specific solution that it gives you, not necessarily today. The same. But we actually use the back end from the code that we showed you and shared with you. And at the end of the presentation, we already shared it so you can see it's in the schedule. The presentation is shared there and there is a resource page. At the end, you can see all the resources with also the GitHub repo linked. And of course, SageMaker is one of the back ends. You can even have a local deployed model, your own model and also use it as a back end. And there is a bunch of lists of other back ends that you can use in KGPT. So it's not only SageMaker yet. Are there any other questions? You just said that we can change any models we'd like to. But how do you make sure the output is correct? So how to ensure the if the answer is correct? Well, you need to trust your model so you can first choose a different model, the one that you're comfortable with. That's the first that's the first thing that you could do. And then there are also each model has also configurable variables. In our case, I mean, if you could go back to the config with temperature and top K. Yeah, I think it was there was in the slide. So there you can also change how it how it works like how it generates the answer. So you can also modify it such that it chooses only the words that it's more sure they should there are the right words. So there is a temperature and top K that can be changed. But yes, it's it's a generative model. So you need to have some some caution and some some judgment whether what it says is actually true. It may produce incorrect answers. But essentially you can switch to different model, the one that you or your team or your company will feel comfortable with. Thank you. I think you mean this one, right? Sorry, top P and temperature. Okay, yeah, top P and temperature. You can change it to minimize the hallucination that LLM will generate, right? And also those, I think you can change. And also you can use different backends in the same time that KGPT can answer you from different models, for example, at once that you can pick which answer is the best. Are there any more questions? Yeah, there is a question. K3S like this. I'm not sure if I know this tool. If you are able, if you are able to, you know, connect to that as a normal Kubernetes cluster, that yes, it needs basically. Sorry, we did not repeat the question. Yes. So the question was is KGPT is able to work with K3S. So the smaller, let's say implementation of Kubernetes. And I didn't check it, but I think yes, if you are able to use a cube control, then without any additional configuration, it should work. See, we have one more minute. So that means if there is any last question. OK, I don't see any question. There is resources. If you did not scan it, please do scan the feedback form. It's really just a couple of seconds and it would be very important for us if you could leave some feedback. Thank you very much.