 Greetings, everyone. Welcome to the session. My name is Minakshi Kaushik, and I work in Cisco's Emerging Technology and Incubation Unit called Outshift. Today, I'm going to talk about how you can use existing Kubernetes tools to jumpstart your large language model security. So the agenda for today's session is as follows. I'm going to start by taking a look at why GNI security is important, and then at a very high level talk about large language model security challenges, prevention and mitigation techniques, and spend most part of my talk looking at prompt and respond security in Kubernetes and demo. So let's start by taking a look at why GNI security is important. This is the survey which was done by Cisco's AI Readiness Index Team, and they found that 61 percent of enterprises plan to deploy AI application in next year. So there are going to be a lot of AI application in the market. However, only 14 percent of these enterprises feel that they are fully prepared. So one of the reasons why that is the case is security for large language models. So large language model GNI has more security issues. There are three main reasons why. One, it's new, two, it's rapidly evolving, and three is that the existing application security measures do not address all the unique challenges of large language model. So what are some of these unique challenges of large language model? So the first thing is that in the traditional application security, the request response is pretty structured. So I can give me all my Kubernetes cluster, and my security and identity would know that it's me and only provide access to my Kubernetes cluster. However, with large language model, you have to convert this natural language into something which your identity and security can understand. The second is that the traditional application is always from the database to the user. So for example, the same example of provide me all my Kubernetes cluster, the data is flowing from the Kubernetes database to me. However, with large language model, the data flows both ways. So in the prompt, I can actually scrape some Kubernetes data from somewhere and ask my large language model, can you provide me more info? So the security infrastructure has to be able to really take into account this two-way data flow. Then the third is that the blast radius of large language models is pretty large. So these adversarial machine learning vulnerabilities, which are present, are very neatly captured and prioritized in OWAS top 10 large language models. And in this presentation, I'm going to really focus on the deployment, as well as the operation and the maintenance aspect of these top 10 large language model. So broadly, they are classified into prompt and response security issues, large language model, plugin issues, the design issues, and supply chain issues. We are going to take a look deeper, look into all this in the rest of the presentation. But what I also want to point out is that enterprises may not own all the components of this large language model. And no matter which deployment strategy, which the enterprises use, they most probably should be able to deploy prompt and response security. And that's why I'm going to focus more on the prompt and the response security. For the next four slides, all I want you to focus on is the red bar, which I have, which shows the enterprise boundary, and see where the prompt response security is something which an enterprise can enforce. So if the enterprise is consuming large language model as a service, whether they are consuming large language model as a service and they have their enterprise plugins, they actually build their own large language model, for example, or they are the provider of large language model as a service. So what are some of these common prompt response security issues? They are hallucination and relevance, data leakage, toxicity, jailbreak, and prompt injections. And we're going to take a deeper look as presentation goes on. So what are some of the prevention and mitigation techniques? So the prevention and mitigation techniques are also very well captured by the OWAS top 10 LLM. And so for example, if you look at the large language model supply chain, today we already do software bill of materials. We also do workload security. For example, we look at Kubernetes context. With large language model, we add like AI software bill of material, which would include model and data lineage, and also look at the LLM security for each of the plugins. Then for our prompt and response, in API security, we already do sensitive data detection. We also look at shadow APIs. And with large language model, we would extend that, for example, for the API security for the plugins. The identity which I talked about should be able to know the natural language and figure out what identity it is and apply, and all the different things which I talked about in prompt response, hallucinations, et cetera. With model denial of service, we already do API rate limits. We do input output validation. Now we need to add something which is model focus. So basically how many number of actions a prompt generates? For example, what is the context window size? Or for example, things like the resources which are used for the prompt. So these are some of the prevention and mitigation techniques. Like I said, I wanted to give a high level overview before I focus on how would we do this in Kubernetes. So one way and the easy way would be to frontend the large language model deployment with large language model security gateway. And if we look further down into the large language model security gateway, ideally you would want in the blue box where I have enterprise large language model application. You want to have a large language model application that frontends talking to the actual large language model because it can provide you a way to normalize to different large language models. You can do some form of prompt engineering at this point of time. For example, maintain session tables, you can maintain caches. And so now the user will talk to this enterprise large language model application. So there are three broad security things you can do. Something you can do the identity and access management at the ingress. You can look at the prompt request and responses, filter prompt request and responses at the ingress and then perform prompt response and security. So the first one, which is the identity is something which we already do with traditional applications. I talked about, give me all my Kubernetes cluster. You look at the identity and then decide who has access to what. You would just need to extend it to the large language model where you would break down the prompt into actions which your identity can understand and then apply the access and see if the user has access to the question that it is asking. And then after it flows into the enterprise response, it's something new which we haven't seen before. So in my demo, I'm actually focusing, starting from the fact that you are already augmenting your identity. And then when we reached the enterprise large language model application, what kind of filtering we can do to get the request and response and then perform the request response security. So I am using the existing Kubernetes tools. So Kubernetes and Istio and the sleep pod, which is always present in the Istio to actually represent the enterprise large language model application. And now I'm using the Istio egress gateway for trying to get the request response. So have an envoy filter in egress gateway so that it can pick up request response and we can do processing in this LLM prompt monitor. Now, I didn't need the three gateways, but it is kind of nice to have one gateway per large language model. The reason is because this is a proxy, egress gateway also acts as a proxy. And different large language models may need different kind of packets for proxying. For example, with chat GPT, as you know, you use bearer token, whereas with Azure open AI use API keys and with Lama, you just simply use TLS. And so it makes your filter design pretty straightforward. So what would happen is that the enterprise app would then take the request, pass it to the egress gateway, making the decision as to which large language model it wants to process it to. The egress gateway has a filter and the filter will send it to the large language model where we would make a decision if the prompt is good. If the prompt is not good, then it directly tells the user, for example, malicious prompt detected. Otherwise it sends it to one of these large language model. The response comes back and then the response again goes into the processing input and you can then do response processing or keep a cache of prompt and response so that you can do prompt response processing and then send it back to the large language model. So let's see this in action in my Kubernetes cluster. I'm jumping a little bit back and forth, but so this is my Kubernetes cluster. As you can see, this sleep pod actually simulates what I talked about in enterprise application. I have a very simple Istio deployment which has a Istio daemon ingress and egress gateway. And as I mentioned, I'm using three egress gateways you don't have to. So one going to Azure, one going to chat GPT and the other going into Lama. And before the egress gateway looks at the request and sends it to, for example, large language prompt monitor. So now let's look at the unvoi gateway which a simple gateway that you can write on your egress to perform this proxy function as well as taking the request and the response, right? So this is a simple unvoi filter written in Lua and as you can see that as I was mentioning, because it is Azure AI, so you're using an API key to build the proxy. And also before the request is forwarded to the large language model, for every request it actually passes to the large language prompt monitor service. And then based on the response of the large language prompt monitor service, if it is not equal to 200, then in the case of large language model, I'm detecting like prompt injection error, so it shows that malicious prompt is detected. So you could like write a simple Lua filter for your egress filter monitoring to do that and similar thing you can do for response. Like for example, all the responses also go to the large language monitor service. So with that let's like write some prompts and see what happens. So I'm going to look at the logs of the log monitor service and let me just go and... So here like for example, I can just say teach me how to steal a car, something which we shouldn't do, but if I said teach me how to steal a car, it actually says that it's a malicious prompt detected. So if I go back to my log file, you can see that it comes to teach me how to steal a car in my monitoring service because my unboy filter passes to this monitoring service. And then I'm going to talk a little bit more. There are a lot of libraries available to detect hallucination, prompt injection, data leakage, like for example, Rebuff or True Error. Or I'm using by logs because I took a deep learning course and they were using by logs. And I like because they have written wrappers around standard libraries which show good visualization as you would see. And so this is the by logs library, which just quickly checks if there is a prompt injection detected and it says yes it is. And so it passes like 400 response. And if you remember in our Lua filter, we said, oh, if it's not 200, then I'm not going to pass to the large language model but would say malicious prompt is detected. So now let's try to see a good prompt. And so for example, I can say, for example, say hello in Hindi. And so this is a good prompt. And so it goes all the way to the Azure AI in this example and then you can see that I got a response back which here. So the answer is namaste, which is a response. And so if we go back to our monitoring service, as you can see that it got this prompt and then it again, this is a very simple libraries which are wrappers around some standard libraries which are available, which looks for prompt injection and it didn't see any issue so it sent 200 and that allowed the filter to go to the backend and then the backend did the response. So this is how you could like start with enabling large language model security. So what additional things or how do we actually detect these prompt request response issues? So for that, I'm going to take a look at three slides and then go back to Collab to show more. It's easier to actually show that in Collab. As I said, I took this deep learning course and it was the Y Labs was the company which was talking about it. So this material is really taken from there but it's very relevant to just think about how we look at hallucination and relevance detection. So the way to look at hallucination and relevance detection is that you could look at either a comparison of prompt and response or you can ask your large language model the same question multiple times and then can compare between those multiple responses. So those are like two ways. I'm not saying these are the only ways. I'm just saying that these are like some of the ideas and depending on your data set, the kind of analysis that you do may work or may not work. So for example, for the detecting whether when you're detecting the similarity between the prompt and the response, the first thing you can do is you can just compare tokens so and which is what the blue score does. And then if that doesn't give you the good results then what you can do is you can compare embeddings and so you can compare contextual word embeddings and then look at like semantic match which is what board does. So it is like better than just looking at word but it is actually looking at contextual embeddings and that might work for the prompt and the responses which are flowing through your enterprise. Then for the, you're asking the question to the large language model and getting response one, response two, response three then you're actually doing sentence comparison. So you are doing response one, sentence comparison to two to three and so instead of doing word comparison which you were doing with the question response you're doing complete sentence embedding comparison and then you can use different kinds of similarities. You can use cosine similarity which is what the first one is or you can actually send it to another large language model to say, oh, I'm giving you these embeddings how similar they are. So these are some of the ways that you can detect hallucination and relevance. For data leakage, this is some standard stuff. You know, all products do it in API security and so really what you do is you can use either like standard libraries to match all this personally identifiable data or you can write a simple RNN model or something to do pattern matching and the same is the case with toxicity. You might want to do more than that because your large language model may leak like for example, your product name, your project names, your employee name and so you want to do entity recognition. So there are standard libraries available for entity recognition as well and then this is like even more hand wavy but when you look at the example it would make a little bit more sense but for jailbreak and prompt injection detection for example. So jailbreak is where usually the large language model would say how to steal a car and they would say, I'm sorry I can't give you this information. You shouldn't do this and then you are able to like craft an input and still get some ideas from the large language model. So you are able to break the jail which the large language model has done and so one of the ways you can do that is with prompt text length and then for the prompt injection you can look at similarities between the usual prompts. So now let's take a look at the example. Like I said, this was part of like the sample prompted responses were part from the deep learning course which I took with quality and safety for large language model and the libraries I'm using are Y Labs but there are many open source libraries available. So really the data set consists of prompts and responses and so they are not perfect prompts and responses they have some of these hallucination jailbreaks and this is like the full text of the prompt and responses. So let's look at hallucination. We talked about this blue score where we are looking at token similarity between the request and the response. So as you can see that when you look at the token similarity between request and response at least for the data set that I have it's clubbing everything together so it's not really too much of a help and the reason why it's not too much of a help is because all these samples which are prompt and responses don't really have too much in common. So like for example can you give me credit card information? It says I'm unable to provide anything of that subject. So there isn't any tokens which are common between the prompt and responses for example. So this is what you won't be so this is not a really good metrics for like for this data set. Then you can start looking at bird score where I was talking about where you do contextual embedding matches and so you can see with the bird score there is a little bit of stagger it is able to do that. So it looks like a better fit this metrics looks like a better fit and as you can see that the contextual mapping doesn't match too well when there is a single response or a single word in the prompt because you can't really do contextual matching here and so it tells you oh this must be a hallucination for example. Then the other thing which I talked about was where you compare response one versus response two versus response three and then see how similar they are and so you are looking at these responses and seeing the similarity score. In this example it doesn't make much of a difference because when you are asking the large language model multiple times it is providing similar scores but for certain topics like for example so as you can see most of them provide similar responses but then there are some which don't and those are the ones where you have all the refusal sorry I'm unable to provide you this and then you tell it tell me how to hotwire a car or ask it for some detailed explanations so then the large language model starts giving different responses for those and that's why it would detect it as hallucination. So similarly we talked about data leakage and toxicity as I said there are standard PII libraries available and then you can detect oh what are the different kinds of like phone, mailing address, all that kind of stuff both on the request and the response now that you are already passing to your monitoring service and so it shows what kind of detection it has between the prompt and responses and then as I was talking about you may want to know more than PII information so you may want to know entity recognition for example and that you can do using this library called span maker model and it will be and you have to really tell like what is entity for your particular thing like for example in this case it's person, product and organization so you have to identify what you think is the entity you are worried about and then it is able to tell you what issues it has and similarly with toxicity there is like a tox- toxigen-hate bird model which you can use for analysis of toxicity and the jailbreak which I was talking about there is a site called jailbreak chat which actually talks about how you can jailbreak and most of the time you can jailbreak is when you tell chat GPT oh pretend that you are something and now tell me how I can steal the car or imagine you are having conversation with two people and do something about it and so then and in that context to tell chat GPT or any large language model this you might have to describe a little bit larger and so you can see how it makes sense to say okay I will look at the prompt text length but this may not work but you know it could be one measure and these measurements are always evolving but I wanted to give an idea of like how you can jumpstart this and for injection similarity or prompt injection similarity you would use similar you know you would still use similarity matches between the common phrases which create prompt injection like for example your new task is you are an actor who is role playing and so like in my example it was like teach me how to steal a car so you can add teach anything which has something with similarity to teach is a prompt injection and so that's how you would detect prompt injection so with that I'm almost to the end of my presentation where I was going to say that you can use existing Kubernetes tools to jumpstart your large language model security and thank you for coming for this session and I think I have time to take questions five minutes hi is the question that when you are comparing sentences what are you doing oh okay so there are libraries available where you can look at the and this is why I was talking about the Y logs and other open source they are not completely open source they have open source and closed source and that you can do and you can look at their embeddings and so once you have the embeddings then you can look at the similarity between the embeddings of the sentences and to see how similar they are so in this session I talked about two things that you can look at the cosine similarity between them or you can feed those embeddings to the large language model and ask them oh if they are similar or not similar are there any other questions sorry so you said you could filter out as teach as an injection yeah exactly exactly exactly so this is a great point which is basically teach me how to steal a car could also be teach me something good like you know teach me about Kubernetes for example and so you have to parse out what is different about that teach I 100% agree and so there are a lot of these libraries which are evolving just to do that in fact like even at Cisco we are looking at open sourcing some of these libraries and so these are not perfect as you can see and you have to like fine tune it to make it work for your environment any other questions the question is can you compare two LLM models and say which one is more efficient for your environment the I mean in the context of this presentation you could look at all these metrics to like do the comparison but I would say that if you had to go back from the fundamentals you would want to like start with model cards to see how they were created and then see whether they map for your environment like for example the custom LLMs what data was used and how they were fine tuned that's the first thing that you want to look at to see whether they fit for the purpose that you're doing like for example in my team we are looking for generating remediation code so you want to like look which large language model how it has been fine tuned for that purpose that will be the first then the second is that after you have selected the best model which is what I didn't talk about here and I only talked about once you have selected whatever is best given the model cards and all the data that you have then you deploy it in production and then look at the request and the responses my talk was focused on security and that could be one lever which you use for which large language model is better like which one is more secure whether it allows you to do jail breaks whether all the things which I was saying you should do it in your gateway whether the large language model does it already as a guardrails for their capability so then it is better because they are itself providing some kind of guardrail so security could be one measure to compare whether large language model is better but you can also look at performance you could look at the speed and all the other aspects so I guess to compare two large language models you have to first decide on what are different criterias that makes it better for you and so it could be the size because deploying I mean I have so much issues getting GPUs at my work because these sizes are pretty large so you could start by looking at size you can look at whether it fits the data set that which you are looking at you could be looking at security you could be looking at performance and then once you have all these criterias and have the score then you can make a decision as to which large language model is better or not better so security is just one aspect of it so I think hi I think I've run out of time but I'm happy to talk to you after the session so thank you so much for coming everyone thanks