 Hello, everyone. I am Daniel Wynne, the CEO of Metal Security, and the topic of today is how can we make AI more privacy-friendly? I guess that if you are here today, it's because you think that AI has potential, and we see the results in different fields from medical imaging to biometrics to audio analysis and so on. But there is one big issue which is how to get access to data for AI models to be better, to reduce the false positives, the false negatives that are really important, for instance, if you want to help screen cancer. But sharing data can be complicated because that data can be very valuable and confidential. And because of that, security and privacy are one of the biggest obstacles for AI adoption in sensitive fields like healthcare. And to see better how this issue works, let's take an example. Imagine you have a lab that trains an AI model to screen, for instance, long-term COVID from extra images. And now they want to deploy it to hospitals. The natural way would be to go to the cloud because it makes the startup focus on their core business. And also it's easier to onboard hospitals because the deployment is much easier and so is maintenance. But in that case, data will be sent from the hospital to the cloud to be analyzed by the AI model of the startup. And that poses some issue. Okay, there is some overlay, but sorry. So let's see how it works. First, the data is collected on the hospital's premises. Then we can encrypt it, as you know. And we can send it to the application in the cloud managed by the machine, or by the AI. And so this is a regular stuff. You send, oh, yeah, it's encrypted and all. So that's good. Data is decrypted inside the premises and it isn't clear. But it is at that specific moment that data is exposed because you need to have it in clear to analyze it, which makes sense. But that means that both the AI provider and the cloud provider have access to your data and you no longer have control over it once it's outside. And it is because this last mile is not protected, is not guaranteed. There's no privacy guarantees that it's hard to share data because it's very easy to leak information if you're the owner of the machine. And also from the cloud provider, it's very easy to mount attacks and dump the data. And because of that, it's very hard to share data to other parties because then you today have little technical guarantees and you usually only get contractual guarantees that are often not enough. So that is why we've built blind AI, which is an open source and secure solution so that you can deploy AI models within what we call secure enclaves so that people can benefit from AI without having to show their data in clear. So what's the difference? Now data is decrypted inside what we call an enclave, which I will explain are hardware-based solutions. You can find some with Intel or AMD processors. But the idea is that it's a kind of black box environment, a secure environment, where data that is manipulated inside is protected from the outside either through hardware isolation or encryption. And that means that external attempt to dump the data will fail. And therefore, while it is inside, it is protected, it is in clear. And then inside the enclave, you can apply an AI model, for instance. You can do arbitrary computation, but in our case, we focus on applying AI models. You get a result. You re-encrypt the result and then you send it back to the data owner or the hospital, who can then decrypt the result and benefit from the AI model. And with this workflow now, your data was encrypted most of the time. It's only decrypted inside the enclave. But while it's inside the enclave, it's protected. So you benefit from end-to-end protection. So the hospital benefits from a state-of-the-art model. They don't have to install it locally. They don't have to maintain it. And still, they have a very high-level privacy and security with that kind of solution. Okay? So that's a bit how our solution works. It's in two parts. There is a Python client that you can try. And then we have a Rust server that has an Onyx engine to apply AI models that are uploaded in Onyx format. Onyx is a standard format to export your neural networks from PyTor or TensorFlow, for instance. And then we package it inside the docker. And so this way, we today provide two modes of deployment. You can either deploy this image, the server side, on-premise. You can use the simulation mode if you don't have the right hardware. Or you can get the right hardware if you go, for instance, on Azure or GCP. But the interesting part is we saw that it's very hard for users to get access to the right hardware, configure it, and so on, to try. So we have provided a SAS solution. And we call it the Blended AI Cloud. And I'm going to show you how you can today use the Blended AI Cloud to deploy, for instance, a COVID-19 model with privacy guarantees for the users. So let me go here. So actually, you can find it on the website, but it's live since, like, what, 12 hours. So I hope you're not going to crash it, but please do try. And basically, you can register. We provide you an API key. And then you just have to pip and start the client. And the client will upload the model to the secure enclave. And then you can query it. So this is the basic interface. So it's very simple, but a lot of stuff happened. Okay, I guess not. Okay. Okay, so now I don't see what happens here. Okay, I guess I would guess. Yeah, I guess I could be smarter. Yes, you see it? Okay. Yeah, so you go to, we have a cloud page where you can sign up to get an API key. And then we'll show you how it works. But basically, it's very simple. You get your model, you upload it, and then you can query it easily. But I will talk about the security details, but you can actually verify the patent client and the server image to make sure that we provide the security we pretend to provide. And here's a list of models you can try and see how it works with our solution, but it's really smooth. And to see how it works here, we'll see how we can deploy a COVID net model. You can see it was a model that was published, like, I think, three years ago to help screen patients with COVID. And you can guess that, yeah, it's quite sensitive. So you can actually find a collab to do this experimentations. Yeah, first you install some stuff and then you install a library brand AI. Here, I will take the model already trained. It's already exported in Onyx format. But we have, you will see in the other examples that will go from training the model to exporting it in Onyx format. Once you have the model, it's pretty simple. Here I already loaded the API key, but you just connect to the managed cloud. If you want to deploy information, you just need to provide, like, I don't know, like, you know, the address of your whatever local where you deploy it. But here, by default, it will go to the cloud. So the model is uploaded. Now, for instance, I will download an image. Okay, sorry. Is it better now? Okay, thanks. So here's an image I want my AI model to analyze. Here, it's just some preprocessing stuff. I mean, this is nothing new. We can see the image, blah, blah, blah. And here, we'll send the data to the secure enclave for it to analyze. So pretty straightforward, you know, usual stuff. And we get access to the result for instance here. It's a positive label. And you can run the model inside an Onyx runtime, but it provides the same performance, I mean, the same predictions. So as you see, it's very straightforward, like, it, it doesn't look like a lot happens, but I will dig into the technical details about what happens behind the scenes. But with two lines of code, you can make your AI models privacy friendly and provide guarantees to your users that you analyze their data without you technically being able to see the data. So I think that's pretty cool. Okay, let's move forward. Yeah. So we have tested our solution with different models, like Eurov5 for object detection, which can be pretty big use case, use case, for instance, detect, like, I don't know, in smart cities scenarios where you want to detect potentially dangerous objects or things like that based on flux of personal people going through or we also make it work with transformers models, which is cool. Actually, we run our solution with GPT with three billion parameters from Ileterai. So you can do big models inside the enclave. And for instance, we have a client that with whom we help detect all biometrics with privacy guarantees. And then this is a performance. So we didn't fully optimize it. We wanted to make it work from end to end and secure first. But it's pretty decent if you compare it to, for instance, homomorphic encryption or security party computing that are like 1000 times slower. Here's this, we can still optimize it. But you can see that's pretty decent. And usually you see in the worst case scenario 20% slowdown, which I guess can be acceptable in most scenarios. So now, how does it work behind the scenes? So what we use is called confidential computing. So confidential computing is a field where we use hardware based solutions to protect data even while it's being analyzed by third parties. There are two main properties that are very interesting. The first one is that I will protect it while it is being analyzed, protecting from external parties that might want to see your data, for instance, the cloud provider or the AI provider. And you have something that is called code attestation that provides you a kind of cryptographic proof to tell you that only a code you know and trust will be executed on your data even before you send the data. I will show you how it works. So the first part is data protection during analysis. So how do I analyze your sensitive data without me having access to it? I create this enclave, I send the data encrypted inside, I decrypt it inside. And thanks to this hardware isolation, isolation or encryption, it depends of memory inside the enclave. External parties, it's okay. We'll not be able to see what happens. That's the first property. The second is, okay, you say you have an enclave, but how can I know using an enclave? Because if I say, oh, I use an enclave, but I don't, then what kind of guarantees can I provide users? That way it's key to have something that we call remote attestation that helps you verify two things remotely. First one is the guy remotely uses a secure enclave. And second, he put the right code inside because if I put like print inside my enclave, like what guarantees do I get? Nothing. Even if I use an enclave, if I put malicious code, it doesn't work. And code attestation enables you to have both. So how does it work? I want to have a third party analyze my medical data. First thing I say, show me you are an enclave with the right code. And the idea is that the specific CPU, those hardware-based solutions from Intel or AMD, for instance, and soon Nvidia, have some embedded hardware secrets and some specific instruction that allows them first to create a kind of report that says, hey, I am an enclave with that kind of software loaded, firmware, whatever, stuff like that. And the computer hash of the code that is loaded inside the enclave they extract, they derive a key from the hardware-based secrets and then they sign it. And this way you have a report that says, hey, I'm an enclave with right properties with a specific signature that is not forgible because it's based on a hardware and a tupper proof hardware secret that you cannot recreate from the outside. Then you send that attestation to the hardware provider, for instance, imagine you use Intel-based enclaves. Intel has a list of keys linked to their processors. They say, okay, that's one of my own because only a guy that uses my enclaves can generate that signature. So I say, okay, this guy is who he says he is. And then you give it back to the user to the client and he says, okay, Intel told me that this is indeed a secure enclave that I'm talking to. So I will trust it. So the hardware provider is still in the people you need to trust, but everyone else, the cloud provider or the air provider are no longer in your trust model. And finally, how does it work, for instance, in practice? So I want to send my data to be analyzed by this AI model in the cloud. I first ask, okay, show me your enclave. Show me the money. And then the enclave generates attestation to send it to Intel. Intel says, okay, this is a good guy. I send it back to the host. He will add some information to set up a TLS channel that ends up inside the enclave. I give it back to the client. They verify that I'm talking to an enclave with the right code. And then they finish some handshake so that they get a TLS channel that is ending inside the enclave. And then you can send data after that. And at the end of that process, what you have is that you know you're talking to a secure enclave with the right code and you have a TLS channel that ends inside the enclave. And now you can send data for it to be analyzed. So that's kind of how it works. Now, funny part is I don't know if you have a Gradio, but you can try. I can show you the demo. But let's see how it works. So if you want, I can leave you 30 seconds if you want to try it yourself, but I will also do it. Do you guys know Gradio? I don't know how to pronounce. If you want to make a demo that looks cool, it's a good thing. It's purely Python. It's like Streamlit but doped, I guess. So yeah, I will let 30 seconds for people to try if they want. And so you will see the magic. I mean, it doesn't... okay. Are you good? Does someone want extra seconds? Okay. So I have it prepared here. So it's hosted on on Huggingface. So some text explains how it works, but this is what it looks like in the future. So our goal is to become the HTTPS of AI. And you know what? You go to the website that doesn't have the right security features. What happens is that your browser is A. That sucks. Don't send data. And to prove you how it works, I set up three servers. One that is not a secure enclave, regular server. So you have no guarantees. And our Python client will see A. There is no attestation that I'm talking to an enclave about. Don't send data. And then they might be using a secure enclave but they could put a malicious code. But as I said, there is a mechanism where the code inside the enclave is hashed and signed so that you have a proof that only a code you know can be behind. And what happened is that we have pre-computed the hash that we expect. This is this part. And we observe this one. And as I said, they cannot forge it. They cannot forge this because otherwise the certificate will not be validated. So we are talking to a secure enclave but there's not the right code behind. Actually, I put just a... I will put the code to reproduce that how I got to that specific hash but only difference is that I put a print inside the enclave. But because just with that specific line, my enclave will be malicious but I can detect that. And you can actually, if you don't trust me, you take our code, you compile it yourself, you get the hash and you check that it's actually what is measured remotely. But then in the third one, and you can actually, you know, if you don't trust me, you can like look at the code behind the demo and stuff. But yeah, I'm not going to... But... And the third one is a set up that... A server set up with the right code. And I know exactly what code is behind thanks to remote attestation. And now, for instance, I have a GPT2 GPT2 model inside that I can use. Yeah, pretty... I mean, not saying it... This is not... It doesn't depend on me if the model sucks. It's GPT2, the only one. But yeah, it does work. So that's a bit what the future might look like which is we can have HTTPS for AI. We can know that we'll only talk to someone that uses secure enclaves so that we know that they will not sell your data and your data is protected. So that's what we're trying to build. So this is it. And just to give you a little bit background, co-faccia computing, you might not have heard of it because it's kind of recent, but the hardware providers are accelerating. Intel was one of the pioneers, but AMD got into the game. And recently, this year, NVIDIA announced that for the H100, it will have secure enclaves abilities. So far, it was only CPUs. But now with GPU, we can do the real stuff. We can do very interesting stuff with training or deployment. So that's actually pretty cool. And there's a private preview on Azure where you can start playing with GPUs. It's not fully functional yet, but it's very promising. And as you see, all the cloud providers are also hoping on to this. And we also have some co-faccia computing offers. And so that's pretty cool. And the final thing before we conclude, we have one last thing that I think is interesting is what I show you is blind AI, which is a deployment solution. But we also have something called bastion AI, which is a training solution. And this one will be ravaged the new GPUs so that you will be able to train AI models on confidential data with co-faccia computing. It's available in alpha. I mean, like it's very disgusting, but it works. You can fine tune BERT model on sensitive data and inject differential privacy by design. And if you know differential privacy, the interesting part is if you put noise in a centralized manner, you need much less noise for the same budget. Because in a decentralized way, in field learning, you need to put a lot of noise for some budget, for a specific budget. But in a centralized manner, you need to put much less. So that's very interesting. And also the good thing is with co-faccia computing, you don't need all the nodes to have the hardware and the software to run the training because you can externalize all the computation to the cloud, to the secure enclave. That means you can have co-faccia training as a service and you need to go to the hospital instead of your CPU, GPU, your software, etc., which is a pain in the ass. But now you can upload it and still have a very high level of protection. So that's a cool part. And so yeah, so we got started recently. If you're interested, blind AI is quite functional. Bastian AI is, actually if you can start, it's gonna be good. Bastian AI is like, my founder told me, oh, don't show it today. Now I'm going to show it. It does work, I guess. Brand AI is working. And you can go on this call. I mean, we're still, I'm still new into the community thing. So if you're interested in wanting to know more, I'm happy to chat. But yeah, that is it. Okay, yes. In the VM, is it in an OVA? So there are two ways today. Like you have Intel as like, it's every application based. And AMD is more like VM based. So you have different level of stuff you put inside. So for example, Intel you have, you put very little, just put just what you need inside the enclave. There's no OS or whatever. So you're, you trust very few people. But that means that it's a pain in the ass because you need to re-implement the wheel. Because you, you know, you don't trust a lot of things, which is good from a security point of view. But it's very hard to develop applications with Intel as Jax. With AMD, you can put an OS inside. So it's better, but then you need to verify the OS that the OS is not my issues. And so there are different approaches like that. Okay, yes. Oh, sorry, just as a lady asked, what did you ask already? Just for the people to know that. Okay, I don't know if people got it, but okay, sorry. Yeah. Like Docker images, Python scripts. Okay, so the issue, the biggest issue in privacy and AI and privacy and technologies, not just like that, is you want something that is fast, secure, and easy to use. That's the holy grail. But if I allow people to put Python script, it's very hard for me to give guarantees that they're not trying to leak data in some weird way. So that's all we restricted to onyx models initially. Then we cover PyTorch and we'll try to cover bigger and bigger and more expressive frameworks. But imagine, okay, imagine in a setup, I have the AI company. You have a third party that wants to use the AI model. The AI company uploads their workload inside or enclave. But what if they put a malicious workload? What if they put some way, some hidden backdoor that is not, it's not, maybe they won't print, but there are actually side channels attacks that exist that leak information to the outside, even though you didn't like grossly said print information. And secure enclaves are not the perfect thing. They're vulnerable to side channels. So if I allow people to put arbitrary code, some might think of ways to leak information, especially if I allow Python languages, if I allow people to put Python script, they have thousands of ways to leak information to the outside. So to us, it's not viable in the long run because what's the point of these secure enclaves if you don't have security, you know, like, so that's why we restrict our stuff. But in our current way, because we only allow onyx models, whatever they put, we can provide a high level of security to the end users because, hey, whatever they put, all the operations that they use, we made them sound channel resistant, we reviewed it, they can only do this specific operation and nothing else. So we can give much better security guarantees. And still, it's not hard for users to do, you know, like, inside the enclave. Okay, let me, I haven't looked at the code for a while, so I don't know. Okay, but yeah, inside the enclave, you know, when you get the data, I mean, that's why you can put the print. Can I have a look at the code? Other questions? Does that change each time or is it the same each time you do the encryption and decryption? To be honest, this is, I don't know how to implement it, I'm not like this, yeah, I guess it's session based for now. I mean, I guess I change it. I don't know how to, I mean, you can do both, you know, depends on what people want, but I guess in our case, they do this session based, so after, but if you have questions, you can ask stuff on Discord or open an issue and whoever you open to that. Yes, yeah, yeah, so we try to be kind of modular and, you know, we, I could, I am kind of, yeah, but yeah, attestation is very specific to hardware, so we try to have something where some pieces are common and, yeah, some stuff, we need to make it like hardware specific, but yeah, mostly attestation, sealing can be different. I mean, some stuff is different, but for instance, the core computational stuff is kind of similar, unless if you want to do acceleration and use, I don't know, AVX 512, then yeah, you might need to do also some specific stuff in the computation part, but yeah, we try to make it quite modular to accommodate different scenarios, all the code that is inside the enclave, I think it does, yeah, yeah, just to be honest, I am a data scientist that got into security, but my guys in security can pretty much answer all the questions, yeah, yes, did you raise your hand? Sorry, yes, about the use case of this, because like for example, you were talking about the medical images, the nice thing about sending it to an enclave is that the server can't see the image, but how, but oftentimes with those kind of AI models, the providers don't want to give that data away, right, they don't want to, they don't want to share their AI model, because that's very information, yeah, how do you do the HEPA station if you don't actually know the code that's running on your own? So the question is, actually I forgot to repeat the questions, I hope it's okay, the question is yeah, how do we know the code that is inside the enclave? Yeah, as I said, there is an attestation mechanism, so basically, yeah, you see it here, before some data, you cannot have a look at the Python code, but they send a certificate that says AI model enclave, first you need to check actually signed by the hardware provider that is linked to the specific enclave, and second, what is the hash of the code that is inside, what is the hash of the, I said code, but there are other properties like yeah, firmware and stuff that is linked inside, but that's why we are open source, we have an open source solution, so the AI providers that just provide the AI work loads for instance on eggs or other format, so actually I didn't mention it, but the data and the model are protected because model is uploaded inside, what is open source is our launcher is open source, and people can verify it, and because we restrict what we can do, whatever workload people, whatever payload they put inside, you don't care because we restricted the expressivity of our launcher. Other questions? Is it your soul? Thank you. Yeah, so if you can drop it to Starron, come see us on Discord, that would be cool. Okay. Thanks.