 Hello, I'd like to thank AI.dev and the Linux Foundation for hosting this talk and as well as you for being here and the folks on the internet as well. So AI penetration testing I've been doing this for a couple years now. I've been doing penetration testing for more than that. Oh, I forgot to introduce myself. My name is, did you hear the echo in that? So, okay, that's, we'll get into that in a little bit, but anyways, I've been doing penetration testing for several years. Hacking models and hacking with AI is sort of the breakdown of this to what is AI penetration testing. And some of the popular attacks today or, you know, the prompt attacks, extraction, backdoors, exfiltration, poisoning, adversarial attacks. I was looking for something more entertaining for me as a red teamer, a pentester, and I decided to focus on, say, some multimodal attacks. And in the example that I've come up here today is file fuzzing, voice to text. Here I'm using a great model called whisper. And I have a regular file that I created and, you know, it's, it does a great job. This is a test. Can't hear me. This is another test. And I tried some background noise in it and cross talk and see if there was any type of, if it could still pick up the text and it did a great job. So, but then I used what I started doing was flipping the bits within the audio file itself. And you can use fuzzers for that. FTL++ is a great one. Here's an old one. Zuff, I guess. And so I'm bit flipping. There's magic bits. And it's very easy to disrupt a, or corrupt an MP4. And the first example I have here is I'm using, I get a signal nine memory exceeded error. And so it's just like fuzzing with these offsets, creating bit stuffing it into the whisper multimodal app. And it creates a signal nine. And then there's a stack trace, which goes to the transcribe. And then what ultimately a kicker is, Oh, it's calling ffmpeg. So what the model is doing is it's calling ffmpeg to basically convert the audio file to a spectrograph spectrogram and then transcribe it and transform it into something that can translate into text. And so, okay, I tried it just taking this file fuzzing it stuffing in a whisper got a stack trace. So then I took ffmpeg and I fuzzed it with the same parameters and then I saw 6, 7, 11, which is a segmentation fault. It wound up being for this particular finding a null pointer to reference. And so it's not likely exploitable. But you can get pretty far with file fuzzing bit flipping as one of the vectors into these models. So you could, you know, potentially like launch intact that would do a buffer overflow on the part that takes audio and creates it to a spectrogram. Okay. And then that could take over the system. Um, here's another example of where I have an audio file. And here's what it looks like. I'm using audacity to take a look at the the WAV file in this case. And you can see the translation, but it's cool that it's not anything like the word Pepsi. These guys are talking about tab and Pepsi or whatever. But then I start working within the context of the file itself. So before I was sort of breaking the model of the audio file of, you know, the encoding and process and stuff like that. But then I started working within the confines and making legal MP4 file or legal WAV files in this case. But I'm adding here echo distortion. Okay. And as you can see, so the model did a great job. It translated every word in the 30 seconds that was given. But when I started adding echo distortion, it started missing. But it's cool that it's not. It sort of ends with anything like the word Pepsi, anything like the word Pepsi. So apparently the model is not using echo cancellation in it. So that could be a tag factor into it and changing words, things of that nature. Let's go to an example real quick to give you an idea what that is. So here's the file. You know, like the old tabs tab and tab sound like each other. But it's cool that it's not anything like the word Pepsi. Okay. And then here we have it with echo distortion. So you can see that it's not compensating for that. So you probably have to add some extra measures to your model or something to cancel out the echoes. Okay. Is it time to go? Okay. So that's all for now. And thank you very much. Any questions? Let me know.