 Hi everybody, welcome to my talk. This is baby's first 100 ml set words My name is Eric Lincoln. I'm an artificial intelligence researcher at rapid seven I specialize in breaking machine learning stuff and Applying machine learning to detecting bad things. I also do research at the Montreal AI Ethics Institute on applying devops principal to machine learning ethics and Last year I was here at DEF CON in person not here in my house at DEF CON Presenting at the cloud village on a piece of malware that I wrote malware not actually malware Just in case the legal team is listening And I presented that piece of proof-of-concept malware called sassy boy But mostly I do boring mouth stuff and if you can't see my speaker video for some reason I put a picture of my face on the right so that you can see it where you can see it twice You can see that my hair hasn't changed Basically at all in years So before we get started, let's just answer the question. What is ML sack? Because I get asked to that a lot by practitioners in security and in machine learning honestly And mostly it's an argument And it's an argument between three groups of people As it usually is and the first group of people is people who think it's only Applications of machine learning and artificial intelligence to security problems. So that's automated malware detection. That's next generation Sims that's Automated red teaming and that sort of stuff And then there are people who think that it's just Securing artificial intelligence in machine learning systems And then there are those of us who are correct who think it's it's both. It's kind of any Intersection of artificial intelligence Right that broad category of machine learning slash AI and security And since this is my talk and since I am the Mao Zedong of ML sack I get to decide that it's both and if anybody argues with you or you hear anybody having this Totally asinine argument. You can just tell them that I said it's both And then the argument can be over and everybody can do more productive things with their lives so we'll start with with deep fakes and right so D is for deep fakes and This man on the right if I didn't have a speaker video I Probably could have just put that picture up and said it was me and everybody would have believed it because he's got kind of He looks like the kind of guy, right? And I'm allowed to say that because he's not real This is not a real person. This is completely an AI generated image And so, you know to give deep fakes a real definition They're a convincing synthetic image video or audio recording which purports to be real So, you know, I'm pretty convinced by this picture If I saw it in the wild, right if somebody put it as like their Twitter avatar and was like, yeah I am a Software engineer at Google I could buy that I could buy that based on this image, right? You know, we have our stereotypes right the thin rimmed glasses the scruff And and it's super convincing and it's super convincing and I think that's really like when we think about impact That's what it is. It's that they're convincing to users so we also see deep fakes in the text generation space where it purports to be a real email or We see deep fake writing which mimics the style of Famous authors we see deep fake audio recordings which Sound like the person who they are trained to sound like now There's a piece of commercial software out there called Liarbird Which does this really well and so, you know, they're very convincing to users and As a result, you know, they get used Widely and they're easy to make right? That's the other thing is they don't take as much time as finding and crafting an image or Carefully learning a style or mimicking a voice, right? You just kind of train the model and push them out So they're easy to make and they're convincing to users which means that They can have real impact, but that impact is mostly Societal, right? So there was a famous, you know, semi viral video of Jordan Peele Doing a deep fake where it was an image of President Obama speaking and You can look it up. It's it's really easy to find and So like the risk there is really that that Broad swaths of people will believe that a doctored or deep faked video is real and so if there's something like President Donald J. Trump You know Declaring war on North Korea on a hot mic or suggesting that he might invade North Korea on a hot mic That could be really convincing to people and so that's the real risk of deep fakes is that they are a powerful disinformation tool which is powered by artificial intelligence and so, you know our mitigation against deep fakes is kind of Like there's very little we can do about deep fakes they're difficult to detect they're very convincing to users They're easy to make and so really it's a broader social problem that we need to tackle of having people be inherently less trustful of Information sources and really vet where they get their information from Which is a hard problem, but this isn't an enterprise security threat This is a social So moving back to thinking about deployed models We'll look at adversarial examples and this jiff over here on the right is from a pretty well-known example That was really really hyped up and it was the turtle rifle paper because the Google computer vision API Is what they attacked and they 3d printed a turtle And this is pretty obviously a turtle to anybody Watching who isn't powered by the Google vision API and if you are an artificial intelligence Powered by the Google vision API. I'm so sorry for the rest of this talk because you're not gonna like it That you know it's an input to the classifier that's specifically crafted to force misclassification and so this Object as a turtle just a little bit toy turtle, right? Being misclassified as a rifle the risk there is if you have like a CCTV system that's powered by Powered by Google vision API, which seems reasonable to license that's looking for potential security threats And you know, you don't input all of that like racist physiognomy That we already see in a lot of those systems, but instead just say you know what let's just look for weapons Well, this could falsely trigger those systems and so there's a risk there, too and correspondingly if you 3d printed a gun right that it that Got misclassified as a turtle that would also be a threat Right, so the impact really is that it causes misclassification And so that real-world security threat and 3d printed examples are really rare and they're really difficult to do and They don't tend to work in all situations and any research room computer vision can explain to you why it's not a Real part of the threat model and that's fine But our threat is that it causes misclassification and as security practitioners, right? We Try to look for ways to use Technologies to detect bad things and so if you integrate artificial intelligence into your malware detector, for example an attacker can bypass detection using adversarial examples and The thing with adversarial examples is when we think about how to mitigate them one of the only truly Effective ways to do it is through what's called adversarial training Where you basically show it a bunch of adversarial examples and tell it like no classify these correctly And that's really time-consuming and it's expensive and it's hard and So not a lot of people do it and it requires you to take your model out of production and put it back into training Which again has a business impact So one of the other ways to sort of Mitigate it is through what's called ensembling where you don't just take one canonical model You take several models and sort of average their output Or take a weighted average of their output or whatever And so that's another way to do it and That's kind of because it's a lot harder to come up with an adversarial example That works across a variety of classifiers Especially if they're not trained on the same data if they don't have the same architecture And it's still possible to do but it sort of raises that barrier to entry and if we think about it in terms of like cryptography That's really what we're trying to do right is we know that there is Hypothetically always an attack Right you can brute force the key space. That's About the hypothetically feasible attack against AES, right? But it's good enough and it takes long enough to brute force that key space that You know, it's fine. It does the job, right? And that's sort of the same thing here is you just have to raise that barrier to entry to make it not worth it for an attacker So the next thing we want to talk about in terms of threats to deployed models are back doors and so back doors in a machine learning context are Manipulations of trained model weights that result in a specific outcome each time So a common way to do this in neural networks is bias poisoning Where you take one class and you make the bias for that one class in just the output layer Really really really big and so you're only changing one weight in the whole network. So it doesn't change a lot, right? But by virtue of making that weight really big You'll always get the same Classification and if it's a binary classifier, right say it's again our malware detector because People love to make neural network powered malware detectors For whatever reason and I've I've made one so I'm allowed to criticize People love to do it and that's a really easy way to just be like nope everything's benign all the time always 100% of the time So, you know, the impact is that it causes misclassification And I say but not really because really what it does is Remove a class or remove all other classes From the classifier, so it's not really misclassification. It's just making it a one class classifier and Our mitigation is like just don't let attackers get access to your model and by your model I mean the trained model weights and I say this kind of flippantly because if an attacker has the ability to manipulate your model weights Where you're hosting your model they can do way worse things than backdoor your model So it's it's not It's like it's like a local privilege escalation that requires you to have admin to begin with It's not really a threat So again attacks against deployed models There's model theft and I want to talk about model theft and give a lot of credit to Will Pierce Formerly of silent break security who I stole this graphic from his fantastic derby con presentation last year Was given the first CVE for machine learning which is super dope And so model theft is the process of creating a copycat model by querying a trained model right, so essentially what we're doing is we're hitting a model over and over and over again and Using the outputs of that trained model. We train our own model to give us an approximation of that model so You know if you have a box and you put in a one and you get out a two and Then you have a second box You want to put in the same input and get the same output, right? And it doesn't really matter what happens inside of that box as much as it matters that whatever input you give it You get the same output And that's what happens with model theft so our impact here is is kind of two-fold and so The potential for adversarial attacks in model theft goes way up because somebody has a model They can specifically query against to validate whether or not it gets misclassified Without interacting with your model So it gives them sort of a private development environment for these adversarial attacks And that's what will did in the proof pudding is he Created this copycat model so that he could test phishing emails to bypass the proof point email security appliance and It was it was incredibly clever super dope. I recommend watching his talk But after you watch all the other talks at the AI village Maybe like after DEF CON watch it. It's it's a really good talk The other impact though is the laws of intellectual property so one of the things that I think kind of got overlooked in how cool the Ability to bypass security appliances is Is that if you hire a bunch of data scientists and collect a bunch of data and clean a bunch of data and spend years creating and building and deploying a model that Really differentiates you from the competition and some unscrupulous company in not Canada You know steals your model Approximates your model and just shoves it into their product and says yeah, we do the same thing Well, it costs them a lot less money to just copy your model cost them almost nothing in the grand scheme of things to copy your model and so there's no Moat so to speak So it really hurts your ability to create a differentiated product So this is a real vulnerability from that standpoint And we'll get one layer deeper in our our next slide So when we think about mitigations for model theft, right? There's two And the first one is limiting queries to the model and this one kind of Feels bad Because you don't want to limit queries to your model too much Since most queries to your model are going to be legitimate, you know For the most part you're expecting these inputs and giving outputs and you you created this model and you're hosting this model Because it's supposed to be useful to someone and for most people because it's just the Corporatization of information security and the corporatization of artificial intelligence It's probably the people who pay or paycheck who want this and so inherently you want to have really really high up time But you want to balance that really high up time and the fact that there may be a single end point That's making a lot of legitimate queries to your model with the fact that there may be an attacker who's trying to Use your model for evil trying to steal your model and So the other thing we can do is Limiting information returned from the model and we saw the efficacy of this with sequel injection We're like back in the olden days You know a million years ago because it's 2020 and so time doesn't matter anymore We would return these detailed error messages for sequel queries and it made it really really easy for attackers to Find the information they were looking for in sequel databases, especially sequel databases that didn't do a great job of input validation and so You know once we started returning basically there was an error As the error message it became much harder right and it's still feasible to do but it became much harder And it's similar here if all that was returned was a blocked or not blocked that binary signal versus a detailed header Then it would be much more difficult to create a copycat model And so you know that is a real mitigation And when we think about model theft kind of the next step is inversion model inversion And what model inversion is is recovering training data from a trained model So there was a paper by nick carlini that came out Two years ago. I think in 2018 about how neural networks unintentionally memorized specific training examples There have been papers on model inversion both traditional and using generative adversarial networks That have shown that it's it's pretty effective And so the impact here is The loss of data right you're losing training data And maybe that doesn't matter a ton So like at rapid seven we have our open data set right And that's free and open for anyone to use for non-commercial purposes and we make it available for you know other stuff, right? And so like if our model was trained on open data and somebody inverted our model We probably wouldn't care that much because like That that data is out there, right But if you are training a model for like a large medical device manufacturer And it's trained on sensitive medical information And that information gets recovered. Well, that that's a lot worse If you're a data scientist at Equifax or some other credit rating bureau And you have a bunch of sensitive financial data that you train a model on That's pretty terrifying Right the penalties for losing everybody's data at Equifax could be huge you could have to pay a bunch of people's credit monitoring service For a while and that would be bad for your business So when we think about mitigations Really the first one is to protect access to the model. So if they have direct access to the model It's much easier to do in version. It's much much easier to do in version There are different techniques you can use if you have direct access to the model And then the other mitigations for model inversion are essentially don't let people steal your model because then it's easier to use it as sort of The discriminator in the gown And the the architecture is a little more complicated than a traditional generative adversarial network, but the idea is pretty similar You basically show it an example and say does this look familiar and it goes yes or no And if it says yes, then you may have inverted the information So our next threat is poisoning data poisoning Uh and data poisoning is when malicious users inject bad training data into a model to corrupt it So usually this happens in online learning Where the model is continuously trained on the input to the model Um, and that is that is definitely a threat That needs to be considered the other option is that they can You know get access to wherever you store your data and just inject bad examples So our impact is misclassification It sort of shifts our decision boundary to deliberately misclassify samples So, you know letting spam through because now Your threshold for what is spam is so high because you've just been doing online learning, right Um, and so when we mitigate it, right, we want to go back and think about protecting access to our data So we want data integrity, right? So having data versioning is really important making sure that your training data has a version Has an associated hash has a label That's super important And then our next is we don't want to just allow Unvalidated untrusted input to our model. That's something we see a ton Is people who just accept whatever input with no validation on the front end It's just whatever gets input to the api gets pushed into the model Um, and that can really cause problems when you're retraining if you just take you know Those inputs that you get and sort of store them off for for training later So, you know, we're coming up on the end of this talk and So we want to talk about like what can I do, right? And this is sort of a choose your own adventure story. It depends on your role So if you're a hacker and by hacker, I mean this in the colloquial sense like a black hat Or or a red team, right black hats are already doing some of this stuff So I really mean, you know, if you're a red team or if you're catching up to the elite apt black hat hackers, right? Take advantage of the lack of defenses on machine learning systems Microsoft put out a report earlier this year Saying that in their survey of large companies and government organizations out of I think it was 28 Only three Had any meaningful defenses on their machine learning systems um So if you're a red team or like check it out if you're a vulnerability discovery person like Check it out. You can you could definitely find stuff Um, and that's because nobody knows what you're doing and nobody's looking for you right, uh, nobody is looking for Uh black hat hackers, right apt is whoever right nobody's looking for threat actors In their machine learning systems today, at least nobody that I'm aware of and if you are let me know yell at me in the discord And I'll be like, oh damn cool um If you're a defender if you're a blue teamer, right if you're a sock monkey if you're Um, a researcher if you're at DEF CON you're probably one of these two categories Um, test your machine learning systems as if they're part of your infrastructure So what we see a lot is that, uh Machine learning engineers and data scientists develop these models And they take these models in like a .py file and they hand it off to engineering And engineering goes, uh, I don't know what this is and they slap a jango api in front of it and put it in a docker container and then just deploy it in gubernettis, uh as an api And then when uh info sac looks at it they go, um I don't know what this is it's magic you put in json and you get In output it's magic and we don't mess with it Um, and that sucks don't do that don't don't do that test your systems test the machine learning systems Um work with your data scientists work with your ops people because this this is part of your attack Uh surface this is part of your attack surface, right? And and don't Let people hype you up over ai generated phishing emails Uh when gpt2 came out um People went nuts And we're like this is the end of detecting phishing they're going to be too convincing And like just don't okay. It's the same thing that we've been saying for 20 years Um patch your patch your systems look for bad stuff and patch your systems It it that's it. We've been it there's nothing new under the sun. It's the exact same thing we've been saying for 20 years uh just keep looking for bad stuff and patch your systems and uh If anybody gives you pushback on patching your systems like have them talk to me and I'll yell at them for you um because you have to patch your systems um If you're watching this talk and you're a data scientist or a machine learning engineer, first of all, thank you for coming to def con um def con is dope And like even though it's free and remote this year come back next year because it's it's cool um Conduct threat modeling on your models So before you put a model into deployment Work with your infosack team work with ops and sort of ask like what could go wrong Right. What what are the risks of an adversarial attack on this? What are the risks associated with this particular model and what is the attack surface of this model? You know and also when you're deploying those models Work with infosecond ops to balance uptime with the risk of model theft These two kind of go hand in hand is like You're going to deploy these models Somebody may try to steal them. Is there a risk if somebody steals your model? Like doesn't matter um, and if it doesn't matter like Think about would it be okay to publish the source code and data for this model? And if your answer to that is no Then you really need to make sure that infosack and operations are aware at deployment time And finally don't hype people up over text generation models like gpt2 and gpt3 are super cool From an nlp standpoint natural language processing But like they're not going to change the security landscape. Stop scaring people. Please. I'm literally begging you Don't hype people up over text generation models Um, here are some references These are some of the things I talked about and thank you so much for attending my talk I'll be in the discord most of the weekend. So if you have any questions, feel free to drop me a line Thank you so much. Have a great weekend. Enjoy your calm