 Great panel that is called Responsible Offensive Machine Learning and here we have Bodicea, Philar, and Straith, and we have our Delta Zero moderating. So take it away. All right, let's just really quick have a time for the panelists to introduce themselves to start. What are you working on today and what path did you take to get there? So hi everyone, I'm Straith. Primarily what I work on is using robots to social engineer people. So physical robots and you know getting people to do things they would not otherwise do or say things that would not otherwise do and My path here has been really weird started in business and then went through to project management and then to IT help desk and then University where I did theoretical computer science and robotics and now I'm in a cryptography security and privacy lab None of those words are AI or machine learning But I've kind of done a little bit along the way and obviously it does feed into the robot social engineering Hi, I'm Bobby Philar data scientist at Endgame My primary responsibility there is building malware classification models Employing NLP to try to help security analysts do their job faster easier My path here Equally kind of bizarre International relations background. I wanted to write about North Korea nuclear weapons Now I'm on stage talking about offensive AI Yeah, so I jumped from kind of the policy side of the house to geospatial analytics Then into information security and then into information security data science So I'm Sarah Turp. I'm a data scientist. I work for a very large ad tech company So my day job I'm going through very fast very very large amounts of data Trying to get more people to look at videos to the ends of videos which balls I held out of me sometimes so as a hobby I Track misinformation very large-scale bots troll activity and work out counters to that and I got there God, I couldn't be a kid. I couldn't decide whether to do psychology or computing and then discovered AI First job I refused to go to university So I went straight to a graduate dub designing sonar systems and intelligent torpedoes got one of the very first AI degrees in the world we're talking 30 something years ago and Just no unmanned vehicles intelligent systems Just fun stuff. It looks like fun. I do it Awesome, so starting right into things during the past week at black hat def con and b-sides We've seen quite a number of examples of like offensive ML adversarial examples things like that What concerns you the most in the whole ML AI info sec intersection? The biggest thing that concerns me is probably the marketing of AI in ML So I'll make sure to apologize to our marketing department, but in in reality It is it's often positioned as like a silver bullet that can solve or catch anything that it can reduce all of your data and produce no false positives and that's sort of Kind of definition is dangerous because then people start to believe it and spend a lot of Resources both people and money on things that don't solve everything Well, can we talk about the training data sets that are used for all these companies that are adding in AI and machine learning? Like my favorite thing has been walking around vendor areas and saying hey How do you train your systems? Like how do you feel about that sort of stuff too? Oh, boy? I've got a long story about that one I'm sorry human inputs very bad bias So my biggest worry is that we're getting a lot better at replicating the appearance of being human and Yeah, I know it's all likes of Terminator that AI type stuff, but I I build this crap and There is a danger of Not really people not being ready and not knowing how to work with Bots that aren't necessarily labeled. This is called autonomy theory We've had it a lot of it in some of the work on a man vehicles for years, but it's really hitting at scale now And so that brings up something else is the thing that worries me the most is people constantly think of AI and they jump Right to being a Terminator, right is once you put an AI in a body It's very very different and to me what like seeing what everybody's doing in AI machine learning The biggest threat is like the social and cultural impact not what happens after you put it into a body Yet, which is why I'm working on the robot social engineering But the idea that AI and machine learning is already affecting us now And I don't think people understand how much it's affecting us because they haven't physically seen it So it worries me that people are waiting too much for the physical impacts before they see the cultural and social Actually on that same note, what are the impacts of AI ML and how do we actually keep them ethical and fair? so I think the The idea of keeping things ethical and fair Is certainly a difficult one. I believe it was you who had a presentation last year on Fishing using targeting fishing using AI and ML where you mine publicly available information to create a kind of curated Target set and exploit individuals to gain access and and kind of take advantage Ethical fair you're only as good as your inputs. I am seeing so many Groups using human generated internet generated inputs, which are by themselves usually likes come on Let's talk sexist racist and all of the other bats. I know Tay was Engineered to be a racist asshole, but there's a lot of that going on Um So can you repeat the question? Yeah, so what are the impacts of AI and ML and how do we keep them ethical and fair? Oh, yeah, that question So what worries me is what we count as ethical and fair Understanding what is fair for one person is not fair for another person it is not necessarily ethical for another person and it really worries me what we are doing in North America or even in Certain subgroups and how specific the AI and machine learning is getting for one Task, but then you try and put that machine learning anywhere else and it could actually harm people or hurt them So I'm definitely worried about the con contextualization of these tools Let's kind of talk about definition of ethics. So I've worked a lot on data ethics So what it means to be ethical when you're using data about people even before you start machine learning on top and The framing that I use that seems to stick most is of ethics as a risk problem So ethics is all about there is a risk to people So you have a population you have a risk of something you have a probability of that risk and Then you can start talking about relative risks and relative problems relative populations without that you're just waving your hands in the air and say Yeah, and piling on to the fair argument that I believe strength was making Explainability is certainly something that at least us in the infosec group Strives to do because black box can be extremely dangerous when you are producing some sort of output based off of math and stats that a human being then makes a decision on that could have actual dollar ramifications So when you look at things from through that lens, you have to be much more careful Yeah, so money is one factor, but there are so many other factors Especially when we start looking at some of the tools we are making and and how they impact for example, even lgbt Q plus plus groups and how their model is different even though they might be in the same house as somebody else that the Tools might be working for so that worries me too is that even when you have a contextualized in one specific spot Just some of these personal experiences can change things so much and then it's not even a deliberate Offensive machine learning. It's accidental or like non-conscious, which is almost worse to me Very cool. So on the next top So obviously we're moving towards the future like ML AI robots are becoming more widespread How will different teams a red teams blue teams or even other fields need to adjust for the future? So, yeah, I think the red team blue team conversation is probably something that folks in here are Interested in as well I think putting together solid red and blue teams is pretty much the same way you put together a solid basketball team everybody kind of has a role and responsibility and the advent of ML backed platforms particularly in the security space Means that you will likely need to have a resident expert on AI and how to exploit that on the red side and on the blue Side when you're paying vendors Hundreds of thousands of dollars for ML backed security You're gonna need to understand the underlying blind spots and problem areas and how to debug these machine learning models. I Mean hell when we say ML and most of the stuff out there. It's like, hey, we've got a table. We've got some labels Let's go like put the thing together. So this is not exactly complicated yet. What was the question again? How will other teams like red team blue team? So I've been through quite a few industries that we've added data science and ML into one my job used to be a consultant going in and changing organizations and What you see is changes in Not just the technologies, but you're talking about the speed you can do things the scale you can do things with The types of data you can start using it's like log files are boring as hell but when you get enough of them together it gets interesting and There's always this sense of transformations of people process technology culture all of those change The skills people have changed because they have to be able to do to to understand The vulnerabilities within their own algorithms. They have to understand how to attack similar things and Okay, I can see straight itching to get the microphone. I'll stop for the moment. You can get red well, the other interesting thing with this too is When we're looking at red teams and blue teams, we're usually talking about corporations and large groups and communities, right? but a lot of stuff you've talked about in the past few days if anybody's had a chance to see is social bots Especially on platforms like Twitter and Facebook and how they're getting people Individually and how they're getting like the individuals. So how do you read and blue team for yourself too? Right and a lot of it comes down to trying to develop communities around, you know blue team and community to protect your friends and family and and Countries to like what can we all be doing individually to be like hey stop that? So this is something to think about too is is with how the social bots are being integrated into our lives We each need to start taking a role in this and not just leaving it to the companies Not just leaving it to the platforms but also judging for yourself who is a bot who is not a bot and Going through and seeing whether you can help parse that out for other people and prevent it Which is why this is now going back to Sarah Sorry, one of things I want to talk about is subtlety So one of the things that if you're starting to use machine learning you're starting to use AI instead of that humans Pitting and buttons is you can do much more interesting complex under the radar attacks So at the moment the botnets that I've been tracking have been pretty fucking dumb So I swear every talk I'm done So they've been pretty dumb. It's like hey, let's just like retreat this thing They're pretty easy to find but once you can start adapting so you can take naturalize in you can generate out Once you can start randomly creating Patterns of behavior that are more human like They're the more random it makes my job as a hunter a lot harder and it's that Arms war that I'm expecting over the next year between the bot makers and the bot finders Using machine learning on either side and starting to race up up that chain So it's gonna get interesting and yes that hits individuals as well as groups as well as countries as well as it The surface is enormous No, I guess my question to you then would be when you look at things and Ariel mentioned this as well with deep fakes and there was the The Jordan Peele Barack Obama sort of fusion or my favorite Nick Cage Donald Trump, which was fantastic How do we Not quite that good yet, but What steps would you recommend to kind of the layperson to educate and arm themselves because I think for a lot People here who are are very well guarded in the information security space and know how to handle themselves at places like DEF CON or in day-to-day Interactions when you remove yourself into a more political or geopolitical spectrum And open yourself up to these Alternate media streams and things like that. There's that's kind of a lot to expect of people to to understand So this is something I touch on a lot with the robot social engineering is because most of it is one-on-one Because it's a physical robot and a physical human in a space interacting with with each other socially So talking, you know body posturing is huge with robots like being able to use body language That's primarily the advantage of physical embodiment from AI over versus when it's on the web and For that there's not a lot you can do like I've been trying to write this defenses chapter in my thesis and it's just like Magical like awareness will solve everything because there's just not a lot. Otherwise that you can do I've been doing some experiments on whether people can tell the difference between when a robot is acting on its own And what is being controlled by a human and the answer is no like so far There's been no distinction between the two people can't tell me when hey that robot. It's acting weird They think it's a bug before they think it's another human Or else they don't even think it's a bug. They're like, oh, that's a feature So this is a thing too is is depending on how the AI is presented People usually can't tell the difference as soon as it's in a body because they prescribe so much more Like life likeness to the robots So this is something specifically from my area that concerns me about AI and machine learning is people don't know The difference between that and real Yeah, I mean really I mean we we look on two different axes So this is comes from my old school intelligence training. It's like you always look at the content and the source so the Content of something isn't enough anymore. You need to know the context around it now Where is it come from? What does this trust framework come within and there are media literacy trainings on How to start spotting things but no things like deep fakes. I mean I I've had a little play I Am good at image processing and I've I've reused some old mind-finding Software stuff and it's not easy. It's not easy to spot I mean you end up then doing common sense you look at things like snopes but If you're trying to protect that group of humans and those humans that where where is the activity attack surface generally That you just got to go all the way back the supply chain and stop this stuff before it gets to the humans Because they're the they are really a very last resort Yeah, actually, that's really intriguing just generally because like I know in my own research It's really hard to sort of spot like, you know spearfishing or audio Like generated. I'd love to hear more just on the idea of like What happens if it's impossible to actually detect this stuff? Like I know your research and you know stopping it from the source. Can you just expand a bit like? Oh This is probably not the place I would record it Yeah, there was there's a certain building in St. Petersburg if it accidentally disappeared life would get an awful lot easier for me, but That's part of the change. So you've got the people who are generating the people who? Pushing out you have bought what mean it bought application You have the people at the end to a then applicant applying on top So it literally is a chain of information that you need to at every point Find places to stop it find ways to stop it and at the end with the people I mean you can do like domain squatting and typo squatting and Diversions, so there's a whole pile of and that's an as an entire different talk onwards And this is something you and I have discussed like outside But the idea of whether we want to be able to use that metadata to find bots like we also want Anonymity for people we want like there are some cases where you know having that metadata it can harm people So if we start stripping all that out in an effort to protect individuals and have a more like Free internet or however you want to think about what information we release while we're doing things You use a lot of that to find bots So once like if that is gone and if we have a more anonymous internet like How are you gonna be able to tell between people and bots and where does that start affecting our culture more than it is already to right? Yeah, and bringing this back to kind of the infosec problem and and for me malware classification That's where the onus falls on on folks in the crowd fellow data scientists Where you have to participate in this sort of research adversarial machine learning as you could tell by the track list over the past two days pretty heavy on that side and The reason why it's so powerful is not because it's cool to make stickers to trick a machine into thinking you're a toaster It's about identifying your models blind spots and once you can identify those blind spots You can attempt to patch them and by patching them you help make that platform more secure and and hopefully in theory Organizations people what have you more secure? Well, there's also the human blind spots one of the beautiful things about doing machine learning is you start learning where the people have missed things It's kind of cool Awesome, so on that same note actually let's let's move things more to the responsibilities side For people creating, you know AI ML systems. What responsibilities do they have when creating those systems? Whether they're blue team red team or anyone? Yeah, I'll take first crack here So for me personally, it's a lot of eliminating things So eliminating false positives to try to engender like trust and make people believe that the Answer coming out of these systems is is fair and trustworthy Trying to reduce kind of the black box feel of it explainable AI as I said earlier is is a huge thing the majority of the industry Right now uses tree-based classifiers, which should Insure at least some sort of explainability Based off the features they're using and and things like that and then the last thing I believe Straight harped on which is eliminating bias in in your training data. So for me Bias isn't necessarily Gender or anything like that, but it could be nationality based language packs and software a lot of Now where it comes from Eastern Europe So we don't want to create models that the first time it sees anything from Eastern Europe That is like a browser or a plug-in that it immediately labels it bad just because so much of our training data is Oh, if Russia, that's definitely bad. So That's that's not an ideal thing and that can lead to a lot of Problems both within International organizations and just developers trying to do right by themselves to create software And we're back to risk equations. So you're actually talking about false positive positives Actually, I've worked with a bunch of Nigerian tech people as well. It's the same problem of like nobody will answer them Hello Legos So you've got to think about what the inherent harm is in your false positives and your false negatives and Which way you should shift to keep the arm down So that's like if I mean in terms of bots quite we quite often catch real human beings Who are just enough on the borderline that they that the List put out by Congress had a bunch of human beings in it Their lives can get completely screwed by that But the other way if you don't catch a major bought that's causing a lot of harm or a major incursion Then you screwed that way. So it's like who were the harms to What are the what are the? Risks what is the cost and not just in money, but the cost in general of doing this or not doing this of those parameters you're using Let's actually move it like back to robots actually if I'm designing a robot Do I have any sort of responsibilities or anything in that sense? Oh, sorry Let's move it back to like for example robots If if I'm designing a robot, are there any responsibilities that I should have or should follow in order to you know, I Don't know try to not deceive people or anything So there's some interesting things with robots Most of it is not AI or machine learning related but More in how the robot is perceived by people So some of the things that really bother me are robots that are heavily gendered and put into gender specific roles So we have a lot of female robots that are servants waitresses and all all of that sort of thing That bothers me because then you have the male robots which are being trained as managers are being trained to Be in positions of authority, but they have huge biceps. They're built like football players They have like they have an like huge crutch area. It's like it's a robot Could we not? Or it's like why does the waitress robot need huge tits? They're like, I mean, they're hard plastic There's usually a tablet in front of them anyway. So it's like why is this happening? So when you start doing that and you throw AI in it, you also get a bunch of really interesting Interesting quirks in how people perceive what the robot is doing So if the robot does not act like the gender that it's perceived as people also treat it differently, which is kind of neat They get really confused about what role the robot has Which needs more research, but so far the papers are just really neat The other thing too is how we do voices with AI or what we name our AI assistants Siri Like female name again kind of like another thing that why isn't it called like Jeff? I'd love an AI called Jeff be like Jeff. Can you check my calendar for me today? Although? I'm sorry if anybody hears name Jeff But this is the thing is like, you know, I love being able to change the voice on my assistant as well But like how many people change it and stick with a male voice or stick with a neutral voice, right? They're usually female as well. Like if you go to the computer history museum in Mountain View There's Watson in the room male voice But then the AI that controls the lights is a female voice and it's like why wasn't it the same? Why wasn't it, you know, what other options could happen and how does that affect people? But then I use that for social engineering and messing with people more because when they're like Oh, it's such a cute robot. Oh, she's like so tiny and it's like, yeah, she's gonna fuck up your day There is kind of an I could you know element so I was just remembering something I did for my last last but one company So I was using all the security logs. I was looking for anomalous Behaviors of humans or maybe humans so I had all the logs across all our systems and I started finding stuff out about my team that maybe I didn't want to know There were specific members of my team doing some really interesting stuff But then you could find out through the you know just going through the activity they had It's like I know who's awake when I know there's some interesting curves going on there I know some interesting correlation. Oh, look, I can go use some open data to go find some more about them It was getting a little bit creepy and some of it is like how far do you go to protect your system? and in this case it was like I had to track my own team to be able to spot anomalous anomalies and other parts of the Team and to spot whether any of my team's access is changed to a point where I had to worry about them No super years. You don't want them going work Yeah, I mean if anybody here works on insider threat problems, you'll know that yeah somebody shaking their head It's incredibly creepy how insider threat programs work because they need to know every single thing When you log on when you log off when you use the restroom What do you print when are you on Gmail and what do you get to that sort of level it creates? I think an interesting moral dilemma about security versus privacy Which is something that that both fellow panelists I think is have harped on in multiple talks Cool also, let's let's transition a bit to Responsible disclosure So there's a lot of different cultures surrounding responsible disclosure There's like the academic IRB model. There's the you know, sort of hacker just tweeted out model How what what do you what does responsible disclosure or what what? Responsibilities do you have when you're doing a pen test in any of your fields? What should that look like? So I've been going through IRBs the past month They don't like me Because I fight back on a lot of these things and what is actually ethical like an ethics review board is not the like end-all be-all of ethics For example, the one I was dealing with is like you have to keep your raw data for seven years on the university servers And and they and I was like why and they said oh just in case Like no the With science the best way to have reproducible science is to release your data set But you want to take care of your participants so you anonymize it the best you can you strip out as much data as you can And leave the bits that you need to get the stats that you got so other people could get the same thing if they took That data and your paper they should be able to get the same results easily and I said yeah Like I'm gonna as soon as the raw data has been shifted over and I've anonymized it I'm gonna delete it. It'll be gone and then they're like no and fighting it And so this is an interesting thing because when we're talking about how much that data reveals And how much you can get out of it if you haven't anonymized it and it's sitting there on a server And like honestly most people aren't there for seven years That would be a master's and a PhD in Kenna and maybe some undergrad that's a very long time Who's maintaining that who's actually gonna take it down who's actually making sure the data is safe? And so like even then the ethics boards are like, well, I don't know you have to you have to take care of that And it's like It's seven years who's gonna remember that later on except for maybe a Google calendar update This is hey delete data. It's like what data what did I do? Like this is a problem and also with the ethics boards, they are not usually geared towards specific fields And ethics board covers everything in the university So they might be Biologists that might be sociologists that might be English professors and then maybe you'll have a computer scientist Like they're not necessarily ready for AI or machine learning ethics applications Or even going through the methodology that's used, right? So this is another problem. No one verified my methodology in any of my experiments that I have put in for ethics this is also a problem is I don't know how many people here had seen the paper that came out a few months ago on telling people telling whether people are gay or Anything like that based on their face like how is that ethical? Where was the ethics review board? It got ethics clearance apparently. So, you know, this is an issue That was all about hats and face here Well, and then this is the thing is there is a company people can ask me about after this outside But this is what they do is they collect people's faces at Public places and they say oh, yeah It's for glasses and facial hair and hats and to see what the latest trends are so they can sell it to companies To see like what products are selling the best But then they were talking about how at lunchtime they go through the facial hair pitchers to make fun of people Like that's not great ethics Yeah, this can be an issue if you're doing something that's so new that nobody's really kind of I mean you talk to infasect people. They know that misinformation is an infasect problem. They know that it's a massive hack on people's brains and communities But there's no real place to put it. It's like we find stuff We find stuff all the time and all we can do is like a very quiet word with somebody in the right sort of place so how do you tell people that a Piece of their system that they don't even think of as a piece of system is broken and vulnerable and That they're completely screwing up protecting it on the other side Thank you anonymous for releasing a huge pile of names of people who connected to QAnon, but They're not the people we've seen so I mean you've seen everything from us like sometimes we keep stuff because we don't know where to put it and we don't Want to do harm by putting it out to people just like chucking stuff out because hey It's hard. It's hard for us and you're Yeah It's Quite the label but no, I think with responsible disclosure The thing that most interests me is we as a community have finally gotten really good I think reasonably good at responsible disclosure for the vulnerability Research community has anybody here participated in like bug bounties or anything like that do vulnerability research Yeah, a few people we have guidelines and boards and clearing houses and for the most part proper steps to take to go and publish a Vulnerability for a network system or a piece of software for AI backed security platforms that doesn't exist at all Yet we kind of open up These platforms to public view in places like virus total where you can test You know your malware against a variety of antiviruses and then if you happen to pass one And you feel like vendor shaming or or anything like that or gaining publicity through tweeting That's that's an option on the table. There's no real Ethics or guidelines attached to that and that is probably an opportunity for The adversarial research community to to learn from the vulnerability research community and attempt to establish these sort of norms and guidelines And this is sometimes where academia does come in Because we aren't beholden to a company. We don't have like in Canada specifically We don't have to worry about funding So you basically have a general grant that covers like privacy and security and you can do whatever you want under it Essentially, so this is a good opportunity sometimes to work with academics who have more protections Who can publish this stuff out underneath the protection of a university and have the funding to do it Whereas, you know, a company might be like no, you're under us. We own your IP You can't do that or maybe you're worried as a community group like whether you will personally be targeted Of course academics also do have that worried that we will be personally targeted But sometimes the universe can help out with that or we can publish under the university generally and not put out specific names So this is where it definitely start talking to academics as well And we could bridge things a bit more to maybe help solve some of this problem I mean, especially if you're a lone hacker or a lone few hackers It can be pretty scary out there. You have It's an avatar serial context. It's a seriously adversarial context and the risk is high So, can you think of any other like ethical, you know, just Responsibilities or decisions or boundaries that you know exist in this space. I Know we kind of already covered it. So yeah, I mean I think what we're talking about and what folks like Sven and Ariel and a few others who have gone over the the applied adversarial approach Doing this and testing Vendor-based AI platforms is is a murky area because you are Inherently do making this up or do something different and that's where I drew kind of the the comparison to the vulnerability Research community, which is very much the same idea and and that's kind of that's kind of where I was going at least with my Do we partner with academia? Do we partner with impartial third parties or NGOs in an attempt to create some sort of Clearing house in order to generate or establish these these norms Yeah, I mean, it's like your data was broken your access is broken and your algorithm is broken. It's it's Who the hell do yeah Awesome, so the final question before we actually turn it over to audience and have audience ask some questions How would you recommend someone get into the field or these research areas? Yeah, I mean a lot of people are doing it right now going to things like AI village Sarah leads an excellent conference that will be in Washington, DC I don't lead it. I'm just the program chair But you should go it's called camless conference on this applied machine learning and information security and we're looking at both Offensive so attacking with machine learning. We're looking at defensive defending machine learning And we're looking at attacking the algorithms of cells So things covered AI village also ML sec. We have a little problem with the website's address But if you look at ML sec as a Twitter, you should be able to find the community Join the slack channels join the long conversations we have You covered it. Oh, I was the other thing would be just talk to people in the in the room Follow follow folks on Twitter if you don't have an account. Don't you don't need to tweet Just follow people it's such an excellent way to learn about what everybody in Academia and industry is working on I Completely forgot I've got a GitHub repo with listing to other people's repos of interesting things to read plus people to follow