 and I will let Catherine take the stage now. Awesome. Well, thank you so much for the introduction, and thank you so much for being here, taking your time. I know that Congress is really exciting, so I really appreciate you spending some time with me today. It's my first-ever Congress, so I'm also really excited, and I want to meet new people. So if you want to come and say hi to me later, I'm somewhat friendly, so we can maybe be friends later. Today, what we're going to talk about is deep learning blind spots or how to fool artificial intelligence. I like to put artificial intelligence in quotes because, yeah, we'll talk about that, but I think it should be in quotes. And today, we're going to talk a little bit about deep learning, how it works, and how you can maybe fool it. So I ask us, is AI becoming more intelligent? And I ask this because when I open a browser, and of course, often it's Chrome, and Google is already prompting me for what I should look at, and it knows that I work with machine learning, right? And these are the headlines that I see every day. Are computers already smarter than humans? If so, I think we could just pack up and go home, right? Like, we fix computers, right? If computer is smarter than me, then I already fixed it. We can go home. There's no need to talk about computers anymore. Let's just move on with life. But that's not true, right? We know because we work with computers, and we know how stupid computers are sometimes. They're pretty bad. Computers do only what we tell them to do, generally. So I don't think a computer can think and be smarter than me. So with the same types of headlines that you see this, then you also see this. And, yeah. So Apple recently released their face ID, and this unlocks your phone with your face. And it seems like a great idea, right? You have a unique face. You have a face. Nobody else can take your face. But unfortunately, what we find out about computers is that they're awful sometimes. And for these women, for this Chinese woman that owned an iPhone, her coworker was able to unlock her phone. And I think Hendrick and Karen talked about if you were here for the last talk. We have a lot of problems in machine learning. And one of them is stereotypes and prejudice that are within our training data or within our minds that leak into our models. And perhaps they didn't do adequate training data on determining different features of Chinese folks. And perhaps it's other problems with their model or their training data or whatever they're trying to do. But they clearly have some issues, right? So when somebody asks me, is AI going to take over the world? And is there a super robot that's going to come and be my new leader, or so to speak, I tell them, we can't even figure out the stuff that we already have in production. So if we can't even figure out the stuff we already have in production, I'm a little bit less worried than of the super robot coming to kill me. That said, unfortunately, the powers that be, a lot of times they believe in this. And they believe strongly in artificial intelligence and machine learning. And they are collecting data every day about you and me and everyone else. And they're going to use this data to build even better models. And this is because the revolution that we're seeing now in machine learning has really not much to do with new algorithms or architectures. It has a lot more to do with heavy compute and with massive, massive data sets. And the more that we have training data of petabytes per 24 hours or even less, the more we're able to essentially fix up the parts that don't work so well. And the companies that we see here are companies that are investing heavily in machine learning and AI. And part of how they're investing heavily is they're collecting more and more data about you and me and everyone else. Google and Facebook, more than one billion active users. I was surprised to know that in Germany, the desktop search traffic for Google is higher than most of the rest of the world. And for Baidu, they're growing with the speed that broadband is available. And so what we see is these people are collecting this data. And they also are using new technologies like GPUs and TPUs in new ways to parallelize workflows. And with this, they're able to mess up less. They're still messing up, but they mess up slightly less. And they're not going to get uninterested in this topic. So we need to start to prepare how we respond to this type of behavior. And so one of the things that has been a big area of research, actually also for a lot of these companies, is what we'll talk about today. And that's adversarial machine learning. But the first thing that we'll start with is what is behind what we call AI. So most of the time, when you think of AI or something like Siri and so forth, you are actually potentially talking about an old school rule-based system. This is a rule. Like, you say a particular thing, and then Siri is like, yes, I know how to respond to this. And we even hard program these types of things in. That is one version of AI. It's essentially it's been pre-programmed to do and understand certain things. Another form that usually, like for example, for the people that are trying to build AI robots and the people that are trying to build what we call general AI. So this is something that can maybe learn like a human. They'll use reinforcement learning. I don't specialize in reinforcement learning. But what it does, it essentially tries to reward you for behavior that you're expected to do. So if you complete a task, you get a cookie. You complete two other tasks, you get two or three more cookies, depending on how important the task is. And this will help you learn how to behave to get more points. And it's used a lot in robots and gaming and so forth. And I'm not really gonna talk about that today because most of that is still not really something that you or I interact with. But what I'm gonna talk about today is neural networks. Or as some people like to call them deep learning, right? So deep learning won the neural network versus deep learning battle a while ago. So here's an example neural network. We have an input layer and that's where we essentially make a quantitative version of whatever our data is. So we need to make it into numbers. Then we have a hidden layer and we might have multiple hidden layers. And depending on how deep our network is or a network inside a network, right, which is possible, we might have very much different layers there and they may even act in cyclical ways. And then that's where all the weights and the variables and the learning happens. So that holds a lot of information and data that we eventually want to train there. And finally, we have an output layer. And depending on the network and what we're trying to do, the output layer can vary between something that looks like the input. Like for example, if we want to machine translate, then I want the output to look like the input, right? I want it to just be in a different language or the output could be a different class. It can be, you know, this is a car or this is, you know, a train and so forth. So it really depends what you're trying to solve but the output layer gives us the answer. And how we train this is we use back propagation and back propagation is nothing new and neither is one of the most popular methods to do so which is called stochastic gradient descent. And what we do when we go through that part of the training is we go from the output layer and we go backwards through the network. That's why it's called back propagation, right? And as we go backwards through the network, we upvote and in the most simple way, we upvote and downvote what's working and what's not working. So we say, oh, you got it right. You get a little bit more importance. Oh, you got it wrong. You get a little bit less importance. And eventually, we hope over time that they essentially correct each other's errors enough that we get a right answer. So that's a very general overview of how it works. And the cool thing is, is because it works that way, we can fool it. And people have been researching ways to fool it for quite some time. So I'll give you a brief overview of the history of this field so we can kind of know where we're working from and maybe hopefully then where we're going to. In 2005 was one of the first most important papers to approach adversarial learning and it was written by a series of researchers and they wanted to see if they could act as an informed attacker and attack a linear classifier. So this is just a spam filter. And they're like, can I send spam to my friend? I don't know why they wanted to do this, but can I send spam to my friend if I try testing out a few ideas? And what they were able to show is yes, rather than just trial and error, which anybody can do or brute force attack just like send a thousand emails and see what happens, they were able to craft a few algorithms that they could use to try and find important words to change to make it go through this spam filter. In 2007, NIPPS, which is a very popular machine learning conference, had one of their first all day workshops on computer security. And when they did so, they had a bunch of different people that were working on machine learning in computer security from malware detection to network intrusion detection to of course spam. And they also had a few talks on this type of adversarial learning. So how do you act as an adversary to your own model? And then how do you learn how to counter that adversary? In 2013, there was a really great paper that got a lot of people's attention called poisoning attacks against support vector machines. Now support vector machines are essentially usually a linear classifier and we use them a lot to say, this is a member of this class that or another when we pertain to text. So I have a text and I wanna know what the text is about or I wanna know if it's a positive or negative sentiment. A lot of times they'll use a support vector machine. And we call them SVMs as well. And Batista Biggio was the main researcher and he's actually written quite a lot about these poisoning attacks and he poisoned the training data. So for a lot of these systems, sometimes they have active learning and this means you or I, when we classify our emails as spam, we're helping train the network. And so he poisoned the training data and was able to show that by poisoning it in a particular way, that he was able to then send spam email because he knew what words were then benign essentially. He went on to study a few other things about biometric data, if you're interested in biometrics, but then in 2014, Christian says Getty in Goodfellow and a few other main researchers at Google Brain released intriguing properties of neural networks and that really became the explosion of what we're seeing today in adversarial learning. And what they were able to do is they were able to say, we believe there's linear properties of these neural networks, even if they're not necessarily linear networks and we believe we can exploit them to fool them. And they first introduced them the fast gradient sign method, which we'll talk about later today. So how does it work? First I want us to get a little bit of an intuition around how this works. Here's a graphic of gradient descent. And in gradient descent, we have this vertical access is our cost function. And what we're trying to do is we're trying to minimize cost. We want to minimize the error. And so when we start out, we just chose random weights and variables. So all of our hidden layers, they just have maybe random weights or random distribution. And then we want to get to a place where the weights have meaning, right? We want our network to know something, even if it's just a mathematical pattern, right? So we start in the high area of the graph or the reddish area. And that's where we started. We have high error there. And then we try to get to the lowest area of the graph or here, the dark blue that is right about here. But sometimes what happens, and so as we learn, as we go through epics and training, we're moving slowly down and hopefully we're optimizing. But what we might end up in, instead of this global minimum, we might end up in the local minimum, which is the other trail. And that's fine because it's still zero error, right? So we're still probably going to be able to succeed, but we might not get the best answer all the time. What adversarial tries to do in the most basic of ways, it essentially tries to push the error rate back up the hill for as many units as it can. So it essentially tries to increase the error slowly through perturbations and by disrupting, let's say the weakest links, like the one that did not find the global minimum, but instead found a local minimum, we can hopefully fool the network because we're finding those weak spots and we're capitalizing on them essentially. So what does an adversarial example actually look like? You may have already seen this because it was very popular on the Twitter sphere and a few other places, but this was a series of researchers at MIT and it was debated whether you could do adversarial learning in the real world. A lot of the research has just been a still image and what they were able to show is they created a 3D printed turtle. I mean, it looks like a turtle to you as well, correct? And this 3D printed turtle by the Inception Network, which is a very popular computer vision network, is a rifle and it is a rifle in every angle that you can see. And the way they were able to do this, and I don't know the next time it goes around, is you can see perhaps, and it's a little bit easier on the video which I have posted, I'll share at the end, you can see perhaps that there's a slight discoloration of the shell and they messed with the texture and by messing with this texture and the colors, they were able to fool the neural network. They were able to activate different neurons that were not supposed to be activated. Units, I should say. And so what we see here is, yeah, it can be done in the real world and when I saw this, I started getting really excited because video surveillance is a real thing, right? So if we can start fooling 3D objects, we can perhaps start fooling other things in the real world that we would like to fool. So why do adversarial examples exist? We're gonna talk a little bit about some things that are approximations of what's actually happening, so please forgive me for not being always exact, but I would rather us all have a general understanding of what's happening. Across the top row, we have an input layer and these images to the left, we can see are the source images and the source image is like a piece of farming equipment or something. And on the right, we have our guide image. This is what we're trying to get the network to see. We wanted to misclassify this farm equipment as a pink bird. So what these researchers did is they targeted different layers of the network and they said, okay, we're going to use this method to target this particular layer and we'll see what happens. And so as they targeted these different layers, you can see what's happening on the internal visualization. Now, neural networks can't see, right? They're looking at matrices of numbers, but what we can do is we can use those internal values to try and see with our human eyes what they are learning. And we can see here clearly inside the network, we no longer see the farming equipment, right? We see a pink bird. And this is not visible to our human eyes. Now, if you really study and if you enlarge the image, you can start to see, okay, there's a little bit of pink here or greens. I don't know what's happening, but we can still see it in the neural network. We have tricked. Now, people don't exactly know yet why these blind spots exist. So it's still an area of active research exactly why we can fool neural networks so easily. There are some prominent researchers that believe that neural networks are essentially very linear and that we can use this simple linearity to misclassify, to jump into another area. But there are others that believe that there's these pockets or blind spots and that we can then find these blind spots where these neurons really are the weakest links and they maybe even haven't learned anything. And if we change their activation, then we can fool the network easily. So this is still an area of active research and let's say you're looking for your thesis. This would be a pretty neat thing to work on. So we'll get into just a brief overview of some of the math behind the most popular methods. First, we have the fast gradient sign method and that was used in the initial paper and now there's been many iterations on it. And what we do is we have our same cost function. So this is the same way that we're trying to train our network and it's trying to learn. And we take the gradient sign of that. And if you can think it's okay if you're not used to doing vector calculus and especially not with a pen and paper in front of you, but what you think we're doing is we're essentially trying to calculate some approximation of a derivative of the function. And this can kind of tell us where is it going? And if we know where it's going, we can maybe anticipate that and change. And then for to create the adversarial images, we then take the original input plus a small number epsilon times that gradient sign. For the Jacobian saliency map, this is a newer method and it's a little bit more effective, but it takes a little bit more compute. And so this Jacobian saliency map uses a Jacobian matrix. And if you remember also, and it's okay if you don't, a Jacobian matrix will look at the forward derivative of a function. So you take the forward derivative of a cost function and it gives you a matrix at that vector and it gives you a matrix that is a point-wise approximation if the function is differentiable at that input vector. Don't worry, you can review this later too. But the Jacobian matrix, then we use to create the saliency map the same way where we're trying to essentially find some sort of linear approximation or point-wise approximation. And we then want to find two pixels that we can perturb that cause the most disruption. And then we continue to the next. And unfortunately this is currently an O-N squared problem, but there's a few people that are trying to essentially find ways that we can approximate this and make it faster. So maybe now you want to fool a network too, and I hope you do, cause that's what we're gonna talk about. First you need to pick a problem or a network type. So you may already know, but you may want to investigate what perhaps is this company using, what perhaps is this method using and do a little bit of research cause that's going to help you. Then you want to research state-of-the-art methods and this is like a typical research statement that you have a new state-of-the-art method, but the good news is, is that the state-of-the-art two to three years ago is most likely in production or in systems today. So once they find ways to speed it up, they usually, some approximation of that is deployed. And a lot of times these are then publicly available models. So a lot of times if you're already working with the deep learning framework, they'll come pre-packaged with a few of the different popular models. So you can even use that. If you're already building neural networks, of course you can build your own. An optional step, but one that might be recommended is to fine-tune your model. And what this means is to essentially take a new training data set, maybe data that you think this company is using or that you think this network is using, and you're going to remove the last few layers of the neural network and you're going to retrain it. So you essentially are nicely piggybacking on the work of the pre-trained model and you're using the final layers to create finesse. This essentially makes your model better at the task that you have for it. Finally, then you use a library and we'll go through a few of them, but some of the ones that I have used myself is Clevon's Deep Fool and Deep Ponying. And these all come with nice built-in features for you to use for, let's say, the fast gradient sign method, the Jacobian saliency map, and a few other methods that are available. Finally, it's not going to always work. So depending on your source and your target, you won't always necessarily find a match. What researchers have shown is it's a lot easier to fool a network that a cat is a dog than it is to fool a network that a cat is an airplane. And this is just like we can make these intuitive. So you might want to pick an input that's not super dissimilar from where you want to go, but is dissimilar enough. And you want to test it locally and then finally test the ones with the highest misclassification rates on the target network. And you might say, Catherine, or you can call me Kjam, that's okay. You might say, I don't know what the person is using. I don't know what the company is using. And I will say, it's okay because what's been proven is you can attack a black box model. You do not have to know what they're using. You do not have to know exactly how it works. You don't even have to know their training data because what you can do is if it has, okay, addendum, it has to have some API you can interface with. But if it has an API you can interface with or even any API you can interact with that uses the same type of learning, you can collect training data by querying the API. And then you're training your local model on that data that you're collecting. So you're collecting the data, you're training your local model, and as your local model gets more accurate and more similar to the deployed black box that you don't know how it works, you are then still able to fool it. And what this paper approved, Nicholas Papinot, and a few other great researchers is that with usually less than 6,000 queries, they were able to fool the network between 84 and 97% certainty. And what the same group of researchers also studied is the ability to transfer the ability to fool one network into another network. And they called that transferability. So I can take a certain type of network and I can use adversarial examples against this network to fool a different type of machine learning technique. And here we have their matrix, their heat map that shows us exactly what they were able to fool. So we have across the left-hand side here the source machine learning technique. We have deep learning, logistic regression, SVMs like we talked about, decision trees and K nearest neighbors. And across the bottom we have the target machine learning. So what were they targeting? They created the adversaries with the left-hand side and they targeted across the bottom. We finally have an ensemble model at the end. And what they were able to show is like, for example, SVMs and decision trees are quite easy to fool. But logistic regression a little bit less so but still strong for deep learning and K nearest neighbors if you train a deep learning model or a K nearest neighbor model then that performs fairly well against itself. And so what they're able to show is that you don't necessarily need to know the target machine and you don't even have to get it right even if you do know. You can use a different type of machine learning technique to target the network. So we'll look at six lines of Python here. And in the six lines of Python I'm using the Cleverhans library. And in six lines of Python I can both generate my adversarial input and I can even predict on it. So if you don't code Python it's pretty easy to learn and pick up. And for example, here we have Keras and Keras is a very popular deep learning library in Python. It usually works with a Theano or a TensorFlow backend. And we can just wrap our model pass it to the fast gradient method class and then set up some parameters. So here's our epsilon and a few extra parameters. This is to tune our adversary. And finally we can generate our adversarial examples and then predict on them. So in a very small amount of Python we're able to target and trick network. And if you're already using TensorFlow or Keras it already works with those libraries. Deep Poning is one of the first libraries that I heard about in the space and it was presented at DEF CON in 2016. And what it comes with is a bunch of TensorFlow built in code. It even comes with a way that you can train the model yourself. So it has a few different models, a few different convolutional neural networks and these are predominantly used in computer vision. It also however has a semantic model and I normally work in NLP and I was pretty excited to try it out. And what it comes built with is the Rotten Tomatoes sentiment. So this is Rotten Tomatoes movie reviews that try to learn is it positive or negative. So the original text that I input in when I was generating my adversarial networks was more trifle than triumph which is the real review. And the adversarial text that it gave me was Jonah refreshing haunting leaky. Yeah. So I was able to fool my network but I lost any type of meaning. And this is really the problem when we think about how we apply adversarial learning to different tasks is it's easy for an image if we make a few changes for it to retain its image, right? It's many, many pixels. But when we start going into language if we change one word and then another word and another word or maybe we change all of the words we no longer understand as humans and I would say this is garbage in, garbage out. This is not actual adversarial learning. So we have a long way to go when it comes to language tasks and being able to do adversarial learning and there is some research in this but it's not really advanced yet. So hopefully this is something that we can continue to work on and advance further. And if so we need to support a few different types of networks that are more common in NLP than they are in computer vision. There's some other notable open source libraries that are available to you and I'll cover just a few here. There's the Vanderbilt Computational Economics Research Lab that has AdLib and this allows you to do poisoning attacks. So if you want to target training data and poison it then you can do so with that and it uses Scikit-learn. DeepFool allows you to do the fast gradient sign method but it tries to do smaller perturbations. It tries to be less detectable to us humans. It's based on Theano which is another library that I believe uses Lua as well as Python. Fullbox is kind of neat because I only heard about it last week but it collects a bunch of different techniques all in one library and you can use it with one interface. So if you want to experiment with a few different ones at once I would recommend taking a look at that and finally for something that we'll talk about briefly in a short period of time we have Evolving AI Lab which release a fooling library and this fooling library is able to generate images that you or I can't tell what it is but that the neural network is convinced it is something. So this we'll talk about maybe some applications of this in a moment but they also open sourced all of their code and their researchers to open source that code which is always very exciting. As you may have known from some of the research I already cited most of the studies and the research in this area has been on malicious attacks. So there's very few people trying to figure out how to do this for what I would call benevolent purposes. Most of them are trying to act as an adversary in the traditional computer security sense. They're perhaps studying spam filters and how spammers can get by them. They're perhaps looking at network intrusion or botnet attacks and so forth. They're perhaps looking at self-driving cars. So and I know that was referenced earlier as well at Hendrick and Karen's talk they're perhaps trying to make a yield sign look like a stop sign or a stop sign look like a yield sign or a speed limit and so forth. And it's scarily they are quite successful at this. Or perhaps they're looking at data poisoning so how do we poison the model so we render it useless in a particular context so we can utilize that. And finally for malware. So what if a few researchers were able to show is by just changing a few things in the malware they were able to upload their malware to Google Mail and send it to someone and this was still fully functional malware. In that same sense, there's the Malgan project which uses a generative adversarial network to create malware that works I guess. So there's a lot of research of these kind of malicious attacks within adversarial learning. But what I wonder is how might we use this for good? And I put good in quotation marks because we all have different ethical and moral systems we use and what you may decide is ethical for you might be different but I think as a community especially at a conference like this hopefully we can converge on some ethical privacy-concerned version of using these networks. So I've composed a few ideas and I hope that this is just a starting list of a longer conversation. One idea is that we can perhaps use this type of adversarial learning to fool surveillance. So as surveillance affects you and I it even disproportionately affects people that most likely can't be here. And so whether or not we're personally affected we can care about the many lives that are affected by this type of surveillance and we can try and build ways to fool surveillance systems. Stenography, so we could potentially in a world where more and more people have less of a private way of sending messages to one another we can perhaps use adversarial learning to send private messages. Adware fooling, so again where I might have quite a lot of privilege and I don't actually see ads that are predatory on me as much. There's a lot of people in the world that face predatory advertising. And so how can we help those problems by developing adversarial techniques? Poisoning your own private data, so this depends on whether you actually need to use the service and whether you like how the service is helping you with the machine learning but if you don't care or if you need to essentially have a burn box of your data then potentially you could poison your own private data. And finally I want us to use it to investigate deployed models. So even if we don't actually need a use for fooling this particular network the more we know about what's deployed and how we can fool it the more we're able to keep up with this technology as it continues to evolve. So the more that we're practicing the more that we're ready for whatever might happen next. And finally I really want to hear your ideas as well. So I'll be here throughout the whole Congress and of course you can share during the Q&A time. If you have great ideas I really want to hear them. So I decided to play around a little bit with some of my ideas. And I was convinced perhaps that I could make Facebook think I was a cat. This was my goal. Ken Facebook think I'm a cat. Cause nobody really likes Facebook. I mean let's be honest, right? But I have to be on it cause my mom messages me there and she doesn't use email anymore so I'm on Facebook. Anyways, so I used a pre-trained inception model in Keras and I fine tuned the layers. And I'm not a computer vision person really but it took me like a day of figuring out how computer vision people transfer their data into something I can put inside of a network, figured that out and I was able to quickly train a model and the model could only distinguish between people and cats. That's all the model knew how to do. I give it a picture, it says it's a person or it's a cat. I have no idea, I actually didn't try just giving it an image of something else. It would probably guess it's a person or a cat, maybe 50, 50, who knows. And what I did was I used an image of myself and eventually I had my fast gradient sign method, I used clever Hans and I was able to slowly increase the Epsilon and so the Epsilon as it's low, you and I can't see the perturbations but also the network can't see the perturbations. So we need to increase it and of course as we increase it when we're using a technique like FGSM, we are also increasing the noise that we see. And when I got to 0.21 Epsilon and I kept uploading it to Facebook and Facebook kept saying, yeah, do you wanna tag yourself? And I'm like, no, I don't, I'm just testing. Finally, I got to 0.21 Epsilon and Facebook no longer knew I was a face. So I was just a book, I was a cat book maybe. So unfortunately as we see I didn't actually become a cat because that would be pretty neat but I did, I was able to fool it. I spoke with a computer vision specialist that I know and she actually works in this and I was like, what methods do you think Facebook is using? Like did I really fool a neural network or what did I do? And she's convinced most likely that they're actually using a statistical method called Viola Jones which takes a look at the statistical distribution of your face and tries to guess if there's really a face there. But what I was able to show, transferability, is that I can use my neural network even to fool this statistical model. So now I have a very noisy but happy photo on Facebook. Another use case potentially is adversarial stenography and I was really excited reading this paper. What this paper covered and they actually released the library as I mentioned is they studied the ability of a neural network to be convinced that something's there that's not actually there. And what they use is the NIPS training set. I'm sorry if that's like a trigger word. If you've used MNIST a million times, then I'm sorry for this. But what they use is MNIST which is zero through nine of digits. And what they were able to show using evolutionary networks is they were able to generate things that to us look maybe like art and they actually used it on the CFAR dataset too which has colors and it was quite beautiful some of what they created. In fact, they showed it in a gallery. And what the network sees here is the digits across the top, they see that digit. They are more than 99% convinced that that digit is there. And what we see is pretty patterns or just noise. And when I was reading this paper, I was thinking how can we use this to send messages to each other that nobody else will know is there. I'm just sending really nice, I'm an artist and this is my art and I'm sharing it with my friend. And in a world where I'm afraid to go home because there's a crazy person in charge. And I'm afraid that they might look at my phone and my computer and a million other things. And I just want to make sure that my friend has my pin number or this or that or whatever. I see a use case for my life. But again, I leave a fairly privileged life. There are other people where their actual life and livelihood and security might depend on using a technique like this. And I think we could use adversarial learning to create a new form of stenography. Finally, I cannot impress enough that the more information we have about the systems that we interact with every day that are machine learning systems, that are AI systems or whatever you want to call it, that are deep networks, the more information we have, the better we can fight them, right? We don't need perfect knowledge, but the more knowledge that we have, the better an adversary we can be. And if I thankfully now live in Germany and if you are also a European resident, we have GDPR, which is the General Data Protection Regulation. And it goes into effect in May of 2018. And we can use GDPR to make requests about our data. We can use GDPR to make requests about machine learning systems that we interact with. This is a right that we have. And in Recital 71 of the GDPR, it states the data subject should have the right to not be subject to a decision, which may include a measure evaluating personal aspects related to him or her, which is based solely on automated processing and which produces legal effects concerning him or her or similarly significant effects him or her, such as automatic refusal of an online credit application or e-recruiting practices without any human intervention. And I'm not a lawyer and I don't know how this will be implemented. And it's a recital, so we don't even know if it will be enforced the same way. But the good news is pieces of this same sentiment are in the actual amendments. And if they're in the amendments, then we can legally use them. And what it also says is we can ask companies to port our data other places. We can ask companies to delete our data. We can ask for information about how our data is processed. We can ask for information about what different automated decisions are being made. And the more we all hear ask for that data, the more we can also share that same information with people worldwide. Because you know the systems that we interact with, they're not special to us. They're the same types of systems that are being deployed everywhere in the world. So we can help our fellow humans outside of Europe by being good caretakers and using our rights to make more information available to the entire world. And to use this information to find ways to use adversarial learning to fool these types of systems. So how else might we be able to harness this for good? I cannot focus enough on GDPR and our right to collect more information about the information they're already collecting about us and everyone else. So use it. Let's find ways to share the information we gain from it. So I don't want it to just be that one person requests it and they learn something. We have to find ways to share this information with one another. Test low tech ways. So I'm big into, you know, I'm so excited about the maker space here and maker culture and other low tech or human crafted ways to fool networks. We can use adversarial learning perhaps to get good ideas on how to fool networks, to get lower tech ways. What if I painted red pixels all over my face? Would I still be recognized? Would I not? Let's experiment with things that we learn from adversarial learning and try to find other lower tech solutions to the same problem. Finally, or nearly finally, we need to increase the research beyond just computer vision. Quite a lot of adversarial learning has been only in computer vision and while I think that's important and it's also been very practical because we can start to see how we can fool something, we need to figure out natural language processing. We need to figure out other ways that machine learning systems are being used and we need to come up with clever ways to fool them. Finally, spread the word. So I don't want the conversation to end here. I don't want the conversation to end at Congress. I want you to go back to your hacker collective, your local CCC, the people that you talk with, your coworkers and I want you to spread the word. I want you to do workshops on adversarial learning. I want more people to not treat this AI as something mystical and powerful because unfortunately it is powerful but it's not mystical. So we need to demystify the space, we need to experiment, we need to hack on it and we need to find ways to play with it and spread the word to other people. Finally, I really want to hear your other ideas. And before I leave today, I have to say a little bit about why I decided to join the resiliency track this year. I read about the resiliency track and I was really excited. It spoke to me and I said, I want to live in a world where even if there's an entire burning trash fire around me, I know that there are other people that I care about that I can count on, that I can work with to try and at least protect portions of our world, to try and protect ourselves, to try and protect people that do not have as much privilege. So what I want to be a part of is something that can use maybe the skills I have and the skills you have to do something with that. And your data is a big source of value for everyone. Any free service you use, they are selling your data. Okay, I don't know that for a fact, but it is very certain, I feel very certain about the fact that they're most likely selling your data. And if they're selling your data, they might also be buying your data. And there is a whole market that's legal, that's freely available to buy and sell your data. And they make money off of that and they mine more information and make more money off of that and so forth. So I will read a little bit of my opinions that I put forth on this. Determine who you share your data with and for what reasons. GDPR and data portability give us European residents stronger rights than most of the world. Let's use them. Let's choose privacy-concerned ethical data companies over corporations that are entirely built on selling ads. Let's build startups, organizations, open source tools and systems that we can be truly proud of. And let's port our data to those. Amazing, we have time for a few questions. I'm not done yet, sorry. It's fine. I'm so sorry. Let's go, no big deal. So machine learning, closing remarks is brief roundup. Closing remarks there is that machine learning is not very intelligent. I think artificial intelligence is a misnomer in a lot of ways but this doesn't mean that people are going to stop using it. In fact, there's very smart, powerful and rich people that are investing more than ever in it. So it's not going anywhere and it's going to be something that potentially becomes more dangerous over time because as we hand over more of these to these systems, it could potentially control more and more of our lives. We can use however adversarial machine learning techniques to find ways to fool black box networks so we can use these and we know we don't have to have perfect knowledge. However, information is powerful and the more information that we do have the more we're able to become a good GDPR-based adversary. So please use GDPR and let's discuss ways where we can share information. Finally, please support open source tools and research in the space because we need to keep up with where the state of the art is so we need to keep ourselves moving and open in that way and please support ethical data companies or start one. If you come to me and you say, Catherine, I'm going to charge you this much money but I will never sell your data and I will never buy your data. I would much rather you handle my data. So I want us, especially those within the EU, to start a new economy around trust and privacy and ethical data use. Thank you very much. Thank you. Okay, we still have time for a few questions. No, no, no, no worries, no worries. Less than the last time when I walked up here. Yeah, now I'm really done. Come up to one of the mics in the front section and raise your hand. Can I take a question from mic one? Very interesting talk. One impression that I got during the talk was with the adversarial learning approach, aren't we just doing pen testing and quality assurance for the AI companies and they're just going to build better machines? That's a very good question. And of course, most of this research right now is coming from those companies because they're worried about this. What however they've shown is they don't really have a good way to learn how to fool this. Most likely they will need to use a different type of network eventually. So probably whether it's the blind spots or the linearity of these networks, they are easy to fool and they will have to come up with a different method for generating something that is robust enough to not be tricked. So to some degree, yes, it's a cat and mouse game, but that's why I want the research and the open source to continue as well. And I would be highly suspect if they all of a sudden figure out a way to make a neural network which has proven linear relationships that we can exploit non-linear. And if so, it's usually a different type of network that's a lot more expensive to train and that doesn't actually generalize well. So we're going to really hit them in a way where they're gonna have to be more specific, try harder and so I would rather do that than just kind of give up. Next one, Mike too. Hello, thank you for the nice talk. I wanted to ask, have you ever tried looking at it from the other direction, like just trying to feed the companies falsely classified data and just do it with so massive amounts of data so that they learn from it at a certain point. Yeah, so that's these poisoning attacks. So when we talk about poisoning attacks, we're essentially feeding bad training data and we're trying to get them to learn bad things or I wouldn't say bad things, but we're trying to get them to learn false information. And that already happens on accident all the time. So I think the more too we can, if we share information and they have a publicly available API where they're actually actively learning from our information, then yes, I would say poisoning is a great attack way and we can also share information of maybe how that works. So especially I would be intrigued if we can do poisoning for adware and malicious ad targeting. Okay, thank you. One more question from the internet and then we run out of time. So we can start. Oh no, sorry. Okay. Thank you, one question from the internet. So what exactly can I do to harm my model against adversarial samples? Sorry? What exactly can I do to harm my model against adversarial samples? Not much. What they have shown is that if you train on a mixture of real training data and adversarial data, it's a little bit harder to fool, but that just means that you have to try more iterations of adversarial input. So right now, the recommendation is to train on a mixture of adversarial and real training data and to continue to do that over time. And I would argue that you need to maybe do data validation on input. And if you do data validation on input, maybe you can recognize abnormalities, but that's because I come from mainly like production levels, not theoretical. And I think maybe you should just test things and see if they look weird. You should maybe not take them into the system. And that's all for the questions. I wish we had more time, but we just don't. Please give it up for Catherine Carmel. Thank you. Carmel.