 I would like to welcome you to our first talk of the day. This is John, he is the only person who I've ever approved a machine learning talk for. I am CFP review panel for quite a number of conferences and I have never approved one of these, so we're all going to give him a really hard time. Ready? Go! Thanks. Don't worry guys, it's not going to be one of those talks. Hey guys, this talk is called learning to listen. I'm going to be talking about detecting rogue APs with machine learning, but in a very specific context. This relates to my own personal experience with hacking and giving presentations at conferences and my own journey of how I got here in the first place. So if you will follow me on this wonderful adventure, I promise we'll have some fun. So I'm John Dunlap. I work for a company called GDS Security. We're based out of, well, New York principally, but also London, a few other handful of places we have offices. I work on security research, versus engineering, pen testing, whole nine yards, everything. I do a lot of research. I'm actually doing like three talks at DEF CON this year, so having a really fun time. And I swear this will not be a buzzword late in talk. I promise I will not say all of these words in succession and confuse everybody with like crazy dense machine learning talk, although let's see. I did not say AI revolution or Keras or APT or cyber, so that's good on me. So who here knows Gabe Ryan, the like master of Wi-Fi hacking? If you guys know Gabe, he's like really, really into Wi-Fi hacking. He talked at DEF CON two years ago and last year and this year. He already did talk in the big presentation hall, but he talks a lot about really complex like enterprise EAP attacks. And the story of Gabe and his attacks has a lot to do with the justice for this talk. So I met Gabe when I was a slightly younger security engineer and he was really new at my company. And within like a month of joining on as his like first security engineer job, he came up with this research for detecting rogue APs. And it really inspired me that he had like the gumption to just go for it with very little experience. And sort of ended up inspired me to go and give talks and do research in a way. So I thought I would return the favor to Gabe for inspired me to start my speaking career as it was by sort of tearing apart one of his tools, playfully, lovingly. And his tool is called EAP, well his good tool is called EAP Hammer. He has a lot of good tools. You should try EAP Hammer. It's really hot. It helps you do enterprise EAP attacks. But two years ago he wrote a different tool and this is kind of subject of this talk. It's called Sentry Gun. And the idea with Sentry Gun was he wanted a way to monitor for rogue AP attacks using little host speaking devices. And the basic idea with Sentry Gun was you would deploy a bunch of little monitors with antennas all around your whatever area you're trying to protect. And they would listen passively for probe requests, 802.11 probe requests. And they would measure against a predefined statistic. They would measure if the signal strength was greater or lower than the predefined statistic. And then that would get sent to a server. And when that was violated it would give you the ability to slay the rogue AP by sending some kind of DOS attack against it or some kind of DOS attack against it. And it was part of my job working at GDS at the time. I was a little more senior than him. I got to review his slides and offer critiques and stuff. And I got to have a good time giving him a hard time. But since he was so inspiring to me I thought I would return to that and see if I really could outdo his methodology. So yeah, Gabe wants to take out rogue APs. That's pretty metal. If you didn't know Gabe plays in like a metal band. He's like a very heavy metal kind of person. Here's the sentry gun, Github repo. If you haven't seen it you should give it a shot. It has a nice little web interface. I think it's cute that it lets you DDoS the rogue APs. So it also helps you locate them. And the location information is based on the different signal strength of the various antenna sensors you have. In the room. So you've got a spatially distributed sensor array. But you have to do benchmarking of this sensor array for your trusted APs in a trusted environment. Which kind of stinks? Because you might not have that trusted environment if you're in a public place. You have to find a day when no one's in that room. Now what is that statistic that we're using for figuring out if the signal is out of band? It's basically arithmetic mean. Right? So the way Gabe's tool works is it samples 15 packets exactly and takes the signal strength of each of those. And doesn't mean on that. And if you're past a certain deviation up or down from that then you're a rogue AP. And only really considers you for rogue AP status if you have the same SSID and VSSID as the trusted AP. So you have a white list and if you come up with a weird signal strength and you're on the same SSID and VSSID as the stuff on the white list then you're considered a rogue AP and put up for dossing. And this actually worked pretty well in practice. He was able to find and eliminate rogue APs in his test. And using the multi-sensor setup he was able to physically locate the attackers. That said when he first gave the pitch of the talk to me I was like really just the mean? Just the mean? See I'm one of those engineers. I'm always trying to build a Rube Goldberg machine and I instantly thought to myself that's not enough math. That cannot possibly be enough math. And it hasn't someone, this is too easy. Someone's tried this before. But I had to give game of respect because it worked really well in practice. So I demanded more. I want more math. I want more features. I want a deeper characterization of attackers. And possibly greater accuracy because although I was aware in practice because I've run Gabe's tool a couple of times and I saw him two years ago I think a little bit this year he actually tried his tool out at B-Sides. They let him run it in an active threat environment and found a lot of rogue APs. But what was never established was like what is the false positive level of this tool? What is the accuracy level of this tool? So I wanted to establish those things on my own and see if we couldn't find a way to come up with a better, more adaptive method for characterizing these kinds of attacks. And maybe utilize a little more of the academic literature on finding rogue APs which is kind of interesting. So how about machine learning? Because when we're talking about we have two very similar signals and we want to classify them. Why machine learning is very good at classifying things as one of the primary things it does. So I thought I would develop some kind of classifier to classify these signals. And even if I didn't know the particular function that made these signals different from each other the machine learning could find that out for me. So my idea was to add machine learning algorithms to Gabe's tool to train a model to better identify and classify attackers. And try to find those signals that didn't work well with Gabe's algorithm and maybe in the future predict some recurring attacks and locations. And I got sort of about half of that stuff in my research. I'm really proud of the results I got. But we'll show you what I did get and did not get. But first, let's talk about prior work. You know, the first time I saw Gabe's presentation I said to myself someone's had tried this before. And the answer was no. No one really had done a practical implementation quite like this actually. There are a lot of close calls and a lot of, you know, if you read a lot of papers from journals there's a lot of people proposing algorithms but not a lot of practical tests and benchmarks which sometimes sucks. So we'll talk about that in a second. We already got that. And so the first paper we'll talk about is Yang, Song and Gu. And this is from 2012. Active user side, evil twin access point detections using statistical techniques. So this is the first thing that came close to both me and Gabe's sort of idea. And, you know, it's not quite machine learning, it's statistics but sometimes that's the same thing like Beijing statistics could be machine learning, right? And the idea is that we're going to find a way to find rogue APs using a number of their own invented statistics including hop differentiation technique, trained mean matching, sequential probability, ratio test, that's the term they made up. And we're going to use kind of a client side approach. So this would be run on the work station, we're worried about connecting to the rogue AP. And that's kind of like a differentiating thing you'll see in a lot of these papers. I wanted to limit my options for the techniques I was using to things that were essentially similar to Gabe's approach. If you look into the market for rogue AP detection, there are a bunch of variations on the overall architecture. And in enterprise stuff, the most common thing you see is basically a box that's connected to both the LAN and the Wi-Fi that can measure latencies and see if something has a greater latency based on measuring its relationship to both the LAN and the Wi-Fi. I avoided that purposefully because that's not what Gabe's tool does. Gabe's tool is a passive one that doesn't need a whole lot of intrusive access to the network. And you'll see what I mean in a second in terms of options that I sort of left by the wayside just because it wasn't similar to Gabe's tool, I wanted to improve on Gabe's idea specifically. And sort of extracting ideas that are important from this Yang, Song, and Gu is there's a lot of ideas floating around about measuring hops in terms of like maybe we'll connect to this rogue AP and see if it has more hops than it should have. If the rogue AP is connected to the AP, it's connected to the WAN or something like that, we will have one extra hop than we should have. But that involves like some intrusive access. We actually have to associate to that rogue AP. Again, that's not in the spirit of Gabe's tool, so we're not going to do it. But the measuring of latencies and whatnot, that is a good idea and we should think about that. Here's another paper from 2013, Kim, Sio, Sean, and Moon. A novel approach to detection of mobile rogue access points. Again, we're measuring latencies for round trip time analysis. This is a good idea. You know, maybe a rogue AP will have much greater latency. And, you know, they make the very modern inference that a rogue AP will likely be connected to some kind of 3G wireless. You know, this is very much going along the idea. If you've ever been on a red team with a jump box or something, you usually will have something connected to the victim network on one side and 3G on the other side so you can ex-fill your data without being dependent on the victim network. And obviously, 3G is going to have a great deal of latency. And if you can measure that latency occurring when associating to the endpoint, then that's a good indication that you're using your rogue AP. But there's a catch there and that catch is that you have to associate to the AP. And that's not in the spirit of Gabe's talk, but it gets us thinking about time and round trip time analysis and that kind of thing. And we can start to think about techniques that involve this. It might be applicable to the tool that Gabe's written, I'm writing, et cetera. Here's another one. Jana and Casara on fast and accurate detection of unauthorized wireless access points using clock SKUs. Now, this is the idea that every device is going to have a slight aberration in the clock time as time goes on. The time will slowly get off base and that we can actually detect that off of the time stamps sent in packets. So we read those time stamps and connect them to our system's time and see how they gradually deviate. We can get an idea for how far off their clock is. This is good paper because it gives a decent algorithm for doing that. But in general, you can just subtract those two values and get an idea for what the clock SKU is and this will be different device to device. And that's kind of really what we want to do. We want to differentiate not just the signal strength because that can be iffy. But the behavior of the Wi-Fi card of the device will be different device to device. And hopefully being more general about it, we could find stuff that's interesting. One technique I left out is I don't have a slide for unfortunately is a lot of people are working on ways to measure delays based on 802.11 protocols where there is like mandatory delay between certain actions and that will be slightly different based on the device itself. And again, that can be problematic because you need very high resolution time measurement for that to work out. So what about learning? So we're going to do a machine learning type experiment to see if we can't use what we just learned from those papers a little bit to design a detection algorithm that has good false positive rate and whatnot and detect these things pretty well. So we're going to pick some side channels. We're going to use the clock SKU signal strength time stamp. And we'll put those together as sort of our data set to try and differentiate the APs. Next, we have to pick a good machine learning algorithm. So what are our requirements for this algorithm? So first of all, this is inherently time series data. And what that means, you're not familiar with that term, is that the signal strengths we are sampling shouldn't be viewed independently. They should be viewed as a sequence of events because they could be oscillating in a sine wave type form. That could be more distinctive than the signal strength itself. It could be that, you know, the way that the Wi-Fi card on this particular device scans only peaks every seven seconds or something like that. It's a very simplified way of looking at it, but it's something that machine learning algorithm could easily detect. But wouldn't be so obvious in our check the average signal strength type situation that we have engaged original implementation. We also want something that can work with multiple features because we want to stuff in that clock skew and that time stamp, right? So we want to put in that as a vector that's going to be learned in a time series. And we want to be doing something that can do classification where we're going to label our examples ahead of time and say, you know, this is what a rogue AP looks like. This is what our nominal AP looks like. It's going to differentiate and tell us a good estimate of how confident it is that this is actually the rogue AP. And then another thing to keep in mind is that this could be a model if the attackers aware of it that they might be able to adversarily influence, right? If they're able to affect our training data, right? You know, again, in my case, one of the downsides for this is that we have to collect our training data ahead of time. And in my design of the algorithm, as we'll see, you have to collect training data for each endpoint individually to characterize it, which isn't such a big deal, but it kind of sucks. It does kind of mean you might be able to do this characterization offline in an isolated room and not in the production network. Because we're just looking for, we're characterizing the behavior of the card itself. So that should be relatively fine. And another thing to keep in mind is that we want to be sure that we're doing our training correctly because our adversaries might be very different. They might be heterogeneous. They might not all behave the same. We don't want to accidentally train our network to detect one particular adversary or our example adversary. We want to be looking for the difference between the nominal AP and any adversary, not just the one we trained for. So I went for a type of neural network called LSTM or long short-term memory. It's a type of recurrent neural network. And I'm not going to get through like 30 minutes of explaining to like fully characterize what that means, but it's a type of neural network that can keep track of long-term events selectively. So recurrent neural networks have the power to look back a couple steps. Long-term short-term ones can hold on to a little bit of data it finds interesting over time. And this is good for us because we don't know the scale of the aberrations that we're looking for in the signal or in the time stamps and whatnot. The LSTM will help us figure that out. I consider a couple of other things. One of the to-do list things I had for this is trying adversarial networks, for instance, familiar with those. But it turned out that the LSTM works so well. I was not highly motivated to try other neural networks as we'll see in a second. Here is like a Wikipedia definition for an LSTM if you're interested. Again, there's a lot of technological data science stuff to digest there. But what's important to know is that we're able to detect sequences in the not just the short-term but the long-term, which is nice. That's what we want. So when we talk about neural networks and doing machine learning, we talk about features or parameters. And I used four. So we have a training label. We have our signal strength, a time stamp, and the time the packet is received. And I didn't explicitly put clock queue in there because I wanted to see if the neural network would be able to pick that up, that relationship up on its own. And I'm pretty certain it did. But in the future I might explicitly put in there a couple of other side channels that I find. I implemented about 2,000 rounds of training, which isn't very much. We're talking about like maybe 30 seconds of machine learning training. I didn't have to use Keras or Google computer or anything like that. It's really pretty light. And then I had to implement some protocols to avoid overtraining. Now overtraining is again that problem that you run into in machine learning where if your examples are too specific, we might not get a general enough algorithm out of it. We might train only for that one adversary. If we don't have a whole variety of adversaries, then we might overtrain and only detect the one. And luckily we have some methodologies for picking up on whether that's happening or not. And so to give the results of my machine learning experiment, here is the tensor board graph. And this is a, to give you a little background, I did this all in tensor flow, which is like Google's machine learning API that lets you implement these neural networks very easily. It's got a lot of predefined bones. And tensor board is it's like automated visualization representation type setup. And I got about 90% accuracy on the predictions. So about 90% of the packets I put through there it correctly identified as being rogue or not. And if you look in there, on the right side of the graph, we have the accuracy slash cost graph, which basically just indicates to us that we're probably not overtraining. Usually it's an indication that like if we're in an overtraining situation, that graph will remain stagnant or it will go up. So the graph is basically saying that the neural network is getting better with each generation more accurate, which is what we'd expect if we had a proper setup. So every time I run this it gets better. And it has a very well relatively high accuracy rate. We're talking machine learning. We could be much higher if I wanted it to be like really optimize it. But here in this case it's pretty impressive. So we have about 90% accuracy and we're able to detect recurring patterns in the data and we have no signs of overtraining, which is pretty freaking sweet if you ask me. And now we have a model that we can apply to live data and run it and see what it does. But now we have to ask ourselves, did we do better than before? And after calculating Gabe's false positive rate is about 50%, something like that, which is in practice pretty good when you're talking about each individual packet. If you miss a couple that's fine because you're meant to go and do something about it the moment you get an alert. Whereas my false positive rate is maybe 10%, which is pretty freaking good. You know one of the reasons that he had the high false positive rate is that he, remember he has to train ahead of time on the mean of the signal strength. And the actual signal strength on the network as time goes on, that mean in reality will change, but we've trained it to just the beforehand mean, if that makes sense. So with more parameters in the future we might be able to do even better, but I'm pretty impressed with 90%. I showed this to Gabe maybe 10 minutes ago and he lost his mind. He made my tool perfect. And believe it or not that's just about it. I finished a little early here and I'd be willing to take questions about the approach, the algorithm, machine learning. I hope I kept this fairly buzzword free. Go for it. Yeah. So that's one of the problems I identified with Gabe's original algorithm is that the signal strength in absolute terms per packet is unreliable. And even in a relatively quiet network I tested this with a bunch of raspberry pies and you know a couple of legit APs I had. For no good reason that signal strength will vary a great deal. And that mean that average will vary a great deal. But what appears to be detectable is the fluctuations. And adding that time stamp and receive time together helps the neural network have a very high success rate. And I did find these rogue attacks. Go for it. Yeah. So and that's sort of the next step of this research. I didn't quite get to that but there are some really good papers on using varying RSSI and signal strength to locate people. And specifically it didn't make it in here because I didn't finish it. I wanted to do that without having multiple sensors. That was more interesting to me. And there's even like a prototype like antenna on a servo thing to help make that happen. Another paper I left out this which would have been I should have put in here is that the people are working on doing similar things to what I'm doing with the SDR but or not with SDR. With RSSI but using SDRs to fingerprint the in a more general sense fingerprint the radio signal. Because if you're not thinking just about 802.11 packets and you're thinking about the actual shape of the radio signal there's a much more characteristic thing to learn there. So maybe in the next couple months I'll grab an SDR and try that out because it seems like that's a very good ongoing method. Anybody else? I am releasing the code soonish. Right now I'm in the process of cleaning it up and integrating it into Gabe's web front end so that you can use it without asking me a billion questions about how to set it up. But yes, watch my Twitter it should pop up soon. I'm one of those skittish coders who is embarrassed of I don't want someone in a future job interview to be like why did you write your Python like that? So soon, I promise. It should be it was at the front of the talk is John Dunlap 2. Here I will I will give you the front. Yeah, there you go. And okay, any other questions? Yeah, go for it. No, no, I do have another one at one in the biohacking village if you want to see me I'm inserting encrypted data into living organisms. No really. Okay, anybody else? All right guys, it's been great.