 I'm basically going to talk about geospatial intrusion detection and also how to beat it. I absolutely loathe public speaking, so bear with me. If I pass out, just someone continue on, just jump up here, slide should walk you through it. Okay, I only have 20 minutes, so I'm basically going to go right for the jugular. There's going to be no kind of beating around the bush, no tap dancing. So the agenda, I'm going to talk about the current state of the mapping security alerts, kind of touch upon the actual methodology of geospatial intrusion detection systems. Obviously vendors and accuracy always plays a huge role in this, and then obviously touch upon how to beat it. Okay, whereas IT security and mapping collided. Mapping has definitely gotten a lot more publicity with Google Earth and Yahoo Maps and all that type of stuff. Many security firms have implemented mapping tools and their products in varying degrees. Here's a couple screen captures of some of the bigger players. John Goodall sent me this image of Meerkat, which this product specifically is literally just putting geographic locations to wireless signals. This one's another one. This is a Korean based network security product called VizNet. It's pretty kickass. This is one I did in grad school. I just called it GSWAT for some odd reason. Now a lot of these tools, they have very little stuff going on behind them. They're basically taking the source IP address, translating it to lat-long coordinates, and then mapping it. It's giving the security analysts a little bit of information as far as geographically where it's coming from, but that's about it. It's basically eye candy at this point. Now this is where the geospatial intrusion detection system kind of comes into play. It's basically putting a backbone, giving it a bite. So the goal is to find a correlation between externally based network alerts by plotting their source locations on a geographic map. This by no means is a silver bullet for companies looking to defend. It's basically giving the security analysts one more piece, one more component of information so that they can basically make a better determination what to do with that alert. The theories supporting this is, okay, so you have a hacker. He's coming from a zombie computer. He's going to compromise those zombie computers by doing a sequential IP scan. It's common knowledge that ISPs divvy out IP addresses in a geographic location. So if a company defending against this can basically read in, suck in all that information and run some type of pattern recognition against the information, they can basically find the zombies and find out what the attacker's trying to hone in on. Okay, so I'm going to do a very high level explanation of the kind of GID methodology. You want to first off just eliminate as many friendlies or false positives as possible. There's a couple of different ways to do this. By mapping the street address of the clients, of strategic partnerships, remote locations, I've even gotten to the point where I've mapped my system administrators and their home address because they always create false positives. Let's see, IP translation. Maybe if you don't have that street address, you can take where they're coming from, that IP address and basically just do the same thing just of the geocoding translation. Another thing is create an IDS alert when a customer is actually authenticated to your website. So it's very easy with snort, piece of cake. And obviously it's not bullet proof. A hacker in due diligence is probably a customer, but at the same time it's a risk you're going to have to take. Now with each of these steps, I want to kind of give you a little screen capture of what I'm actually talking about. So let's see, the triangles represent snort alerts. The circles represent actual, whether it's a remote branch, a system administrator's home address. So with this quick little map, by doing the elimination of friendlies, it basically reduces the false positives and reduces the IDS data set by 30% roughly. Now again, at this level, by looking at the country, you would almost have to obviously kind of dig in a little bit closer because with these triangles and circles sitting on top of each other, they still could be 30, 40, 50 miles apart. But again, I just wanted to give you kind of a high level representation. Okay. The time series, the time stamp is very important. Obviously when you're doing some level of data mining and stuff, and with IDS alerts there's just so many false positives and so many alerts, you almost have to, you can't analyze the entire data set. You've got to break it into smaller chunks. So you want to plot rolling time series of one week, two week, four week. This again is, you're trying to target the professional hacker. It's not going to be the high school student that got home from school and is doing like a huge, just aggressive end map scan. You want to target the low and slow to really hit the hackers. Now the rolling, obviously fixed is Monday to Monday, Monday, Monday, Monday, Monday. You don't want to do that. You want to give it a rolling. You want to basically do Monday to Monday, but also Tuesday to Tuesday, Wednesday to Wednesday. You want to give it a rolling time series. Because obviously hackers aren't, or crackers or what not, aren't basically within your interval. Okay. So this is when the actual juice comes into play. Basically once you've tried to narrow that IDS data set down to a number that, or a volume that you can handle, you want to run it through a GIS clustering algorithm. There's multiple clustering algorithms out there. Poisson, Nier's Neighbor, Moran's Eye Index, Ripley's Cave Function, any of these you can play with. What I did in my research is I hired three individual independent GIS firms and basically gave them a huge data set of approximately a year, 450,000 IDS alerts, and I told them to tell me what's the best GIS algorithm to use. Those were a, I don't want to get into who they specifically were, but of course they each came out, came back with a different GIS algorithm. So again, still in the testing phase, I'm leaning more towards the Poisson, which you can kind of Google, I don't have time to dive into it specifically, or Q and A's afterwards, I can show you how it works. Once you've identified the clustered spots, you want to extract those to again dive in a little bit deeper. Now what I've been doing currently is once I extract those IDS alerts, I've been manually going through, which is an absolute nightmare. So what I'm doing right now in my research is evaluating, running them through some type of weight calculating algorithm, working with some of the guys at George Mason that developed a product called Caldrin that does attack graphs, so I'm trying to automate it as much as possible. Okay, and typically with this level of conversation, the biggest question is accuracy and vendors. Who's offering these translation files and how accurate are they? The vendors that I know of, IP2 location, MaxMind, Quova, Digital Envoy, and the other two, sorry. The vendors are using approximately 12 different techniques to better, to more accurately tell you where those lat-long coordinates are. And again, Q and A afterwards, I'll actually tell you what the techniques are. I just don't have time. Okay, so accuracy, these are two specific quotes off of the vendor's websites. Our GOIP databases are over 99% accurate on the country level. That's not too shocking. 90% accurate on the state level and 80% accurate for the US within the 25 mile radius. Now again, with MaxMind, who's actually auditing that? That could just as easily be a marketing ploy as they all exist. Quova is a little bit different because they actually have Price Waterhouse, Cooper doing their auditing. So there's statistics. Quova's country level accuracy was measured at 99.9%. US state level accuracy was measured at 96.3%. But again, accuracy is ultimately determined by the desired level of need. And notice how a lot of the vendors won't actually tell you at a city level what their accuracy is. Here's an example of less accurate translations. Now you can obviously see with plotting of the snort alerts that there's something called striping going on. It's pretty easy to determine. So there's something wrong in their algorithm. Okay, so how to beat it? Okay, so typically right now hopefully you're thinking, okay, this jackass has piqued my interest a little bit. Now, how is he going to tell me how I can actually beat it? Well, you take that translation file and you just reverse engineer it. So typically what they're doing is you feed them an IP address and they feed you lat-long coordinates and zip codes and phone numbers and who the ISP is and what broadband they're using and all that type of stuff. Just reverse engineer it. Actually, let me jump back to this slide. So I developed this product that basically sucks in the entire database of source IP to lat-long coordinates and basically deconstructed it so I can give it lat-long coordinates and it'll give me all the IP addresses within those lat-long coordinates. So here's just the screen captures. It's basically just web-based. So you kind of zoom in, click and drag. It shows you all the cities within your click and drag. Dive in a little bit deeper. University of Manchester in England. And once you do the fetch, it'll actually give you the IP addresses. Now, a lot of people don't realize how dangerous this is because if you think you can basically do schools, you can even take Google Earth and plot out a street address whether it's Panera, Starbucks, and then once you zoom into it, you can actually get the IP address of that store, office, military base, contractor, subcontractor, prime, secondary, any of that type of stuff. And it's basically allowing you to narrow your target down so you don't have to rely on the who is database or anything that's really not monitored. So it's definitely a lot more dangerous. Okay, so let me jump back to this. Now, the accuracy is... network security for this accuracy isn't the prime goal. So network security is still riding the coattails of some of the other more... I don't want to say important, but people that have the money. So what industries are using this IP to lat-long coordinates to their advantage? You have credit card fraud. So obviously credit card companies are spending a small fortune to make sure this accuracy is fairly precise. You have digital rights management. And in talking to some of the vendors, to my understanding, if a Major League Baseball game doesn't sell out, they don't actually broadcast that game on the television. So they'll use the IP addresses to their advantage. Obviously, Emergency 911 calls. Once people started using VoIP, it became absolutely essential that emergency services could locate you given an IP address. So that's a huge push. So again, with some of these network security, we're just riding the coattails of this technology. And hopefully as natural evolution goes, it'll become more accurate and more precise. So the actual ways to beat it obviously don't use sequential IP addresses to attack a victim. Divide it from different geographic locations, different places, all that type of stuff. Map remote locations and use a tool to extract neighboring IP addresses, which will hopefully get extracted when eliminating friendlies. Obviously it depends on if the defending company is actually eliminating the friendlies. But typically you're not going to go at a company head-on. You're going to go through one of their remote branches, smaller companies. If I'm targeting banking information, I'm not going after Deutsche Bank. I'm going after the little mom-and-pop banks that probably don't have the security in place to stop me. Attack at various times. Obviously one of the small perks of geographically locating the source IP address or the source place is you find out what time zone they're in. So if I'm a network security analyst, I'm going to treat an IDS different if it's during a business hours versus a non-business hours for the source location. And obviously include decoy scans, look for different ports. Ultimately trying to throw the network security analyst off your tracks, off your scent. So let's see, if you have any questions offline, you can reach me at this email address. I had the best intentions of actually putting together a website, but I got caught up at work. So I haven't yet. I plan to make this open source, so hopefully within the next couple of weeks I'll put everything online so people can actually use it. Hopefully I'll find myself back at a Guinness tap fairly soon. But I will double back for the visualization workshop at two o'clock. Are there any initial questions that I can answer? I will, the stuff on my website, when I sat down to do this presentation I actually came out with 75 different PowerPoint slides and in 20 minutes it would have just been diary of the mouth. So I sat down and knocked it down to obviously as you can tell 26. So I will, on my website, dive into a lot more depth and detail. I'll show you the results that the GIS firms actually came up with. I can show you a little bit, and it's unfortunate that mapping softwares don't like power presentations because the projector resolution just sucks. So let's see. So the data set that I have, and again this is kind of shooting from the hip, here are the friendlies. So I was able to map those out and I mapped out the clusters, actually let's see, the clusters. So those were basically just shown with the blue circles and then extracted the hotspots. So it's pretty nice. A lot of this is done through KML which is the language that Google Earth talks to. I don't have the funding to use Esri who, if anybody's familiar with Esri, Esri is the big DOD in mapping conglomerate. Absolute nightmare to deal with just because they don't even know what's going on. But again everything uses KML, a lot of Python, and again with what I'll put online is I'll give you a Python script that actually basically extracts from your MySQL databases or whatnot and actually create the KML file for you so you can actually see it on Google Earth. Okay, well that's all I got unless anybody's got questions or whatnot. Come on, they'll be shy. Perfect, thank you.