 All right, so first of all, thanks to everyone. I have a really short cord, nevermind. Anyway, we had a little bit of a last minute land party this morning, which was kind of a, I was giving my black hat talk and commenting how I was gonna be speaking at 10 o'clock in the morning at DEF CON and how that's kind of a bad idea because no one's gonna show up, so it encouraged people to show up. I said, oh, we'll have a land party. And so I got a server, the concerto guys gave me a server to get everything running, I downloaded all my stuff, got it ready, brought it down here with no monitor or anything, but it just turned on, make it work, and it didn't work, and so I went to log in and SSH wasn't started. Meanwhile, everyone else that was trying to get here couldn't get in, because the hotel security staff was thwarting their efforts to get here and play video games, which is kind of frustrating 10 o'clock on Sunday when you're thwarted by a 65-year-old hotel security guard at DEF CON. You're kind of like, oh, God, maybe I should just go home. But we started out at 8.30, and by a 9.20 we had one person playing, and by about 9.25 we had two, and we actually got up to four. So it was a decent little land tournament. This dude here stayed up all night long to play on two versus two Team Fortress for 15 minutes. So that's dedication to the cause, he's a serious crack addict. But we have an award for you. So what's your name? Quinn, so Quinn, Quinn, if you want to stand up, and Beedle, if you could stand up too, please, if you want to award it. Many of us were out last night dressed like idiots. Beedle was wearing, here, he'lls do it. Yeah, these were Muay Thai boxing shorts that they picked up for $11. And he wore them all night long. So now they're yours. I think they say juicy and tight on the ass. But I'm not too sure about that. So congratulations, thank you. Yay. My God. All right, anyway. Oh, he's putting them on. Oh, thank God, because it wouldn't be awkward otherwise. I think they're on backwards. I think the juicy part goes in the butt. There we go. All right, so that was network flow analysis. Any questions in all seriousness? I think we may try to be gaming later in the contest room. So if we do open CTF, own your box, drop a Team Fortress 2 server on and we'll be playing. All right, so introductions. There's actually content here. Really, I swear to God. The, I'm Bruce Potter, founder of the Shmoo Group. I had to think about that. Founder of the Shmoo Group, I run a small consulting company out of the Metro DC area and I've done some other odds and ends. I always start my talks the same way, which is effectively, you're at DEFCON, don't believe anything, Jaco, right? My ability to be here is based on my ability to socially engineer the people at DEFCON, period. That is why everyone's on stage here. It doesn't mean that they're experts. There's no qualification tests. There's no bar to jump. We're just up here spewing random crap and it turns out sometimes people think it's valuable so they let us back up here. But at the end of the day, most people that are up here are not formally trained in the work that they do. We do this because we love it. We do it because it's interesting. But as an industry, we kind of suck it like self-policing. We tend to reinvent wheels all the time. We tend to really just not be as efficient and as effective as possible. And so you get people on stage who will say the crazy shit and you'll get 2,000 people in the audience and everyone will believe them, what they say. One thing we do at Shmucon. Anyone been to Shmucon? All right. One thing we do at Shmucon is we give everyone little foam balls and the idea is it's hard to stand up in the crowd this big and yell bullshit to the speaker. Thank you, there you go. Not for a few people. But what we do is to offset that, we give people little foam balls and we encourage them to throw the balls at the speaker if they disagree. So the speaker says something that's a little wacky and boom, you know, you get a ball thrown at you. Speaker says something that's really off. That's my mouse, don't throw that at me please. Anyway, the long and short of it is it's a physical manifestation of the bullshit flag, right? Like you get to call bullshit and the speaker has to stay up with whatever, they have to justify whatever they're saying. So I encourage you at Defcon to throw the bullshit flag either really like, hug something at the speaker or at least get up and say bullshit if you don't agree because that's what this is all about, right? We come to Defcon and we're supposed to challenge authority and challenge what people are saying and whatever and just because somebody's on stage saying something you shouldn't believe them. We're very polite here now. Like we stand in line, the badges showed up the other day and everyone politely stood in line waiting for their badges, which is just this funny thing to see people at Defcon standing politely in line. It should have been like the most chaotic situation, people elbowing each other and saying like, oh, I gotta go get married, I gotta get mine right now. I gotta go get a divorce, I gotta go get mine right now. But no, everyone's standing politely in line. So we may have lost a little bit of that edge. I encourage you to regain it. Why are we here? So I got, I have more slides here than I know what to do with and I've been rambling about Team Fortress so I'm gonna just have to bomb stuff. Why are we here at Defcon? I was at Black Hat and I gave a talk, basically this exact talk, but I did a slash, you know, Black Hat slash Defcon slash G and like no regex geeks in the room at all. Like three guys got the joke. And there were a lot of vendors at Black Hat, like a lot of vendors. There was an entire hallway like going off into the haze at Caesar's Palace. And I thought, man, that's really impressive in some regards, but it's a little sketchy too because the security industry's gotten into, it's gotten to be very, very large, right? And we've gotten to be a very expensive industry. We get a lot of money, people in this room or make a lot of money on a yearly basis. And it's important to kind of remember what the purpose of all this is, besides just messing with the hotel and having a good time in Vegas. Is the point of being here at Defcon to release cool split techniques and specific vulnerabilities? Is it to really just poke holes in systems? That's a key part of it, right? But it's not the only thing. The MCBA, yes, thank you. We should be poking holes in the MCBA. I'm sure there's a lawyer gonna come after me after this. For those that aren't aware, go Google Defcon, MCBA, go get yourself smart on what's going on this afternoon, so. Ways of building new protection mechanisms to cover up inherent security weaknesses. Is that important? We need to build defensive products. No, we need to build products that don't require defensive products, right? Antivirus shouldn't exist. I don't mean that in the rag on Microsoft way, like Microsoft should write better code. I mean that from a computer science perspective, we have failed, right? We have done research on trusted and secured systems since the 60s, and we keep not doing it. Microsoft was able to build an operating system that was very effective for the users, effective for the application developers, and it got huge amounts of traction and they dominate 90% of the industry, okay? It's not their fault that people didn't care about security, but we're far enough down the road that we should be concerned about these things. We should be worried of how do we build operating systems that don't require antivirus? How do I leverage, God help us, trusted computing platform? How do I use a trusted computing chip to secure transactions across the internet, not just be a cute little key store on a laptop or something like that? These are problems that architecturally have been solved, but as an industry, we haven't gone after because it's not sexy. It's sexier to write intrusion detection systems. It's sexier to write antivirus. It's sexier to do sell all these products. It's not sexy to solve the core problem and put us all out of business. So drink gamble, figure out how far to push what happens in Vegas, yeah, I'll give you that. How many people pushed that last night? A few, that's really, not at all. Turns out we were totally normal last night. Really, at the end of the day, what we're here to do is secure the end users in the systems that they're using, okay? So if whatever action you're doing isn't achieving that ultimate goal and say selling exploits to third-party companies, maybe doesn't achieve that goal, right? To sell exploits to third-party companies that aren't actually patching the software that it's coming from, instead, they're making better security products to then defend against the inherent weakness in the system that they now know about the exploit for. That's not really getting there, in my opinion. We really need to think about what are we doing? Is it really making the end user more secure? The industry has changed an awful lot. There's little subtleties like that that seem like they're security-related, but they're really kind of more pocketbook-related than anything else. All right, there was no applause on that. I'm obviously not fired up enough. Thank you, yay, one guy in the front. Everyone else is selling loans for money and it's gonna give me the middle finger. Network analysis, that had nothing to do with NetFlow, by the way. Network analysis, so you wanna look at networks, you wanna find bad guys, you wanna find interesting stuff, right? It's a cute pastime. People have done it for a while. There's lots of ways to gather stuff from networks, SNMP, raw packet data, et cetera, et cetera, et cetera. It turns out, for most enterprises, network analysis isn't an important thing to do, right? What they do is they build defensive systems and hope that nothing goes wrong, okay? You kind of hope and pray and say, oh, I hope nobody can bust through this wall. And for a long time, that's actually worked, right? We haven't had very sophisticated, excuse me, attackers and that's worked. But we've kind of, the game's changing a little bit. If you go read some CIO magazine and they talk about IT, one of the things they'll talk about IT is that IT is an enabler, okay? I'm getting all buzzword compliant and there's actually a suit underneath this t-shirt. Although I am wearing a pin that says epic fail because of this morning, so IT is an enabler. So let's think about that. GM and Ford both have IT systems and the core premise there is, if GM does IT better than Ford, they'll sell better cars, they'll sell more of them and they'll make more money on every car that's sold, okay? True or false? Probably true in this day and age, right? Because GM does so much IT work and Ford does so much IT work and in such a critical part of their engineering, in such a critical part of everything they do, then most of the Kool-Aid says, yeah, that's probably okay. Security, can you say the same thing about security? Security and enabler. Does security make a difference in the marketplace? If GM has better security than Ford, does it really make a difference? Probably not, right? And the worst thing, let's think about security analysis. If GM has better forensic and network analysts than Ford, does that make them sell better cars? No, security analysis is the worst tax on an enterprise, right? All the defensive systems that you invested all your money in has failed and now you gotta go blindly running around inside to find bad guys and you're gonna pay somebody to do that, right? If you don't know that there's even bad guys there and you're just paying people there, you're potentially just wasting money. You hire a security analyst, he looks at your network day in, day out for a year. Worst thing he finds is a couple of script kiddies. Does that mean there's nobody in there? No. Can you make any statements of assurance of the network? No. All you know is you pay that guy $100,000 to sit in his butt and share for a year and do something. That's not really useful. So if you look at the attack space and the threat space we've been dealing with and the far right, you're right, my left, of this, think about general purpose attacks, right? Anti-virus, worms, viruses, all that kind of stuff. So when GM got hit with Code Red, did they care? No, they didn't. GM didn't care in the least. You know why? Because it hit Ford too and it hit Chrysler and it hit Toyota and it hit Honda. It was a cost of doing business, right? It was a pain in the ass. I mean, don't get me wrong, but everybody paid equally. So did GM share prices go down because they got hit with Code Red? No, because everybody in the sector got hit with Code Red. Boom. You know what I do about that? Declare victory. Yay, we were insecure and it didn't even matter. Like, that is the wrong message at the end of the day for the CIOs to hear. So the real response, what they do is there's a compliance checklist, right? There's FISMA, there's PCI, there's all these freaking checklists. All they want to do is check off the boxes for general purpose solutions against those general purpose problems. I have antivirus, I have anti-spyware, I have IDS, I have firewalls. Check, check, check, check. We're compliant, we're done. Fire all the security guys. On the other end of the spectrum, you have highly targeted stuff. Inside your threat, inside your threat. Boogeyman, you know, you get someone in the inside with lots of access. They're gonna do nasty stuff. We're a long way away from solving that problem. That's a really hard problem. System administrators with root bad evil, you know? There's lots of examples of CIS admins or internal admins going rogue and just doing bad stuff. Sir, you have a question? Okie dokie. So the statement is inside your threat is actually our person problem. And I contend some of those people problems are really technology problems that are too hard for us to solve. So we just tried to trust people. Authorization, it's all about trust, right? We can't do fine level authorization and access control on the data and the objects in your enterprise. So then we just trust the people. So, eh, I can authenticate a jackass. I mean, thanks, Olme, I appreciate that. The official recognition, look a jackass. So anyway, inside your threat, yeah, there's personnel issues involved with it, but there's also technology issues and it turns out there's a whole spectrum of stuff and most people just said, look, it's a hard problem. We're just gonna have to trust people not to do anything. And in reality, something goes wrong, we use legal recourse and cops to go get that person, right? Well, what we've seen recently in the middle ground are kind of highly skilled people leveraging general purpose attacks and infrastructure like botnets to go after individual enterprises, right? Banks, you know, I did work for a bank that was getting targeted by organized crime, heavy amount of fraud perpetrated through, you know, software, bots, that kind of thing. The bank next door was not getting defrauded. That makes CIOs really twitchy. Your IT systems are vulnerable to attack and apparently our neighbors aren't or for some reason we're the target and they are not. What do you do? Okay, there's a researcher from Google, a provo, I forget his first name, Niels Provo, who released a paper last week at Usenek Security. There's a follow on the research they had done last year about malware on the internet. Turns out, so Google has a good view of the internet. And they went and mined a bunch of the data and they looked at hundreds of millions of URLs to look for malicious content in those URLs and see how many URLs that they were basically indexing and returning in results were malicious. And so they built like the sandbox architecture, we're going to these URLs, seeing what would happen, seeing if someone was shoving an iframe and malicious data to them if they got owned and that kind of thing. Out of like a hundred million, they found like three million of them had malicious data associated with it and on average between one and 2% of the search results that are returned in any given Google search have malicious content in them, one to 2%. Okay, so that's actually kind of a scary number, because I use Google more than a little bit. I'm a consultant, so I use Google. All the consultants in the room nod, yeah, that's all we do. Just search. I can do that. Yeah, see, I can do it now. Would you like some coffee, dear? Anyway, so Google, oh, so the other interesting statistic was they had three, three, long night, antivirus vendors that they tested all these malware against to see how good the detection rate was. They didn't name the antivirus vendors, but the detection rate went something like this. The best one averaged about 75%. The next one averaged about 50%. And the next one averaged about 25% detection of the malware on those three million malicious URLs that they were dealing with. 25% of the tax got caught. So we can just presume that these guys in the middle who wanna pick a fight with you and not the guy next to you, they're gonna win, right? They're gonna get people in your enterprise. They're gonna be able to build a big botnet. They're gonna be able to target it against you. They're gonna carry out fraud. Whatever the hell it is they're looking to do, and they're gonna be inside the walls fucking with your head, okay? So what does that really mean? Well, it turns out it means you need to go find them. And IDSs don't work either, because the IDSs in the AV industry look a lot alike when you squint, or even when you're wide open stone they look a lot alike, right? So you need to probably pay someone to go look at the network, which is actually a new thing for the companies to have to deal with. There's some organizations, mostly government, that go and look at networks all the time, because it's critical. It's critical to nation's health and all that kind of stuff. For most industries, they don't, right? I've been in ginormous companies that have like two security guys, you know? 4,000 employees, 10,000 systems, two security guys. They are just sit there with like the meth line in their arm all day long. Well, network analysis is actually not that hard when you sit down and think about it. The problem is there's not a lot of products in the space because there's been no need for products. People develop cool products all the time, but if there's no buyers, they go away, right? So network flow is an example of one of these things where there's a lot of people out there who have built cool things over the years, but they've fallen out, or they just don't have the traction, or they don't do everything you want them to do. So it is kind of like rocket science right now. So one of the things that we did, and because I'm gonna run really long in the tooth, I'm just gonna cut to the chase, we released an open source network flow analysis tool that kind of tried to bridge the gap between the open source network flow analysis tools that are out there now, which look a lot like MRTG and Cricket, except they take flow data, and the commercial stuff, which has reporting and alerting and much more analytical capability. So we're building kind of a robust framework to do analysis in this middle ground because your goal for doing this analysis, again, recognizing that security analysis is this massive tax that nobody wants to pay, is to make your analysis as efficient and as effective as possible so they can go through the data as quickly as they can, look at it, and then get on with their life. Because it's probably those two guys in that 4,000 person organization having to do the network analysis in between upgrading the firewall and dealing with some Jack and Ape user problem that, oh, my machine won't boot, and I think it's a security issue, you know? Oh, Jesus, like that takes half a day, and you want your network analysis to take about 15 or 20 minutes. So we built the tool, we're trying to make it better. Is it released, tarballed up yet? No, not quite yet. I'll get into the status a little later. So it turns out there are some TF2 fans here, like there were three of us, four of us this morning. The way I like to say this is I would much rather be playing TF2 than working. So my goal is, whoo! So my goal as a network analyst is to get back to playing TF2 as quickly as possible, right? Anyway, nice t-shirt, sir, by the way. There's a guy in the audience wearing exactly the same t-shirt that I am. Not as nice as mine. I think they're functionally equivalent. All right, so types of analysis. So there's a couple of ways to kind of go about searching through data, right? There's the really convenient way of, hey, this thing's been owned, can you look at it? Oh, hey, guess what? I found out it's been owned. It's really easy to go dig through one box, right? Some dude walks into the Forensic Sky's office and said, this box has been owned. The Forensic guy kicks his feet up, says, yeah, I'll go image the disc. He's done it before. There's a bad guy. There's a little German side. He's gonna go find it. La, la, la, la, la. It's not fricking rocket science, right? Go sick to Forensic Sky and the whole network. Hey, there's 4,000 hosts. What are you gonna do? Oh, I'm gonna fire up my case tool and start imaging discs. Good luck, let me know how that works out. I'll come back in 2020 when we've had a couple tech refresh cycles and you have to do it again. This is kind of what we do, though, when we look for bad guys, right? We use tools like Grep. I'm gonna go Grep for something. It's a bad thing, because when the hackers break into a box, they put in Pwned in your log file. So Grep for Pwned, and if you don't see that, you're okay. Things that work as well. Grep, I really need a date. Grep, a bigger wallet, things like that. This is just wishful thinking that you can use Grep to go rootin' through log files and find shit like that, right? I'm sure someone has found a date, or at least some good JPEGs by using Grep. I was up at the podcasters meetup last night and they were running Driftnet, and someone else was apparently running like the biggest porn bot in the world. So while the podcasters are there, there was all kinds of stuff showing behind their head. It was cute. So the other way to think about it, these are not like, I'm not like an analyst, formerly trained like in college or anything. So this is just like my view of the universe, and there's a couple other views that I present here. Maybe one of them resonates with you. But kind of another way to think about this is a breadth-first search, right? I wanna look at the landscape. I wanna look at everything that's going on, and I wanna quickly find out what might be bad, right? If I have 4,000 hosts, what I'd really like to know is which five or six have a high probability of being owned right now, okay? That's useful information, because five or six is a reasonable. Units, think units. Tens, no, hundreds, no, units, okay? When you're doing this kind of work, you wanna be investigating units of things a day, less than 10, because again, remember, the goal is to play Team Fortress. So you wanna be constantly whittling down. So you gotta have tools that allow you to do this breadth-first search to say and get funneled in on something, then you can dive deep and go look for what you wanna look for. Sentences sometimes just get away from me. So if you don't understand it, just nod, it'll get back together again eventually. What all these graphs have in common? No context, thank you. Matt only generated them. What else? Time on the X axis, bingo. Every one of these is a time domain graph, right? Time marches on to the left, except for those people that set that variable the other way, and time marches to the right, and we all get very confused, okay? I don't know if anyone's ever looked at MRTG graphs where they graph it backwards. God, do I hate that? You mean today's over there? No. That's the default. Yeah, you're right, it is the default. That's for the other side of the Primeridian. North of the Equial, because the toilet water goes the other, yeah. Jeez. Except in North Poland, it shoots right to the center of the Earth. So there's other ways to think about data besides time. Time's kinda cool because as an operator, not an analyst, an operator, administrators, that kind of thing, you kinda care about what's going on now. You kinda care about what's going on now. You kinda care about what's going on now, right? That's your job. Systems down now, and you walk to your desk and say it's back up, and you don't, you're like, yeah, I did my job, because now it's working. Who cares about then? Then is this magical time that I don't give a shit about. Now is what I really care about, right? So system administrators really focus on now, and that's why tools like R&D tool and all the derivatives that have come out of it, MRTG, R&D, obviously has applicability beyond system administration, that kinda thing, I'm not trying to indicate that, but it's a fantastic way of representing time domain data and then presenting it on the screen. Cool, the problem is we've come to believe that that's the only way to represent data. There are guys here, I was talking to a gentleman who may actually be in the audience today who's been doing a lot of security visualization work. They're releasing like a visualization tool kit on the CD that boots, it covered with like cool visualization tools and that kinda stuff. You can make graphs that look like other things other than bars. There's a guy who's famous, his name is Bell, dot, dot, dot. So what is this? This is a frequency domain graph, right? Like you're graphing the frequency of something. And so you've got lots of common things happening in the middle and you've got some outliers on the end. From an analyst perspective, the things that everyone is doing is probably okay. Whatever you're representing here, like number of hosts per the size of emails, number of hosts per hosts spoken to, ratio of inbound to outbound traffic, that's a really useful one. Imagine this graph is the ratio of inbound to outbound traffic for any flow on a network, right? So for every discrete flow, you're gonna compute the ratio and then you're gonna generate 300 buckets wide on some logarithmic scale from zero to infinity for the ratio of inbound to outbound, or outbound to inbound, whatever you wanna choose. And then you can look and say, who's sending data out of the network and who's receiving data from outside? People that send data, mail servers, VPN servers, FTP servers, web servers, servers send data. Clients tend to receive it. When you see your clients sending lots of data, especially outside of your network, to strange places in Romania, that might be of concern, right? How do you find those people? You look at the far right-hand side of this graph and you say, hey, that might not be okay. I'm gonna go look at that, right? So in a normal, like, so there's a customer idea with about 2,000 hosts, I look for traffic that is bigger than a megabyte and has a greater than two to one ratio outbound to inbound. And on a given day, I might see two of those type of flows in the network, right? Two is easy to look at. That, again, to the previous slide, gets me back to Team Fortress really frickin' quick. So that's a nice thing to look at when I'm looking at data exfiltration. So Bell Curve, it's an example of statistics. Being a jackass now. So it's not a complicated thing, but actually you do have to kinda wrap your head around it cause you do get a little kind of clouded by the time domain stuff. But frequency domain analysis tends to be a lot more effective for analysts than it does for operators. I'm gonna try to get into this a little bit more later, although again, I've got way more slides than I have time for. The tools today are designed around this kind of concept, right? Providing tools to the analysts to be able to find those unit's things that are outliers and diving down to them. That's not important. Flow basics. Data, who's using NetFlow today? Let's just get that out there. All right, that's not bad. NetFlow's not new, okay? The concept has been around, I mean, for 15 years in general use as far as the term NetFlow, but even before that. There's things called pen registers for, you know, phone, you know, it's basically who they called and how long it lasted. This is basically like a pen register for IP networks, okay? So it's a unidirectional flow, it's an accounting information, a unidirectional flow of data as it's going through a network. So let's take an example. Who knows what the thing in the middle is? A router. Because it's actually like a Porter Cable 690. For some reason, Microsoft PowerPoint doesn't have an actual router by default on a Mac, but it does have a Porter Cable wood router. I searched for a router, I got that and I used it. So, host A talks to host B through a router, a flow record's generated in the router and it's basically, it's like a six, seven tuple of information, host destination IP, host destination port, the number of packets, the size of the flow, the number of bytes in the flow and then there can be other odds and ends depending on how the network's set up. That's it, okay? It's not a lot of information. That gets, you know, the first time the packet goes through, next time packet comes back, separate flows generated. TCP is like kind of one of the only session-based protocols out there. There's an awful lot of non-session-based protocols out there. Turns out TCP just makes up the lion's share of what we do thanks to YouTube, MySpace and Facebook. That is the interweb right there. All the data goes back and forth and as more data is exchanged, the flow is updated. At some point, that flow information expires and gets, it can get exported from the routers and it doesn't have to. You can, if you really want to go low tech, you can just turn on flow accounting on your router and then log in to your router to look at it. It sucks, but you can do it. At some point then, that data gets exported to a collector. The collector can then log it to a database. The database can then analyze it, whatever, and then you've got some workstation that goes and looks at it. Basic form, you can have lots of routers sending tons of data to a collector that puts it all in the database and someone can go look at it. It's all it is, man. It's not rocket science. And it turns out that it doesn't get any more interesting than this, so if you'd like to leave now, you're welcome to. Reasons to care about network flows. You get a lot of visibility in your network without having to spend a lot of time crunching on huge data sets, right? You get more detail than SNMP. When you get SNMP information on an interface, you get like, byte count. Not all that interesting. I mean, it's interesting from the extent of my network's slow and you go, okay, and you look and say, yep, there sure is a lot of data, but you have no clue what's going on, right? Network flow will give you more resolution on what's actually happening and it's a lot less complicated than dealing with packet dumps. So, let's think about net flow versus full packet analysis. You can, when net flow, you know, I said one-tenth the storage, it's more like a thousandth of the storage, with net flow versus full packet. In places where you care about privacy, you're not storing anything private, right? You're storing IP addresses and byte counts, so you don't have to worry that you got someone's login information or their credit card number. I was getting nervous when people walk in with boxes in the middle of my talk. That's not for me, is it, guys? You're not lying to me, are you? All right, well, he's wearing a red badge, so we're gonna hold him to that. And the other guy can lie all he wants, so. All right. Dankaminskina cake. Three separate cakes, actually. It was a really exciting night. Disadvantages with net flow, obviously you don't have the full packet trace. Things are really nice to know, like URLs, you see someone did something on port 80, some screwy server, and then that's all you know. You go to like, there's websites that are out there that will track what virtual servers are associated with a given IP address. I'm not exactly sure how they do it. I haven't really cared because I'd like believing in magic. But I've been in situations where I'm investigating an incident where someone's sending tons of data out of a network and I go and look at the IP that they went to and I go to it and I look it up and it's got like 25 virtual servers running on that IP address. And one of them, in this particular instance, was like craigs Barnwood.com, which was a site for selling Barnwood. So if you've got an old barn, people tear it down, resell it and make really expensive kitchens and that kind of stuff. And the other like 24 sites were gay porn. And you know, I'm thinking, well porn sites get popped and malware gets put on them and figured maybe this guy was just getting owned and everything was getting uploaded in one of these gay porn sites. So we went and we bagged the box and took a look at it. No, he was actually selling his barn. Felt kind of dumb after that and I'm like, it's not, you're not going to Craig's Barnwood-y, right? Just Craig's Barnwood. He's like, yeah, I'm like, all right, good. Gonna pay for that one later. Dude in the front in the boxy shorts is falling asleep. Anyway, so you can get that data out of band through like something like WebSense or there's actually these extensions to various NetFlow records that are out there that provide the ability to get URL information along with that basic tuple. So like NetFlow v9 and Sflow and some other things that are out there. So if you want URL information, you can get that. You're just not going to run the default NetFlow version five. NetFlow version five is a lot like SNMP v1, right? It's brain dead simple. It's got a lot of weaknesses, but everybody uses it and everything supports it, right? SNMP v3 solves a lot of problems. Anyone really care at the end of the day? No. How many people actually run v3 in their network? Okay, right. And you believe in unicorns? So some people do run NetFlow v9 and IPv6, whatever, but it's not something that's commonly done yet because the standards are still being kind of solidified. NetFlow v5 is a Cisco proprietary spec. Lots of open source products, lots of commercial products implemented. There's no standard in the sense that there's an IETF thing. It's basically whatever Cisco decides to put in their docs. Cisco apparently has the world's largest file server on Cisco.com, right? Because that to maintain every chunk of documentation for every Cisco device, for every iOS version, which apparently is more than the grains of sands on the beaches on planet Earth. And Cisco can serve all that up on their website. I have a lot of respect, respect. Products to do full packet analysis. One of them of import is NetWitness. It will grind through terabytes of raw packet data. It's actually pretty cool if you get a chance to play with that kind of thing. NetFlow vs. IDS, really quickly. IDS is you look through their goggles, right? Like that's the world that you see. And the world that you see says port scans are really bad. That's the most annoying thing. I really could give a shit about port scans. With NetFlow, you get to kind of swim through the data yourself and you get to decide what's bad and what's good. The problem is from a security perspective you start from ground zero. A lot of the open source NetFlow products are not security specific. They're network engineering tools and at the end of the day, you just kind of sit there with a blank screen and some pretty graphs and you got to kind of make your way through it. It's tough. The commercial products that are geared towards security are a lot easier to use and just pick up and use but they cost money and we don't want to spend any money. NetFlow vs. SNMP. SNMP is more real time. That's probably the biggest advantage. And you obviously get other types of system information. So I'm not going to talk about NBeds. In reality, even if you don't have the ability to be like a network analyst full time or even play one on television, it turns out that one of the greatest reasons to deploy NetFlow in your network is just to have a relatively complete forensics log of what went on. So when the dude does show up with a box in front of your $5 million an hour forensic analyst and says this box has been owned, you can go back and look at all the traffic that went in and out of that box, figure out who we talked to, where it came from, who else might be owned and backpedal your way through the infection in the network and see what's going on. That's fantastically useful. So even if your NetFlow stuff sits around for six months and then you finally start using it, just having it for that capability alone is worth the price of admission. Again, just to be clear, this Team Fortress 2 is really the reward here if you do this right. Sensors, when is the flow record generated? There's four basic ways flow terminates normally for a session-based protocol. Again, session-based protocol equals TCP because there aren't, name another one. What? No, that's all right, man, I don't know because I don't know any, I'm just asking. What? Reset. No, I wanna know another session-based protocol besides TCP. Alrighty then, there's one, I'm gonna declare one because we can't, Empirical Evidence says there are no others. That's really bad science, by the way, for anyone that was paying attention. The, ooh, phone's ringing. This isn't my phone, by the way. Todd Nightingale. Oh, it's a fax machine. Hello? Which side does the, is that on camera though? There we go. All right. As said before, epic fail, okay? I gave you the goddamn phone this morning. All right! Word business. Second reason the flow expires is you hit like a max idle timer, usually measured in like a minute or two. The cache might fill up or you hit a max active timer. This is really critical. Max active, you wanna set really low if you're gonna use this stuff for bandwidth monitoring. You know why? Because when some guy calls up and says, hey, the network's kinda slow and you look at your flow data and you don't see anything, you don't see anything, say everything's okay. And then 30 minutes later, you hit the max active timer, 30 minutes. The dude's been downloading like four, you know, Debian ISOs or something. And the thing like, you know, you finally get the flow export, comes out and says, oh, your network's been hammered for 30 minutes, but because your active timer was so high, you couldn't see the traffic. Okay, so this is really important. Not all of the documentation for all the open source software gets into this. Ratchet that active timer down low enough so that you get a view of what's going on in your network in near real time. Configuring the sensors, the only thing really to take away, you can make your own NetFlow sensor, you don't have to have a Cisco router or whatever, there's open source tools like SoftFlow D that you can configure. The only thing is, you need to keep in mind when you deploy a sensor in line in your network, you're replacing an ethernet cable. Okay, your ethernet cables are generally reliable. Okay, so you should put something in place that is as reliable as an ethernet cable. There is what, that $500 Denon ethernet cable that people have been talking about for a while, gold-plated and none of the bits leak out like those other ethernet cables. Denon apparently is making the statement that cat six kind of sucks. Anyway, it turns out that you can buy fail open ethernet cards that have two ports that when the power goes away or the OS goes ape shit, like it'll turn into a wire. They're not bullet proof, but they generally work the way you want them to. You can get them on eBay for a couple hundred bucks making them new between five and a thousand. They're worth the price, they're worth the investment if you're serious about this. Some people say, well, why don't you just use a span port? Well, you can if you have one available. Most times that the people's blood-sucking IDS is actually using the span port, so you can't do it. What? Or you can use taps if you wanna go that route. And again, some people don't believe in the tap mentality. So, however you gotta do it, just if you're gonna put it in line, realize that you're replacing an ethernet cable. Okay, some real quick things. We run our software on multi-core boxes. You can buy super cheap Intel, I think they're what, Q6660s, the quad-core 2.4 gigahertz Intel boxes. I got one of them for $550 at Office Max. I walked in and said, I want the cheapest quad-core box you can give me. And he sold me one off the shelf for $550. I had to pay him at cash and pick it up in the back of the store. But man, like four cores is really fricking cool, man. Like, you can do a lot of stuff. So, when you're doing this, especially with our software, like there's, you know, you'll have something that's collecting data off the network. You've got the database, you've got Postgres, you've got Apache, all this stuff running. And it's helpful to have lots of cores for that stuff to run on, because you get one core that's hammered with Postgres and the box is still totally responsive, able to take other things. So I encourage you, if you play with this, go get multi-core boxes. Go buy yourself one of these fricking $550 supercomputers in cash. As a point of reference, so people that are thinking about doing this and not here to humor me. 2000 hosts a day, or 2000 hosts will be about 10 million flows a day, give or take, and that's about 5,000 flows per host per day. That's a good rule of thumb. When you're talking to vendors or you're thinking about scaling for boxes, Shmucon, similar situation, you know, for the conference network, we saw about that same rate of flow data. So there are lots of open source tools out there, they're commercial tools. Why do we build a new one? Like I said, the open source tool is probably the most interesting one, no offense to any of the other ones, but is NetSA from Carnegie Mellon. It's a lot of little widgets that have been written by some really good computer scientists under, they're basically an FFRDC, so they're getting funding from the government to do this. And they're great tools, but you can't really pull them off the shelf and use them. They require custom glue, and especially from a UI perspective. So we've been focusing on building something that not only has cool analytical capabilities, but has UI capabilities as well. It's not up yet, it'll be up tomorrow. We've been writing this for months, we've got a whole tons of code. What I wanna do is give a quick tell you about our status. We wrote our own parser, yikes, that should scare anyone, because writing parsers, parsers, yeah. Parsers go wrong a lot. Is that 10? I wasn't seeing two fives, okay. 10, 75. 55. Anyone wanna start singing Van Halen? We'll have a little rock guitar up here for a minute. It is just one more volume, absolutely. It's probably injectable, like have at it. Anyway, it inserts directly into the database, into the Postgres database. There's multiple threads per sensor, it just dumps things in. This gets boring, that's why everyone's leaving right now. Bye, see you later when we play TF2 in the contest room. The UI is all Ajax-y and good, we're just gonna keep on trucking. A few things to note, multi-core is good. Linux Speed Limit CPU cores. This is really fricking annoying, okay. Most of the modern CPU cores support speed scaling. They're default, something like one gigahertz. The really annoying thing is, it tries to preserve power more than give you performance. If you peg one core, but the other cores are idle, and you're telling me 10 as well, you're a little drifted from his clock, sir. The goons need to NNTP sync, so. I got to go back, my 10 is right. Oh, your 10 is right, so he is the stratum one timekeeper. So they scale it down, and if you peg one core, that's not sufficient to get it to peg or raise the CPU speed. So you actually have to go and bring it up, like turn it to 11 by hand, because it won't do it for you unless the box is busy across multiple cores. So what we have to do, and you might wanna do this in any of your Linux boxes, even, I found just general-purpose servers, like a some server process with Spike, and since the cores don't scale, a 2.8 gigahertz processor running at a gigahertz, you're not really getting your money's worth. So you can go into assist devices, system, CPU, and however many CPUs you have, and just go hard set them all to the fastest speed. It's not very environmentally friendly, question mark, but who cares? Software RAID takes overhead, but we use it and love it. RAID 0 at full throttle takes about 25% of one core, which isn't that bad, all things being equal, and we get ridiculous speeds. SATA is fast, man. Like it's fast disk and it's cheap, and RAID 0 is awesome. Also note you can't boot from a RAID 0 root file system. Blinks at you and says, no operating system found. You can make one. So I made one once and rebooted the box, and then promptly stared at basically nothing and had to completely reinstall the operating system. What are you looking for really quickly? Some hints for those of you, again, the four people here that actually care about this. For finding bots, if you're looking for bots, there's still IRC bots running around. The promo paper indicated there's like 15% of the bots that they found that were dropping malware through their study. We're still IRC based bots. I know a lot of us still use IRC, but most people like most marketing departments don't. If your vice president of marketing and business development is suddenly taking a liking to IRC, he's either taking a liking to little girls or he's owned. Or both. Yeah, in more ways than one. Do you take that to, anyway. And just to be clear, because I know we haven't heard enough DNS talk lately, but giggle. So when your clients make DNS requests, where do they go? Your recursive name servers. Not recursive name servers in Romania, right? If you see DNS requests going out the front door of your enterprise to other people's name servers, that's pretty bad. There are bots that leverage that because 53 tends to be open a lot. Port 80 bots, hard to find, but not impossible. That's, again, if you don't like Team Fortress and you want to look at traffic, you can look for Port 80 bots. Look at destination addresses. If there are networks of interest, places in the world you don't want your traffic going, you can filter on that and find it. Data exfiltration, again, I was saying, two to one is a pretty good ratio. Hi, sir, how's it going? Outstanding. Keep on truckin'. Policy violations, performance and utilization. Look, man, if you don't run NetFlow yet and you fire it up, you're gonna have these massive aha moments where you look at it and go, holy shit, that's what my network's doing? No, seriously, software engineers are the craziest people in the world because they have this blind faith that once the code compiles, the pixies take over. The network does, it's just this magical thing that things flow across and it's not really true as it turns out. And so, I remember the first time I did a packet dump for a software engineer who was logging into a production server with Telnet and going to root. And I was like, here, let me fire up a theory and I fire it up and show it to him. He's like, that's as easy as it was to find my password. I'm like, it's unencrypted. What do you think that means? Is this surprising? I read a thing the other day where this guy was gettin' a little bitchy about the latency of gigabit ethernet and his web app. Because his web app was making 1,000 requests per page to the database and was wondering why it was taking more than a second for all of them to fulfill and blame the latency of gigabit ethernet. I blame the latency of the synapses in his little tiny brain. Like 1,000 transactions per frickin' web page. This is not an unusual situation, right? If you have things like NetFlow, you'll see it. You'll see lots of little packets going back and forth and back and forth and back and forth. That's when you grab the club and you go beat the shit out of people. Couple things, man, I flew through. There was a lot, little content here, I apologize. Just go download the code later. A couple plugs. First of all, I don't get paid for any of this, but there's this dude, Edward Tufti, who deals with visualization stuff exclusively and a lot of the analyst's problem is finding proper ways to represent data. This dude has Kool-Aid, man. Like Jonestown Quality Kool-Aid about data visualization and how to represent data for engineers, for all types of people. So if you get a chance, go to the borders, the greatest lending library in the world. Read his books. Go to one of his classes. He teaches day-long classes. You get the books when you go to the classes. It's really pretty fantastic. He's got a thing online. You can go read. The Gettysburg Address is a PowerPoint slide. It's what Lincoln would have done if he were a Bush. Some Republican got a little twitchy. I'd also like to thank Ken Shoto because they donated the box that I ended up not using, but we did use the five-port switch for the thingy, the Team Fortress thing, and School Root is still dominating, question mark, I presume, so School Root's still dominating, so there may not be a last-minute crush on who's gonna win the CTF event. Parting thoughts. Whatever, go download some stuff really quick. The most important slide. Spook on! Yay! So these are basic, the dates are right. The top line is correct and it gets buzzier as you go down. February 6th through 8th, 2009 at the Alexis Park. There you go. That we're moving Spook on to Vegas. No, we're in the Park Marriott in D.C. CFP probably opened around September 1st. You know how the CFP process works at the Hackercons, right? You just kind of start sending stuff in. Someone says, oh, thanks, and you're off and running. Ticket sales, mark your calendars. We'll probably be these days at noon, and we mean noon. And for those of you that haven't tried to buy Spook on tickets before, they're kind of apparently a hot item, and you need to be hitting refresh a lot at noon to get the ticket. So in all seriousness, if you'd like to come to Spook on, we'd love to have you. Attendance is limited. We don't know exactly the upper end, but you're probably gonna wanna buy in November 1st and not try to buy in January 1st. Further, support your local Hackercon. They need you. Freaknick, Nauticon, Layer 1, Tourcon, Chicagocon, Sector, et cetera, et cetera, et cetera. There's gatherings like this that are smaller in other places. They're very interesting as well. They're all a little bit different in usually good ways. I encourage you to go out and try one on and see how it fits. I think that's it. I'm almost out of time and I wanna go play Team Fortress, so catch you all later.