 I'm sure you'll be very underwhelmed by my presentation, so please don't clap. Starting off, we're going to be talking about anomaly-based intrusion detection systems and why, say, Haxor's best friend. First, before I start, let me ask a couple of questions. How many people here are familiar with anomaly-based intrusion detection systems? Wow. That's pretty cool. Normally, when you go out to other places, people have never heard of them. How many people have worked with them? That's still a pretty good number. If any time during this speech you have a question or concern, please get my attention. I'm kind of blind, so you have to shoot up flares or something. But I'll get my attention, and I'll be more than happy to mitigate your question. Yes? Do I have any connection with ISS? No. Even though I'm in Atlanta. You know, it's a popular misconception that everybody in Atlanta works for ISS, but it's actually not true. There's four people that don't work for ISS, and I'm one of them. I'm going to start off with the history of the castle. What is anomaly-based intrusion detection? How does it differ from signature-based systems? Well, basically, intrusion detection has had the signature-based system for quite a long time. Basically, it looks for a static thing in a packet, like a certain string or whatnot that will identify it as a certain type of attack. For instance, password, password file formats or whatnot. Anomaly-based intrusion detection systems rely mostly on creating a network baseline, something that it can compare against, and then checking how much a host or a network segment will vary from that. Meaning that over time, as things grow, the theory is that your network is not going to change much. And if it does change much, the anomaly-based intrusion detection system should be able to integrate with that. However, when attacks happen, something that's unusual, something like a new port being opened that people are communicating heavily with, a new web server that's running, a new application, it will be able to take this quickly and you can respond to it. Where do they come from? Anomaly-based intrusion detection systems have a birthplace in a gap in the signature-based system. Signature-based systems have a problem because they require a signature to work. They require that you know about the attack. If your attack is zero-day, then there's generally not many signatures for it. So if you know an attack that no one else does, it can go on for quite some time until it's covered attacking and compromising hosts with reckless abandon, basically. So until there's something developed to mitigate this, there was not much you can do. That's where anomaly-based intrusion detection systems came from. They're just looking for things out of the ordinary. What drives their development, that's that gap we talked about. Oh, I just mentioned. The gap between the creation of the exploit and the creation of the signature-based system. How anomaly-based intrusion detection systems work. A baseline is normally gathered during a tuning phase. This means that in order to create a baseline, the anomaly-based intrusion detection system has to run for a certain amount of time on your network to gather enough data to decide what is normal and what is not. The tuning phase ranges between products, but it goes from three days to a month. Say again, it depends on how the particular system is implemented. This is true, but then that's a protocol anomaly-based intrusion detection system. Not a general anomaly-based intrusion detection system. What he's asking is if you have an intrusion detection system that just looks for packet anomalies. For instance, your sequence number is out of the range of a RFC sequence number. Or if you have flags set that shouldn't be set in concert. That is also a type of anomaly-based intrusion detection that most vendors do implement. They do behavioral-based systems. However, unless you have a really dumb attacker, you don't have much use for that because most people who write attacks now don't really want to vary from the RFC because the signature-based systems can also detect these if you have a tight enough system. Did I answer your question? All right. The theory behind them, traffic that's not seen before is bad. This basically means that when you go through your auto-tuning phase, you're creating a baseline. And things that are not seen during your baseline are exceptions or generally bad. An example of this would be you have a network. You create your baseline for your network. A week later, you have somebody who installs Windows Media Player, for instance. And then when they run Windows Media Player to get streaming content, the anomaly-based system will pick it up and flag it. Which actually takes security admins a certain amount of time to mitigate. They've got to research this. Attacks can cause things that the system is not seen before. This is basically relying on the theory that if you're going to attack the system, there's got to be something about your attack that is going to be strange or outside the normal activity generally seen for the system. That's not true in all cases. Can science-brood and research labs work in the wild? This is an interesting question because most of the behavioral-based anomaly intrusion detection systems are based on theory and proofs. Meaning that if there's formulas to everything, it doesn't take a will to account growing networks or changes in different environments. Basically, they think that everything is easily representable. Attacks can build rockets, but can they build security products? My point in this statement is unless you've actually looked at network flows for a long time and spent a long time looking and debugging network problems, you'll realize that you don't have the ability to easily reference the new attacks that you haven't seen before. I'll get to more of that later. Hardware-based considerations. Anomaly-based intrusion detection systems require slightly different hardware than signature-based. They're generally going to be your standard X86 systems with high-speed cards, but they need to be a little bit more scalable. Most anomaly-based intrusion detections do look at things like protocol headers and leave payloads out. Generally, they're a little bit faster than signature-based systems. They have to actually inspect the payload. However, there are some things that you need to keep in mind. One of these is the amount of data you're gathering. To make it an efficient anomaly-based intrusion detection system, you have to gather enough data to do trending over time. You basically have to be able to take your baseline and augment it. That requires a lot of overhead. As far as speed goes, you need to make sure that your sniffer is fast. In addition to that, you need enough juice left in your processor to run your engine. This isn't that big a deal on a Class C, but when you get to a Class B level or a greater, this becomes a very big deal. The castle is built on a shaky foundation. It's built on a foundation of science, basically, like I mentioned before about the proofs and whatnot. Basically, you're starting with a problem that has on a wieldy side. You have to gather all the data on your network to make this work. Generally, over a period of time, that's obviously a lot of data. You have problems with this with data structures of memory. You have problems with hardware limitations. You have problems with the integrating network changes into this. Meaning that if you've suddenly decided you want to bring up a new network or a new set of hosts that hasn't been seen by the system before, you're going to have to go through a learning phase again, basically, to integrate this into your system, which can be a problem and offers an opportunity for attack. There's some things that most anomaly-based systems assume. Once again, when I say anomaly, I mean all types of anomaly, not just behavioral systems. What hasn't seen before is bad. Machines have normal patterns. It can be easily distinguished. The networks don't change. This is somewhat of a half-true. Basically, I talked about this briefly before, but I mean, it's found a little bit. An anomaly-based intrusion detection works on the principle that it's going to create a baseline, and over time it will work based on that baseline and only evaluate changes to it. Unless you have a system that integrates changes to a network over time, you'll have some problems because there's no network that doesn't grow. In my experience of nine years of consulting, I've never seen a network that doesn't grow over time. This will raise problems with administrators for anomaly-based systems because they have to integrate these changes into the network. Machines have normal patterns. It's kind of a strange thing. Unless you understand your operating system thoroughly, you can't exactly understand everything your operating system does. An example is a random broadcast from a host for rendezvous, for example. Things such as this. Networks that don't change is a fallacy. Networks are always changing. They very seldomly stay the same unless you work for the Department of Defense. Why is it hard to tell good traffic from bad traffic? An anomaly-based systems rarely look at the payload. There are some that will inspect the payload, but many of them don't. They don't care about the payload. They're looking at your headers for things like protocol anomalies or they're looking for flow information for things that are doing stuff that hasn't been seen before. For instance, talking to new hosts and talking to hosts that hasn't been seen before, talking on services that hasn't been seen before. This is a serious problem because most attacks you need a payload to understand what's going on. If a network runs in a non-standard configuration, there are problems. This basically means if you have in-house-developed applications or things that aren't generally... that are developed in-house, most anomaly-based intrusion detection systems have problems with these, which is an easily exploitable avenue. Attacks against a common-use services on the machines are ignored. An example of this are web server attacks against public web servers go unnoticed. An example of this would be if there was a machine on your network that was serving port 80 to the world and I wanted to attack this based on the model the anomaly-based intrusion detection system gives, I can easily attack and compromise this host and the anomaly-based intrusion detection system won't see much of a problem with this because all it will see is data flowing to and from the host, which is normal. Like I said before, most anomaly-based intrusion detection systems rely on flows. Am I going too fast? All right. Say it again. Louder. Sorry about this. This is my first time really speaking in public. My parents kept me locked up for a long time. Don't laugh. I'm serious. Well, this example it sends to many things. If you run most services that are open to the world, for instance, if you're a public web server, if you do DNS for other companies, if you run a lot of services that are generally open to the world, that you're used to a lot of different hosts speaking to, most anomaly-based intrusion detection systems will have problems with these, detecting attacks that is. Now, unless they have a very glaringly obvious problem, like a protocol header that doesn't match the RFC or something such as this, you're not going to see it, which is a big problem. This is a patchy chunk exploit on 443. A lot of anomaly-based intrusion detection systems will think the traffic to that is just fine because it's used to seeing people talking to and from HTTPS. When in reality, you actually have a glaring payload that says, hello, I'm going to attack. So keep this in mind when you're attacking networks that have this. Configuration problems is a serious example. Most, a lot of anomaly-based intrusion detection systems, specifically behavioral-based systems, rely on a concept of inside and outside. That being that the inside network is what you're protecting, your specific range of hosts, the outside network is everything else. So if things aren't set up quite right, you're going to allow communication between hosts that shouldn't be talking, and that will go unnoticed. Configuration is a large problem, this is still a relatively new type of product. Most people who are implementing it have configuration problems, such as they are not allowing changes to the network to be integrated into their baseline over time. This basically means that when somebody does start that new Windows Media Player, or they just download AOL and send messenger for the first time, and they run it, that host will be a high-risk, basically a high-risk host. Most behavioral-based systems and most anomaly-based systems have a different way of dealing with this. And this can give administrators varying degrees of headaches based on how those changes are being integrated over time. I'm sorry, I'm really nervous. How to tear the castle down. Now I'm going to talk about how to attack it, and this is actually what I'm really good at. More noise-less accuracy is basically the theory that we talked about it briefly before. If an anomaly-based intrusion detection system has not seen this type of traffic before, it's bad. Now let me give you an example. Everybody here is familiar with NESA, so I'm assuming. Now if you run a NESA scan against a set of hosts that are being protected by an anomaly-based intrusion detection system, you'll notice something very, very funny. Every one of those hosts is now a very bad host. The reason for this is the response to your scans is being logged by the state table in the anomaly-based system. And it hasn't seen this type of traffic before, so generally it's all bad. So if you want to do a very simple attack on a machine that's used to not see much traffic, the best way to do it is to raise the alert level of every other machine around it. This is similar to doing an ARP attack, an ARP storm to hide a port scan. Properly crafted packets will cause inside machines to bearers attackers. That's what I was just basically talking about. You can, based on the type of system being used, it's trivial to craft packets to make those attackers appear to be very, very bad. An example is with the latest 55808 problem. Everybody's been reading about that. It was a month or so ago. The ISSX Force named it Stumbler. Stumbler. Right. The anomaly-based intrusion detection system I work with was set to look at that. Any packet with a payload size of 55808 as a bad packet, which made making this machine very slow, very easily, because then all I had to do was use a tool like LibNet to craft packets going to a firewall with a send flag set with a window of 55808. When the packets were being reset and sent back, they, strangely enough, had the same window size and were being flagged as bad packets because the anomaly-based intrusion detection system thought that the inside machines were generating these packets. This allowed all those machines to be very bad and the administrator watching this machine missed that one very critical CRC32 exploit that went by. Covert channels. This is actually the best way of handling anomaly-based intrusion detection systems. You're basically hiding your data in plain sight. Is everybody familiar with Loki? Yeah, Loki's bad. It doesn't work that well with anomaly-based intrusion detection systems because it violates that RFC stuff we were talking about before that most of these systems love. Now, what would work is something a little different, like a low and slow attack, meaning that your attack is going very slow and taking a large amount of time. This isn't a particularly fast thing. However, it will most likely time out the state table and the system itself because what you're trying to do is you're trying to make your traffic appear as normal as possible. So an easy way of doing this is hiding information in something like a sequence number. As long as it's following RFC, you're good. Now, as an example of this, let's go back to the 58808 intrusion that was first found by Intersect. It was a very simple model that relied on the window size. Now, the versions that were found apparently opened a PCAP filter and sniffed for... Do you have a question? Oh, I'm sorry. You stood up so suddenly. I thought you were going to say something. Throw something at me or something. It appeared that it would just open the PCAP filter and sniff for anything with that window size. When it would pull that window size out, it would do a conversion on the sequence number and based on the sequence number it would extract a new IP address that would allow communication to a new host that something inside an added network or a firewall network would be able to communicate with. An example of something like this is something that an anomaly-based intrusion detection system would fail at very greatly. Stature-based systems generally have had a better chance of detecting things like this because you have something that's generally aliking the payload. Unless you're just reading your payload because you can't really... You have a hard time of writing a signature for it. If you're doing it completely random and relying on your anomaly-based intrusion detection system, this Colbert channel would go and know this. An example of this attack would be if I have compromised the machine over time on the inside of a network that's being firewalled. And I have a root kit that's running on this host that generates... that's set to listen for packets with that window size. Now, over time, this would be a configurable option and the inside host would have to know what it is, but that's pretty simple. So I send in a packet with a window size of... We'll go back to 55808. It pulls out the IP address. It starts communicating back to a new host, but it's doing it at such a slow rate that the system thinks that it's either normal traffic or it's such a low anomaly that it bothers not even a warranty as a high risk. This is generally the easiest way to defeat an anomaly-based intrusion detection system. Then we go to everybody's favorite, if you're a script kitty, which is flooding, which is you just really get a big pipe and you blast it. You generally time out the state table. You create such great events that in some systems you'll have overflows in the state table. This is a less desired method. This is a very loud method, and normally the emitters that are doing this will be found. However, it's very useful for a quick and dirty attack. Briefing traffic analysis. Now, we talked briefly about the way a baseline is created and what's done with it over time. Now, a lot of these systems have a way that's implemented to take changes and integrate into the baseline over time. Now, if you have a low and slow attack like I was describing earlier, over time you can teach an anomaly-based system that your PeopleSoft server is indeed supposed to be doing IRC traffic, and everybody loves that. It's great for the breaking of traffic analysis. A lot of people assume that if you're recalling a target, like general wide NMAP scans, nested scan, things like this, they're very shortly thereafter you're going to try to compromise hosts you're looking for. This is because this is the way a lot of worm behaviors work. For instance, the slammer worm will start looking for all the MSSQL hosts it can find. Once it finds it, it tries to attempt it by sending itself to it and compromise the machine. If you time your recon and your attack over a great period of time, you can actually fool this system, which is pretty simple to do. Generally, you have to look at the site you're attacking. A good example would be a single individual Class C. Let's say there's a company you wanted to attack that had a small network, a Class C, that was running a system such as this. You would have to look at what they're doing in and out. You can find what kind of bandwidth they have pretty easily. Let's say they have a T1, so you know what their bandwidth constraints are over time. You, at that time, can argue that that won't work. You, at that time, can begin your recon and your target by understanding the data flow in and out. Over time, you can understand how much the state table will grow based on the data it has. For instance, a smaller system that uses a small network. I'm really tanking here. I'm sorry about this. Right. For instance, what we're basically talking about is state table size. When you have a smaller network with a smaller amount of bandwidth over time, the state table's not that big a problem. For instance, a small company with a small Class C over time, you'd have to wait weeks and weeks and weeks to time out the state table. Now, let's go to something a little bit bigger. Let's say you're attacking a company with a Class B, and this Class B is being protected by two anomaly-based intrusion detection systems in a fellover configuration. Now, you know that this Class B is being fed by an OC-12. Everybody knows what the speed of an OC-12 is. If you can do some guesstimations based on the system you're using, you know how much RAM an X86 system can implement, you know how much disk space is possible. If you know over time how big their state table can get. With pretty accurate guessing, you can understand how long it will take to time out a state table. You can send in an attack. You can send in your recon. Your recon will be noted by an anomaly-based system, but it doesn't think it's that bad yet, because it's just only recon, port scanning and whatnot. And then over time it will be timed out because there's no follow-up activity from that host. After that, after your guesstimation, you can pretty much do whatever you want, but you have to keep it low-key. And my last philosophy for breaking traffic analysis is wind and roam. A good way to break a traffic analysis is to understand what the host or compromising is doing. For example, this would be a workstation. If you compromise a workstation, there's a lot of outbound SSH connections, which is pretty easy to determine what a host does once it's compromised. If you were to mimic the activity of that host, it looks like normal traffic to it. So with the SSH example, if you were to compromise a host and then write your root crit, your root crit has to be portable enough to plug any port in, and then work it to send data over to destination port 22, the anomaly-based system will think this is normal traffic once again. Flies in the system, attacks against the system itself. This is a little bit iffy because in order to do this stuff, you have to know what system it's running is. For a lot of institutions, you can search in Google or whatnot for PowerPoint presentations. They wrote to get funding for their systems. By doing that, you can know what types of systems are running. Based on that, you can determine what's vulnerable to an attack, buffer overflows and sniffers and things like this. A simple method is to tackle what's feeding the system. The router or the aggregation point itself for the system, because although the system may function properly, if it's got no data, it's pretty useless. Is there any way to fix anomaly-based systems? Yes and no. The best way to fix an anomaly-based system is not to rely on it itself. It's using concert with something else, such as a signature-based system like somebody asked earlier, ISS, or in Terrasys Dragons. There's tons of signature-based systems. My favorites happen to be Snort. But if using concert, you have something that's looking at the protocol header and you have something that's also looking at the payload. As far as the problems being fixable, it would take greater time to implement it. Basically, what needs to happen is more science needs to be developed and they need to be tested greater in the wild. I'm sorry, I flew through this. Is it useful? Is an anomaly-based system useful? The answer is yes. Yes, it's useful, but not by itself. Like I said earlier, it has to be using concert with something else. Who do they keep out? They do a really great job at keeping out things like worms, automated attacks. I'm sure everybody's familiar with the recently found e-commerce bullet? Destiny. It won't be long until ISS is releasing a... I love ISS because the guy in the phone asked me if I worked for ISS earlier. I love using them now. It won't be long until there's a worm out for it. Since it's such a widely spread hold, this is going to be a big problem. An anomaly-based system would do a very good job in protecting you against something like this. It'll also do a good job in protecting you against spammers or things that generally create loud network footprints that are out of the ordinary. What they won't protect you against is a dedicated attacker that wants to get whatever you have. If given enough time, anybody can learn to mimic anything inside your network which will pretty much render the system useless. How can they be better? Correlation. Correlation is good. Do you have any systems? Did somebody boo? Oh, boo. Yeah, correlation systems. You've got to have a correlation system. Correlating your data from an anomaly-based system with a signature-based system as we described before, as well as a vulnerability detection or a port scan. For instance, if you see a lot of packets going to a port that's not open, that should give you an indication of something wrong there. The problem with this is I can write rootkits now that will work on closed ports. You can get packets sent to a port that's closed, and I can do something strange based on the information in that packet, although it's not technically received, and things like a signature-based system would have problems with that. This could be a little better. What does this all mean for your standard system attacker? I'm a penetration tester. I like breaking the things. This makes my life a little bit more difficult. This means I have to use greater recon in my target. That's basically about it. It won't change the methodology in which I break into machines. It just means I have to spend more time making sure I know what kind of machines I'm breaking into. And that was my presentation. That was... Wait, wait, wait. I'm not quite done. What this basically all means is... I threw through this really quickly because I thought my slides were really lame, so I think I'm going to tell you now what my real opinion on these systems are. They're great. They really are, as long as you're using them with something else. If you're a determined attacker, I spend most of my time attacking systems as a pen tester, they're not going to keep that single determined attacker out. So if you're betting your infrastructure on this one type of technology, you're pretty screwed. And if you're an attacker, it's pretty good for you, because if somebody's relying on this, this is a lot like people in 97, 98, 99 that relied only on firewalls for security without properly configuring them. You know, you can have a heyday with this. Are there any questions? You. He's asking if I've worked primarily with network or host-based systems. I've worked with both. And a host-based anomaly detection has a very good future, especially with discovering things like kernel-level root kits, because if you can detect anomalies and things like your system calls in the kernel, you can do a great job with that. This won't stop somebody from getting in initially. However, once they're in, it makes getting them out easier. Any other questions? You. Are you asking if the anomaly-based systems are in line, if they're intrusion prevention systems, basically? Yes. No. Right now, anomaly-based systems are pretty much a new toy. Not many people have started adopting them. This is going to change. They do have a place, and people will be using them. They're not reliable enough to use as a prevention system, because what they can basically tell you is this host is doing something strange. I don't know what it is, but it's strange. And based on that, you can't really prevent an action. Yes? No. His question was, have I seen products that are very effective at detecting covert channels? The answer to that is no, because basically in order to do that, you have to gather all the data that's going on through the network forever, and basically use that for statistical analysis of what doesn't quite look right. And then you have to compare that with things like the sequence number count from your host, and why this package is being generated with a sequence number where everything else is falling in this predictable pattern. There's nothing right now that really does this. Yes? What anomaly-based systems have I worked with? That's a funny question. Based on non-disclosure agreements, but then I could be sued. Would you like to pay my legal bills? I'm looking for someone to, if anybody would like to volunteer. Yes? Which one's my favorite? TCP dump with a lot of purl scripts. That seems to work well, because if you're a confident network administrator or you know what your network is supposed to be doing, TCP dump and a couple of purl scripts will let you know that you don't think it's doing. Yes? Unopened sources? My realm of knowledge is mostly in the commercial area. To my knowledge at this time there's not, but I could be wrong. Say again? I'm sorry? Spade. I feel about that. That would work, I suppose. That's a different discussion. There's problems with that in my experience. I didn't really view that as an anomaly-based intrusion detection system as more of a, actually I don't know how to describe that. It's a great tool though. I mostly use the statistical analysis of stuff. Any other questions? Do I know of any products that correlate data between anomaly-based and signature-based technologies? The SEMs that I'm familiar with are mostly working on that technology now, which is a guarded net, would be the biggest one. There's some open source projects that work with, but they still have some time until they're effective. Other questions? Yes? There is fingerprint techniques, but that's an entire another discussion. Just actually talking about that would take a long time to answer, so to begin. But if you find me afterwards, we can discuss that. Any other questions? No other questions? Oh, Kidoke. Wait, question. Pull the reactive or the prevention systems? His question is, what's my opinion on the implementation of these over time? And right now, there's not an IDS anomaly-signature-based that I know of calls positives. And until this problem is cut down greatly, no one is going to trust an online prevention system. Now, there's several companies that are doing research based on this. There's several companies that are very close. ISS being one of them with their prevention appliance. There's some other companies, but right now, the problem is, especially as a security administrator, when you wake up in the morning to 10 million alerts on your network and randomly port scanning it, you don't really want to trust whatever thought that was a critical enough alert to wake you up at 3 in the morning to randomly rock hosts. So until an administrator feels comfortable enough that whatever they're using isn't seeing a great amount of false positives, there's not going to be anything that can be done about the psychological part of implementing it. Yes, sir? Perhaps I should tell you some research I did for about a year that is consistent with yours. I was analyzing the rates of traffic to a lot of the systems that we had, not looking at absolutely everything, but trying to do an anomaly-based detection on database access traffic. You know, if you use Oracle databases, SQLNet is something that you can peek inside of, you can get rates. It turned out that establishing a baseline just didn't work for us. There are variations which I could readily see in patterns of use on a daily, on a weekly basis, and in order to establish a baseline it took so long that by the time I could get a baseline, people were changing what they were doing. So the only things that were really useful in analysis that way were really huge variations. That would be consistent with my philosophy as well. Any other questions? We have one right over there. Am I your daddy? Yes. Yes, I am. Especially if you're running an anomaly-based detection detection as your only method of security, I will be your daddy. And you will get an email tomorrow about that.