 POC. So if there's any confusion there, I apologize. I'm going to be talking about the anatomy of denial of service and distributed denial of service testing that I was able to do as part of a contract associated with a large ISP in the western area. I won't mention their name. Okay, this is not an advertisement. This is not an endorsement. And this is not a claim that I know how to program anything that's remotely related to denial of service or to distributed denial of service products. As we get into it, you'll see how we got our hands on the actual tools that we use. But I'm not a programmer. So some of you may have programmed the ones we used, we shall see. Why did we test? Methodology we used, the challenges and lessons learned and the findings associated with those. Why? We needed a product to protect our infrastructure, the data, and business continuity. Another reason it was the testing occurred was because they had just gotten beaten up by several denial of service attacks recently, so they decided to finally think about something to do about that. Part of what we need to do is evaluate emerging technologies. There's, you know, traditional protection of denial of service is limited. So we had to look at what new technologies were going to be out there and we'll hit what those are. So this is ultimately going to end up being a review of the products that we tested. The other reason why the problem is getting worse, there's quite a few tools that people can get their hands on very easily. Once they learn how to acquire zombies and start distributing the tools, they can quickly bring down sites, either websites or just deny service across networks. Okay, nice pictures. Again, the problem is getting worse. Let's see, denial of service, 39% of the people polled had detected or determined they had denial of service attacks. The system availability or system unavailability is the fourth largest area of concern that this particular survey had identified. And the second most important project was the security and availability of websites. Again, availability being the opposite of denial of service, I guess. What were we looking for? We had to find solutions that were going to work in the infrastructure, something that's going to be able to work within gigabyte solutions, both gigi and multi-mode fiber. We also ultimately needed to find solutions that were going to scale to the OC48 to OC192 levels. They're not there yet, but again, hoping they will scale. We also needed something that would ultimately work toward protecting the customer. Again, looking at the gigabit levels, fiber and gigi, 10100, which is the most common customer site at this particular ISP, and then eventually being able to roll out OC48 and OC192 levels. Here's a list of the products we tested. I got permission from most of them to talk about them, so we should be in shape. On the passive tap solution side, let me explain that for just a quick second. We looked at two sides. One would be something that was in line that can hopefully implement a protection against denial of service as soon as it's detected. The other was something that we called passive tap for lack of better names, which was supposed to be hidden on the network with no network site IP addresses that would be detecting, receiving information. Within that area, the passive tap solutions, we had arbor networks, reactive networks, mazoon networks, and Asta networks. The inline solution side, we did captus networks and mazoon networks. They actually had two products in the test. I did want to point out that each of these vendors was extremely supportive throughout the testing process. Of course, they want to sell product, but they're also utilizing these head-to-head tests as development for their products. They get lots of feedback. They get information that's useful to them to help develop their products. The methodology we used. Basically, there's just a few things you can do for denial of service prevention today. Reverse path filtering is one item. Ingress filtering, egress filtering, stopping directed broadcast data, and basically unplugging your computer, unplugging your network, and going back to pen and paper. I tried that. I can't find any pen or paper, so we're going to have to look at something different. We needed to imitate a customer hosting center. This involves setting up a network lab, which we'll see the diagram here shortly, that would basically have similar equipment that would be in the hosting center. We needed to run real-world tests, products that are easily available out there, so we found the different tools from that aspect. We needed to test both the network functionality side of the products and also the management side of the products. We needed to find solutions that are going to work farther upstream in the network, not all the way down to the customer, but preferably up closer to where the data is coming into the ISP itself to help stop some of the stuff where it can still be controlled, where the bandwidth is a little bit higher. This is a basic snapshot of what our test environment looked like. Where it says attack network, which I guess for you is on this side, there were several attack boxes that were set up that had the various tools that we needed to do the Nile service attacks. On the far side, the victim network side, we had three to four victim computers that ultimately were the targets that we were using. We had a combination of Cisco and Juniper products in that, which was fairly common for the ISP itself, and also a combination of Gigabyte or GSRs on the Cisco side where we had the multi-mode fiber as well as 10100 ethernet connectivity. Let's see. The two Cisco 6500s, which were 6509s, on either side of a large GSR, were actually used for port marrying. We were trying to test multiple products at one time, and to do that, we had to have port marrying to get the data flowing through. Okay, on the passive tap testing side, we had no network side IP addresses. That was a desire. That was not always the case depending on the product. Data marrying so that all they were getting was the pass-through data that was not two-way stateful communication. Desire was that these products not be a single point of failure in the network. If you have an inline device, certainly without high availability, you have the issue of becoming a single point of failure in the network. They also looked at the products because they offered multiple levels of providing protection. In this case, all of them were basically recommending ACLs back to the routers themselves as to what kind of traffic, what IP addresses, what IP ranges they were going to block. They are automatic semi-automatic, or just basically a flag or an alarm to let you know that there was an issue. Yes. I'm sorry, I can't hear you. I still couldn't hear you. You say in GSRs, do not do ACLs? Okay, we weren't trying to do ACLs on that particular box. We were looking at basically the ingress side and the egress side, not the center. But thank you. We were ultimately looking at ACLs in the Junipers in the 6500s, where they would end up. Okay, the configuration for the reactive network solutions. The reactive solution had two boxes associated with their solution, the detector and the actuator. The detector set farther downstream, closer to the customer. It did data profiling, traffic profiling, alarms. It communicates with multiple actuators which would be farther up in the stream. And together, and this is where their smoke and mirrors magic works out, together they work together to provide the recommendation for the filter that would be implemented. In this particular case, the actuator had to have a connection, an SNMP connection into the network in order to help track the data and also that's how it would implement the filter into the network itself. Mazoo, they had a single box called the inspector. They basically, trying to think how to say this, they basically took in line and pushed it out passive. It was a unique solution. They had basically, it wasn't duplex communication over a fiber line. It was only the inside was coming into their box. They, in this particular release, and of course things change over time, but in this particular release, they only recommended the filter. They did not do any actual implementation of the filters into the rollers themselves. They just created the alarm and told you what they recommended putting in. You had to physically put that in. They did have a real nice traffic profiling mechanism in their product that we actually used quite a bit during the testing for all the products because it gave us a good idea of the breakout of UDP, ICMP, and TCP traffic that was occurring. Aston Networks, they were composed of two boxes. One was purely a management box that took input from all the collectors. We only had one to test, but you can have multiple collectors in your network. This one worked off of NetFlow and CFlowD. It got its information from there instead of direct traffic feeding. So we had to go through and set up NetFlow and CFlowD, NetFlow being on the Cisco side, CFlowD being on the Juniper side. That worked in some cases, did not work in others. Depending on the versions of Cisco products, Juniper products you have, it's going to be resource intensive potentially to implement NetFlow and CFlowD. If you have the latest and greatest, those do it a whole lot better. Arbor Networks, they actually gave us five boxes for our test. That worked out really well. They had a combination of both packet analysis as well as NetFlow and CFlowD analysis. That gave us a larger picture. And of course they had a controller involved with theirs as a management device. All inter-communicating between each other, providing each box data as to the value or as to what is actually going on in the network. And when you exceeded certain thresholds or when you went out of bounds of what it knew as normal traffic, then it alarmed you, made recommendations, you could either automatically implement them or you could manually implement them in the process. On the inline testing side, obviously inline, boxes are placed in the data stream itself. With that being the case, there's always the concern about single point of failure. At least the Captus product had a high availability solution. The Mazoo product was probably not quite as mature at that time and did not have a high availability solution available. The general idea of the inline product is it's going to quicker in response. It's going to recognize it and if you have it set up right, it's going to be able to push out the protection against the Nile service immediately. And in this case, you had to have the interfaces visible on the network. They had to have IP addresses, so that alone made it a little bit of a vulnerability in there. For the particular case of the ISP I was working with, they were very much against an inline solution because they did not want to lose control. They did not like the ability of the product to automatically implement something into the network. They wanted to have manual review before something was implemented. And it would take quite a bit of time for them to get comfortable with a product before they would ever let something like that be implemented. And I saw a couple of heads shaking. That's probably the case in a lot of places. But if you think about it, you know, firewalls or inline solutions, IDSs can be inline solutions. So you may already have that issue of availability. Okay, the inline solution side, okay, the Mazu box was placed after the GSR, but before the Cisco 6509, before the victim network. And that was where they recommended we put it since we were doing the testing. We said, okay, we'll put it there. Again, this box did the decent traffic profiling that I mentioned for the other Mazu in the passive testing. And it gave some good representation of the data. And then it all came down to how the box performed in the testing itself. And in CAPTIS, CAPTIS was designed to be like a firewall. As a matter of fact, CAPTIS is an integrated firewall DDOS mitigation solution. They do a lot of their initial filtering with the firewall portion of it before they implement the actual DDOS protection. The thing we found about the CAPTIS is it's very programmable if you, once you learn their code, it has a lot of flexibility because it actually has no GUI. It's all command line and it brings it up. You can define quite a bit of information into as to what you want to watch for, what you want to do with it when you find it, et cetera. Some of the tests that we run, first thing we had to do was come up with some baseline traffic. We had to have something to simulate something. We had to have something to simulate a baseline network that we were working with. A lot of these products require, quote-unquote, burn-in time of data so that they baseline their traffic so they understand what normal traffic is so that when you go out of the bounds of that normal traffic you get an alarm or you get something that says, hey, something's not normal, you need to look into it. So we had to actually burn in like 72 hours of generated traffic and keep that running throughout the entire test. Then we implemented the attacks that we did. We did a combination of several attacks and we'll hit the tools we used here in just a second. We, you know, TCP sends, TCP acts, floods, fragmented packets, fragmented packets. IGMP, floods, spoofed and we did this in a spoofed and unspoof manner. Just trying to see how the products are going to react. And we also did it on both the regular network side and the management side. Sometimes they forget about the management side. Some of the lessons learned. On the network side, when you're creating your baseline traffic, you've got to make sure that it's got the full communication three-way handshake because the products don't like it if they don't see all three of them. It took us a little while to figure this out. Yeah, it seems like a no brainer now, but at the time we thought we were in good shape. But anyways, I wonder what the heck that is. And the other thing about the network, if you're in a lab environment and don't do this at home, do it in a lab, if you're in a lab environment, if you can control the lab, you're going to be in a lot better shape. It took us six weeks to build the lab because we didn't have control of the lab resources, and that's a long time to build a test network. But in the meantime, we learned a lot about it, so the education process was pretty cool. Okay, what happens if you have bad routes and you run a denial of service attack? You kill yourself. And we just, as they were trying to upload the stuff from the war driving contest just out here a little bit ago, they had some route issues. Actually somebody was, they thought they had two default routes. It looked like they had two default routes. So anytime you have route issues, you run the potential of running the denial of service attack against yourself, or bringing down the entire lab network. We did that. The lab we were using was part of it was ours to use, obviously, but part of it was also other ongoing lab tests, and so the other people weren't quite happy about that. But we did prove that the tools work and they work very well. Part of what these products also desire requires a separate management network. As part of that, make sure that you have the right separation on your management network within your routes versus your actual live network for the same reason. You're going to bring down your lab. Our attack network, we did our best to scrounge as much equipment as we could. I did need to mention this testing other than my services as a consultant was done with no other cost. We couldn't go out and buy equipment. With the exception, we did have to buy some gig fiber cards. So they gave us a thousand bucks total to conduct this testing. But anyway, we used Linux 6.2 and Linux 7.2 boxes when an open BSD box and a Solaris box are the four attack boxes we were able to acquire throughout the process. And this had a mix of 10100 and gig interfaces on it. We wanted to use the gig interfaces to get the traffic levels as high as we possibly could. We wanted to exceed what the normal network would see otherwise. You don't really have a good test. You don't see what the products can take in the testing process. Some of the tools utilized. One of the vendors had a testing box a box that had all kinds of attacks on it. We originally were not planning to use that because we didn't want any preferential treatment toward a particular vendor but we found the box to be very useful. And so we did use that quite a bit and it gave us things that we didn't have easily available otherwise like IGMP attacks as well as some others. Open source, we used stream and light storm and F script and RC8 and Slice 3. We particularly found some of them very useful. Stream, let's see, that was one of them that brought down the entire lab network. Fragment and packets which we were able to use out of the traffic generator from Arbor that brought down two of the four management interfaces. So we found some tools that we liked that really give us a decent test of the product. The victim networks, we tried to set up with some monitoring tools. We did try to set up Labrea on four different victim boxes. We couldn't quite get it right so we never did quite use it but we did use Snort and a couple other tools that we had in our tool bag to see what was going on on the victim computers. Then we did quite a few manual checks while the testing was going on. Simple pings just to make sure we were still talking across the network just running it straight forward and also CPU utilization statistics on the top on the systems and seeing what was running. The victim network was composed primarily of Linux 6.2 and Linux 7.2 computers. Can we talk just briefly about NetFlow and CflowD? When you're setting that up, the sample rates you're using have to match inside the products and inside the routers that you're getting that information from. We found also that the Cisco 6500 or 6509s do not create usable NetFlow data for the purposes of the products. We did get decent NetFlow data out of the GSR and the Junipers did quite well as well. The Junipers actually were more consistent and more useful data. This is not a sales pitch. Now the flow sampling gave us several different things. One was traffic characterization in a sample format. The products themselves used that data to determine what the traffic patterns are and what the changes in the traffic patterns are. Flow sampling can also be used for things like customer billing again for the same reason where you're getting the traffic patterns utilization and it can be broken down by IP or IP ranges. And certainly it can be used for DOS and DDOS detection. SNMP is used by the products themselves to communicate with the routers. You had to have a connection available for those that were going to try to implement automatic routes into the automatic ACLs into the routers themselves. We had a couple of situations where they didn't tell us we needed this and we couldn't get the products to work and they finally told us whole what about the SNMP connection. The other thing, we were conducting SNMP testing. I wasn't particularly, but the group we were working with the lab simultaneously with this with the same routers so we had some situations where we had to stop our testing because the SNMP tests had brought the systems down basically running the the warning about SNMP public had come out about that same time so we were running testing about the same time so certainly if you're setting up your community streams don't set them up as public you're not supposed to do that. What do the vendors do well? They monitor the baseline traffic pretty well. They detected changes in the traffic patterns pretty well. Some of the products you just had set up a a pure level that if exceeded that level it gave you a warning. Some of the products learned the data and told you if it changed over time. They did that fairly well. That's what they were programmed as designed to do and these generally were initial releases that had come out but they did that part pretty well and it did alerting and alarming when the thresholds and statistics were exceeded. What they didn't do too good was protection of the management interfaces. They hadn't quite gotten to the point where they put a lot of thought into protecting the management side from denial of service attacks. They were able to attack the management ports and lock the boxes up and then they become useless if you're able to get in on the management side. I say this generically the couple of the products did very well in this test. They survived I'll put it that way. They either recovered quickly or they just had a little bit of degradation of service. Some just completely locked up. They hadn't thought about warning banners and lockouts and that kind of thing for the management interfaces yet. Again, they took those suggestions and promised that those would be all things that were added to their next release of the product. Poor lockdown on the management interfaces having as few ports open as possible helps to prevent additional denial of service attacks from happening to your product. If you're looking at a large enterprise the products that you'd want to look at this data is at least four months old. Unfortunately, DEF CON only occurs once a year so the data can be a little dated. There's new releases out on all these and I recommend if you have any interest in learning these products at all go to their websites and get some information. You'll have my contact information and I could probably link you up with one of their sales people because they all call me all the time. Basically, if you're looking for a passive solution for the large enterprise we recommend the passive solution. Most of them we talk to do not want automatic ACLs implemented or that kind of thing. The solutions can be very expensive depending on the number of boxes that you ultimately want to end up. We priced out for this particular ISP that the solution was going to be in the range of 10 to 12 million dollars for DDOS to protect their entire network. Well, that's probably a bit high but it can be scaled down from there. On the large enterprise side you'd have a mix of flow collectors and packet collectors. Again, the flow collectors are the net flow collectors the ones that get the data or the net flow C flow D off of the routers. Packet collectors are one that process all packets that flow through it. Those would be in your network, visualize in your network. You'd have to centralize the management consoles. You're looking at whether it be in your knock or your security operations center. You'd be centralizing that management piece in there. Most of them could handle approximately 10 of the collector boxes per management console. So again, depends on the scalability that you have to look at as to how many boxes or how many management boxes you would need in your network. But the products in this space are the arbor, the ASSA and the reactive. Our particular tests, the one that quote unquote one was also the most expensive solution, was also the most mature solution and was also the one that had the most products in that solution was arbor networks. They had performed well in most of the testing with some minor things. But again, arbor ASSA and reactive were all very supportive to sell you product. On the smaller enterprise side we're looking at the inline solutions are worth considering. They can provide value. The combination firewall and DDoS solution or DDoS solution and combination IDS and DDoS solution ultimately I think will be the most useful part of it. CAPTIS, they're very programmable like I mentioned before. They had also put quite a bit of thought in protecting the management side of the product and performed fairly well. They... their processing and storage is going to improve, that's what I've been told, improve with the new releases that are coming forward with the new releases. They hope to be able to handle greater than 10-100 networks up into the gig. Product we tested, gig interfaces brought it or gig type speeds brought it to its knees. From a 10-100 perspective they did perform very well. In Mazoo, a maturing product they did well in some aspects not so well in some others. Again everybody's improving their product as they go along. The technology is still evolving. It's something that ultimately I believe is going to become integrated into firewalls and or IDS solutions and have already started to do that. It's definitely something that is needed and we see acquisitions all the time and I think you'll start to see these companies by other companies to develop the suite of products. My opinion only. They've made positive strides in the DDoS environment. The products themselves mostly do what they advertise or what they've tried to advertise that they do. With minor exceptions it's involving, it's interesting, it's new. Keep an eye on it. The information security magazine did an article on this that back in I think it was October of 2001 that was actually the driver for selecting the products that we tested. I'm sure that the security magazines will keep an eye on these products for those that survive through their next round of venture capital funding as viable products. Here's some resources. Basically, do a search on the web. You'll find all kinds of stuff but here's some places where I found some information on DDoS as well as all the product vendor websites. That's pretty much it. That's the testing we did. If there's any questions I'll be happy to try to answer them and go from there. No, we did not try spade in context with that. We knew there were some limitations on the testing that we were doing and we got stuck with time constraints as well, so that would have been a good addition. Okay, speeds of solutions, I do not have hard numbers on per se. In terms of surviving the attacks, the passive solutions, the arbor, the aster, the mazu, we're asking reactive. The speeds, we had three of the boxes going at giggy speeds pushing data through it. Arbor and aster both processed and fined, reactive for the most part processed and fined, so they survived that portion of it. In terms of costs, the costs have changed since we did this testing. Some have gone up, some have gone down, and it depends on which pieces you buy. Like I said, the arbor solution was fairly expensive, but it was also the most mature. The reactive, I believe reactive has now gone to a model where you basically lease the license and you pay on a monthly fee. Arbor you were buying the boxes and that kind of thing, so to give you a hard answer now, it would be difficult without normal research where they stand today. The arbor solution I know is very expensive for the whole shebang that we had to do, but it just varied on what you need to provide. Is the arbor traffic gin available as a standalone product? The answer, at least when we checked into it, was yes, it is. The price quote on that, again, old data was $62,000. It was a good product. The question is the Cisco routers are known to roll over with too many ACLs on a die. The answer is yes, they did roll over on us, especially when we were combining SNMP testing at the same time. We did not implement that many ACLs. We were actually looking just at what the recommendations were versus actually implementing the ACLs. We did go into detailed discussions about what would it take to do this in their environment which would have involved upgrading all of their Cisco routers to support the new releases. I'm sorry, I can't hear you. What was the max utilization on the port? Our normal traffic, we were running at... Let me make sure I get this right. Primarily we were in the 50 meg range. We were pushing it pretty hard, even for a 10-100 meg connection for baseline traffic. On the gig side, we had pushed it very little bit. We were able to see 900 meg at some points, depending on the products that we had in place. Of course, we never really exceeded the full gig, but that was the attack. The 900 meg was the attack. On the normal side, we were in the 250 to 300 meg range. The frame? Actually, we did. We went from 64 byte to 1500 MTU. We jumped around a little bit in there. Once we nailed the box, you broke some aspect of it. We did a little more testing, but then we pushed it back to the vendor to try to address the issues. I'm sorry, say it again. We did do randomly spoofed addresses. Part of the challenge that we had was within the lab network, creating an environment where the products didn't think the good traffic was all bad traffic, because we had to limit the IP ranges we could work in, which generally were internal IP addresses, 10-10 whatever, or 192. whatever. The products really, for the spoofed addresses versus the non-spoofed, they were looking at traffic changes in general, and they responded to the change based on whatever that change might have been. You set the box off? That's a good question. There's really limited protection you can do against that, except to deny all for that particular service or a port or something of that nature. It was one of the things that we had discussed as a possible solution, reverse path filtering. We never actually got to the point of implementing it, so I don't have a lot of answers on that. It sounded like a good idea, the actual application of implementing it. I don't have the background for it. We need to finish up here. I'm certainly happy to answer any additional questions. My contact information is on the CD. It's in the back of the slides, so certainly feel free to email me. Thank you very much, and hope you enjoy the rest of the show.