 I'm Lawrence Baldwin and this presentation is on extreme IP back tracing, test, test, okay. I sat through two presentations here earlier and was in the back and couldn't really hear well. Is this a good volume of voice to be able to hear me? All right, good. I'm the developer and operator of the MyNet Watchman distributed intrusion detection system and I collect firewall events from about 1,200, 1,300 firewalls around the world and try to make some sense of it. And back trace addresses when we detect, when people have been compromised and try to send them notifications. So I probably back trace about somewhere around 25,000 to 30,000 IP addresses a day and try to figure out who the heck owns them. And the rationale for putting this presentation together is to try to impart some of that knowledge, bring visibility to the fact that this process is non-trivial and maybe encourage people to do more with their firewall logs as well. Okay, so a lot of these events that show up in your firewall logs, your IDS logs, it's non-trivial tracing some of these IP addresses back to their owner. A lot of times if you do a who is look up, you end up with nothing more than ISP. And really what we're hoping to do is that a lot of these systems that people are using to attack other people are not their own systems sitting on their own home DSL line. They've broken into someone else's machine and they're launching attacks from these machines that they're controlling. And we have plenty of those machines out there, Code Red, Mimba. Based on that, there's two ways really to improve the overall security of the internet. One goal is reducing the total number of compromised hosts. And the other being to minimize the amount of time that a system remains compromised. And so we have to protect ourselves. We need to let people know, people who have been broken into. We need to let them know, hey, we're seeing this in our logs. I know a lot of people when they fire up their firewalls and alarm bells start going off like crazy, they get almost as upset as when they receive a piece of spam. What we really have to realize is that probably about 95% of the scanning activity that you might be seeing in your logs is coming from a hacker directly, but rather it's really just another victim that's been compromised with something. So it's almost like these pervs are a cry for help rather than something that's intentionally hostile. This quote here is taken from an article written shortly after the February 2000 DDS tax. Authorities pursuing the attack just say the servers they use belong to users that had no idea that their resources were being used to launch attacks. So here's a pretty interesting trace. This was a MyNet Watchman incident. The machine was compromised by the Microsoft, despite it weren't. When we backtraced the IP, it led over to the ripe server and you see the net name is GAN. That's the central reason of GAN RF. And if you see the first notified contact is SAM at GAN.ru. So we go over to GAN.ru and run this page through a translator. This is the Nuclear Regulatory Agency of Russia. So pretty nice to know that they are infected. As far as an update on this, the machine was actually notified weeks and weeks ago. And as of last week, they were still attacking people. So I don't know what kind of response they're planning, but hopefully they do something soon. Okay, so we're going to start by talking about some of the different tools you can use to backtrace the latest in places that information shows up. We're going to start with the first thing you want to do when you're actually tracing something back is actually validate the source IP. Before you go off and spend a whole lot of time trying to figure out who owns an address, it's a really good idea to validate where that actual traffic came from. Just because a packet has a source IP address that could be from an address base halfway across the world, there's nothing to stop that packet from having been generated locally by somebody spoofing addresses or some kind of misconfiguration problem or whatnot. So it's certainly a good idea to use either a packet analyzer or router debug commands to actually validate that the packet is actually coming in physically from some remote physical interface. Also, when trying to analyze firewall events in a broadcast network kind of topology such as a cable modem network, it's entirely possible that you'll receive bleed-through traffic that's sourced from addresses that are actually your neighbors on the same cable segment, whether they're using private addresses or maybe even being misconfigured with valid public addresses from other places around the world. So one of the next steps you can take is to eliminate some potential bad IP addresses. In RFC 1812, they mentioned the Martian addresses. I was looking around to find a consistent definition of what the actual Martian addresses were. These are actually great from the RFC. Some people might argue that Wink Local and some of the other IANA-reserved stuff are belonging to this group, but the examples here are directly from the RFC. You've got the broadcast, the back multicast and limited broadcast addresses. An interesting quote from RFC is below, a router should not forward any packet that has an invalid IP source address. I can't tell you the number of times I've had people asking me, why is IP address 224.0.0.5 attacking me? So you can also exclude private addresses, 10 dot space, 172.16 through 31, 192.168. There's a class A, B and C, slash 12 slash 16. This was a pretty interesting who is record. Actually, this is from Korea. They actually published a who is record. If you look at the IP address range, it's for private IP address space. And this came up on a discussion on the NAN address. And someone said, where is it? So it's done an APNIC registration for RFC 1918 private space. Perhaps they're suggesting that spammers or perhaps all of Korea be assigned private address space, which are fortunately non-routable. So I mentioned this a couple slides back, but you can also pretty much exclude anything that comes from the IANA address. It's really doubtful with the attacking rule. These are some of the reserved addresses. If you do look up for IANA at the who is servers at hand, you'll get this list. And this is just the list continued. You do have to be a little careful about just dismissing reserved addresses because those are, of course, subject to change. I know I have been considering a specific address range as being reserved in my database, and then six months later, some new reservation allocations have been given out. So addresses that were previously invalid as source addresses are now invalid. So you've got to be sure to check these IPs on a somewhat frequent basis. You should definitely look for stuff that looks fake. 1.2.3.4 probably isn't attacking you. We also have some examples from MaxVision's white hats document on NNAP decoy addresses. Sometimes they show up in the trace. People try to be really stealthy, and you see a trace that has 23.23.23.23 as the source IP, and then you see another packet from 24.24.24. And then you see something from a public address that's totally different. They're like, huh, which address you're using there, buddy? So you can do a limited amount of spoof detection on the addresses. It gets really difficult to detect spoofing because the ultimate way where you receive the packet is somewhere near the edge of the network. And Bob Morris wrote in his TCP IP weaknesses in the 4.2 BSD UNX TCP IP software. This is back in 1985. He describes the weakness, which is basically anybody can fill in the source address, whatever source address they want, similar to snail mail. If you don't have access, physical access to the network, there's really not an easy way to validate whether or not an address that appears to be attacking you is real or not. However, there are some, and these are not extremely reliable, but at least some basic techniques that you can do that might give you some information to decide whether or not an address is spoofed. Because obviously if somebody is spoofing an address, you don't want to be picking up the phone or sending them a nasty gram saying, you know, why are you scanning me? So here's a technique. It's based on one of the passive fingerprinting techniques and it's fingerprinting techniques, which actually looks at the TTL value. And so when you receive a packet and you check what the TTL is at the point that you received it, there's default TTLs for most addresses. And based on that and the count at where you received it, you can do some deduction as to what type of machine was attacking you, but also perhaps find out if there was a spoofed address. By trace routing back to that IP, you'll have some idea of the number of hops that it actually takes to get to you from that IP. Asymmetric routing may add a few hops in between, but they should be fairly standard. And so if you subtract the final TTL from the original TTL and then do some guessing as to what kind of arrest sent the packet, you can give, it's not totally reliable, but it can give you an idea whether the packet was really spoofed or not. I mean, if you receive a packet and the implied decrement in TTL implies that that host is like 10 hops away, but when you trace route to that IP, it's like 40 hops away. I'd say there's probably a pretty good chance that that was a spoofed address. And I don't know if anyone here was in the X-Probe 2 talk, but there's actually a list of what the default starting TTL values are for a lot of the major offerings. Oh, I didn't realize you did it to me. Let's put that in. Now this was actually taken. I looked around for the default TTLs all over the place, and there was this one link on Google that unfortunately was down. So this is from the cache page. I haven't verified all of these, but I have looked into the windows and the Linux TTLs. And in the slides on the CD-ON, there's actually some details about how you can change your default TTL to be whatever you want it to be. So here's an example of where this type of spoof detection breaks down. You'll notice at hop 10, we jump into 10 dot space, and then at hop 11, we jump into 172 dot space, and then the trace disappears after that. So we really have no idea exactly how many hops away this particular address is. Another way to validate whether or not an IP address is real is to perform well validation on it by looking at BGP route tables. And there's a couple of sites listed here that will basically let you punch in an IP address and it will come back and tell you what the active route entries are for that address. This is like a really, I've only just started doing this, playing with this, but it's really a powerful concept. Almost every other source of backtracing information is stale and old information. Whereas, you know, BGP route information is about one of the few sort of active and real-time information. And I can't tell you the number of times when I've taken an address that I have some suspicion about whether or not it's been spoofed or not, plug it into BGP route tables and it comes back network not even in table. So we have an example of that here. Yeah, here's just a quick spoof example. This is a lookup again for 182.1.1.2 and, of course, no match. And then here's the results of the looking loss. No routes for this IP address, so it's pretty accurate that it's probably spoofed. Of course, the sooner you do this, the more accurate the information is too. So the tune backtrace, there's a couple of tools that are pretty standard tools for doing backtracing. One of them is NSLOOKUP, of course. If you just have an IP address, you want to find out what the reverse DNS name is. Under that, you can query for started authority, MX records, things like that that can give you more information about the domain itself. Here's an example, a simple example. For example, NSLOOKUP, we set the type to pointer, looking at the full address in the name space. And then we look for the SLA for the full address in the name space, and, of course, it doesn't find it. So we dropped off the last octet and looked for the started authority on this class C address. And here it is, located at rr.com. One thing that, in the early stages that I was using this technique, I was actually recursively backing up to even the class B level to do these SLA lookups. And I can assure you that's not a very good idea, because you have tons of different ISPs within the same class B address. So you really don't want to go further than going to a class C level, and even that can be touch and go if you're dealing with an address space that's been very finely distributed across multiple providers. Another one of the standard backtracing tools is, of course, the Whois databases. I do my first lookups at Aaron, because it'll tell you if it's not in their registry in that point here at one of the other mix, if it is. A lot of times the data that you get back, though, from the Whois service is totally bogus or totally out of date. Here you have some of the advanced query syntax. Some of these are kind of fun to play with, doing full lookups and net lookups. You can actually do a net lookup on a class B, and it will give you every network in the database that is within that class B. One of the things I wanted to emphasize from the previous slide is very often people will see the email contact address listed in an Aaron record and start shooting email at that person. It's really not a good idea as typically the contact record knows the person who maintains the IP address allocation space, not a person who deals with security issues. So about the only case where you'd want to use that information is if you see a net block that's extremely small, like a class C or below, then it might be safe to try to contact that person directly. If it's a large ISP, you don't want to be sending email to the Aaron contact information. You just want to take the domain portion of that contact address, and then as we get into later, try to figure out what mailboxes to use and that sort of thing. So we talked about this. Okay, so here's an example. If you work the contact email is at hotmail.com, which is pretty useful. And the phone number is pretty good. 1-2-3-4-5-6-7-8-9-0. So obviously not much is going to be able to get through to any of these numbers. This record really emphasizes the whole problem with who is in that, yes, you can look up Aaron records and find out netblock information. But very often those netblock records don't tell you anything about the domain that is responsible for that address space, because it's sure in hotmail.com. And you really can't even, you really have to look at this carefully to realize that Internet America in Dallas, Texas, with their domain name, it's probably iadfw.net. But doing that in an automated fashion is almost impossible. It requires almost manually reviewing these records and making those associations between IP address and domain names. Okay, so moving on, we have the intermediate bad trace category, some more advanced tools as you move further in the tracing. Recursive who is, if the record isn't in Aaron, it would point you at RIP or APNIC who in turn would point you to one of the national registers. So by doing recursive lookups you can narrow down who actually owns the space. And you can actually cross-check what you find in the IP-based who is records with domain-based who is queries. There are several places to do that. I like the geek tools proxy a lot. And I just recently stumbled on the youwhois.com. So I'm not so sure about the reliability of it, so the universal who is sounds pretty nice. Yeah, cross-checking is very critical because one of my pet peeves about IP who is information is there's absolutely no ramifications whatsoever when somebody allows that information to go stale and out of date, which is the reason why it's so out of date. I mean, I see records constantly where the last update time was like 1994 or earlier. As with domain who is information, there's real ramifications if you don't put a valid email address in there. You're not going to get notifications about your domain name expiring and you could lose your domain. So very often the domain contact information will actually have much more timely information than what's in the IP who is record. So once you have a domain that you might be pulled out of IP who is, look it up in the domain who is to validate the contact, see if the contact information is as good or possibly more up to date. This is an interesting example from a document. I've been getting a ton of spam encouraging me to buy my .us domain name. And Newstar, the people who administer the .us namespace, this is an interesting quote. On the current administrative practices, the US top of the domain only has no central database that can in turn create a central who is. It doesn't in place for DOEs to provision database information of the central registry. Even if DOEs wish to provide new who is information to the ESTLD administrator, that capability is currently nonexistent. So we're really not doing anything to fix the problem of being able to trace these events back to an actual owner or administrator of an IP. Once you actually have extracted out a domain name that you believe is responsible, your next step in trying to actually make physical contact is to figure out an actual email box to send it to. And we have the abuse.net site, which is sort of a de facto registry of abuse email addresses. And you can just basically punch in any domain and it will try to pop back and tell you what mailbox, whether or not it's an abuse app, mailbox security app, or postmaster app, whatever it is. And I think now he's up to about, I think he's got about 140,000 domains in there. This site is more meant for dealing with spam-related issues, but in large majority of cases the spam mailbox is essentially the same as any kind of security-related mailbox. Okay, moving along, we have some more advanced techniques, which first one is, if the who is query doesn't give you the right domain, you can try searching on Google or putting in parts of the information that you got back from who is. Try adding parts of the actual address, the city, the state. Here's a neat example, something that you don't see all the time. Here's, if you look at the address space for two of these, for each of these records, they're two different records for the exact same space. So it's kind of tough to tell who owns this. I would that quote sounds a little bit more specific. So here's the full record for iWave Corporation. Again, the contact is at Hotmail. Not really helping us out. So, we do a Google search on iWave, adding some address information in there, Vienna, Virginia. And lo and behold, the third link down and shows us www.iWave.com. And of course, surfing to that website, you'll notice in the bottom right-hand corner, we actually have a real email box, not at Hotmail. One of the other advanced tools that you can use is R, who is, or Phil, who is. When I first read through the spec on R, who is, I got extremely excited. I was like, you know, great, here's this much more specific tool of getting down at who owns address space. Because, you know, very often, when you do a straight who is lookup, it's just going to resolve to the service provider and not to the actual end company. And unfortunately, trying to contact a company through a service provider is pretty much a brick wall. You can pretty much be guaranteed that you're not going to have any communication for like weeks at a time. So R, who is, is basically a mechanism for ISPs to create more granular databases. Unfortunately, only a handful of service providers have implemented their own R, who is servers. Here's just a quick example. This is actually from Geek Tools. I know it's a little bit hard to read. But this is just one that, you know, when you do a lookup at it, it will give you back, you know, a record for Exodus and then say, you know, for further information, you can visit the R, who is server at Exodus. So the nice thing about the Geek Tools proxy is it gives you both records to automatically go out and pull down the R, who is record and give it to you. So this is this is the domain record based on what we pulled out of this slide here. This is the domain record. We get hostmaster at PMI, PMimgilling.com which is nice. This illustrates why people should use for their contacts. If Jigler leaves the company you don't want his email again as the contact in your information. A nice generic hostmaster at your company is great to use because whoever takes over as the next hostmaster can receive that email. So there's other databases besides who is there's the routing industry on all records. I list the RFC number there if you're interested. The routing industry is similar to the BGP routes except it's still and a lot older information that sometimes you can find that it's a goal. Here is a successful trace that we actually were able to pull out information with RR. The culprit in this case was clicknetwork. Unfortunately if we thought IP through his records were out of date the route records tend to be more magnitude out of date so it isn't always helpful resource but it's just another possible source to try. Here's what's actually returned by the routing registry hostmaster at clicknetwork.com So then we move into some of the more extreme techniques. You have an IP address that doesn't do any harm to the RFC reports that got offered out to the world and perhaps there's a banner there that you can read that will tell you who administers the machine and the question. I've actually found a surprisingly high percentage of compromised servers are also running mail servers. I'd say it almost seems like almost as high as 25%. Unfortunately although including a mail banner I'd say probably only about 60% of people actually follow that standard so it's not always effective. Here's an example this particular machine had 433 and so looking at the SSL certificate you can see on the subject line that's highlighted at latimes.com here we're getting some clear as to who might be running this machine and you can get this to pop up just by doing a surfing ACTBS surf right to the IP address that's attacking you. Then this SSL certificate pops up and here's an example where we may have gone too far trying to trace something back. You'll notice that the range for this record is actually almost 4 addresses wide 4 IP addresses wide so here's the record that's returned by the RIPE server so we try connecting it to the machine and it's got FTP and we're running the FTP so I want to have it correct the ultra secure real FTP so we didn't find anything there but trying some of the other IP addresses in the range there's a web server on one of the other addresses and there it is click that key click that key to see last t-test of it to me so here's the domain record for portaladventus.com and we do get a webmaster at portaladventus.com as an email address here so there's times when you trace these back and through whatever or whatever reason you still have no answer as to who might run this machine so you can always kind of move out a little bit and take a step back and find out who runs the autonomous autonomous system but my objective is always to try to get the that trace to be as specific as possible so we can make direct contact with whoever is actually responsible for the system and Jason said when that's not possible and you want to at least get a message to someone you can use the autonomous system information in general only a larger or midstream ISP is going to have BGP Autonomous System numbers so you're not going to really be able to get extremely specific in many cases but at least gives you an avenue to go somewhere so here's an AS example this is a BGP and I pick up on an IP address and I've circled the terminating AS record or the terminating AS number 6197 and this is the same command that we used previously where we got the no network found when the network is actually valuable the output will look like this so here's the full record from then on the autonomous system 6197 the contact here this is a great example of useful not useful information I actually consulted for Bell South for a number of years in their early days and when I saw this record I was dumbfounded that the contact person was listed working at Siemens and I was pretty sure there was no logical relationship between Siemens and Bell South or then Bell South occasionally buys a phone switch from Siemens and so I had kind of I had actually just punted on this assumed that it was some kind of error and unbelievably about four months later I was standing on a train platform having come from a meeting and there was a guy standing next to me holding the same envelope that I had so I assumed he had been at the same meeting I had some time to spare so I went up and chatted with him and we started talking and a few minutes into it I found out he works for Siemens and I was like, wait a minute, I need to talk to you so this is like the real example of extreme backtracing I had to hunt a guy down to a train platform and this is another good example of the breakdown in problems with the Aaron records what essentially happened was he did originally work for Bell South and when the AS number was registered he used his contact handle several years later he left the company and went to work for Siemens and he began allocating address space assigning it to his handle main well there's no mechanism to say all these addresses that I used to administer I don't work for that company anymore so remove those references from me so we're sitting here having a person who doesn't even work for the company anymore still being a contact contact person so that's what happened in this case it would have been much nicer if they had a real-based account here where originally like he said he had his email address at Bell South he updated his personal record to show that he was working at Siemens and here we have an implied relationship between Siemens and Bell South which really doesn't exist so here's another extreme technique you can use if net bias is enabled on a machine send them a win-pot message using that send I haven't quite figured a technique to actually make this be successful I've probably sent about a hundred of these and I've never had the person contact me but I have a feeling if the message was crafted in an intriguing way such as this is the Microsoft Security Management Center in Washington please call us immediately with this phone number that might work this is one of my favorite backtraces it's amazing how people are willing to name their machines things that give away a tremendous amount of information here we did an NBT stat to get a iteration of the machine names and the workgroup name is called a Ducci underscore Dorf, I thought it was Doofus at first but as I was Dorf I happened to also do a trace route to that address and could sort of see from the trace route that it looked like given the CHI that was probably in the Chicago area and then just do a Google search for Ducci Dorf in Chicago and bam, law firm of a Ducci Dorf in Blackenship and a bunch of other names so there's an example where just this little tiny piece of information if it's unique enough you know, not like John Smith and you've got a little bit of information you can actually trace that back to a machine name now the sad part of the story is I called up this company just to give him a heads up got the secretary and said I need to talk to your security manager or your MIS manager and she's like who? We don't have one so I was like okay let me talk to your managing partner and I get Mr. Blankenship on the phone and it's really nicely explained to him not trying to sell anything and his response was like okay thanks, have a nice day so here's the we've actually done a little bit of cross-checking of the domain record with what we found and the responsible domain to be ADLMD.com so sometimes no matter what you do there's nothing you can find out about an address so you find out what is the most directly upstream address from the one you're looking to trace and see where that leads sometimes it's only the service provider but you never know it's really important to get as close as possible here's a trace that there's a record that comes back from right showing that this address space is registered to ADLMD.com but the notify contact has a totally different domain that's tacked at ADLMD.net so whenever you see disparate information like that, the people who really want to contact are in a com pool so you do an MS looking for mail exchange records for in a com pool and as you can see there are none no address records available for them either so here's a trace route to the IP trace 22 is the IP but hop 22 is the end of the trace hop 21 is the next IP just above it so we try looking at that for more information and we find that it's registered to TACTA.net which TACTA corresponds with the other address that we found so it looks like this is about as close as we're going to be able to get as TACTA.net so conclusion it's surprising how many corporations have a isolationist view of the internet and they're very concerned about protecting themselves on the inside, very hard firewall crunchy exterior sometimes they actually deploy stuff within their networks but back when Code Red and Nimbus and other worms were hitting you'd see in the red logs direct evidence of all these machines that were compromised and most companies really they don't make their money tracing these back and letting them know they're broken into so they really don't do it and as a result you have a tremendous number already broken into machines just waiting for someone to grab control of them, import them, DDoS somebody unfortunately machines that are port scanning you today could be DDoS in you tomorrow trying to back trace them so you do read your logs I hope there are services that can help you do it Lawrence's site, my net watchman is excellent DShield is very similar if you don't like the idea of sending your logs outside of your network you think that that might be insecure somehow then there are solutions you can deploy internally to swatch log century so that's that's our presentation do we have any questions what's that? the presentation all the slides are on the CD and if you look at the notes view I apologize but the notes view is totally screwed up my PowerPoint skills are only this big so I have on my website Jason.net that's J-A-E-S-O-N .net if you go to that site it will take you into a place where you can actually download the latest and greatest presentation with all the notes fixed on it