 Hi, my name is Mark Grimes, also known as Obeshin. Some of you may or may not know me. I'm basically here to present a specification in a tool called intravenous. We're going to talk a little bit at the beginning about marrying artificial intelligence with TCP-IP because the areas of research that you can explore with raw sockets and how to obtain next packet injection are very important. Everybody knows that you can just connect client server through BSD sockets, and we want to explore the arena of actually taking raw sockets to, well, I mean, especially with Windows carrying raw sockets in the next version by default. I want to explore the arena of producing next packet injection with connection integrity with respect to time is the goal. We don't really care about payload, but we'll start getting into this here. First thing we want to cover is packet injection. Some of you may or may not know that I'm responsible for a tool called Nemesis. It is a 9 protocol packet injection suite. It is based on Mike Schiffman's lip net, which makes it extremely portable between operating systems. I've even had reports that this works on Sun Solaris or Trusted Solaris. We've got node integrity, evasion testing. I mean, these are the common uses that this tool is used for. The emails I've gotten in regards to Nemesis, I was kind of creating this on a whim for studying for a test called the CCNA. I don't really believe in theory without testing the practical, and it's nice to read a bunch of Cisco books that are written by, you know, Yada Yada PhD. But it would be nice to actually see these packets go across the wire and actually work as suggested in the Cisco reading material. So the three top uses are node integrity and evasion testing of router switches and firewalls, IDSs, and in stack integrity. So we use it to test malformed packets on the network wire. Router congestion management is also another real good aspect of it because the tools or the tool is used for throwing like gigabyte files across the wire as payload to test a particular protocol and see where the router chokes. It can also be used for covert channels. I've actually done a little demo where I had, I was testing sending a 35K text file through ARP since, you know, like nobody like looks at the payload of ARP in an IDS and I sent the readme file for Nemesis across the wire and just like dumbied up a little program to collect the packets and collect it on the other end. And, you know, 100 ARP packets later. Not that that would be noticed or anything, but it made it through and it collected it at the other end and it was a it was a little nice covert channel and obviously it can be used for spoofing. The last version I did give Nemesis to Jeff Nathan of Atstake who presented at Black Hat this year and also used the tool in his demonstration. The tool is actively being worked on again. I kind of lost the passion for working on this because it's kind of like just plugging in more protocols. It's at this point. It seems like esoteric assembly line work. I talked to Mike at Black Hat to see what the future of LibNet is going to hold. It looks promising and it looks like Jeff is going to have an interesting project ahead of him. I wanted to focus my research on on intravenous and it can also be used for evasion, of course, because if you're, you know, if you've mapped your IDS and understand what the IDS is alerting from because perhaps they're running an insecure X server and you're able to pull the the screen that there's been many. I work in for SCIC Corporation and I'm the technical lead of network penetration and data forensics and although this research is not a part of SCIC, this is stuff I've done on my own time. Anything that's said today is basically pointed at me and not at my corporation. The tool has been demonstrated. It has reached a lot of media attention, actually. SONS has covered a lot of articles choking on NAPFA. NAPFA, I believe, was a type of attack that was coined by BindView and there's an article from February 2001. There's all kinds of articles protecting network infrastructure. There was a November GIAAC Detect Report that was crazy, that was promoted through SONS and it pits Nemesis versus Cisco pics and they were actually able to send IP packets that didn't log IP addresses, which is not exactly a great thing. Fyodor has listed Nemesis as the top 50 security tools of all time and it's available at www.packet-ninja.net if you hadn't already known about that domain. We reach artificial intelligence and a lot of you are probably coming in here with the misconception that we're creating sci-fi material like if you've seen the movie but what we are actually creating is intelligence software, software that is based on, thank you very much, software that is based on the, hold on one second, software that is based on artificial intelligent design, the different types of packet or data structures that are indicative of expert systems and machine learning. Intelligence is defined, I looked this up on dictionary.com and was kind of intrigued that when I looked this up right before I finished the presentation they had changed the definitions. But intelligence is defined as the capacity to acquire and apply knowledge and the faculty of thought and reason, both of which can be defined in software. You can have something automated that takes care of a human function that we're normally we would be doing, like if you're doing a pen test and you are exploring the boundaries of some network, host-based exploitation. We can automate that process, like if you're scanning a network you do a ping scan, you do a port scan, you do a reverse DNS, I mean all that can be automated. If we can write it down on a piece of paper we can automate it. The third definition of intelligence is where human bias becomes a factor. We have superpowers of the mind or superior powers of the mind and mind is defined as the human consciousness that originates in the brain, of course not dolphins but humans, and it's manifested in thought, perception, emotion, will, memory, and imagination. This is the point where the definition skis and we really are having a problem in the artificial intelligence arena of defining intelligence in regards to perhaps some of you that are not really engaged in artificial intelligence study are perhaps thinking that artificial intelligence is taking things to a different level and it's not really, it's just really automating human interaction and in an automated fashion. An intelligent program, this is a common definition that's been seen in a number of AI text. An intelligent program is one that exhibits behavior similar to that of a human when confronted with a similar problem. It's not necessary that the program actually solves or attempts to solve the problem in the same way a human would. So our goal is we work on performance measures. We tell when we get a port scan and it comes up with port 80 and we assume it's an HTTP server and we probe for HTTP vulnerabilities, we assume that this is a proper performance measure, a positive rather than a negative decrement. And the agent understands that it's doing the right thing, moving forward. This is basically defined in our environment. An artificial intelligent program is defined by the environment, which is the network traffic that passes over the wire. Sensors, the sensors are obviously a packet capture filter that would narrow down the traffic that we want to see on the wire with port exclusion and we'll get into this when we start talking about the actual live artificial intelligent agent that I'm producing for OpenBSD. Static capture rules and to categorize traffic. We have effectors, effectors of the injection engine essentially. For the course of this mission, we're doing stateful packet injection, which is something that's not commonly done. We always look at things from an IDS, from a defense standpoint. This is an offensive tool. This is an attempt to create a disease so that we can create an antidote on how we can perceive distributed denial of service from the arena of imagine you have a typical DDoS attack and you set a packet crafter on the end of each node instead of a particular distributed denial of service attack. And then you're able to communicate with these different agents as to what specific traffic you want to hit. You strategically place them on the inside and the outside of firewalls on different points of a LAN. Where you have two different endpoints, say you wanted to ARP spoof in two directions so that you could bring two VLANs together or closer together so you have a distributed sniffer component. Examples of agents for host-based vulnerabilities would be like ASCII or binary fingerprint target analysis or target response. Recently, well not too recently, SCUT of TESO created binary fingerprints in ASCII or printable ASCII that could basically detect finger responses and there were a couple other tools that they created in this arena. I mean obviously we don't want to stay confined to just ASCII printable characters and we want to go to extended ASCII and just make a binary fingerprint as to what we can print on the screen and then store that to a file just as SCUT has done. If you want to see more of them, you can head up to TESO's website. I'm sure most of you know where that's on. Intelligent shellcode offset alignment analysis. We have a knowledge base. A knowledge base is the foundation or one integral part of an expert system. We can define OS major and minor versions and we can do process context recognition. We can look at banners. We can look at things that would be indicative of a particular exploit and attempt to exploit that. I mean, what's the biggest problem with third-party scanners today? They're doing fingerprinting. They're saying, oh, if you're running a particular version, that must be exploitable. Well, later when a particular vendor's software came back with Solaris exploits for NT, that didn't really work out too well in our reports that we're handing off to the customer. SCIC has developed a 500 megabyte toolkit that comprises a lot of tools that are standard, packet storm, some private, some not, that we can go out and assess and exploit or access a vulnerability to where it is truly exploitable and not just a fingerprint match. Network vulnerabilities, we've got default passwords, clear text, protocols, dynamic ARP caches. There's a ton of areas that we can define agents for making an expert system where you just map everything and say, okay, well, if we're hitting a Cisco device, we're going to try these password combinations. We're hitting this SNMP, so we're going to hit these combinations. There's no reason that we have to take ADM SNMP and grind on passwords that we already know are not default passwords on that device. You can actually map them by device. It's just one layer deeper. NetBios, of course, is a nice network vulnerability. The replacement for that, of course, by Microsoft is, well, we don't own the protocol. We just use it. Use IPsec, which of course means you have to run Win2k, and it's attached to Dial-up Networking for VPNing, so we all know what's going on there. Stack integrity, malformed packets. Obviously, this would be another thing, like just like Nemesis has proved that certain devices can drop like a rock to particular malformed packets through RASock. We've got covert channels. We can look at OSRFC deficiencies. There are many vendors that ignore must-be-zero bits and use reserved bits to store whatever I'm not so sure that some of these commercial vendors are actually reading the RFCs. I mean, you can have an unbelievable amount of covert channels, but the reason why I've tuned a particular intelligent agent to one operating system as opposed to making a portable intelligent agent is because we can't trust the operating systems packets that are coming back on the end of the wire. We all know Linux drops packets. We all know that what you see is not what you get when it's coming back from particular operating systems. We can have crafted payload and non-standard protocols, and this crafted payload could be directed or indirected. We could be sending this to, I don't know, there was a talk last year about IDS and about how you should be looking at things that are on your network as opposed to things that are not running on your network. This is great if we have sane data. We do not have sane data out there. We have a bunch of hackers throwing packets everywhere. Look at FNET. If we send a packet to a non-existent host and an IDS isn't going to catch this, then we can pick this up on the other end if we're looking for packets somewhere else on that segment destined to that host. We can actually have communication running back and forth that an IDS has no awareness of. Although, yes, for optimization reasons, it's important to scale your IDS to your network topology, you're also going to miss a whole hell of a lot. Packet scrubbing was a suggested alternative. I'm all for this. Doug Song, I believe, was pointed out his last year to have been working on this. I've seen the Stubbs go into OpenBSD-PF. For those of you that are not familiar, Darren has pretty much not given us permission to use IPF, and his license violates our open standards. We are moving towards OpenBSD-PF, which is a product of Dan Hartmeier of the OpenBSD development team. He has produced a wonderful set of initial code to start from with some optimizations that I think Darren kind of missed the boat on. As far as a firewall, he's got some IDS implications to what he had put into his firewall, such as keep frags. We'll get to this more later when we discuss TSP fragmentation and assembly and MTU. We have assessment codes today. Our code is singular and disjoint. We have an IDS that detects. We have a scanner that scans and a sniffer that sniffs. I mean, this is really, really backwards. We have a security consultant that sits in the middle and mitigates this risk. Tomorrow or today, we're going to have conditional response. We're having intelligent tool suites that are able to use agents to talk with each other, so that we have centralized or decentralized reporting. We can use these tools for this. So if a scanner on one tier of a network infrastructure detects username Joe, password Bob, logging into a network, we can report that to the network scanner so that it can detect the presence of that username and password on other nodes. So those counts could maybe shut out if we detect that this intrusion was across, again, an anomalous time, three o'clock in the morning from the secretary. The secretary has no dial up access or anything. We can use these tools to basically secure the environment until the knock has a chance to report the problem and they can get people dispatched. We have agent usage. I'm kind of moving through these slides. I've been asked to hurry up a little bit since we're kind of on a off-skew here on time. We have automated response. We have replay attack, session hijacking, sniff to compromise. That's a nice little term that would mainly mean that you would sniff a packet like using dsniff. You get a username and password combination and immediately use that against the host that you have sniffed the traffic on without anybody being present. So if somebody came back, they could report that that was actually a successful login and wasn't a bogus packet or something like that. We've got sniffers out there that will report anything based on what you send as a crafted packet. If I can dummy up any packet nemesis and trigger any sniffer to believe what I want it to believe, you're not necessarily getting information that's indicative of what's really out there on your network. We can use distributed agents and autonomous nodes. This kind of reaches the area of where Ender was talking about autonomous nodes and using them to each have their own control, each have no knowledge of each other. They just have their own tasks that are mandated on the wire. And this is basically the future of enterprise security solutions. I mean, we are taking information on a network wire. We are producing a report and we are walking out of the corporation with a snapshot of the corporation. And two minutes later as we drive off the parking lot, if they move one box, that actual engagement has completely been obsolete. And obviously this can stem with distributed agents to intelligent via infection, DDS attacks, and intersegment sniffers. And what I mean by intersegment is by having two different agents on two different endpoints, both ARP spoofing across the routers, both collecting information and then sorting out which ones have the exact same packets. And putting together a nice network topology that you would normally get from a fully switched environment. The problem with agents though is that it can't think outside the box. When you're there doing a network penetration, you've got other risks to worry about. You've got policy. You've got social, political and economic risks to the corporation that won't maybe not be so intelligible for a program to figure out through looking at files that may be of a proprietary format. So obviously there's a means to an ends and taking this to the electronic level. But we're going to miss a whole hell of a lot from the actual risk and a corporation if somebody is doing laundering or somebody is doing something that possibly may not be seen just through electronic traffic alone. Or we would never reach that we would never reach that solution based on looking at that electronic data alone and that we would have to be crossing to a level of looking at other documents in the corporation. So now we reach intravenous. The prior slides were just telling along the lines of marrying agents up with TCPIP. This is a actual live intelligent agent that I'm working on currently. I handed off the Nemesis source code to Jeff Nathan of AdStake because I don't have time to work on it anymore. This is all stuff coming out of my free time. And I'm looking at marrying an intelligent agent up with a host and network so that we can have network and host-based assertion as far as what we can look at. Hold on. Client server design. Our agent design criteria for this project intravenous is we've got ammunition here. So this is the only guy. I talked to Jennifer Granick for about an hour yesterday or day before yesterday or yesterday. Black Hat. And we're using Blowfish internally. I'm not making plugins. I'm not going to scale this down because U.S. Crypto Law is restricting exportation. We're all, hopefully, well, there's a lot of people here from other countries. We're going to do what we can. If somebody else can help write a plug-in, that would be great. I want to marry this dope in BSD. So if we can find some other solutions, that would be great. But I'd like to use the internal Blowfish functions. So we've got a strongly connected or strongly authenticated as well as strongly connected client server model. And what we mean by this is that we authenticate the Blowfish password and then we have configuration files that are stored at both ends. And we do a SHA-1 hash and a challenge response to verify that the information stored about who the client IP addresses are is sane on both the client and the server. Oh, screw up the slides. We have a sensor, which is modeled after OpenBSDPF and it's scaled to user land. So we bring this out to Stateful Packet and Inspection. We have to take care of TCP fragment handling. We have static filtering rules that are broken up into two categories called intrastate and extra state. We have an expert system which has knowledge bases in both the arenas of TCP IP. So we actually essentially have three expert systems, three minor expert systems rolled into one. The attack and defense expert systems, because the inference engines provide lookup functions to the knowledge bases, are essentially packet steps or function steps for the moment. We're trying to concentrate on next packet injection over raw socket. So we'll be focusing on TCP IP, expert systems. We have an injection engine. These being the three components of intravenous, which is simply a user defined injection. And we have some sort of automated injection, which is going to be looked at across the expert system to see if we've actually gotten packets. We're trying to do this from the standpoint of allowing or not allowing hackers to be able to dump or craft packets with Nemesis or some other tool to corrupt the state table of intrastate traffic, which is the most important state or stateful information that's kept on the wire. Our model sort of looks like this. This is all designed in OpenBSD. This is using DIA. We have packets that come in. We've got the client that connects to a server. The server rules are configured. We've got three systems. The sensors, the injection engine and the expert system made up of the knowledge bases and the inference engine. Packets arrive to the packet filter, are brought through the sensor, are checked with the injection table to make sure that the user that connected to the server actually injected this packet. We don't want the... We kind of care less about the extrastate table getting fairly corrupted because we aren't keeping... We're keeping very minimal state when it comes to extrastate information. We'll get to the definitions in just a second. Intrastate is the core component of this. It's all you need to be aware of at this moment. The problem with Nemesis was that we have static injection. That if we launch one particular packet over and over and over again, we are basically simulating a SIN flood. We send out... We have a sequence number of 100, an acknowledgement number set to zero. We send that to the target. Target responds with a sequence number of 200, an acknowledgement number of 101, which is the next packet that's expecting to send out. However, Nemesis sends out the same packet again, so we have another SIN that doesn't get answered, and we simply have no handshake. Nobody has really explored the arena of conducting three-way handshakes through raw sockets. This agent is all about handling this ongoing connection past the three-way handshake for as long a period of time as we can possibly stand the connection up. Through hackers throwing reset floods across the wire to try to drop our connection, the general common chaos that goes on on the network wire, on a network that is full of maybe entrusted employees, maybe just a public network like here, where you're going to have a lot of people running a muck on the wireless land testing out everything that was presented at Black Hat. We have a SIN packet that goes in through intravenous. Same thing, sequence number of 100, acknowledgement number set to zero. Target comes back 200, 101. The only difference is because we're interpreting the packet that comes back from the target's intravenous. We can actually have a knowledge base that consists of relative value changes based on a major and minor protocol version, which we'll get to in a little bit, that will discuss how we can perform that next packet injection, which the next packet will answer back from the target. This keeps this ongoing connection regardless of payload, because what we'll come to see is that there are many protocols which are used today, such as 70% of the internet being web traffic, that are simply clear text, with like a new line appended. When you've got clear text, you can just read payload from a file just like you did with Nemesis. You can have each packet pulling one line per one line of payload, and each packet that goes out does the next packet injection, such as Unicode. Unicode's a great example. You could walk a file system with intravenous, searching for a particular file through a string regex combination of a client server model. IV, intravenous, not to be confused with VI, if you're typoing. So yeah, it's change your CSHL so that it doesn't detect typos. IV, the client has a configuration protocol at slash etsyiv.conf, which contains only the IP addresses of the clients. I say clients here because you may have two foreign powers that want to both join in the same game of playing with packets across the network, and you can set the set beforehand where you have delegated IP addresses, and this is what the SHA-1 hash is computed against. So if a hacker had managed to acquire somehow, I mean, you've got to be a root to run the client. Well, you've got to be a user to run the client. You've got to be a root to run the actual agent itself because it's doing the raw packet injection. The client just stands up a blowfish encrypted tunnel to the agent. The agent is the only one doing the packet injection. This is not set up like distributed analysis service where you have actually three hops, where you have a client going to a master and the master's going to the slaves, and the slave's hitting the attack so that when the forensics analysis is done of the slaves, the worst that they could do is point back to the master, which can't point back necessarily to the client unless they do data forensics on the master. So the whole reason that this was set up on a three tier for distributed analysis service was to maintain anonymity when conducting a denial service attack. Although this is an offensive tool to explore the defensive applications that could be implemented with AI, this tool is also meant to be installed in OpenBSD, and it's also meant that you have root access on all these agents. So it's assumed that you have control over all the nodes that are attached to this network diagram. So it makes little sense to play the anonymity game when you're not just compromising a host and sending packets over the wire. We initially talked about that this was going to be used with nemesis and that we were going to use nemesis to actually do the injection threads. The problem with that is that I have had some difficulty coordinating libnet development with nemesis development. At the beginning, nemesis was developed just fine because we hadn't reached the end points of libnet of where we're at now. And libnet has been getting a little air play these days as far as design on itself. And I spent more time after talking with Ophir to get some ICMP obsolete packet types and codes in place that these were not being produced as a part of libnet. And so I produced some more ICMP search quench. And honestly, I got bored of developing libnet when nemesis was the tool I was developing. So I decided to bring this in-house so that we can explore the arena of host-based analysis and what we can do with host-based control if we know the operating system that we're on. The white paper that will be released after this proceeding goes into quite a bit more detail as to why we are using OpenBSD as a platform as opposed to another operating system. The server is IVD, IV Daemon, which is the agent. And I'm really, really fascinated with KQ. I recently tossed Apache to the wayside and threw up THCTBD on packetNinja.net. Because I noticed that my load was going up to a point where I couldn't use the box anymore that the web server was running on when there were a lot of simultaneous connections. KQ is some remarkable work that came out of FreeBSD and has been brought over to OpenBSD. And so we're going to use that to help create the concurrency in case there were lots of people on different agents coming from different regions that wanted to make multiple simultaneous connections to the server so that it can manage this a lot better. So we have atcivd.com which contains the IP addresses of, this should be servers or the agents. It is aware of the actual other agents that are on the network wire because, as you'll see later, we're actually going to do an extra-state analysis of the other remote agents that are on the wire. This helps us in being able to see what's going on and what packets are being injected into other agents while we are still connected with yet another agent, which would be considered our local agent. And our max connections are allocated through atcivd.com so that we maintain some sanity and we don't have lagging connections between packet injections so that it can keep up with the interstate engine. As you'll see, we've modeled the interstate engine to use as least clock time or CPU time, I should say, as possible. And this is done through work, through OpenBSD-PF, through understanding how PF handles TCP fragmentary assembly at minimal cost. And it also has to do with, I lost that thought, but both the client and the agent use iv.conf and ivd.conf. They both have copies of each other, one-way hashes and strong perm checking of the files. We have basically root-owned file 0400. We maintain sanity of IV and IVD by assuming that we are the only ones that have root access to the box or the ones that do have root access to the box we have knowledge about. So by marrying this to OpenBSD and being the secure by default paradigm and the fact that there have been no remote vulnerabilities discovered in the last four years publicly, I think Theo actually, Theo and the rest of the core development team, I mostly work on ports, but they have a lot of respect on my end. Authentication successful with password and hash compliance. So we use both these components to do a one and a two-part authentication. This makes sure that if somehow somebody says, hey, Bob, you know, talking over the phone on AT&T or whatever, and they're like, here's my password for the agent, and it gets tapped somehow because they're using a cordless phone with analog cell phone. And we sniff that over our F or we tap their wire or whatever. And we acquire that password through not having root access to that box. We could still, if we just did password authentication, we could still log into that agent and we could still control it. But by doing this hash compliance, we assume that the attacker knows the IP addresses that are configured as the clients. If the hacker has no knowledge, no prior knowledge of the client IP addresses, and of course we can just, we've got comments, we use pound for comments just like shell scripts, we can just throw in garbage into comments, which just obviously helps pump up the hash so that if one client was assigned to IBD.conf and somehow that information got transmitted over a telephone or by word of mouth across the cubicle farm, the password and knowledge of the IBD.conf, the actual client IP address, having one IP address in there alone and doing a SHA-1 is completely pointless since that could be replicated and both parts of that authentication could be broken. Our sensors are obviously running in promiscuous mode since we're already root, we might as well be running in promiscuous mode with filtering. We use a standard PCAP for this filter and we're filtering by what's set up in the configuration file. So we have our stateful user defined connection filters which are delegated as interstate or defined as interstate traffic and we have stateless, it's actually really not stateless, it's minimal state, we keep track of the source destination IP addresses and the ports of applicable protocol and that's stored in an extra state table. The agent is a PCAP filter again, it stores the agent IP addresses plus a set of local exclusion ports and what we mean by that is we want to be able to set up an agent that can run on a web server that's already communicating with the outside world. So if we want to set up an agent on a web server we don't want to be sniffing traffic that is legitimately inbound for that web server, we're not developing an IDS, there are other tools to do that. We want to maintain, we want to actually maintain state on packets that are unexpected coming to the same node that the agent's running on and we want to do information gathering across these remote port or we want to do remote. The ones that are not exempted we want to do information gathering of. We have interest state, we reached the definitions. This is the scope of user injected response. The table is based of splay trees after talking with Marty about splay trees and also talking with Dana Hartmeyer, currently OpenBSDPF uses AVL trees with a single rotation. We've discussed possibly dumping splay trees in here, why? When we get to the definitions of trees you'll find splay trees more adhered to the way that packets move across the wire, that we can actually simulate caching by bringing each expected node up to root rather than having to have each node of a tree be maintaining height balanced of plus or minus one to maintain a nice balanced tree. We use layer three for this obviously because we can't anticipate necessarily that all our packets are arriving through data link on the local LAN. And we sort the splay tree by source IP addresses, so incoming packets source IP addresses because obviously the destination IP address will always be one that is being either one of the other agent IPs or the local agent IP that the promiscuous filter is running on or the promiscuous sniff is running on. And we also maintain a local state of the routing table. Here we start dipping our first finger into why we're marrying this to a particular operating system and are not just using any operating system. We would only be able to actually maintain user define or slash etsy resolve.conf. We would only be able to use very, very clearly UNIX compliant files that are used across all operating systems. As you can tell I'm a big Lennox fan. We have Lennox. TCP IPs headers have been rewritten various ways to make them completely incompatible with at a low level. And so we have tools out there. The reason we have tools out there like libnet and libmua which ZSH is making as a component that is like libnet, an alternative to libnet. We have lib event which is a open, well it's basically a abstracted KQ method so that other operating systems like Lennox that don't implement KQs can use tools that like Nils Provost has come up with Vomit which actually sniffs. I don't know if you guys have seen this. It's worth checking out. Vomit actually is a sniffer that will basically reassemble WAV files of voice over IP connections since Cisco chose not to encrypt these connections. We have interest state rules. These are the rules, the static rules. Once the traffic makes it through the initial packet filter, the peak out filter, the agent IP addresses, and the local exclusion or local exempted ports, we have a set of rules that go beyond that that are statically defined to define what interest state is and what extra state is. The packet must have a local agent destination IP address. The packet also must have a protocol match. So we have a little bit mask, a little UShort type that keeps track of the protocols that are already current in the interest state table. These interest, the protocols that are already have been either user defined injections or following a user defined injection through continual injection after the user has made the initial SIN packet. We have a flow that goes back and forth across the agent that is supposedly or hopefully automated that this matches up with the protocol that is already a part of the interest state table so that we are not actually getting a protocol thrown on the wire that is not indicative of something that has not already been a part of the interest state engine. Part of the reason for this is we don't want to hacker crafting packets again that can saturate our table, make it a mess to sort. If a good example is there are certain tools out there that do, that name in the hymns, that do passive OS identification fingerprinting and they use hash tables. We chose not to use hash tables because hash tables are inherently vulnerable. Why are they vulnerable? Well, if a hacker knows, I should say attacker, hackers are not all bad. If an attacker knows that what to inject on the wire to produce state, the hacker can fill up that hash table. Early days of a particular tool I'm thinking of used a hash table, the currently publicly available one, I use a hash table of size 10,000. If I inject a whole bunch of packets from random IP addresses to fill up that 10,000, that passive OS fingerprinting no longer happens because there isn't a mechanism that was implicit in that code that would actually do a copy of the old hash table into a bigger hash table, which is a mess on the big O. Everybody here knows big O notation. It takes big O of N and if you've got 10,000, that's a hell of a lot of time to re-copy that hash table. So we wanted to use in splay trees or trees to keep a pointer based reference that we can continually expand until we exhaust available resources, which we're marrying up because we're using OpenBSD as an operating system of choice. We've got a few areas of APIs that we can use that such as pool underscore star and the pool resources are a kernel resource pool manager, which basically allows the ability for an application to not drop like a rock when the box is hammered and the kernels having enough difficulty trying to keep up. So we do other things. We do packet prediction, which all this will become very apparent and is discussed in extreme detail in the white paper. Packet prediction is the core component of the expert system. We have injection state. We're doing stateful packet injection, so we want to make sure that if the user says, I want to inject a sin to port 80 of this host, that that is a part of the injection state and that when the next packet is coming back in, we marry that up with the expert system's relative value changes and we know that that's a valid packet and it's not a spoof packet. And we check this against the expert system tables, which the inference engine uses against the knowledge base for lookup. Extra state is the information gathering component, layer three, layer two is assumed. Layer three conducts is also on a splay tree. It keeps minimal state. We have a link list that keeps track of IP MAC address relationships on the local land. For later purposes of doing either attack or defensive based research using these packet anomalies that you'll see coming up later. The extra state rules are that we have a local agent destination with an unknown protocol. We have a local agent destination or we have a local agent destination with an unknown source IP address. We have a remote agent destination strategic placement. So if you set your agents up on two different sides of a land where you may predict that attacks may be coming from, you can guarantee that one of those agents is going to be on the same network segment as the attacker, which means that we can do some sort of stateful analysis from that agent. And with a minimal state, we can keep track of it with the other remote agents. And it also keeps track of layer two and layer three broadcast. TCP IP header versus MTU. This is going to get into why we're handling TCP fragment reassembly cheaply. The largest packet I've seen on a wire sending out numerous packets of nemesis and other tools is a TCP SIN and a SINAC, which sends out 64 bytes because of a 25 byte set of TCP options. And we have ARP that follows 28 bytes. The RIP base for extended RIP options is 24 bytes and an OSPF Holo packet is 24 bytes. That's kind of like the top four of the protocols that are already used in nemesis as to the max values I've seen on the network wire. So what happens if we get a packet that's sent to us that is less than 64 bytes in length and we don't have a full header? Well, the minimal MTU I've seen is 572, which is indicative of PPP. I have never seen an MTU that has gone below 64 bytes. So the best way that we can handle TCP fragment handling is to simply drop the first frag if the full protocol header is not contained because we're filtering based on this protocol header. Otherwise, we just filter on the first fragment only. Whereas Darren reads IPF. It keeps frags. Why? I'm not really sure. If we don't have a first frag combination and we let the other frags pass through, we don't have a first frag to match up with the rest of the fragments when doing reassembly. Therefore, we don't have a full attack taking place. The header's missing or part of the header's missing so the packets get dropped. So why do we keep frags? I've been talking with Dan about this a number of times. We haven't been able to come up with real legitimate reasons for doing this. Now, I can understand doing this for IDS purposes because we're trying to match up protocol finger or we're trying to match up anonymous attacks, exploitation, fingerprinting. We're trying to match this up with signatures. So we want to make sure that we get the full header along with the payloads so that we can actually assign this to rules and we have some sort of a knowledge as to what the hacker is doing rather than just a bunch of random packets going over the wire, which doesn't do anybody any good, and creates a lot of hell going through TCP dump. Otherwise, we filter the first fragment only, complete header, and the next packet injection depends only on the header. So again, we don't care about the payload. If we're only caring about the next header or the complete header, we can just drop all the packets. And obviously, there are many reasons for doing fragmentation. A classical example is IP over IP tunneling increases the size. Our MTU is set to, oops, set to Ethernet. We got 1500 bytes on an Ethernet and with IEEE and different header combinations. We have 1492. An IP over IP tunnel would drag that size up to where it'd be greater than the Ethernet MTU. So we'd have fragmentation across all these packets. Now, of course, this would be better handled if we did even distribution and we understood ahead of time that we're doing IP over IP tunneling so that we could bump the MTU up and make sure that we were having needless fragmenting because this could produce twice the cost doing fragmenting on each packet. So our possible performance increase and decrease is based on this MTU. And then, of course, hackers fragment these packets, either very large oversized packets or very small, usually typically, for denial of service, evasion or compromise. We have our state tables, our interstate table. The updates are consistent or consist of addition and deletion. We have rapid deletion. This is why we don't use hash tables. It's an expensive cost to have to keep updating a table and deleting entries from it. It's just easier to make a prime hash table and to use linear probing, quadratic probing. There's all types of methods that you can use for trying to avoid collisions and hash tables. Frequent use. It's a very expensive operation. This is the most expensive operation is keeping state on this table. So we need an optimal data structure. An extra state table, which just updates consistent of addition. And we'll remove the stuff over a timeout. I mean, there's really no reason to just keep removing stuff over and over and over again, like we've done with the interstate, because we're only trying to keep track of the current packet that's on the wire so that we can inject the next packet out. We don't want to have to sort protocol dependent fields so that we can figure out which packet was the latest packet that was sent over the wire. There's just too much information to keep track of. Oh, which sequence number is the latest sequence number? It's a mess. So we're doing updates with the interstate table with rapid addition and rapid deletion. How's my time here? How's my time? All right. Speed is important, but not at the expense of security. Hash tables versus search trees. Why we don't use hash tables and why we use search trees? Hash tables can create, destroy, insert, delete, and find. We have a static array and it's expensive to resize as talked before. Our worst case is big O of log n. Our average case is big O of 1 if we don't have collisions. We do collision handling by giving that prime hash table size. We do open addressing and chaining. You can read up on hash tables if you're really interested in this. This is better if we're doing things like, I don't know, some symbol tables, like when you're compiling something, things where you're not going to be like removing things as you compile because you want to maintain state on those symbols. It's better if you just have additions and you don't have deletions. Our search trees can also create, destroy, insert, delete, and find. A typical search tree is height balance at each node. This can be an expensive operation if your search tree is very big. Balance is maintained through rotations. AVL trees are commonly find as single and double rotations, which are commonly called zig, zag, zag, zig, zig, and zag, zag. OpenBSDPF implements solely single rotations at the moment, but we're still filling out. This is still early in the development of it, but hopefully everything will be in place and we'll actually have all the state tables solidified for PF by OpenBSD30 this December. We have a linked structure representation rather than a static array that we've defined for a hash table, so our resize is just adding more pointers. Our worst case is big O of log n on a height balance tree. We decided to use split trees over AVL trees after many discussions. I'm pretty aware that Dragos, I'm not sure if he actually implemented the code. I know he was involved with it, according to Dan. Marty Roy, she just left at one o'clock today, unfortunately, could probably confirm that, but split trees are used all over the place and start. They can perform faster operations than AVL trees, given a good splaining algorithm. They're self-adjusting binary trees. There's no height data stored, unlike AVL, so we get faster searches, but we rotate each axis to root. So basically, the reason this works better for TCP IP is if we're doing interstate analysis of a packet, we have source IP to destination IP and vice versa going back and forth over and over and over again. If we, on the first rotation or into the split tree, we bump that value up to root of the tree, we have a big O of 1 worst case scenario for each additional packet that comes across the wire until we get a packet that is going to get rotated into interstate because the user is defining that he wants to send a packet to an HTTP server and then another packet to an SMTP server and yet another box. As these stagger back and forth, we obviously are going to have probably a worst cost of about the cost of an AVL tree. If anybody can argue that point with me later, that'd be great. I'd really like to hear that AVL works better than Splay, but so far, I'm not convinced. The amortized retrieval is big O of log in on average. Our worst case is a tree that kind of pretty much looks like, if I'm drawing this backwards, straight line that goes across, it's all in order. By definition, a Splay tree will never be big O of N for more than one insert into the Splay tree. We bump that up to root, we have a little bit more balance and things are happy. Our Splay tree algorithm, which does double rotations and does this zigzag, which the zig is indicative of a left branch and a zag is indicative of a right branch. This is also expressed in more detail in the white paper. Expert systems. We have an unbiased maximum performance, lax personality. Mycin, which is probably the first commercial expert system, mapped bacterial infections and how doctors could basically take a look at symptoms that a particular patient has across a number of bacteria and to try to pinpoint this to what might be the cause of this person's illness. Obviously, anybody in this room, everybody's got the proficiency. Some people are Linux kernel coders. Some people are Linux user land coders. Some people are Windows people. Everybody's got a proficiency, but if you have an expert system that is matched to, like in Mycin's case, that is mapped against all known bacteria infection types, we can deduce what the plausible bacteria types are with human intervention to make sure that we have a possible match so we don't give somebody the wrong prescription or something. Knowledge bases. We have a TCPIP rule base, an attack rule base, a defense rule base. TCPIP rule base is the most important. It is indicative of a dictionary. Dictionary of protocols by major and minor version with the respective dependent fields that need to be altered to maintain this next packet injection. And again, we have inference engines that match up to these knowledge bases. The inference engines pretty much tell how to sort the knowledge bases in as little time as possible. Our TCPIP rule base consists of a list of rules by protocol. We inject TCP send across destination port 80. So we make an injection table addition. We sniff the next incoming TCP send act packet. We reverse the injection state match so we invert the source with the destination IP address and the ports. And we say, hey, that was already injected by the user. So this must be a valid sniff. We add that to the interest state table. And the rules represent the minimal state change, connection integrity. Here's an example. We have a packet prediction. We have a syn act going to an act. So we sniff the syn act. We want to send out an act as the next packet. We sniff an act number of the incoming packet. And we make that the new sequence number of the outgoing packet. We sniff the sequence number of the incoming packet. And we make that the act number plus one of the outgoing packet. Hence, we keep the connection going through raw sockets. We add a payload. This makes clear text protocols really easy. You got Unicode and you've got all these other little attacks that, you know, you always see these people running them against your Unix boxes, you know, like they did any sort of information gathering. You can do heads slash HTTP. You could just dump this into a file. And if you're maintaining this packet prediction, next packet injection, you pull the next file from the payload file and boom, you've got a head. You've done a stateful connection back and forth three way handshake. And you've actually simulated doing information gathering through head using raw sockets alone with no client server based connection through standard networking sockets like BSD sockets and et cetera. Exploitation. Here's a famous one. And ooh, IDS vision. We can use this and we can actually use this to map the directory structure. All we have to do is change the payload at the end of here. And we can actually walk a file system and just have it dump all the, you know, the packet payload that's coming back on the wire to a file. I mean, really, the goal here is not to have the agent take over the human's interaction. It's just to help out with the information gathering so that the human isn't doing all these esoteric moves. You got to do a network penetration test. All right, I got to do ping scan. All right, I got to do port scan. I mean, all this is just so humdrum that it would be nice if we could actually automate some of these attacks and just bring the information that we want back. I mean, after all, we're assessing the organization for economic risk and not necessarily assessing an organization for the exploit of the day for Solaris. We got bouncing and stealth. We can connect to a proxy and we can send to, you know, payload after we've made this connection and we can basically masquerade our IP address. Our attack rules are going to go through this real quick because these are stubs right now. They maintain exploitation signatures and denial service signatures. All these are taking the TCP rule base. The reason why these are stubs right now is we need a rule base language for maintaining packets and their dependent protocol values. What we mean by that is we need a way that we can say, okay, we want to send out this packet and then this packet with these fields, then the next packet, and that's indicative of yada yada denial service attack. So we do this at the protocol layer. We aren't calling it Smurf. We aren't calling it JP. We aren't calling it other names for denial service attacks, but the idea here is that we don't want to make a network attack engine that can only do one thing. I want to make a packet crafter that can launch a whole bunch of these attacks given the knowledge in the rule base through the inference engine lookups of how we can actually launch each one of these attacks and define them by individual packets that get sent across the wire. I mean, keep in mind that we have full control over the protocol headers by not using standard networking IO so we can pretty much do anything here. The defensive rule base, the reason we're keeping this to open BSD, we want to marry this up with open BSD-PF through IO control. Wouldn't it be cool if you had an agent on a multi-tiered host and you as a user connecting to these agents knew that these multi-helmed hosts were connected to different peers across two different interfaces, two different ISP peers, let's say, or upstream. And attacks started coming in through a particular interface and then we could automatically drop that attack by dynamically adding a firewall rule to restrict any or dynamically adding a route, a black hole route that would drop all those packets on the wire until somebody got a chance to come and assess the damage that was happening on the network. I mean, if nobody's getting to the server anyway because it's a denial service attack, we might as well keep the stuff from flooding the network because if you've got IDS sensors deployed across your network and you're doing information gathering based on your topology just like every good speaker has told you so far, you are actually collecting a ton of state that keeps coming in, banging your boxes, you keep getting sensors probing or you keep getting sensor alerts that will keep going until somebody responds and it'd be nice if we could keep the rest of the information gathering on the network sane and that we only detected it once and we just say that it's maintaining this until somebody can get to the box to figure out what's going on and do some data forensics analysis. So we married this up with both OpenBSDPF and the routing tables. The TCP IP engine, TAC engine and defense engine. We do deterministic prediction, predict with certainty. There are many ways of doing inference engines. There are three, pretty much three ways. There's forward chaining, backward chaining and rule value. Rule value is an optimized version of backward chaining and basically forward chaining is a data driven search. Backward chaining is a goal driven search. We've got, want to predict with certainty across the TCP IP rule based inference engine. We don't want to send the next packet if we don't know exactly what we want to send next. We don't want to hose the interstate connections. That's the whole goal here. If we can't maintain because we're missing, we've got three unknowns and we can't decide what two of the unknowns are going to have to map to. We just drop the packet. TCP is very forgiving. If you don't respond and there's already a three way handshake going, it'll gladly send you another packet and tell you more about what it needs. We do probabilistic prediction for the attack and defense. We predict with uncertainty. This is mostly used for user defined injection. Want to do a Smurf attack, send it across the wire. So we remove greatest uncertainty, minimal cost, oops, major protocol, TCP, ICMP, minor protocol, TCP SINAC, ICMP redirect. Rule value means that we remove the greatest uncertainty at a minimal cost. So if the next packet comes across the wire is a TCP packet, we can instantly narrow the scope down to the knowledge base to just TCP packets. And then we can again, based on the TCP flags that come across the wire, we can narrow that down to an even finer grain scope of what the minor protocol is. So we now we're only looking at what the next relative injections are for a particular event of TCP. So we have protocol connection dependent relative field values. Add plus one to this field, set this to a default value. We just grab the minimal state required to make the next connection out. We have actually, for most packets and my research of gluing various pieces of code together, it's a constant value. It's like big O three, big O four, we usually need to change about three or four values to keep a protocol or to keep a connection moving. User defined injection. Can anybody read that? User defined injection, only way to have interest state. We can't, we aren't allowing any packets to move across the wire that are brought in through the PCAP filter and through the interest state extra state static rule base without the user injecting the packet first. So we inject and we add to the table. And then we have automated injection, which injects is the next packet if the criteria is met. We have stateful packet injection. The usage is sensor sniffs the packets. Interest state checks the reverses of the addresses in the ports. Checks injection table, present. We send to the expert system. If it's not present, we update the extra state and we drop the packet, we drop state. So we want to drop all that state we've been keeping on the packet and move it into extra state if it's not important to what the user has already defined is what they want to do across the wire. Thank you very much. This is the actual key to unlock the white paper that I will upload to the site later. I did this just to give you guys a little value added flavor. And I've had like all these people on IRC like, are you going to release the paper early? No, but if you attended the conference, which a lot of you did, cognitive science is the key and it will be mapped to that key and that should encrypt it and decrypt it. And that's it. Does anybody have any questions? I know you have a lot, but this white paper will clear up a lot. The presentation wasn't, but what did you mean by case sensitive? Oh yeah, it is case sensitive. Yes. www.packetninja.net. You'll find both nemesis and you'll find intravenous. I kind of pulled off all the kitty code that I used to write off the site. I'm trying to get more deeper into OpenBSD. I currently maintain about 15 ports for OpenBSD and I want to start delving more into the kernel. And this is kind of a, I'm providing this for OpenBSD for reasons that will be out late on the white paper as being very sane reasons for making this choice. But I'll admit to you all right now that I'm being a little selfish and that I want to be able to research the OpenBSD kernel at a more intricate level at the same time I'm producing the Intelligent Agent. Anything else? Thank you very much for attending.