 Hopefully this will work. Okay, so what I'm going to talk about is loot kits, and also ways to detect your kits. I've developed a freeware tool called VICE that does that, so if you came to see a bunch of hookers taken away by some policemen, you're in the wrong room. I know! I know! The limo got break down, you know, whatever. So, as he said, my name is Fuzanop. If you came to Black Hat, you know, I'm also Jamie Butler. It's in your program. That's no secret. I'm at rootkit.com. Everything I discuss here you can download from rootkit.com, but wait until Monday or Tuesday of next week. The source code for all the rootkit stuff I will talk about is on rootkit.com, so you can have that and change it or whatever. VICE, the freeware tool currently is downloadable in binary format. So what we're going to do, we're going to talk about hooking and how a lot of really lame rootkits hook and user land and so forth. Then we'll go talk about the tool VICE to detect it. Then we'll go talk about direct kernel object manipulation, which is incorporated into FU and a demonstration. Pardon our voice today. It's pretty dry. So we'll cover a few general overview of an operating system. This talk applies mainly to Win32, but you could also apply all the rootkit technologies and concepts to a Linux or Spark or whatever. So in a Win32 type of environment, there's user land, and that's supported by kernel32.dll and ntdll.dll. So whenever you write a program, that's the libraries you link to, and you use the functionalities exported by those DLLs. Once those DLLs get a hold of your request, they eventually call down to the kernel. Through some means we'll talk about in a moment. The kernel is what implements the low-level functionality to allow these APIs to work. So it's the kernel doing all the work, kernel32 and everything is just basically a pass-through. So, tax scenario. Rootkits are not about how to exploit a box. There's been other talks here, FX and Alvar and other talks like experts at Shellco and so forth. So I won't talk about how to gain access the first time, but a rootkit is for is to maintain access. So you want to hide your presence, you want a remote command and control channel, that is undetectable by IDSs and by the kernel itself and maybe intrusion prevention systems. Rootkits will, in the most part, for the ones I write, they'll exist in the kernel. So they act as part of the kernel. And we'll talk about while device drivers acting as kernel can be really bad. Until recently, most rootkits were nothing more than a chosen program. This is probably like an 80s thing. They would replace PS or LS or some of the other commands on the file system so that when an administrator would run the command, they would get a chosen version of the executable. Of course, companies would come along and, for the large part, solve this problem. And now rootkits evolve to filter data by hooking functions in memory. This way they could achieve the same results. By hooking a function, you can change what that function returned. So, for instance, if you're asking for a list of processes on the machine, by hooking, you can filter out your rogue process that you do not want displayed. Hooking can be done at many different places. The import address table is one. And I think we'll have a diagram of that in a moment. You can hook the system call table. You can hook the interrupt descriptor table. Really, this is only a few places you can hook. There's quite a multitude. So, this is just a diagram. I don't have a light pen, so sorry. I can't point out the relevant features on the slide. But this is the green represents, the big green box represents a program that's running. And it needs functionality from another DLL. So, it does, it has stubs in its code called the import address table. When the loader loads the program into memory, it will set the appropriate function address for these functions that are in other DLLs in this IIT table. And that's a blow-up there. They're either exploited by function name or by ordinal. And the one, two, one, one, two, two, three, three, four, represents the address in a different DLL function would actually be located at. So, this is the normal control flow. Code in your program goes, and the application goes through the IIT. The IIT calls into some DLL, such as kernel 32, et cetera. You can inject the DLL into another process. There are a couple of different names. We won't cover those. But they're well documented. And it's also rootkit.com. There's at least three different rootkits that use DLL injection to hook things. So, in this case, the rootkit is changing the IIT entry so that the program control flow changes to it instead of the original DLL. Now, here's an example of a system call hook. So, on the, I guess, my left, there's the system calls being made. That's an application asking the operating system to do something on its behalf. It passes through that magic barrier there. Now, it's in kernel mode. In kernel mode, there's a function called, in the kernel, in TS kernel, there's a function called chaosSystemService. ChaosSystemService will resolve the number of the function you're trying to call. So, for instance, like, ZW createKey for creating a registry key. Maybe that's function number 3B. It uses that as an index into the system service descriptor table, and that just contains function pointers. Each, there's a blowup of each table entry, and they're nearly function pointers that are indexed by call number. The function is represented here. So, you go through the call table to the function in question with this control flow. Now, if you have a root kit, obviously you can divert that control flow to yourself. This is the same as a user-land IAT hook. More or less, it's just a little bit more complicated because not as many people seem to know how to write kernel code and can get some things wrong. This is an example of a more advanced technique where after going through the system call table, it still goes to the correct function, which is represented there at the top. However, the function's been hooked by an immediate jump, and the immediate jump jumps to the root kit, and then the root kit has control once again. This is a lot more advanced technique. It's sometimes hard to get correct. Quick overview again in case this is an animated slide that represents it maybe a little bit better. So kernel32.dll runs for a while. This function represents create file w to create file in the file system. It eventually calls into NT-dll, calling the function NT-create file. That runs for a while, and then it calls into the system. So at each one of those places on the slide, you could have hooked at the beginning of the function. You could have hooked at where the function made the call. You could hook the IITs, et cetera. Once you're in the kernel, on Linux, the interrupt for requesting service of the kernel is an update on Windows. It's int2e. So if you hear int2e, then that's when you'll jump to a system, chaos system service, which will resolve the number of the system call that you wanted and also where your user parameters are and so forth. Just for reference, in the EAX register is a system call number. In this case, it would correspond to NT-create file. And in EDX is a pointer to the user supplied parameters. So once in the system call table, again, it's just an address, and it jumps off to the real function in question. This slide is an anime, but you could have hooked NT-create file by overriding pushEVP, moveVSP into EVP, and XOR. You would have to at least overwrite that many, maybe the push also. Because a jump is a five-byte instruction, it's at least five bytes. So you have to overwrite at least the first five bytes of the function. So, on rootkit.com, there's a whole lot of rootkits that are user-mode rootkits. And user-mode rootkits aren't the way to go if you want to write a rootkit because they're fairly trivial to find. By hooking the IT and so forth, they leave a really big footprint that you can look for because you made the range of the DLL and so forth. So you can look for jumps outside of acceptable range and then flag those. So Vice was to take care of that. Also, hooking even in the curls, usually it's a little bit old-school now. It's been around for at least six years, probably in public source stuff, and maybe in the Linux world even longer than that. I'm not familiar with that. So, Vice detects kernel hooks, 132 API hooks, and inline function patching. What I mean by that is when you would overwrite the first five bytes of a function, that's inline function patching. Also, we're not in the slide, is if the driver, device drivers in memory have a table of function pointers also. So there's... Whenever you talk to a driver, you're talking through our request packet. So like when you use netstat to list all the available ports, you're talking actually under the hood to tcpip.sys, and you're passing it rp underscore mj underscore query. That's one of the erps you'll pass down to the driver. Well, there's a function pointer within the driver object that describes which function handles or the address of the function that handles that particular erp. See, you can also hook there. However, I haven't currently come across any public root kits that do that in the Windows world, and I'm doubting it's done much in the Linux world. I'm not sure of the format of the LKM structure. So I'll try to get to a Vice demonstration now and show you how it works and that you can use it when you go home. Okay, well, let's turn that key. I'll try to show it in a minute if I have extra time. The key's not installed right now. So to save the day, I have backup slides. So here's Vice versus Hacker Defender. Hacker Defender is a very good root kit that's on rootkit.com. I believe it was written by Holy Father, and he's very knowledgeable. I've discussed root kit stuff with him in email. Here's the Vice output. I'm not sure if it's completely readable, but all these slides should be on your CD also. The thing's there with the embers glue, a little icon, you'll see it's hooking. There's something hooking kernel32.dll, and actually that turns out to be, in this case, way up top of the screen is NTDLL. So you might say, well, NTDLL is not a root kit. That's correct, but the way it's used in the system, it actually hooks other Windows APIs occasionally. And so therefore it is a hook, and that's what Vice catches. So the thing with the Swiss Hammer knife, a little icon, that's all the functions that Vice called Hacker Defender hooking. When possible, Vice will output where on the file system you can find the hooking module in question. So that will allow you to go to the file system and delete that bad DLL or whatever off of the system. In this case, it couldn't resolve Hacker Defender, Vice's project's still in the works, and maybe in the future we will be able to resolve that on the file system. So you can see hooking things like create process and so on and so forth. It also gives you the address and memory Vice does. So if you wanted to go and reverse engineer it yourself and figure out exactly what their function does, you can do this. Here's Vice versus Viacruz. Viacruz is another popular root kit at rootkit.com. And that's it about that. How many red KDMs paper on Frack 62? Anyone? Okay, he stated that Vice could not catch his root kit called NT Illusion, which is an entirely user-land root kit. This was correct at the time he tested. There was a bug in Vice, and Vice had not had any development on it since it was first written in mid-April. However, now that bug is fixed and as you can see it catches NT Illusion. That's noteworthy here, it's not only catching NT Illusion, it also puts it on the file system it is. So you can go delete it. So that was sort of root kit spin and hooking. We talked about how hooking is detectable, and now we're going to go into a little bit more of what consumers are currently doing in the world of trying to prevent viruses, trying to prevent root kits, and so on and so forth. Everyone's running to third-party personal firewalls and also host-based intrusion prevention systems. So what are HIDs? HIDs are host-based intrusion detection system, HIPPS being host-based intrusion prevention system. What are they there for? They're there to detect what process is a rolling system, maybe if a new process has started, block access to things like Netcat and so forth from rolling, what files are created or deleted or modified that may be important to a HIPPS, like don't let anyone write to the 1MTE or Windows System 32 directory because that's probably going to be a bad thing. HIPPS also keep track of network connections made and possibly deny those connections, and they try to do a little bit with privileged escalation. And they look for buffer overflows and so forth. However, for a lot of these different pieces of functionality, HIPPS and the current security products rely on the operating system itself. So if you rely on the operating system, you rely on the HIPPS. Here's some different functionality that's built into the operating systems today that allow HIPPS to work. If it allows the HIPPS to work, it can also probably, you could use this as a starting point to where to look to try to unlink a HIPPS, for instance, in your kernel. So you can register OS provided callback functions. For instance, if you want to be notified every time a process is created, you can register an address to be called in that case and the kernel will do so. That works for threads also, images loaded into memory. There's all these different pieces that HIPPS use to detect new things coming into the kernel or coming into the process listing. You can hook these, you could remove, these are nothing more than a linked list of functions to call. You could remove a HIPPS product from that linked list without a whole lot of problems. Some HIPPS actually hook the system call table. You have to do that especially on Windows if you want to filter or unregister key access because there's no easily... There's a user land API that you can call for a callback when a registry key changes or any of its children, but it doesn't provide you much useful details. It just says, hey, something changed and you would have to reparse all the keys to figure out exactly what it was that changed and therefore you'd have to sort of keep state like before and after. That's not very useful. So HIPPS for the most part just hook the system call table directly. HIPPS can hook user land, API functions. How many saw the FRAC article on bypassing third-party Windows Buffalo Floor Protection? Okay, that was done by myself and two other anonymous co-authors who I'd like to thank and they know who they are. So anyway, some products were hooking the user land APIs and we wrote some shell code and so forth to totally bypass their implementation of Buffalo Floor Protection at the API level. And then you can query the kernel, of course. So if you want to know what network connections are open, you can query the kernel, I'll tell you. Well, there's a problem with a lot of these HIPPS designs and it goes back to the FRAC article really. It requires that they be on the execution path of any malicious code and they have to know where the execution path would be ahead of time. So there's no reason that we have to use the APIs in the kernel or even in user land. So the ones that we know that they're watching, we don't use those APIs. And I'll introduce DECOM in a second which uses absolutely no APIs to make its changes. It may use some APIs once in a while to gather information, but this is really benign information that would never be blocked anyway. So let's talk about the operating system design quickly. Intel has four privilege levels or four rings, as you see here demonstrated. However, Microsoft and other OS vendors do not use all four of these rings. They only use two. There's ring zero, of course, which is completely privileged code, which is the kernel itself for device travelers or LKMs. Then there's ring three. Ring three is all applications. So when you're running in ring three, for instance, ring three is just ring three. Like the Hips doesn't have any more privilege than you do at that point. If it's only implemented at ring three, which is the case of some of these one three two buffer on-flow protection mechanisms. Ring zero, you have access to all of kernel memory. So everything that the kernel uses, like objects to keep track, the kernel is more or less a big accounting arm. So it's keeping track of all the processes, all the ports and so forth, and it does this in structures that Microsoft calls objects. So these are kernel objects not to be confused with C++ objects. So once you have this low separation of powers, there's only two privilege levels. Every single device driver has access to all these accounting objects that the kernel has access to. Therefore, you can modify it just as much as the kernel can. You just have to be very careful how you do so. You have to understand what the structures are, how to find them, what they do, and how the kernel expects them to look. So if you don't understand those important details, you may modify an object in memory and cause a blue screen. There are so many third-party device drivers. I mean, there's your printer drivers, your fax drivers, all these different things. And there really should be different levels of protection between the two, between the kernel and the drivers. So the next generation of rootkit technology is DECOM. It's a play on, obviously, Microsoft's terminology, but in this case, Direct Kernel Object Manipulation. So what does it do? It modifies objects directly in memory, never going through APIs. So there's well-known APIs to, for instance, add a SID to a token in a Windows process. Well, first of all, the kernel is going to see if you have the appropriate privilege to add that SID and it may stop you there. Also, the Hips or other third parties may wrap around that function flow, the function control flow, and try to intercept that call and limit you there. But if you write just directly to the memory, there's nothing to stop you. So you can use DECOM to hide a process, add a privilege to a token, add groups to tokens. A cool application is to fake out the Windows Event Viewer so that if a user... I should say, if a process spawns another process, such as Netcat, if the administrator has detailed process logging turned on, any Windows Event Viewer, they will see an event that says that administrator or whoever you're logged in as, you know, just started a CMD or Reddit at 32 or whatever. Well, if you change your token to look like someone else's, just by changing a couple bytes, then the administrator will be thrown off because you can make it look like System did it. So System just spawned Netcat. Now, obviously, that's going to be a red flag to any halfway competent administrator. However, it's not going to point at possibly the user account that you subverted. It's going to be a red herring that they're going to have to chase for a couple weeks, probably, before they really figure out what happened. And maybe they'll never figure it out. So another thing of DECOM, you could use it to hide ports. Currently, the FU root kit does not hide ports, but you can do it. So, obviously, the implications of hidden processes. The hidden process can have potentially full control of the system. There's no accountability. It defeats the hids and hips type of paradigm. And it can skew the results of forensics. It all depends on how advanced your forensics are. But you only have to overcome the slowest forensic tools for the most case. So how do you go about hiding a process in the Windows kernel? Here's a picture of how the objects look in memory. The kernel has a processor control block. It's represented there at the top. Within there's three key pointers. One is to the current thread, one is to the next thread, and one is to the idle thread. So you need to find your current thread because that's going to lead you to your current process. So you go to the processor control block, follow that pointer to the current thread. That's represented there by the e-thread block that's sort in the middle of the diagram. The e-thread block or object contains a k-thread object, completely embedded within it at the beginning of the e-thread object. And that contains a pointer that leads to the e-process block. Every process in a Windows operating system has to have at least one thread, and it has an e-process block. So these structures are, for the most part, undocumented. Some of them now are becoming a little bit more document. However, they change a lot between versions of Service Box and also major versions of the operating system. So you have to do, at first, when I did this two years ago as all reverse engineering, now you can use Windybug, load up the symbols appropriately for whatever version of the kernel you're using, type dtspace nt bang underscore object name. And if you don't know what the object name is, you can wildcard it to try to narrow it down. I will tell you that not all the object numbers are documented, like Microsoft may just say, oh, it's a p-void. And so you don't know what it actually leads to. And of course, that takes a bit more time to figure out. So within the e-process block, there's a list entry structure. It contains two pointers. The first one's called a flank, and the second one's called a blank. A flank is a forward link, and a blank is a backward link. So this list of e-processes, if you would just keep following the flanks, it's completely circular. Also, if you just keep following, well, you'd have to do, like, de-reference blank and then add forward to get to the link behind you and keep following the chain that way, you could go backwards through the list also. So if you want to know how to process, the kernel processor control block is located at a fixed address. Also, if you don't trust that, because I haven't verified that since 2000, you can use the FS register to always get the address of that kernel processor control block. Also, within there, we talked about that you can get the current thread running, and there's the assembly command. Well, you would just move that into, like, eax, the FS colon, bracket 124, and that would get you to an e-thread. The e-thread takes you to the k-thread, and the k-thread takes you to the e-process block. So to how to process, all you need to do is change the e-process block in front of you to point to the e-process block behind you, and the e-process block behind you to point to the e-process block in front of you. And I think I have a demonstration of that graphic. So here, so I don't have a pointer, but we're trying to hide the current process. So if you just, it's the e-process block that's in the center. So you would want to modify the flink behind us to point to the flink in front of us, and that's represented there. Once you do that, half the game's done. Now you have to change the blink in front of us to point to the flink behind us, like that. So you just tell your neighbors, hey, you know, point around us. So why does this continue to run? Well, it turns out that the schedule in Windows is extremely advanced, and it's thread-based. But however, the reporting mechanisms are process-based. So you can edit this e-process block, and all your threads still get scheduled because they're often their little thread cues, and so they'll continue to run. I'll just cover this really quickly, because Linux isn't my strong point. I haven't looked at it since it came out with the new threading model in Linux 2.6. But within the Linux kernel, you could do a similar thing, however there's sort of a gotcha at the end. The Linux kernel contains an array of task structs. So each task struct represents a process. A task struct contains a link list with the pointer's previous task and next task that you can also follow around the circle. And it contains two other vital pointers which are previous run and next run to determine the list or the ordering of when to schedule a process. So how to process, you just remove it from the previous task and the next task, like demonstrated in the e-process block. Exact same concept. You leave next run and previous run alone. Does anybody know what happens when you get this far? What? Speak up. Okay, the whole process freezes. So nothing. Your process does not continue to run. And the reason being is because the way that processes get scheduled, tell them on the Linux 2.4 kernel, in which case the scheduler walks around the previous run and next run list. So it follows the next, I'm sorry, it follows the previous task or the next task list. Walks around this link list. Going to each task structure within memory. Based upon, it calculates a goodness value for the process. And based upon the goodness value, it assigns it an order or the number of jiffies and so forth in which to run or to get CPU scheduling. So that's when it puts it into previous run and next run, the ordering there. Well, if you're never in the next task listing, then you'll never get your goodness value calculated, so you'll never get scheduled again. Because the scheduler can't see you. This is probably all different in the 2.6 with the threading model. There was a root kit on... After I hit this problem, there was a root kit that I found on LON that was open source. It was Phanis Morghi, I believe is the name. And this person had come across the exact same problem I was having, and they figured out the only way to fix it was to... or the way they were going to fix it was to patch the scheduler. So what this... But the solution did was the init process is always the first processor has no parent process. So the parent processor or pointer parent process in the task struct is null. Well, they modify that to point to the hidden task struct. That way they modify their scheduler to always go there and look at these hidden task structs and calculate goodness value for them. But if you go through all that effort, there will be better ways to hide your process. So it should be noted that you can't just change these listings arbitrarily and not expect some problems. These lists, especially in a multi-threaded environment or a multi-CPU especially environment, these lists are shared resources. So you have to use some kind of synchronization objects to try to get access to these lists if you want to completely modify... or modify them in a completely safe manner. In Windows, if you want to modify the list of active processes, there's a non-exported symbol called PSP Active Process New Text. The kernel every time grabs that new text and once it has it, then it can modify the list. PS load and module resource is to guard the list of device drivers. The problem is, obviously as I said, these symbols are not exported. So we have to find another way to come up with synchronization. One way you could do it is you could hard code these addresses of these new text and that would vary a lot between each version of the OS. And maybe potentially each reboot could also have an effect here. This isn't probably the best solution. The other way you could go about it is to try to search for a known pattern in memory. You know that there's going to be certain functions within the kernel that have to access these new text in order to operate safely. So you could do a search and Sherry Sparks from the University of Central Florida has worked with me on this. She wrote all the code to search for the pattern matches in memory in order to gain synchronization. Another technique that we're playing around with is trying to raise every... You can raise the dispatch level on the processor or the IRQ level on the processor to dispatch level. Well, if you go to dispatch level then you can never get swapped out. You'll continue to run until you give up control because you run at the same priority more or less or the same level as the actual scheduler itself. However, you can only raise... One call to this function only raises it for that particular CPU. So you would have to find another way to raise it on every single CPU and we think we may have a solution for that and we just haven't coded it up to test it yet. On to token manipulation. Out of privilege of some process you can go and manipulate the token. It would be something like SE debug privilege or SE shutdown or SE load driver, for instance. You can also add groups to a token. The groups represent the SIDS that that process belongs to. Also, the first SID of the token is the owner of the process. That's important that we'll talk about in a moment. And by altering some of these fields you can make a process look like someone else launched it. So how is this possible? You have to understand the structure of the tokens a bit. There's a static portion and there's a variable length portion. Within the static portion, obviously it's a fixed size and static. There's important things such as session ID. There's a pointer to user and groups. There's a pointer to the privileges and there's the number of privileges and groups. So we'll need to modify these things and also alter the variable length. Within the variable length portion because a token can have different numbers of privileges and so forth, it may not be the same for every token. They're put in the variable length portion. The same for users and groups and restricted SIDS. I won't cover restricted SIDS. It was something that was added in the Windows 2000. I couldn't believe and I haven't seen much use of it. But you could restrict certain users from performing certain actions with restricted SIDS. So one of the problems I ran into in the FU root kit was you cannot just grow a token in memory. In other words, you cannot just write to the end of it because you don't know what's below your token. There could be, especially if a FU doesn't take into account the restricted SIDS. So you would sort of just write over the restricted SIDS which probably isn't a good idea. Also, you don't know what's at the end. There could be another process token, so on and so forth. The memory may not be valid. You could cause a page fault or potentially a blue screen. And so I went about this or a way to solve this was to just allocate memory and that turned out not to work. But I did was allocate enough memory to put the new privileges and the new SIDS in and then I just changed the pointers and the static portion of the token to point to that region of memory. And that worked for the current process. But if that process ever spawns a child process, the way that SEP duplicate token works, it tries to create... It takes the parent process as token and tries to create that in the child process. So it's an inheritance. But the way it works, it has... It actually has a subtraction in there that I can't figure out. They assume they know where the token is and memory or what's right below it and so on and so forth. So they know some things about the token structure that I don't and that prevented this mallicking from working. If they would just follow the pointers which seem like the obvious way for me to do it, they wouldn't have had any... I wouldn't have had any trouble when I tried to do it using that method. So the way I learned about this was to try to... There's a lot of privileges within a token that are disabled by default and so if they may even turn to one, then I probably don't need them, so let's just throw them away. And I used this method, I called it the inline method, just to free up some slack space within the token so I can write some valuable information there.