 Good morning. Welcome to DEF CON. Today I will talk about a topic, kernel, Mac OS and iOS kernel debugging and heap function. So here is the outline. First I will introduce myself and what is the kernel. Then I will introduce how to do kernel debugging on Mac OS and then iOS. Then I will use these debugging techniques to debug a real kernel heap overflow. Last but not least, I will introduce two methods to do heap function. Okay, let's start. So who am I? My name is Min Zheng. You can call me Spark. I'm a security expert at Alibaba and I'm a PhD in COHK and I'm a member of Blue Lotus and Insight Labs. I worked in FIIR by doing maintenance before. Our team have achieved a private drill break in iOS 9 and we will show more details in the future. Here is my Weibo and Twitter. You can follow me and ask me questions on that. This work has a co-author called Liu Xiangyu. He is my colleague and he's a security engineer at Alibaba. I also want special thanks to my friends which helped me in this work. So the topic is kernel debugging. What is the kernel? The kernel is actually the XNU. XNU is the computer system operating system kernel developed by Apple. It is open source software as part of the Dow operating system. It has a abbreviation called X is not unique. We know that XNU for Mac OS is open source. It can be compiled and debugged. However, XNU for iOS is not open source. You cannot compile it or debug it officially. But most implementations is the same as Mac OS. However, we can use some tricks to do kernel debugging for iOS too and I will introduce it later. Let's talk about Mac OS debugging first. A wise man once said to do a good job when my master sharpener was tool. So we need to buy some equipment. First we need two Mac books or one Mac book with virtual machine. The system version can be different which means you can debug iOS 10.10 on iOS 10.11. Also you need to buy some converters or cables like Thunderbolt to firewall or firewall cable. After we have the equipment, we need to install KDK on the two Mac books. For the host Mac book, we need to execute a commander called FWKDP. For the debug Mac book, we need to copy the kernel development of KDK to the system library kernel folder in the debug Mac book and execute the following command. This command will set the environment and reboot the debug Mac book. After rebooting, the host Mac book can start debugging with commander LLDB, KDP remote local host. Note that we can get kernel slides immediately. It is very useful for us to debug the kernel. Also we can use some command like image list to get the kernel address of partial kernel extensions. Note we can only get partial kernel extensions if we want to get all the kernel extensions we need some other ways. Just like JDB, we can use X slash and X to read the data in the kernel. Also we have three ways to set the break points in the kernel. By using, by, we can use B asterisk address to set a break point in the LLDB. We need to use the offset of the kernel slides to calculate the address. Another way to pause the debugging machine is to use a shortcut key. You can use command plus L plus control plus shift plus ESC all at once. And it will pause the debug machine immediately. We can also set break points in the X and U source code through int $3 and print kernel information through print. Note that if you use this way to debug the kernel, you need to recompile the kernel and then put it in the debug machine. Also we can use commander script import, this commander to load the pattern script in the kernel, in the LLDB. It is very useful because this script can help me get a lot of information about the kernel. For example, you can use the print to print the zone, the zone information. And you can use show the free list to show the element information in the free list. There is another commander called show all kernel extensions. This commander can get all the kernel addresses of the kernel extension. In addition, you can implement your own pattern script. You can find some examples in the KDK's pattern folder. So let's talk about iOS kernel debugging. We know that there is no official tools for iOS kernel debugging, but we can use some tricks to do that. Before we debug the iOS kernel, we need to get the kernel cache. Unlike macOS, we need to decrypt the kernel cache. Before iOS turns, the kernel cache was encrypted. You can find keys in the iPhone Wiki website and then decrypt the kernel cache. After iOS turns, there is no encryption, but there are some encodes. So we need to unzip the kernel and decode the kernel using a major tool. After that, you can extract the kernel information through joker and adder. Know that there is no breakpoint in iOS, so the most common way to get the register value is to use the panic log. Something should pay attention is if there are too many panic logs in your phone, the system will stop generating the panic log. So if you want to debug your iOS system, you need to delete the panic logs if there are too many. So you can use these two methods to do that. One is for jailbreak the iOS. Another is for on jailbreak the iOS. Although there is no KDK in iOS, we can still use kernel task port to do arbitrary kernel memory read and write. So two user land API, one is called mark VM read, another is mark VM write. But what if we don't have the task for PID patch or no jailbreak? What should we do? Well, you first need a kernel vulnerability. Then you can use this vulnerability to get the kernel task port. Then you can restore the kernel task port into the host special port. Then you can use a user land API called host get special port to get the kernel task port in the user land. And then you can use mark VM read and mark VM write to do the kernel read and write in the user land. After getting the kernel task, the next step is to figure out the kernel text base and the slide. In ARM32 it's easy because there are only 256 potential locations for the kernel slide. But in ARM64 it's not easy. We need to do something like first create an OS object in kernel. Then we need to find this Vtable pointer which point to the kernel base region. Then we search backwards from the Vtable address until we find the kernel header. The source code can be referred to the iOS kernel utilities project. After getting the kernel slide we can get the root privilege for our applications through the kernel read and write. And then we can use offset plus kernel slide to find the kernel object address of related ports in memory. It is very useful to debug the mark port in the XNU system. So now I will show how to use the debug techniques to debug a real kernel heap overflow. This vulnerability exists in the mark virtual extract attribute recipe trap. This is a new function added in iOS 10 and Mac OS 10.12. So that's why there's no jailbreak for iOS 9.3.5 because this function does not exist in that version. It's a new function in iOS 10. This mark trap can be called inside the sandbox so we can attack the kernel inside the sandbox. This function will first use copy in to get the recipe size from the user land and save this size to the sz. Then it will use key lock to allocate sz size, a block of memory with this size. And then it will use copy in to copy the data from the user land to the kernel. However, the developer forgot that the recipe size was a user land user mode pointer and then it used as size value in the copy in. We know that a user mode pointer may be very large than the size value so it may cause the heap overflow. If we want to debug this vulnerability, we can set the break points before and after copy IO. They address are calculated through the offset in the kernel catch plus the kernel slide. So as you can see before heap overflow, we can find the flag of dead beef. The dead beef means this block of memory are unused but the next block of memory is used with FFF. After we trigger the heap overflow, we can find the first block is full of our data with A. And then the next block of memory is overflowed by the first block of memory. It overflowed 32 bytes with character B. So now we have the ability to overflow arbitrary content of data but we need to find a way to do the kernel read and write. So we need to do some function. There are two ways to do heap function in iOS 10 and macOS 10.12. I will introduce the first one. We know that in iOS 10 and macOS 10.12, we cannot use the classic VM map copy techniques to do the heap function by changing the VM map size because Apple added a new mitigation that they will check the free into the wrong zone attack which means if you change the size of the VM copy, the kernel will panic. So we need to find a way to avoid that. Luckily Bear in Google project zero proposed a new way to do heap function through the pre-alloc mark port. The basic idea is to use mac port alloc full to alloc IPC key message objects in the kernel memory. The object contains a size field. We can craft it and it doesn't have any pointer so we will not craft any pointer in the kernel memory so it will not be panic. By using the exception port, we can send and receive the data in the kernel memory and this data will not be freed. It's very important. And the data we use to send is the crash. It's the register value of the crash thread. So if we want to send the information to the kernel, we need to create a thread and set the register value we want to send and then crush the thread. The data will be sent to the address of IPC key message objects plus ICM size minus 104. So why the number is 104? We can use kernel debugging to figure out it. First we use kernel debugging to figure out the address of pre-alloc port buffer in the memory. Then we can trigger the exception and send the data to the kernel. After that we can use kernel debugging machine like kernel read to get the data to inspect the data of the buffer. As you can see we can find the location of the data in the buffer is D3C because we set the value of the ICM size to E4O. Therefore we can get 104 at last. So that's why the number is 104. So now we we get the object to do the heap function. The next step is to rearrange the kernel memory. First we allocate 2000 pre-alloc ports to ensure the following ports are continuously. And then the attacker can allocate three ports. One is the holder. The second is first port. The third is second port. Both of them are in the zone 4,096. Then the attacker can free the holder and use the overflow or vulnerability to overflow the first port. The overflow data contains the ICM size and other fields of the ICM object. The key point is to set the ICM size to 1104. So why is 1104? We can do a simple calculation that the first the address of the first port plus 1104 minus 104. It will be the address of the second port which means we can control the second port through the first port. So if we can control the data of the second port, the next step is to get the address of the second port. Which means we need a heap information link. So how to do that? The first thing is the attacker give you the first port to give the second port a valid header. If the second port has a valid header, you'll send the data to the second port. The second port's ICM next and ICM previous will be set to point to itself. Which means if you receive the data from the kernel to get the data of the second port, you can figure out the location of the second port in the kernel memory. After getting the heap address, the next step is to get the kernel slide. But before we do that, we need to safely free the second port. The reason is if you don't do that, the system will be panicked because the kernel detects the memory corruption. After freeing the second port, the attacker can allocate a user client to hold the spot of the second port. Then the attacker can get the user client object through the first port. Note that the first eight bytes of the user client object is the VTB address. So which means you can use the VTB address in the kernel cache compared with this dynamic VTB address. Then we can calculate the kernel slide. In this case, the kernel slide is 1BC000000. After that, the attacker can generate, create a rope chain which can be used to do the kernel, arbitrary kernel memory read. He can use all essentialize with UUID copy. In this way, the attacker can copy the data from any address to the kernel buffer base plus for it. And then use the first port to get this data back to the user mode. If we reverse the X1 and X0, he can get kernel memory write because one is to do the read, another is to do the write. So we get the rope chain. The next step is to trigger this rope chain. We can trigger this rope chain through the IO Connect get service. This method will invoke get mental class return and release method. So we can create a fake VTB and send it to the second port. Then we can use IO Connect get service to trigger the rope chain. We mentioned before to do the kernel read and write. When we get the ability of kernel read and write, we can do the kernel patch. The latest and public kernel patch technique could be transferred to Yalu. Note that this traditional function is not stable because it needs to do multi times and use a lot of rope chain. It only have 50% success for it. So if we want to get, if we want to have a high success for it, we need to use port function. So what is port? We know that mark port is the most frequently used IPC mechanism in X and U. And we can use complicated message to send out of LAN ports to the kernel. Which means we can send the ports object to the kernel. And in this case, we will send 32 mark ports to the kernel. And each port use the 8 bytes and 8 bytes multiply 32 is 256. So the data will be sent to the zone 256. Note that the out of LAN ports saved in mark message are the IPC object pointer. And the pointer can be pointed to the user mode address. Therefore, the attacker can overflow those pointers and modify the pointer to point to a user mode. And then we can control this port in user mode and also create the big task for this big port as well. So how to overflow the right port? The right port is the key. So we need to do some function to rearrange the kernel memory. The first thing we should do is to send lots of OOL port to the kernel to ensure the new allocated blocks are continuous. And then the attacker receives some message in the middle to dig some slot. Then the attacker sends some message again to make the overflow point at the middle of the slot. After that, the attacker can trigger the overflow vulnerability at the overflow point. Then we receive all the port from the kernel and check the value of the port. As we mentioned, we send the dead port to the kernel with FFFF. If the value changes to some other value, which means this port is overflowed by us, and we can control this port in user land. So if we can control a port in the kernel, what we can use this port to do arbitrary kernel memory read, how to do that, we need to set the IO bytes of the fake IPC object to IKOT task and craft a fake task for the fake port by setting the value at the fake task plus the process of the attacker can do arbitrary memory read. It is amazing because the function doesn't check the validation of the task and adjust the return, the value on that address. As you can see in this function, it will use port name to task to get the fake task. And the fake task is controlled by us. Then it will use get BSD task info to get the information. So the function will use A1 plus 38O to get the data on that address. And this address is controlled by us. Also it will use the process PID to get the information on the A1 plus 1O. And this address is also controlled by us. So we can get any address in the data in any address. And we get the kernel arbitrary read ability. Then we can dump the kernel IPC object and kernel task to our fake IPC object and copy it to our fake IPC object and the fake task. Then we can use a user line API called task get special port to get the kernel task port. Then we can use this kernel task port to do, to call the two API. One is mark Vm read and one is mark Vm write to do arbitrary kernel read and write. So here is the conclusion. We talked about kernel debugging. It is very useful for us to do kernel exploit development. And we introduced the two HEP function techniques. One is traditional HEP function because it needs rope chains to do kernel memory read and write. And it needs to do multi times. So it's not stable. For port function it doesn't need any gadgets and only use this structure. It's stable with a high successful read. But it's very easy for Apple to fix it. So that's all for my talk. Thank you for listening. Do you have any questions? Sorry, I take a long time.