 part of the project CheckGrain. They investigate j-brakes for iOS devices for the iPhone. iPhones are usually locked down and can't be rooted as we call it, but technical users try to get root access from time to time anyway. There's a continuous cat and mouse gang between Apple and people who find j-brakes and Apple fixing them again. This jailbreak is a particular one because it's not fixable. Quarty will introduce us to their work and explain what the problem is and what they found out. Please warm round of applause for Quarty. Thank you. So today I will be speaking about the one weird trick that Securum hates. This talk was already given at POC 2019, on the occasion of the planned release of CheckGrain. We had missed that deadline, but it is now available for download and free use at checkgrain.in. So I am looking at the Aka Quarty and you can find me idling on IRC.cracksby.com in the chat channel. On Twitter you can find me at Quarty or UIOPZ. I'm an independent security researcher with an iOS focus and my main background is that I've been hacking iOS and I've been involved in the iOS hacking scene ever since I discovered iPhones a decade ago. However, today I am here on behalf of more than just me because this talk would not be possible without the work that several people have put in place. I want to give a special shout out to Axiomax, which is the person who published the vulnerability that we were abusing and also to Lita Leila, who added independently, and Sigoodza for the exploitation strategies. And I also want to thank the entire CheckGrain team because we worked for months on end to get the infrastructure ready in order to turn such a powerful vulnerability into easy to use and reliable jailbreak for everyone. And the CheckGrain team is composed of an all-stars group of iOS researchers. There's more than just these, but these are the main contributors. And the knowledge base of this group is pretty impressive and it's possibly the best team that I've worked on in my entire lifetime. So I think it's really important to give them a shout out. And I'm very grateful for what we achieved and the work we put in. Thank you. So we'll start talking about what is Securum. Securum is the very first code that will run on your iPhone once you turn it on. And it's a stripped-down version of iBoot, which is the iOS bootloader. And this code is very important because it's placed into mass Chrome while your phone is being manufactured. And it's basically the most trusted piece of code that can ever run on the application processor of your phone. The goal of Securum is to fetch an image from a flash storage for the full-loader iBoot, load it in memory and jump into it. So it's very simple, but it does a bunch of signature validation. And it also provides a recovery mechanism known as DFU, which allows you to recover a phone from an unbootable state or just do a complete restore of the device without relying on the bootloader to be sane. And you can actually enter this mode by pressing a special key combination while the phone is turning on. And I'm also going to mention Securaboot a bunch during this talk. And iOS truly pioneered the concept of Securaboot. It's one of the most long-running issues in the Gelbre community, the ability to load unsigned code. And it all starts from the root of trust in Securum. And every time you load an image in Securum, it will perform validation mechanisms, except on the original iPhone, where this was not the case for images that were already flashed into storage. And as I mentioned, yes, this is Gelbreking enemy number one. And Gelbreks want to be past this. The DFU protocol is a fairly simple protocol implemented over USB control transfers. And again, the goal is to allow the device to get some data from the host. And since this is a stripped-down version of the bootloader, it's designed to be very simple. And basically, there is a special request that you can send to the device with some data appended to it. And once the transfer is done, you can do a sequence of other requests, which will tell the device to boot into the image that you uploaded. There's, however, other ways to exit DFU, including a special request called DFU abort, which will shut down the USB stack and reinitialize it from scratch. As I mentioned, this protocol is implemented over USB control transfers. And so I will go through what exactly a control transfer is. And basically, it's used to issue some commands from the host to the device. And you can actually send data with your request or ask the device to provide you with some data. Successful USB enumeration requires complying devices to support a minimum set of these control transfers. And each control transfer starts with a set-up packet, which contains some information on the transfer that's about to happen. And so we have a request type, which specifies which direction the data has to go through, the B request, which is an index of what kind of request you're trying to do, and some other fields. But in our case here, the most interesting one is WLENF. Because if WLENF is said to be non-zero, the set-up packet is going to be followed by a so-called data phase. During this data phase, the host will send data in a chunk fashion to the device. And again, as I mentioned earlier, this is the mechanism that allows your host to upload files to a DFU device. Although if you specify a direction of in, the transfer will actually get data from the device. And again, the maximum length of the transfer is specified in the WLENF field here. And when you send this data phase, the data that you want to transfer is going to be chunked into packets of sizes ranging from 0 bytes to 64 bytes, depending on what USB speed you're running on. And it will send it sequentially. Once the transfer is finished, a status phase follows. And in this status phase, marks the end of the control transfer. And a zero length packet is going to be sent. In the IEboot USB stack, a temporary buffer is allocated in order to get all these chunks. And the data gets mem-copied in the right place every time such a chunk is received by your device. USB and DFU work side to side because as soon as you enter DFU, the USB stack gets turned on. And during the turn on, this temporary buffer is going to be allocated. And once the control transfer is issued, a pointer to this buffer is going to be copied in a global variable. And the USB stack will use this global variable to figure out where to put the data that it just got in memory. However, when you exit DFU, the USB stack is turned down. And the buffer will get freed. And so we get to the interesting part which is our bug. So essentially, when we do free this buffer, the global variable of destination of the data write is not actually nulled out. And these actually can cause use after free type of vulnerability. And so I'll recap the steps to get a trigger for this. So we will have to start a USB control request that has an associated data phase. And as the data phase is being sent out, we will interrupt sending packets and issue a new USB control request. In this case, we will issue a USB DFU abort which will exit DFU, turning off the USB stack with it and freeing the buffer. After DFU exits, it will try to load whatever data was loaded, but we don't send any legitimate valid data. And so the image loading will fail and DFU will reenter. Once DFU is reentered, the USB stack is available again to be used, but we have our dangling pointer still available. And if we send a data phase at this point in time without an associated set of packet, the pointer is going to be reused from the previous set of packet that we sent during the earlier DFU stage. And, yeah, as soon as this happens, the data is going to be copied on top of such a free pointer. And in order to accomplish this, we need some very good control over the USB stack. And so Little Laila and Sigurds' approach to this was to take an Arduino with a USB host shield in order to get complete control. And Axiomax's approach, on the other hand, was to coerce the USB stack in macOS to diverge from specification and abort a transfer midway. And the second approach is not deterministic, but you can actually do multiple attempts and many USB stacks will actually tell you how much data has been transferred once you abort a request. And from these, you can actually infer whether the attack was successful or not. So as much as it's not deterministic, we found that we got an implementation that could get very close to 100% reliability, practically. And this is a picture of the initial setup to test this out from Sigurds and Little Laila. However, there is a catch. If you actually attempt to follow these steps, nothing at all would happen, except for two specific devices, A8 and A9 used on the iPhone 6 and the iPhone 6S. And that's because on those two devices, there is actually a bug in the DFE abort functionality. Basically, the USB stack has its own task to process USB transfers. And Sigurds and Laila noted that these two devices do not actually free the task once you tear down the USB stack. And so this structure is going to be leaked in the heap every time a DFE abort is done. And essentially, this gives us the primitive to override the contents of our task structure. And once you yield from another task, once you're from this task, the registers are going to be saved to this structure. And when you schedule this task off, the registers are going to be restored off of it. So the use of their freeing right in theory would allow direct register control. However, this is not the case because the use of their free right actually happens on the task structure of the currently executing task. And you can't actually override the registers because those are live in the actual hardware backed registers at a point in time. And when you schedule off, they will actually be saved back to the corrupted task structure. And so your corruption is going to be erased. However, tasks also have an internal linked list from which the actual next task to be scheduled is chosen. And so you can actually create a thick structure somewhere else in memory. And with the use of their free right primitive corrupt this linked list instead. So the next time that the USB task yields back to the scheduler, it will branch into controlled pointers. And by doing this, we actually achieve direct code execution as the heap on the secure ROM is re-execute. 30-bit ROMs cannot actually use this strategy, as well as A7, A10, A10X, and A11, which are the newer SOCs. There is no task structure leak to abuse in this. And no crush is triggered whatsoever. Because the ROM is deterministic enough that every single time we exit DfU and re-enter DfU, the heap layout is going to look identical. And the buffer for the transfer is going to be relocated at the same place every single time. And so we need to find a way to break out of this determinism in order to get a proper use of their free scenario where the right will actually hit something that's not supposed to be just generic data. And so Axiomax looked at the secure ROM heap implementation and realized that the heap will return the smallest possible hole available for a given location size. And so we can use this property to do heap feng shui. And essentially what we really need is controlled allocation primitives. So when you actually send a USB transfer, there is actually an associated structure allocated on the heap. And you can actually have multiple USB transfers, in this case they're transfers from device to the host specifically, in flight at the same point in time. So for instance, if you request some data from the device, the device is not actually going to be able to reply with the data if the host doesn't acknowledge that the data was received. And every time you send a set up packet in such a condition, while the host is nacking the transfer, it will end up staying around for a while. And this gives us the ability to call malloc and delay this free temporarily until the stall condition is cleared or the USB stack is shut down. But the issue here is that we need allocations to persist across the USB stack tear down and initialization in order for the heap shaping to influence the state of the heap when the buffer is going to be allocated next. And in order to pull this off, a state machine bug in the USB stack had to be abused. And these allow us to have persistent allocations that will never actually get freed. And these, sorry. Okay. And these actually relies on zero length packets. So once you have a transfer from the device to your host, the structure will actually send up a zero length packet once the request is fulfilled or the USB stack has to shut down. And the issue is that if you're actually shutting down the USB stack, the zero length packet is going to be sent into the void and it will never be managed because the pointer of the list of in flight transfers is going to be nulled out later on. And this leak can be triggered conditionally because zero length packets are actually only sent upon specific conditions. For instance, requests that are not a multiple of 64 bytes will not require a zero length packet from the device. And so it's not actually going to be sent and nothing will happen. So we can actually abuse this in order to carefully craft the state of the heap and end up with an exactly sized hole, which will be the preferred place for malloc to put the IO buffer next. And each USB request structure will also contain a pointer to a call back that is going to be invoked once the structure is freeed. And we can actually reallocate a few of these in place of our old buffer. And once we perform our use of their free write, we will be able to control this function pointer and get an indirect branch once we exceed the few next. And this strategy can be successfully used on A9X, A10, and A11. And A10X. On older devices, however, this technique is not actually usable, as every single transfer will end up leaking if the device has an end point that's told. However, we can actually allocate repeatedly and have the heap almost run out, minus the size that we want to have for our next buffer. And this will end up creating a hole of the exact size at the very end of the heap, and that's going to be preferred by malloc. And so by using this, we achieve what's called Checkmate. However, right now, all we have is code execution. And what we really want to have is a boot kit of sorts to continue boot normally, rather than staying in secure ROM and DFU mode, and Gelbrecht device, hopefully. And so how do we develop a boot kit? Well, we do have code execution, and we are able to reenter into the main function of secure ROM, which will attempt to boot the device of a flash again. And eventually, the bootloader is going to boot into the kernel, and once that happens, it will try to branch into its entry point. And so our goal then becomes to patch the kernel at a point in time at which the kernel is loaded in memory after it's been validated. And it's still fully read write at a point in time, and some of you will know that the kernel text is not writeable during later points in time. So that point in time is actually quite tricky to reach, especially if you want to support multiple devices and multiple versions. And resilient patch finding strategies were really useful to have in order to have a Gelbrecht that's maintainable for the long-term. So we ended up going for what's called the boot trampling, which is the last piece of code of iBoot that is ran before jumping the kernel or another bootloader. And this boot trampling will wipe the previous stage bootloader from memory so later stages can dump the memory and dump the bootloader that way. So researchers can have access to bootloaders. It will also disable the MMU, wipe all the registers, and this actually also touches registers such as X18, which due to ABI constraints are not used anywhere else. And so we can actually find this trampling by looking for a move X18 zero, which is always and deterministically going to only be in this boot trampling. And yeah, we have shell code that will dynamically locate this and patch it out. But however, there is a chicken and egg issue here, because the trampling will wipe the current stage bootloader, and it has to be relocated somewhere else in order for it to not be wiped by itself. And so there is a reserved region for this, but on secure ROM especially, the trampling sometimes is relocated only before trying to use it. And secure ROM being ROM is not patchable, and we get code execution before any of these happens. So it's actually quite tricky to modify that trampling. However, we are able to copy secure ROM into SREM, and we can patch the page tables in order to remap the range of memory we serve for the ROM, and redirect that to our writable SREM. And on some devices, however, there is not enough free SREM to do this, but we can do some strategies such as resizing the heap, or we can actually get more SREM available to us by reserving more of the L2 cache for it. But realistically, we only need to remap a single page in order to patch the trampling, as it's a pretty small piece of code. And additionally to this, we will also need some area where we can stash our shell code. However, we are still restricted to the available free SREM, and we will also need to do a normal boot, which might itself make use of SREM in other ways. And maybe we could use DRAM, however, DRAM is not initialized when secure ROM runs. The first level bootloader has to initialize DRAM. However, the first stage bootloader also has a feature called SoftDFU, which will simulate DFU mode, but at that point in time DRAM has already been initialized. And this actually allows you to upload more data to the device, and we can abuse these two loads, basically an arbitrary amount of data in DRAM, without having to initialize it ourselves, and without having to bother with a bunch of other complications that SREM poses, such as size constraints. And the issue with DRAM, however, is that iBoot will wipe large portions of it, but we implemented dirty workaround, which involves in hooking bit zero and detecting whether the range overlaps with our shell code, and in that case, turn the bit zero into an up. And it isn't the nicest way to pull it off, but it works pretty well across all the devices that we care about. And once we hook the trampoline and bit zero, we are able to reach the next trampoline invocation. And on some devices, there's actually multiple bootloader stages, and so we have to re-execute these hooks for each stage bootloader that gets loaded. And devices before, after 8, 10, only one bootloader is present after ROM, so it's not strictly required. Eventually, the kernel is prepared, and the trampoline is going to be invoked. In devices before 8, 10, with EL3, secure monitors entry point is actually going to be passed to this code rather than the kernels entry point. However, in the arguments that are passed to the secure monitor, we can actually find the entry point for the kernel itself. And we can actually patch the branch target of the trampoline, so our shell code will be ran instead of the actual secure monitor or kernel. And we can also pass the original values to our shell code by preventing registers from being zeroed out. By doing this, we now have the ability to patch the kernel dynamically at boot time. However, we have to support a large number of devices and version pairs, which means that the only viable strategy here is going to be patch finding all our patches dynamically. And since we are hooking code execution after the trampoline runs, we do not actually have MMU or caching available to us. So this becomes really problematic really quickly because we're going to have to scan the kernel, which is roughly 30 megabytes. So it becomes really slow, really fast, and we will actually have to turn the MMU back on with a fake set of page tables in order to map the RAM as cacheable. And the patch finder can use this cacheable map to find the patches, and that's going to be a lot faster. And our patch finder will apply a pretty old-school kernel patch set. Because KPP and KTRR, which are kernel integrity mechanisms, are not issues. We can patch things that other jailbricks haven't been able to patch in several years, and so it's basically like it's iOS 8 all over again. So we are actually able to allow full read-write execute code to be allocated for user space iOS, which is going to be very interesting for emulator kind of people, which hasn't been available to jailbreakers for roughly four years now. However, kernel patching alone is actually not going to be enough. The issue is that once the kernel boots, we actually want to keep code execution in order to prepare the jailbroken state, because we do want to like install some apps and tools and SSH demon, for instance, and doing these before the kernel boots is really tricky, as we would have to implement a file system driver. To solve this issue, we figured out that we could embed a tiny RAM disk in our shell code, and this RAM disk will hijack code execution once the kernel starts booting user mode, and we can patch the device tree in order for the kernel to find this RAM disk, and we can alter the kernel boot arguments in order to have the kernel use it as a root file system rather than the normal NAND flash. And the end result of this was our initial patch finder here. And so we started from Securum code execution, and we got all the way to EL0 user mode code execution. However, we need to have a RAM disk be really small, as DFU transfers are pretty slow, and having to wait several seconds on a data upload to boot your device is not great. Additionally, there's copyright concerns, since we should avoid shifting any proprietary Apple code, such as libraries, with such a RAM disk. And again, this is problematic, because it means that we have no libraries and no dynamic linker available to us at all. So it's required us to write our own totally legitimate dynamic linker, which doesn't actually do any dynamic linking. It just mounts the real file system in a union mount on top of the RAM disk itself. And so once we do that, we have the full set of Apple libraries available to us, as well as the real dynamic linker. And at that point in time, we can run real code as PAD1 before the init in iOS called launchD is ever executed. The root file system at this point is still read only, and we would really like to keep it that way until the user explicitly decides to change it. This is for forensics applications, for instance. But we also do need to drop some extra files in order to get a shell and allow users to install package managers if they so wish. And so we need the private bar partition to be available. The issue with that is that private bar is the partition where user data is stored, and on iPhones user data is encrypted. So that partition is not going to be accessible until we enable data protection, and launchD is responsible for doing it. So we need to hook code execution at a later point in time, but still early in the book process. So we keep on union mounting, and this time around we mount another disk image on top of userLibExec, which is where system demons are located on iOS. And by doing this, we can override any system demon. We chose to overwrite C status check, because it runs to the right point in time across multiple iOS versions. And actually finding something that reliably ran at the right point in time was quite tricky for us, and it took several attempts with different demons until we settled on this one. And once we got code execution, we can execute back into a V node of some other root file system, and then we can just forcefully amount userLibExec's overlayDMG. And at this point in time, the system is sane, and we can let launchD continue booting by just executing it. And we can fork before doing so, or inject our own libraries into launchD. And by doing this code, we'll still be present. At that point in time, we'll have to wait until USB communications are enabled, and we will wait for the host to upload another disk image, this time containing all the required utilities for a basic shell to run, SSH demon, and an app that allows users to install a proper jailbreak. And we mount these on slash beam pack and kickstart the SSH demon. And so the result is that we can get root shell on any iPhone that's vulnerable to this, on any version. And once Springboard, which is a demon that shows the icons on your phone runs, we are able to add our own Checkrain app. And we also have some future plans for all this, and we would like for this to morph from just a jailbreak into something like Clover, which is a bootloader for Macs and non-Mac devices, PC devices, which allows you to do a custom kernel extension loading, and also allows you to do dual-booting and boot Linux and Windows on iPhone. And this is still an ongoing area of research. But we are coming up with what we call PongoS, which we only briefly mentioned before today, and it's being officially announced right now, which is a tiny pre-boot execution context. It's a fully custom OS that was written by the Checkrain team, specifically designed for Apple birds. And it actually allows you to load modules to it over USB, and they will be loaded and linked against all the facilities that PongoS provides, and we were actually able to move the Checkrain kernel patcher to such a module. And so this will allow people to use our infrastructure for custom stuff and security research. And that was it. Thank you. Well, thank you, Kurti, for this interesting talk. We are kind of close in time, so we take questions as much time as we can get. Please, you know the drill, line up behind the microphones, and does the signal agent, the other signal agent is very eager, there's a question from the internet. How was this exploited discovered, considering that most recent exploits used user space? Is the iBoot still actively researched? So this specific bug was found by multiple parties. The person who ended up publishing this bug, XMX, actually found it by diffing, because this was actually patched by Apple in iBoot, because the USB stack is actually shared between iBoot and Securum. And there is a two-year lag between something getting pushed to iBoot, and it actually reaching the semiconductor manufacturing facilities and being flashed onto devices. So, yeah, as much as that was patched, it was still good enough for our purposes. And I don't know if there's any more bugs in iBoot, but give it a shot and ping me if you find anything, I guess. Okay, thank you. There's two questions in front of the microphones. Number is it? I think it is four. I think... Hi, yeah. Really good talk. I've never seen so much great work put into this. Really great job, guys. My main question today is what is the estimated time of arrival for the Checkrain Windows build? Thanks in advance. So, we've been working really hard on these, and currently, Checkrain is only available for macOS. However, the Linux release is really close, and hopefully it's going to come out by the end of the year. The Windows version is also being worked on really much, and we have several achievements already. We are able to put some devices, and hopefully that's going to come becoming in the next couple months. So, there's another question on microphone four. Yes. Roses are red, violets are blue, Windows, Checkrain, ETA when? I mean, I just answered this question, but that was a good meme. Nicely done. Another question from maybe someone was wondering where's the source code? So, currently, Checkrain is closed source. However, the team plans on open sourcing it over time, and I believe that PongoS is going to be the first thing that will open source, and gradually more of it will be available. We estimate that over the next year it's going to be fully open source, but yeah, you can look for more updates on my Twitter or on the official Checkrain Twitter. Okay, thank you. Our time is over, so give him another big round of applause for this very interesting work. Thank you.