 Hello everyone and welcome to my talk fuzzing the phone in the iPhone. The phone in the iPhone is the component that receives SMS, sends SMS, receives phone calls, makes phone calls and also manages your internet connection when you are not on Wi-Fi. However, you might now wonder what is it exactly? So I'm talking about Comcenter and fuzzing it via the QMI and ARI interfaces, but this is a bit too technical for most of you, so I will first introduce you to the concept of fuzzing in general and protocol fuzzing before I dive into further details. For those of you who have not yet heard about the concept of fuzzing, you can send a lot of random messages and then try to test the security of an interface with this. And in this video, you can see how I sent SMS over a Frida-based fuzzer with something like 400 fuzz cases per second and then the IMHM perceives them, catches them and sends a couple of them also to the smartphone. Let's start with a motivation and an explanation to the attacker model. So if you look into a modern smartphone, you have two components if you want to show it in a simple way. So first of all, there is the hardware part with a lot of chips. And then on top of this, there is an operating system and applications. However, it's not as simple as this because even those chips are so complex that they run their own little real-time operating systems to preprocess data. So this means that you can even get code execution on such a chip. And this is usually much easier than in the operating system itself because those chips cannot have that many mitigations. So what do you even do if you have code execution in such a chip? So if you are in a baseband chip, then one escalation strategy from the chip towards the operating system might be to manipulate traffic in the browser. However, I don't think that this is the case because if you look at the cerodium price list, then actually the browser exploits are much more expensive, so it's probably not done like this. And there must be other ways to escalate from this chip into the operating system. In general, the traffic manipulation is something that you can always do in wireless transmission or also on the internet. So if you look how those systems work these days, so you have something like the internet in general that serves websites and so on, and also the core network of your mobile provider. And there are many, many ways to manipulate traffic, either if you are a state-level actor who is able to have something in the core network or just by sending around websites or modifying websites. And then there is the base station subsystem. There might also be dragons, we don't know exactly. And of course there are over-the-air transmissions. And wireless transmissions are very special because if there is something just slightly broken in the encryption, for example, then it's also possible to manipulate traffic there if you have a software-defined radio, for example. So all of this could be attacked to manipulate traffic. And I don't think that for this one would craft a baseband exploit. Already in 2014 at the CCC, there have been two talks about the SS7 protocol, which is run in the core network and is actually meant to connect different mobile carriers to each other. And this can also be used to intercept phone calls, for example. And this also has been exploited recently. So even though there have been some mitigations, etc., since then it's still exploited for the same purpose to spy on people. So really, really, really basement exploits only exist to escalate from the chip into the operating system. But now the question is, what are the strategies? So if it's not via the browser, what else could it be? So the browser, really, I'm sure it is not because also you need to have some traffic and so on. It doesn't really work instant. You need to visit a website to replace traffic on the website and so on. So, yeah, there must be something else. So if you are on the chip with remote code execution and want to go into the operating system, there is some interface. And this means that something in those interfaces needs to be exploitable so that you can escalate the privileges from the chip into the system. And also, those interfaces are very interesting from a reverse engineer's perspective. So even if you don't want to attack anything, just understanding how they work is also a goal of this work. So for example, if you have a baseband debug profile, you can just download this onto your iPhone and then you open your iDevices log, you can already see a lot of management messages that are exchanged between the chip and the iPhone. And if you have a jailbreak and Frida, you can even inject packets or modified packets to change the behavior of your modem. If you want to start to work on such a thing, the question is, like, how do we even start? Where do you start? And fuzzing is actually a method that can be used to understand such an interface. So initially, if you identify an interface, just to check if it is the correct interface. So if it really changes behavior, if you flip some bytes, but also how powerful this interface is, so what are the features, what breaks instantly, and if things break also, you can check if the whole interface has been designed with security in mind. Now let's start with an introduction to wireless protocol fuzzing. This will also be a short rant because the current tooling for fuzzing is usually not made to fuzz a protocol. So let's start with a very simple fuzzer, a fuzzer that is just an image parser. So you browse your smartphone for unicorn pictures or PNGs or JPEGs, and then you send them to the image parser. And in the image parser, you might be able to observe which functions are executed in the form of basic blocks. And then during this initialization, the image parser can even report which parts were executed. And you can just start the image parser again and again with different images and get this basic block coverage back. In the next step, you can then combine existing images or flip bits in these images and send them to the image parser and again observe the coverage. Most of the time, it won't generate any new coverage. So you just say you are not looking into this image in particular, but sometimes you might get new coverage like here, and then you add this image to your corpus. So over time, you will increase your corpus and increase your coverage. Another method can be if you know how exactly an image format looks like. So you might know the JPEG specification. And because of this, you could just generate images that are more or less specification compliant and they look more artificial like this. So you just generate images and send them to the image parser and at some point you might observe a crash. So that also depends again on your harnessing. Maybe you can observe basic blocks. Maybe you can just observe crashes. And then you know at which image you had a crash. You might even be able to combine these two approaches, just depending on what you know about your input and how you can harness your target. Now it looks a bit different for a protocol. So in a protocol, you can have a very complex state. Let's say you are in an active phone call or just something like you receive an SMS. You can actually force the iPhone to receive SMS if you have a second iPhone and send SMS. And then during the fuzzing, you can replace some bits and bytes like this and then you would have a modification. So this is a very simple approach and it preserves the state. So no matter how complex the thing is that you're currently doing, it's very simple to flip a bit here and there in an active interaction. But it's also a bit annoying because you need to have these active phone calls, etc. So something that's more efficient is injection. So you would observe certain messages and then just send them again and then you don't even need this second phone to make calls, etc. You can just send a lot, a lot, a lot of data. And this is the effect when your iPhone goes to the dim or something because of all the notifications and all the data that is sent. But the issue here is that this does not preserve state. So there might be actions where the iPhone requests something that is then answered. So the iPhone might request, for example, a date and only then the chip would reply with a date and only then the iPhone would accept the date. But it's still very interesting to do this. So even though you cannot reach certain states because you can do this without a SIM card and you can do this very, very fast. So just to summarize the issues here, if you pass a virus protocol, you can have very significant state differences and just injecting packets cannot reach all states. The fact that you cannot reach all states also shows in very simple stuff like a trace replay. So a trace is something that you record. So let's say I have an active phone call, I record all the packets and I can also observe the coverage. So with Frida, you can observe coverage on an iPhone while the phone call is active. And then in a second step, you would do some injection. But the only thing that you can inject are the packets sent from the baseband to the smartphone, not the opposite direction. And this resides usually in much less coverage. So you are missing a lot of things due to a missing state. And even worse, if you do the same thing again, you might be in a different state and you might observe a different coverage. So you do the exact same thing, but you get different coverage. So even replaying recorded messages results in less or inconsistent coverage. Anyway, let's take a look into an injection example. So in this video, you can see how I'm in the unicorn network on an iPhone 8, which has obviously 5G, but also does a lot of fuzzing. And in the fuzzing, what is interesting is that you might do a lot of states in a combination that are not usually possible. Like you have a lost network connection while you have to confirm a pin or you have a network connection during this, et cetera. So to summarize my rant, some states cannot be reached solely by injecting packets. So even if we have a very good corpus and do very good mutations, we might just miss 80% of the code, but we can just fuss anyway, but we need to keep in mind that some stuff is just not fuzzable. We looked into a lot of virus protocols at Zemo in the past, so it's worth to also consider which tooling we already had available for fuzzing protocols. The most advanced tooling that we have is Frankenstein and it's built by Jan. So what Jan did is he emulated the firmware and attached it to a virtual modem and also a Linux host. For this, he first looked into the firmware that's here and we had some partial symbols for this and also some information about registers. Then Frankenstein is actually taking a snapshot that you can see here, including some of those registers of the modem. And with this, you can build a virtual modem and fuzz input as if it would come over the air, then Frankenstein also emulates the whole firmware, including thread switches. So it gets into very complex states and it's even attached to a Linux host, so it also fuzzes a bit of Linux while actually fuzzing the firmware itself. Now the issue with this is that basement firmware is usually 10 times the size of Bluetooth firmware or even more. And we don't have any symbols for this, so it's a lot of work to customize this. And even if one would do like all those steps and put all the work into this, it's only, so to say, code execution in the baseband, it's not yet a privileged escalation into the operating system. The next interesting tooling was built by Steffen and what Steffen did, he built a fuzzer based on Dtrace and AFL. Dtrace is a tool that can provide function level coverage in the Mac OS kernel and user space with some modifications. You can even get basic block coverage in the user space, which is required for AFL to work. So in the end, you have AFL or AFL++ as a fuzzer on any program on Mac OS. It's even slightly faster than Frida, at least the version that he used. And he gets a couple of thousand fuzz cases per second, even on the very old iMac. So yeah, in our lab, we just had an old iMac 2012 for this and it works on this. But the issue is that Wi-Fi and Bluetooth, which he fuzzed are very complex protocols, so he couldn't find any new bugs with AFL. And also in the kernel space, you only get this function level coverage. He still, despite not finding any bugs in Wi-Fi or Bluetooth, got a CVE because Dtrace also has bugs, so at least some finding. But on iOS, this is not supported out of the box, so it might be possible to get Dtrace working with some tweaks, but it's a lot of work, so probably it's easier to just use Frida in the iOS user space. Also during this, so while Stefan was building all this very advanced tooling, Wang Yu found issues in the Mac OS Bluetooth and Wi-Fi drivers, and so he was very, very successful in comparison to us. That's really a pity. And I think what he did is much better state modeling, so of how the messages interact and what is important to reach certain functions. So what is still left? So usually fuzzing the baseband means that you need to modify firmware or also emulate firmware. You need to implement very complex specifications on a subtly defined radio if you want to fuzz over the air or build proof of concepts. And for everything that's somewhat proprietary, you need to do protocol reverse engineering so you can spend a lot of time and money just to do very, very basic research. Or, well, you can also use Frida, so you can fuzz with Frida and all you need to do for this is write a few lines of code in JavaScript. So I kid you not, the option is Frida. Dennis was the first in our team who was advised as a thesis student who built a Frida-based fuzzer and it's called a Toothpicker. It's based on Frisor and Radamsa. So what it does is, well, it hoops into these connections or into the protocols of the Bluetooth daemon. You could also think of this upper part here as a one block. So the protocols are implemented in the Bluetooth daemon, but we want to fuzz certain protocol handlers and to increase the coverage he creates a virtual connection. So a virtual connection holds a connection and pretends to the Bluetooth daemon that there would be an active connection to a device. And of course, the chip would then say, I don't know anything about this connection. So there are also some abstractions in here so that the connection is not terminated. So that's a very simple tool, but it really found a lot of bugs and issues. And even there were some issues in the protocols themselves that also apply to macOS. So it's not just iOS bugs, but also protocol bugs in macOS that then is found. And this really got me thinking because two picker runs with only 20 fuzz cases per second. So it's really, really slow. And we were still able to find Bluetooth vulnerabilities at this speed. So why is this? So first of all, if you try to fuzz Bluetooth over the air, then the over the air connections are terminated after something like five invalid packets. So over the air fuzzing is really, really inefficient. And with Frida, you can actually patch this function so it's gone. Then the virtual connections are a very important factor. So they are really, really important for having coverage. It's still a lot of coverage that we miss during your plan fuzzing. But yeah, so it's really an advantage compared to the other fuzzing approaches where you just inject packets. And in addition, there is an issue here because if you have a virtual connection, it might be that this virtual connection triggers behavior that you cannot reproduce over the air. So that means that everything that you find, you need also to confirm that it works over the air. At least inconsistent coverage is also fixed in two picker because two picker replace all packets five times in a row. But the issue here is that it also means that if you have sequence of packets that is like generating a certain bug, so you need multiple packets, this is nothing that the mutator is aware of and also nothing that's locked properly in two picker. And because of this, I got a bit anxious, like, so yeah, maybe we missed a lot of things. So once I got the intuition that we are actually missing certain state information, I had the idea to replace bytes in active connections. And this is one project that you can see on the keyboard. So I'm just replacing bytes on keyboard input and see what happens. And I let this run for a couple of weeks, also for different protocols and so on to see if there are further bugs or not that we didn't find previously. So here you can see the same for airports with SCO and then they produce some correct sounds for the replaced bytes as even worse for ACL, so actual music because then you can hear very noisy chirps. I let this further run for multiple weeks and it didn't find any bugs that two picker hadn't discovered before. So I think the reason for this is that I mainly passed in active connections like the one with the audio or the keyboard, but I only passed a few active pairings because this requires me to actually perform those pairings by hand. So nothing really interesting. The only bad thing that I could produce with this, but not worth a CVE is that the sound quality of my airports is now really, really bad. Well, OK, and also the Broadcom chips on iOS don't check the UART lengths, but that's not that bad. So I mean, if you consider that they removed the right RAM recently, then you might now still be able to write into the RAM via UART buffer overflows. But yeah, nothing too interesting. So after all of this, I asked myself, what is still left for fuzzing if we cannot find any new Bluetooth or Wi-Fi bugs? Well, the iPhone baseband or actually the iPhone base bands because there are two. The first variant of iPhone baseband that you can get are the Qualcomm chips, and they are in the US devices. They use the Qualcomm MSM interface, and this interface comes with some documentation and there are even open source implementations for it. So it's something that's probably easy to understand and easy to pass. On the other hand, in almost all devices that I had on my table were Intel chips. Intel has been recently bought by Apple at least a part that does the baseband chips. And these are the chips in the European devices. That's the reason why almost all my devices had Intel chips. And they use a special protocol. It's called Apple remote invocation. And if you search for this in the internet, I even checked it like just today. There are no Google hits at all. So it really hasn't been researched before, at least not publicly. It's completely undocumented and it's a very custom interface, so it's not even used for Android. It's really an interface just for Apple. The component that we are going to fuzz in the following is Comcenter. So Comcenter is the equivalent of, for example, the Bluetooth or Wi-Fi demon, but for telephony. It's sandboxed as the user wireless, but it comes with a lot of XPC interfaces. And this is something that we will also see later in the fuzzing results. The next part is that there are two flavors of libraries. So depending on if you have a Qualcomm or an Intel chip, different libraries will be used before certain actions or data actually is then processed by the Comcenter itself. So we have different code paths here. But all of this runs in user space and this means that both libraries can be hooked with Frida and can be fuzz with Frida. So that's very interesting. There is still a lot of stuff that goes on in the kernel. So what you can see here is that Qmai and Ari have some management information that is sent to Comcenter, but they don't contain the raw network or audio data. So they don't contain your phone call. They don't contain your website that you are opening. And the next issue is that Qmai and Ari are not directly sent over the air, but what is sent over the air are normal baseband interactions and these generate Qmai and Ari messages. So there is still some section in between, but of course there are now two ways. Either you have interaction that you can do over the air that is causing Ari and Qmai messages directly that are something that causes an issue in the upper layers or you might have this full exploit chain requirement that you first need to exploit the chip over the air and then from the chip break the interface into the Comcenter. Now Qmai, the code has a lot of assertions. So it's really asserting everything about the protocol, the length, the TLV format and so on. And if anything goes wrong, it really terminates Comcenter. So if you just send one invalid packet, Comcenter is terminated. This doesn't matter a lot because if your protocol is stable and you usually don't send any invalid packets, then you know an attack is ongoing, so it's valid to terminate the Comcenter. And furthermore, it doesn't matter a lot to the user. So the worst thing that happens when Comcenter crashes, for example, where you have an active phone call, it's just that the phone call gets lost or your LTE connection is reestablished. So you don't really notice it. It just feels like your internet connection breaks for a short moment. In contrast, there is the ARRI protocol. And this is a part that works just very, very, very different. So whatever it's getting, it just prices it and it doesn't terminate Comcenter. So you can send many, many, many fancy things and it just continues, continues, continues because the developers were probably very, very happy once they got their special protocol for Apple working and then they never touched it again. But what does it look like? So it has a very basic format also with some TLVs. And the first thing that I noticed when I fasted is that in the iDevices log, it always complained about this sequence number being wrong. So it just said I expected to follow up sequence number so and so. So I started to fix this. And if you open it in Ida, you can see that the range that is expected is between zero and 7FF hexadecimal. So you know at least the range. And then it gets weird. So the sequence number is spread over three different bytes in single bits and shifted around and so on. And it's not even continuous. So very weird code. Probably they just added those sequence numbers to confirm some race conditions or something. I really don't know or out of order packets. Something weird going on there. But I wrote the code. I fixed the sequence number. And then during the replay of packets, I noticed it doesn't even matter. So no matter if your sequence number is valid or invalid, parsing continues and even worse, even packets with a wrong sequence number are parsed probably because otherwise there would be too many issues because the protocol implementation is too buggy. And there are also a couple of other things. So for example, if you send the first four magic bytes wrong or a wrong length or something, then the packet is potentially ignored. But the parsing continues and Comcenter is not terminated like in QMI. Since it's a proprietary protocol, there's currently no tooling available. But to be as it's working on a Weissach sector. And once he finishes his thesis, it will also be publicly released. So you need to wait a while, but then you will have a tool for this. Anyway, let's also talk about parsing this. So I would not recommend to pass this because you might precure device or at least get into weird states. So just don't do this on your productive iPhone. I mean, obviously, I know what I'm doing. So yeah, just fuzzing packets, right? But I'm not so sure about what exactly I'm doing. So the only direction that I pass is from the baseband to the iPhone here, not the opposite direction. So I hopefully do prevent anything weird on the chip, right? But the iPhone might still answer with something invalid and this might confuse the baseband or cause other crashes. And so I actually had to call for help like me, me, me, me, me. I brought my iPhone. I mean, just one of my research devices but still so it booted into Pongo OS but no longer into iOS and it didn't tell me any debug message that was useful. Well, it turns out at least on the Qualcomm chips and that's where this happens. It just boots after a couple of hours again but before it's just entering a boot loop and on the Intel iPhones I also almost break the iPhone 8 but luckily it didn't completely break. So the issue there is if you enable the baseband debug profile then it writes a lot of stuff to the ISTP files. So that's some debug format of Intel and every few minutes it just creates something like 500 megabytes of data at least on the iPhone 8. On the newer iPhones, this debug format is a bit shorter so it doesn't create as much data but still a lot. And if you don't delete this regularly then of course your disk will be full and an iPhone behaves quite strange if it has a full disk. So you can still interact with the user interface but you can no longer delete photos because deleting a photo it seems it just needs some file interaction. Also you can no longer log in with SSH which is also an issue because it somehow seems to create a file when logging in so you can no longer delete any files. And I was just rebooting the iPhone after trying a couple of things and luckily it came back and deleted some files and I was able to log in and remove the baseband logs but be careful when doing this. And of course all the iPhones are very confused from the fuzzing so they really lose everything about their identity and location and they want to be activated again. So here you can see a smartphone that lost its location and really wants to be activated, activated, activated. During SMS fuzzing you might even get LESH messages and if you click on the head menu on a direct theme they are displayed black on gray so probably nobody ever tested it. Also great if you have a locked iPhone you can still display SIM menus and SIM messages on top of the lock. Okay, so I guess I have to revise my first instruction so fast this, really, really fast this. It's a lot of fun maybe just not on your primary device but you will enjoy fuzzing these interfaces. But first of all, you obviously need to build a fuzzer so how do you build a fuzzer? The first fuzzer that I used was the one that I also used for Bluetooth that just uses the existing byte stream of protocol and then flips single bits and bytes. So it has this high state awareness but it also means that like some kind of monkey I was just calling myself, writing SMS to myself enabling flight mode, everything that you could just imagine and it's a very boring task but it also found very fancy bugs that I couldn't reprove with the other fuzzers yet because it can reach states that just injection of packets cannot reach. So at least it was quite successful and well I fast with this for something like three days and it already found bugs. That's very different to the Bluetooth fuzzer so there seem to be more bugs in com center and so I just wrote to Apple, yeah hi there I wrote this really, really ugly 10 lines of code fuzzer and see what it found, awesome, awesome, awesome and crash logs are attached. And obviously this is simple to reproduce because I only fast for three days got most of these crashes multiple times. Yeah, so here you go and draw my puzzle. And this was probably quite stupid because it's not that simple. So it's really not easy to reproduce the crashes. First of all, well of course the script is so generic that it runs on all iPhones with an Intel chip so no matter if I take an iPhone 7 or an iPhone 11 it will just work. But the crash logs that you get are very different depending on if you fast on pre-A12 so iPhone 7 and 8 or on the later versions like the iPhone 11 and SE2. So you cannot reproduce the same crash logs that easy and also it depends a lot on the SIM so even on a passive iPhone if you don't do any active phone calls and so on you would get different results. So I started my parsing actually with a Singaporean SIM card without any data contract or phone contract on top of it and it already found a couple of things but yeah, it might just behave very different on just a slightly different configuration. Anyway, let's listen to a null pointer that it found and this null pointer has been fixed in iOS 14.2 and it's in the audio controller so you can hear some loop going on there. What you can see here is me calling the Deutsche Telekom and so on so they have this variable in text. The Telekom. And then I call again and have a crash and now let's listen to the crash. Just for the sound effect I also recorded another one so this one is with AldiTalk and now let's listen to a special offer by AldiTalk in three, two, one, da-da. Since these first parsing results were very promising I decided to use the latest Toothpicker version and extend it for Fuzzing Arri and I called it Ice Pickery because the little chips are also called Ice. So I just cloned Dennis' latest Toothpicker Alpha which is very, very unstable but this one actually runs on the iPhone locally without any interaction with MacOS or Linux so it doesn't need to exchange any payload via USB and also it's using AFL++ which is a much faster mutated than Radamsa. So from a speed consideration this is a much better design however AFL++ didn't turn out to be the best fuzzer for a protocol so most of the time it actually spent trying to brute force the first magic bytes, the first four bytes because it tries to shorten inputs it's also not aware of something like a packet order so it was just brute forcing those first four bytes. And well the next issue is that for some reason if the first four bytes are invalid the Arri parser slows down a lot so I was suddenly down to something like less than 10 fast cases per second and also there is no awareness of the ice picker in this case of the Arri host state so Arri sometimes shuts down this interface if it thinks that something is very invalid and the father would just continue so I looked into the ID by syslog after the father couldn't find any new coverage for more than six hours and I was wondering like what is the issue here? Like is the implementation wrong or is it the father? And it really looks like the fuzzer is producing inputs that are not good for protocol fuzzing. Of course this is stuff that you can optimize so AFL++ can do a lot here so you can tell it a bit how the protocol looks like and also get it to not brute force the first four magic bytes but for this I would have to recompile the whole thing and it was something that compiled on Dennis machine but it didn't compile on my machine because I had my Xcode beta in a weird state and well of course some of you might now say yeah just download and install a new Xcode but this takes so long that actually writing the next fuzzer seemed to be easier. Still this variant of ice picker was interesting to me because it was the first time when I saw that the fuzzing situation works including coverage and also my replay works across multiple iPhone versions so my code was collected on an iPhone SE 2 was replayable on an iPhone 7 so it was not useless in that sense but I just decided to not use this configuration. So I just wrote a very simple fuzzer again and I didn't do the porting of everything to run locally on iOS. I just kept the design a bit simpler or at least easier to code and had my fuzzer running on Linux and then using only free down iOS. It cannot reproduce all the states and crashes that I observed with my very first fuzzer but most crashes could be reproduced. I didn't do any coverage. I didn't do any smart mutations just very stupid mutations and basically I just did a very blind injection but this was super fast. So instead of the 20 fast case per second I already had something like 400 fast cases per second on an iPhone 7 which was about the same speed or even faster than the AFL++ variant and I can at least correct the length, field sequence number and so on before injecting the payload. Since it doesn't do that great mutations at least I need to collect a good corpus with many sims, many cores and I'm also logging the packet order with this so it's at least aware of a packet sequence in the sense of I can reproduce the sequence later on. I had this fuzzer running on a couple of iPhones in parallel for multiple weeks and it found a lot of interesting crashes so that's my go-to fuzzer. I still wanted to confirm that not collecting coverage wasn't an issue so I also cloned the publicly released tooth picker which definitely finds new coverage and it's using the Radamusor Mutator which is very, very slow but it does a bit smarter mutations at least in terms of protocol fuzzing. It's still only aware of single packets and it's only using the same packet five times in a row to confirm coverage, et cetera. And also an issue is that it cannot catch a lot of the crashes of Comcentry so it happens quite often that Comcent are crashes and then if you cannot catch the crash with Frida and everything crashes then you need to start the fuzzer again but you also need to delete the files in the corpus that led to the crash because otherwise you would just run into the same crash very fast so it needs a lot of babysitting. I also had it running for a couple of weeks but sadly it didn't find any new crashes so at least I can be sure that fuzzing much slower but with coverage is not any improvement. Still the mutations it creates are quite useful as you can see in the following. So you can even see this phone number scrolling here and so on so it generated a very long phone number correctly into some TLV structure here and that's quite interesting to see. So this is something that you could not reach by just flipping bits and bytes. There is one big shortcoming that all of these fuzzers have including the initial toothpicker which is they don't have any kind of memory sanitization. So the framework that you would usually use in user space on iOS is the Melox stack logging framework. I even got this running for Comcenter so it's a bit of command line juggling but in the end you can enable Melox stack logging also for Comcenter. The issue here is that it increases the memory usage a lot and even if you configure Comcenter to have a higher memory allowance it is so high that it's just immediately killed by the out-of-memory killer. So this doesn't work. Then there is also the Melox. It doesn't exist for iOS, it's just exist in Xcode. I got one of the Xcode libraries running on one of my iPhones. I have no idea if this is an expected configuration or not. At least I could execute smaller programs and then when you use this on Comcenter it just crashes with a LibG Melox error on parsing some of the configuration files very, very early when starting the Comcenter. So all of this didn't work and this also means that the fuzzer cannot find certain bug types or crashes much later when encountering bugs. So all of the puzzles that I created are not perfect but at least they found a lot of different crashes. Let's look into this. I mean, the first obvious number that you see here is the 42 so I stopped fuzzing after 42 crashes. At least crashes that I think are individual crashes and that are not caused by Frida so I tried to filter out Frida crashes and this corresponds to the total amount of crashes but only some of them are replayable by either one or multiple packets and for the replayable crashes I can also check if they were fixed in recent iOS version so the most recent iOS 14.3 or not. Then I also marked two colors here because there is the Intel libraries but there's also the Qualcomm libraries and for the Qualcomm libraries I didn't spend as much time fuzzing because I have less Qualcomm phones but also all the asserts in the code prevent a lot of issues from being reached so the libraries themselves have less issues and also within COM Center less of the code that has improper state handling is reached. The location daemon is marked also with a big gray box here because the location daemon is similar to the COM Center using some of the raw packet inputs and prizes them so it has special prizes for Qualcomm and Intel and it's also an interesting target because of this. Other than this I got really a lot, a lot, a lot of different daemons crashing some of them even with replayable behavior so for example there is the virus radio manager daemon that you can just crash via one Intel packet but this has been fixed and then there is one interesting crash that I actually got via Qualcomm and Intel libraries so in the mobile internet sharing daemon this also has been fixed and some of the crashes only happened via Qualcomm but I'm not sure if that's like a Qualcomm specific thing or it's just randomness of the father so the mobile internet sharing daemon has an issue where it accesses memory at configuration strings so there's different strings in this memory address and I found this quite early but I was not aware of the fact that so many other daemons are actually crashing when I first come center so I didn't look into this in the very beginning and when I reported it to Apple they said yeah, yeah, we already know about this and we fixed it in a better prior to report so sadly nothing that I got a CVE for Another interesting crash is in the cell monitor but only of the Intel library the cell monitor is something that is running passively in the background all the time and it parses for example GSM and UMTS cell information I already found this on the Singaporean sim without any active data plan in my very first round of fuzzing and reported it back then to Apple I don't know if it's triggerable over the air or not so I guess it's something that you first need chip code execution for and it has been fixed in iOS 14.2 and I wrote a lot of emails with Apple because I thought that they didn't fix it and the reason for this is that both the GSM cell info and the UMTS cell info function when they parse data they have two different bugs so I still got crashes in the same functions and I thought like, okay, same function, still a crash the bug is not fixed but actually it's very high quality code and it's just multiple bugs per function and there is even one more issue in the cell monitor even though I think the remaining bugs are very simple crashes so nothing that could be exploitable at all but still hints to the great code quality and the same story is that there are even more bugs to be fixed so most of them are probably just stability improvements but some of them are still interesting so let's see how this goes so since I told that it's a very simple fuzzer some of you might have already started coding those 10 lines of code for fuzzing while I continue talking and scrapped their old iPhones that they are willing to lose if something goes wrong so how can we actually build a fuzzer that is performant and replicates some of the bugs that I found just within a day, let's take a look when you do Frida fuzzing a lot of the stuff that you do is limited by the processing power of the iPhone so your iPhone will get very, very, very hot and it might even drain more battery than it can get via the USB port so it might even discharge while fuzzing and performance is really key so you need to identify bottlenecks so yeah, I said toothpicker or ice picker the initial version is just 20 fuzz cases per second and you can tune this to something like 20,000 fuzz cases per second so I already told that I tuned it to something like 400 or 500 fuzz cases per second but why the 20,000? so initially a student of mine did some fuzzing in a very different parcel and said on my iPhone 6S it's running with 20,000 fuzz cases per second I was like, yeah, no way, no way but actually you can do this so this depends a lot on the Frida design the first variant how most Frida scripts are written is that you have some Python script that runs on Linux or Mac OS and it has a couple of functions that you can see here so first of all it has this onMessage callback so this onMessage callback is something that we need later and we just register it to our Frida script the Frida script I'm going to show you in a second and you load the script and the script can then even call functions on your iPhone for this you load a second script on your iPhone so this is JavaScript injected into the iOS target process and it can for example use the send function to send something back to the onMessage function and it can export functions via IPC so you can then call them all this happens via JSON and so it needs serialization and deserialization which means you cannot send hex data or binary data directly so you have a hex string that you encode into JSON which is then parsed as binary data and also it's all via USB so you also have the speed limitation by USB and of course if you use the Frida C bindings locally on the iOS smartphone it is a bit faster but it's still not perfect so the more you can prevent from this JSON part and USB part the better and the actual pattern looks a bit like this so you are in the lib-iris server so that's the slowest library from the diagram before and then you define this inbound message callback function which has two arguments which are the payload and the length so this looks a bit cryptic but that's basically it and then you can but you don't have to add this interceptor here because you might want to fix your sequence number or add basic block coverage to your fuzzer etc so this is also done there and then you can just call this inbound message callback of Ari and send Ari payloads so this already can be very different so if you now call this via RPC export via a Python script on your laptop you can reach something like 500 fast cases per second if you inject SMS which are quiet processing intensive payload or if you just do the same thing and if you just run this inbound message callback in a loop locally with JavaScript without any external Python script then you would get 22,000 fast cases per second on the very same device so this is the speed difference that the JSON serialization, deserialization and the USB in between make so I did a few more measurements and sadly on the iPhone 8 there is a bug that prevents me from collecting coverage but what you can see is so the first part here is if you have just a bit flipper in a loop that calls the target function you can get 17,000 fast cases per second on an iPhone 7 as soon as you start collecting basic block coverage not processing it just collecting you drop to 250 fast cases per second so you need to ask yourself if your puzzle gets really that much better from collecting coverage and another thing is that's this line above so if you just print the packet that you fast or injected and print this via Python to your laptop you also have a huge slowdown which is not as large as the coverage slowdown but still you can see every print and every sending of a message in between a Python script and JavaScript takes a lot of time now if you have this remote SMS injection that I had before then you drop to 400 fast cases per second so this is a blind injection without any coverage if you collect coverage but don't process coverage then you are down to 100 fast cases per second so for the initial toothpick design this would be the optimum but because the Radamsa mutator is very slow and because you also need to process the coverage, information etc. that's down to 20 fast cases per second so this is the comparison here and now you can imagine why collecting coverage probably isn't always useful and why also having your laptop calculating better mutation because it's easier to write a mutator there than directly in JavaScript is not always the best idea so let's watch one last demo video what you can see here is when you try to delete SMS after all of the fuzzing it really doesn't break neither via the settings nor via the SMS app so you really need to reset your iPhone after fuzzing it for too long no other chance than this to delete the messages with this we are already at the end of this talk but of course there will be a Q&A session and if you missed the Q&A session you can also ask me on Twitter or write me an email thanks for watching