 The digital distributed online chaos will now present you no POC, no fix a sad story about Bluetooth security. So POC here stands for proof of concept and not people of color. It's about technology. It's just a broken mem copy in the Bluetooth stack and we really need to fix that. So that's probably what the Bluetooth developer thought and Jan Ruger is talking about what he found out about it. So to Jan. Hi, my name is Jan Ruger. And as Lindholm said before, my talk is no POC, no fix. And I want to present some additional IT security findings that we had in the Bluetooth stack. And I want to tell you something about the vulnerabilities as well as our methodology and also something about the disclosure process that we had with those windows. So first of all, my motivation why I want to do this. During my time at university, there was a next one firmware patching project. As Jaskar mentioned before, and it was targeting a Broadcom Wi-Fi controllers. So the main idea was that we can implement the monitor mode on the Nexus 5, what wasn't possible before. So they built this whole patching framework around it. So that we have access to the firmware and can modify it. And as a consequence, others use this knowledge to build exploits against those Wi-Fi controllers. Most prominent is Broadphone exploit by Google Project Zero. And other one was by Quark's lab in April, April 2019. So this is always something you have to keep in mind if you do reverse engineering and make your knowledge public. Others might use it to write exploits. And those are of course only the public known exploits. There might be even others that we don't know about. So currently at CMOS there is a lot of research going on regarding Broadcom Bluetooth controllers as you heard before. There's the internal group project that adds debugging capabilities to those Bluetooth controllers. And we are currently having the same problem that we have making a lot of knowledge about the controllers publicly available. And so we wanted to have a look before others do if there are some obvious vulnerabilities. So first of all, a little bit of background. So most of you already heard of Bluetooth. It's a radio protocol. It operates at 2.4 gigahertz. And it's usually used for communication between two devices or more devices. So for example, your mobile phone communicating with the headset. And what some people might not know is that when you have a mobile device with Bluetooth just enabled, you can connect to it per default. So for example, you can run a two-ping. What will open up a two-cup connection and run ping commands over it. And this is of course a huge attack surface as you can interact with nearby devices without them even noticing. So yeah, and in addition, there are some protocols that are hidden or terminated in the firmware. So they are not accessible from the host. And those are quite poorly documented. And the implementation is not popular. And those are also prone to errors. And the platform that we use, I guess this guy already showed it to you. It's Cypress Development Board. If you say Cypress or Broadcom, I use the terms basically as the same vendor because Cypress has acquired a Broadcom wireless stuff a couple of years ago. So the copyright in the firmware is also Broadcom. And as I said, we use those Bluetooth Low Energy Development Boards because you can run C code on them. So you have this wise development environment and can execute your C code on the controller. And most interesting here is that there is no separation between the application that runs on the controller and the firmware. So you can read all memory regions. You can read the ROM, you can read the ROM and even interact with it. In addition, the firmware somehow has to link the C code against the code that is stored in the ROM so that you can use primitives from the already existing firmware. And so the development environment basically leaks function names, global variables and hardware registers some names for everything and what made our reverse engineering progress a lot faster. So now a little bit about the Bluetooth protocols that are relevant here. So on the right, we have a diagram. On the top, there is the host and the bottom here, this big thing is the firmware. And here you can see there's a two-card protocol I mentioned before, it's basically the lowest layer that can be used to communicate between two devices. And this two-card protocol is then wrapped into HCI whereas HCI is a communication protocol used for the host to speak to the firmware. And for the firmware itself, you have three on top, the hardware registers for the serial interface. And then this blue stuff over here is the multi-threaded firmware. So there runs a real-time operating system on the firmware, implementing multiple threads. And here on the top, there's a BT transport thread brought as responsible for parsing HCI messages. And yeah, this, then we have here the link manager that processes those HCI messages and also implements the link manager protocol. What is the protocol I mentioned before that is already terminated in the firmware and not accessible to the host? And this link manager also controls something that is called a Bluetooth core scheduler. And this is an additional component that developed by Broadcom. So it's a custom scheduler. And this scheduler is involved for each Bluetooth clock cycle, what is around about 300 microseconds. So this interrupt handler here on the bottom is invoked 3,000 times a second. And this Bluetooth core scheduler implements all the time critical tasks for the Bluetooth protocol. And now I'm a little bit more of background. So as I mentioned before, there's internal boot project that you can use for debugging the firmware. And as the project was developed first, they were working on the Nexus 5. So they didn't have any symbols and they're just doing it by playing reverse engineering. And even though they were able to get access to the link manager protocol and extract those messages and also send arbitrary messages. And in this talk, I want to talk about Frankenstein which is an emulating and C patching framework. And this is also what was used to review the Bluetooth core scheduler. And why is this important? Because if you have a look at a Bluetooth packet, this is how it looks on the physical layer. In the beginning, you have the channel access code what is basically a link identifier between two devices. Then you have the packet header that describes basically the type of the packet. This is a management packet. This is a data packet. And then there is a payload header what describes some flow control stuff and a length field. And in the end, you have finally the payload. So if you have no or don't know about the Bluetooth core scheduler, you only have access to the payload. For example, if you're coming from the host, you can set arbitrary L2 cup packets. And if you can send LMP packets, you can send management packets with arbitrary payload. But as we now have also access to those packet and payload header, we can basically send arbitrary packets. And in this talk, the payload header will be most important because this triggered some interesting bugs. So main technique that we used was by hooking. So as I said before, we can execute C code on the controller. So we built a custom hooking algorithm, hooking mechanism so we can easily debug the firmware. For example, we can trace function calls, extract the arguments that are used to call this function. And in addition, we used it to implement some of text because we needed to modify the protocol behavior on the link manager protocol level and also implement a user for fuzzing. And yeah, what we did pretty early in the process was we found the function that is responsible to copy the payload data to the send buffer, to the hardware send buffer. And so we basically set a hook on this function. And in this point, we can modify the packet and payload header and even the payload and therefore implement fuzzing. And this is how it looks like in practice. So this is actually the code here. We have this function here, BCS, DMR, TX, and ABL, EIR. And what is basically the function that copies extended inquiry response packets to the send buffer. And extended inquiry response packets are the response if you're doing a device inquiry. So if you are scanning two devices, and this is the response packet, it says round about 240 bytes of data. So it was quite interesting because there fits a lot of data into it. And this here on the bottom is the actual fuzzing function that is invoked prior to this copy. And all it does is randomized the Bluetooth address. So the scan results are not as fast or not discarded. And here we are simply just flipping random bits in the payload header. And yeah, that's it. So I flashed the code on the controller. I set it to discoverable and started scanning with my laptop and boom, it crashed actually the controller of my old T430. And okay, so the problem here is that the T430 is a very old laptop. The firmware is from round about 2010. We are not actually that sure because it has no build date in it. We tried the same technique against newer devices and newer is Nexus 5. So even quite old devices and they seem not to be affected. So I thought, well, it's probably not worth the effort but Jessica said, maybe you should have a look anyway. So okay, I looked at it and yeah, I had no symbols for the firmware. So it was quite hard time to figure out what's going on. At some point I had the full firmware image, had an image of the crash. And it looks like that it was a heap corruption because it can look from where we come from. And as you see that there are some function involved that are relevant for the heap. And this is quite problematic because on this particular heap implementation if you have a buffer overflow you basically corrupt it in some point or somewhere and then you free the buffer and everything is fine. And somewhere later you are trying to access those corrupted data and then it crashes. So it is basically impossible to correlate the crash with the cause of the crash. Yeah, so we had no luck with this one. So we get up and focus more on the emulation part of our research. And yeah, the basic idea here is that, well the firmware is just ARM code. And maybe we can just extract a well-defined firmware state that contains all the registers or the memory and just restore the registers and try to re-execute it maybe. And this is here on the right is the code how we did this. So we have this XMIT state function that can be also invoked as a hook. And all it does is it saves the registers to the stack and stores the stack pointer at a fixed occasion in memory. And then we are calling the XMIT memory function that will actually disable all the interrupts. So we have no other code running in the meantime. And then we are sending all the memory regions that we want to the host. And this is important that we do not use functions that use threading, for example, invoking the BT transport thread. So we just directly use the hardware registers to send out the memory dump. And if we have now the memory dump on the host, we can basically call the continue function. And this will restore first of all the stack pointer and load the registers and continue execution. And yeah, let's see how far you can get with it. But first of all, yeah, we have the memory dump. And of course we don't have the hardware registers anymore. We have no URL, the radio front isn't implemented. So we have to do some modifications to the firmware first of all. And here's how we did this. So we just wanted to use playing QEMO ARM. So not, we didn't want to implement or to mess up its QEMO. And so first of all, we have here the firmware where we extract our memory dumps. And those are then converted to object files. And as I said, we need to do some modifications. We want to write C code. The C code lists in the same memory as the same address space as the firmware. So we also can compile this to an object file. And as I said before, we have a full list of all functions, of all global variables, et cetera. So what we can do is we can link our C code against the firmware image that is stored in the, that we extracted from the chip. So we can basically write C code in the same address space. So we can invoke functions, we can parse data structures and so on. And in the end, we are after we have compiled everything, we can link it to a new ELF file and the ELF file then describes the memory layout, of course, of the firmware. And down here, here's an extra page where our C code lists and the C code, first of all, does some modifications to the ROM and to the RAM. And then second, it calls the continue function and restores those registers. And first of all, what we did is we added a lot of debug messages so that we can trace function calls which are invoked in order to understand what is going on. And in the end, we were able to implement the threading behavior, we were able to inject and extract HCI messages. And we are also able to inject and extract raw wireless frames. So what we can do right now is, as we can make use of HCI, we can try to attach it to a running operating system and feed basically data to standard in that are then processed by the firmware as it were valid wireless frames. And then see how the firmware behaves together with the operating system and see what's going on. And this is what I want to quickly show you in a demo. I hope it works. So I have here an Ubuntu. And as you can see here, I have the, let me make, okay, that's better. I have here the Bluetooth settings of Ubuntu open. There's currently no adapter. Here I have the BT-mon running so we can see what's going on on the HCI level. And here is a watch on HCI config and currently there's no adapter. And here on the top, there is the magic happening. So this HCI attached binary is actually the compiled firmware that we can run and it will automatically attach to the Bluetooth stack. And we now pass random data into the compiled firmware and see how the system behaves. And as soon as I execute it, we can first of all see that we've got a lot of logs here's activity going on in the BT-mon. We have a new adapter here that is now up and running. And sooner or later, the firmware actually starts a device inquiry and scans for devices and we get actually valid scan results. So the random data that we are passing to the firmware are interpreted as wireless frames. And if we have a look here in BT-mon, we can see here event device found, another device found, and yeah, of course we get also a lot of logs. And I hopefully am it, demo time doesn't work. But anyways, yeah, let me start it again. Scanning for devices. Yeah, no matter. Yeah, as you can see, we get a lot of logs and what I also told you that we can re-execute a firmware state. And so in this case, we have here also a web front end where I can execute the firmware and get even more insights into it. And so here I have a compiled firmware state that will just execute the firmware and until it's enter the idle state. So here we have the debug output. And as you can see, we have here information about the contact switches and memory allocations and so on. We can also get here a view of the memory where we can see what memory regions are modified and also all the symbols. And on the bottom, we have a activity map of the memory and where we can see what memory regions are executed, which are read, which are written. And you can zoom into here and there are also symbol annotations and for the firmware. So, sorry that the demo didn't work quite out because what actually happened is this here and because in the device inquiry, there's actually a bug. And so there's a heap corruption and the firmware should actually crash at some point with a heap corruption detected. And this here is what you would see on the console. So, yeah, we have added a heap sanitizer to it and somewhere, got to receive one. And yeah, okay, here's it. We can see that we here have a memory allocation with hex 109 bytes and it's returned this address here. And later we are doing a mem copy into this actual buffer with a length that is way longer than the 109 bytes. And this is then detected by the heap sanitizer and gives us information actually where the heap overflow happened with all the link registers. So we can actually debug the firmware. And this was an actual CVE that we have reported. And it is a heap corruption during the device inquiry. And as we had a closer look to it, it was actually the same bug as on the T430. So the problem here is that we have a bug within the firmware that we can observe on a firmware from 2010 and on a firmware that is from 2018. So it was at least for eight to 10 years in production and this bug. And most interesting here is that the bug is actually located in two different locations. One is inside the Bluetooth core scheduler and the other is in the link manager. And if you have a look at the source code what is happening here, we are first of all extracting the packet length from the payload header by discarding the lower three bits and then masking out the length field. And in the link manager thread, we are allocating at 264 bytes of memory. And then we are doing a mem copy into this buffer with a previously computed length. And what you might ask us here, okay, first of all, length check is missing. What happens if we send a data or a packet that is longer than the expected 240 bytes? And the answer is actually nothing will happening because the maximum packet size is limited on the physical layer. But what's actually the deadly bug here is that the bit mask over here is wrong. It's 13 bits, not 10 bits as it should be. So the packet length also includes the reserve for future use field. And this field is normally set to zero. And as soon as you set a single bit into the field to one you will greatly exceed the maximum packet size and therefore trigger a buffer overflow. Okay, so you might ask I can have a mem copy with that causes overflow, but I can't actually control the data that I'm overflowing with. So what's the point here? But in fact, if we have a closer look at the firmware, what we did is we set the, I used the pattern for the payload data that we could easily recognize in memory. So here on the bottom you can see the blue part is the actual packet that we send. And for some reason, we don't know even why some part of the packet is actually duplicated in the end. So even though we only sending 240 bytes of memory, we can get and control it overflow here. And this is pretty dangerous on the, on the seed implementation because if you have a look here this is how the heap looks like. It's basically just a management of buffers with fixed size. So we have here a free list of, or a linked list of all the three buffers. And if you have an overflow we can corrupt one of those pointers and basically redirect the linked list to an arbitrary location memory. So what you can then do is we can treat any arbitrary memory location as a valid buffer if we are able to allocate three buffers in a row. And then therefore we can override arbitrary data. And as there are no exploit mitigations at all so the complete memory is writable and executable. We have a write.wear.get should we can basically overwrite a function and run and therefore gain code execution. And if you think about ASRR forget about it, it's all static layout. So we know exploit mitigations at all. So we built a full proof of concept for this and then disclose it to Broadcom. So on April last year we informed them finally and after two weeks we requested a status update and asked what's going on. And they said that they have found this bug in February 2018 and they had a complete fix and informed all their customers and there's a software upgrade available and so everything is fine. And this is a pretty strong poker face here because our latest snapshot from the firmware was January 2018. And they stated that the bug that was for at least eight years within the firmware was fixed two weeks later. So we had a closer look on this because I was pretty sure that I have tested the Samsung Galaxy A3 with a patch level that was after, after February 2018. And later we also tested the Samsung Galaxy S8 what has a patch level from March 2019 what was quite recent at that point in time. And in fact, we tested even more devices and we couldn't find any device that actually was not vulnerable. So even some fitness tracker were vulnerable. And so we asked what's going on and they said, yeah, we normally provide fixes to our customers and it's at their discretion if they want to fix a bug or not. So yeah, as I said, we couldn't find any device that was patched until Jessica bought the S10e. And here's a relevant part of the code. So here's the mem copy again and they have indeed added a length check. So they are checking here if the length is greater than expected 240 bytes. And if it's the case, there are trunk height in it. And this firmware has a build date from April 2018. So they indeed found the bug in February. So this is for, this is ledged. But yeah, well, in the end, we had to escalated it to the vendors. So we contacted Google and we contacted Apple because they are using, or at least Apple is using Broadcom Bluetooth ships and there's something, of course, using Android as well. And so Android supplied fixed or firmware images with this fix in August, 2019. And also Apple provided fixes and gave us public recognition. And yeah, this was pretty stressful disclosure as we had to really ask the vendors, please could you supply a fix to finally for a remote code execution that is in the firmware for over 10 years. That's not the way how it should be. So now I want to talk a little bit about ACL or asynchronous connection less. What is the term for Bluetooth connections? So fuzzing ACL is of course an interesting target because it's the most complex task or the most complex thing in the firmware because this also includes the link manager protocol implementation in the link manager. So we really expected some bugs here, especially because you just already found missing length check in this handler table. So we tried actually a lot. So we tried coverage guided fuzzing with the Bluetooth stack attach. We tried coverage guided fuzzing on ACL packets, but this wasn't, didn't yield any results because the ACL packets are directly passed to the BT transport thread. And therefore they are not processed by the link manager. And we also tried the same like for the inquiry packets where we mutated the packets before they are written to the hardware registers. And none of those approaches yielded any useful bugs. So we were not able to crash the firmware or it was quite surprising to us, but except there was one crash we could observe on the Nexus 5. And this was during my thesis, but the problem here was, A, it wasn't in scope because I wrote my thesis on dynamic firmware analysis and therefore the Android operating system was not in scope. I hadn't really time to debug this. In addition, our builds here had no symbols. What makes it even worse. And then we are in a heavily threaded process as you can see here. We are crashing in thread number 19. So I gave up on this and stated in my thesis that it's probably the fragmentation exploit described by Bluborn. But later we actually retested it and throw the same further against the S10e. And actually, yeah, we could, they're also observer crash and what we basically forgot about it. And as you can see, we have here finally also symbols. So we are crashing in a mem copy what looks good on the first hand. But if you have a look here on X2, what this was during invocation, the length parameter of the mem copy, it looks pretty much like a negative number. And the fault address here with a lot of zeros in the end also indicates, or it's a strong indication that we are having here a mem copy with a negative number. What is an infinite mem copy? So we are running at the end of a page and therefore crash. And if you have a look at this reassemble and dispatch function, we can see here that there is indeed a mem copy with a difference in the length argument. And if you have a, if you roll the numbers, there's indeed an edge case where you can cause a mem copy with a negative length. What is caused by this if statement, what is actually a buffer overflow protection. So, okay, this is a valid case. But as we tried further fuzzing and see what crashes we could generate, there are some crashes that we couldn't really explain. For example, this one where the fault address looks really odd, like even as we have overwritten it with some ASCII, and it's a completely different code location. And all those crashes were not reproducible. And so, yeah, we might expect there's something else going on. First of all, the relevant code location here is for the L2Cup defragmentation. So, and it works like this, if you have a packet that, or want to send an L2Cup message that is longer than you can send on the physical layer, for example, you have to fragment it in some way. And so, if you send the packet, you'll say, here's a packet or L2Cup packet that is a thousand byte longs, and here you have the first 500 or so. And then the host operating system will then advocate the four 1000 bytes and keeps track of the offset where the current packet ends. And then if the next packet arrives, then copies the payload in the end of the packet and then tries to reassemble it. And the problem here is that the mem copy on the prior Android 10 is really weird when it comes to negative numbers. We hadn't done a full disclosure on this, but it was indeed possible to exploit this bug, at least for an Android 10. And here's what it looks in the end. So, you end up with a basically reverse shell with the same privilege as the Bluetooth demon on Android. And as we then rolled our write-up to report it to Google, we actually had a second look at the master branch because I had bookmarked from myself the source code for the Android 9 branch because this is what running on my test device. And then I had a look at the master branch and as you can see here, and there already was a commit that actually fixes this bug. And it also says directly wrong packet length leading to memory corruption. And this commit was actually from April 2018. So it was a quite old bug and it was visible for a couple of months publicly in the master branch, but for some reason it wasn't applied to any releases. Yeah, so yeah, as I said before, we build a Pog for this. At the point it was fixed, what was in February, it affected around about 60% of the Android devices. So around about 1.5 billion. And yeah, we did the disclosure on November, on the third November, 2019 and the fix finally came out on February 2020. So even Google requires full 90 days to basically apply a commit that is already there in the master branch. And we also have here the problem that there are a lot of devices that only receive quarterly updates or only receive updates every 90 days. And there are still a lot of devices that will not receive updates at all because the phone is older than two years. And then we were also contacted by some automotive vendors where I'm not even sure how their patch cycle looks like. So this is always the problem if you have a chain of different companies. If you report a bug to the first company, they require 90 days to fix it and then the next company also need 90 days to fix it and so on. So disclosure of such vulnerabilities can take a long time. And yeah, finally to my conclusion, exploiting is quite exhausting. So I would like to see if vendors would directly fix memory corruption vulnerabilities without requiring a full remote code execution proof. Like for example, in the Broadcom case, they found the bug previously but they didn't supply it any patches. Same for Android, the bug was known but no patches were supplied. And we still have the problem with patch supply. So there are always unfixable devices. It takes a long time for the patch to reach the customers. And yeah, we have finally released our tool that we've used here, what didn't work quite that well in this live demo. But if you're interested, here's a source code that also describes how we exploited the Broadcom vulnerability in the device inquiry and also some additional information. And that was it from my side and I would be open for questions. Yup, I heard you. So interoftome stream, Fiji? Ah, that's good. So, which heap sanitizer has been used to detect the inquiry bug you mentioned? This was our own heap sanitizer implementation. So they are using the heap implementation from Thradix and we basically reverse engineered the heap and it's actually quite easy to run some basic checks if the heap is still intact and then we set hooks on the most interesting functions that we will basically test if the heap is still intact and we'll throw an error message if there's something wrong. I guess ASLR is not possible on the ARM Cortex M3, I guess they are using. It's ASLR is something that you really can't use on embedded devices. I guess that DEP, so data execution prevention could be possible but they have to write the firmware so the updates into the run. So it has to be executable at some point and they should lock it afterwards but I'm not sure if they will implement it at some point and Broadcom also now has implemented some sort of heap checks. So, and this I guess also why they found the bug yet. Yet, but it's not really an exploit mitigation but more anti-lock. DEP power X-M3, DEP and response to this bug. And that what, good. So I just counted it. I guess it's around about one or 200 hooks that we are using and there are a lot of debugging hooks and the number of functions that are needed to be replaced are actually not that much. Most important are those functions like writing to an UART register and it's a simple function. It gets a pointer to a buffer and a length and it writes those data to UART and you can basically just replace it with a write. And same for UART read. And yeah, there are some functions that do not make any sense on in the user space. For example, enabling and disabling interrupts. So those are functions that we disabled right away. And let me think. And yeah, there are all those like hardware relevant functions that will, for example, do a read on a hardware register and write on a hardware register. And as long as they can write to a valid memory location it's probably fine. And if you have this memory state you also have pretty good default values. So it turned out to work okay-ish. For which one? And for the Broadcom exploit it's in the GitHub repo. So cmo minus lab slash Frankenstein the Android proof of concept is not released yet.