 Hello friends and welcome to our talk, High Stakes Updates, or as we like to call it, BIOS, RCE, ONG, WTF, BBQ. We are Jesse and Mickey. Hi, I'm Mickey. I'm Jesse. Here are our pictures, so you can someday find us and harass us face to face. We both work at a startup named Eclipseium who has generously funded this research and down below are our Twitter handles where you can heckle us and ask us questions at any time. In this talk, we'll give you some background as to how it all started, what we call the background story before we talk about what we found and how we found it. Once we're all on the same page, we'll dig it a little deeper into the exploitation of these issues and finish up with discussing the practicality of large-scale exploitation of these issues. In the last five years, we've seen almost an explosion in the amount of firmware-related vulnerabilities. We've seen more and more publications about silicon side-channel attacks, and more attention is put on lower-level components from the Intel ME to AMD PSP to PCI bridges, Ethernet controllers, and even more components in the server world. Since the term firmware is very broad these days, in this context, we'll use it to describe BIOS or UEFI firmware. Now, in my eyes, threats and vulnerabilities are not the same. Threats are manifested using vulnerabilities. For example, there is no threat without a vulnerability of some kind, so in response to this growing problem of increased threats, more and more solutions are popping up. The easiest and most known solution to talk about is Microsoft Secure Core PC. The marketing material for this solution has been putting emphasis on how large scale the firmware problem is and how we're all oblivious to it. The way this is done is by utilizing a range of hardware and firmware security features to work in unison and provide a sort of umbrella coverage to protect us all. But not everyone can get a new computer with all the top security bells and whistles, and not all computers are going to have them either. So for normal users, the problem remains, and it's simple to explain. There are more issues to patch in firmware, and patching or updating firmware has never been a very user-friendly process. In fact, many knowledgeable users still fear it because of a mishap they might have had in the past themselves or have heard of a friend breaking their computer during a firmware update. There are some ways the user experience or UX problem is being attempted to be solved with minimal user interaction, or by using classic OEM tools or update packages, or by manually going into the BIOS menu and flashing through there. LVFS, for example, is the Red Hat open source way of providing firmware updates in an automated way for Linux users. Windows Update is another process of doing this on supported machines. The easiest example is Microsoft's Surface Lineup. Since they control the whole vertical from firmware to software and manufacturing of the computer, they can have the OS do the update process for them without worrying about any third party not doing proper validation and breaking their system. By the way, this does not mean that breaking is not a real possibility. For example, there are always these unfortunate souls whose experience that one bizarre power outage writes back in the middle of a BIOS update. HTTPS boot, which is also used to recover in OS, but we list it here because it's a part of the recovery quote unquote umbrella, which can also apply updates as a new version of OS installed over an older one that's not working anymore. All these methods of updates and recovery need to be done in a secure and safe way, especially if they involve using a network connection at any point. We've had some experience dealing with remote code execution in the past. Back in 2018, we've discovered vulnerabilities in ASRock and ASUS update process through BIOS. You can watch our Blackhead and Defcon talks from that year if you want to learn more. Looking at some of the big names in the enterprise computer hardware world, we did some attack surface coping. We started with HP and looked at their Pixie Boot and HTTPS boot features and we couldn't find anything that stuck out there. We also took a look at Lenovo's capabilities. They're very similar to HP's, but with less fancy user interface, nothing stuck out there either. We then moved to Dell and then we noticed that they had more options and more fancy UI. We noticed two new options coming in from Dell. One is to make it easier to update the BIOS without an OS and the other is to recover the OS from an unrecoverable state. Now don't get me wrong, both are very useful features and I wish I had these many times during my career using computers. So please don't take this talk as a discouragement against using these in general. Both of these new features appear to be a part of what's called Dell Support Assist, which is an umbrella definition. I'd say umbrella because the same term covers software utilities in the OS level as well as the firmware features. Some of you might call it bloatware. That's the software that comes installed by default on most Dell systems. There are some ways that the Windows side of Support Assist can interact with the firmware side. Those are well documented. We're not going to go into them right now. If you want to go explore these, you can Google them and you will find plenty of documentation covering this. But for the contents of this talk, anything referring to Support Assist will be in the context of firmware. Let's say a regular user gets into the boot options menu after either something bad happened to their system or they need to do a BIOS update. This is the screen that they would see. This is the boot option menu or something similar in any other platform. This is the example of what we're seeing on our Dell Latitude 5320. If their system will not boot to an OS and it needs to go through recovery, then they will have the recovery option. If they need to do a BIOS update but not use Windows, they can use the BIOS firmware update remote option. The BIOS setup, the diagnostics, the BIOS update, and the device configuration are usually out of reach for normal usage scenarios for common users. For more technical users, it is something that most of us are probably familiar with. In case some of you missed that, that was a BIOS flash update over the air, not just the BIOS update. The OS recovery is done over the air then. What could possibly go wrong? So let's take a look at how these features work. We need to set up a machine in the middle environment. We start with using the community edition of TF Sense, running on an old desktop, we had lying around. And once we had all that working, we set up a sniffer and sniff the traffic. Looking at the initial packet capture, we can immediately see a ping going out to 888, which is Google's DNS server IP address, and a DNS query asking for the IP of downloads.del.com. After that, we see the handshake process starting and then failing. Looks like we need a cert. Well, what kind of cert do we need? We need a valid SSL cert so we can use with our malicious server to do a proper handshake and sniff the traffic. So let's look at the firmware image. We pop it open and we get this string. Now, if you look at it closely, you might recognize it. This is the text header of the Mozilla CA root certificate bundle. Now, this is just the first few lines. This file is very, very long. It contains the common CA's used by Mozilla. So let's get a cert. First, we tried to get one from zero SSL. It's a free service that gives you three months, free cert for a domain that you own. You can play with it and try whatever you want. We got the SSL cert. We loaded up in the server. Everything looked fine when we tested it independently. But when we went ahead and did the machine in the middle attack, it didn't work. Turns out the CA that zero SSL is using is not one of the ones that is listed in the Mozilla list. So we moved on. We tried let's encrypt. It took about three minutes before we realized that that's going to be a hassle. It's not going to be a point and click way to go to their website like zero SSL. Click a few buttons and get a certificate. So moving on, it turns out that you can buy a wild card SSL certificate for about 70 euros. There's about 95 to 100 bucks. So after a few minutes, googling on the internet, finding the cheapest solution, we ended up buying a certificate from Surdom. I hope that's pronounced like that. And we're set and holy crap, it worked. We got a full capture of all the traffic decrypted between our laptop and our malicious server. Now let's take a closer look at those green lines in the capture. We see that the laptop is reaching out to downloads.dell.com and is trying to first make sure that there's a connection and then retrieves a catalog BC dot XML file. Let's take a step back and look at how this works in high level. We have our laptop reaching out to Dell servers over SSL and getting a catalog file with an XML format. In this catalog, according to what we chose in the laptop with what option we selected, either firmware update, which we will reference as FOTA or photo or OS recovery that we will mention as CSOS or CSOS. According to that selection, these will point the communications to the corresponding EFI file that will be downloaded from the server. Normally, modifying these would be out of reach because the traffic would be covered over the SSL. But since we bypassed that barrier, we are able to play around and mess with the files and the process and see what we can mess with and fuzz. As a result of all this, we have found four CDs. The first one was the TLS issue where the verification of the certificate was not done properly in which the URL was not compared to the certificate. So as long as you would provide a valid certificate from a trusted CA, you could impersonate Dell servers. The second, third and fourth vulnerabilities are the ones we're going to discuss deeper in this talk. Okay, so now let's take a closer look at some of the vulnerabilities themselves and what's actually happening here. This is some of the contents of that catalog BC XML file. It includes things like the base location tag with downloads.dell.com. There's a lot of model specific information in this file also. But there's also some of these software component tags that include things like the path to Dell photo launcher.efi. There's another one for Dell CSOS launcher.efi and both the firmware over the air update and the support assist OS recovery, depending on which path you choose, will pick a different software component to download and run from downloads.dell.com. If secure boot is turned off, that's an easy way to get arbitrary code execution during the pre-boot process. If secure boot is turned on, you do need to have a executable that's signed by a key that's allowed by efi secure boot, which would mean that it's in the the dv database. In our investigation, we discovered that this base location tag has a buffer overflow. This is a heap-based buffer overflow that is in the UEFI firmware itself. Just the code that's running before it even downloads and runs this Dell photo launcher, for when it's parsing this XML file, in a component that's in the spy in your UEFI firmware that's flashed to the motherboard, that's where this vulnerability lives. There is this overflow in the base location tag and that was kind of fun, but there's there's some more vulnerabilities that we'll look at as well. Starting the support assist OS recovery path or CSOS, one of the things that this does is it downloads a JSON file from Dell as well. This is part of that JSON file. There are multiple sections that have a URL size and then a SHA256. It turns out that this URL field has a stack buffer overflow and also the SHA256 field also has a stack buffer overflow. Let's take a closer look at the SHA256 overflow. The verification function to check and see if the file that was downloaded from the server actually matches that SHA256 that was provided has some bugs in it. This is a simplified decompilation of that function where it is taking the ASCII hex string, converting that to binary, and then comparing that against the hash that was calculated from the file that was downloaded. But this hex conversion that it's doing, they are writing the converted values into a buffer on the stack without properly verifying that it's only the length of a SHA256 hash that they're converting. If you take a look here, they will read up to 20,000 bytes from the hex string, but they can only write 344 bytes before they run into the saved return address on the stack. This is a great thing for attackers because it's a stack buffer overflow at a fixed address and you don't have to worry about embedded nulls, other bad characters, because you can just give it a hex string with whatever you want in it and that will get written to the stack. So in the past when we had to work on explaining bugs like this, it was basically set to easy mode because we could set the hardware to be in debug and using Intel tools, we could just single step through instructions in real time and see how the bug is being triggered and how to exploit it, but that's not common anymore. Modern computers and modern platforms no longer have that option easily available, so you'd have to be creative. Now it is however possible to do this, it's just harder, but that will be in a different talk. Instead of using that debug mechanism, we are using the PCI leech. We love PCI leech, PCI leech is great. The PCI leech was a great way to dump all the memory in the system, so we didn't have direct debug access, we couldn't single step, we couldn't do live debugging, but we had the ability to dump the entire address space or most of the address space, so we ended up with three gigabyte dumps and a lot of those. Since we had kind of an interesting year, we've been doing socially distant debugging and Mickey initially found this bug and was able to dump the system using PCI leech and then he would upload this three gig memory dump. I would download it, load it into IDA, which has the issues of its own, and do some analysis and send a payload back, try something, and we had kind of this workflow which was not ideal because Mickey has much faster internet access than I do, and at one point we discovered that because the system is booting up, there isn't really a lot, there isn't a lot of live memory yet, so it's mostly just a little bit larger than the BIOS region, so it turns out that these three gig memory dumps compressed down to around 17 megs if you throw it in a 7-zip, so that sped things up a lot and helped quite a bit. For actually debugging these vulnerabilities, one thing that's really nice about exploiting UEFI is that there's a one-to-one virtual to physical mapping, so all of the physical map, all of the pointers that you would see in the memory space are actually at those physical offsets into the file, so it was pretty easy to do that mapping and figure out what was going on, find where things were because I didn't need to do any kind of translation or mapping between those, so it ended up where we were able to basically pass these dumps and payload attempts back and forth where we were able to test the payload remotely and confirm arbitrary code execution when I never even saw what the physical device looked like, so there are some things to be aware of when you're doing this type of analysis, especially with really large images, loading those into IDA, make sure you turn off analysis before you load the image into IDA, these were three gig memory images, even just loading that into IDA with 64 gigs of RAM, it would go non-responsive for a while, I'd sit there for a while, wait for it to load, and if you forget and leave analysis turned on, it's really easy to run out of memory, and there are some cases where you might need to turn on analysis momentarily, like if you're using X-rays and something isn't decompiled properly, you might want to turn on, like just click to turn on analysis, click again to disable it, and eventually it'll start responding again and disable analysis, but if you forget you'll run out of memory, and even with analysis turned off, loading these three gig dumps into IDA was creating about 12 gig IDBs, so I totally ran out of memory out of disk space multiple times while doing this and had to shuffle things around, doing some preprocessing ahead of time to strip out the regions that you only wanted to do, that would have been a good idea. Some other things we ran into was once we were able to come and start working in the office again, we put our test equipment in the office and we were able to do some tests that way, but then we didn't really have the ability to test remotely in some scenarios, so I did some experiments of loading three gig dumps into a unicorn engine and debugging first stage shellcode payloads that way, and that actually worked really well, and I have a 128 gig length system at home, and it would take about five minutes to test a payload and it would run through until it was jumping to the next stage, so that's another thing that was really useful. So there's some interesting complications to modern exploitation, there's a lot of exploit mitigation techniques like the stack and heap used to be fully executable, all these mitigations were put in place, address randomization, sandboxes, but in the UEFI environment it's kind of like going back to the 90s because that's still lagging significantly behind all these mitigations that were in the OS and application space, so in most cases you have an executable stack and heap, you don't have any canaries, you don't have address randomization, and you're running in ring zero, essentially running in kernel mode, so Tiana Core has started providing initial implementations of some of these things like non-executable stack, but those need to be enabled by the OEMs and we haven't seen any real systems that have these turned on yet, essentially anything that slows down the boot process OEMs are hesitant to turn on, so one of the things that we looked at for our payload is although there is no address randomization, things can still load at different addresses just because of you might have a different system, different controller, things might load at different addresses, so we took a look at places where we could find gadgets at known fixed locations, so one thing that's really useful is that the BIOS region in spy is mapped at physically known locations in physical memory starting at ff0 through the reset vector at 60 bytes below 4 gig, so we know where the UEFI firmware is going to be mapped at a physical location, so we can go look for RAP gadgets in the BIOS region that's mapped from the spy chip, there's a couple ways that you can get the contents, you can either dump the BIOS region using tools like chipset, you can use physical access with a deddy prog or other spy reader, or you can also just download BIOS updates and you can extract them using this great Dell PFS BIOS extractor, and it turns out there's actually a lot of useful gadgets at fixed addresses in the BIOS region, here's an example of using a wrapper to search for jump RSP equivalent instructions in this particular BIOS image, we found 441 in this image, so the firmware across model families and different models tends to be different, but within a specific model, the different firmware versions tend to be similar enough that you could find common gadget at addresses across all of the available firmware images, here's an example from the latitude 5320, all of these versions have this gadget at the same address, it's totally useful for us, so let's take a look at what our exploit payload contains, so our first stage includes a jump RSP at a known address that's mapped from spy, so we can reliably get remote code accession without caring where we loaded, at this point we can do whatever we want, but we want to use UEFI functions to do things for us, so to do that we need a pointer to the boot services table, we can scan for the boot serve signature, take a look, find it that way, but we also need a pointer to the EFI handle for the current executable to do things like load image and start image, it turns out that that executable that they just downloaded from us, Dell CISO SOS launcher contains pointers to both of these, so we can just scan memory, find that executable that we just ran, we know exactly what version they have, and then we know offsets to the boot services pointer, we know the offset for the image handle for the currently running executable, there are going to be multiple copies of these files in memory, you do need to determine what is the correct one, as an example, when Dell CISO SOS launcher is loaded, there will be one copy that is before it has actually been executed, so those pointers will be null, in the live version that boot services pointer and the EFI handle variables will not be null, so we can search for a pattern, check the pointers, if they're non-null, we found the correct one, but what about secure boot, we talked about calling a load image and start image, but there's still secure boot to contend with and the cryptographic image verification, so the UEFI framework is designed in a modular way and it uses this UEFI security to arc protocol to abstract some of these security functions including TCG measured boot, UEFI secure boot, essentially there are callback handlers that get registered in this protocol in order to actually do that cryptographic signature verification and the measurement into the TPM registers, so when Dixie core is loading an image, core load image common calls security to sub authenticate, which calls execute security to handlers, and that function before it actually calls any of the handlers checks to see if any are registered and returns success if there are no handlers registered, so it turns out there's a really easy thing we can do. In order to turn off image verification, we just scan memory to find that security sub Dixie executable and write it zero to that global variable and now we can load whatever we want, this also stops updating the TPM measurements and the rest of the UEFI firmware still thinks that secure boot is on and being enforced, so another thing that we have to deal with is we have somewhat of a limited amount of space in our payload in the stack, where do we want to load our next UEFI executable from? The easiest thing to do is have a very small UEFI that's just appended to the first stage, you can also call UEFI network functions to do connections to your own network infrastructure, download it, execute it all around that way, so it turns out there's actually something a lot easier that we can do. Dell has their own UEFI RAM disk implementation and Dell CSOS launcher will actually download all those URLs that we saw in the JSON file into RAM disk file systems, it'll verify them, it'll even extract zip files for you and all of these files can be accessed using standard UEFI functions like EFI simple file system protocol and our payload actually ended up using both of these where we appended a very small executable to keep our shell code simple and then the appended EFI executable iterated over the file systems in order to find the next stage and run that. For our first demo we'll just show the classic example of popping calc except in BIOS when secure boot is turned off. So for our second demo we'll increase the difficulty level a little bit by using a secure core PC where secure boot and all the other features are turned on and will pop shell. So for our third demo or the boss level we'll increase the difficulty even further by still using the secure core PC with all the security features enabled but we'll drop a malicious executable into the Windows startup folder and run that. Some of you might say, hey, BitLocker's enabled, why do you have to suspend it? We get to that age old question of what came first, chicken or the egg, what is the right thing to do once you have an update process you need to follow when the BitLocker measurements measure the firmware but you need to update the firmware. So according to Dell the easiest solution is to suspend BitLocker before you update the BIOS. Unsurprisingly HP has a very similar recommendation and last but not least Microsoft also recommends this for third-party updates. So you can't update firmware without suspending BitLocker. Now we do have modern mitigations in place so there are some challenges if you want to persist for example, boot guard and BIOS guard are there to prevent you from modifying the firmware and persisting inside the firmware image. There are ways around this if you see recent vulnerabilities published you might spot some boot guard bypasses. HP SureStart is another mechanism to verify the integrity of firmware so if you do happen to be an attacker and modify flash you will get caught by HP SureStart. Kernel DMA protections are there as well. VBS and HVCI, there are protections in place to prevent you from abusing low-level mechanisms. So how would you exploit this at a large scale? Well we know these vulnerabilities impact a large number of laptops. We also know that home routers are being attacked actively right now and enterprise device vulnerabilities for example we've seen a lot of effort done by attackers against VPN servers. Once they gain that foothold they are potentially able to modify internal network appliances to redirect DNS traffic inside an enterprise environment and so on and so forth. We can't forget all the ace and hijacking stuff. Let's say someone managed to social engineer their way to redirect for the ownership of a DNS through a registrar or modify records. There's also BJP hijacking and we should never forget what the the ISPs are capable of doing. No one really knows what ISPs are doing so we are at their mercy. These attacks are actually happening in the wild and although not common they have happened repeatedly and are most likely to happen again. We can't have a talk without mentioning supply chain. Now let's say we want to attack this mechanism of remote firmware updates. The simple way to do this by a supply chain attack is by compromising the web server hosting the XML files. Let's assume we didn't have the TLS vulnerability. We still wanted to gain access to all the machines that are doing updates and exploit the other buffer overflows without the SSL issue. So a web vulnerability and a web server exploit would come in handy. You just replace the files in dallas.del.com and you're done. The more complex scenario is when you have an insider threat inside the chain. Let's say an employee manages to modify files in some of the servers that host these files or an insider who is involved in adding code to the signed binaries that adds vulnerabilities to them. It's less likely but it's still a scenario that we should talk about. In conclusion, a couple of words about the disclosure process. It was not easy at first to gain a measure of how many models were actually affected by this issue but we ended up agreeing that there were 129 models affected. The initial disclosure went out to Dell on March 3rd and by the time we published our initial high-level blog post on June 24th all updates for this issue have been released publicly. Note that this has been 90 days plus two weeks from disclosure to patch of 129 models a vulnerability in BIOS. This I hope will be an example for every other vendor and every other OEM. Once you find a vulnerability in BIOS you can achieve a 90-day timeline without arguing for more. If you are affected by this issue and you would like to securely update your system we recommend you do not use these features for obvious reasons and download the manual OEM update setup files from the Dell website and use those to update your BIOS locally. If you are afraid of rollback and downgrade attacks on your system you can go into BIOS and uncheck the checkbox that says allow for a downgrade of BIOS and only check it if you manually need to roll back your BIOS for some reason in the future. All our tools, exploits, data information is going to be available on GitHub in the following address. Feel free to submit issues if you want to ask us questions. You can DM us on Twitter and we would like to say thank you a final thank you to Dell P-Sert for working with us and resolving this issue in around 90 days and US Sert for helping us as well. And that's it. Thank you for your time and we appreciate you listening and we hope you enjoyed our talk.