 our next speakers, Joseph and Ilya, who will be talking about how bootloaders are broken and how to sort of look into them. Please give them a warm round of applause. Hello. Does this sound okay? Okay. Cool. Yeah. So, welcome to Boot2Root, auditing bootloaders by example. I'm Joseph Tartaro. I hack things for IOActive, and this is my second time at Congress, so I'm really excited to be back here. Hi. I'm Ilya. This is my 18th or 19th time at Congress. Happy to be back here, too. I spoke here, I think, seven or eight times before, and I'm very excited to speak here together with Joe. We've been working together on bootloaders last year in change. This is minus the NDA coverage stuff. This is some of the things we've observed and seen, so I'm very excited to do this. Yeah. So, the expected audience for this talk would be embedded systems engineers, security people who are interested in embedded systems, and just curious security people. Just a caveat, we're going to be quickly going through about, like, 70 slides or so, and a lot of it's just, like, examples of C source code, so if you did not realize you were signing up for an hour of that, feel free to walk out. We're not going to be offended. And then another caveat would be this isn't really trying to flex and show, look at all the bugs we found. The purpose of this was to kind of show people, hey, if you have not looked at bootloaders before, this is our recommended, you know, areas of attack surface that are interesting. This is probably where you should get started, and in some examples of nobody's really looking at them, and they should start, so that's pretty much what's going to go on right now. So, quickly, here's the agenda. We'll discuss, quote, unquote, bootloaders, why they're important. Some of the common ones we looked at, attack surface and our conclusions. So there's going to be a wide interpretation of bootloader. So basically what we mean by that is anything that's in your secure boot chain. And if you don't have experience with these or you haven't looked at them much, you can kind of think of it, like, from an operating system standpoint of user land calling into your kernel space, you'll have, like, kind of normal world calling into secure world and stuff, and that's kind of what you're looking for is those pivots and when they're processing, you know, attacker user control data. So why? Because they're, you know, critical for security. It's a key component of your chain of trust, and it's very obvious that a lot of device designers are poor at hardening and limiting attack surface. And what we mean by that is a lot of the devices we've looked at over the last year or so, you'll find devices that have, like, full network stacks even though they don't need a network stack. You'll have a bunch of code loaded up to handle file systems that are never expected, so it just doesn't really make sense why they're not limiting all that attack surface. There's also a huge underestimate of reverse engineering. People just kind of assuming that there's no bad actors and nobody's really going to look at this thing and it's this hidden black box that we should ignore. And a little story behind the presentation is we're actually on a train going to a baseball game. I was trying to introduce Ilya to the lovely game of baseball and we were talking about U-Boot and I pulled out my phone and we went to the U-Boot GitHub and in about, I don't know, 10 minutes, 15 minutes, we ran into like 10 bugs and we went, yeah, we should probably audit some of this stuff and kind of inspiration from a previous talk that Ilya has given at Congress where he audited a bunch of different BSDs. I said, why don't we look at a bunch of different bootloaders? And so just to give credit where credits do, we are not the first to look at any of this stuff. So this is a list of people that kind of inspired us, have done really interesting work, and we recommend if you're interested in any of this and you enjoyed it, you should go check these people out and see things that they have released in papers and stuff like that. So where are they? Bootloaders are pretty much in everything. You have your workstations, phones, game consoles, your TV, everything. And generally the security basically depends on this. So it obviously really, really, really matters. And so with that said, we basically started looking at these common open source bootloaders. So Uboot, Coreboot, Grubb, CBIOS, CAFE, which is Broadcom, IPXE, and Tionakor. And we're just looking at what's on GitHub, what we downloaded. Obviously in your real world scenarios, the devices that you have at home or go by and start looking at, they're going to be heavily modified. They're going to have weird custom drivers that aren't available, things like that. So we're not here to argue the likelihood of some of the bugs we've found. We're not going to argue exploitability, half of them we don't know if they're exploitable. We will for one. Yeah, we will for one. That's not really the point of it. The point is to kind of show people, show designers what they should be concerned with and show researchers that might be interested what they should look at. So Uboot is extremely common. It's in a ton of devices. There's a huge, very customizable config for all different sorts of boards and stuff. There's concerns for environment variable stuff. There's a super powerful shell. You'll sometimes see even shell injection concerns based on environment variables. It's pretty funky. And there's lots of drivers for tons of devices. So it's kind of a great first step of looking at something that kind of covers a huge breadth of things. And so features of Uboot that are interesting would be network stack for different protocol parsing, file systems. And they also will load their next stage images from all sorts of weird like archaic things that nobody uses anymore. And it's just used by tons of devices. And then Corboot. And I apologize, this is a little dry right now. We'll eventually get to stuff. But Corboot, it's more targeted towards modern operating systems. There aren't legacy bio support. It actually, they took a methodology that other projects don't, which is they're not going to implement features that they don't want to. So if you're trying to do network booting or something, you use Corboot to boot into like Ipixie or something. They're not going to implement that feature. These are used in Chromebooks. And obviously, some of the interesting parts come from Google. And one main interesting area is SMM. And in Grub, obviously, you guys are all familiar with this. The primary concern here is a multi-boot spec. And they support just a ton of file systems. So that's obviously the attack surface you'd be concerned with. But the interesting part here is that there are UEFI signed versions of Grub. So that's kind of like your secure boot break right there. If you find a vulnerability that's an assigned version, you can now load that into your UEFI and exploit that vulnerability. And you're good to go. And then CBIOS is the default BIOS for Kimu KVM. This supports legacy BIOS stuff. So you'll see that this gets booted into from things like Corboot. This supports TPM communication. So that's kind of interesting. And it's the compatibility support module for UEFI and OVMF. And then Broadcom Cafe. So this is used in a ton of different network devices and TVs and stuff like that. And the obvious attack surface there would be network protocols, the network stack. That's what you'd want to look at. iPixie. This would be more network-based stuff, various parsing, and similar to Grub, there's UEFI signed versions. So it's a great potential pivot for secure boot stuff. And then finally, Tionakor. There's really no introduction needed here. There's been a ton of great presentations over the last, like 15 years of people just doing everything to this thing. And due to that, since it's pretty much the most scrutinized one out there, it's really mature compared to everything else. So when you bash Tionakor, realize that it's way better than everything else. And there's a lot of implementations built on top of it, like platform-specific stuff, Qualcomm's, ABL and XBL, things like that. So you'll have all these. You have the base Tionakor EDK2, and then you'll have all the custom stuff built on top. And then related to boot loaders, you have things like TrustZone. And that's kind of where I mentioned you have like Normal World and Secure World. You'll have these interesting pieces of code that are running into those secure areas that you're going to want to look for, you know, pivots into there so you can gain access to secrets. And then obviously from the host operating system, the attack surface there would be you can modify things like NV-RAM. So you can set variables that when the next time you reboot, the boot loader is going to process those variables. And this slide's more for reference. Later you can get the slides or take a picture or something. These are just links to some instructions for building and whatnot, so if you wanted to start looking at these, you can quickly build a little environment and poke around. So just really quickly to go through, you know, the concept of secure boot, you kind of have your chain of trust. You'll have the boot ROM that will then verify and load various other loaders that then verify and load the next thing. And you'll sometimes have a TPM involvement. You'll have some TrustZone involvement and then OS interaction stuff like the OS, basically setting things like NV-RAM which will set boot configuration stuff, whatnot. So the boot ROM itself is something you've probably seen the talk by Cordy Orioop, I think earlier, that was, it's really important because it can't be patched. It's, a hardware revision would be required if a vulnerability is found and it's so early in the stage and you would basically compromise everything after it that is where you'd want to go. It's extremely bare minimum. It does some initialization of hardware and memory stuff and then you'll find things like maybe implementations of fast boots, you have a USB stack, so that's an obvious attack surface you'd be interested in. And then it will verify the signature of the next stage and then boot that. And then you move on to the next stages and this is where they start initializing the rest of what's necessary, network stacks, SMM handlers and whatnot and they'll basically just keep handing off. And then you'll run into things like trusted boot and measure boot, which is a verification and then measuring which is kind of more for logging and whatnot, doesn't actually mitigate anything, it just kind of alerts you of stuff. So some hardware environments are a little different like the secure world stuff, you'll have ARM or trust zone, Windows has VTL 0, VTL 1, hypervisor stuff and then yeah, I'm basically repeating myself. And then eventually the operating system, that's going to get the kernels gonna be loaded, it's gonna be verified and then eventually start running and then you can have fun. So early observations is everything we pretty much looked at that's open source, there's no privilege separation. So if you were to compromise a piece of component, you pretty much rule everything. And what's interesting is there are some proprietary bootloaders that you're starting to see, like Apple for example, they're doing some aspect of privilege separation, so if you were to exploit a portion, you didn't necessarily control the world. And so right now at least all the stuff we looked at did not have anything like that, but maybe in the future we'll see that. So this is the list of the attack surface that we think people should be interested in and focusing on. You have NVRAM, file systems and files, all network stack protocol stuff, all various buses, you know, SMM, DMA and hardware stuff. The interesting thing about buses that we've noticed is that embedded designers seem to inherently trust anything that end users should never interact with. So they go, okay an end user uses USB so we should verify some of that, but they do a bad job. But an end user doesn't play with the spy flash, so just inherently trust it. Or an end user isn't to the TPM, so just if the TPM says something, run with it and they mess it up a lot. So NVRAM, these are the various environment variables that can be configured for the next boot cycle. And like I said, it's basically processing of user-controlled data. So if you start looking at some of these boot loaders, here's the interesting functions you'd wanna look at. For example, in Uboot, you just kinda call for environment git and grab through and you see them grabbing the environment variable and see what they do with it. If they're not doing any sort of validation, you're gonna hit a buck. And so an example of that would be right here, you see there's an environment git for boot pvci. It checks if it existed, it will then do a stir length and then just mem copy it directly into a buffer without validating that it can fit the buffer. And this is actually kinda funny, earlier today we were like, ah, we should try to exploit one of these just to show it. So we toyed with this one and just do a bunch of bytes and it starts just kinda sending these weird packets over the wire and a bunch of boot p stuff you'll see later on, you just have full raw packets of whatever payload we send. But it wasn't very realistic. It kind of when started getting into the KeyMu network implementation stuff, so we kind of avoided and moved on to a different bug. But as you can see, it's very easy for you to start looking at the stuff and being interested in it and quickly set up an environment and mess with exploitation and we'll get into it, but there's no mitigations. So as you see, there's a bit of a pattern here, environment git hostname, mem copy with stir length, no checks. 128 byte buffer, environment git with boot dev, stir copy. If I'll stir copy, you know, it just keeps happening. This is kind of what we were talking about, the quality code is not very great. What's funny is when we were messing with that boot p VCI stuff earlier this morning, we went, ah, this is gonna be more involved, so it was like, well, let me just go find a different bug. 10 minutes later, it's like, ah, I found a different stack smash. Okay, let's work on that one. It's just like, this is a perfect example. The environment git the variable, then they'll grab the very first element of that variable as a length and then they use that as a length. And, yep, and it keeps happening. So just a quick example, see this. So the attack scenario of this would be, you have a device, you have NVRAM that you can physically modify as you have physical access. And this is kind of an example of the default NVRAM. So we just threw an environment variable with 600 bytes and you'll see towards the bottom there's four bytes where we threw a function in there and that's the address. So just this had to do with the GFS S2 file system loading, so you just do FS load. And that's it. That's shock code being executed. So this is, sorry, I can't see. So obviously, if you've never looked at this stuff, it's kind of fun to play around with and you should start poking at it because why not? And screenshot. Sweet, cool. Cool. So that was the NVRAM attack surface which is usually the most fun to play with. Programming the spy flash sometimes may take a little bit of fiddling, but in terms of attack surface and fun things to play with, it's, as Joe said, often overlooked. And so it's easy to toy with. And so you saw all those Esperant Devs and MIM copies and string copies. So there's a lot of fun there. But obviously, there's not just NVRAM. There's more attack surface to sort of a trusted boot environment. And obviously one of them is the file system, right? Because this thing needs to boot and it needs to find images and they're stored somewhere and your file system sort of brings order to that chaos. And so basically you need to mount your file system. And often file systems are not signed or are technically checked or not all of it is. Because before you can do that, you need to be able to read something, right? And so obviously an example of a common file system would be the file system. If your boot environment supports loading your USB drive and for storage, it's probably gonna be fat. Depending on, and the flash itself may have its own proprietary format or it may use X2 or something else. But clearly the file system parsers, I mean that's prime attack surface. A closely related is obviously the files inside of your file system, right? Now depending on what your boot environment looks like, some of these files are gonna be integrity checked, but some of them might not be, right? And so those file parsers are interesting attack surface. And if you're either looking for bugs or you're building a product, we would highly recommend fuzzing these. And it's like starting with AFL, it's probably a good starting point. But we'll show some examples of some bugs we saw in certain in a number of boot loaders. So we'll cover X2, we'll cover a bitmap splash parsing one. I think the other ones are sort of examples of where there could be bugs, but those two we'll cover. So this is grub and this is grub X2 and grub X2, you know, this is the sim link code. So it looks to file system and it goes, oh, okay, this is a sim link, how do I parse the sim link? And so the sim link says, hi, I'm a sim link, I'm this big. And what grub does is, oh, great, I'll add, I'll allocate that much size, but I'll do one more for an old byte. And of course that is a classic integer overflow. And the grub memory allocator actually returns a valid pointer for a zero size. So this is a perfect primitive. So you get a pointer to something that is of zero size. And then it actually reads the sim link content from the file system into a zero size buffer. And obviously that's gonna cause memory corruption. And a primitive, you can't see it here, but a primitive is really nice because that particular read function, actually, if there's not much more on disk, it stops reading. So even though you can save, read four gigs, you can make the layout so it only reads, let's say, 1,000 bytes or 100 bytes. And so you have this near perfect primitive to cause memory corruption. This was in Tiana Core, for example. This was the bitmap splash screen parser. And I don't know anything about bitmap terminals. It's a very simple format, but basically you can have a four-bit bitmap and that gives you a pallet of four by four, which is 16 bytes. And so the idea is if you have a four-bit bitmap, then it has a 16-byte pallet. And so it goes and reads that 16-byte pallet except you tell it how big the pallet is and you can say, okay, I know you're expecting a 16-byte pallet, but I'm giving you a 256-byte pallet and it'll read it into a 16-byte buffer. So that was broken. And then for the 8-bit maps, similarly, it was broken the same way. And these are just a tip of the iceberg. There are many, many, many more. This did not take long to find. But let's move on. Obviously, okay, now that we've looked at some local stuff and some physical stuff, what about remote? How do we, what is there? Obviously, if you're talking remote in the modern world, you're talking TCP-IP, right? So you need a TCP-IP stack and you need to have some services that you either expose or that you have client codes for that you go talk to. And that's BOOTP and DHCP and DNS and iSCUZE and NFS. And if you're in a corporate, if you're a corporate net, you probably won't have IP stack and then HTTP and HPS and TFTP, right? And sure enough, most bloat loaders have code for this and then you start asking, okay, well, what's the attack surface? For TCP-IP, if you implement your own stack, well, a good lock, because you're gonna screw it up. But secondly, it's like, okay, well, if you look at these things, you take a step back and say, okay, well, really, this is all mostly TLV parsing. And so you go and look into what are the things that can go wrong if you do TLV parsing and there'll be out-of-bound reads pretty much everywhere and you'll see in those loops in lots of places and things like that. If you look at the protocol at the top of that, like DHCP and DNS, you have your standard sort of network attacks where you'll see lease stealing or cash poisoning or things like that if you don't protect your ID. So if you have a static ID or you don't validate your IDs or you generate predictable IDs, you can have these kind of poisoning, stealing, man-in-the-middle attacks, right? And then thirdly, obviously, the thing we really like to see is memory corruption bugs. At the end of the day, you take network data and you parse it and if you do it wrong, you may cause memory corruption bugs. Another sometimes overlooked interesting bug, I think, in when you're doing network parsing is when you have information leaks. So this often manifests itself, I mean, hard bleed, of course, was one example. It was a perfect primitive, but generally it'll be the thing where you end up generating some kind of pack as a response to something and you'll have done the memory allocation but for some reason not initialize something and so you end up sending initialized data over the network. So this is sort of the common things you see in network stacks in general. So if you are looking for these kind of bugs in a boot environment or if you are building a product that does this, I would highly recommend fuzzing them. There are numerous interesting network fuzzers. If you're looking for network stack fuzzing, I would recommend ISIC, it's pretty old but it's still incredibly, incredibly effective. It tends to break most network stacks. So with that said, let's show some examples. I was, so Uboot, for example, this is the Uboot DNS code and if you see that, the TID, that's the DNS ID and it's basically they give it a, they use the static DNS ID of one for all DNS packets that they send out. So doing DNS man in the middle on a Uboot environment is extremely trivial. You guess the DNS ID by saying that it's one and you are correct 100% of the time. So this is Broadcom's coffee and this is the DHP parser and it basically has this sort of junk stack buffer where it just needs to read something out of a buffer and it goes and says, okay, well get me the length of that thing in it. So U and 8, so that means it can represent 0 up to 255 and then it reads 0 up to 255 into that junk buffer. So that's a stack corruption right there. If you then look at the coffee, a DHP parser, it's very similar where in this case it has a 512 byte buffer and then it basically copies the entire Rx buffer into it and this is for ethernet so up to 1500 bytes gets copied into this 512 byte buffer again, causing memory corruption. Obviously we're not done with coffee yet because it has such beautiful code. So this is the coffee ICFP ping handler and this has this really cute sort of use-after-free bug or double-free where it basically sends out a ping and then it sits there receiving, you see the while loop there and it basically sits there receiving packets until it finds the right one but it also, because it doesn't want to have it it has a time-out and there's this interesting condition where if the last thing it looked was the packet you're looking for but it also timed out at the exact same time it frees that packet twice which obviously can lead to memory corruption. Again, we're not done with coffee yet. Oh right, there we go. Not done with coffee yet. So this is the IP handling coffee and if anybody has ever looked at an IP header which I would assume most of you have you'll know it has an IP header length and a total length and of course coffee needs to know this because it needs to know where stuff is but coffee validates neither the IP header length or the total length so once you start messing with those coffee blows up really fast. So that was sort of a quick overview of all the trivial TCP and TCP related bugs that show up in your average boot loader. But let's say you got that covered and let's say, okay, what's the next thing? What more sort of network stuff can we do? And then of course what comes to mind is Wi-Fi, you know, 8 or 2.11 and a surprising number of boot loaders don't have 8 or 2.11 so I don't know if that's sort of on purpose or it's just sort of we didn't get to having this feature yet but we will at some point and obviously it'll have bugs when we implement it but we did find one that has it and I'll get to that in a second. One thing I did want to mention is that if you look at 8 or 2.11 like frame parsing depending on which device you have and particularly which radio your device is using, you'll have radios that do a lot of parsing on the radio in which case the stack is kind of covered because it's, or at least you hope it is where the radio does a bunch of validation but then there's a whole bunch of radios that sort of, you know, they take package from the wire and they just don't do anything and just pass on to the OS. That's the stuff that's interesting from a tax service point of view if you look into attack, the boot loader and not the firmware. So yeah, we looked at iPixie and of course, you know, any time you do any kind of Wi-Fi stuff the first thing you do is you're looking for an SSID, right? This is what this thing does and it sort of has this SSID buffer which an SSID can be up to 32 bytes so that is 32, well it's 33 bytes because it's 32 plus one and it has this loop where it sort of goes over IE's that it gets and then when it finds the IE for an SSID it says okay, we'll take this IE and we'll do IE length and IE length is a U and 8 so it can be up to 255 and it copies the SSID IE into the SSID buffer which is only 32 bytes causing memory corruption. So that's iPixie for you. Okay, so the next one if you're thinking networking would be Bluetooth and Joe and I have actually we've looked at proprietary bootloaders that have Bluetooth support. Unfortunately, we can't talk about those bugs because they're covered by non-secret agreements and we tried really, really hard to find a similar equivalent in any kind of open source bootloader and we couldn't find one. So unfortunately, we can't give examples of sort of Bluetooth bugs in open source bootloaders but I do kind of wanna in general terms talk about what we have seen where we suspect the bugs are going to be if somebody does do this in an open source bootloader or something that's a new bootloader. So and also like if you're gonna do Bluetooth in a bootloader, it's usually for HID device, right? This is for keyboard and mouse, this usually ends up being for consumer devices. But in general, if you look at a Bluetooth stack and you're looking for any kind of parsing bugs, there's three sort of recurring teams. There's three sort of recurring teams that we saw when we looked at the Bluetooth stacks that we have and this is if you look at the lower layers like L2P cap and things like that and this is usually related to sort of frames and frame lengths, so very, very large frame. So a frame can be up to about 65,000 bytes because the length is a U in 16 and so if you create really large frames like right up to the edge, that tends to blow up Bluetooth stacks. The other one is if you create very, very, very short frames, less than what something is expecting, that tends to blow up your Bluetooth stack. And then lastly, because L2CAP can have fragmentation, so you can have individual fragments that you'll add together and every fragment can be X amount of bytes but the whole thing can be up 65,000 bytes a byte, 65,000 number of bytes. So if you start playing around with the fragmentation, we've numerous Bluetooth stacks have blown up. Again, I wish I could have shown an actual bug here but I wasn't able to find any in the open source bootloaders. So moving on to USB, this is a primate hack surface in bootloaders, obviously. If you haven't followed the news in the last couple of weeks and months, this has shown up in a number of devices. This is at least up till recently, I think was sort of underreported or sort of people didn't quite care about it but to me, the USB stack is like the, if I, anytime I look at a USB loader at a bootloader, my first thought is, and VRAM in number two is USB, right? And USB is interesting because obviously, a lot of bootloaders support USB because they'll use it either for storage or things like internet dongles, but often for storage where either you expose certain files or you try to do some kind of recovery boot from USB or something like that, right? And so basically, if you start looking at how USB works at a slightly lower level, it's not quite, like it's not like PCI, it's more packet-based. And so what happens when a device talks to a host, the device is asked for quote unquote descriptors. And these descriptors say certain things about your device and based upon the number of descriptors that the host asks from you and all the answers you give them in descriptor responses, the host can then figure out what kind of device you are and what class you are and what functionality you expose and all these kinds of things. And so a lot of these descriptors end up being parsed and being parsed wrong. And so generally, we often see either straight up overflows or double fetches because the way descriptors work is they're variable length content but the headers are predefined. And the way it works is you first ask for the header and then based on the length and header, you ask for the thing again except the USB protocol doesn't allow you to just get the payload, you have to reread the whole thing so you have to reread the header that you already had. And so in most implementation, what happens is you go get the header, you allocate a buffer and then you go get the header and payload again and you override the original header. Which means is you get to overwrite the original header length and so you can have a talk-tow where your device gives you a header with a good length and then the second time it gives you a descriptor with a bad length and if your host doesn't validate that both lengths are the same, bad things happen. And yeah, straight up overflows happen too because nobody ever expects the USB device. There's an example in Grubb, for example, where it goes and gets a descriptor and the descriptor says, oh, here's my config count. I have this number of configurations. Go and fetch those in the descriptor as well. And there is a predefined array of number of configurations that Grubb has and it doesn't correlate that with the config count. It just always assumes the config count is less or equal to the array. And so if you have a malicious device and you say, hey, I know your arrays are two bytes, two elements, but I'm gonna give you 64 configurations. It'll happily write 64 configurations in a place I can only hold 32 of them and exactly the same thing for number of interfaces. TianaCore had similar bugs where this was TianaCore where they go and get a descriptor and then the descriptor length says, okay, now go fetch me the whole thing and use that descriptor length except the original hub descriptor was a very small struct and the descriptor length is a unit eight so it can set up to 255 bytes. And so that would have smashed memory and caused a stack corruption. And so as you can see here, the fix here wasn't actually at a bound check, but just to make the buffer bigger because you know the length is a unit eight. So if you just make the buffer 256, then any length you give it will fit within the buffer. Hold on. Yeah, this is an example in CBIOS where this is the classic double fetch where it goes and gets the header, does an alloc based on the header and then gets the header again with the content and it doesn't verify that config wTotalLength is the same thing as CFG to wTotalLength. And because that verification isn't there, whoever calls this sting can no longer trust wTotalLength because it could be invalid. And so as I said, USB to me is one of the prime attack services to secure boot. And so I want to sort of very, very briefly mention two recent real world cases where devices got broken into because of USB parsing into bootloader. So this is a case of the Nintendo Switch, the Tegra. This was done by the failover flow people about a year or so ago that basically you give it a length that's not validated and then a mem copy and that causes memory corruption. And then the recent iPhone checkmate, this is slightly more complicated because it wasn't a straight at memory corruption, but it was, if you fiddle around enough with a state, it sort of gets out of sync and it has all these pointers that it's still considered to be alive but the memory's been freed. And so this ends up what you use after free but it is because of a sequence of USB packets that are being sent. Right, so that's it for USB. Obviously a bunch of the buses or almost any bus on your device if your boot chain uses it is interesting attack surface. And this is spy, I mean spy flash shouldn't be trusted. SDIO, I2C, LPC, even your TPM, right? Even your TPM response you get back from TPM can't be trusted because somebody could desolder your TPM and pretend to be your TPM. And if you don't now validate the data you get from TPM, you end up with memory corruption. So this is, for example, what happened in CBIOS. CBIOS has this talk sort of TPM and it goes and gets this structure and you can send it less than what it expects. And then basically they subtract some side of a struct minus less what they expect. And so that causes integer underflow which ends up with a really big size. And then that size is given to malloc and malloc internally has an integer overflow. So that really big size then becomes a very small size and then they copy into it and that causes memory corruption. And so this is a combination of two bugs, right? One is where the sizes are wrong and then secondly is where the malloc has an internal integer overflow. And this is the malloc internals of CBIOS which I don't wanna dive into now because it's pretty convoluted and complex. I'll leave this as an exercise to the audience to figure out where the int overflow is. And technically it's not an overflow because it's a bunch of shifting being done but it essentially comes down to an int overflow. Yeah, there we go. Yep, there we go, okay. So another service, attack service that is interesting but often overlooked on devices except for UEFI is system management mode. And I mean there have been over the last decade and a half numerous presentations about SMM attack service and breaking SMM handlers for UEFI because it's an x86 thing and you see this in UEFI stacks. And it was a sort of cat and mouse game where for years somebody breaks something and then UEFI fixes it and then somebody breaks it again and UEFI fixes it again and somebody breaks it again and UEFI goes okay, let's do the mitigation for this and this has gone on for like 15, 16 years. And occasionally people still find bugs but by and large UEFI does fairly well now in regards to SMM handling. There's some third party stuff that still breaks occasionally but in general they got a handle on this. Now what if you're using x86 but you reimplement your own boot loader and you don't use UEFI, right? Well that means you're gonna run into all the same problems UEFI had and you're gonna screw it up in all the same ways and it'll take you another 15 years to get it right. But I guess if you were to try on the first time this is what Corboot does and to their credit they say, oh we get input we should range check it and their range check says to do. Okay, so now I talked about a bunch of buses. The thing to me that sort of separates that from other things is when you do DMA generally for as long as DMA has been around the idea has been DMA is game over, right? If you get DMA then that's that there's no trust boundary. Obviously that's no longer true, right? We have IOMU's and if you use them and if you have a device that has this available then all of a sudden DMA can be stopped, it can be contained, it can be regulated and so it's no longer a sort of game over and you can implement a trust boundary. Few things there. A, if you are an embedded device and you're using IOMU you are way ahead of the curve because not that many people are doing this, right? You should but it takes some effort. There's obviously many different IOMU's depends on your architecture and sort of the board you have and all sorts of things. So Intel's VDD, ARM is SMMU, AMD has like the device exclusion and it's got a few other ones as well but basically if you use one of these you can sort of DMA is no longer the game over. Now that doesn't mean that you're in the clear. Once you define DMA as an attack service you have to defend it and that's where you get some difficulty because you'll end up using drivers that have been written before it was a trust boundary and I mean no one has ever written a DMA handler and with a trust boundary in mind because you assume there's no trust boundary. So now if you start using the IOMU you have to go back and all your drivers and look at okay where am I doing DMA and now I can no longer trust what's in when I get back from DMA and you gotta go validate all this stuff but even if you do all of that right which by the way is very hard and you probably won't do it right but let's say you do, secondly because DMA is now a trust boundary anytime you open up a memory window for a device you can't just open it up you have to clear the memory first because otherwise now you're leaking memory to a device so all these sort of new things sort of show up if you take DMA as a trust boundary and let's say you do that right well now you still have a dependency on the IOMU because you are assuming the IOMU is perfect when it probably isn't right I haven't really gotten to this part yet but one of my plans for the future is to go look at IOMUs and see if I can attack IOMUs and find bugs there I strongly suspect there will be side channels and logic bugs and maybe maybe even some heart room implementation bugs. So bug wise this is where it sort of gets designing right so this is what this is Ufie today it's EDK2 platform code where it has support for IOMU when they're ahead of the curve but if you look at the spec there's no good handover protocol from Ufie to the next stage and so Ufie basically boots up and very early configures the IOMU and make sure that devices can't peek or poke arbitrary physical memory and then it does all of its stages and then it's about to hand off and it goes well I don't know if the next guy supports IOMU so it undoes all the IOMU program it does turns the thing off opens up everything and then hands it over to the next guy so you did all this work for nothing and this is a spec bug and it's being worked on people intel are very much aware of this people at Apple have fixed this for their devices this is going to get fixed in the future but given that this has to be done by spec it takes a while because you get people to agree to it and then have numerous different implementations implement this and this is where I hand back over to Joe Yeah so hardware it's pretty much out of scope for the presentation we did not look at any of this but we thought it would be naive to not at least mention it to people and what we mean by that are like glitching side channel, silicon stuff and so with glitching you have like fault injections and a lot of times people go after things like the signature verification stuff so they'll basically glitch that part and then they'll start running on sign code and a recent example of some glitching was done by fell overflow for the ps4 syscon and I forget the specifics it's been a while really really good blog post you should check out but I think it went to like an infinite loop like go to debug mode or something and it would say that's not enabled and then there's glitch out of it and then it initializes debug mode and so stuff like that obviously should be concerned with then clearly side channel with timing discrepancies power discrepancies, things like spectrum up down speculative execution in general these are where people are leaking secrets so they're going after keys so they can start signing their own code stuff like that and then Chipsack and this is somewhat interesting because it's obviously only relevant for a very sophisticated attacker it's going to be somebody who has very expensive equipment they do things like decapping they use fibs and cems they can do things like optical ROM extraction and get the boot ROM and then start auditing that and then find a bug and exploit it and obviously totally out of scope for the presentation but one thing that's interesting about this is kind of presently it's still not a lot of people have this expensive equipment but I think very soon you have people like John McMaster and stuff who have like a SEM in their garage and so you're going to eventually start getting these regular hackers in their garage that will eventually get this equipment as the older equipment becomes more affordable and maybe that opens us up to more realistic instead of people just kind of ignoring it and they go I'm not worried about somebody with a quarter of a million dollars worth of equipment now it's turning into somebody who dropped 10K or 15K and then a quick note on code integrity it's something that people mess up a lot it's kind of hard to do right you have people that do weak or crypto you have blacklist problems and an example that is you know there's finite space for their blacklist it'll eventually get exhausted so when you have stuff like where I mentioned that you have a signed grub there's a known bug in a signed grub binary that was released by Kaspersky and if that's not on the blacklist of your eufy platform at home you can just load that up and then exploit it and you just broke secure boot like and it's not going to get fixed until that platform has an up-to-date blacklist and eventually if all this crap gets blacklisted eventually that list will be full and then they can't blacklist anymore right so it's a concern there and then you'll have issues where they'll only sign like certain portions not the full blob so you can still modify certain parts or sometimes you'll see they'll just check for signature existing but they don't validate it you know really really dumb stuff so conclusions obviously this is the tip of the iceberg there's a lot of stuff we didn't look at and if it wasn't clear we did not really audit these things we literally would like chat with each other every day and be like hey we need a C BIOS bug like okay give me an hour okay I got one okay check the box and then let's move on to the next one our goal was to have an example of a bug for each list of the attack surface an example of a bug for each list of the thing we looked at and that was it so once we found one we stopped so you guys should go hunting and have fun but if it wasn't obvious there's a surprising amount of low quality code it's kind of crazy and it's pretty clear that not a lot of people are looking at this stuff and one thing that kind of sucks is you get to NDA hell really really really quick if you want to look at any of this proprietary stuff so if you're interested in Qualcomm stuff for example and you want to get some documentation or look at it you pretty much have to sacrifice a first born to get access to that stuff like it's just not going to happen and it's kind of silly so kind of like our advice and call to action is people should be minimizing their image their boot environments their host environments turn off features you don't need if you don't need a network stack why have it if you don't need USB why have it all you're doing is enabling attack surface for somebody to leverage and something that we see a lot which is insane is like these little embedded devices that are running Linux and they have literally more drivers than our desktop at home and it's just like what is going on here it doesn't make sense you should really really really work on limiting the attack surface and really quick mitigations there aren't any in most environments yeah yeah there just aren't any as you can see from the example from before that was like you know just from this morning just really quick you know smash and you know over at the stored PC and that's it like there's no ASLR there's no anything like that code flow integrity what not but there's a link to a GitHub and that is an Intel employee that has gone and implemented a lot of these mitigations that are moving into Tiana core so they are way way way ahead of the game of everyone else because they're actually getting a lot of these mitigations which is quite impressive and you should check it out because it's interesting code so kind of a call to action we really hope that this was inspirational to a few of you if you've never looked at this stuff and it was always this black box that you weren't sure about you should just start poking around go find the slide where we showed where the build instructions were and go build some of these things and mess around and you're gonna have fun and like we said with no mitigations you can work on some easier exploit dev stuff and it's cool right but it's clear that a lot of people need to be reviewing this because it's clearly not happening people should start fuzzing interfaces everything we kind of showed should have been found with basic fuzzing we're kind of at a point where I just said I'm just going to take like a teensy or something and have it just do that classic descriptor double fetch because I think we've seen that in like every USB stick like just make a device and just start plugging it into stuff and just watch it break and then obviously periodic reviews and whatnot but yeah that's pretty much it so yeah well thank you now we have about 10 minutes for Q&A and we will start immediately with the internet thank you has the grub issue been fixed yet and was the code unique to grub or borrowed from you know elsewhere I don't think it's been fixed yet and this was the this was from whatever the official grub repository was alright if you have a question in the room please line up at the microphone so I can see you and now microphone number two please so let's say you want to make a more secure laptop so you can take a poor boot take a static kernel so that everything is compiled in no modules but what about the text services let's say somebody tries to interrupt the line of why is it booting for example that you maybe somehow get into make maybe DMA access because you you somehow interfere with a let's say Broadcom network device that has some special firmware to optimize the traffic or something and then that this thing gets buffer overflowed buffer overflowed and makes some problems on the on the bus or something are you talking about attack surface from devices yes it's like that let's say you want to make a good system so I think the best you can do if you have a laptop that you take a poor boot and you take a linux that you're completely compiled without module support everything you want to have is in the static kernel and then just boots and the kernel is in the core boot and so it's in the flash but you can get a problem by by by let's say some devices that just spit into your memory because they can do a DMA master yeah that's that's that's definitely concern and that's one of the things we were like okay well if if if you do use DMA as it has been forever than that then it's game over luckily nowadays we have IOMUs so you depending if which situation you are in and what hardware you have you can configure your device to use the IOMU and if you're doing that then even if a certain hardware device is compromised you can still try to protect you know your CPU and your host from the device by having the IOMU that I think I hope it answers your question but it's about as good an answer as I can come up with so this is the maximum you can do against some attacks I mean from a device let's say you want to make an extreme hard laptop that you use in extreme hostile environment say again you want to make a hard laptop that you use in a hostile environment right I mean I think that's the best you can do right I mean you can go and look at your host west and look at how it parses the stuff that comes from the bus because there'll be there'll be bugs there too but I think that's the best I can come up with yeah I mean you minimized the attack surface as much as you can right and then once you have it as minimized as possible then you just kind of hope there's nothing there but you don't need the code signing because you have everything in your flash or you still need some stuff oh you mean the stuff that runs on the host yeah I mean obviously and let's say you have your code on the Linux in the flash and so everything that we have that same can you maybe take this discussion after the talk because we have more questions thank you thanks signal angel please what is your favorite disassembly and that you use by reverse engineering the bootloaders uh we didn't really we just looked at source code we we need to be in a few places yeah I mean like uh I mean for this particular in this particular case uh most of it was white box so we didn't have much except for the exploit we didn't have much need for a disassembler uh I guess a little bit of gdb in general when we do do reverse engineering um I mean I does I does my go to uh I've been playing around with Gidra a little bit uh looks promising um has undo which is nice um but uh those are usually and then you know for doing Linux have stuff gdb is nice when you're doing debugging uh but generally Ida microphone number one please um I have two questions um what's your opinion on the arm trusted firm architecture I'm um I have to work with this and I'm a little bit shocked um in my opinion it's a little bit over complicated and the second question is um should we need also question the boot ROMs because um what I've seen in the special project um it leads me to believe that the boot ROM is also very broken and then is the question if it's um really necessary to harden the other stuff right yep those are those are really good questions by the way it sounds like you're working on this um so uh well your first question was what again um what do you think about the arm uh trust right yes um I I like you I've touched upon it in a few engagements unfortunately um any of the concrete stuff I can't really talk about because it was covered by non yeah I see you aware this this problem um but I I just I share your opinion that there are some things in there that are troubling um uh what regards to the boot ROMs your spot answer I mean this we try to bring this up in the one of the slides is if there's a buck there you can't fix it hardware revision is the only way to fix it so you kind of screwed um there are some designs you can do to try to minimize it so you could say okay well you can in your boot ROM in your boot ROM you could have a feature that allows for a quote unquote patching the boot ROM where you can have a piece of something on storage that could overwrite what the boot ROM is supposed to do obviously up until that point whatever's in the boot ROM still you can't fix that uh but you can you can minimize the amount of stuff in the boot ROM that's not patchable um but you can't minimize all of it I mean there's going to be some of it that is locked burned in hardware and can never be fixed for that device uh I I'd love to hear of a better solution I I don't know of any uh no I was just going to say this kind of touches on uh when we said people underestimating reverse engineering they kind of just keep it super super black box and it's this secret thing and then uh you know when somebody finds a way to to dump a boot ROM they start looking at it then they just see you know a bug in a USB USB stack and then uh the switch goes down and or the iPhone or whatever whatever else right uh so see that's uh for my perspective it I would say it would make more sense to do more open open auditing and let more eyes see it instead of hiding it in the corner and in pretending like well it's okay because nobody's looking um that's that's at least my perspective on it I I know why why people are super secretive about it but uh you need more eyeballs so maybe if you've got the right context uh uh my wish would be to get blank chips so I could um flash them with my own boot ROM so I have a a a method to improve things and and keep the texture of this minimum yeah and then there's going to be two other arm dies on there that are circumventing everything that you just did so yeah do we have more internet questions um yes are you aware of any good and maybe even free bootloader fuzzle uh not not not in its whole I did I mentioned a few places um so uh if you're doing any kind of file based stuff and you can isolate it uh AFL is you know that's to go to these days uh for network stuff there's a couple of interesting ones like is sick and there's some Wi-Fi fuzzers and and I think there's you know some DNS fuzzers or things like that so that works um I would say the the best way in my opinion would be to pull out the interesting components that are doing the parsing and and the stuff you're concerned about and write a harness for that so you don't really need to to try to write a fuzzer for that environment you just pull that feature out and and harness it up and and let let it you know run on a bunch of cores and yeah so I think live fuzzer might be very helpful there all right number two please uh hi we've helped producing the occurrence of such errors by using another programming language like um that would feature more secure um things that compile time or during runtime like maybe rust would it help if yes are there alternative projects that use other languages other than C I'm not aware oh uh so I I I think you're right I think we can uh reduce the amount of especially the memory corruption stuff I mean if we you switch to something like Rust uh even though I think on occasion there's some rust corner case that shows up where something doesn't get caught by the verifier those are weird corner cases in general yes I think you're right if we switch languages um the memory corruption a lot of memory corruption stuff goes away um but not everything goes away all the the the design stuff is still there right so currently doing that though all right well so there was actually there was a talk by Andrea Barisani oh yeah yesterday or a day before where he's doing a go runtime for an as a as a base for an ARM device um so this is an avenue that is currently in the beginning of being researched and being worked on um I think it's interesting I think it's a direction one that we could go into I think there's something there um at the present time it still seems a bit too early and the other thing is that while it does reduce the number of bugs that are going to be there in particular memory corruption stuff um it doesn't like there will still be you know your logic bugs and your hardware bugs and things like that um so it's not a silver bullet but it could definitely help okay and we out of time please thank our speakers again