 because it wants to replace this closed-source firmware with an open Linux boot version and our next speaker, Traml Hudson, he's an integral part of that project and he's here to provide you an overview of this Linux boot project. Thank you very much and please give a round of applause to Traml Hudson please. Thank you. Securing the boot process is really fundamental to having secure systems because of the vulnerabilities in firmware can affect any security that the operating system tries to provide and for that reason I think it's really important that we replace the proprietary vendor firmwares with open-source like Linux and this is not a new idea. My collaborator Ron Minnick started a project called Linux BIOS back in the 90s when he was at Los Alamos National Labs. They built the world's third fastest supercomputer out of a Linux cluster that used BIOS in the ROM to make it more reliable. Linux BIOS turned into core boot in 2005 and the Linux part was removed and it became a generic bootloader and it now powers the Chromebooks as well as projects like the head's slightly more secure laptop firmware that I presented last year at CCC. Unfortunately it doesn't support any server mainboards anymore. Most servers are running the variant of Intel's UEFI firmware which is a project that Intel started to replace the somewhat aging 16-bit real mode BIOS of the 80s and 90s and like a lot of second systems it's pretty complicated. If you've been to any talks on firmware security you've probably seen this slide before. It goes through multiple phases as the system boots. The first phase does a cryptographic verification of the pre-EFI phase. The PEI phase is responsible for bringing up the memory controller, the CPU interconnect and a few other critical devices. It also enables paging in long mode and then jumps into the device execution environment or Dixie phase. This is where UEFI option ROMs are executed as well as all of the remaining devices are initialized. Once the PCI bus and USB buses have been walked and enumerated it transfers to the boot device selection phase which figures out which disk or stick, USB stick or network to boot from. That loads a bootloader from that device which eventually loads a real operating system which then is the operating system that's running on the machine. What we're proposing is that we replace all of this with the Linux boot kernel and runtime. We can do all of the device enumeration in Linux. It already has support for doing this and then we can use more sophisticated protocols and tools to locate the real kernel that we want to run and use the KXX system call to be able to start that new kernel. The reason we want to use Linux here is because it gives us the ability to have a more secure system. It gives us a lot more flexibility and hopefully it lets us create a more resilient system out of it. On the security front, one of the big areas that we get some benefit is we reduce the attack surface. That in the Dixie phase, these drivers are an enormous amount of code. On the Intel S2600, there are over 400 modules that get loaded. They do things like the option ROMs that I mentioned. If you want an example of how dangerous option ROMs can be, you can look at my Thunderstrike talks from a few years ago. They also do things like display the boot splash, the vendor logo. This has been a place where quite a few buffer overflows have been found in vendor firmwares in the past. They have a complete network stack, IPv4 and v6, as well as HTTP and HTTPS. They have legacy device drivers for things like floppy drives. Again, these dusty corners are where vulnerabilities in Xen have been found that allowed hypervisor break. There are also modules that the Microsoft OEM activation that we just don't know what they do. Or things like a Y2K rollover module that probably hasn't been tested in two decades. So the final OS boot loader phase is actually not part of UEFI, but it's typically in the Linux system. It's grub, the gran unified boot loader. Many of you are probably familiar with its interface, but did you know that it has its own file system, video and network drivers? About almost 250,000 lines of code make up grub. I don't bring up the size of this to complain about the space it takes, but because of how much it increases our attack surface, you might think that having three different operating systems involved in this boot process gives us a defense in depth. But I would argue that we are subject to the weakest link in this chain, because if you can compromise UEFI, you can compromise grub. And if you can compromise grub, you can compromise the Linux kernel that you want to run on the machine. So there are lots of ways these attacks could be launched. As I mentioned, UEFI has a network device driver, grub has a network device driver, and of course Linux has a network device driver. This means that a remote attacker could potentially get code execution during the boot process. UEFI has a FAT file system driver, has a USB driver, grub has a USB driver, and of course Linux has a USB driver. There have been bugs found in USB stacks, which unfortunately are very complex, and a buffer overflow in a USB descriptor handler could allow a local attacker to plug in a rogue device and take control of the firmware during the boot. Of course, UEFI has a FAT driver, grub has a FAT driver, Linux has a FAT driver. This gives an attacker a place to gain persistence and perhaps leverage code execution during the initial file system or partition walk. So what we argue is that we should have the operating system that has the most contributors and the most code review and the most frequent update schedule for these roles. Linux has a lot more eyes on it. It undergoes a much more rapid update schedule than pretty much any vendor firmware. You might ask, why do we keep the PEI and the SEC phase from the UEFI firmware? Couldn't we use Core Boot in this place? And the problem is that vendors are not documenting the memory controller or the CPU interconnect. Instead, they're providing a opaque binary blob called the firmware support package or FSP that does the memory controller and the CPU initialization. On most core boot systems, on most modern core boot systems, core boot actually calls into the FSP to do this initialization. And on a lot of the devices, the FSP has grown in scope. So it now includes video device drivers and power management. And it's actually larger than the PEI phase on some of the servers that we're dealing with. The other wrinkle is that most modern CPUs don't come out of reset into the legacy reset vector anymore. Instead, they execute an authenticated code module called boot guard that's signed by Intel. And the CPU will not start up if that's not present. The good news is that this boot guard ACM measures the PEI phase into the TPM, which allows us to detect and attempts to modify it from malicious attacks. The bad news is that we are not able to change it on many of these systems. But even with that in place, we still have a much, much more flexible system. If you've ever worked with the UEFI shell or with Grubb's menu in config, it's really not, it's not as flexible and the tooling is not anywhere near as mature as being able to write things with shell scripts or with Go or with real languages. Additionally, we can configure the Linux boot kernel with standard Linux config tools. UEFI supports booting from that file systems. But with Linux boot, we can boot from any of the hundreds of file systems that Linux supports. We can boot from encrypted file system since we have Lux and Crypt setup. Most UEFI firmwares can only boot from the network device that is installed on the server motherboard. We can boot from any network device that Linux supports. And we can use proper protocols. We're not limited to Pixie and TFDP. We can use SSL. We can do cryptographic measurements of the kernels that we receive. And the runtime that makes up Linux boot is also very flexible. Last year, I presented the heads runtime for laptops. This is a very security focused initial RAM desk that attempts to provide a slightly more secure, measured and attested firmware. And this works really well with Linux boot. My collaborator, Ron Minnick, is working on a Go-based firmware called Nerf. And this is written entirely in just in time compiled Go, which is really nice because it gives you memory safety and is very popular inside of Google. Being able to tailor the device drivers that are included also allows the system to boot much faster. UEFI on the open compute Winterfell takes about eight minutes to start up. With Nerf, excuse me, with Linux boot and Nerf, it starts up in about 20 seconds. I found similar results on the Intel mainboard that I'm working on. And it hopefully will get a video. This is an action. This is from PowerOn to execute the PEI phase out of the ROM and then jumps into a small wrapper around the Linux kernel, which then prints to the serial board. And we now have the Linux print case. And we have an interactive shell in about 20 seconds, which is quite a bit better than the four minutes that the system used to take. It's curled by pretty fast, but you might have noticed that the print case, the Linux kernel thinks it's running under EFI. That's because we have a small wrapper around the kernel. But for the most part, the kernel is able to do all of the PCI and device enumeration that it needs to do because it already does it since it doesn't trust the vendor biases in a lot of cases. So I'm really glad that Congress has added a track on technical resiliency. And I would encourage Congress to also add a track on resiliency of our social systems, because it's really vital that we deal with both online and offline harassment. And I think that that will help us make a safer and more secure Congress as well. So last year, when I presented heads, I proposed three criteria for a resilient technical system that they need to be built with open source software, they need to be reproducibly built, and they need to be measured into some sort of cryptographic hardware. The open is, you know, I think for this crowd is not controversial. But the reason that we need it is because a lot of the server vendors don't actually control their own firmware. They license it from independent BIOS vendors, who then tailor it for whatever current model of machine the manufacturer is making. This means that they typically don't support older hardware. And if there are vulnerabilities, it's necessary that we be able to make these patches on our own schedule. And we need to be able to self help when it comes to our own security. The other problem is that closed source systems can hide vulnerabilities for decades. This is especially true for very privileged devices like the management engine. There have been several talks here at Congress about the concerns that we have with management engine. Some vendors are even violating our trust entirely, and using their place in the firmware to install malware or adware onto the systems. So for this reason, we really need our own control over this firmware. Reproducibility is becoming much more of an issue. And the goal here is to be able to ensure that everyone who builds the Linux boot firmware gets exactly the same result that everyone else does. This is a requirement to be able to ensure that we're not introducing accidental vulnerabilities through picking up the wrong library or intentional ones through compiler supply chain attacks such as Ken Thompson's trust in trust article. With the Linux boot firmware, our kernel and initial RD are reproducibly built. So we get exactly the same hashes on the firmware. Unfortunately, we don't control the UEFI portions that we're using, the PEI and the SEC phase. So those aren't included in our reproducibility right now. Measured is another place where we need to take into account the run time security of the system. So reproducible builds handle the compile time, but measuring was running into cryptographic coprocessors like the TPM. Give us the ability to make attestations as to what is actually running on the system. On the heads firmware, we do this to the user that the firmware can produce a one time secret that you can compare against your phone to know that it has not been tampered with. In the server case, it uses remote attestation to be able to prove to the user that the code that is running is what they expect. This is a collaboration with the mass open cloud project out of Boston University and MIT that is attempting to provide a hardware root of trust for the servers so that you can know that a cloud provider has not tampered with your system. The TPM is not invulnerable as Krzysztofowski showed at DEF CON, but the level of effort that it takes to break into a TPM to decap it and to read out the bits with a microscope raises the bar really significantly. And part of resiliency is making honest tradeoffs about security threats versus the difficulty in launching the attacks. And if the TPM prevents remote attacks or prevents software only attacks, that is a sufficiently high bar for a lot of these applications. We have quite a bit of ongoing research with this. As I mentioned, the management engine is an area of great concern. And we are working on figuring out how to remove most of its capabilities so that it's not able to interfere with the running system. There's another device in most server motherboards called the board management controller, the BMC, that has a similar level of access to memory and devices. So we're concerned about what's running on there. And there's a project out of Facebook called Open BMC that is an open source Linux distribution to run on that coprocessor. And what Facebook has done through the open computer initiative is they have their OEMs pre-installing that on the new open computer nodes, switches, and storage systems. And this is really where we need to get with Linux boot as well. Right now, it requires physical access to the spy flash and the hardware programmer to be able to install. That's not a hurdle for everyone. But this is not something that we want people to be doing in their server rooms. We want OEMs to be providing these systems that are secured by default so that it's not necessary to break out your chip clip to make this happen. But if you do want to contribute, right now we support three different main boards, the Intel S2600, which is a modern WolfPass CPU. The mass open cloud is working with the Dell R630, which is a Haswell, I believe. And then Ron Menek and Jean Marie are working on the open compute hardware. And this is again a, in conjunction with Open BMC, a real potential for having free software in our firmware again. So if you'd like more info, we have a website. There's some install instructions. And we'd love to help you build more secure, more flexible and more resilient systems. And I really want to thank everyone for coming here today. And I'd love to answer any questions that you might have. Thank you very much, Shemal Hudson, for this talk. We have 10 minutes for Q&A. So please line up at the microphones if you have any questions. But there are no questions from the signal angel and the internet. So please, microphone number one. Sorry. One quick question is to Sigma using this for any of their internal systems and be how much vendor outreach is in there to try and make this beyond just the open compute, but also a lot of the vendors that were on your slides to adopt this. So currently, currently, we don't have any deployed systems taken advantage of it. It's still very much at the research stage. I've been spending quite a bit of time visiting OEMs. And one of my goals for 2018 is to have a mainstream OEM ship in it. The Hedge Project is shipping firmware on some laptops from Libram. And I'm hoping that we can get Linux boot on servers as well. Microphone number two, please. The question I have is about the size of Linux. So you mentioned that there's problems with UFI. And it's not open source and stuff like that. But the issue you mentioned there is that the main part of EVEFI is EDK, which is open source. And then, I mean, I just have to guess that the HTTP client and stuff that they have in the Apple boot, I assume it was, is for downloading their firmware. But how is replacing something that's huge with something that's even bigger going to make the thing more secure? Because I think the whole point of having a security kernel is to have it really small to be verifiable. And I don't see that happening with Linux because at the same time people are coming up with other things. I don't remember the other hypervisor, which is supposed to be better than KVM because KVM is not really verifiable. So that's a great question. The concern is that Linux is a huge TCP, a trusted computing base. And that is a big concern. Since we're already running Linux on the server, it essentially is inside our TCP already. So it is large, it is difficult to verify. However, the lessons that we've learned in ported Linux to run in this environment make it also very conceivable that we could build other systems. If you want to use a certified, excuse me, verified microkernel, that would be a great place to bring into the firmware. And I'd love to figure out some way to make that happen. The second question, just to point out that even though EDK2, which is the open source components of UEFI, are open source, there's a huge amount of closed source that goes into building a UEFI firmware. And we can't verify the closed source part. And even the open source parts don't have the level of inspection and correctness that the Linux kernel has gone through. And Linux systems that are exposed on the internet, most of the UEFI development is not focused on that level of defense that Linux has to deal with every day. Microphone number two, please. Thank you for your talk. Would it be possible also to support, apart from servers, to support laptops, especially the one locked down by BootGuard? So the issue with BootGuard on laptops is that the CPU fuses are typically set in what's called verified boot mode. And that will not exit the BootGuard ACM if the firmware does not match the manufacturer's hash. So this doesn't give us any way to circumvent that. Most server chipsets are set in what's called measured boot mode. So the BootGuard ACM just measures the next stage into the TPM and then jumps into it. So if an attacker has modified the firmware, you will be able to detect it during the attestation phase. Microphone number one, please. Just one question. Thank you. On ARM, it's much faster to boot something. It's also much simpler. You have an address, you load the bin file and it boots. On X86, it's much more complex and the amount of quotes you saw for Grubb relates to that. So my question, I've seen all in the boards, Cortex-A8 booting in four seconds just to get a shell. And six seconds to get a QT, so the Linux kernel played a QT app to do a dashboard for a car, so five to six seconds. So I'm wondering why is there such big difference for a server to take 20 or 22 seconds? Is it the peripherals that needs to be initialized or what's the reason for it? So there are several things that contribute to the 20 seconds. And one of the things that we're looking into is trying to profile that. We're able to swap out the PEI core and turn on a lot of debugging. And what I've seen on the Dell system, a lot of that time is spent waiting for the management engine to come online. And then there's also, it appears to be a one second time out for every CPU in the system that they bring the CPUs on one at a time and it takes almost precisely one million microseconds for each one. So there are things in the vendor firmware that we currently don't have the ability to change that appear to be the long tent and the long pole on the tent on the boot process. Microphone 3 in the back please. You addressed a lot about security, but my question is to rather, there's a lot of settings, for example, BIOS, there's UEFA settings and there's stuff like remote booting, which is a whole bunch of weird protocols, proprietary stuff and stuff that's really hard to handle. If you have a large installation, for example, you can't just say, okay, deploy all my boot orders for the BIOS settings, are you going to address that in some unified, nice way where I can say, okay, I have this one protocol that runs on my Linux firmware that does that nicely? That's exactly how it's, most states will deploy it, that they will write their own boot scripts that use traditional, excuse me, use normal protocols. So in the mass open cloud, they are doing a WGIT over SSL that can then measure the received kernel into the TPM and then kexecut and that's done without requiring changes to NVRAM variables or all the sort of setup that you have to put into configuring the UEFI system that can be replaced with a very small shell script. We have time for one last question and this is from the signal angel because the internet has a question. Yes, internet has two very simple technical questions. Do you know if there's any progress or do you know if any ATAs on the TALUS 2 project and are there any size concerns when writing firmware in Go? So the TALUS 2 project is a power based system and right now we're mostly focused on the x86 servers since that's the very mainstream available sorts of boards and the Go firmware is actually quite small. I've mostly been working on the heads side, which is based on shell scripts. My understanding is that the just in time compiled Go does not add more than a few hundred kilobytes to the ROM image and only a few hundred milliseconds to the boot time. The advantage of Go is that it is memory safe and it's an actual programming language. So it allows the initialization scripts to be verified in a way that shell scripts can be very difficult to do. So thank you very much for answering all these questions, please give a warm round of applause to Tramel Hudson, thank you very much.