 engine deep dive understanding the ME at the OS and hardware level and it is by Peter Boss please welcome him with a great round of applause. Okay everybody hear me. Nice. Okay so welcome. Well this is me. I'm a student at Lyon University and yeah I've been I've always been really interested in how stuff works and when I got a new laptop was like you know how does this thing really boot I knew everything from reset vector onwards I want to know what happened before it so first I started looking at the boot guard ACM while looking through it I realized that there were not everything was as it was supposed to be and that led to a later part in the boot process being vulnerable which ended up in me discovering this and I found out here last year that I wasn't the only one to find it. Drama Hudson also found it and we reported it together presented it at Hack in the Box and then at the same time I was already also looking at the management engine. Well there had been a lot of research done on that before. It was mostly the public info was mostly on the file system and on specific vulnerabilities which still made it pretty hard to to get started on reverse engineering it so that's why I thought might be useful for me to present this work here. So it's basically broken up into three parts. The first bit is just a quick introduction into the operating system it runs so if you want to work on this yourself you're more easily able to understand what's in your face in your December. So and then after that I'll cover its role in the boot process and then also how this information can be used to to start developing a new firmware for it or do more security research on it. So first of all what exactly is the management engine? There's been a lot of fuss about it being a being a backdoor and everything. Well in reality it's if it is or not depends on the software that it runs. It's basically it's a processor with its own RAM and its own IOMM use and everything sitting inside your Southridge. It's not in the CPU it's in the Southridge so when I say this is gonna be about the sixth and seventh generation of Intel chips I mean mostly motorboards from those generations if you run a newer CPU on it it will also work for that. So yeah a bit more detail CPU it runs is based on the 8486 which you know is funny quite an old CPU and it's still being used in almost every computer nowadays. So it has a little bit of a of its own RAM it has quite a bit of built-in ROM it has a hardware accelerated cryptographic unit and it has fuses which are right once memory that's used to store security settings and keys and everything and then some of the more scary features it has bus bridges to all of the buses inside the Southridge it connects us the RAM on the CPU and it connects us the network which makes it really quite dangerous if there is a vulnerability or if it runs anything the farious and its tasks nowadays include starting the computer as well as adding management features these mostly use on servers where it can serve as a board management controller to like remote keyboard and video and it does security boot guard which is the signing of firmware and verification that are signatures it implements a firmware TPM and there is also SDK to use it as a general purpose secure enclave yeah so on the software side of it it's runs a custom operating system which parts of which are taken from Minix the teaching operating system by Andrew Tanwell and so it's a microkernel operating system runs binaries that are in a completely custom format and yeah it's it's really quite high level system actually if you look at it in terms of the operating system it runs it's mostly like Unix which makes it kind of familiar but it also has large custom parts and yeah like I said before in this talk I'm gonna be speaking about six and seven generation Intel core chipsets so that's sunrise point Lewisburg which is a server version of this and also the laptop system on the chips that they're just called Intel core low power they also include the chipset as a separate die so it also applies to them in fact I've been testing most of the stuff that I'm gonna tell you about on the laptop that's sitting right here which is a Lenovo T460 the version of the firmware I've been looking at is 1100-1205 right so I do need to put this up there I am not a part of Intel nor have I signed any contracts to them I found everything in ways that you could also do didn't have any leaked NDA stuff for anything that you couldn't get your hands on I also it's a very wide subject area so there might be some mistakes here and there but generally should be right right well if you want to get started working on an ME firmware when a reverse engineer it or modify it in some way first what you got to deal with the image file you've got your SPI flash it's most of its firmware lives in the same flash chip as your bios so you've got that image and then how do you get the code out well there's tools for that it's already been extensively dockly documented by other people and you can basically just download a tool I run it against it which makes this really easy this is also the reason why there hasn't been a lot of research done yet before these tools were around you couldn't get to all of the code the kernel was compressed using Huffman tables which were stored in ROM and you couldn't get to the ROM without getting code execution on the thing so there was basically no way of getting access to the kernel code and I think also to the system library but that's not a problem anymore you can just download a tool and unpack it also the Intel tool to to generate firmware images which you can find in some open directories on the internet has Qt research resources XML files which basically have the descriptions for for all of the file formats used by these ME versions including names and comments to go with those structure definitions so that's really useful right so we look at one of these images it has a couple of partitions some of them overlap and some of them are storage for it and some is code so there's the main partitions the ftpr and ftp which contains the programs it runs there's MFS which is the read write file system it uses for persistent storage and then there is a log to flash option the possibility to embed a token that will tell the system to unlock all debug access which has to be signed by Intel so it's not really of any use to us and then there is something interesting the ROM bypass like I said that you can't get access to the ROM without running code on it and the ROM is mass ROM so it's internal to the chip but Intel has to develop new ROM code and they have to test it without re-spinning the die every time so they have the possibility on a unlocked pre-production chipset to completely bypass the internal ROM and load even the early boot code from the flash chip some of these images have leaked and you can use them to get a look at the ROM code even without being able to dump it it's going to be really useful later on so then you've got these code partitions and they contain a whole lot of files so there is a binary themselves which don't have any extension and there's the metadata files so the binary format they use has no headers nothing included and all of that data is in the metadata file and when you use the enemy 11 tool you can actually it'll convert those to text files for you so you can just get started without really understanding how they work yeah so the metadata it's tech length value structure which contains a whole lot of information the operating system needs it has the info on the module whether it's data code where it should be loaded what the privileges of the process should be a chat checksum for validating it and and also some higher level stuff such as device file definitions if it's a if it's a device driver or any other kind of server I've actually written some code that uses this then some get up so if you want to closer look at it some of the some slides have a link to to get up file in there which contains the full definitions right so all the code on the ME is signed and verified by Intel so you can't just go and put in a new binary and say hey let's let's run this the way they do this is they in Intel's manufacturer time fuses they have a the hash of the public key that they use to sign it and then on each flash partition there is a manifest which contains which is signed by the key and it contains the the shahashes for all the metadata files which then contain a shahash for the for the code files doesn't seem to be any major problems in verifying this so it's useful to know but it's you're not really going to use this and then modules himself as I've said is a their flat binaries mostly the metadata contains all the info the kernel uses to reconstruct the actual program image in memory and a curious thing here is that the actual base address for all the modules for all the programs is the same across an image so if you have a different version it's going to be different but if you have two programs from the same firmware it's they're going to be loaded at the same virtual address right so when you want to look at it you're going to load it in some disassembler like for example Ida and you'll see this it disassembles fine but it's going to reference all kinds of memory that you don't don't have access to so usually you think maybe I've loaded out the wrong address or or am I missing some library well here you've loaded it correctly if you use the the address from the metadata file but you are in fact missing a lot of memory segments and let's just take a look at each of these it's it's calling it so it should be code it's and it's pushing a pointer there which is data and what's that so it has shared libraries even though it's flat binaries it actually does you share libraries because you only have one and a half megabyte of RAM you don't want to link your C library into everything and waste what little memory you have so there's the main system library which is like libc on a linux system it's it's in a flash partition so you can actually just load it and take a look at it easily and it starts out with a jump table so there's no symbols in the metadata file or anything it doesn't do dynamic linking it it loads the it loads the pages for the shared library at a fixed address which is also in the shared libraries metadata and then it's just there in the process of memory and it's going to jump there if it needs a function and the functions themselves are just using the normal system 5 x86 calling convention so it's pretty easy to look at that using your normal tools it's no weird to register argument passing going on here so right those shared libraries there's two of them and this is where it gets annoying the system library you've got access to that so you can just take your time and go through it and try to figure out you know hey is this open or is this read or what's this function doing but then there's also another like second really large library which is in rom and they have all the the the c library functions and some of their custom helper routines that don't interact with the kernel directly such as the strings functions they live in rom so when you've got your code and this is basically where was that when I was here last year you're looking through it and you're seeing calls to a function you don't have the code for all over the place and you have to figure out by its signature what is it doing and that works for some of the functions it's really difficult for other ones so that really had me stuck for a while then I I meant to find one of these rom bypass images and I had to code for a very early development build of the rom and this is where I got lucky so the the actual entry point addresses are fixed across a entire chipset family so if you have an image for the server version of like the hundred series chipset or for client version or for desktop or laptop version it's all going to be the same rom address point three rom addresses so even though the code might be different you have the jump table which means the addresses can stay fixed so there's only needs to be done once and in fact when I upload my slate later there is a slide in there at the end that has the addresses for the most used functions so you're not going to have to repeat that work at least not for this chipset so if you want to look at a simple module you've loaded it now you've you've applied the things I just said and it's you still don't have the data sections in fact I don't necessarily don't know what that function there it's doing but it's not very important it it actually returns a value I think that's not used anywhere but it must have a purpose because it's there right so then you look at the entry point and this is a lot of stuff and the main thing that matters here is on the right half of the screen there is a listing from a minix repository and on the left half there is a disassembly from an ME module so it's mostly the same there is one key difference though it the ME module actually has a little bit of code that runs before this C library startup function and that function actually does all the ME specific initialization there's a lot of stuff related to how C library data is kept because there's also no no data segments for the C library being allocated by the kernel so each process actually reserves a part of its own memory and tells to C library like any global variables you can store in there but when you look at that function one of the most important things that it calls is this function it's very simple it just copies a bunch of RAM so they don't have support for initialized data sections it's it's a flat binary what they do is they they actually use the BSS segment so the zeroed segment at the end of the address space and copy over a bunch of data in the program the program itself is not aware of this it's it's really in the initialization code and in the linker script so this is also something that's very important because you're gonna need to also at that address in the data section you're gonna need to load the last bit of the of the binary otherwise you're missing constants or at least in association values right and then there's the full memory map to the process itself it it's a flat 32-bit address space it's it's got everything you expect in there it's got the stack and and the heap and everything there's a little bit of heap allocated right on the initialization and this is this is basically how you derive the address space layout from the metadata especially like the data segment and the stack itself is like the location varies a lot because of the number of threads that they already use or the size of data sections and also those stack guards they're not really stack arts there's also metadata for each thread in there but that's not nothing that's relevant to the process itself only to the kernel and well if you then skip forward a bit and you you've done all this you look at your at your simple driver like this this is taken from a driver used to talk to the CPU like okay so when I say CPU or host I by the way I mean the CPU like your big Sky Lake or KB Lake or coffee Lake whatever your big CPU that runs your own operating system right so this this is used to send messages there but if you look at what's going on here okay I think I have a problem with the animation here it sets up some stuff and then it calls a library function that's in the main syslib library which actually has the main loop for the program that's because Intel was smart and they they added a nice framework for device driver implementing programs because it's it's micro kernels so device drivers are just usual and programs calling specific API's then there's normal POSIX file IO no standard no standard IO but it has all the normal open a read on IO CTL everything functions and then there's more initialization for the server library and this is basically what all the simple drivers look like in it and then there's this because it's so low on memory they don't actually use standard IO or even printf itself to to do most of the debugging it it uses a thing that's called Sven oh touch on that layer so there's the familiar API's that I talked about it even has POSIX threads or at least a subset of it and there is all the functions that you'd expect to find on some generic Unix machine so that shouldn't be too much of a problem to do with but then there's also their own tracing solution Sven that's what Intel calls it the name is in all the development tools that you can download from their side and basically they don't include format strings for a lot of the stuff they just have a 32-bit identifier that is sent over the debug port and it refers to a format string in a dictionary that you don't have whereas one of the dictionaries for a server chipset floating around the internet but even that is incomplete and the normal non-NDA version of the Intel developer tools has some 50 format strings for really common status messages it might output but yeah like if you see these functions just realize it's doing some debug print there might be dumping some state or just telling it it's gonna do something else it's no important logic actually happens in here right so then for device files they're they're actually defined in the manifest when the kernel loads a program and a program wants to expose some kind of interface to other programs its manifest will contain or its metadata file will contain a special file producer entry and that says you know you have these device files with a name and an access mode and the user and group ID and everything and the minor numbers and the kernel sends this to the kernel the program loader sends this to the virtual file system server and it automatically gets a device file pointing to the right major and minor number and then there's also a library as I said to provide a framework for a driver and that looks like this it's really easy to use if you were a me developer you just write some callbacks for for open and close and everything and automatically calls them for you when a message comes in telling you that that happened which also makes it really easy to reverse engineer because if you look at a driver it just loads some callbacks and you can know by their offset in structure what actual cooler implementing right so then there is one of the more weird things that's going on here how the actual user land programs get access to memory map registers there's a lot of this going on calls to a couple of functions that have some magic arguments the second one you can easily tell is the offset because it has it increases in very nice power of two steps so it's probably the register offsets and then what comes after it looks like a value and then the first bit seems to be a magic number well it's not there's also an extension in metadata saying you these are the memory map tile ranges and those ranges they each list the physical base address and a size and and the permissions for them then the index in that list does not directly correspond to to the magic value the magic value actually you need to do a little computation on that and then you can access it through those functions the computation itself might be familiar yeah so these are the functions and the value is a segment selector so they they use them actually don't use paging for inter process isolation they use segments like x86 protective mode segments and for each memory that map to iO range there's a separate segment and you manually specify that which is just weird to me like why would you use x86 segmenting on a modern system or Midex does it but yeah to then extend that even to this luckily normal wider space is flat like to the process not to the kernel right so now we can access memory map IO and that's all them like the really high level stuff so what what's going on under there it's got all the basic micro kernel stuff so message passing and then some optimizations to actually make it perform well on a really slow CPU the basics are you can send a message you can receive a message and you can send and receive a message where you basically say send a message wait till a response comes in and then continue which is used to wrap function calls this is mostly the same as in minix there's some subtle changes which I'll get to later and then memory grants are something that only appeared in minix really recently it's a way for a process to basically create a new name for a piece of memory it has and give different process access to it just by sharing the number is referred to by the process ID and the number of that range so the process IDs are actually a local per process so to uniquely identify why you need to say process ID plus that number and they're only granted to a single process so when a process creates one of these it can't even access it itself unless it creates a grant for itself which it's not really that useful usually these grants are used to prevent having to copy over all the data inside the IPC message used to implement a system call these are the basic operations on it you can create one you can copy it to and from it so you can't actually map it a process that receives one of these has to say to the kernel using a system call please write this data into that area of memory that belongs to a different process and then there's also indirect grants because you know in minix they do have this but also only recently and usually if you have a microkernel system you would have to copy your buffer for read call first to the file system server and then back to like either the hard disk driver or the device driver that's implementing a device file so the ME actually allows you to create a grant pointing to a grant that was given to you by someone else and then that grant will inherit the privileges of the process that creates it combined with those that it assigns to it so if if the process has a read write granted it can create a read only or write only grant but it cannot if it only has a read granted cannot add write rights to it for different process obviously so then there's also some big differences from minix in minix you address the process by its process ID or thread ID with a generation number attached to it in the ME you can actually address IPC to a file descriptor kernel doesn't actually know a lot about file descriptors it just implements the basic thing where you have a list of files and then each process has a list of file descriptors assigning integer numbers to those files to refer to them by and this is used so you can as a process you can actually directly talk to a device driver without knowing what his process ID is so you don't send it to the file system server you send it to the file descriptor and the kernel just magically corrects it for you and they move select into the kernel so you can tell the kernel hey I want to wait till the file system server tells me that it has data available or till the message comes in this is one of the most complicated system calls the ME offers that's used in a normal program and you can mostly ignore it and just look like hey those arguments sort of the file descriptor sets has a bit fueled and then there's the the message that might have been received and there's DMA locks because you don't just want to write to registers you actually might want to do the direct memory access from hardware so you you can actually tell the kernel to lock one of these memory grounds in RAM for you it won't be swapped out anymore and yeah it will even tell you the physical address so you can just load that into a register and it's it's not really that complicated just lock it get a physical access right in the to register and continue well that's the most important stuff about the operating system the hardware itself is a lot more complicated because the operating system once you have a code you can just reverse engineer it and get to know it the hardware well let's just say it's a real pain to have to reverse engineer a piece of hardware together with its driver like if you've got the driver code but you don't know what the registers do so you don't know what a lot of logic does and you're trying to both figure out what the logic is and what the actual registers do right so first you want to know which physical address goes where and it's the metadata listings I showed you actually had names in there those are not in the metadata files themselves I annotated those so you just see the physical address and size but there's the one module the bus driver module and the bus driver it is a normal user process but it implements stuff like PCI configuration space accesses and those things and it has a nice table in it with names for devices so if you just run strings on it you'll see these things and yeah when I saw this I was I was pretty glad because at least I could make sense what device was being talked to in a certain program so the bus driver it does all these things it manages power gating to devices it manages configuration space access it manages the different kinds of buses and IOMM use that are on the system and it makes sure that normal driver never has to know any of these details it just asks it for a for a device by a number assigned to it a build time and then the bus driver says okay here's a range of physical address space that you can now write to so that's a really nice abstraction and also gives us a lot of information because the really old builds for sunrise point actually have a hell of a lot of debug strings in there as printf format strings not as fan catalog IDs it's one of the only pieces of code for me that does this so that already tells you a lot and then there's also the table that it just talks about that has the the actual info on the devices and names so I generated some docu wiki content from this that I use myself and this is what's in the table part of it so it tells you what address the PCI configuration space lives at it tells you the the bus device function for it through that it tells you on what chipset SK use they are present using a bit field and it tells you their names in different fields it also contains the values that are used to to write the base average at the registers for PCI so also their normal memory ranges and there's even more devices so yeah me has access to a lot of stuff a lot of it is private to it a lot of it is components that also exist in the rest of the computer and there's not a lot of information a lot of this these are basically all the things that that are out there together with the conference slides published by other people who have done research on the ME I did not have time to add links to those but they're easy to find on Google I'll get later to this but I actually wrote a emulator for the ME partial emulator to be able to run ME code and analyze it which obviously needs to know a bit about hardware so you can look at that there is some files in Intel's debugger package that specific version versions of that that have really detailed info on some of the devices also not all of it and I wrote some tool to parse some of the files it's really rough code I publish it because people wanted to to see what I was doing it doesn't work out of the box and there's a nice talk on this by Mark Kermalov and Maxine Gori actually don't know if I'm pronouncing that correctly but they've done a lot of work on the ME and this particular talk by them is really useful and then there's also something else there is a second Demi in server chipsets the innovation engine it's basically they copy pasted the ME to provide a ME that the vendor can write code for don't think it's used a lot I've only been able to find HP software that actually targets it and that has some more debug strings but also not a lot of most mostly has a table containing register names but they're really abbreviated and for a really small subset of the devices there is documentation out there in a Pentium N and J series data sheet it's seems like they compiled their lot of code or whatever with the wrong defines because it doesn't actually fit into to the manual that well it's just a section that has like some 20 tables that shouldn't be in there right so this from that talk I just referenced and it's a overview of the innovation engine and the bus bridges and everything in there this isn't like very precise so based on some of those files from system studio I try to get a better understanding of this which is this this is the entire chipset the little DMI block in the top left corner is what connects to your CPU and all of the big blocks with a lot of ports are our bus bridges or switches for PCI express like fabric yeah so there's a lot going on the highlighted area is the management engine memory space and the rest of it is like the global chipset well the things I've highlighted in green here are on the primary PCI bus so there's this weird thing going on where there seems to be two PCI hierarchies like at least logically so in reality it's not even PCI but on Intel systems there's a lot of stuff that behaves as it is PCI so it has like bus device function and numbers PCI configuration space registers and they have two different routes for the configuration space so even though the configuration space address includes a bus number they have two completely different things with each each of which has its own bus zero so that's that's weird also because they don't make sense when you look at how the hardware is laid out so this is stuff that's on the primary PCI configuration space that's directly accessed by the by the Northridge on the ME CPU so that's the minute I a system agent system agent is willing to call the Northridge nowadays now that's not a separate chip anymore it's the it's basically just the Northridge and the crypto unit that's on there and the stuff that's directly attached to Northridge being the ROM and around so the processor itself is as I said derived from a 486 but it does actually have some more modern features it does CPU ID at least on my systems some other researchers said theirs didn't it's basically the core that's in the Cork MCU which is really great because it's one of the only cores made by Intel that has public documentation on how to do wrong control so breakpoints and accessing registers and everything over JTAG that's Intel doesn't publish this stuff except for the Cork MCU's because they were targeted makers but they reuse that in here which is really useful it even has an official port to the open OCD debugger which I have not gotten to test because I I don't have a JTAG probe which is compatible with Intel voltage levels and supported by Open OCD and also has like I said CPU ID and MSR's it has some really fancy features like branch tracing and some more strict paging permission enforcement stuff they don't use the interrupt pins on this so it's an IP block but if there's some files out there that's where this screen shot is from that actually are used by by a built-in logic analyzer Intel has on chipset and you can select different signals on the chip to watch which is a really great source of information on how the IP blocks are laid out and and what signals are in there because you basically get a tree view of the IP blocks and chip and some of their signals they don't use the legacy interrupt system they only use like message-based interrupts by where the device writes a value into a register on the interrupt controller and instead of asserting a pin and then there's the Northridge. Northridge is it's partially documented in that datasheet I mentioned and it it does support x86 IO address space but it's never used everything in the ME is a memory space or exposes memory space through bridges in the Northridge implements access to the ROM RAM it has a IOMMU which is only used for transactions coming from the rest of the system and it's always initialized to like you said at least in the firmware I looked at it's always initialized to the inverse of the page shable so linear addresses can be used for memory map sorry for DMA it also does PCI configuration space access to the primary PCI bus and it has a firewall that actually allows the operating system to deny any IP block in the chipset from sending a completion on the bus request so it can actually say hey I want to read some register and only these devices are allowed to send me a value for it so they've actually thought about security here which is great and there's one of the most important blocks in the ME which is the the crypto engine and it it does some some of the more well known crypto algorithms AES, SHA hashes, RSA and it has a secure keystore which I'm not gonna talk a lot about it in their ME talk at Black Hat and a lot of these things have DMA engines which all seem to be the same and there is no other DMA engines engines in the ME so this is also used for memory to memory copy or DMA into to other devices yeah so that's used in a lot of things this is actually a diagram which I don't have the vector for anymore so that's why the LibreOffice background is in there I'm sorry so this is basically what that crypto engine looks like when you look at that signal tree that I was talking about earlier the DMA engines are both able to do memory to memory copies and to directly targets the crypto unit they're part of basically when you I don't know about the control bits that go with this but when you set the target address to zero and the right control bits it will copy into the buffer that's used for the encryption so that is how it accelerates memory access for crypto and these are the actual register offsets they're the same for all of the DMA engines in there relative to the base address of the subunit therein and then there's the second PCI bus or bus hierarchy which is like in some places called the PCI fixed bus I'm actually not entirely sure whether this is actually implemented as a PCI bus as I've drawn it here but this is what it behaves like so it has all the DME private stuff that's not a part of the normal chipset so as timers for the ME it has the implementation of the secure enclave stuff the the firmware TPM registers and it has the gen device which I've mostly ignored because it's only used at boot time it's only used by the actual boot ROM for the ME mostly it is what the ME uses to get the fuses Intel burns so that's the Intel public key and whether it's a production or pre-production part but it's pretty much a black box it's not used that much fortunately there's the IPC block which allows the ME to talk to the sensor hub which is a different CPU in the chipset it allows it to talk to the power management controller and all kinds of other embedded CPUs so it's inter-processor communication of inter-process confused me for a bit and there's the host embedded controller interface which is how the ME talks to the rest of the computer when it wants to the computer to know that it's talking so it can directly access a lot of stuff but when it wants to send a message to the to the EFI or to Windows or Linux it'll use this and it also has status registers which are really simple things where the ME writes in a value and even if the ME crashes the host can still read the value which is actually how you can see whether the ME is running whether it's disabled whether it fully booted or whether it crashed halfway through but at a point where it could still get the rest of the computer running and there is some core boot code to to read it and I've also implemented some decoding for it on on the emulator because it's useful to see what those values mean right so then there's something really interesting the primary address translation table which is the bus bridge that allows the ME to actually access the PCI express fabric of the of the computer for a lot of the what I in this table called ME peripherals that are actually outside the ME domain in the in the chipset it uses this to access it also uses it to access the UMA which is an area of host RAM that's used as a swap device for the ME and a trace up which is the book port but also has a couple of windows which allow the ME to access any random area of host our host RAM which is the most scary bit because UMA is specified by host but the host DRAM area is or you can just point it anywhere you can read or write any value that that Windows or Linux whatever you're running has sitting there so that's that's scary to me right so and then there's the rest of it in the the rest of the devices which are behind the primary ATT and that's a lot of stuff that's debug that's also the normal peripherals that your PC has but it's also includes things like the power management controller which actually turns on and off all the different parts of your computer it controls clocks and reset and so this is really important and there's a concept that you'll come across when you're reading Intel manuals or ME related stuff that's root spaces besides your normal addressing information for a PCI device it also has a root space number which is basically how you have a single PCI device exposing two completely different address spaces and it's zero for the host it's one for the ME some devices expose the same information on their other ones behave completely different but yeah that's something you don't usually see and then there's a sideband fabric so besides all the stuff that I just covered which is PCI like at least there's also something completely different sidebank fabric which is a completely packet switch network where you don't use any memory mapping by default you just have a one byte address for device and some other addressing fields and you just send it a message saying hey I want to read configuration or data or memory and there's actually a lot of information out there on this because Intel it seems like they just copy pasted their internal specification into a patent this is how you address it and this is all the devices on there which is quite a lot it's also what you if any of you are kernel developers and you've had to deal with GPIOs on on Intel socks there's this P2SB device that you have to use that's what the host uses to access this their documentation on it is really really bad right so this was all done using static analysis but then I I wanted to figure out how some of the logic actually worked and it was really complicated so I wanted to to play around with VME there was this nice talk by Irma Love and Goriachi which where they said you know you can now we found a an exploit that gives you code execution and you can you get you can get JTEC access to it sounds really nice it's actually not that easy so arbitrary code in the execution execution in the bot module they actually describe their exploit and how you should use it but they didn't describe anything that's needed to actually implement that so if you want to do that which you need to do you need to figure out where the stack lives you need to know you need to write a payload that will actually get it from a buffer overflow on a stack that by the way uses stack cookies so you can't just overwrite the return address to turn that into an arbitrary write and you need to find out what the return pointer at addresses so you can override it I need to find RLP guys just because the stack is not executable right so and then and then when you've done that you can just turn on debug access or a chain load a custom firmware or whatever so what I did is I had a bit of trouble getting that running and in order to test your payload you have to flash it into the system and it takes a while and then the system just doesn't power on if the ME is not working if you're crashing it instead of getting code execution so it's not really viable to to develop it that way I think some people did I respect that because it's really really hard then I wrote this ME loader it's called loader because at first I started out like writing it as sort of a wine thing where you where you would just a map the right range is at the right place and jump into it execute it patch on some system calls but because the ME is a microkernel system and almost every user space program accesses hardware directly it ended up implementing like a good part of the chipset at least as stubs or enough logic to get the code running and I I later on added some features that actually allowed to talk to a hardware I can use it as a debugger by just because it's actually running the ME firmware or parts of it inside a normal Linux process I can just use gdb to debug it and back in April last year I got that working to the point where I could run the bootstrap process which is where the vulnerability is and then you just develop the exploit against it which I did and then I made a mistake cleaning up some old change route environments for closed-source software and I nuked my home dear yeah I hadn't yet pushed everything to GitHub so I stuck with an old version and I decided you know let's refactor this and turn it into something that might actually at some point be published which by the way I did last summer this is all public code the ME loader thing sound good up and someone else beat me to it and replicated that exploit by the Russian guys which up to then they have produced a proof-of-concept thing for Apollo chipsets Apollo Lake chipsets which is we're completely different for from what you had to do for normal ME I so that's actually I was a bit disappointed by that not being the first one to actually replicate this but then I did about a week later I got it got my loader back to the point where I could actually get to the vulnerable code and develop that exploit and got it working not too long after and here's the great thing then I went to the hackerspace I flashed it into my laptop the image that I had just been using on the on the emulator I didn't change it I flashed it was like this is never gonna work and it it worked and I've still got an image on a flash chip with me because that's what I used to actually turn on the debugger and then you need a debug probe because that use be based debugging stuff that's mentioned here only works pretty late in boot this which is also why I only really see Apollo Lake stuff because on those chipsets you can actually use this for the ME and then you need this thing because there's a second channel that is it's using use be plugged but it's a completely different physical layer and you need an adapter for it which I don't think was intended to be publicly available because if you go to Intel site and say I want to buy this they say like here's the CNDA please sign it but it appeared on Mauser and luckily I knew some people who had done some other stuff got a nice bounty for it and bought it and I let me use it thanks it's expensive but you can buy it if it's still up there haven't checked that's the link so I'm a bit late so I'm gonna use the time for questions as well so that the main thing the ME does that you cannot replace is the boot process it's not just breaking the system if you don't turn it on it actually does stuff that has to be done so you're gonna have to use the ME anyway if you want to boot a computer don't necessarily have to use Intel's firmware though the ME itself boots like a microkernel system so it has a process which implements a lot of the servers that will allow it to get to a point where it can start those servers this process is very high privileges in older versions which is what's being used on these chipsets and if you exploit that you're still ring three but you can turn on debugger and you can use the debugger to become ring zero so this is what normal boot process for a computer looks like and this is what happens when you use boot guard there's a bit of code that runs even before the reset vector and that started by microcode initialization of course and this is what actually happens the ME loads a new firmware into a power management controller it then ready some stuff in the chipset and it tells the power management controller like please stop pulling that CPU reset pin low and the CPU will start power management controller is a completely independent thing it's a 8051 drive microcontroller runs real-time operating system from the 90s this is the only string in the firmware by the way that's quoted there and depending on the chipset you have it's either loaded with a patch or with a complete binary from the ME and it does a lot of important stuff no documentation on it besides the ACPI interface which is not really any useful the ME has to do these things it needs to load the keys for the boot guard process needs to set up clock controllers and then tell the PMC to turn on the power to the CPU needs to configure PCI express fabric and reset turn it like get the CPU to come out of reset there's a lot of code involved in this so I really didn't want to do it so statically what I did is I added hardware support hardware pass through support to the emulator and booted my laptop that way actually had a video of this but I don't have the time to show it which is a pity but this is what I had going at the bring up process from the ME running in a Linux process sending whatever hardware accesses it was trying to do that are important for boot 2d debugger and then that was using a ME in real hardware that was halted to actually do the register accesses and it worked it's yeah so it's not gonna show this it actually booted the computer reliably then boot guard configuration is fun because you know where they say they fuse in the diffuse in the keys well yeah but the ME loads them from fuses and then manually loads them into a register so if you have code execution on the ME before it does this you can just load your own values and you can run core boot even on a machine that has boot guard yeah so I'm gonna go through this really quickly this is by the way these are the registers that configure what security model CPU is gonna enforce for the firmware I'm gonna release this code after my talk it's part of a Python script that I wrote that uses debugger to start the CPU without ME firmware I traced all the accesses the ME firmware did and I now have a Python script I can just start the computer without Intel's code if you translate this into a ROP sequence or even into a binary for the ME you can start a computer without the ME itself or at least without it running the operating system so yeah future goals really do want to share this because if there is a way to escalate to ring zero through the ROP chain then you could just start your own kernel on the ME and have custom firmware at least from the vulnerability on but you could also build a mod chip that uses the debugger interface to load a new firmware there's a lot of stuff still needs to be discovered but I'm gonna hang out at the open source firmware village later at least part of the week here so because I really want to get started on open source in me firmware using this right and there's a lot of people that that's played a role in getting me to this point also would like to thank a guy from my hackerspace Pino Alpha who basically allowed me to use his laptop to prepare the demo which I ended up not being able to show but right I was gonna ask whatever were any questions but I don't think there's really any time for that anymore Peter thank you so much unfortunately we don't have any more time left I'll be around I think it's very very interesting because I hope that your talk will inspire many people to keep looking into how the management engine works and hopefully uncover even more stuff I think we have time for just one single question I don't know do we have one from the internet thank you so much okay first off I have to tell you your shirt is nice Chad wanted me to say this and they asked how reliable this exploit is and does it work on every boot right yeah that's actually something really important that I forgot to mention so they patch the vulnerability but they didn't provide downgrade protection if you can flash a vulnerable image with an exploit in it it'll just boot every time on these chipsets so six seven generation chipsets put in that image and it will reliably turn on the debugger every time you turn on the computer thank you so much for the question and Peter bus thank you so much please give him a great round of applause