 and right on time. So right now we're going to listen to Jean-Christophe Delonnet who's a former pentester who used to play a lot with Microsoft Active, Microsoft Active Directory infrastructures both on defensive and offensive aspects of synactive, a French offensive security company. He is now in the reverse engineering team within his company focusing on Windows and hardware topics. His talks is IOMMU and DMA attacks or input output memory management unit and direct memory access attacks. So let's start. Hi everyone. In this presentation I'm going to talk about IOMMU and direct memory attacks. But first of all I'd like to thanks NorthSex staff for their incredible work in order to organize these events online. You guys rock really. Thank you very much. Also I hope everyone is safe. A bit of presentation. I'm Jean-Christophe Delonnet, he stores on Twitter. I'm working for a French offensive security company called Synactive. We have three polls, pentests, reverse engineering and developments. And as it was announced I'm working in the reverse engineering team now but was previously in the pentest one and used to focus on Windows related subjects like Active Directory stuff, authentication schemes, etc. In the reverse engineering team we focus mainly on low level subjects like reverse, hardware, porn, etc. And if you understood well, yeah, I'm French. I speak croissant. So, please excuse my poor English and accent. Now we do a quick roadmap. First I'll proceed to short introduction on DMS subjects. Then I'll focus on the various IOMMU implementation within mainstream OECs. And finally I'll explain some attacks then conclude and talk a bit on our ongoing works. First a quick disclaimer. In this presentation I'll talk about known attacks on internet technology only and I'll say quite high level. This is mostly state-of-the-art stuff so no real brand new fancy things for DMA experts. Even though I'll talk a bit about Thunderbolt at the end. Attacks in this presentation target are already switched on computer. The considered attack vector being the AVMADE one. For example, when your room being accessed during a conference or so. This is the AVMADE scenario. What is DMA? Well, on this slide I've put a simplified explanation of what it is. On the left you can see the normal workflow occurring when transferring data with an external peripheral. It goes through the CPU, involves your request, interrupts, etc. In order to interact with the main memory. On the right is the DMA workflow. You can see that there's not as many elements as on the left illustration. In fact, in order to increase performances, a dedicated bus is used with a dedicated controller to be able to access the main memory directly. Here are the various technologies using DMA. You've probably heard of all of them. We have the good old PCI and AGP. We have Firewire, PCI Express, etc. You can see that I've put many illustrations regarding PCI Express. The graphic card, which is well known to be using PCI Express, but also an NME SSD drive and a USB-C connector with Thunderbolt support. These two also rely on PCI bus and offer DMA. Now we'll talk about the IOMMU. Intel implements what it calls virtualization technology for directed IO, which is also known as the short VTD. Basically, this is the IOMMU stuff I will talk about. Its purpose is to proceed to DMA remapping in order to control which memory locations are reachable and by who. This DMA remapping works as a classical MMU, thus the term IO MMU, in that it takes addresses manipulated by peripherals and translates them to physical addresses. Actually, if the functioning is similar to a classical MMU, these peripheral addresses are not bound to any real virtual addresses that we could observe while debugging a process, for example. They have their own address space. A really important notion I will often refer to is the domain term. In PCI world, peripherals are organized by what they call domains. Each domain has its proper MMU configuration and what is important to remember here is that all peripherals within a single domain share the same memory mapping. Basically, what it means is that, for example, if I have a graphic card and a network card which are under the same PCI domain, then the network card could access graphic card's memory pages. We'll see later how that is important. In order to be identified, peripherals are assigned triplets, bus, dev, fun, where bus is the bus number, dev is obviously for device number, and fun represents the function of the device. For example, if we stick with the graphic card example, let's say that we have a cyber-digital NSE compliant graphic card which also makes, I don't know, coffee. Well, we would have two triplets, one with a function for graphics and one with a function for coffee. So we have two triplets. So we'll finish this short introduction with some use cases of IOMMU implementation. The first one, which is actually the well-known one, is the hypervisor use case. Imagine I have a virtual machine and a peripheral attached to it. Because my peripheral is attached to this VM, I don't want it to be able to access to my host's main memory. In the OS use case, well, it's pretty the same thing, except that I want to protect my OS kernel from rock peripherals. So I must ensure that peripherals can only access their memory pages. Now that we've seen a bit what is DMA and IOMMU, we'll take a close look on implementations on various OSs. First, let's take a look at Microsoft Windows. On Windows, IOMMU is used by technologies such as Hyper-V virtualization based security, which is also called VBS, and the new kernel DMA protection. Hyper-V, well, you've normally heard about it, is the Microsoft virtualization technology. VBS on its side relies on Hyper-V and permits to isolate critical components as LSA-SS or stuff like that from an insecure world. We'll see what it is after. And finally, the kernel DMA protection is a new feature introduced with Windows 10 version 1803. Basically, what does kernel DMA protection is, it's preventing peripheral, which have not a driver with memory isolation capability, like DMA remapping to work while the computer is locked. And you can see that I wrote on the slides that this is according to Microsoft, because unfortunately, Windows being close to us, we can be 100% sure there are no, and there are not so much of technical documentation on the IOMMU usage in here. So reverse engineering work is needed and actually is in progress on our side. On this slide, I've put a scheme of what is VBS. We won't go a lot in details on it, as this is not the purpose of this presentation. But on the right is where the user evolves when using his computer. It's called VTL-0. And on the left, you have the secure world which is called VTL-1 and is isolated from VTL-0. Technologies such as credential Guards, device Guards, etc. They all rely on VBS. And the underlying stuff is the IOMMU. Now let's talk about Linux IOMMU implementation. On Linux, the IOMMU is not activated by default. And you can change this with a special boot arguments until IOMMU on, but it's not here by default. And by the way, there is also an AMD IOMMU, but this talk is on Intel. So each IOMMU type defines an IOMMU OPS structure which serves as an abstraction layer while interacting with the hardware. We'll see this on the slide after. And here on the slide, when I say platform type, I don't refer to x86, 64, etc., but more on Intel, Equinox, Mikrotik also. In Linux world, virtual address as seen by a peripheral is called NIOVA. It is associated with a physical address PADDR with its corresponding CELs. On Linux, my mapping is achieved per domain and not per peripheral. But also each peripheral has its own domain. So in fact, each peripheral has its own address space. So here is the IOMMU OPS structure for the Intel platform. And looking at it, you can see why it is considered as an abstraction layer. And I've also put on the right the map function and the corresponding unmap function. You have all the function pointers in here. So we have here for Intel, but we would have the same for AMD, Equinox, etc. And we finish the implant implementation part with the macOS. Apple understood many years ago that hardware security is really important. They've been adding IOMMU support for quite some time actually. And if a lot of parts are open source, we can take a look. But unfortunately, all the driver parts is not. So at SynActive, we began to reverse engineering the UEFI part in order to understand how it is implemented. In our reversed work, we saw that UEFI is involved in the IOMMU configuring process. This in particular means that IOMMU is enforced at boot time. Also, we saw that there is a custom UEFI protocol permitting drivers to configure IOMMU mappings for peripheral precision. Here, when I say protocol, where it is protocol in a UEFI context, basically, there is an interface implementing functionalities. And this interface has a custom GUID. Drivers can talk to this interface. They just have to specify the GUID they want to talk to. And when the UEFI hands off to the OS, the Apple IOPC family driver reinstallizes the IOMMU for the OS context. This driver also declines the Apple VTD device mapper class, which overrides the IOMapper class. Well, here you can easily guess what this class do as their names are pretty talkative. Also, the Apple VTD device mapper class really defines IOVM map memory and the corresponding IOVM and map memory API, which permit to add and remove memory mapping within the IOMMU. And what is really, really, really important is like it's unlike Linux, Mac OS uses a single domain for all peripherals. And we'll see after why this is really important. Okay, we've seen the theoretical stuff and we'll now see the attacks. Actually, if you get bored until now, I hope it will change now, we'll see. Once again, we'll start with Windows, a quick reminder first that we consider here that the target computer is already switched on and is locked. As we saw earlier, Windows does not use IOMMU by default. The workflow of this attack is pretty easy to understand actually. First, we find a way to connect to the PCI bus and we then probe the target computer main memory searching for the unlocking routine. We patch the password checking routine. Voila, you can log whatever password is entered. Now we'll see a bit more in details how to do so. So I said we are probing main memory searching for unlocking routine. This routine is called MSVP password validates and is located in Intel and share.dll. In red, I've highlighted the RTL compare memory API which is used in this routine to compare the answer password with a normally valid password. On the left, I've displayed the opcode corresponding to the instruction in IDA Pro. Now, as I said, we want to patch the password checking routine in order to be able to log in whatever password is entered. Here is the result of such a patch. As you see, I've not the previous conditional jump to ensure that we continue in the branch we are interested in. By the way, this branch being the one that basically says this is the good password. So we managed to have a good password, whatever password we are entering. To proceed to this attack, we have to search for specific opcode in main memory. For that, we use the awesome PCL-H2 from the no less awesome of Frisk. If you look at this slide, you can notice many things. First, my amazing Microsoft Paint skills. I'm quite proud of it. But more importantly, the two opcodes patterns I'm searching for in memory. We can see that the C60FA4 in red and FBF, etc., in orange. We are looking for them at the offset 73a and 73e in memory. Here on the slide, I haven't highlighted these offsets as they were not aligned. But we will see the offset on the next slide. Now we can take a look at the patch. We are patching at offset 73b with opcodes, nope, nope, nope, etc. So what you can understand from this slide is that we are searching for two specific patterns at two specific offsets and provide a patch to be applied at a specific offset. This kind of signature is all you need to be able to use a PCL-H. Well, apart from the hardware, of course. So the prerequisites to proceed to these attacks are hardware and software. You can use an FPGA spartan with a USB 3 extension card for performances or the PCIe Screamer R2. There is a more recent version of it which is based on the M2 form factor too. And whatever adapter permitting to connecting to the PCIe bus actually. On the software side, you can use PCIe-H on Linux or Windows and you also need the signatures as I've been talking about in the previous slides. Here is an example of my FPGA with its USB extension card on the top. So in blue you can see the extension card. Actually, the FPGA is quite big. We'll use it more for tests, etc. When you will want to attack, you will prefer the PCIe Screamer because it's much slower, much tinier. On Linux, there is also no IOMMU used by default. The attack principle, it's exactly the same actually. You are searching for the password unlocking routine, which is in this case verified password hash from the pamunix.sulibrary. Then all you have to do is two patches and you have here in the bottom of the slides, you have the signature for this specific library. And finally, macOS. So this part is the most interesting as the IOMMU is enabled by default, as we said earlier. So in order to be able to compromise our targets, we must find a way to bypass this IOMMU protection. Colin Rothwell found some interesting vulnerabilities during his PhD thesis. He did a really, really great job. And following his thesis, he released with other researchers the Thunder Clap platform. And this platform contains both hardware and software and permits, among other things, to compromise a system running with macOS prior to version 10.12.4. The principle is the following. You may recall that I told you that on macOS, the reference are under the same domain. This means that they share the same address space. So this is possible to access my network cards memory pages. Well, provided that these network cards relies on PC express technology, of course. Colin Rothwell exploited this behavior to be able to execute commands as root on macOS before version 10.12.4. Let's see the attack now. First, we have to understand how network packets are described. These packets are described by what we call an Mbuff structure, which you can see on the slide in here. We'll see also this structure on the next slide, so I'm switching to it. And this particular structure has many fields across C unions. Among these fields, we can see that we have m dates in red, m packets dates in orange, and mx in blue. These elements represent data, which is a story in the packet. And because of the union type, there can be only one type at a time, depending on specific flags. You can see these flags in commands in the structure. It's a mx set, m packet header set, et cetera. And spoiler alert, we are interested in here, in the mx1, which stores data in external buffer. So if you look at the mx, I said that we are interested in it, and this means that we are interested in the corresponding flag, which will enable it. And this corresponding flag, which is a mx, but with uppercase, is stored in the mbuff structure header, as you see on the slide in purple. So in this specific structure, we have a field which contains the flag, which maybe we can set. Okay, so let's say we have the mx flag set, which, by the way, occurs when there are big packets. So pretty often because an external buffer is allocated to store the data, it must be freed when it is no longer needed. Okay, the function in charge of freeing the buffer is stored in the mx structure, actually, and it is stored as a function pointer. So you should now normally see what is the problem. Because we have a DMA access, we can modify this function pointer through DMA and also control its parameters. There are also members of the mx structure. So we have complete access to this structure. We can alter it. And all we have to do is override this pointer with the KUNC Execute API. And this specific API permits to launch a binary as root in the user land. And all you have to do is to wait for the buffer to be freed. And when it's freed, this function will be called. Apple patched this vulnerability by adding some random values, which are absorbed with data to be protected. So what it means is that if you don't know these random values, you can't proceed to this attack anymore. And these random values are set during the boot process. So this attack is no longer feasible. And if you remember in our attack vector, we said it was heavy made. So the computer must be already switched on. And we don't want it to reboot. Now let's conclude. DMA attack vectors are more and more discussed and are still a real threat model, despite being known for ages, actually. As expected, Mac OS is ahead of its contestants regarding hardware security. But Windows seems to take the physical attack vector very, very seriously. We plan to go further than the current state of the arts during our French rapid project, which I'll be talking a bit now. So this state of the arts was achieved because we plan to go further. And our project is called DMR vests and consists in studying DMA subjects relying on a PCI bus from software to hardware. We plan to look at each mainstream OSes, whether it's open source or not, and various technologies such as M2 Thunderbolts, et cetera, and also architectures like AAC-T6, ARM, et cetera, and much more things. At the moment, we are studying Thunderbolts on Windows. And because it's closed source, we are reversing it. The Intel software suite for Thunderbolts contains what we call the universal Windows platform application. You probably know this stuff. It's the new Microsoft, well, not the new Microsoft application model with the online store, et cetera, we used to call it Metro. So this is the UWP application. This suite also has a service which communicates with the application, but also with the Windows driver frameworks drivers. And there are two drivers. There is a userland driver and there is a kernel driver. These are the UMDF for userland and KMDF for kernel. And to finish, here's the first schematic we've done to represent the Intel Thunderbolts stuff. You can see that the UWP application and the service are through another binary. And the UWDF interacts with the plug-and-play manager. And this is what we are doing at the moment and plan to publish about it later. The aim is to keep digging always to do a hardware layer to understand how IOMMU is used, actually. Thank you if you have any questions. Thank you very much. Thank you. We'll give maybe two to three minutes for the participants to go and have a look at the questions, maybe of both some of them or ask for three minutes ahead. So things are good. Hi. Welcome back. Okay. So first question, are you planning to do some IOMMU DMA, the RTSRT research on the iPhone? Yes, we plan to do it. Actually, the rapid stuff, the French stuff, at the beginning we targeted solely the PCI Express technologies because there are so many things. But as iOS, iPhone or the phones are a target of interest, they are more and more targeted. We plan to do it, yes. And I think it will be a good thing to look at it, yes. In the near future or more like in a few years? Difficult to say. As I said, we've began to reverse the thunderbolt stuff on Windows and it's pretty fast. So let's finish that and after we'll see. I bet. Okay, good. So if you did some DMA research on the iPhone already, are you planning to release this research as well? We haven't done the research on iPhone and because this project is public, yes, we will publish if we find something. Okay. Is the latest macOS still vulnerable to thunder clap DMA attacks? No, no, no. As I said, this particular vulnerability was patched and the Thunder Club platform uses this vulnerability. So not exploitable anymore, but it's interesting. Good. Sorry, are there any kind of signal, signature or behavior that threat hunter could monitor to detect such attacks in real life? Signal, signature or behavior? I don't think so. Maybe if the workstation that was targeted wasn't rebooted and if, yeah, I think it's the only thing because the principle of this kind of attacks is that you access the direct memory. So the RAM and when you reboot the RAM, it's just flashed for the most part. So maybe if you don't reboot the computer, but it's more likely very difficult to spot this kind of stuff, except if there are some backdoors after that, it's not the DMA stuff. Okay. Have you heard about DMA attacks targeting embedded devices using architecture such as ARM MIPS? Embedded device, not really. A guy in my company did some stuff on the HP ILO. It's not really embedded, but it's kind of server. And there is also the guy which produces the PCI Screamer Ramzin Amin, who did some DMA on the iPhone. So, well, it's not really embedded devices, but you have this kind of architecture beside the ARM, etc. Could we say that it's like a little bit under research or...? I don't know. I don't know. Okay. Can you read write TPM registers at physical address fed 40,000 through DMA when the IMM or U is disabled? Which address? So I've heard the TPM, yeah. Registrar is at physical address fed 40,000. You mean in the CPR of the TPM? Because the TPM has some registers and the principle of the TPM is that it's a secure device, it's secure competence, so you can't access it. I don't know if I understood well the question. Because if it's that, no, you can't access the TPM, but whatever the TPM is exporting should be used by the exploitation system. At this moment, yes, you can, because it's in main memory. Okay. I think, yeah, you pretty much answered it. Are there IMM US embedded systems such as STM32F stock? Good question. I don't think so for the STM. Well, honestly, I don't really know. I don't really know. I don't think so. Okay. Are you leveraging the work done by the Inception project in your research? Yes, the Inception project was really cool. Actually, I've done some pull requests in it, but the Inception project, the guy who did this really great work, he doesn't commit a lot. So actually, what is done is Inception is basically the same thing as PCI-LH. It's just that Inception also supports the Firewire case. So when you're a pen tester and you want to do some attacks, if you have some Firewire, you will switch to Inception. Otherwise, you'll use PCI-LH. Okay. And one last question. For non-Apple hardware, does IOMMU seem to be correctly implemented at the UFI level? We haven't looked at it, so I can't answer, unfortunately. But we plan to. Near or from future? Same thing. But actually, because as I said, there are a lot of stuff, which is open source. I think they probably already some stuff on the internet. So somebody probably looked at it. I don't know. Good. Well, thank you very much, Narkistof, for being with us today. Thank you to you. Thank you very much. Big round of applause to you through the chat. Thank you, guys. And we'll take a short break and we'll be back.