 Hello again, good evening for the last session on day three of the Congress. I'm really happy to see so many of you so late interested in such a particular topic that might be really really relevant for many in assessing our threat levels, so we will hear more about direct memory attacks and how they're still possible nowadays again and All Frisk is here to show you and to tell you more about what you should know about it Thank you Tonight we're going to talk about the public FPDA based direct memory access DMA attacking My name is all frisk and helping me with the demos today. I have Peter Noreen. I Will start by briefly going through some background and previous work has been done in the area Then we'll jump straight into the actual DMA attacking I will try to do a live demo in which we will transmit and receive PC express transaction layer packets We will dump memory at speeds up to 75 megabytes per second Then we'll have a look at the actual FPDA design that I created After that we will go into some more advanced DMA attacking We will attack a vulnerable vanilla linux system and a vulnerable UA5 If you manage to get into UA5 You might also be able to compromise secure boot and then you can also compromise the not yet booted operating system such as a Windows 10 system running virtualization based security and At the end we will have a look at some future hardware that I'm really excited about My name is all frisk I'm employed in the financial sector in Stockholm, Sweden Previously presented my work at the secti conference in Stockholm and also at DEF CON in Las Vegas I'm the author of the PCI leech direct memory access attack toolkit And this has been a hobby project of mine since the start and it still is Also need to point out that I'm giving this talk as an individual my employer is not involved in any way whatsoever I'm here today to represent PCI leech FPGA PCI leech FPGA is the combination between the silence sp605 development board coupled with AFT 601 USB 3 add-on board The PCI express generation one one lane side goes into the target computer or if you wish to call it the victim computer The USB 3 side goes into the controller computer or if you wish to call it the attacker computer Once both sides are connected the controller computer is able to send PCI express transaction layer packets over USB on to the FPGA which will then put them on PCI express of the target system We can also read PCI express TLPs this way from the target system and they will be forwarded on to the controller computer The whole hardware setup as such is between five and six hundred dollars in total and with it You will be able to do DMA to build both 32-bit memory address space below four gigs and 64-bit memory address space above four gigs You will be able to do DMA at around 75 megabytes per second Everything that I created is totally open source, but I'm using some vendor proprietary blobs in there Unfortunately, so that's why the title of today's talk is public and not open If I compare the SB 605 as FPGA solution with the earlier hardware I used for DMA attacks the USB 3380 the USB 3380 was sold out earlier this year and The FPGA solution is a little bit more expensive. It's bulkier It's also slower as is at the moment but it's much more stable and You will be able to do 64-bit DMA memory addressing as well And that means that you're able to access memory above four gigs as well as memory below four gigs And this is a huge different Compared to the old hardware that we were only able to access memory below four gigs with DMA attacks has been around since pretty much forever I think you all heard of inception awesome firewire DMA attacking tool if you haven't used it or heard of it Please look it up as a response to DMA attacks That's also as a response to the growing need for virtualization of devices CPU vendors introduced the IOMM use or VTD around 2008 and on words and If the IOMM use are used properly and by the firmware and the operating systems They should be able to protect fully against DMA attacks as we'll see today. That's not always that's not always the case There's been lots of research in the day in the DMA attacking space I can't mention everyone here today. I thought I should mention The Camino's work with his iron hide from the academic area that they used for his PhD thesis and Also snare and racing did a really awesome Thunderbolt attacking DMA attacking talk back in 2014 actually using the exact same hardware that I'm using here today DSP 605 and Then just a couple of months ago Dimitri Olexiuk released what I know to be the first DMA attack focused FPGA bitstream into the public with his PCI express do it yourself hacking toolkit Dimitri also supported my work with the PCI leech and they also shared both at first binaries and some source code with me and It really pushed me to actually get the SP 605 from the start and get going here So really huge. Thanks to Dimitri without you. I wouldn't be here. Thank you PCI express is based on its packet based the packets are called transaction layer packets or TLPs They are D word based 32-bit based They usually consists of a header that are between three and four D word long and The TLPs can have different types for example read memory write memory IO configuration messages completions and so on Let's focus on the DNA TLPs here today the memory read and write TLPs The 64-bit write TLP is down on the left It starts with which type of packet it is in the first D word And then you also have the length of the data that you wish to write in number of the words The second D word contains the requester ID Which is the bus number and device number of the actual device sending this TLP packet and then since we are doing a 64-bit Write that means that we're writing to 64-bit memory address space We need to represent that address into the words and then we have the data at the end When we do a write we just post this message on to PCI express and we will trust that it will get written We won't get any acknowledgement back that it was successful or not When we are doing a read it looks pretty much the same the packets Except it's a different type of course since we are doing a read here we are doing a 32-bit memory read and Once you submit that one you need to wait a short while and you will receive one or more Completion till piece back containing the actual data that you read So let's do a demo Let's transmit or receive p6 press transaction layer packets Let's enumerate the memory and let's dump the memory if we switch over the image to the hardware here Here I hide the FPDA board and I have a victim system here So let's insert our express card to a PCI express adapter in the target computer and power on the FPGA It's connected to my presenter computer via USB here If we switch back to my presentation And here we have it from a slightly different angle the hardware here. We are trying to Read something. We are going to read one D word from 64-bit memory address space We are going to read from the address for jigs exactly this address here See what happens Here we send the Read till P and we get a completion till P back and the completion till P the first 3D words are The heller and then we have the actual data that we read here So let's do a write as well. Let's Do a 64-bit memory write to the same address. Let's Do a to kill a little D word long write to the very same address with this data and See if we can overwrite that previous data. So we send that till P and Since we are doing a write we won't get an answer back. No completions or anything like that But we can try to read the memory back to see what happens if the right was successful Let's try to read 30 D words this time From the very same address Here we see that we get the completions back in two different completions and If you check in the beginning We said see that the previously read data is now overwritten with our new data here We can also enumerate the memory of the target system Since we don't know how much memory it's in this computer We need to check it out and we can do this by reading a tiny portion of every page that we are able to read and See how much memory there is in this computer and Physical memory address space in a modern-day computer is not one big contiguous chunk of memory You have a physical memory in there And you also have like holes in memory in which there are nothing you have memory Mac PC express devices You can have unreadable memory such as system management mode remember as well Here we see that we read that it seems to be failing after slightly more than 8 gigs here This is probably an 8 gig system So let's try to dump the memory Dumping memory takes a while. So let's go back to the presentation These are all PC express form factors. You have the standard PC express Cardless you all know to the lower left You have the mini pizza express that goes pretty much behind the back average of the laptops You have the express card that I use here today Thunderbolt also carries PC express Thunderbolt 3 is most often combined with the USB C connector nowadays And then you have the different M2 key form factors. For example M2 KM is really common for NVME drives Here is the actual FPGA design that I created. It's rather simplistic You have a block that receives and transmits data or a 32-bit data connection from the USB 33 the USB FT601 hardware and then you have the Silings PC express core on the other side that handles the actual PC express communication Everything in yellow here or silings IP blocks or IP cores and they are not like open source So it's a vendor proprietary stuff Everything in green here is stuff that I created though. So it's totally open source and it's found on my github Where we receive some data from over the USB connector or a connection from the controller computer and Then we actually receive some data and some metadata Because we know we need to know what kind of data we are receiving if the data is a part of a Transaction layer package a TLP. We put it on the first out first queue The FIFO queue for till piece if it's some other kind of data for example internal loop back the bug data We put it on an internal loop back FIFO for example if you do some put the TLP It's the data of the till piece on the TLP FIFO We transmit it to the silings PC express core and that one will take care of everything practical We receive data and we receive till piece from the silings PC express core as well and then since we have Different FIFO's here that we wish to read data from as well We need some merge logic here so merge it into a stream that we can send back to the controller computer and actually everything like like formatting of the till piece it's actually done in software on the Controller computer. So this is a rather simplistic design But it works So let's jump into some more advanced DMA attacking Let's do a demo on a vulnerable vanilla Linux system. Let's locate and patch into the Linux kernel and since Linux kernel version 4.8. I believe the kernel is fuller randomized in physical memory address space Which means that it's very likely that it will end up above the 4 gig limit and here the FPGA hardware really shines compared to the older attack hardware that I used So let's try to find the Linux kernel patch into it. Let's mount the file system and unlock the computer So here we have the Linux computer and see that the memory dump was successful here It's a little bit slower here today since I'm going through a USB hub Unfortunately, but the memory dump seems to have worked We switch to the FPGA here image Okay Yeah, let's try to log on to this computer Try to log on with the password of single a here and it's the wrong password. We cannot get into that Linux computer So if we switch back to the presentation We can insert a kernel module into the running Linux kernel we try to locate the Linux kernel and And As We can see here today. It's actually found below four gigs. It's happened to be randomized in that position. So But it seems to be working anyway Let's mount the live file system Using the kernel module address here and once the file system and is mounted We can just click into it. Actually, we have mounted a live memory Live RAM as well We can go into the ETC folder and locate the shadow file Which contains the password hashes of the users We can just edit it in our favorite favorite editor here And here we have lots of user accounts with no hashes and we have the user account That's very end that has a very long password hash here And of course if you know the password hash, you can try to crack it or something like that But that's no fun. It's much easier to just delete it and replace it With something else and then we hit save Let's see if we can log on if we switch back to the FPGA Try the single password of a thank you and So let's go back to the presentation If we go through the other computer here We need to If we can switch the camera to the other computer that is was like filming already and we can also attack you a five You a five some you a fives may protect themselves against the may attacks most you a fives don't If you are able to get into you a five you might even compromise secure boot Let's try to get into you a five here today. Let's backdoor the exit boot services function that is called by the Operating system loader and once you'd wish to take control of this target system Let's retrieve the memory map of the efi memory map And let's also patch the not yet booted windows kernel that is loaded at this stage And actually what I'm doing here today. Dimitri has done some really awesome work in this area as well So if you haven't checked out his stuff, I really would like you to do that So if you switch to the now, maybe we can have this here So here we have another system need to switch around the PGA here. I think Cableing so what we are doing. We are inserting the PGA here in the not yet booted computer and If we start it We switch back to the presentation felt to connect to the device try to do it again Yeah, it works better this time probably a bad connection computer is starting and now the operating system loader called into the exit boot services function which we hooked with our code We Trapped it there. We retrieve the ui-fi memory map or the efi memory map here and Once we are in this stage the windows kernel is already in the memory the normal enters kernel The hypervisor is already in the memory and the secure kernel is already in the memory But the windows operating system is not yet booted so it cannot protect itself against the may attacks yet so here we can actually patch into the Linux a little windows kernel and If you look at windows virtualization based security It has something that can we can enable that protects kernel code integrity with the help of the hypervisor and secure kernel With regards to evil devices that are trying to do the may access to the memory the Hypervisor and the secure kernel memory. We have no access to that memory at all Normal executable pages in the normal Windows space Normal user space normal kernel space are marked as read only with regards to DMA from evil devices So we cannot patch the memory directly there and normal non executable pages are pretty much as usual read right and as I said the Kernel mode code integrity features are not yet enabled in this stage We are now since the windows operating system is not yet booted so let's try to insert some code there and Then It's more nicer some shelf Here we located we communicated with our UA5 module. We located the windows kernel and We located some code caves in there to put our code in there and Now windows is booting enabling virtualization based security We cannot edit the kernel anymore, but our evil code is already in there So we should be able to try to log on to this computer if you switch to the FPGA Here we have the Windows computer Try to log on to that one using no password at all and As you can see we couldn't log on if we switch back to the presentation. Let's Change that. Let's spawn a system shell Here we are system and of course if your system we can Remove the password of the user account and if you switch back to the FPGA We can try to log on and we're in if you switch back to the presentation We can also dump the memory of the Windows system Here we see that we get lots of failed pages when we are dumping the memory It's pages that are marked as not readable via the IOMMU via VTD that Windows protects It's primarily the hypervisor and secure kernel pages in memory. We cannot read those, but everything else pretty much we can PCLE each FPGA is Open source pretty much at least the parts I coded it's found on Github and I try to make it as easy to use as possible You don't need any prior FPGA knowledge at all. You should just be able to Flash it on this hardware and start DMA attacking Unfortunately, it's Windows only at the moment on the attacker PC. I have some Linux Driver problems with the hardware I'm using here. I hope to resolve that quite soon And what's even more exciting is that there seems to be coming lots of devices Quite soon be able to do DMA attacks. For example, there will be Lots of yeah Some devices will be really inexpensive while some others will Be a little bit more pricey, but still less pricey than the sp605 solution One such example is a new hardware the PC express screamer. It's a new hardware by key to RAM team. I mean It's going to be easier to use It's going to be a lower price tag than the sp605 solution it's going to be more capable PC express generation 2 and I plan to add support for this one sometime early 2018 here So it's going to be really really early next year. Hopefully in the coming month To sum everything up affordable FPGA DMA attacking is the reality of today Physical access is still an issue. I in my news are there in the hardware since forever But they might not always be used And I hope I shown you today that I believe there is more research to be done in this area and Hopefully my tools will be useful to everyone that is interested Thank you Thank you so much of So everybody just saw that you should keep your devices always on the person and we have questions microphone one, please One question I have is right now. You're dumping memory and doing edits in memory and patching the colonel Did you have the idea of say taking the? writing in driver for say a virtual machine which is mapping another machines Memory into that virtual machine so that you can kind of say stop the processor and the attack machine Use a virtual processor to do operations on the memory of the Victor machine Very can see what the program is doing in your emulator. I Haven't gone into like attacking with like the virtual machines and NASA stuff as well But it's an interesting idea to be able to go into If I do have kernel access at the moment, so it should be possible But this is like a hobby project of mine. My time is a little bit limited here And it would be the stuff is out there, so it would be awesome if someone can actually look into this I think it might be quite useful So we have a lot of questions here also from the signal angel It's actually not that many just to What prevents you from implementing the PCIe device without any proprietary stuff and Is the control or limited to Windows because of that proprietary stuff? To ask the Windows question. It's I believe I get it working on Linux quite soon It's just a driver issue. I just haven't had the time to actual actually code it for Linux yet I had a little bit of a problem with that driver, but it shouldn't be any problem really I just need to find the time to actually do it and And The other question I with regards to I use the I'm quite new to FPGAs actually, so I just use the default tools that the silencs toolkit provides It should be possible to replace some elements with the more open elements in this design as well But I'm really FPGA noob here So it's this was my first attempt at an FPGA so it should be possible to do this as well So you should talk to each other further So microphone to please So I wonder if you can access Memory used by Intel ME the UMA, which is not accessible by me the main CPU No, this is out of limits from this. It's going to be mapped away in the ph elite platform Controller hub, so it's I shouldn't be able to access this access it and I cannot access the system management mode memory either Okay, thank you And the last question for microphone three You're using think pets as I've seen do any Biosettings of those think pets interfere with your DMA attack For example does disabling the express card slot really help or is it just more? disabling just the power lines or something Disabling the express card slot will help then I can't get into the express card slot But usually on laptops if you unscrew the back cover there are Something in like a Wi-Fi card or something like that in there. That's probably going to be PC express as well And that's maybe it's harder to disable that one if I may the Question before the last one I can answer that you can't replace Some of the Xilinx course for example the PCI express one because that's so-called heart. Yeah, that's really on FPGA non-changeable stuff So it's just yeah, yeah hardware. It's hardware and yeah, but you should be the pipe was you should be able probably Thank you. Thank you on microphone two Did you wanted to say something still? Okay. No, so thanks again. Thank you all first and Somebody showed up from microphone one. Yeah, so I'm on yeah, so regarding the hard IP so what these hard IPs normally implement is the physical interface to the PC express which is doing these transaction layer packets, but the actual DMA is usually done using IP core which you load into the thing so usually it's the DMI IP core which is proprietary and Running on the hard IP for the PCI physical layer. So you would probably need an open DMA IP core Okay. Yeah. Thank you Okay, so now we're done with all the questions. I guess you will have a lot of people Surrounding you after the talk to not speak into microphones and Yeah, I wish you a great evening and thanks again all frisk