 All right, so there's a bit of a story here that I don't know the full thing, but you'll notice that either the speaker is tiny or non-existent, which is true. He is, well, he does exist. He is just not here. Our speaker apparently had enough juice to convince us that we will now deliver this presentation in an automated way, streamed via iCloud.com. That's right. We're going to try to stream an entire presentation from the internet right here on stage. Yeah, so you guys know about Autonomous Things, so this is Defcon's first autonomous presentation, and we also know what might happen if you get into an autonomous vehicle. That shit might just completely crash. So we have no earthly idea what's going to happen here. We hope it really... Yeah, we hope that it goes really well. However, there's one thing for sure. This is the first time we have used iCloud.com as a presentation medium, which we think qualifies iCloud.com as a first-time speaker and presenter, so therefore, we're all in this together. Here's to you, iCloud.com. And without further ado, my very small co-presenter. Hello, everyone. This is Sanad. Sorry, I couldn't make it to Defcon. Apparently, my visa got rejected. So I have to pre-record my entire talk. I hope I can convey the entire content of this presentation through it. If you have any questions, just feel free to reach me out to Twitter. My DMs are open. Hello, everyone. This is Sanad. Sorry, I couldn't make it to Defcon. Apparently, my visa got rejected. So I have to pre-record my entire talk. I hope I can convey the entire content of this presentation through it. If you have any questions, just feel free to reach me out to Twitter. My DMs are open. So I'm a 19-year-old security engineer at GoRoot, based in Berlin, Germany. And on weekends, I play CTF in DCUA, a CTF team based in Ukraine. I normally solve pawn challenges and sometimes I'll blow solutions to them in my gist. So now let's get started in the talk. So regarding heap exploitation, there have been a ton of techniques being published, like from 2005, House of Spirit, House of Force, which allow you to get a chunk anywhere in memory. And the devs started patching most of them, but the research is still going on and there are a ton of new exploitation techniques coming up, like 2016 House of Orange, which was a challenge in HITCON, 2017 House of Rabbit, which was used in a few Asian GTFs also. So this year, I thought maybe I'd research and find something that's a little more interesting and universally applicable to many different scenarios, which is why this is titled House of Romans. So as I said, this technique is leakless. We just use a set of partial overrides to achieve a complete RCE on bind reads that are compiled with Pi2, like we don't really know where the dot-text section is, so we can't rob. The best part is the server does not need to send us any data back. So even if STDF was closed, we could still get a shell and I demonstrated this technique earlier with a simple UAF, but it seemed to be a pretty severe bug, so I thought maybe we do it with a very simple bug, like an off-by-one or something, to demonstrate its versatility and also to use caloc instead of maloc because caloc mem sets it 0, so it makes it a bit more tricky and tough to perform this, but yet it's doable. So as I said, the bug is simple off-by-one and nothing else. So as you can see, I just made a simple binary, which has three functions, maloc, right and free. There's no print function. All it does is basically take a site from maloc and then it malocs it and you enter a bunch of A's and it gets stored on the heap, as you can see. And it's really like a basic skeleton program. So this is like a basic recap of how the algorithm works, is you free a chunk, it gets added to its appropriate free list, if it's a fast bin, it gets added to that single list. If it's a size width and 0x80, it either gets consolidated or gets added to an unsorted bin free list or a small bin large bin, which are a double-link list with FDBK. And in this program, there's now UoF, as you can see the pointer and the array is nulled out. There's not by one and the write function. So we can change the size of a chunk and we can overlap it with the other neighboring chunks and thus gain control FDNBK to perform various sorts of heap attacks. So this is 0xD, one-size-chunk, when you're free, you get some weird pointers here. These are actually the arena pointers. They point to this main arena, which is a Lipsy symbol, to complete the double-link list of the unsorted bin list. And as I said, the main arena is a Lipsy symbol, much like the system, exactly Malakuk and all of these stuff. So it's interesting to know that Malakuk is pretty close, but I'll talk about that more later on. So again, this is just sort of like a refresher to the audience for those of you who don't know what the unsorted bin attack does is. It allows you to write a particular Lipsy address to anywhere you want. Now you can't really control what you want to write, but it's the address of main arena. More importantly, it's like a Lipsy address. And then there are fastman chunks, which are Sightless and 0x80. Their head points are stored in the main arena at an offset determined by their respective size. You should free two chunks of the same size then they're in the same free list. And it makes it really easy to exploit them because it's a single English, so less checks. And if you find a particular size alignment for fastman free list, which matches the size, then it's like game over. So to actually see the off by when in action, this is sort of like the attack plan. So I have these 0x71, 21, and D1 sort of like these chunks Malakuk and the heap. Long fishy here, just like I just made a bunch of alox. And my plan is to use the second 0x71 to also into the 0x21 size chunk and make it something like 0xE1. And then of course put a fake size header at plus 0xE1 so that we don't fail any Malakuk assertions and we bypass all of them. As you can see with this, we overlap 0x71, 21, and 0xE1 chunk with our 0xE1 chunk. So we could free them and we could actually control freed chunks. So this is what it looks like, this 0xE1 chunk, when we return it for a 0xE1 size, it overlaps this 0x71, 21, and there's a fake size header of 0xE1. Now one thing you have to make sure is that at plus 0xE1, there's a good size like here you can see there's a Malakuk 0x21. This is really important, otherwise you will fail a Malakuk assertion, especially when you try to free it. So now let's assume we have control of the FDF assessment feelers. So now we need to think where we want to attack. So the best way to attack is, which I've seen is to find these Lipsy addresses or Stack addresses because they start with 0x71F. And so we can use the 0x71 free list and make it point to somewhere near this. How does it work? So basically we take a Lipsy address and we shift it by 5 bytes. So when you shift it and it sort of becomes like 0x00007F, which is a valid size for a 0x71 free list. So you actually fool Malakuk into believing that that is a valid fast bin free list on the heap and the consecutive 0x71 allocation which actually return you that chunk. So just in case you didn't understand this part, this is how it works. So you take a Lipsy address and if you have a null Q word in front of it, then you can just shift it and you can see in diagram it just keeps shifting, shifting, shifting until you reference 0x71F. And the memory that you have to make it point to is something, something, something 980. It becomes something, something, something, 985. So that's where you actually make it point. I chose Malakuk because Malakuk is in a place which is surrounded by a bunch of Lipsy addresses and a bunch of nulls, if you've ever seen this. And I think it's possible with Freehook 2, which I also discussed later, with an unsorted bin primitive, you can actually get to Freehook. And Freehook gives you more control over RDI and your arguments. But nonetheless, Malakuk also works to get a shell. So, I mean, you've seen this in a ton of CTF challenges and write up like point to point the FD there and get it, but we don't know the Lipsy address. So we can't really set it to somewhere near Malakuk. So we need to find an address which somewhere points close to it and then maybe we could do a partial overwrite to it. That's like the basic idea is to do a series of four partial overwrites, some in the heap, some in the Lipsy in order to put a Lipsy address and in order to get an allocation near Malakuk and get your shell. I mean, it's all of our partial overwrite because we're assuming that the program does not print any data back. Just to show you, like if you have like BACD in one run, you found out to GDB. And so basically in your script, you'll code something like back-class XCD and then back-class XA and that capital X would be what you boot. You can set it to B, you could set it to 1, 2, 3 for anything you want. That's the thing you have to boot. So imagine we have this scenario, like I said, we have 0xd1 getting overlaid by 0xd1 and so now I just feel like 2x0x0d1 chunk and the FD points there and what you see here is where the full Malakuk is. Instead of making it here, we make it to somewhere at the top, like near that 0xd1 chunk. So what we basically do is we tell Malak, hey, after the next 0x0d1, think that the free list is not down there, but it's rather up here. And then we use the overlap thing to change the size of 2xd1 to 0x71. So now we made Malakuk think that that's a 0x71 chunk because of course we just changed the size and what's more, we even made it think that the 0x7FFF7 addressing is actually an FD pointer to the next 0x71 free list. Of course, it's just a pointer to the main arena, but we can do a partial overwrite of the lower 2 bytes of this address and make it something like point close to or above Malakuk and of course, the lower 3 nibbles will always be the same. So we just need to boot the 1x16 probability that I talked about before. This actually turned out to be a really big problem because once I overlap, in order to get control of that overlap, I need to return it. If Kalloc is the only way to return it, then it will just memset the entire region to 0. What that means is basically I'm going to talk about the Kalloc bypass. Actually, I found this while looking at the source code of Kalloc during a CTF called rc3ctf. It was a real relation CTF. So anyways, the bypass is that Kalloc, if it returns a sort of memory that's from Mmap, then it thinks that it's already zeroed out and it is not due to the operation of nulling it out. This is more of an efficiency thing because you expect a memory region that returned by Mmap to be zeroed out. So this is the fact that we exploit is we set the lower nibble of every chunk to 0xf, which is basically Mmap bit. And we'll make Kalloc believe that this chunk is not part of the heap. It's part of an Mmap segment. So Kalloc will be like, okay, I don't need to null it out then. And we will still have the arena pointers there. So for example, you have 0x71 chunk. So it's not 0x71 now, it has to be 0x70f. This is cool for fast pins, but when you come to unsorted pins, you have to make sure like if you have like 0x91, it should be 0x9f, but you have to also make sure that you allocate exactly 0x9f because like, you know, unsorted chunks have sort of this previous size thing in the next chunks header. So if you allocate a size less than that, then you would have a size conflict, but it would check the previous size. Rather, if you allocate exactly the same size, then it will use up the entire unsorted bin and it won't check the previous size. So you'll bypass that check. Again, like this is just a simple demonstration. What I was writing the exploit is this is what it looks like. This is what it should look like 0x9f and then you have the arena pointers, there's a free chunk and a Kalloc and I get the same chunk back. But as you can see, all their memory is not nulled out. It's still at the arena pointer. So I can partially override that. Of course, you want a more detailed analysis. I posted it on my gist as part of the solution for that particular challenge in RSC 3CTM. So this is the first partial override where we have 0x91 free list and we want to make it point somewhere near malloc hook. So we override the lower two bytes with a brute for the four bits. And the next allocation for 0x91 return. The second partial override is when you want to make malloc believes that the 0xD1 chunk that we are forging is actually part of the 0x91 free list. And to do that, you need to use a already made 0x91 free list and make it point to the 0xD1 free list. And for that, you need to do a partial override in the heap. So like, for example, I have a pointer here to 1398190. And but instead, I need to make it point 1398110. The one which I just overwrote with the pointer to malloc hook. So now malloc thinks after the 13982D0, the next chunk in the free list is 1398110, which A is a 0xD1 chunk, which I change aside and B points to malloc hook. Of course, even after we read malloc hook, we can't rob. We can do stack pivoting. We can't build a rob change on heap. We don't really know the dot text and we can leak it. So, so here's the third part about writing an ellipsi address in malloc hook itself. So the third step in our tag vector is to do an unsorted bin attack on malloc hook. So this is a freed unsorted bin list, the FDNBK about pointer main arena, because there's only one. And we overwrite it with, actually, we partially overwrite the lower two bytes again with the address of malloc hook minus 0x10. So you don't need to boot this time because if you already booted the address to malloc hook, now we're down to the fourth and final partial overwrite, which is we have a ellipsi address in malloc hook and we have a chunk close to it. So now all we need to do is partially overwrite the ellipsi address that we just wrote on malloc hook and make it so that it points to system or the magic gadget or any ellipsi function that you want to call. So of course, this is what it looked like before malloc hook is zero after it looked like the address of main arena. Of course, if you try to malloc here, then what we'll do is we'll think the main arena is an address and we'll try to execute shell code as there, which of course it will make it crash. So you got to do the partial overwrite and then call malloc. The best way to trigger a shell is basically to use the magic gadget and then trigger double free, which sets up the and correctly fulfills the stack and strange required to call the magic gadget appropriately. Was that as awkward for you guys as it was for me because it was real awkward for me. Thank you all for staying. And how about that for the people that came in and were probably very confused. Let's give, I guess, PowerPoints and three clicks for no apparent reason to advance a slide. A big round of applause.