 All right. Next up we have Unicernal Apocalypse by Spencer Michaels and Jeff DeLeo. Please give them a warm welcome. Hi. I'm Jeff and this is Spencer. We work for NCC Group. We're doing consulting. And we're here to talk about some intern research that Spencer was doing last summer under my sort of auspices. Basically we, first and foremost, I'd like to mention that all vulnerabilities we're going to be discussing in this talk were disclosed prior last year and some of them have been fixed and some of them haven't for various reasons. But we defined vulnerability here to be an issue that it may itself be exploited or any failing of any security protection or mechanism that was intended to be there. So what are Unicernals? Unicernals are these specialized applications that bundle all of their application code and resources into a single binary that runs a bare metal VM where the entire application stack runs in kernel space. And what could go wrong? They're intended as an alternative to sort of full VMs and Linux containers. And so they, the thing above them is the hypervisor wherein containers, the thing above them is the kernel and then maybe a hypervisor on top of that. They have, they re-implement a lot of the components so they tend to have their own network stacks written potentially from scratch in a different language than one might be accustomed to. And they're very specialized and sometimes you can port code to them easily and sometimes you can't. So why look at them? Basically they're a very new technology and they might, you know, I don't think they're going to completely replace containers or anything like that but they definitely will probably find a niche that they fit well in. And they're being looked seriously at by people to make things like CubesOS more secure. And anything that people are doing to make Cubes more secure is definitely worth looking at, especially since some of the security claims from the people writing these things are really kind of don't make sense and don't add up. And then there was this blog post by Brian Cantrell from Sun originally who wrote a blog post basically saying that Unicernals were unfit for production because they were impossible to debug. That last part was very prescient as we'll find out. So there are a lot of claimed security advantages to Unicernals. Things like there's no unnecessary code that, you know, if it's not used it's not compiled in because they link the whole thing together. There's no shell so if somehow an attacker got remote code execution they wouldn't just be able to do BNSH and they'd be just out of luck. You know, they can't, they're not reconfigurable. They're completely immutable. If you want to change anything about them running you have to just rebuild the whole thing and reload it. And that there are no sys calls, only function calls. So therefore attackers need to know the memory layout to do anything. So all of these are basically completely false on their head. The unnecessary code thing, even if they were talking about dead code optimizations, that's not really true in general. And there are a whole bunch of Unicernals that add in a whole bunch of modules of code that may or may not be used by the application code that sit there. The shell doesn't matter. Binary level, you know, you can, you can write shell code for things if you get a buffer overflow. If not, if you're in a higher level Unicernal say running Node.js and you get like an eval, I'm fine running JavaScript if I need to, you know. So I'll do that instead of BNSH. The reconfiguration stuff will find out that Unicernals are very, very not immutable. And some of them even take like YAML configs on the fly or take binary uploads and just hot load them. So that's just not true. And the sys calls thing, it's, that's just, it just doesn't mean any, it's just not even wrong. So our hypothesis is that Unicernals may in fact reduce attack surface by throwing out a lot of, you know, the garbage system D. But, you know, there's that, that alone doesn't make things more secure. The rest of it has to actually be secure too. So once someone gets in though, there's like no process isolation. Like you don't need root, you are the kernel. You can, you can send, you can craft arbitrary packets, do disk IO, speak PCI to whatever God forsaken floppy disk is emulated and given to your VM, you know, things like that. So back when we were doing this, we started out looking at the low level stuff with the intention to go higher and look at all of the re-implemented application stacks, like networking stacks and things like that. And we just kept finding things in the lower level parts. So, so that's primarily what this talk is about. So things like, do these Unicernals do address-based layout randomization, which is a protection against code reuse attacks? So which memory regions and sections have randomized base addresses? Do they have page protections, things like DEP or NX that you might be familiar with? So if the pages are read-write execute, that's probably bad. Whether or not relocation, read-only or railroad applies, in many cases it doesn't, which is a protection on dynamic linkings that the sort of tables of function calls that you're dynamically linking, once they're loaded, they're not going to be then writable as function pointers as an exploit primitive. Whether or not there are guard pages between the sections, so those are going to mean that you have a page or a section between your sections that is not read-write execute at all, and such like a continuous buffer overflow that goes straight into it would hit a page fault trap and the whole thing would die without letting the exploitation continue. And then things like the null page being mapped. So with the null page, it's a bit different in the unicolonial world. In the normal world where people think of kernels in user space, those attacks are about attacking the kernel from user space by doing shenanigans with the null page. That's not what we're talking about here. So in the unicornals, if someone writes Linux style malloc code and therefore doesn't check null because VM over commit never causes it to never return null, it always returns essentially valid pointer, and when you run out of memory the O.M. killer just kills your process, so they never check null. But in embedded sort of systems and things like unicornals, it can return null. But null may very well be a valid writeable readable address. And so if no one's checking null when malloc returns and multiple things use allocations for malloc that return null, they're now sharing the same section of memory and it's essentially used after free. So that's interesting. Stat canaries, are they used? Are they going to be initialized directly? Are they going to be filled with proper random data? Heap hardening? Is there pointer validation? Is there good entropy for it? Are the allocations randomized? The canaries? Are they good? Where does the entropy come from? Is it good entropy? Is the RNG itself cryptographically secure? Is there any reuse of things in deterministic ways that are going to be bad across multiple VMs on the same hypervisor? And standard library hardening things like your percent end format specifier, custom format specifiers for format strings may lead to insecure implementations in the libc itself where there's like a section of writeable memory that just contains nothing but function pointers or even potentially code. And whether or not they support or even use fortify source. So our first target was Rump Run. It is an interesting research target and it's fairly well known unicornal in that circle. And it's based on NetBSD's Rump Kernels which were designed back in the day so that when you were writing your kernel drivers and they crashed, they didn't just take out your system because there weren't really VMs at the time that were in use. So what they would do is they had a bunch of if-defs all throughout the NetBSD kernel code base and they still do. That allow you to essentially take the entire kernel of NetBSD and link it against some kernel driver code or application level code and run it in user space as a standard binary. So when that thing crashes, your whole system doesn't go with it. Rump Run flipped this on its head, took all that magic linking in if-def magic and then threw it back into kernel space to run POSIX applications in kernel space. Our next target was Includos. It wasn't originally our next target but while we were doing this, a blog post came out claiming that they were the most secure thing ever and it was written in C++ and those two things didn't necessarily add up. So we decided that we would look at it. And then MirageOS, fairly well known in the circle and anyone who uses cubes, is a Unicernal that is primarily used for writing OCaml so that you can write secure services in OCaml since it's type safe and memory safe for the most part. But it also has a CFFI and some of those components are written in low level code. So they need to be secure as well. So we looked at how that was being protected. And then the next target that we have, which is actually next on the list and we haven't gotten to it yet, is OSV, which is Unicernal focused on high performance but also other interesting things like being able to run the entire JVM in ring zero. So you can have your vulnerable Java enterprise web applications running in kernel space, you know, ex-assess in kernel space, SQL injection kernel space, the world is your oyster. And so we're going to be looking at that soon enough. So I'll hand it over to Spencer. Okay, so let's get into the gory details and gory indeed they are. So we'll start with Rumpron, which is a 64 bit Unicernal. It runs on KVM and Zen as well as bare metal. We tested on Zen because that's mostly where people are running this. So Rumpron has a POSIX interface with the exception of threads and a few other small things. So basically you can run most POSIX compliant apps with very little modifications, sometimes none. It supports all these languages. Interestingly, it also comes to the separate repository of a bunch of packages like Apache and Nginx that basically just build out of the box on Rumpron. They've done tweaks already, but these look sort of like POCs. It doesn't seem like they're updated. I'll get to that in a bit. Basically the way Rumpron is constructed is, as Jeff said, it's just like the NetBSD kernel and they strap that on top of Zen MiniOS, which is an operating system that it's very minimal. It basically just does everything by making calls to the hypervisor. So if you want to make a power virtualized guest on Zen, your best bet is usually to just build it on top of MiniOS. So Rumpron doesn't have any ASLR. It's saved a little bit by LD, which we've taken here to calling the poor man's ASLR. Basically what that means is, if you're linking in slightly different code or like if you make a function slightly longer, it's going to push stuff down or LD is going to put stuff in different places. So if you don't know the underlying code of a binary, then you don't know exactly where everything is, but you can often make a reasonably decent guess. And if you know the code, of course, it's all deterministic. What's interesting too is because this is running as a power virtualized guest. Zen maps the hypercall page into the very early on in the memory space. So it turns out you may even be able to use that to do Rop, but it seems like there may not be enough primitives. We just found that recently we're not actually sure of the full impact. It doesn't really have any extra page protections. The tech section is not writable, but Datastack and Heap are all RWX. There's a little bit of protection that the null page is not RWRX, but there are also no guard pages. So you know you can just say stack overflow into the heap or something like that. There are stack canaries only in a very sort of edge case if you happen to get lucky. So the core make files of RumpRun actually disable it. So like RumpRun itself is never going to have stack canaries. If your compiler happens to turn them on by default in your application code, that's built separately and then linked in. So your application code might have it. Now the random canary value is generated at runtime properly using the BSD syscontrol crypto APIs, but it turns out that because RumpRun doesn't have thread support, they didn't implement thread local storage of course, and GCC expects the canary to be stored in and retrieved from thread local storage. So it turns out that the canary, even though it's generated correctly, it turns out to be null in practice. It's just 8 null bytes. On the other hand, if you have up to 8 stir copies, it will save you from that, but anything else you're on your own. The heap protections are pretty negligible. The allocations are deterministic and contiguous, and there's no point or validation of any kind. There are canaries in the heap and page chunks, but they are A, compiler defines, and B, they're not even positioned correctly in the one case where it actually has a null byte in that define. So essentially, they're completely useless if you want to exploit them. Jeff will talk about the entropy situation. Stairs, right? So RumpRun and the RumpCurnals have an interesting system where they have a whole bunch of if-defs. And for some reason, when you're building RumpCurnals, the if-defs basically mean that your rdrand, your CPU RNG, is disabled, even though that's not a privileged instruction. They totally could have run that as a user space binary. So Rump Run, using the same if-defs, suffers from the same flaw. It basically has no CPU RNG. So it completely falls back to the VM uptime and CPU cycle count, which are so deterministic in how this thing starts up that essentially, at start every time you are getting the same exact random values out of DevU random, which is spooky. It's saved a little bit by the fact that NetBSD is sort of complicated and has all sorts of other entropy sources. So anything that you print F, which I'm not sure why you would do on a unicernal running on AWS, but if you print F something, that output will feed into the entropy system. Mac headers from packets that the system sees will get added in, although those are probably going to be pretty much the same values back and forth on a given subnet between the unicernal host and the gateway. And then the host uptime, which is also known to all guests on the same hypervisor. The RNG itself is this sort of weird SHA-1 thing that actually has a whole bunch of code comments saying that it hasn't actually been audited or anything. We haven't looked at that. But this is all accessed in the standard NetBSD ways of doing it, so the BSD SysControl, SysCalls. And also the virtual file system sets up DevRandom and DeVueRandom correctly for all of this. Okay, so the standard library is implemented with libc from NetBSD. And that supports percent and doesn't support custom format specifiers. It does support Fortify source and actually the NetBSD, like core makefiles, will turn that onto the highest level. But then the top level rump run makefiles will turn that off again for debug builds. And it turns out if you use the like rump run build tool chain in scripts, it will always give you debug builds and there's no flag to disable that. It just sort of does that silently. So if the average person is building a rump run, this is going to be the case. It also has SysCalls almost. So they're not triggered by an interrupt, but there's a function called rump syscall, which you, I mean, basically acts just like the regular syscall function arguments and all. And the first 24 bytes of it when we looked at it in all the rump run unit kernels we generated are unique, so you can scan for this very trivially as soon as you get RCE and then just fire off as many syscalls as you want. In addition to that, to get initial RCE, the syscall table is a really good target because it is populated dynamically on startup and then it's left in a writable state. In addition to this, rump run, basically the build tool chain takes a sort of image config that tells it what libraries to link into your final unikernel and the default is about twice as many modules as we could get it to build with. So when they're talking about reduced attack surface, they don't emphasize that you need to prune this config down. So rump run's heap implementation is kind of interesting. They have a basically, there are two major primitives which is a memalloc chunk and a page chunk. The memalloc chunk basically represents the chunk of memory you get back when you actually malloc something. And that, the two important fields are a line pad and magic. Magic is the canary and you'll notice above it, unprotected, is a line pad. And what that does basically, it's the amount you have to subtract from the base pointer to the memalloc chunk to get up to the page chunk above it. And the page chunk has a level which is just like a metadata you can very easily guess. And magic which is a canary but it's 4 bytes that are not null. So you can, and it's in a compiler defined. So it basically doesn't exist. And then next in previous you can use that to perform the classic unlink bug. There's a lot of details about this if you're interested to look at the slides but we're not going to reiterate like all of malloc malefic arm here because it's basically just classic unlink. So if you want to exploit the romper on heap basically you have to do is get an overflow in a heap chunk onto another heap chunk that will later be freed. And you just need to overwrite the first 8 bytes, the mh align pad field, such that when it's, when that value is subtracted from the base pointer of that chunk that you're now corrupting it points back to a fake page chunk in your overflow buffer that you've written out. And make sure level and magic check out that's really easy. And then just set the next in the previous pointers however you want to do a write. There is a caveat of course with this kind of exploit that the value that you're writing must itself be a writable address otherwise it's going to cause a crash on the second write. Although you may want that actually as we'll show in a little bit. So basically romper on security situation is not good. There's no SLR everything but text and null page are WX. The canary situation generally speaking they're not going to be any and if there are they're pretty useless. The heap is not really protected. There are canaries but they don't do anything in practice. The entropy situation is also weak and particularly if you're attacking from another hypervisor guest. And while a standard library supports hardening it's explicitly disabled. And you may be thinking well here heap overflows basically in romp run they will give you RCE under pretty much any circumstance especially if the attacker has a source code or binary but sometimes even in cases where you don't because you could attempt to brute force the addresses you need because while they are not you won't necessarily know where they are they're not random you basically like if you know roughly what's what's in the romp run image you can guess pretty well where they are. And by the way romp run is also completely un-maintained as of just a little before we started researching it last summer. Basically right now they only updated every few months to like add compiler flags to disable new GCC defaults that would make it more secure but actually break the old romp run code so that's interesting. And at this point you may be thinking well what could possibly be worse than what I've just said and to this Includos says hold my beer. So Includos is a 64-bit unicolonial that unlike romp run is pretty specialized it's exclusively for C++ services particularly web services it runs on KVM and virtual box and VMware but it's primarily developed for and tested on Linux KVM. Now it's worth noting here before I go into saying all I'm going to that Includos the version we looked at is from summer of 2017. So they have likely fixed some of the stuff we're talking about given that it's actually maintained it's probably more secure than romp run at this point but we haven't actually gone back to check. But before I talk about what I found in romp run let me explain a little bit about how I actually found it because it turns out that debugging in Includos is like really really really difficult. In fact I suspect that the Includos developers probably have not ever tested Includos running as a Unicolonel or rather debugged Includos running as a Unicolonel because it turns out the boot script they used to do that doesn't even have a flag to enable the GDB debug bridge. It's trivial to add but the fact that it wasn't in there is kind of suspicious. If you compile binaries with debug symbols and then actually run them as Includos Unicolonels they'll just crash on startup most of the time and we figured out a way around this which is like if you run the non-symboled version and then attach GDB and then load symbols from the symboled version it actually doesn't work either most of the time. Generally it'll die with a CRC mismatch error and we don't even want to speculate what causes this. If you manage to get past this which again just happens sort of randomly breakpoints also don't really work you usually have to just start the guest paused and then like insert jump zero instructions and then manually put the original bytes back when you're ready to continue. It's not good. So having gone all through this rigmarole we find that Includos basically the ASLR situation is exactly the same as Rumprum but what makes this more embarrassing for them is that in that Includos is secure blog post the CEO mentioned that Includos quote randomizes address at each build and it seems like he's just confusing the linker like normal linker behavior with compile time ASLR which Includos definitely doesn't have. He also says that Includos is immutable. It's not. It's like the most mutable anything can possibly be because every single page of Includos is RWX. So you can I mean literally we actually we didn't have a full POC for this but we got far enough to realize that you could probably like inject your own TCP IP stack and then like load a new Includos image on top like into an existing Includos and then start running that. It's pretty terrible. As you can probably expect there are no additional page protections either. There are no guard pages and even the null pages mapped RWX. You get the use after free sort of thing that Jeff was mentioning. The snack canaries they're certainly more prevalent than in Rump Run but that's not particularly helpful. So they use FStack Protector strong in the core kernel and also the like application preamble CMake files. So everything's going to have stack canaries but the canary values are compiler defines that are generated randomly by the like CMake random string function which is really low entropy not cryptographically secure and that CMake list that builds it is in like the core of Includos. So only when you rebuild Includos the main image itself does the canary get regenerated. So all of the images you build against one build of Includos are going to have the same canary and it'll persist across restart so actually you can just brute force it pretty easily especially given how fast it restarts after crashing. And by the way the thread local storage bug from Rump Run pretty much appears verbatim here so just ignore everything I said the canaries are actually null. The heap doesn't have any protection at all it contiguous allocations deterministic no canaries no pointer validation nothing. So the entropy system is actually one thing that Includos did fairly well so it just uses already ran for everything. It does some CPU feature detection to detect if it has it if not it falls back to cycle counts but modern CPUs is not going to be a problem. The actual RNG applied on top is basically they took the internal sponge function out of a Ketchak which is fine that works. But to access the RNG they actually the way you just open up DevU Random or DevRandom and right at the beginning of their open implementation they actually just stir compare every single select string path that's passed in and just literally just check if it's DevRandom or DevU Random and then they return you a magic file descriptor that goes to the RNG. The interesting thing about this is that in the middle of when we were looking at they updated it a little bit because it used to be at the magic file descriptor was number 998 but they never actually checked if when they were incrementing file descriptors elsewhere when they were returning new ones whether or not that collided with the magic one and so if you opened up enough files eventually one of them would just be DevRandom and not your file that you intended to use but they later fixed that. So the standard library from that they're using for the standard library is Red Hat's New Lib which was designed for embedded systems and has basically no security whatsoever as much like other other things that they provide. So it's got percent and no custom format specifiers probably just because of spacing complexity constraints and embedded environments and has no support whatsoever for fortify source. All right. So I'd like to bring your attention to this point the very first sentence of this include us post says that it was written with security in mind. We'll let you decide. So probably the most egregious thing at the include us project did was they decided to throw it all all sort of knowledge about how memory is supposed to be laid out for applications and kernels and things like that. And so in the include us world the stack is in low memory followed immediately by the tech section followed by data followed by heap in high memory. And so if you if you see anything odd about this you remember that when you have a buffer you write to increasing addresses. And so what this means is that if we have a buffer overflow we can actually just continue past the end of the stack right into the tech section start overriding the code. And if you're really lucky the the actual copy loop that maybe it's it's you know mem copy stir copy some for loop you know that's going to be early on and short enough so that your buffer overflow will be enough to actually hit it. And so when you overflow the actual you know move instruction that's implementing that's performing the buffer overflow you don't want to use a not because you know much like your your buffer writes code executes down but it increments the the IP address. So basically you actually want to use jumps that go backwards in a chain until you go all the way back to the start and then run your shellcode going forward. And as our demo shows if you're not that lucky well you're still pretty good because include us the way that it links code together it puts basically all user application code right at the beginning of the tech section. So even if you can't get you know memcopy when memcopy returns it's just going to return right in your shellcode and it's never even gonna you're and ideally your shellcode doesn't check the stack connector a stack canary just don't do that. Where's the other way right this one stack overflow yeah so we have a demo that basically we just break on memcopy and then on the end of it we have to run through a couple of them because they're using the up bring up of include us and then we just start stepping it and you can see us stepping through our reverse jump chain as we like to call it extreme not not sledding until we get to the very start of it with the final jump and have our actual shellcode that's a little bit complicated it prints out you know hello Torcon XX the only reason this is complicated is because I wrote it so that it doesn't have any null bytes in it. I'll hand it over to Spencer. Okay so as Jeff said the include us all the low level stuff is implemented with new lib. This is about as insecure as it sounds it is basically just vanilla unlink again if you want to know about this either like consultant Malik Maleficarm it applies verbatim so basically with an unlink because of chunk coalescing in free if you corrupt a chunk you can you can write one to three pointers anywhere you want so how do you chain this to RC? Turns out that include us has a panic handler called on panic which is just a function pointer that if it's non-zero it'll get called when the OS crashes like say on a page fault you can overwrite this now of course if you want to jump back into your your code in heap overflow the heap in fact it's actually really deterministic you might know where it is but for the sake of argument let's suppose you don't know what you can do is since you're corrupting a chunk you can induce a crash inside free and of course at that point the pointer to the chunk that's corrupted is going to be on the stack so what you can do is use the first right via unlink to point the panic overwrite the panic handler and then point it at some any known writable location which is everywhere because it's in kudos it also needs to be executable actually and then use the second right to write eight bytes of shellcode to that location and what the shellcode needs to do is just increment RSP up until it's pointing at the address of that buffer on the stack and then it calls writ and then that just returns into your buffer and that's great but there's one more problem which is that the again newlib is made for embedded devices so it's trying to save space and the way that the malloc chunks work is that they're part of a linked list of course when they're free so they have forward and back pointers when they're allocated they don't need them so the buffer starts from where the forward pointer is so because this the chunk that gets that gets passed a free that gets unlinked first forward and back at that point need to be pointers to valid writable addresses otherwise you're going to crash too early but then you return back into it and you return right to forward and so forward and back actually also need to be valid shellcode and the eight bytes of shellcode we have doesn't have enough space to actually increment anywhere beyond that so we thought this was going to be really hard maybe impossible till we looked at the first instruction in the include us binary we're looking at and it appears to be an eight byte not that also happens to be a writable address so thanks clang we have a demo for this as well whoops so here we have a vulnerable tcp server with this with a buffer overflow on the heap and we assume we don't know the size of the buffer overflow so basically what I'm doing here is a sort of exponential back of where I'm just sending bigger and bigger buffers seeing where it crashes and you see it restarts really fast so it's great for brute forcing and basically we are finding the distance to the next heap chunk header so we know how to corrupt it and then we're going to use that to just write to progressively greater address addresses incrementing just by like one pointer length each each time and in just a moment it's going to whoops it's going to find the panic handler and there it's printed our address or sorry our message so include us there basically aren't any protections with the exception of like really half-hearted canaries the aslr there's no aslr everything's rwx canaries are constant across reboots not cryptographically random and also their no all the time the heap has no protections at all the entropy is actually okay except for the nsa case and the standard library doesn't really have any hardening at all so basically include us like it it doesn't have any security features like if you get any kind of memory corruption or like even just look at it the wrong way you're going to get rce now on a better note we have mirage os which is an okamal-based unicolonial and what what drew our interest to it is basically it's intended to be used probably in kubes os to create these secure minimal attack surface vms and in that case it'd be running on zen so it supports both kvm and zen but we we tested it on zen just because that seems the most interesting for this use case and because it runs okamal it also supports ffi so you can you can call native code from the okamal and basically the way that it's constructed is pretty much like run from you take the okamal runtime and you put it on top of zen mini os although they have their own fork which has just a little bit of differences so you might be wondering why would you use okamal because it's like sort of an obscure functional language basically the argument comes down to the fact that it supports like also object oriented programming and imperative pretty decently but it's also reasonably memory safe not as much as something like rust but it's if you're doing just like regular okamal things it's it's you're going to be fine you're not going to run as like a buffer overflow or anything although it turns out that vanilla okamal does have support for some operations that are distinctly not memory safe but if you're writing normal okamal you'll be fine right well no not exactly okamal has some pretty bad cv es and you know this one we're pointing to with the arrow is is a 10 that actually is not applicable to unicolonal because it uses environment variables but the top one is a memory corruption that absolutely would be useful to us although we haven't done a poc there's also a spooky library called marshal which is for deserialization and if you basically like specify the wrong type quote anything can happen at runtime the only limit is yourself welcome to zambo car yes so much like include us we were having quite a bit of trouble debugging mirage because right now mirage only supports running as a para virtualized guest on zen and it turns out that there are no good debuggers for zen para virtualized guests and by no good debuggers I mean literally there are no debuggers for para virtualized guests on zen there's VMI debug which is really nice it's it's modern it's live VMI based it's you know supports lots of nice nice things unfortunately none of the things we are looking at it only runs with linux and and windows hvm guests and mirage obviously is not linux or windows or hvm there's also gdbsx which comes packaged with zen and it's like a gdb server basically it it it goes downhill very quickly like if you look into the codebase you realize it's using these gdbsx branded syscalls specifically implemented in zen for gdbsx and it seems like after a while I realized this is probably because the zen developers documented their VMI and debugging apis so badly that like they couldn't find them later and so they just said oh screw it will implement the syscalls specifically for this it can also only do peak poke so it also supports reading registers theoretically but it doesn't speak the gdb remote protocol well enough to talk to either gdb or ldb when it sends registers so basically eventually I realize that using these as a nightmare and it's actually easier just to go and write my own debugger so I did I call it zen debug and Jeff's calls it zenbag and I kind of don't like that but it's stuck it uses this time the correct zen debugging in VMI apis which I had to basically hunt down myself and dissect the implementation because like wow good luck if you're like even you're looking for like the comment at the top of the header file that tells you what it does no they're those are like not they are most of the time it right now it has its own REPL which works pretty nicely but even more functional is the gdb server it it's it's like it's a stub gdb server and by gdb server I mean lldb server because it turns out that gdb doesn't actually speak its own remote protocol to spec and I'm not going to deal with that honestly so right now zen debug supports para virtualized guests and it can read and write registers and memory and do breakpoints and stepping and right now we're trying to work on getting memory region infos you can like read the page tables but it turns out that zen actually doesn't have an api for this in para virtualized guests so it's going to take a little while so I've got a demo of this as well whoops and right now on the bottom right we've got a simple mirage of s unicolonial it just prints hello I'm a test every second 100 times and we're going to run it in the bottom left and then start zen debug with with port one two three four open this is actually the syscall page it's stepping through instructions it follows the ret even and here we're going to now search memory for the string hello and then write a magical value into that area and then continue and we should see it's now printing our message fabulous whoops okay so mirage of s same aslr is the same situation as the other two it's basically like you know public code yields deterministic binaries um the pervertualized page thing with rump run is also here because it's it's running as a pervertualized guest as well what's interesting is that okaml has a pretty big runtime and of course it's all written and see at some point so most of the okaml runtime is going to be like deterministically ordered and placed at like a semi not random but like a location that you won't know necessarily so if you can find any part of okaml you probably know where the rest of it is and you may be able to do something with that the page protections are just like rump run text isn't writable but data stack and heap are all rwx the additional page predictions are actually good they've they've pretty much done everything that they should they have uh unmapped or guard pages it's hard to tell because we can't inspect pteas yet between at least the data data and stack sections and then stack and heap and the null page is mapped but not rwx so that's good and railroad again doesn't apply because all the linking here is static there are no snack canaries so unlike rump run and include us it doesn't even try here they're explicitly disabled in the mirage os like core make files and then when you use the build tool chain it also disables for your application code even if even if you're like gcc turns it on by default the heap doesn't have any protections at all with the exception of a little bit of pointer validation so it's implemented with x malik from mini os and so the allocations are deterministic and contiguous there are no no sort of like heap or page chunk canaries uh unlinking small chunks has pointer validation but most code paths don't and I I have a poc for an unlink bug just like I showed you with include os so the entropy situation is interesting because there are two packages you can use for entropy they don't really specify which of them is good and they really should because one of them is great and one of them is terrible the terrible one of course is mirage random which is a wrapper for okamels random module which is not even cryptographically random but that aside the way it initializes itself is it gets a seed from devu random but because there's no file system and it like doesn't do any special casing like rump run or include os opening devu random just returns one and in fact literally the implementation for open is just return one it will then fall back to the PID and ppid so like parent PID as well as get time of day but the first two don't exist because there are also no processes on unicurnals there's just just hard coded to return two and one respectively and so ultimately the only entropy source that it actually gets is the current time this is obviously bad there's also mirage entropy which is good it can do rd random rd seed of course only on x86 and it also has something called the whirlwind RNG which they refer to with a new research paper and it says that it attempts to exploit CPU level data races that lead to execution time variability of identical instructions and we're not crypto experts but this sounds kind of spooky spectral even we're we don't really have the expertise to assess this aside from like obvious non-crypto issues which we didn't find but it would be interesting if if someone with crypto expertise could take a look at this because you might find something there's no standard library hardening they got their own standard library called mini libc which is like a really thin wrapper around mini os and it's the usual it supports percent and doesn't support custom format specifiers and doesn't have any fortified source support something interesting and rather unique tomorrow's os it uses opam as a package manager and rather like you use opam to install packages to build it and that has a sat solver based dependency resolution mechanism and so what this means is it can install arbitrarily out of date packages without warning you if you have something like this diagram you have a top level package a that depends on like package c at version greater than 1.0 and then it depends also on package c transitively at say less than 1.0 via another dependency no matter if you install a opam doesn't let you install two versions of the same package so in order to satisfy this it'll just install version 1.0 of package c and it'll you know the interface will give you exactly the same readout as if it installed the current version of package c like it'll show the version in some little things somewhere but unless you're really looking hard then it's it's not going to act any differently it won't warn you so you have to be really careful if you're developing any stuff for MirageOS that like you don't let opam screw you with really out of date packages so right now there are a couple other vulnerabilities like real vulnerabilities that we found but we can't talk about them because we just found them in the past few weeks and we're still in the disclosure process but rest assured these will be in the white paper so in short MirageOS has no ASLR basically no page protections on data stack and heap text is okay null page is okay there are no canaries even if you try to enable them for your application code nothing at all on the heap the entropy is good as long as you're using the right package but they don't actually specify that that's the one you should use and the standard library doesn't have any kind of hardening so basically if you get a stack buffer overflow you're great you can get RCE pretty much immediately most types of heap buffer overflow except some stuff with small chunks will give you an arbitrary pointer right again keeping in mind that the function point like the function addresses that you might want to utilize are going to be in slightly different places but again you can brute force those or scan the entropy implementations and the OCaml package manager OCam you have to be pretty careful about how you use them because they can shoot you in the foot easily so given these three assessments I think it's pretty obvious that it is not the case that unicernals are secure but they are rather hilariously broken and that's basically because they don't implement even simple security measures so to the point that if you get most types of memory corruption on the unicernals we looked at at least they will lead you very easily to RCE and oftentimes as like in the include os demo you don't even have to have seen the binary or the source of what you're attacking so on top of this is the fact that everything has kernel level VM capabilities so like as soon as you get any control it's already full control like there's there's no privilege separation there's no root or anything as soon as you get RCE you can start crafting arbitrary packets do PCI whatever anything the VM will let you do you can do but with that said it's worth mentioning that even though Mirage at a low level is basically the same situation as Rump Run it is actually still an order of magnitude more secure than any of the native code unicernals we looked at because on Mirage with that said memory safe language unicernals like Mirage still need to be pretty careful about how their low level components which are most likely going to be written in C are hardened and they also need to focus on actually providing like secure APIs for application developers to use because they can't just rely on being inherently secure they actually have to like help application developers develop secure applications so basically unicernals unlike the claims of many of their proponents say are not a cure all basically their embedded systems running in a VM and they have about the same level of security as you'd expect from a lot of embedded systems so to put it lightly there's a lot of work that needs to be done before unicernals become at all suitable for production in retrospect they should have been obvious as soon as we saw that there basically wasn't a debugger for the overwhelming majority of unicernals we looked at and we actually had to write our own to assess some of the underlined components we'd like to thank some people first of all mini Preston of the we're helpful with teaching me all sorts of stuff about how OCaml's FFI works during the last CCC in Germany and then Brian Cantrell for being an epic troll who happened to be right this time for future work we're going to start looking at OSV we have white paper at least the first one coming out soon it's like at 100 pages right now that are mostly diagrams and exploit shellcode so that'll that'll be cool when it drops and then we are going to be looking more for those higher level things where there are issues in the kind of stack code of when you when you decide to re-implement your own TCP IP implementation what what can go wrong and and that's sort of that kind of level of stuff is one of the one of the sorts of bugs that we already found I weren't really looking for that kind of bug but that's one of the ones that we we can't talk about of what happens when you just re-implement all the application stack from user land to kernel to hardware and just throw it all and bring zero so we're looking for more of that if there are any questions we have a little bit of time left actually so we actually got through this faster than than our demo runs for ourselves I'll also add that NCC group is hiring so if anyone wants to talk to us about that we'll be around so just feel free to stop by and ask us questions so any any questions okay thank you