 Okay, so let's start this presentation. My name is Thomas Barbosz and I'm presenting to you a joint work which I did with Maxime Villard. It's called K-League. It's a practical kernel memory disclosure detection. But before going into the details, and I may warn you, this is a little bit more technical than the last talks. We have to do a quick recap on two or three things. So the first thing is actually, what is this call? And well, I suppose everybody knows the CISO space, kernel space and user space. We've got potentially untwisted programs and they're kind of isolated and then there's kernel space where the kernel is running and there's where the power lies. So the programs are isolated and to do anything meaningful, they have to ask the kernel to do so. So they use CISCALs, like for example, CISCAL X, which could be center packets, or write to file and then you call it to the kernel. There's a CISCAL dispatcher which does additional preparation, chooses the right CISCAL to call and finally return to the user program. And the interesting thing which I'm talking about now later will be this boundary between user space and kernel space. There's a trust boundary. Program should not read stuff from kernel which they're not allowed to read and they shouldn't write there. So there's some data exchange usually going on. So first some data has to go into the kernel. This means the user program cannot write directly into the kernel space. So what it's going to do is the user program provides a pointer to the buffer and then the kernel fetches this data and works actually on this copy. Here in the slide you can see there's some buffer on the real space stack and it gets copied into the kernel stack and some work is done. And again there's this trust boundary between user space and kernel space. And now the really interesting thing which is this talk all about, it's about copy out, getting data out safely and securely out of kernel space. And typically there are also dedicated functions like copy out string in order to get this data to the user mode program. And the kernel doesn't typically write directly into user space. It uses these APIs also because stuff like supervisor and mode access prevention in this kind of an exploit mitigation. And if you know what is this call and you know about the trust boundary then you can follow this talk perfectly I suppose. So it's about kernel memory disclosures and what are they? Well one could say they are inadvertently writing data from kernel to user space. So data that actually does not belong there but which was written there by accident. And as a consequence first you may leak random data but for an attacker that's useless but then the interesting stuff can happen. For example you may leak kernel pointers or parts of a kernel pointer and if you've got like kernel ASLR then this breaks basically your exploit mitigation. And also it can happen that you write more sensitive stuff, maybe crypto material and so on and so forth. Typically those parts do not lead directly to an exploitation or a privilege escalation but they are a modern system that they've got kernel ASLR, they are a step towards this goal. So in order to do some privilege escalation you have to know where the kernel lies and memory and you need to get kernel pointers and so on and so forth. So let's have a quick look how such a kernel memory disclosure looks in code. I hope everyone can read this. This is a kernel memory disclosure that I found by auditing the FreeBSD code. And on the left hand side we can see actually the code of the syscall. It's called syscall getContext which provides the context, the register states to the user space program and it's architecture dependent because on different architectures you've got different registers and stuff like that. And on the right hand side that's the kernel stack. We're going to call into it, there's a little pointer and now we're at the beginning of this syscall and what happens we're pushing the parameters out on the stack and it returns to us. And still most of the stack is initialized. We don't know what's actually laying there. And now there are variables and they are only reserved at this point in time so the stack gets opened for the new context and also somewhere logically there should be the return value. And there's some sanity checking on the buffer and then you go into another function which is architecture dependent in this case and which is called getMContext which fills this UC struct with the register states. So how does this struct actually look like? Well it's a struct which is nested with structs I think on another layer it's also nested. So it's kind of complicated nobody really knows how this will relate out into the memory by the compiler because of padding and stuff like that. So I believe nobody here in the room will get it right. So we don't know what happens usually with this stuff. And as a consequence this function and getContext fills this memory area where we pass the pointer and then there are some holes. And now we're copying the whole struct to user space again and this hole by now is a leaked stack memory. So everything that was before now is in user space. In this case it was some two and a half kernel pointers that were passed like 20 bytes every time and that's the thing with there like you won't see it at the first look maybe but then if there was in FreeBSD kernel as a lot it would have been broken. So the fix here is quite easy while you just zero out everything before you write to it. So why are they hard to detect those kernel memory disclosures? Well there are silent bugs so they do not hear questions. Without a question nobody usually will look into it. So maybe this is also these are also hidden behind your C library because you usually don't talk directly with your with the kernel by a system calling in your user land program and maybe you won't get the leaked bytes through your C library and nobody will notice. The root of all evil here in the end is the C programming language and there's no let's say safe way to transfer data from kernel to across trust boundaries. Also the current state of compilers doesn't help. I mean they tell you if there's a non-regional variable but then they fail if there's nested structs of structs and stuff like that and they don't give you a hint that there may be a kind of memory disclosure. System memory allocators they've got also, they're also a reason because usually the kernel stack is not zeroed out. It's uninitialized, the heap returns in many cases, uninitialized memory that could be leaked to the user land. Furthermore architectures are also a thing here because on some architecture actually the code will leak on the others not. This is because compilers may add padding and so on and so forth in the structs and you won't notice this. If you develop something on x86 it may leak on AMD64. And maybe some developers are not that aware of this issue, are not taking seriously but yeah. Typical error sources, there's a paper by your project, Google project zero and it's like 20 pages just about typical problems of kernel memory disclosures and this paper came out like half a year ago in summer 2018. So everybody who's working with low level stuff I can definitely recommend and read it and then afterwards you will know a lot about this kind of bug class but in general it boils down to stuff in the C language again. So in uninitialized variables well the compiler will warn you at least here but then you've got struct alignment stuff so stuff gets aligned so that you can fetch it in one CPU cycle by the compiler and there may be paging bytes at the ends and structures. Unions are kind of evil as far as I know because if there are two members a smaller one and a bigger one then the unit will be the size of the bigger one and if you could be out the smaller one to user land there will be a padding at the end and again the allocator in stack won't help here. So how to avoid it basically first if you do stuff general passing data across trust boundary it doesn't have to be user land, kernel land can be also network for example. So local variables on a stack are usually uninitialized we don't know what's there and your heap implementation if you don't pass flags for example it will be uninitialized as well so initialize it. Don't trust the compiler at the moment I guess in a couple of years there will be some advancement and they will help you more on this and don't assume any certain architecture padding when laying out stuff mentally in memory so this will break your neck. And yeah initialize your data as soon as possible in the case in fact sample that I showed you if you trust then it will happen later most likely won't happen so do it yourself as soon as possible and you won't forget it. Again if you write in your own syscalls then dump the stuff you're exchanging in user land and look for some bytes that would be a way to another way to find this leak byte and finally when in doubt zero because security efficiency and when you go the long way to implement kernel ASLR then well one leak byte may be enough to break it so this couple of cycles for searing out data structures may help you a lot there. So what to expect actually if you look at the history of kernel memory disclosures in other operating systems that's not a BST problem that's a problem everybody's facing at the moment like in a publication of Loo in 2011 I think they stated there were 40 kernel memory disclosures detected mostly by manual code audits in the Linux kernel and they automated stuff and found like 20 more in Android and Linux then there was this paper of Euro last year who found around 70 alone in Windows so everybody's fighting with this thing and the list goes on there are many individual researchers that found kernel memory disclosures by having a look at the code so far there hasn't been any systematic investigation in the BSTs as far as I know maybe you can say an exception would be open BST manual code reviews but then again correct me if I'm wrong but I think two weeks ago there was an avatar notice in open BST about the pledges and there was a kernel memory disclosures so if this is true then everybody's fighting with this so the assumption is that there must be many kernel memory disclosures in the BST and while I was reviewing the code of free BST, net BST, open BST I proved it by finding some lowering inputs and then I thought well we could actually patch the kernel and use some full time tracking and do this automatically and this is basically where I teamed up with Maxime and we came up with this idea that's called K-Leak so K-Leak is an automatic approach to detect those kernel memory disclosures and it uses a rudimentary reform of taint tracking so we taint our memory sources like the kernel and the stack then we let the bytes, the tainted bytes travel through the kernel space and add some data things in our case copy out, copy out string we look at the buffers and search for certain marker bytes if they're there and if they're there then we detected such a leak so let's have a quick look how this actually works so we've got our program, we've got our potentially initialized stack here and we've got our patched heap that already we turned tainted pages and if we use a syscall and call into the kernel at a syscall dispatcher we taint the whole stack by calling a function of our own that allocates a huge array and memsets it with our marker byte and then we call it the system call and the syscall will actually do memory allocations and it will get stuff from the heap it will write to the stack and then at some point there must be some exchange so data gets copied out to the user space so that would happen here, copy out to user space and this is basically our data sync where we have a wrapper around, copy out and we look at each buffer and check well is there a marker byte like if there are four of them in a row then we know cool there's some memory, kind of memory disclosure you may think well one marker byte that's not a good idea because however uncommon a byte may be you will see the bytes all the time so if you think about system calls like get random all 255 values will be seen there so therefore we had this idea of a hit map where we introduce basically, we introduce rounds so we will take the marker bytes in the first round we've got to make a marker byte 1 in the second round we've got to make a bit 2 and every time we encounter this marker byte in copy out we write this down into our hit map and for example if we say we want to check for 8 rounds for the leak then we encode it in the hit map 8 rounds and this decreases the possibility of false positives tremendously because it's very improbable that our marker bytes will be there at the same offset all the time in a random function so there still might be false positive but it should decrease it tremendously so let's go into and then again you return to user space let's go into the details so we've got our data sources that's one source is the heap and we instrument basically the dynamic memory allocator to return our mark chunks so we memset the chunks before we return them if you call malloc for example we return your pay the memory you requested but memset with our marker byte and not like initialized so the exception is well zero chunks if you request huge zero chunks we have to return zero chunks because if we don't you get kernel panics then another source is the stack so right before entering the syscall at the syscall dispatching we call our own function that taints the whole stack but then we've got a problem during execution of the syscall you enter into another function another function and those will open a local stack will close it, there will be some data and maybe the leak will be a little bit deeper in the syscall so what we do is we continuously retaint the stack so we use some compiler instrumentation and every now and then we retaint a smaller part I think at the moment it's 512 bytes we retaint it and this increases our probability to encounter those tainted bytes later in the copy out and then detecting those leaks it's like rudimentary taint tracking so we don't track it track each and every byte through the kernel we let them travel and at the source which are defined as copy out and copy out strings we check on each invocation if we see those marker values now you may ask what are good values for marker values so if I would mark by my tainted chunks heap chunks for example with 0 or 255 this is not a good choice because those values are probably are more likely to occur and we had this idea well we at least try to estimate the byte frequency first and see what happens like what bytes are more common than other bytes that's not super scientific exact but for our case I think it was a cool approximation what we did was we we patched copy out and copy out string in that beast 8 and every time copy out copy out string was called we had an internal data structure and we counted the bytes that we saw in the buffer to be copied into user space so we counted in this buffer they were 3 times the byte 1 2 times the byte 2 and so on and so forth and then we added another system call in order to fetch this data from kernel space and read it out in order to get a lot of interaction with the operating system kernel we used the test suite of net BST tests ATF and I think it run in my VM half an hour something like that like it provoked many many many implications of copy out and copy out string and that's basically the result so on the left hand side on the geometric scale you see the results of copy out and on the right hand side you see the results of copy out string so first you will notice on the left hand side the bytes like 0, 1 that are a little bit more common then in the middle that seems to be the same STS-K spectrum over there more or less those seem to be common and then at the end there are also bytes like minus 1 which is 255 is quite a common byte but bytes at the end of this spectrum like in a range of 200, 220 not that common so maybe choosing them as Marker byte would be a better idea then on the right hand side we see the values for copy out string and just copies out only ASCII strings so the good news is we did not count like a byte at 200 which would be something like Konami Disclosures the bytes are actually in the ASCII range and there's 10 and this little spike at the left side that's like line feed and that's value 10, 10, 11 so on the bottom you can see on the left hand side the bytes that are quite common like 0 255 I don't know why it's 1797 but on the right hand side the bytes that are not so frequent and which we choose as our Marker bytes and it's like 154, 218 bytes like those and I think of those 10 like 4 maybe are prime numbers so so our solution as I told you using only one Marker byte it's not a good idea because you have to consider the amount of full false positives and the solution was basically to invoke the Konami entry points over and over again but changing the Marker bytes and using this hit map and with this hit map each offset we had a byte and each byte contains 8 bits and if we choose 8 rounds we switch on each round if we encounter a byte we switch a bit on and then again we jump to results so this can be done in the implementation in NetBeast with the user land to K-Leak which interacts with the kernel part in general it's not enabled by default it's a develop option but it's your system remains quite usable so it's not a tremendous slowdown that's happening if you use K-Leak and here we can see an example we call K-Leak with the number of rounds in this case 4 and the command PS and what happens was that they came to output and Maxime found this one which was quite tremendous which leaked 931 bytes of memory from user land to kernel land because I think this was the biggest one that we found so that's quite a lot so quickly about the limitations well one limitation is simplicity and speed or precision so we are not that precise because we don't have this super sophisticated chain tracking where we track the bytes through every operation in the kernel we rather let them travel through the kernel check at the end and that's it so that's simple because the patch of K-Leak for NetBeast was definitely less than 1000 lines I don't know exactly how many it was and it's kind of fast you don't know to remember slow down and that was one of our goals and actually it worked as you will see later on so code coverage it's a dynamic analysis approach that's the problem with dynamic analysis so we have to cover as much as possible you can simulate user interaction you can use for example you can use testing frameworks like CTF tests you could use fuzzing to improve the coverage but in the end close to 100% coverage is difficult but that's the problem of all the dynamic analysis approaches another thing may be portability across the BST it's not a huge issue I did a proof of concept port to free BST which gave some results and I think you can do it to Apple and to Linux if you want to Windows I don't know so those are the direct results actually of K-League and I think it's not complete we found I parted it to free BST 11.2 as a proof of concept and Maxime implemented fully a net BST current and this is like those are more than 20 biome leaks from everywhere from dynamic memory from the kernel stack like 900 bytes, 92 some of them are smaller usually those are half of a pointer on AMD64 what we found so and the research basically goes on those are the direct results that we found by using K-League so I was told there will be soon a security advisory by a net BST about many leaks more than 15, 16 I don't know what happened to the free BST leaks they were fixed all but there's no security advisory so far I don't know what happened was this was kind of a cool like a follow up by the free BST devs the leaks were reported and what they did basically they looked around in the code where one leak is there may be another so on the left hand side there was one leak we found out there were like 20 implementations of file systems there was a leak and they fixed them all so as a follow up they fixed like 50 or 60 more kernel memory disclosures which was great work yes so as a conclusion so we saw what are kernel memory disclosures so what they are how to avoid them I presented to you our approach which is called K-League which is fully implemented in Net BST and it detected more than 20 of those memory disclosures and as a follow up dozens of those were fixed by other developers and so it's a thing the one more thing is the BST are far from being KMD3 that's two for other operating systems as well so instead today instead of planting a tree maybe we can fix memory disclosures and we're getting closer to full secure system so keep looking for them and thank you for your attention any questions yes how expensive would it be to at compile time insert zero instructions for the stack that's used by kernel functions could the compiler insert zero instructions could you modify the compiler and function entry zero out all the stack variables actually there are some projects and like for example on linux there's something like that every time you enter into a system call it will zero out your stack so then there's the issue performance so those are like I don't know how many CPU cycles but if it's performance critical you won't need it you don't want to have it but it's there it's not a standard thing it's one of those security patches I don't know how they're called like the linux security patches maybe yeah but then again for example there are leaks that are from because you wrote it there and then there are leaks then there are leaks from the heap so then you have to patch your heap implementation as well so there are many places where those leaks can and could occur so what do you use as a key for the heatmap as a key so this was actually I'm not the heatmap was the greatest idea of Maxime who implemented it so I asked him that was I cannot give you the entire details on the heatmap