 Welcome everyone back from lunch. I hope you still have some mental capacity to follow a couple of more talks. Please welcome Andrew Turner, he's going to talk about fuzzing free BSD. Open BSD, sorry. Please welcome. Thank you. See I'm going to talk about, well I say that, but I won't, I am a free BSD committer. I'm not just going to talk about free BSD. I will, I have looked at the state on free BSD, need BSD and open BSD. So this is about myself. I'm a free BSD, I said I'm a free BSD committer. You may have previously seen me doing things like arm 64 support. I'm a research associate in the University of Cambridge. So if you were in Brooks's talk this morning on Sherry ABI, I work on the same project where we're doing trying to add more security by default into hardware and therefore into your software. And sometimes I pretend to be a freelance software engineer if people feel like paying me money. So I'm going to start off with talking about the sanitizers. So even though this talk is on, I said it was on fuzzing. It's sanitizers without, the sanitizers are useful because forward fuzzing and I'll sort of explain. And for sanitizers I want to say, well, you've got some code. Do you know about the quality of your code? I, you know, we've got this big piece of code we called the kernel. There's lots of different parts of it. What's the quality? Is it, are there many buffer overflows leading to somehow exploiting the system? Are there, are we accidentally leaking memory and therefore leaking some information that a user shouldn't get access to? You know, what do we know, do we have a way of figuring this out? So the sanitizers, what, how they were, you know, they're just, you can think of it as instrumentation of your code in some way. It's, they're all, it's generally the compiler will be doing this, re-inserting a function called somewhere in the code depending on the sanitizer. So there's some that could say if you're entering into a basic block or insert a function call there, some will, if you're doing a comparison to see what memory you're comparing or if you're doing a load or store it will insert a function. And then exactly what we do with it is there's a runtime that will handle this. And use the space, the clang or GCC will provide you a runtime hopefully not so much in the kernel. We have to do this all ourselves because it's always a special environment. Yeah, undefined behavior is always a favorite too. So the user, the user space undefined behavior sanitizer and so the kernel equivalent which always usually starts with a K that much of many, there's a lot of undefined behavior we can maybe able to do detected in the compile time but a lot of it is only runtime detectable. Things like misaligned or null pointed dereference. So you accidentally, I sort of found actually, went into this recently where you accidentally get given a null pointer somehow and you can't possibly detect this because it comes from user space and you're just missing a null pointer check. Or you may be shifting by a variable amount and that value may be 64 and you're on a 32-bit ant which is well outside of its range. Thanks to NetBSD. We have the micro UBSAN runtime which I have, I imported to previous, so NetBSD has committed it over a year ago and I imported it late last year and OpenBSD has done, did the same this year. You do get some issues though, can't be a wee bit big but I'll show you, so here are three examples I found in NetBSD. You can, yeah, trying to alignment issues often, it's often a case where the compiler may actually assume you've sold it, it's going to be 128 by the line and then you give it a 64 by the line thing. I've seen plenty of many cases where that happens and then suddenly you're reading the wrong thing and you could easily turn this into a security vulnerability. This null pointed dereference is actually, this is using, trying to find the offset of a value inside of strut by taking strut, casting null to this type and then finding the address of the entry you want. There is actually, I think Clang definitely has a way of front doing this. I think I expect GCC will have the same way. Don't just cast null to a strut please anymore. And there will be things like shift south of bounds. So this is all things that you can't assume, the compiler can make certain assumptions that you don't want to ever do these. And if you do, the compiler could do anything if you hit an undefined behavior almost. And don't rely on undefined behavior, not changing across compiler versions, two different compilers. GCC and Clang will probably do something different with your undefined behavior. Clang eight and nine, they could do something. There was examples of new, Clang nine will now, if it can show that a variable is const and you're writing to it, it can say that can't possibly happen, it will just drop that store in some cases. So don't try to remove your undefined behavior if you can. So that was sort of the easy one. That was, maybe I should add it. There were some issues with having enabled it, we can't enable it because it blows the kernel size but it's possible to build kernels that will work. But the second one is there's a coverage sanitizer. So when the compiler, when you're running your code, how does, you run a system call, where is the kernel code? What paths through your code does it take? So it will use, it inserts traces into the compiler, into your code to say, I've entered into a new function or I am now running this if statement and here is the two values that you're comparing and some information about it, about what size, what widths this comparison is. So it can be, this one's actually very powerful. And it's all three, open BST, free BST, net BST, have added support for this, which we'll see later exactly why. This has been done. The free BST code originally started as a co-op student that Edmast had and then I took the code and worked on it a bit more and I managed to get it committed. So it will do things like this. So we have a buffer. We ask the kernel for buffer and it starts off with just the number of entries in the buffer, the number of valid entries in the buffer and just the list of basic blocks it's gone to or program content inside the basic block. The actual, you can just imagine, it doesn't have to be an actual PC value. It just has to be something unique in each play location that all can use. Free BST, net BST, both these are all 64-bit, open BST, it's whatever you want to point it to, so it could be 64-bit architecture. Normally it's just the return address of the function. The exact value doesn't actually matter that much as long as they're unique. So this is good for, if you wanted to find out just what path you took through the code. But then there's some issues of, okay, I want to then try a different path through the code. How do I modify my input to figure this out? Well, there are probably some if statements in there, but we don't know where. They're just addresses that may be related to the if statement. So we have this comparison trace, which is very useful for fuzzing. We have the same thing. We have the number of entries. The number of entries is slightly different in this, and it's number of these different groups of values. Where we can say, well, we now know, okay, so two means, I don't know exactly, it's like a eight-byte comparison, for example. And it might be the two values that they're comparing, so hex 10, hex 20, and the address, this comparison, approximately address the other, a value for where the comparison was. And then you have a second comparison later. So you could imagine if we've got some input data we were passing in, and then we see it in here, and it was a failure comparison, maybe if we change it, if we change it, could we try finding a new path through the kernel? So that would mean that that comparison actually succeeds. So this is just, you know, finding path through the kernel. It's not necessarily going to find the fact that you've got a buffer overflow or anything. I would just say, here's how you got, here's how to, or help you find, here's how to get to the buffer overflow, not actually, you have a buffer overflow. And so, you can think of this as the interface, you have a device somewhere, and this information that I've presented is then put into a shared buffer where you use the kernel and user space. So such that, as a user space program, I can then read it and learn more information about the program, the system calls I've executed. There's alternative interfaces, where AFL, which is the American fuzzy lop fuzzer, and NetBSD, there was a talk just now about, from NetBSD, about the support for it. I also have patches for FreeBSD, so that support to cake off, it lets you extract the rut of information. But, you know, I've also got patches, we can say, well, if you've got a completely different idea of how this should work, I'll let you do that as well. And then I've split out the actual sanitizer from the interface. So we could have alternative interfaces. So if anyone has any ideas of what you'd want to do with runtime information about every possible comparison kernel, let me know. I'm sure we can come up with a module that will give you this information, break the kernel even more. So those are the two nice ones to have, but these are the ones that will actually break your content. These are the ones that will find, tell you, you will, these are lots of useful security vulnerabilities. So the address, the address-based sanitizer is check memory, sorry, memory accesses. I give you a pointer. You can say, it's a rayosay. And I give you a link. What happens if the link isn't quite right? And then you try to read it. Or you've got a bag where you actually read one part of the link, or something similar. The idea of the address-based sanitizer is it will catch this sort of thing. And it will try to help you find small buffer overflows, not in the range of 8 bytes sort of area. So it's not as good to get you off by one errors you may find. It won't really help you with a very large, you'll somehow manage to corrupt your pointer in other ways. But it will find, to the way it works, it's got a shadow map. So every 8 bytes of Kunoadry space, you have an extra byte of the shadow map. And then we mark 1 byte, we mark 0 to 8 bytes of this buffer, these 8 bytes is valid. And so every load in store, you then have a check to say, is this load going to, is every byte in the store valid? It has to be the first 0 to 8 bytes. And they must be contiguous. You can't say on the first 3rd and 7th byte. It's not long to work. Just because of the way the format works. And you can also mark, when you're marking memory is invalid, you can, we've got specific values that will say this is valid because it was a free, it's been freed memory. And so we know it's a use after free. Or it is the padding that you get after malloc. So it's a buffer overflow on the malloc buffer. Or this is used, it is padding between two values on the stack. And it's the top of the, or it's the top of the stack or something. You can, you know, you get a little bit of information about why you've got your buffer overflow. So you can help, helps you search to see what you're doing wrong. So how it works is this is your memory. You've allocated, I've given you three examples of memory that's been allocated. You could have say, you've got three bytes there. And because, because it will always, a sand works on eight byte chunks. Therefore, you always, you want to get the extra five bytes at the end. So you got three bytes of memory valid and five bytes invalid. Or you could have every, every byte valid or nothing. And so you, at the top, you'll see there's a white box, which is in the end is just the, the equivalent of this blocks shadow memory. So it's a, it's a fixed offset. It will say, depending on the value of n, it will mean that if it ends positive, you'll have some number of bytes, zero to one to seven bytes of valid if n zero, all bytes are valid. And if n's negative, then nothing's valid. So like I said, all, all allocation, this does mean all allocations have to be at least eight bytes aligned, which, so it's going to cause a little bit of overhead and memory overhead. They have to, and they have to be aligned to an eight byte sized because of the, the way it works. Which means that small, over, a small other, buffer flows are detected. And then you can add on, by default, I think we're at three BST ads, one byte, or, well, we'll, we'll when I implement it, add one, one byte, a one block of eight bytes afterwards. But you could imagine if you think there's going to be a lot, if you've got a larger overflow, we could add more if we need to. And I, because I've copied this, I'm using the same runtime as net BST, I believe they'll be the same. So here's an example, I, hopefully this one doesn't have any buffer flows. I'm asking, I've allocated some memory. And if you're not familiar with three BST, the, the, this is just a, a maloc, which MTM just says, give me a temporary value that I don't care about the type, it's just a certain type to say, for later to say, the tracking of who's using memory. And white okay means I'm happy with sleeping, as long as you, so a guarantees that the, the value will a, a non-null value that will be returned. So memory, so I don't need to bother with a null check in this case, because it's guaranteed that it will be valid. And I get some memory, I get some data, so I call a function, pass the, this pointer in, and I want to return it. So I copy it into, just onto the stack, and free, free the, the maloc memory. I'm only using, in this case, I'm using maloc just because, just to show you that how it would work with that. You can imagine that this is actually more complex than that, than maloc and free, maybe in different functions, but ignore, you know, this is a simplified case. So what's happening? It allocates the memory. In this case, because I said it has to be eight bytes aligned and eight bytes sized, it will allocate the first eight bytes. And it's rounded up to a second buffer, so we can do buffer overflow detection of, guaranteed then we'll have something after the, after the memory we've allocated. And only the first four bytes are marked as valid. We then put in some data into the show they met. So we say four bytes are valid in the first entry. And FB is saying, it's a, a malocked buffer. So it's saying it's going to, if we do a buffer, have a buffer overflow there, we'll know that it's from a malock, reading past the end of a malocked array. So yes, I said the first half is valid and I apparently forgot to write the second buffer, the second block is marked as malocked. So a load of store that includes bytes 14 through 15 will be detected. As long as it's encoded, it's been annotated. It's run through, being built with a sanitiser enabled. And we can either then, well, if we detect that, we can then either use print here for panic. So panic will stop the world. Print here for just say, you've done something naughty. A load of store, and a load of store outside of this range may or may not be detected. Like we, this, yeah, KSN doesn't necessarily give you perfect security. It's not a security tool. It's going to just give you an idea. So my example again. What happened, what we, what the compiler will do is in malock, we set up that shadow map, and we'll insert and we'll have a function inserted by the compiler. So this ACN node four will implement the checks on the shadow map. There was a, you know, so it'll be, it knows, the compiler already knows size of, you know, an integer is four bytes, obviously. So it will know to insert the four bytes wide thing, a check. And exactly what ACN load four does is run time dependent. So the netBST code that we use and netBST users will end to a lookup on the shadow map. Check that you've got valid memory. And depending on how you build it, do the panel called the printf. If instead we were to do this, and I've, if you note the differences, here I'm reading data zero and here I'm reading data one. Now I've inserted, actually I've inserted a buffer overflow. This will then move to the next, the next entry. So data plus four bytes later. And say, well, you know, what is, what's the value then? And with hope that the load for the code will then detect this buffer overflow, it's all print and then print a message saying exactly where it was. And because it's right beside the use, hopefully that should then mean we'll be able to figure out exactly where in a code the issue was. A question. So the question is, does it work with zone other cases? Yes, except in one case where I've got a panic. If you know about zone other than zone, if you know about UMA, I'd like to talk. I'm not, as I actually got the answer there. Yeah. So netBSTs had this for a long time for a year. I've had a sum of code student working on it, who did a lot of, you know, got a lot of the work and we managed to get a successfully booting FreeBSD on AMD64. This was with my previous runtime, which I decided I didn't like. After the sum of code, I've also now put into the FreeBSD runtime. And I pushed it to get a branch of people interested in playing with it. I am currently testing and trying to break it. And I have found one known issue in UMA that is with my use of my case, KSN side, not UMA side. So the problem I've got is with my code, not with the existing code so far. And because of the way it works, when we free, we also make memes invalid. It will also detect use after free, which is nice. So there's another useful, interesting sanitizer called the hardware-based, hardware-assisted one. So AMD64 has support for a thing called top byte ignore. We can mark the top byte of pointers as, tell the hardware to just ignore the value in there. It's going to be, could be anything. And then when we enable, if you enable it in the hardware, we can then, on every check, we can then say, what if you've got, we can then use that value as an information about whether or not this is a valid pointer or pointing it valid memory. So what it will do is it will store an 8-bit tag in this, in this area, in the field. And then as the hardware will ignore it and loads and stores, we can then, when we instrument the code, we can then use the load checker then, but we don't have to do modified pointer to remove it. This is an ARM64 specific thing. I don't know of any other hardware, any other architecture that supports something similar. There will be nothing that 3BC supports or runs. It does mean you have to allocate, you then allocate random tags in such a way. So we start off with, white here is all, is just the default, so unallocated memory. So we've got to, we start off saying the points, you know, when you do a load or store that's going to go to become, that memory is valid. We do some allocations. And so in this case, memory addresses 1 and 2 have been marked as blue. And so your pointer has to have the correct flag, whatever number it is for blue, to load that memory. And then eventually, you know, we'll do some more allocations. They may not be, and they may have gaps and things, but that's fine. As long as later on, if you want to say then load from memory location 3 with a point as it's been marked as blue, you'll fail. This one has the advantage of, it means that you can say, well, larger out of bounds allocations are more likely to fail. Because if we have, even if we've allocated memory there, if we haven't allocated the memory with the correct tag, it will fail. But it does, and it also gives you use after free, because as you, when you free memory, it will clear the tag on that piece of memory. It does, you do only get, so you add one byte of kernel address base per 16 bytes allocated. And therefore you need, memory needs to be 16 bytes aligned. You don't need to do a, you don't need a, because then it's, if you're allocating your, the tags randomly, you can use a probabilistically method of saying, well, probably the next allocation is a different color or has a different value. So we don't necessarily, we don't need to allocate any padding for a slightly out of bounds. It does mean that you lose, you don't get the, if you're only one byte out of bounds on a 7 byte allocation, it won't detect that, but it will detect if you're 16 bytes out of bounds. So you can't, you won't necessarily get a slightly out of bounds failure. And because, and you'll get, it's just slightly under, because of you, a few values other, a few uses are reserved, you'll get a bit under one in 256, but probability of getting the wrong color, of getting the correct, the color of your pointer when you shouldn't have. I don't know of any, I only know of the next implementing this. So if anyone wants to do this on 3BST or NetBST or OpenBST, come and talk to me. I'll be interested in a run time. Then, then we've got Cherry, which is, if you're in books as taught this morning and not tomorrow, because you would have known all about Cherry, or go and have a look on the videos when you're posted later. It's basically you say, make pointers from bigger store bounds, more likely bounds in there, and if you use a few, and make sure they're non-forgeable, so that if I've given, if you've got this capability, you can't possibly use it to access memory outside of the bounds, no matter how much you try. You also get some features like bounds and permissions, and they can only be derived from other capabilities, so you can only shrink them, you can't expand them. And go back in time and see books as taught this morning if you're interested in learning more about this. We should see some hardware, some experimental hardware soon, in a few years, with us and his arm is interested and have been working on this. So we can say, we can detect all, as long as we are capable, as the capability is set up quickly, we can detect other bounds. And not just, you know, if you've got a large allocation, we may miss a slightly out of bounds. Access, but we will detect a large out of bounds access, because there's no tag collision and, you know, KH, the hardware I say, and we've got interesting work ongoing on, so this has been ported to the kernel. We have got a kernel that will boot to user space with full, every pointer is a capability. And we've got work on reducing bounds, so arrays inside of struts will have bounds on them, for example, with sub-object bounds. But they may need annotations, KQ is the main one. Q.iso is the main one there. Lastly, memory is possible to read, accidentally write memory, so accidentally read memory before you've written to it, which is undefined exactly what will be in the memory. And if you're reading memory is passed out to the user space, then you could accidentally leak information. So KMSAN is, it will do this, it will check, it will say, you've just had, we store a bit to say, is this memory been initialised? And then on use, it will check if this is valid. Use is, we've got interesting in this, and it's used in the conditional, so using an if statement or while statement, or a pointer to reference or copied out to user space. The last one is there to stop leaking information out of the kernel. So here we go. Here's an example of, you know, we create a variable. B equals A is not a use in this case. It's not because it's not in a conditional, it's not being used to change the flow of the program, it's just an assignment. So that will mean that the fact that A is uninitialised will propagate to B. And then on copy out, which is copying from user space to kernel to user space, then you will warn that you've done an uninitialised, you've used it at that point. Or second, more surprising one, use is this code, so C equals A plus B, that's not considered a use, but the, if C will become or have a flag set depending on if the flag value was set or not. So if we went through the case in this condition, then C will be set as being initialised, otherwise it will be cleared. And then later on we use the flag again to know if we want to copy it out. So in that case C will be an initialised copy out. I'm assuming B is initialised in this example. A certainly more complex one. This is in the conditional, so when you allocate memory it will mark it all as uninitialised unless you've explicitly said not to. And then on the reading of a flag it will say you haven't actually initialised this, it's invalid use. Like KSAN uses a sharing map, it only needs one bit per byte, so each byte of memory has a bit in the sharing map somewhere to say this has been initialised or not. Which when you allocate memory is poisoned and unpoisoned and when you zero it will allocate it right to it in some way, it's unpoisoned. And the pseudo space is propagated, so when you do an assignment we propagate the state so that when you actually use it later then it will know. Then lastly for memory allocations is K leak which is similar by here is likely to be deprecated in NBSD but because it uses in-band signalling so it's more likely you'll get false positives. It uses a value when you allocate memory you'll put a known value in and on copy you actually check if that value is and your point is in your structure you're copying out. With some information they've kind of done some science and looked at these likely one byte values to have been copied out and use those and rotate them and they pay for this if you're interested. Because it's in-band signalling it may get false positives. And threading is thread to hard. There is a kernel TSN I'm not sure what status this is. The last commit I saw was in March to the Linux tree from Google so I don't know exactly but I haven't looked recently if there's any progress that they haven't made anywhere else. So why? Find bugs. That is my run. Fuzzing it helps, often needs this sort of thing and it makes fuzzing much more powerful. Especially the comparison example I gave you is it all means that the fuzzing can feed back. And as you make our fuzzing more likely to find code parts it's more likely to find bugs. So it basically feeds back and if you can make it easier to find bugs you'll find more bugs. So there's two fuzzers I'm going to talk about. The first is this killer. It's a project from Google where it understands a lot about different system calls. It's all through NetBSD, OpenBSD and 3BSD supported by it. So what it will do is it will it's good at finding new and interesting ways of panicking the kernel from user space which may by composing different system calls together. As you can see this was this is from early in the year. It will give you a list of panics Google will provide this. But you'll note that there's only one. There's now two but at the time there's only one process, one of these fuzzers running which is CI3BSD main. There's now a second one which is 3BSD which is running on i386 if you wanted to find Compact32 issues. Compare that to Linux where they've got different, they have different compiled options they have many KaSan, they have KMSan. So it would be nice to have the 3BSDs. I think NetBSDs might have KaSan enabled by default I'm not sure I saw some KaSan panics when I looked earlier. So it will try to find new paths through the kernel. It uses the comparison sanitizers to say I've given you this data, I see these comparisons that you possibly use this data if I try to change it to what's the other side of the comparison is. Does it now get into a new code path? So it's very good that this is how it works, it's good at finding these bugs in the kernel by using feedback information back from the kernel. It understands various arguments and how they relate to each other and it will do things you don't expect. It will take a socket and then pass the value in a file descriptor. It will then pass it to something that definitely doesn't expect to be a socket and see what happens. It will try to do so many different things that you don't expect that is where your initial kernel crashes or happens. It's very good at panicking the kernel. It will try to find you a reproducer, so it will give you a C-test case if it can find one. Adding sanitizers makes things a lot easier to find these bugs though. This is why a sanitizer is good, is that if we do that it will if we add the sanitizers it will find more bugs, we can fix them we can hopefully make better claims about fewer bugs in our kernel than before. So it will give you if it finds a panic, it will give you some information, it will give you the stack trace for example, this is just this naturally, this just comes out of FreeBSD anyway on the panic. If you look down the bottom it's found there was a log, you'll see the information is there was a log. It will give you a C-test kind of a reproducer which is just a textual description of system calls and how it ran things and it will find you a C-reproducer if it can. Which you can then download and then try to reproduce locally. It will email, there's the mailing list join the mailing list of the appropriate one for your BSD or your operating system of your choice. When people fix things if they tag them this kid will detect this, it will follow as it pulls a new source code, your new changes it will detect that you've got a fix for something rather, it will give you an email address to say reported by and then it will check that you have actually fixed it. AFL which is American Fuzzy Lop is a file format fuzzy, I'll skip over this wee bit but I have as the NBSD talk previously was about this. There has been patches and I found them to be slow unfortunately. If you've got ideas of improving performance of this. So conclusion, fuzzing's good. Sensor's good for fuzzing. We have K-Cov, we have UVSAN, ASAN seems to be a good one finding bugs, other ones need more work. Google will fuzz for you if you give them enough if you ask them to, which if you're running one of the BSD's they've done already. And look through these reports if you're interested in this. Any questions? I'm interested in the work on coverage. Did you use any of the hardware facilities like CoreSight on ARM or Intel PT for that? No, the coverage just uses the compiler to insert. I'm interested in the work on coverage. I'm interested to insert points into your code. I have thoughts about using these sorts of tools on ARM and X86, but it depends a little bit on if they are still in the hardware, which may not be the case on ARM. So yeah, if you've got ideas about that. Let's have a beer later. Any questions? So when you're using the sanitiser and coverage tools I guess that the coverage tool cannot coverage itself, as it will cause a recursion and feedback loop. You have to be very careful to build the coverage tool without coverage enabled. Right, so that was my question. Sometimes you have to build without the sanitiser support, otherwise things get very cursive very quickly and you don't even get to the first print death. Any more questions? No, in that case, thank you for your presentation.