 All right, good morning. My name's David Moore. Very happy to be here. I just want to quickly say thank you for being here. I really appreciate it. And welcome to the Embedded Linux conference. And I want to say thank you very much for the Linux Foundation for inviting me to speak today. Just a few quick details about my background. I've been a professional software engineer since 1994. Had the opportunity to work with some great companies, starting with Next Software, worked for Steve Jobs back in the 90s doing web objects consulting, one of the first web frameworks. Became an Apple employee. I went on to work with some really excellent companies and really great people in engineering, sales, consulting, and business development roles. But I was having a great time at that, you know, great career. But I decided to kind of switch into consulting. And so there I am consulting in Indonesia. I was able to do that for a little while. And that was really fun. And then I actually spent a couple of years as a semi-professional opera tenor. I trained into opera. I took a lot of voice lessons. And I reached essentially a regional or semi-professional level. So it was pretty exciting. Kind of got kicked out of opera, though. It wasn't loud enough. You have to be very loud. And at the same time, I was seeing some pretty amazing things going on in the security world. Even starting with Stuxnet and a lot of the big breaches that happened recently, or 2012, 2013. So I really piqued my interest in. I wanted to get back into tech. So I decided to do that and focus specifically on security, really offensive security. And to do that, I trained into it. I used different vulnerable web apps and things like that, just did tons of reading. And then I started doing bug bounty programs. And these are where you can attack a website. And the company gives you permission. And if you find something and report it responsibly, they either pay you, usually a little bit of money. And they certainly give you public recognition, which I think is more valuable. And it's also a very great way to learn hacking. You're going after live targets that are secured by professional security engineers. So it's much harder than the practice things we do. And so that was great. But I pivoted into fuzzing and memory corruption as well. That's kind of what we're going to talk about mostly today. And I found a few things in Ruby and PHP and other stuff. One quick story, I found a bug in the kernel. And it was in EXT4. And I wrote a script to highlight duplicate lines and source code. Anytime there's the same line twice, flag that. And I ran it over tens, maybe hundreds of lines of open source code. And it was quite time consuming because usually that's a legitimate thing, or often it is. But sometimes it's not. I'm really looking for copy and paste errors. So I found one when they were checking the mutex on iNode1 twice. And there was also an iNode2. And so that was a bug. It was like a copy and paste error that the programmer forgot to change the one to a two in that second line. So I did. I sent in a patch. It got accepted. So I have a one byte fix in the kernel. And then last summer, I really got into fuzz testing. I think it's a really powerful thing to do. So I started a company. And we do fuzz testing at scale in the cloud, as well as training. OK, so today we're going to talk about memory corruption, handling memory corruption bugs that fuzzers find. When I first started fuzzing, I wondered to myself, what am I going to do when I find a crash? Like, I can do this fuzzing thing, and it's good at finding crashes. But I really wasn't even sure how to deal with that. So I worked it out. And this talk is essentially the results of what I found. By the end of this talk, we'd like to be able to triage crashes, meaning determining the likelihood of exploitability. In some cases, it might be a bit of a review of secure coding as well, hopefully. And we are going to be addressing memory corruption issues in C and C++ Linux programs. So this is unmanaged code. Fuzz testing, at the moment, really looks for crashes, seg faults. And so it's really focused on unmanaged code in C and C++ programs. And so this talk is going to kind of be about the middle section of the process. The first section is doing the fuzz run itself, things like choosing a fuzzer, choosing the seed files, how long to run the fuzz run for. That's another talk. And then we're not going to cover exploit development as well. That's certainly another talk. And so this is the middle section, where you do the fuzz run, you get crashes. How do you process those? With the goal of fixing the bug or possibly reporting it to the maintainers. And so we're going to go through a quick review of memory corruption bugs. We're going to go through my workflow to deal with them and then show a couple of real world examples. And so what do we mean by memory corruption? Pretty much it's invalid reads and writes. And so that's if the program can be tricked to allow an attacker to either read or write memory beyond the bounds of wherever a buffer or variable has been assigned to that. And so if you can write out of bounds, read out of bounds, that's fundamentally an invalid read or write, most fundamental form of memory corruption bug. These are also called out of bound reads and writes. And often abbreviated OOB reads and writes. And just a quick review, the causes of this kind of memory corruption are very often off by one errors. I see those so often still. It's so hard to do. So about well over half of the crashes I find are really fundamentally off by one errors. So it's good at finding those. Unvalidated input obviously is an important one. Any untrusted input has to be carefully checked. And then still some use of known unsafe functions such as stir copy and get string. And so there's two areas in memory that you store variables in, in a process, the stack and the heap. Local variables are stored on the stack. You say int x equals 5. That's a stack variable. Memory obtained using malloc, like a string buffer. That's in the heap. So if you're allocating x number of bytes, that's always a heap variable or heap buffer. In the case of heap memory, you have to explicitly call free as well. The programmer has to free that memory. So we're talking about buffer overflows as well. That's another word for invalid reader, right? And you cannot, buffer overflow is pretty commonly still in both the heap and the stack. And one quick note, you'll hear people talk about just strictly a stack overflow. Strictly speaking, that's not a memory corruption error. That's a stack overflow as opposed to a stack buffer overflow is different. A standard stack overflow is when you have out of control recursion. You're using recursion and you're not checking things and it's going essentially infinite until it runs out of memory or until something else determines that there's a problem and stops the program. So we're talking about stack buffer overflows today, among other things. This is a quick example of a really basic stack overflow. So you have a program that reads from standard in, puts it into a character array of 8 bytes, and then uses a known unsafe function to copy it from the arg into the buff. That could very easily be a stack buffer overflow. In this case, we're sending it 12 bytes into the program and that's going to overflow by 4 bytes. So what happens if next isn't called properly? I'm sorry, if free isn't called properly, that can usually be another class of bug called a use after free. And a use after free is really just what it sounds like. A program continues to use a freed pointer. These have a pretty good likelihood of exploitability still. We're going to talk about mitigations in a minute here, but use after freeze are still fairly likely to be exploitable, especially in C++ code. These often show up in error handling or other weird corner cases. And especially when there's some confusion or it's not clearly laid out, what part of the program is responsible for freeing allocated memory? For instance, on a function call, who calls free? The caller or the colleague. So if you're not really careful with that, you can easily get into a use after free situation. Here's a quick pseudo code example of a use after free. So malloc 4 bytes, character buffer, do some amount of work with it, free it. Some other stuff happens. And then inadvertently, it gets referenced again, dereferenced again, print f, or maybe you log it, or whatever else you're doing. That's a classic use after free. And then there's other memory bugs too that are typically not as critical, but are important to talk about a double or invalid free. So that's if you free it twice. You free the pointer once, and then even later you free it again. That is hard to exploit, but it's potentially exploitable under race conditions in multi-threaded programs. Another kind of memory bug is when a conditional, like an if statement or a case statement, depends on uninitialized memory. So if you just say int x and don't assign anything to it, and then you say if x do something, that's at least undefined behavior at best. And possibly a hacker could use a control flow attack and try to control the flow of the program. And then finally one I'm sure all very familiar with is memory leaks. And that's when the programmer forgets to call free. Free never gets called on allocated memory. And in that case, memory keeps getting allocated. That could be a denial of service attack. If attacker keeps allocating memory, finds a mechanism to do that, where it's never freed, you can exhaust some memory and bring the program to a halt. So what is exploitability? I've talked about it a little bit already so far. At most basic level, it's reprogramming the application with input data and not code. So if we can trick the program in executing attacker controlled input data as if it were code, that's a code injection exploit. Another way to think about it is that input streams become instruction streams. Or from the attacker's perspective, can I make your program run my program? In that case, it's definitely exploitable. And this typically involves controlling the instruction pointer, EIP in 32-bit or RIP in 64. But this kind of attack is getting really hard to do because it's some of the mitigations I'm about to talk about. It's rare and rare that attackers can get their own code, their own shell code, often into an existing process. Now more and more attacks are about code reuse. And that means reprogramming with existing code. It's already in the process. So the attacker can't bring his or her own code in, but they can leverage existing code both from the application itself or even more so from all the libraries that are linked with it. And so there's enough code where that person can string together existing functionality and reach an exploitable state. And that's called return-oriented programming, or R-O-P, because it relies on manipulating the return pointer. And it's also called weird machine programming. And so most hacks that we're seeing nowadays are some kind of R-O-P attack. And then does exploitability matter? Like why doesn't matter? Why do we really care? And in a way, it doesn't, because if you can find a memory error, it's good to fix it. But certainly, there are issues of prioritization. We all have lots of bugs to deal with. And it's very important to know if there is one that is critical, or that could be exploitable, you obviously want to fix that one sooner. And if it could be exploited in the wild, if the code with the vulnerability already exists in users' hands, then it's also important to understand that it's exploitable so that you can arrange to push a fix out to the user base. And then for what I do, if you're doing security research, it certainly matters, too, because you want to understand what you have when you report it. You want to report it as a security bug if it is, and not if it's not. And it also is important to motivate the vendor or the maintainer to fix things, if you can demonstrate some level of exploitability. But then it's exploitable by whom? There's a wide range of exploitable, exploit dev skills out there. And one being Google has a group called Project Zero. They're Google's internal white hat group. And their job is to test any code that Google customers might use, any application. Is there a purview to find bugs in that and to demonstrate exploitability? And they have about 15 people, very, very top engineers working there, the top hackers on the world. And they have quite a bit of compute power behind them, of course, too, because they're Google and they have Google compute engines. So they can make fuzz runs as much as they want. They can have unlimited resources. Another group with really amazing resources is the NSA or any other nation-state or intelligence agency. And they obviously have even more people and even more resources than Google. And then, who knows? Who else is out there? Whatever. Who is trying to do the exploit? What is their motivation to exploit? How important is the target? How widely deployed is the application? And this is a good time to just reiterate, security is never 100%. What we're trying to do is raise the cost of the attack to the attacker. We want the attack to be more expensive than the value of whatever data they're going to get out of it or whatever nefarious purpose they're trying to do. So that's fundamentally what we're looking at in terms of securing applications. And then another important point is that nowadays, most exploits are not off just one bug. There's been a lot of mitigation. A lot of things have been fixed. Things are getting much better. So to really get an exploit nowadays, you're looking at a bug chain. You have to chain together multiple bugs to reach a fully exploitable condition. So given that, it's a little bit moot to talk about whether or not any single bug is exploitable. Because even one that's definitely not exploitable could play a critical role in an exploit chain. And then sometimes things are pretty surprisingly exploitable too. There's bugs that you think would not be exploitable, but are very difficult, and they are. There was one disclosed recently. And this is in a DNS library called C Aries in the Chrome OS operating system. And someone found full remote code execution, a pretty serious bug. When I say someone, it was anonymously reported by some researcher or more likely a group. And what's interesting about this is that it was a buffer overflow of a write. It was an invalid write. But it was just a write of one byte. It only write a single byte past the buffer where it was supposed to. Not only that, but the attacker could not control what was written into that byte. A lot of times we can. We can control what gets written in bad writes. And we'll see an example of that later. But in this case, it would only always be the digit one. And that's all the attacker had. But they were able to chain this together, or actually it wasn't even chaining. It was heap grooming, where they make lots of calls to arrange the heap in a specific manner to make it exploitable. Or that's what I'm speculating. That's really what it is, a 37 page report about this bug. I did not read all 37 pages. And it was reported to Red Hat as well. And they rated it as moderate security impact. This was before the exploit was disclosed. And I would have too. It seems pretty hard to exploit something like this. And one final thing, this is what triggered it. If you had an escape dot at the end of the DNS name, that's what would trigger this bug. And this is the kind of thing that fuzz testing often finds. It's kind of a weird corner case. OK, so then we've talked about memory corruption, talked about exploitability. Here's a few of the mitigations that have made it harder. The first one being stack canaries. And this is a random integer that gets pushed onto the stack in between stack frames. So just like a canary in a coal mine, if the canary is not looking good, something's gone wrong. And if these random integers, these canaries are actually random integers. So if something goes wrong, let me put it this way, if the attacker can overflow a stack from one frame to the other and try to work on the return point or whatever they're doing, they have to go through the stack canary. There's a little better way to illustrate it. Sorry about that, I missed that slide. And so there are actually integers in between each stack frame. And only the OS knows what those integers are. They're random. They're not available to the attacker. And so if the attacker wants to buffer overflow from one frame to another, and they typically need to, to run an exploit, they don't know that number. And so they're going to change it. And the operating system is going to realize that something changed. Something's gone wrong. It's time to stop the program. And then there's ASLR, another really important mitigation, address space, layout randomization. I'm sorry, I skipped that. My apologies. Data execution prevention is another mitigation. And this simply marks some region of memory as non-executable, meaning if the instruction pointer is ever pointed into that, it's time to stop the program. This is supported at the hardware level by the NX bit in modern Intel CPUs. Now this is embedded Linux. I'm sure there's people working on other kinds of hardware. And so I'm not even sure if data execution prevention is available on different hardware platforms. But it is in generally anything in a laptop. And the combination of stack canaries and data execution prevention makes exploiting stack bugs especially pretty difficult at this point. Another important mitigation is address space, layout randomization. And this scrambles memory. Just like you buy a deck of cards, it's in order by suit and number. You shuffle it. You no longer have that. With ASLR, it's sort of like the mapping is maintained. And so just like we map virtual memory to physical, ASLR adds an extra layer of scrambling and mapping over that. And again, the operating system knows that mapping, but an attacker cannot. So if you control the execution pointer, you don't know where to jump to. You're jumping to a random place. And that's fundamentally how ASLR works. Now one caveat about it, it's only really effective on 64-bit systems because you have a very large memory space. With 32-bit, you're down to 4 gig heap size. And there's actually a tax on ASLR where they use random heap spraying or brute forcing. And it takes obviously many, many calls, but they can keep trying stuff until they land where they want. Land the instruction point or where in memory they want to run the hack. OK, so we talked about memory corruption bugs and mitigations. So we're going to talk about the workflow that I use to go through these things, deal with crashes. First step is to minimize the crash corpus. And that means reduce it down to the important parts. No extraneous information. And then use memory corruption tools to gain information about those crashes. And then finally, get down to brass tacks, determine exploitability, or find the root cause of the bug. So minimization, when you do a fuzz run, it's trying stuff. It's just making up crazy random files and running them through the program. And so there can be a lot of extraneous information in each of the files, as well as you can wind up with lots of crashes, if something's never been fuzz before, you can easily see a couple of dozen crashes. And the fuzzers do their best to make those crashes unique, to make them represent unique bugs. However, that's pretty hard for the fuzzer to do while it's doing the fuzz run. And so we need to minimize two things. The first is the corpus of crashes. So if you do a fuzz run, and you have 20 crashes, you want to make sure that those actually represent 20 bugs and not say eight bugs. And fuzzers have tools. I didn't mention this yet, but I use American Fuzzy Lope. It's a fuzzer, often called AFL. And that was developed by Google. It's an open source fuzzer. And it has a suite of tools. And one of them does minimizing of crash corpuses. It's called AFL-C-M-I-N. And so what it does is that it reduces the corpus of the crashes down to only those that represent actual bugs. That's the idea. So it can easily reduce 20 crashes down to eight or six or whatever it is. And those are individual crash cases. And to be really specific, I mean a file, a crash case is a file you run against a program that's going to make a crash that got found by the fuzzer. So once the corpus itself has been minimized, then it's important to minimize each case individually. So as I mentioned before, you can have lots of extraneous bites that are not relevant to the crash. And so there's case minimizers too. And AFLs is called AFL-TeamIn. And it runs the file with the target program over and over again. And it just experiments. It tries removing a bite. If the same crash still happens, that bite never comes back. If it removes the bite and the crash goes away, or even a different crash happens, then the minimizer knows that that bite is important to that crash. And so doing that, you can just reduce the case down to only those bites that are relevant to the crash, which makes it much easier to do the debugging. And then quickly, there's one more tool called F-dups. And it's a hash-based system. It simply takes all the files in any given directory. And this will work on any kind of file. And it checks the MD5 hash on them and compares them. If two or more files are identical on a bite-to-bite basis, F-dups will eliminate all but one of those files. And I've definitely seen cases. I mean, these tools do their best, but they're not perfect. So I've run the corpus minimizer, I've run the test case minimizer, and still found bite-identical files. And so F-dups is a nice, quick way to get rid of those. OK, and then it's time to start doing some analysis. And there's some pretty useful tools to do that with. Before I go into those tools, it's important to remind everybody that all bets are off. Once things go south in a C program with memory, you really don't know what's going on. You have to be open to anything. And it can affect the tools. The tools do their best, but it's important to not trust anything that comes out of a computer. So the first tool is called Address Sanitizer. And this is found in compilers, GCC and Clang, both have it. And that's the flag that you use to use it on the compile line, compile with it. And it operates at both compile time and runtime to detect memory errors. At compile time, it adds instrumentation to the code, to the binary itself. And then at runtime, it replaces Malik with its own runtime library for allocating memory. So between the two of those mechanisms, it can be very, very accurate at detecting memory errors. And this is an example of the output. You can see it found a heap after free UAF, found a UAF in the heap. Here's the address, some other information. Here's a stack trace. Now, I didn't compile this with the symbols in it. So we're just getting memory locations here. But it gives you a lot of other pretty nice information. We had a bad read, a read of size four. And so it also finds invalid reads and writes, any kind of buffer overflow, use after freeze, double freeze, and some other kinds of memory corruption bugs. Another really useful tool is called Valgrind or Memcheck. Technically, Valgrind is a family of tools. And Memcheck is the one that checks memory. A lot of people use both terms interchangeably. And it's distinct from address sanitizer because there's no need to recompile. You can simply take any binary and run it under Valgrind. And it will give you the information. So that's pretty nice. It does have a lot more output. And it's not as clean as address sanitizer. It doesn't just tell you, hey, this is a UAF or whatever. You have to interpret it a little bit more. Here's an example of the output from Valgrind. We have a bad write up here and a bad read here. Anytime you have a bad write in the heap, you have to consider it exploitable. And in this case, they're close in memory, too. These memory locations, the bad read is here. And the bad write is here. They're not that far away. So that also suggests that maybe the attacker could get something going in there. If you have a bad write, you can write something. And then if it later gets read, all bets are off. And then there's one more really useful tool. And it's simply called exploitable. And it was developed by cert. And it's now been open source. It's available on GitHub, maintained by a person named Jonathan Foote. And it's an extension to GDB, although you can run it as a standalone script as well. And what's really nice about it is that it actually categorizes the crash into various levels of exploitability. It has four categories, exploitable, probably exploitable, probably not exploitable, and unknown, some cases it just can't tell. And it gives you a really nice layout of exactly what's going on in here. So between the three of these tools, you can get quite a bit of information in terms of understanding what's really going on. So then you have the next step, which is actually figure out what's going on. Determine exploitability or find the root cause. So we're just basically into hardcore debugging. Before you do that, it's important to disable ASLR. After talking about it, it's a good tool to avoid hacks. If you run the program over and over again, or any time you run it, you're going to have all different memory locations for everything. And so that's going to make it essentially impossible to do any kind of triage or debugging on it. So disable ASLR, certainly do it behind a firewall or a NAT router or some other case where you're safe, you're not exposed to the internet. This is the umbutu file under PROC that controls whether ASLR is on or not. And you can't edit stuff under PROC. So this is the command to disable ASLR. If you do it this way and reboot the machine, ASLR will be back in place. And then the next step is to identify critical memory locations. So it gets confusing when you're doing this sometimes. It's sort of like you see lots of memory, and there's the code, and where it crashed, and all this other stuff. I like to essentially be very careful about understanding the critical memory locations. And so that can vary in terms of the crash, but certainly where the crash happened in the code, what instruction was being executed when it crashed. That's a very important point. Where the invalid read or write occurred in the data? Where in the memory in question was allocated and or freed, if not, that's a code issue. Like where in the code the allocation happened? Or where it was or was not freed or freed too many times? And then whenever the data is reassigned, perhaps it gets reassigned to a different variable or buffer or gets copied, that can be important too. And that's actually in the code and or the data. So once you've kind of mapped this out, and I like to actually write them down, because it does get confusing after a while, then it's time to just fire up GDB and start digging into it. And so uses of GDB is certainly outside the scope in today's talk, but the basic idea is that you set a break point where the crash happens and work backwards from there. Certainly you want to compile with dash G to get the symbols and dash O0, zero level optimization. That way you get as much information as possible in terms of variable names and things like that. And so pretty much you run the target with canary values like capital A's. And that way you can track those through memory. It's easy when you dump memory to see a bunch of OX4 ones. So that's the hex number for capital A. It's 65 in decimal. This is its ASCII number. And so we're pretty happy when we see a bunch of 40 ones where they're not supposed to be, or four ones. And so we kind of use that to track things too. But GDB is a little bit tricky because you have to work on it a lot. And if you set a break point where the crash happens, kind of the idea is you run it over and over again and set break points kind of early and earlier and earlier and dump memory and see what's going on. And that works. And a lot of times that's what you have to do. But it's kind of time consuming and cumbersome. So I've been using another tool in addition to GDB. It's actually a GDB plugin. And it's called RR. And it does reverse debugging. It allows you to actually step backwards through the control flow of the program. So you have to run the program under RR. And it records the execution. Once you've done that, you can run the program normally with any input you want. And then set a break point. But once you get to that break point, you can step backwards. And just like you step forwards normally, you can turn back time. And that can be, for a lot of bugs, a very efficient way to figure out what's going on. Now a lot of bugs are pretty easy to figure out. Or certainly, off by one errors are typically fairly easy to both find and identify and obviously fix. But some of the stuff is not. So it's important to kind of tape breaks, be persistent, this kind of low level debugging and memory dumping and all that kind of stuff. It's hard work. And then one more thing. Once bugs are fixed, if a crash is fixed, that crash might have been masking another crash. It might have stopped execution before some other bug might have been discovered by the fuzzer. And so it's important to fix those bugs and then rerun the fuzzer again on the target in the same manner to try to flush out a few more crashes. OK, so for the final section, I'm just going to go over a couple of real world examples that I found in my own research. First one being PHP. I was fuzzing PHP. I found a low bad read. Low meaning low in the heap space. It was actually very low. It was a null pointer dereference, essentially. So not exploitable really at all. Here's my report. And I was running it from the command line. And I was using a dictionary in the fuzzer. So I gave it all the standard commands. This isn't the PHP any file, which is typically not used at the command line, which is part of how I flushed out the crash. And so the fuzzer was always using these. Like I set up with a dictionary so the fuzzer didn't have to generate these parts of the any file. But it just happened to put a 1 there. And that made PHP think it was running in the context of a web server when it was not. And that just triggered this null pointer dereference. And you can see here the valgrind output of this, bad read of size 4. Address 0x10, so that's 16 in decimal, is not stack malact or freed. So this is a standard null pointer dereference, no longer exploitable. In the old days, you used to be exploited these. But those days are over. And then this is the fix that the PHP devs put in there. And they're simply just checking now that it is operating in the context of a web server and doing the right thing as appropriate. So then I turned my attention to Ruby and started fuzzing the regular expression compilation in Ruby. And I found actually a few bugs in that. This is one of the more interesting ones. And it was a pretty nice buffer overflow here. And this is my report. It's pretty weird. I mean, if you open a character class in a regular expression and you don't close it, and you have an octal number in it after that, that's the weird corner case that the fuzzer found. Here's the output of the asan. Have a heap, buffer overflow. But what's kind of interesting here is that Valgrind gives us even more information about this bug. Valgrind's saying there's an invalid write here of size four and then an invalid read later. And they are, again, fairly close in memory. And so as I mentioned before, anytime you have a bad write in the heap, you really have to consider that exploitable. And I do. So this bug was really an exploitable bug in Ruby. However, the attack surface for it was very small. Because we're talking about a regular expression as the untrusted input. And it's already not a good idea to accept untrusted regular expressions in any application. Because you're opening yourself up for minimally a denial of service attack. You can think of there are regular expressions that are regular expression bombs or whatever. Or they can really exhaust memory. And so setting aside, that's not memory corruption. That's just a DOS attack. But that's why you really shouldn't even expose yourself to untrusted regexes to begin with. Here's another reason, because this code could have been attackable in Ruby as of eight months ago. It's been fixed. And so this is definitely exploitable, but thankfully a pretty small attack surface. And here's their fix. This is the Ruby devs fix. And they're talking about the octal and all that kind of stuff. And then I found a bug in a system called dynamite. And this is written by Netflix. I found an invalid write in it. And dynamite is a replicator sharder for Redis and Memcached. Key value storage systems is what Redis does and Memcached does. And so dynamite sits in these storage systems between these storage systems and the internet. And it does either replicating in case these, for reliability here, or sharding if you're dividing it up by last name, first letter the last name, that's sharding. And so that's what Netflix dynamite does. And so it's on the open internet. And they are using it. And lots of other people are using it too. It is open source. And I decided to fuzz it and see what happens. Now what I was fuzzing is kind of a weird thing to fuzz. It was what I call an oblique attack. I wasn't fuzzing it head on. I fuzz the admin file. And it's a YAML based file in Netflix dynamite. And I fuzz this file. So I hit it with lots of weird versions of this file. And I did get a pretty nice crash. And now that's kind of a weird thing. Like, OK, so an admin can make a crash or a bad admin. And that's still an important bug. It has to be fixed, because you could have a malicious or compromised admin. But that's kind of what I thought I had. But I'm like, OK, I'm going to file this. This is the case that I filed. So this is the minimized case that I sent to it as a YAML file. And so here you have six capital A's. And to demonstrate, this is what I filed with Netflix just to demonstrate the issue. And so I ran the program. Down here, we got an error. And this is actually from Glib C. Sometimes Glib C gives you a nice error as well. And it goes to standard out. I mean, sorry, standard error. You won't see it on standard out. And we've got the crash here. Here's the next invalid size. And so in GDB, I just dumped that address. I'm sorry. Yes. And here we have the capital A's here. And then I dumped the next address. And there's more capital A's. And it's Intel. So it's little endian. So these are actually six contiguous writes. And this is attacker control. Like the attacker could put anything here she wants to in lieu of those A's. You get six bytes to do whatever you want. And so OK, this is a cool bug. But again, it's like not that OK, it's an admin file. And so I filed it. And to my surprise, they didn't fix what I thought. I thought, OK, maybe it's somewhere in the YAML code or somewhere else. But it was actually in their string functions. And so this is actually something that gets run a lot, especially in something that handles key value storage. All it does is handle strings and duplicate them. In fact, essentially, dynamite sits in between. And it is man in the middle itself. So if it gets compromised, that's a very serious thing. And so I was pretty shocked when I saw this fix. I didn't think it would be down on the string code. And essentially, they were using string functions, a string library, but they were kind of a little bit writing their own code over it, too. They weren't using it really strictly. And so here's their diff. And you can see this is a standard off by one error. And so that's how they fixed it. So this got fixed. And so I did get in the Netflix Hall of Fame, which is a good place to be. And it's good to have it fixed. And I was just really surprised about this. This could have been, theoretically, a pretty critical bug. If you have a hacker could have gotten man in the middle between the internet and Netflix or a lot of other people. So pretty serious bug, I would say. And so that's going to do it for now. I'd like to talk about some references. Rents of Their Polytechnical Institute has a great course called Modern Binary Exploitation. And this is both on GitHub. And they also have materials as well on their site. So it's good to search for this. On GitHub, they have a bunch of vulnerable apps to practice against. And then on their site, they have really the full course materials. So you can just take this course on your own. And I did. And I learned a great deal from it. And they also cover reverse engineering in addition to binary exploitation. And then the Bible of this kind of work is called Hacking the Art of Exploitation by John Erickson. I got the first edition in 2005. There's a second edition now. And then Project Zero is a great blog. Some other really nice references there. More than I can list as well. So that's going to do it for now. I really appreciate your attention. And I think we have a little bit of time if there's any questions. Yeah. So we're going to use the mic here for the questions. Let's see if that. That was a good talk. I like it. I wanted to ask, have you had any experience with using any kind of fuzzing tool or dynamic tool to try to exploit Java code or something that runs in Java? Yeah, thank you. Yeah, great question. So the question, yeah, so do we use fuzzing to break Java or even other managed languages? And that's the next step. So my company is actually researching that right now. I'm working on that right now. The idea is that fuzzers with unmanaged memory, well, let me put it this way, a fuzzer fundamentally throws random data at an application. And it monitors, in this case, for crashes. But a fuzzer could monitor for anything, for any kind of condition. The problem is, or the hard work, is specifying those conditions. And so I'm trying to think of ways in Java to do that. And obviously you could do that by hand, but that's sort of like writing unit tests. So I'm trying to work on ways to do that. The answer for now, there's really not a whole lot that fuzzers offer out of the box as they stand for managed code. But we're working on it, looking for ways to do that. Anybody else? Yeah, OK, cool. It's like one of those talk shows or something. Thank you. Does your workflow change any for multi-threaded applications? Oh, yeah, good question. Yeah, so does the workflow change for multi-threaded applications? Not mine, or I don't do that. I actually, when I'm fuzzing stuff, I'm usually, or I am fuzzing it in a single-threaded context. And I typically, I really don't fuzz stuff that's multi-threaded at this point, or I haven't. Now you could, but it's certainly, it's obviously harder for a lot of reasons. And actually, the fuzzer that I use itself is single-threaded. And so I mean, you could certainly spin up multiple ones of them to do it at scale. But in terms of specifically fuzzing, maybe to try to find race conditions and things like that, I've actually never done that. That's a pretty interesting idea. Anybody else? All right, thank you guys very much. Really appreciate your attention. Hope you have a great rest of your day. Thank you.