 Okay, Hano is a freelance journalist and security researcher. Sometimes he tricks semantic into getting their math wrong with regards to TLS keys, and otherwise he just throws random crap at freely broken source software as part of the fossing project. So please welcome Hano. Yeah, hello. So today I want to present some methods to you how you can improve your software if you write software and like introduce the little project I have been running for a few years. So I run the fuzzing project, which is also supported by the Linux Foundation's core infrastructure initiative. Yeah, and I generally try to improve the security of free and open source software there. And I occasionally find these bugs where I think like these bugs should really shouldn't happen. So this is this was a bug in keepers xc. So which is like the standard free software password manager people use these days. So there was a string of 48 bytes and that was stored in that variable. So does anyone see anything wrong with that? Huh? Huh? Zero terminator. I heard it. Yeah. So the thing is if you have a string in C that's zero terminated. It means it has a zero byte at the end and this zero byte at the end has to be stored somewhere. So if you have a string of 48 byte length, then your buffer needs to be 49 bytes. This code was part of keepers xc. As I said, it was executed when you started it and it was copied over from a project called zxcvbnc. So this is an algorithm to judge the quality of a password. It's been ported to various languages and this is kind of the C version of it. So So the situation we have here is we have a buffer overflow which gets like triggered right at the start of an application. It comes from a security tool. Then it gets copied over to the code of another security tool. So that's kind of a bit worrying. I mean, this is probably not exploitable in any reasonable way because like there's no attacker controlled input or anything, but it's clearly a bug and it's concerning if these bugs happen. So yeah, why did nobody notice that? Another example, Zumba. So you probably all know Zumba. It's kind of the Linux version of this Windows network file system protocol. And you probably heard of the Shadow Brokers, which was like that group of unknown origin that dropped a bunch of exploits a while ago. And they are presumably from the NSA. I mean, we don't know for sure, but that's what everyone assumes. And most of these exploits were against the SMB protocol. So the Windows network file system. And I had the idea if you run these exploits against Zumba, like the open source implementation, maybe that also triggers bugs. I mean, the exploit probably will not work, but it may trigger a bug. That did not happen. So I didn't find any bugs triggered by these exploits, but I found a couple of other bugs. So this one here, if you look at this, it may seem a bit odd. So you have here two structs that are basically pointing to another struct and to the size. And then you have the size of the struct. But you have here key twice. And on the right side, you see key and rack. So something is weird here. So what it really meant is that the second line should also be rack and the size of rack. And then this is a very classic bug. This was a function that was accessing a string at position minus one and checking if that's a new line. But the string could be a string of zero size. And if you have a string of zero size and then access minus one, you end up in invalid memory. So these two bugs and one other bug, so two of them I found by running make check, like the test suite of that application. And one of them I found by just running somewhere and trying to access it. So yeah, how did I do that and why did nobody else find these bugs? Because clearly, okay, they were triggered by the test suite or just by running the application and trying to access it. So and what I'm using here is a tool called address sanitizer, which is a feature that is part of the compiler. So it's available both in GCC and in Clang, which are the two usually used open source free and open source compilers, which you can activate with this compiler flag. So that, and here's a very simple example. So that is kind of a textbook buffer overflow. We were like defining an array with three elements, and then we're writing to element three. And given that we're starting the counting from zero, then the elements zero, one and two are valid and element three, that is invalid. I have this code here as an example. There it was. So what do you think what happens if we just start this? Anyone? Yeah. Yeah, so it will just do what we expect. It prints the five because it apparently at this point it can write to that memory, nothing crashes. And so we won't notice that bug. We have a buffer overflow, but we don't notice it. And now if we add this sanitize flag, if sanitize address, and then we start it again, then we get a crash and a really nice error message. And if we add g, which adds debugging, then we even get line numbers. So it says, okay, we have an error address sanitize stack buffer overflow right of size four, and here we get in line four. So that's very nice. So now we get very good information what's going wrong and that we have a buffer overflow here. This is another example. This is a classic use after free bug. With use after free, it's kind of unpredictable what happens if you just run that code. So what we're doing here is, okay, we're allocating some memory, and then we're copying a string to it, and then we're freeing that memory. You're shaking your head? As a quotation mark missing for last test. As a quotation mark missing. I fixed that, but I didn't press reload. Yeah. Sorry. And then we're freeing that, and then we're trying to print that. So if you compile that, then nothing happens. But if we compile that with address sanitizer and debugging again, then again, we get a very nice error message. And for use after free bugs, like, oh, it doesn't detect it as a use after free. Yeah. Okay. It thinks it's a buffer overflow. It's not working perfectly, but you see that there's a bug here. So yeah, address sanitizer, though, it's a very powerful tool to find bugs in your C code or C++ code. And I'll just say, if you're responsible to develop or maintain any kind of C code, there's just no excuse not to test it with address sanitizer. And if I see, like, that tools that are security tools or something like Samba, which is very exposed to attacks, that they apparently have never tested their code with this tool, which is freely available and really simple to use, I don't get that. So I really try to tell people, hey, this is a very powerful tool to improve the quality of your code. Please test your code with it. And to get back to the example from the beginning with key pass. So I've compiled that before, because if I would compile it during the talk, you would have to wait several minutes to see the compile, which is not very interesting. But if I just try to start it, I get this very long stack trace and tells me, yeah, right of size one. So that was the buffer overflow we saw right at the beginning. Yeah. So when addressing it, I find all these bugs, you could wonder, okay, what happens if we try to build a whole system with it? Like just a whole Linux system. I mean, we have all the code. We could do that. And that's so what I did was that I was created a gentle system. I mean, with gentle, it's kind of convenient because it's anyway a system that you compile yourself. It wasn't that easy because you kind of have dependency issues and you need to get the right order in which you compile the packages. But eventually it worked. And I was able to run a full system compiled with address sanitizer. You end up that it's getting very slow and it needs a lot of RAM because it has a huge memory footprint. But it works. And this is just a list of applications where I found bugs just by compiling them and running them with address sanitizer. So at the beginning, I had to fix some of those bugs so that I was even able to use that system because like bash was crashing all the time. And it's not very usable. So, but eventually a lot of bugs got fixed in a lot of important packages. Yeah, yeah. So address sanitizer is part of kind of a whole suite of different sanitizer features. They are developed mostly with people from Google and usually in Selang and some of them are then ported to GCC. And address sanitizer is definitely the most powerful one. So it's very easy to use and it finds bugs with high impact. But the others are worth looking at too. So there's undefined behavior sanitizer. So if you follow discussions about C, there are a lot of situations where the C standard says, okay, if you do this, that's undefined. And after that you can have no expectation that your code does anything correct. So a typical thing is an integer overflow. If you're overflowing an integer and for example, checking after that, whether the integer has overflowed, that does not work because then the compiler can say it is an integer overflow that can never happen. So we can just optimize that check out. Or another thing is invalid shifts. Like the shift operations you see there are a lot of rules. You cannot have a shift by a negative value or you cannot shift a negative value. And undefined behavior sanitizer finds these things. Then there's memory sanitizer that finds use of uninitialized memory. So yeah, typically you're using your initializing, you're defining a variable and then you're reading it before you've written anything into it. Things like that. The problem with that is it's a bit more tricky to use because there you not only need to compile your application with it but also all the libraries. And if it's C++ then that includes G-Lib C. So using that is a bit annoying for small applications. It works for bigger applications. It gets really tricky. But so Google has built Chrome with it. So it is also possible to build huge applications with it. But you have this dependency issue. So it's not as straightforward as address sanitizer. And then there's also threat sanitizer which finds concurrency issues which is mostly interesting for more complex applications. So race conditions and things like that. Okay. This is a security advisory for TCP dump. And there's a large number of CVEs. 41 if I'm not mistaken. More than half of them were reported by me. And when you see something like that it usually means someone was using fuzzing because you just find a large number of bugs with that. So let's talk a bit about fuzzing. So fuzzing, the basic idea is you're testing software with invalid inputs. So you have kind of a fuzzer that is it's taking some input, some example input, and then just adding random errors and like hundreds a second and testing the application again and again and again. And see if it crashes at some point. And if you have a crash, a crash is usually high likelihood of being some kind of memory corruption. So it might very well be a security bug. And traditionally there have been like the very first idea of fuzzing is to do dump fuzzing, which just means you take a valid input and add random errors to it. That's already quite effective, but you don't find the more complex bugs with that. And then in the past, like many people have then written fuzzing tools that were specific to a certain file format or a certain protocol. That is more effective, but the problem with that is that it doesn't really scale because like you need a specific fuzzing tool basically for every application or at least every format you're testing. And then there's been a new development in the past couple of years, which is so-called coverage-based fuzzing. And there the fuzzer itself gets kind of smart, but you don't have to do anything about it. And that is there are fuzzing tools that are detecting what code paths are triggered within an application. And that gives you a feedback what inputs are interesting. So if I have, I don't know, a JPEG file and I'm fuzzing a parser for JPEG, and then it's okay, if I change this byte to something, then it triggers a new code path within my executable. That means that is an interesting input. It may trigger some unusual feature and then it can use that input to do further fuzzing as a starting point. And that has turned out to be very effective. And the tool that kind of first introduced this technology is called American FuzzyLob. Here's a screenshot of it. So yeah, shows you lots of information. The most interesting information is the upper right, which tells you how many crashes it already found. Yeah. It's relatively straightforward to use. The one thing you need to consider here is that you need to recompile your application because as it needs this feedback mechanism for the code paths, it needs to add special code instructions to the code. So you have a compiler wrapper, AFLGCC or AFLCLANG, and then you recompile your application, and then you run the fuzzer. And it's really straightforward. So it's not a lot of work. And that's also, I think, why it's quite a popular tool because you really get started with it really fast. So yeah, I would say AFL has revolutionized fuzzing. And it's also like occasionally you have debates with people who have ideas about fuzzing that are, as they outdated. Because fuzzing has been around for decades. But what has happened in the past couple of years is just that we have tools that are so much more powerful that it doesn't really make sense to use the methods that have been used in the past anymore. You should use those modern fuzzing tools that are much more powerful than what we had in the past. So AFL, I tend to say, it basically has found bugs in every major piece of software out there. I'm just naming a few. That should be OpenSSL. So there's another error on the slide. So in OpenSSL, plenty of errors in OpenSSL, in Apache, in libjpeg, libpng, which are the major image libraries, SQLite, Knoopg, Bash, whatever. Basically, but these are just prominent examples. But really, if you go to the web page, you can find a list in what has already found bugs. It's extremely effective. And I've earlier talked about address sanitizer. You can use both together. Because address sanitizer enables detection of additional bugs, these bugs that don't necessarily crash an application, particularly things like buffer overreads. Usually if you read past the bounds of a buffer, most of the time doesn't crash because there is still valid memory that you can read beyond the buffer. And combining the fuzzer with address sanitizer will detect those. It will slow things down, but it will increase the number of bugs you find massively. There's another tool called libfuzzer, which is based on the same basic idea of fuzzing. It's using this coverage-based fuzzing with code path detection. But the difference is AFL is testing executables and libfuzzer is testing functions. So here's an example. So what you're doing there is you're writing this kind of test function, which takes a buffer and a size. And then you pass that to the function that you really want to test. And in this case, this function expects something that is zero terminated. So I'm allocating a buffer that has one byte more and add a zero at the end and then copy this input buffer to my temporary input buffer. And the function here, RS create query, belongs to C RS, which is a DNS library. And yeah, I have a demo here. So I have already compiled that library with some specific C flags and with C length. So libfuzzer only works with C length. It does not work with CCC. And now I'm compiling this fuzzing stuff. So this file libfuzzer CR is something that is the code I just showed you. And then I link in the static version of the library I just compiled. Then I link in libfuzzer itself. And then, okay, it needs the include file, which is in the current directory. That's trivial. And it needs the thread library. And then I add address sanitizer, again, because I want to find more bugs. And then I add this flag fsenitize coverage. This gives the fuzzer the capability to find these code paths. And then I add g which adds debugging so I get better error messages. And if I run this, and this is kind of an older version of that library, you see basically in an instant I get a crash. And this is a real bug. And this was used in an exploit chain against Chrome OS in a really complex exploit, but that was kind of the initial attack vector. So that was a real bug. So if anyone had fuzzed this function before with libfuzzer, it would have found that bug. So I kind of retroactively created this fuzzing stuff after I knew that there was a bug in that function. But I think it shows how powerful that is. And I didn't even give it any starting value here. I was just basically starting the fuzzer without a starting value. So it starts fuzzing with random bytes. So we have a real bug with real severe impact. And we can find it very easily. So the advantage from libfuzzer over AFL is that it's much faster because we're calling functions and a function call is faster than calling an executable. Because loading an executable has a lot of overhead. The disadvantage is it's more worked because we usually have to write some kind of code. I mean, you can see it's not a lot of code. There was a code. But it takes longer. Like if I start to fuzz something with AFL, it takes me maybe like five minutes and then I get started. If I want to fuzz something with libfuzzer, it's more like half an hour. It's not dramatic, but it's definitely a higher barrier. So you probably all remember this, right? The heart bleed bug, which was like a very severe bug in OpenSSL. And I decided to make a little experiment. I wanted to know, could we find a bug like heart bleed with fuzzing? And so I have to say something about it. Heart bleed was basically found with fuzzing. So it was found by two people independently, but one of them was using a fuzzer. But that was a specialized fuzzer for the TLS protocol. And also it's commercial. So it's not publicly available. It's not free software. So I cannot look at it and I cannot easily use it. So I did an experiment where I tried to like fuzz the handshake of OpenSSL. So I created a little application that would basically do a handshake with itself. And while doing that would write out all the handshake messages into files. And then I added a functionality that I could swap out one of these handshake messages with one that I gave it over the command line. So with that I was able to run a fuzzer like AFL against it. And in combination with address sanitizer. And after six hours it found heart bleed. So if that would have been done. And I mean it's all kind of tricky to say, okay, you retroactively found something that's easy. But I would say and I mean you can read my blog post about it. I would say that I didn't use any specific knowledge about the heart bleed bug. It was really just fuzzing the handshake straightforward what you would expect if you want to find bugs in a TLS stack. Interesting also lip fuzzer. I didn't know lip fuzzer back then but then I got an email from its developer who said, yeah, I tried to recreate your experiment with lip fuzzer and it's much faster. It can do it in five minutes. So that shows like how much faster it is. Okay. Differential fuzz testing. So what I told you until now was mostly about these memory safety issues, which is like the typical bugs you have in C code where you have buffer overflows, buffer over reads. But there are also very different kinds of bugs where you can use fuzzing. And differential fuzz testing means that we're fuzzing something and we're comparing the output of different implementations. So when we look at crypto, I mean crypto is ultimately just math, right? We're doing some calculations based with some keys which are in the end just numbers. But a question we can ask you is the math always correct or what if there's a bug? Because like if you have bugs in the crypto that can be pretty devastating. One example for that is the so-called RSA CRT bug. So if you're doing an RSA signature, what most real world implementations do is that they do some kind of optimization where they split up one big costly calculation into two smaller calculations. And I cannot go into the details, but what happens here is that if one of these calculations has an error in it and it doesn't matter what kind of error, then the signature you get reveals your private key. And that is pretty devastating. And there was a very nice paper by Florian Weimar who's working at Redhead where he was just connecting to all the servers in the internet and seeing if they give him signatures with this bug so he could get their private key. And he found like a couple of hundred. So it's very important that the math we're using for our cryptographic algorithms is correct. And so one thing I did was that I did a calculation with this input from a fuzzer and then compare the output of two different implementations. Like for example, you take OpenSSL and you take LibGCrypt and you do some maybe a modular exponentiation which is the basic operation for RSA and Diffie-Hellman and many other algorithms. And then you see if the result matches. Because I mean it's math, there should be only one result if you do an exponentiation. There's no ambiguity here. So if we have a different result, that's definitely a bug in one of the two implementations. And that turned out to be pretty successful. So I found a bug in the modular exponentiation of OpenSSL. I found one in NSS. I found several bugs in the elliptic curve operation of Nettle which is the library used by GnuTLS. I found a bug in the Poly 1305 authenticator in OpenSSL before it was officially released. So it got fixed for release. That's always nice. And in matrix SL and here with the matrix SSL bugs, I very much suspect that some of the keys Florian Weimar was able to extract from devices were due to this bug. Yeah. So this kind of shows that you can also use fuzzing to find completely different classes of bugs. Yeah. Now I want to talk a bit why this all matters. So I mean we heard earlier that many of you are using Linux. That's great because it's free software. It gives you freedom to change it and so on. But how secure is that? And we have actually some pretty scary attack surfaces on typical Linux desktops. So there was a very interesting exploit a while ago. So there are a couple of browsers that automatically download files and put it in your downloads directory. For example Chrome does that but also Epiphany which is standard GNOME browser. So basically that means a web page can create files on your hard disk which may be concerning. Then we have desktop search tools that automatically index all those files. Like GNOME has something called tracker and KDE has something called baloo. And then like these search indexing tools then use a lot of code to kind of index your file for example to extract metadata or generate a thumbnail or whatever. And if you can find a bug in one of those parsers that's exploitable then you end up that from a web page you can exploit that code running on your Linux machine. And these tools use many parsers that are not very well tested for security issues. And Chris Evans is a security researcher. He found a bug where he could use a parser for Nintendo sound files which is more or less kind of like an emulator of that sound chip in Gstreamer to exploit a Linux desktop from a web page. So web page downloads the file. The file gets indexed by a tracker. Tracker passes it to Gstreamer. Gstreamer passes it to some kind of Nintendo sound file parser and that could be exploited. So we're exposing all this code which is very often not very secure to files from the internet. That's a bit concerning. Yeah, that basically sums it up. So I've given a whole talk about this at FOSTEM. If you want to look it up it should be online on YouTube. So GNOME in reaction to this has sandboxed the desktop search now and also the Thumbnailer because you have a very similar issue with the Thumbnailer. So if you open a file manager then it will automatically create a small version of the like a small image of that file and that also exposes a lot of potentially insecure code to untrusted input. KDE has not done anything in that direction. So if you're using KDE you should be maybe a bit concerned about this. And also in reaction to that I've started fuzzing Gstreamer and it also led to like 20 bucks fixed. But there's a lot more code that is exposed through these mechanisms. So to get to a conclusion. So I know that a lot of people say we should no longer use C at all. We should rewrite everything in Rust or whatever in Go or Haskell or but realistically there's still a lot of C code that we're using. So probably your operating system is written in C, your browser is written in C and most of the other applications use also. But with a combination of using these sanitizer features and with fuzzing we're able to discover a lot of these typical C bucks. And we should do that. And I mean the tools are free so please use them. Yeah, that was it. Thanks for this. So if there are any questions we have two mics open in the middle. Go get there. Hi Hanno, thanks for the talk. It was very nice and entertaining. I would just like, I have no questions, I would just like to thank you for all the work you've done because in my personal opinion you make the software we all use every fucking day way better and safer. So thank you from my heart. But you can do that too. Hi, I had a question. You said you find the hard bleed perk using fuzzing. But as I understand it, the hard bleed was one that leaked information without doing any sort of crashes or anything. How do you find that sort of... That is where Address Sanitizer comes in. So hard bleed was a classic buffer over read. So you had a length and a message and you could give a longer length than the message and then it would just read random memory. But Address Sanitizer detects exactly these things. So if you read past the bounds of a buffer then Address Sanitizer will crash your application. Of course. Good answer. Thank you. In the front. Hi. So you said that you recompiled Gen to using ASAN, right? Yeah. The Address Sanitizer. So actually I did the same, but like when I tried to use WGet I already found a bug. So I give up on that part. I'm happy that you go down that part and find all those bugs. So, but have you told, because if you know Gen to also has a way to test each package after you build it. So maybe you could also run the test suite and not just try to use the system after you compile it. I actually did that. I didn't go through all the things that it found and reported it because it was just too much. But I went through the relevant packages and did that. Thank you. In the back. Thank you for that awesome talk. So I've never used the code sanitizer, but I used Volgrind. How do they compare? Yeah, okay. That's actually a good question. So I think Volgrind is mostly deprecated these days. So it's an awesome tool for what it's doing. But Address Sanitizer kind of covers most of the things that Volgrind covers. It covers bugs that by definition you cannot find with a runtime tool. And it's much faster. The only thing Volgrind finds that Address Sanitizer doesn't find is uninitialized memory. Therefore, there's memory sanitizer, but memory sanitizer is a bit more tricky to use. So there kind of Volgrind may still have a use case. But it's actually something that I'm trying to preach a bit that Address Sanitizer is kind of like Volgrind, but better. And if you're only using Volgrind these days, I'd say you're doing it wrong. Hi, thanks for your talk. What do you think about sanitizers as security features? Yeah, okay. It's also a good question. I thought that's a good idea, but I was corrected. So Address Sanitizer, so I had the idea to say, okay, it has a huge overhead, but maybe you can afford it for a high-secure system. The problem is that Address Sanitizer basically disables ASLR and a bunch of other security features. Then you can use Address Sanitizer's crash reporting to get root if you have a Stuart binary compiled with it. So there are a bunch of issues with it. There was a very detailed post on OSS security where someone said detail all these issues that come up. So it's basically not designed to be a security feature. It's a debugging feature to find bugs for developers. And do you see some way to make it more security feature? Maybe modify it somehow. That may be possible, but I don't have a good answer to that. Maybe that's possible, but one would have to go into the very details on how it works and check that. But I mean, the thing is there are some new exploit mitigations that are available, but that are not widely used. And the thing with exploit mitigations is people only use them if they have a very small overhead. I mean, it took us more than 10 years to convince Linux distributions to enable ASLR. And you won't sell them a feature that has 50% overhead in runtime and even more in memory use. So I would not be overly optimistic. And I think maybe it's just not the right tool for exploit mitigation. Thank you in the front. What's about some complex systems like NVIDIA QDA if I want to find out if there are some bugs or security vulnerabilities in it? Because the QDA library and NVIDIA driver are closed source and NVIDIA GPU and firmware are also closed things and there are domains between it. I have no idea, sorry. I mean, yeah, that you're basically going to a completely different architecture with different system design and yeah. I don't think the things I presented here apply to that. It's just kind of a completely different domain. Are there more questions? Three, two, one, no. Okay, then thank you, Hanno.