 Hello, my name is Dmitry Vyukov. I work as software engineer at Google in the dynamic tools team, and I'm going to talk, present Sysbot and the Tale of Thousand Kernel Box. So first of all, I would like to ask, how many of you know about Sysbot and Syscolor? OK, so roughly half, maybe. So how many of you love Sysbot? OK, about 10 people. And how many of you hate Sysbot? OK, two, three. And how many of you both love and hate Sysbot? OK, a few people all said that would be a majority. OK, so first I'm going to talk about kernel bug disaster. And then I'm going to talk about what we're trying to do with this. And then I will talk about what we're not doing yet and where we need help. So as of now today, civilization runs on Linux. We have 2 billion Android users. We have cloud service, supercomputers, desktops, notebooks, also cars, plants, space station, and last but not least, our coffee machines. And security is critical. Linux protects privacy of more than 2 billion user people and protects car, corp, and government information. It protects safety critical systems. And it's the first line of defense for all incoming network packets, for entrusted application, for virtual machine guests, and also USB and FC Bluetooth traffic. Also for things like cars, phones, and plants, stability and safety are also critical. So you can say that Linux kernel is one of the most security critical components in the world today, or maybe the most. So there are all those bugs with logos and bold headlines, and they can produce lots of noise. People start running and screaming, and then we can fix them, and then number of known bugs with logos is zero, so we're kind of good. But that's only the tip of the iceberg, because kernel has bugs, and it has lots of bugs. And as with no bugs are the source of security issues. So last year, there were 450 CVs registered on kernel, and some of them were classified as code execution, some as gain privileges, but lots of bugs are unaccounted here. And last year, there were more than 4,000 kind of official bug fixes. I think they count them by those fixes stacks, so again, lots are unaccounted here. About a year ago, we deployed a system called Sysbot, which is a continuous kernel fuzzing system, and for the past 12 months, it reports about 200 bugs per month. It reported more than 1,000 now in upstream kernel and 1,200 bugs in Android, Chrome OS, and some of our internal kernels. And before that, we also used the fuzzer manually and reported more than 1,000 bugs for previous two years. So in total, we reported more than 3,000 bugs now. To give you one example, sometime ago, we started testing USB stack from the external side, so from the side of the USB cable connected to the machine. And by barely scratching the surface, we found more than 80 bugs, so all those bugs, there were all kind of bad bugs, including use after free, out of bounds, double freeze, and all those bugs are triggerable by just any cable that you connect to the machine. And we didn't even get past handshakes, so we didn't actually test the main driver code because we just run out of time and had to switch to something else. But so I'm sure that there are two free 500 bugs, actually, more that we didn't find yet. And USB is not special, this flow of bugs is representative for just any subsystem that we start testing, KVM, TCP, UDP, sound, 9P, BPF, you name it. Here you can see snippet from the sysbot dashboard, so now we have more than 200 open bugs, we have there things like use after free, out of bounds. They still happen, they have reproducers, they were reported a while ago, still unfixed. So that's what's currently present in the kernel. Of course, we're getting not just use of the freeze and out of bounds, we're getting submix of bugs. Besides that, things like bug, warning, null deref, initialized memory, deadlocks, hanks, and all other stuff. But the modest estimation that we reported at least 500 security bugs, and this is not counting things like local denial of service. And very few of those bugs have CVs. It's important to note that exploit doesn't necessarily mean a use after free. We've seen a case when a machine was just unresponsive, but after debugging it turned out to be a full guest host escape because there was guest triggerable page reference leak. We've seen a warning that turned out to be inter-process, inter-VM information leak because that was a warning to restore registers in the context switch, and we've seen stalls that turned out to be remote denial of service bugs. But that's not all. So I looked at the number of back ports in the stable releases, so should I say stable? And for the latest active ones like 4.4, 4.9, we have almost 10,000 back ports there. So not all of them are bug fixes, but a sample, maybe 100 of them, got impression that more than 95% of them are actually bug fixes, so almost all of them. And on top of this, we also have some fixes that are already upstream, but not back ported yet. And this happens frequently because there's just no process. We know that there's at least 700 of them, but most likely much more. There are also bugs that are already found upstream, but just not fixed yet, also hundreds, and obviously lots of bugs that we didn't find yet for various reasons. So based on that, I can conclude that every looks good and stable releases that we produce actually contains more than 20,000 bugs. And now it's not getting better over time, so it's not that we can fix this massive amount of bugs and now the code is much better and we don't have them. So if anything, it only becomes worse. And no, this is not normal. But the state of the upstream and even stable doesn't matter in the end because nobody uses upstream, people have own forks of the kernel and that's what they use. So the state of those, let's say distrust, what actually matters in practice. And distro people say that they simply can't keep up with this flow of changes. And there are CVs are filed on very few of those bugs. So the stable process is not fully working and the CV process is not working. And if you ask why, another view for the stable releases this number of backports per month and to make it more apple to apples is split it for the first year of release life and the second year of release life. So you can see that for 4.9, it was 400 first year and then 500 per month next year and for the latest one for 0.14, it's now at about 700 per month for the past nine months. So this is just a huge amount, right? It's like 22 per day each day, no, we can't. So people can't keep up with this. And then, so people don't use upstream, right? Everybody forked the kernels and there are lots of those forks out there. And each forked bug is effectively a new bug for most practical purposes because that's a separate code base maintained by separate people, separate process, separate testing, so bug can be fixed in one but not in another. So say for Google, so if upstream contains 20,000 bucks, so for Google, it already makes it hundreds of thousands of bucks that we need to deal with and if you look at this industry wide, this already makes millions of bucks that people need to deal with. So being on stable helps a lot but it's still huge stream of bugs and you obviously don't want something like continuous deployment to space station, right? Again, you still need to do some testing. And bugs are being backported to stable too and it's significant rate, which is kind of reasonable because bugs are being introduced at high rate and there's a weak testing, so bug fixes contain bugs and those bugs are being backported to stable and there are also stable specific bugs too, like missed backports or just things slightly different. And obviously there are some distributions that have a large number of custom patches and for them any backporting is a pain and work, right? And also those things like board support packages which frequently basically not updated at all after they've been released. And this kind of makes me sad, so it doesn't look like the situation that should, you know, how should things should be for the system that is as fundamental, as as security critical as Linux kernel. So it's say that we need to reduce number of bugs per release by 100, so not just an order of magnitude but two orders of magnitude to get something like 200 which is much more, you know, reasonable to deal with. So there are some defenses which is great but existing defenses are not enough to protect from that many bugs. So there is attack surface reduction which is great but this large surface is still open and most subsystems are still relevant. So for example, USB is not relevant for servers but relevant for clients and namespaces that way around. So in the end we still care about all of them and we have some mitigations like stack protector and ref count hardening but they simply can mitigate from hundreds of arbitrary memory corruptions because usually it's assumed that there are a few bugs and then mitigations maybe can help with that. They also don't mitigate lots of types of bugs like races and initialized memory or if there is a more or less like write what were primitive then it's just game over. And the CFI is also not completely effective in the kernel because for some functions there is very large number of functions with the same signature and for example for read callbacks function there are about 4,000 of them with this signature and any of them can be called. And some mitigations also not back ported or not enabled to some kernels because obviously performance. Then there are things like SLLinux namespaces or FS variety but this is just a logical protection so they simply they directly assume that the kernel is not buggy which is not true and in particular users namespaces open even larger attack surface. They open lots of things that were historically kind of root only and contain lots of bugs. So today it's even unclear if it's a win or not with respect to security. And there's the thing that called hiding buggy code under root so it can help to some degree but things like SLLinux, IMAIMO, module signing they significantly restrict the root and now it simply can't say a lot arbitrary coding into kernel and for example on Android root is just explicitly not a trusted entity. So when people get execution in a process with the root privileges they still go for a kernel exploit and then the end user still needs to do what they need to do so they need to mount an image and it's protected by a root like what do they do? They just say sudo so they start saying sudo left and right and still kind of exercise this functionality. So existing defenses can't help with kind of save us on the security front from that many bugs in the kernel. Okay and that's it for the set part and no not so set part so both what we're trying to do with this so what we're doing is only part of the solution so situation is significantly rusty so it's not that there's just one magic thing that we can do and magically solve this whole problem. So we have several bug detection tools that they called KSN, KMSN, and KTSN. We also have bug discovery tool called syscaller which is system call fuzzer and we have system called sysbot which is automated systematic testing solution. So KSN or kernel address sanitize it's kind of our security workhorse both in kernel and in user space. It detects bugs like use of the free and out of bounds on the heap stack and globals. It detects bugs right at the point of occurrence. It provide informative reports so for example for use of the free it says where the stack where the bad access happens, where the heap block was allocated and where it was freeed, it's easy to use. You just enable config KSN. The tool is based on compiler instrumentation and you need at NGC 4.9 at least to clank. It's also reasonably fast and has reasonable memory overhead of about 2x but that's only for the kernel part of the workload and kernel usually takes small time of the overall CPU time. So in practice it may be close to actually unnoticeable. And it's upstream in the kernel since 4.3. The next tool is called KSN or kernel memory sanitizer. It detects users of initialized values in kernel and in the context of security this means in particular things like information leaks both local and remote. And those are very easy to exploit way easier than all that speculative stuff and gives you much faster channel from the kernel. It also can lead to control flow subversion when initialized values are used in control flow. And it also can lead to data ethics. For example, if we have an initialized user ID and we actually seen such bugs. It's not upstreamed yet, it's on GitHub and at this point it's kind of almost ready. It's already works. It's mostly enabled on Sysbot and finds bugs, it found more than 50 bugs upstream now. But we are fighting currently with the long tail of various false positives and crashes because the tool is quite complex. And the last tool is KT-SAN or kernel thread sanitizer and it detects data races. Yes, and I forgot to mention KAMESAN also requires Clang. So the rest of the tool were ported to GC and this one will not be ported to GC. Okay, kernel thread sanitizer. So it finds data races. So kernel data races also represent security of thread. There is that common type of ethical time of check time of use when you load something and check and then use this thing later assuming that it's the same value but another thread could actually change this value in between. And again, sometimes we see data races on say credentials and also lots of use after freeze and double freeze in kernel actually caused by data races. So we have prototype on GitHub. It was done by an intern and currently it's frozen due to lack of resources. It found about 20 data races. And the main obstacle for deployment is that the kernel is full of so-called benign data races which is strictly saying undefined behavior in C but historically kernel was kind of sloppy with this and so there are lots of unmarked accesses. Yeah, so to deploy that we would need to kind of get rid of all of them. C-scolar, so it's a system called Fuzzer. It's grammar based and it's coverage guided and it's mostly unsupervised. It's also multi-operating system architecture and machine type. So on the first session there was a question about the Fuchsia testing. So we also exported the Fuchsia and it has Fuchsia on C's bot. Well, so there's G-Visor and AcuraS and also all of the BSD flavors also supported mostly by C-scolar. So it's compared to other fuzzers. It tends to find deeper bugs and it also usually provides reproducers and it does decent regression testing and it's scalable to large number of bugs. So I've said that it's grammar based and what this means in practice is that we have declarative descriptions of system call interfaces. You can see example on the slide. So this is mostly like C function and structured declarations so hopefully you can read it. They have just more semantical information for argument types and fields. And those descriptions help to generate much better workload but the Fuzzer tests only what's being described. So it doesn't just magically test all of the kernel, test only things for which it has such descriptions. And from those descriptions, we generate programs in the following form. You can see an example on the slide. So this also more like C program, just a sequence of system calls with actual arguments and that's what actually we execute and mutate and store. And on top of this, we have the C-sport which is a fuzzy and co-automation system. It does continuous kernel and C-scolar build and update so it always uses the latest version of both. It does test machine management, it does bug deduplication, localization and in the end it automatically reports bugs to kernel mailing list and then does bug status tracking so it can understand when a bug is fixed. Okay, so that's what we're doing but we also need your help just because there are too many developers, too many bugs and we simply can't handle all of these ourselves. So first of all, we need more of the system called descriptions because current coverage is far from being complete and the more descriptions we add, the linearly more bugs we discover and fix. We also have poor environment setup for some things like network devices, as Selenux policies and some other. So the problem is that there are hundreds of subsystems in the kernel and lots of them are quite complex and we're not an expert in any of them. But we see things like, say, a CVE that was classified as remote code executions, bad and it was in something called netfilter and at the time we just didn't know that this thing exists or we see the Android use of the free that was classified as high severity is in something called NSFS and we still don't know what it is and don't test it. So adding those description is not hard. There is some learning curve but when you do the second one, it should be pretty easy. And we have lots of examples. The next thing is testing external inputs to the kernel. So currently CIScolor can inject network packets into the networking stack via tune device so they as if came from external site and this obviously gives the most critical box. We have some basic coverage. We have descriptions for Ethernet, IP, TCP and UDP and for few other protocols but I'm sure there are more protocols and they're quite complex and some of them require some networking device setup. So I mentioned USB, we want to revive this work and kind of actually test more of the USB drivers but then there are lots of other things like NFC, CAN, Bluetooth, guest host interfaces and maybe things like keyboard mouse and also the things that I don't know about yet. So some of them may require adding some stub devices like tune that allow to kind of inject the inputs into that system as if they came from external site. Some may already have it. For example, USB have such support and it doesn't require any additional devices. Okay, so we also have lots of open box, hundreds of box that was reported and not fixed. Some of them are just bad vulnerabilities in itself. Other effects stability or deny enough surface type of box but even the rest they still harms the scholars ability to uncover more critical box. So we can expect that we'll find all of the critical ones if we will not start fixing all of them. So we need help fixing those box and also triaging routing, duping, closing, fixed obsolete box. So usually this is considered as part of the development workflow. So if you submit code or maintain a subsystem, please also contribute to those efforts. Okay, so the next thing is related to KSN. So KSN based on compiler instrumentation so it checks only the C memory access and it checks them with respect to KMLX state. So it does not check memory access done in assembly, done in hardware and it doesn't use of the freeze and out of bounds if there is some kind of custom caching or growth or amortization scheme involved. So an object can be freed in some custom cache but it's not free to the KMLX rate to the slab. So it's still considered as allocated and KSN will not catch back on such objects. KSN has a notation for both checking range of memory if it's good or bad and also notating a range of memory is a good or bad. So a good example is SKB. SKB is a coordinate working data structure and it used to hold packet data. It has so-called linear data which is just directly accessible buffer of packet data and there is an API like you can ask for to pull in say two bytes into this linear buffer and then you can access those two bytes but should not access third byte or if you ask for three bytes, now you can access three bytes but the previous buffer can be reallocated so you should not access the previous pointer. But the thing is that SKB use very active amortized grow obviously to not reallocate each time you ask for another byte so usually you can get away with actually accessing more or accessing the old pointers and this is super easy to get wrong and there are lots and lots of such code and it's just a bug nest. I'm sure there are dozens of remotely triggerable bugs which we currently not detect mostly so it can make sense to do more strict or exact grow policy on the case and for SKB so this something that currently is not done. Obviously we don't want those annotations sprinkled through the code base but there can be some things that that's still worth it. For example, something on DMA, I2C, SPI, virtual maybe something common in USB or file systems. So if you have any good ideas know that this is kind of a way to potentially detect lots of bugs there if we add some additional annotations. Then there are some other tools which we do not use on sysbot currently but it would be great to use them. So the first one is KMM leak or memory leak detector and I heard from server people that memory leaks actually the worst bugs out there because they just silently drain machine resources slowly and then you kind of have everything slow for and can understand why. Obviously remote memory leaks also bad but the problem is that the tool has false positives and this means that we cannot use it in automatic systematic testing setting because we automatically report all things that we find and nobody will be happy if this will produce a constant stream of false positives. So it's possible to make it precise and would be very nice to do. The next tool is KUBISAN on defined behaviors sanitizer and it finds some kind of more local cases of undefined behavior and see. It can find intro object overflows. It can find cases where bull or enums have wrong values and surprisingly this can lead to control flow hijacking because you can have a bull that is neither true nor false or both true and false or you can have for example a switch on an enum value implemented as a jump table and then the compiler can eliminate the bounce check on the enum. Also it detects overflows and valid shifts which also can lead to out of bounce access and sometimes in surprising ways. For example, if variable is used as a right operand in a shift operation then compiler can start making assumptions about the range of the values in that variable and a result that can say eliminate some bounce checking. Okay, but I think that the kernel still needs some cleanup for those bugs and more serious problem that lots of bugs face some significant opposition, especially the overflows, the shifts, yeah, so it's kind of makes it impossible to deploy if we need to kind of have some kind of battle for each other bug. And the next tool is KT-SAN which I already mentioned. So I'm sure it will find thousands of hard to localize races in kernel and provide actionable reports for them but we need to say no to the benign races and just mark all concurrent access. So races are undefined behavior and see and they can also, they're super subtle to reason about and super hard to get right. So if you think that it's easier then maybe you just don't see the whole problem. So even the things like aligned int store a lot can actually lead to very surprising things. And the last thing I wanted to touch is kernel testing. So most of the bugs can be prevented with testing and even the fields that we're not doing enough that we can do much better on the kernel testing side. So we had this 20,000 bugs per release, new bugs are being introduced at a very high rate. Bugs are being back ported to stable. Bugs are being reintroduced and nobody can keep up with this flow. And development is also slowed down because there's high reliance on manual labor. There are delayed releases, there are broken builds which in particular prevent bisections that people ask us for bisection but sometimes the kernel is broken or doesn't boot for months. There's also long fix latency. So in most project with modern development processes it's usually possible to say push a critical fix for example for a build breakage within a day or even within an hour and in kernel it frequently takes month. There are also late feedback for bugs to developers. Developer can receive a bug report two or three months after he submitted the change. They already forgot about the change already on the vacation. So I spent some time kind of thinking what to say here because there are kind of some tests somewhere and somebody kind of runs them sometimes and does something with the results. So if I say that there are no tests then or there is no testing then it's false, right? But still feels that kind of there's something to improve for testing. And I think that something as a testing needs to be an integral part of the development process. So it should not be just something on the side done by some other people. So we need tests and treat. The tests need to be easy to write, easy to discover, easy to run and easy to understand when they pass a fail. We need both user space tests and some internal tests with support for easy hardware mocking. We need to add tests for new functionalities and regression tests and we absolutely need automated continuous testing which is a part of development processes and not just somewhere on the side. So we need pre-submit tests and we need say developer waiting for the four plus one from the testing infrastructure before the commit is submitted or even say a thing called commit queue which is now used in some projects is when the changes approved by humans it gets into the commit queue and then robots test it on head and only if tests pass they can automatically commit the change. And the infrastructure also needs to use all of the available tools because frequently we see bugs that could be trivially detected if existing debugging tools would be used. Yeah, but like tools like KSN and LogDep in particular but lots of them still being committed to kernel. And the standard is not an easy to do, easy thing to do and it's not that they have all of the answers and can say what exactly, kind of how exactly to do this but I feel that that's the direction where we need to move. And with this thank you and I'm ready to answer questions. You mentioned that you file bugs. How do you do the figuring out where to file a bug because there's really no clear way of figuring out how to do it? Do you look through the maintainers file? How do you figure out where the bug should be go to be reported? So we send bugs to Maylient list and from the crash report we try to find the guilty file which is kind of usually on the top of the stack trace but we skip some of the top frames like the comment for example if it's mmslab then we skip it. And then we run getMaintainers on that file and that's the list of the people. Do you need this process improved? Do you need a better way to figure out where the bug should be filed? Because I believe I only just looked through the maintainers file there's only like 19 bug where to file the bugs references. But all of them have the kind of CC people to CC emails addresses. Right, but you mentioned that you also track when the bug's been fixed. Yeah, so the kernel doesn't have the bug tracking system per se, right? There's bugzilla but it's not that it's kind of used. So we build the bug tracking process around the existing email kernel process. So the main scenario is that we give the reported buy tag in the email, in the report email and then the developer needs to put this tag into fix and commit. And then we pull the git trees and when we see this commit we understand that this commit fixes that bug and when we see this commit on all of our builders then we close the bug. And then we, so closing bugs is important because only once we close it we can report new bugs that look similarly. And there are cases actually when we can see the bug that looks almost exactly the same. But the new one is reported only when we close the previous one. So while the previous one is closed that they will all just pile into the same bin. I just wanted to know if there's a copy of your talk online. I saw there's a lot of hyperlinks but I didn't say where they were to. So I'd just like to. Yes, I will send this tomorrow today to organizers and I will also post it somewhere online. I was curious what our options are if we want to fuzz a piece of kernel code that's not reachable due to kernel config decisions. I was taking a look at the kernel configs that you guys have. And for instance, it wouldn't be possible to fuzz app armor currently. Is that possible? Is there a way to specify a custom kernel config? You mentioned up armor, right? It's enabled, but SE Linux is used by default. Okay, so there are two things. One is just the syscolar fuzzer which anybody can run locally. It can use say Koyama virtual machines and then you can give it any kernel. And yeah, so regarding sysbot, we have some fixed set of configurations and potentially we can add new ones. And you could handle kernel configs. Yeah, so up armor already came up because it's also used in some of our internal kernels. It's kind of important to us too. So yeah, if you will find me and send me in the mail then we can discuss this. Great, thank you. So as a developer, when something shows up in my mailbox or on the LKML that is clearly in my code, what's the best thing I can do to communicate back to you that we've actually done something with it or that we're not going to do something with it or any kind of interaction that we might have because there are several bots out there right now that are doing wonderful and glorious things for us. But sometimes the question comes up as, gosh, I fixed the code, now what do I do? So with each email there's a kind of link for more information and there you can find the mailing list for CIScaller where you can reach us. And we usually look kind of subscribed to the reports themselves but there are lots of them and we can miss something. So it's like separate proposal for our systems that it's better to send it to our list. It's just CIScaller at googlegroups.com. You can find this email. How large is the testing infrastructure in CISbot? Sorry, can you repeat? How large is the testing infrastructure in CISbot? How many CIScallers are running? How many pixel devices? Because you mentioned that you run also on devices, right? Devices, so CISbot doesn't run at devices now. So the CIScaller fuzzer itself supports several types of machine, in particular Android phones and all Android boards and it kind of can be extended if you want to. So we currently run on GCE in the cloud on virtual machines only for the CISbot infrastructure. And for the Linux, for the upstream Linux kernel we have maybe 52 core machines or something like this. Any more questions? This is a bit more of a niche question but I saw you were talking about can messages. Could you elaborate a little more on what that's looking for? Can you repeat please? With the CISbot you say that one of the things to add later would be can functionality? Can? Yeah, fuzzing for can messages, I believe I saw that. Can, so currently we do something for can. Again, I don't know exactly what because I have no idea what is can and what are the interfaces. So I talked to can maintainers and they had some initial interest in actually look what we're doing and where this can be improved. And there's something for, I think, actually injecting the can packets. I think there was some issue, maybe it didn't work with namespaces or something but they never get back to us. So yeah, we have something but I don't know what. So you can look and we also give the kernel we give coverage report and you can see what parts of this your subsystem we cover and what parts we don't cover and assess how good is it. Okay, I think that's it. Thank you.