 Hi, everyone. My name is Nick. I work on the Linux kernel and the LLVM compiler project at Google in Mountain View, California. Hi, my name is Bill, and I also work on Linux, but like on the prod kernel side. And also with LLVM as well. So some of the goals of this project, I would say that whenever you have maybe a newish compiler that you're pointing at a code base that hasn't used that compiler before, kind of both the compiler and the code base stand to benefit from that. So on the source side of things, the Linux kernel benefits by getting kind of an additional warning coverage. I would say there's quite a few patches that we've sent as far as just trying to drive the warning count when compiling the kernel with Clang to zero. So we're still have work to do there, but we've already upstreamed over like a few hundreds of patches related to that. Trying to reduce undefined behavior in the kernel. There's not a large amount of it, but there's a lot of code. So you always have issues there. The earlier talk on static analysis is actually kind of why I got into this whole project was I was just trying to run scan build on the compiler. Unfortunately, static analysis needs to know like how exactly are things defined via the pre-pre processor and what symbols get included or not via like dash D flags and stuff. And so you kind of basically need to be able to compile the source code in the first place to do then the static analysis. So I basically got into this rabbit hole of okay, well let's get it building first and then we can go do the static analysis later. So maybe at some point in the future I'll be able to revisit that. But some of the other stuff I'm really excited about is kind of the thread safety annotations. We use quite heavily in Google 3 C++ code for kind of statically verifying thread safety. You can kind of annotate your code and say like you must acquire this mutex or these mutexes in this order to protect these members of the struct or .2 values and stuff like that. Otherwise the dynamic analyses via like address sanitizer, undefined behavior sanitizer, thread sanitizer all have kernel variants as well that are implemented in LLVM are really nice. There's some cool new-ish stuff that we're looking to do in the kernel. A lot of issues we're still working through a lot of them but things like link time optimization and even lately there's been some research kind of post link optimization stuff that we're playing around with a little bit on our kernels. And ideally we want to lower the switching cost between compilers. Just like going back to economics, you have this notion of substitute goods and in order to kind of have substitute compilers you need to lower the friction between swapping them in and out. And so that kind of puts this constraint on client that it needs to compile almost the same code as GCC. Not exactly but you kind of can't even compete. You can't come to the table unless you can compile the same code. So if we can lower the cost of switching compilers that makes it easy for other people to then like report bugs in the compiler and try things out and see how it works for them. For improving LLVM we basically have a whole brand new customer to both give us kind of feature requests and help us figure out what features were missing from either the C language standard or the various GNU extensions. We can't compete with GCC if we can't build the same code kind of thing. So that's kind of a non-starter. And then finally just having more and more code to throw at your compiler helps you find more compiler bugs. So some of the things that we're trying to do is actually create Linux distributions that are entirely built with Clang and LLVM. So Android is in the process of moving there. If you have like a pixel phone, those are already all entirely built with Clang, even the kernel. So we're in the process of working with Android OEMs and vendors now on getting the whole ecosystem moved over to Clang built kernels as well. Chrome OS is entirely all Clang built as well. OpenMandriva is more I guess traditional Linux packages and is a project working through trying to get all of that code ported over and buildable with Clang. And then Bill will talk a little bit about some of the work he's been doing on Google's production servers that are serving your traffic when you visit various Google properties. So the earliest history I could find was there's some project called LLL project that has a few commits in 2011. I don't really know too much about it, but that's like some proof that someone here was trying to build the Linux kernel with Clang. So they have patches on top of a 2.6 kernel, which is like, I don't know, ancient history to me kind of thing. Then from 2012 to 2016 was the LLVM Linux project. They have a large set of patches that is actually a lot of the work that I started doing was kind of based like finding their patches and applying those, rebasing them, cleaning them up, and then in the process kind of upstreaming them as well. And they did a lot of really good work and found a lot of kind of standing issues in Clang and LLVM. I think part of the issue that they were having was both upstreaming of patches into the kernel itself and then also getting fixes in on the Clang and LLVM side of things. So around 2016, my co-worker Greg Hackman and I were looking into kind of getting this up and running on the Google Pixel, right? I was looking into it for static analysis of the kernels. And around that time, Greg came to me and was like, hey, I got this like up and running like it's building. Can you test and see if it's booting kind of thing? And at that same time, I noticed Matthias Kalki on the Chrome OS kernel team submitting fixes upstream to LKML for like fixing warnings reported by Clang. And so I kind of went over to Matthias and showed him like what we were trying to do. And we started kind of collaborating on this stuff and we kind of figured out like, hey, there's a lot of really good patches from the LLVM Linux project. We should actually get those upstreamed or work through whatever issues upstream maintainers have and get that all up and running. So we actually had it up and running for Pixel 1, but due to the way feature releases and product cycles work, we were asked to kind of like give it more soak time. So we ended up shipping it and Pixel 2 was the first device. I think we used Clang 4 to build the full, is an ARM64 Linux kernel for that. So that was the first one, then 2018 Chrome OS. I have two links in there. The first one is like I think in March was they started flipping it on for their various LTS kernels. And then by October flipped it to be the default compiler. So all kernels going forward for Chrome OS are built with Clang as well. In 2018 for Pixel 3, we got LTO working in the kernel and then control flow integrity analysis up and running. That CFI helps with trying to prevent rock chains in the kernel. And then, so 2019 what are we working on? I would say LLD support is imminent, at least for ARM64. The day I was flying out here, project manager was kind of bashing me over the head saying like how come these patches aren't in yet? And I'm like oh I got to clean up the commit messages and there's still some work I need to do to properly kind of upstream them and get them reviewed properly kind of thing, but it's there, it's working. A little bit more work for X86, but ARM64 is ready to go. As I go to the patches are posted. This is a feature that's used in the Linux kernel. X86 kind of requires it. So support patches are up for both the Clang side, the LLVM side. I've been borrowing on them nonstop in the past two weeks. Using C reduce I'll talk about later to basically find bugs in the implementation and make sure we have a high quality implementation before it lands in a bug free one. I would say Clang's integrated assembler. So LLVM tools right has a LLVM has a lot of kind of substitutes not just for the compiler GCC but also for bin utils. So the assembler and the linker and then NM and read elf and all these other like obj tool, obj dump, strip kind of thing. There's definitely a long tail of all the tools. I would say integrated assembler is the one I'm most worried about. I think has basically the longest tail of things that we need. We're in the process of moving all of Android over. I'm working with OEMs on issues that they're seeing with it. Prod kernel, which Bill will talk a little bit more about. The other part of CFI is shadow call stack is another part of the the ROP chain prevention. Third safety analysis, trying to get an intern to see if we can get this working in the kernel because that would be nice. A lot of the bugs that I was fixing when I worked primarily on the Nexus and Pixel kernel team came down to like concurrency bugs in third party drivers. Auto FDO is kind of like the next generation of PGO. We want to do kind of low overhead sampling for ARM that requires ETM or on x86 it's last branch records. Then Bolt is like this cool thing that just came out of Facebook research as post link optimization. There's some other things people are doing trying to see can we move that into the compiler or does it need to be like done after the link stage and one of the issues there. And then trying to get Clang integrated into the upstream Linux kernel continuous integrations system. So kernel CI I would say support for Clang is imminent. They just re-architected it to support multiple different compilers in order to report bugs or regressions in different versions of GCC. And then for them now it's generic enough to just add Clang to it. Zero day bot we were like this close from getting it integrated and then x86 maintainers like forced the use of asm go to the week I was talking with the zero day bot team. So that was a little unfortunate but once we have asm go to landed we'll start up those talks again. If you want to try it out usually when you build your kernel you end up doing like some playing around with your kernel configuration like a make local mod config is kind of the basic target but make has variables you can set a command line you set cc equals clang and you're off to the races. You can set LD is equal to LD to start trying to link. There's some issues I need to fix upstream but that's going to be the command right how to invoke LD. And then when you cross compile one of the things I like about kind of clang is kind of the default build of clang is has all of the different back ends built into it. So with hills and GCC you typically install like a cross tool version that has a prefix in front of it. And so because we're not using clang's integrated assembler we still kind of shell out to GNU AS then when you're cross compiling you need to know that target triple. So if you were going to cross compile for ARM 64 which I do commonly for my x86 workstation those are the environmental variables you set to do the cross compile. These are very rough measurements. I don't know if git grep supports Perl regexes probably would have been nicer but these are like very rough counts of of commit messages that mention clang or LLVM in the kernel. Probably lots of the ones that just mentioned clang may be related to EVPF. I didn't really do any very scientific measurements there but then LLVM as well has quite a few commits now in it that say like hey this is something that we fixed or implemented because the Linux kernel is making use of this. My turn. Hi so first off I'm going to apologize I'm finding a cold so I may start coughing. So the production kernel is basically a different beast from what he's working on which is android. The android tends to work more towards the top of tree Linux while the production kernel has various different what do you call them long term different LTS branches Yeah LTS branches so the one that I'm using is kind of an ancient one I think it's like pre 4.5 but anyway it's it's all kind of a more of an experimental type situation right now well asterisk it's more than just experimental we really want to do it but we need to get like performance there for instance and I'll go into that in a bit here so what I did is I went to the team about a year ago and luckily the android team had already done LTO on Linux and so I was able to use their patches and get our kernel to compile with Linux basically it just took a bunch of changes in the build system mostly it has to do with the fact that during LTO you generate LVMIR files instead of ELF files so you can really don't generate ELF files until you start linking the umlinux.o file and that's when you do all of the LTO optimizations so the builtin.o files that are generated are not really well they're just not really archives at all so basically what you're doing is you're creating a thin archive so that just lists all of the .o files that you want to then shove to the linker and then the linker will take all of those and do its magic and create the .o file I started by using the gold linker because that's what you're supposed to use and then in order to get the actual finals of the umlinux I linked it with the BFD linker and there is some issues however with the gold linker and our teams internally at Google no longer support the gold linker so basically they said use LLD and so about a month ago I went ahead and switched us over to LLD so excuse me so anyway um like I said the gold linker had bugs in fact it was asserting on me a lot but it was great however it turns out that when you use LTO and you're also linking LF .o files with it the LTO files will basically just be generated as one huge .o file .o L file and then the other ones are kind of scattered here and that wouldn't be necessarily an issue however with linux they have this section called init data which is the list of init calls and those have to be in a specific order because they're initializing structures and so on before other ones and we were just not able to replicate that so I basically had to come up with a kind of a hack to get around it so first off this is kind of where you call here and so what I do is I'm popping it into a section that's named you know after the file that it occurs in um and then during linking I create a really massive linker script that will go through and reorder all of the uh sections basically so that they are in the quote-unquote order as they appear on the command line this is absolutely horrible it actually causes the compilation or the linking to increase from about 1.2 minutes to over 3 and there are kind of ways I can maybe get around that but in general it's just kind of bad so please don't tell anybody haha and so now we're going to discuss a few interesting issues that we kind of came across while developing this so I'll let him go first so just in the interest of time I won't get into all these but the very first one which was interesting was so if you see this statement here you're declaring a variable foo it's a long long so we had this issue in clang on 32-bit hosts where according to their y long long should be 64 bits and so you're saying like please use register edx for instance so you can guess which architecture this is please store it in a register kind of thing the thing that's curious is if you're on a 32-bit host and you're saying use edx which is a 32-bit register and you're saying you want a long long well where's the other 32 bits of that 64-bit variable does anyone know well basically you have to implement bug for bug compatibility with gcc on how does it choose the next 32-bit register and if you say like ESP it'll just like crash so yeah that was kind of interesting one of the bugs that we ran to with AR64 that was kind of clever the kernel few text code which is like a synchronization primitive would explicitly dereference null to see like how does the hardware perform and how does it behave and select this select this implementation otherwise this implementation and AR64 has an explicit zero register which is nice so Clang was trying to generate code that said like dereference the XZR is like a zero register the issue is that there is no invalid encoding for that so like you can write that in assembly and then your assembler will choke and say like there's no bit pattern to represent this so that was kind of funny GNU inline is like if someone asks you like what are the semantic differences between C89 and C99 like how XZR inline works that's one the kernel makes use of this I wrote the documentation on this and like have the patch in the kernel for this I'm not really proud of it but that's something you can go look up if you're interested single simple clashes with the C standard library it's like header file like I don't know if you were to pound include stdio or something and like the kernel itself doesn't have a C runtime it implements its own kind of helper functions and then there's like collisions there custom calling conventions VLAIS is a variable length array and then you can also put a variable length array in the middle of a struct not at the end of a struct but in the middle kind of thing was an issue I think the one that I would be most interested in showing you is this one so quick interview question how can you write code that is valid at O2 and not at O0 think about it okay did anyone get this might be hard to see here but basically we have a static inline function inline function has an inline assembly has some constraints on it saying this is going to be an immediate value like the number 42 and so when we compile this at O2 everything's semantically valid but if we don't inline that static inline function it is no longer semantically valid kind of thing the kernel is using this in a few places coincidentally everywhere where it uses asm go to is wrapped in this pattern so it's making testing asm go to very difficult especially if you want to make sure like inlining or especially because the initial instance of patches of asm go to skip inlining support so one of the things that's kind of interesting is if you use attribute always inline it doesn't mean always inline because the compiler needs to know how to do the inlining and a lot of times you'll see like fix me to do implement later not needed for correctness kind of thing so like these need to be macros if you want it to always be inlined kind of thing and I don't know Bill will talk about some of these two other ones in a little bit oh there we go oh there were a couple of issues one of them actually this would have not been a problem if I was using a more recent kernel because the patch as was already existing in an upstream kernel but basically what was happening is during one of our tests the machine would just simply assert and I whittled the test cases down and I found out that in one of the crypto yeah crypto library our .o files if you compiled it with o1 everything went fine if you compiled it with o2 everything crashed and so it happened it had to do with inlining actually and it wasn't a bug in clang which is what I thought initially it was some weird some weird aligning issues so basically because linux does just horrible things with pointers it has this thing called sgsetBuff and what this does is you are basically setting a buffer inside of this structure and it's going to be aligned at a certain place and so on so inside of sgsetBuff it's the offset here is 12 bit aligned as you can see so when it's not inline clang is looking at that and it doesn't really try and calculate the value or anything like that so it just leaves the 0 fff alone however when you do inline it you get the bottom 4 bits turned into 0s and because you have this variable a variable length array being defined like this and s and d are actually defined towards the end there's one variable afterwards but it's not allocated any stack so basically d here is the last thing on the stack about right here and what was happening is after it's set this buffer like this it would put a whole bunch of data in there of course and it would end up overriding the return address because of the misalignment basically all the lower bits there would just be cleared out and it would be off by 8 and you know everything would blow up so that was kind of an interesting bug to find but let's see I'm sorry the resolution is basically to specify the stack alignment on the command line which had like I said had been fixed by a patch probably a couple of years something like that but it just hadn't made it into one of our LTS kernels yet I think so yeah you went by my other ones okay he had mentioned earlier that there had been kind of like lacking support for some of the GCC features in Klang and one of the big ones was built-in constant P it wasn't that we were getting it wrong it's just that GCC was oh geez doing what GCC does and I mean a lot of built-ins aren't super portable aren't super well defined in the first place well basically the definition are the basically the definition of what a built-in does for GCC is what does GCC do and so that's it so basically after inlining happens GCC is still able to say oh is this you know is the variable are the argument to the built-in constant P is it an integer are you know a constant if it is we'll just go ahead and say yes it is otherwise no and in the vast majority of cases we use to take the slow path as opposed to the quick path in one of the newer kernels it turns out that kernel hardening actually kind of fails you would get this scary message here without it so actually both James Knight and I kind of tackled this James Knight did the LVM IR our addition is constant intrinsic and I went ahead and modified the Clang front-end which is not actually the code base that I'm most used to but I became used to it because my god I had to touch every part of it so Richard Smith asked me to one of the main maintainers of Clang asked me to implement a more or less a wrapper class called constant expert and it is the correct way of doing it it's just the long way of doing it but the benefit of it is that it's kind of a more natural way of doing it and also it allows us to support C++ features in the future and essentially what it does is kind of what the name says is you are in a constant expression then you are assuming that you are in a constant context and that can determine whether you're going to evaluate it early in the Clang front-end or late in the LVM IR or middle-end and so yeah we were able to get past that and now we are but compatible at least with that and with our DCC and I think you are that's you yeah so if you're interested in how you might be able to help so I highly recommend this project this is how I was able to start contributing to the upstream Linux kernel and to LVM itself so we have a bug tracker I try to highlight good first issues so we're happy to hand out bugs if you're looking to either start contributing to either one of these code bases I'm more than happy to help you shoot me an email is on the first slide I'm more that's you know just my thing is like how do I get people involved in contributing to the open source software that they love and use every day is something that's personally important to me we're running continuous integration for various architectures so we have ARM64, X8664 ARMV7 ARMV6, ARMV5 PowerPC64 little endian and a PowerPC32 bit running continuously the mainline Linux kernel, Linux next LTS branches 419, 414 4944 so it's a poor man CI like cron job running every night kind of thing but for all these various like I don't know 30 something targets kind of thing so you can check that out it's just running on Travis we have a link to other talks and other kind of material if you're interested in finding out kind of more about the history and the project and stuff Godbolt.org is amazing that's the link I showed you earlier with the code that was only in lineable that's like if you want to communicate to a compiler developer a bug like send them a Godbolt link okay that is like the shared language that compiler people understand like here is a clear case here's what GCC does this is what I want in Clang or here's what I code and here's how it works Clang right kind of thing C reduce I just wrote a blog post on it it's incredible I have like at least 10 compiler bugs that I've found with it you kind of give it a shell script where you put whatever you want in it and then you say like in a source file and it mutates the source file pairing it down until you have like a nice concise reproducer so I found like errors at link time you can say like this script returns a different return code based on if this symbol is found after running nm or not or something on a binary or does it crash the compiler or does a compiler produce strange output or strangest assembly bear is this utility that hooks make for and it kind of spits out a compile commands.json file which you can then feed into static analysis tools like scan build Clang static analyzer or was it clang cpp check right was the other utility I think that uses it or alternatively the kernel will spit out these .o.cmd files that have all the commands for each translation unit when you build the issue with that is it only produces them I believe if compilation is successful which we could probably fix that in kbuild but that's helpful as well just because like when C reducing a bug you sometimes need to know exactly like what flags were passed to the compiler to reproduce a given issue and so like getting those flags out of compile commands.json or the .o.cmd file is critical same thing with static analysis is different command line flags will change your translation unit kind of once it's been preprocessed kind of thing yes yep so yep I've had that issue before where I just say like grep for like a crash and then it paired it down to some like Bizarro C syntax thing and kind of kind of clang's maintainer is like well do we need to fuzz this the C front end like if it's not valid C code like if the compiler crashes or not right is it really a bug I don't know cool so that's all we have thank you very much and we're happy to take questions now this time or sorry I guess it's time up yep yep yep so the question was was around what set of kernel configuration options do we use so the kernel itself has a build system called kbuild that is highly configurable right so it depends on the distribution right like you'll have distributions of linux where there's no guarantee what kind of hardware they're going to run on so they're going to compile a big kernel image that has tons of drivers in it because someone's going to try to put db in on their toaster or their faucet or something right so just turn everything on for android particularly for pixel you know we have one hardware configuration right so we're going to pair our kernels down and know exactly what we're running but then like we may have out of tree drivers that are problematic and things as well and so that's definitely an issue of does it build or not right so someone might come to me and say like well can it build a kernel outright and my question is usually like well what are your configs right because the hard thing with the kernel like baseline what we run in c i is we want to make sure that the def configs build so each architecture has a recommended kind of set of configurations so that's kind of what we cover continuously from there we try to run all yes config builds every so often the issue with all yes config which turns on lots of configs is the kernel has a lot of configs that are mutually exclusive and so like you get this implementation of this function or that depending on the config so all yes doesn't mean like all the code in the kernel so there's probably code hidden somewhere in the kernel that will crash clang or clang can't compile and it's like a matter of finding it so the kernel has something called ran config which kernel c i does run so once they have clang support they'll probably help us find it it's just a random coin flip so you have to flip that coin a lot to find all the bugs and stuff so they're on fabricator I link to them from the slides if not if you go in our issue tracker they're like posted over there kind of thing as well so if you you can pull them down on top of mainline clang rebuild clang and then try it out on the kernel you'll run into the issue with the static always inline I have a patch for x86 that converts converts those all to macros for arm 64 that rabbit hole got really deep and I never finish that patch but it's a starting point if you're if you're interested you'll still need the out of tree patch because inlining support is not implemented in the as we go to patches yet so the plan is to land yes yes that is the plan is like once clang has support for asm go to then we're going to work on getting it to be able to inline basic blocks that have asm go to statements in them kind of thing and then the goal is theoretically there should be no other bugs but sometimes you don't have visibility beyond like one bug so you're like yes I fixed it and then you try it out and then something else is broken so I can't promise you but I think it should work kind of thing after that but oh I mean alternatively like alternatively you can revert the kernels use of asm go to forcing it kind of thing it's not not a real solution but you know it you still get a working kernel kind of thing so one of the big problems we had in LLVM Linux was that we had to track three different patch sources LLVM always running kernel always running and our own patches always flipping between the two and because we had like a just two or three build blocks like x86 and one arm and one arm is running like in KMU we didn't it was hard enough to keep track of all of these three sources in a single configuration and you guys have like 30 something configuration how do you find where the bug is in LLVM or in the kernel or in your own so eventually we're looking to like get help from kernel CI and zero day bot for spotting these regressions and kernel CI is doing like they pin the kernel version and then try with different versions of clang like top of tree clang and like they'll report to us when people break the clang in LLVM trees which is like pretty frequently which is kind of frustrating but it's good that we're getting eyes and coverage and stuff so usually like I'll kind of forward these reports to LLVM developers saying hey you probably already know this someone probably poked you but in case they didn't you're breaking the Linux kernel right now kind of thing then I would say on the kernel side like we're trying to do like always have patches upstream have zero or like minimize the delta as much as possible so we're not carrying patches around kind of thing so it's like as soon as we have a fix it goes upstream and we work with the kernel maintainers to make sure we get a fix and kind of thing and this is no inline asm because there's a lot of inline asm problems so we disable we disable no integrated asm kind of thing for our builds for now for the kernel makes use of assembly both external assembly files but then also the inline assembly which is even more complicated because it has a whole constraint language with it and so like kernel abuses extensively of known standard assembly they have c code generated by the inline assembly and there's then recompiled as an assembly file by the c compiler Abuse is an accurate term when talking about the code base yes okay any other questions what's your plan for plugins good question so the question is about what is the plan for plugins so the kernel has I think checked into the source even like some some GCC plugins and then some are maintained externally so LLVM does have a plugin system when I think back to the earlier talk today that was an excellent talk that was about tooling and how to support this ABI instability in LLVM I think that's probably a good thing to point to as far as like now you have a plugin but how do you guarantee it works with the given compiler version that a user has especially if the compiler doesn't have a stable ABI itself so from a plugin maintenance perspective it makes my skin crawl a little bit it's not impossible it does have a plugin system and I think plugins are useful I think anyone who thinks of reaching for a plugin as a tool needs to sit down and really think hard should this be implemented as a plugin or in the compiler itself so for instance I'll give a case a recent talk at Linux ConfU was saying hey we use a GCC plugin to default initialize all variables I think personally I think that's better implemented in the compiler and there's actually discussion about this for C++ and there's actually disagreement within the LLVM community people saying oh you're going to fork C++ and create another dialect you can't do this and I feel like saying well for C which development has kind of stopped on like can we please have this since we already have our own dialect anyways that we use in the kernel like kind of every dash F flag you add like to me this could be another dash F default initialize right and there's no need for a plugin plugins are great if you want to like hack something up and play with the compiler and get something working I would say the best case where I say plugins make the most sense is when you have you need compiler technology for one given project because then if you were going to say like to submit this to the compiler vendor and say this is only used in this one code base they're going to tell you to get lost they're not going to want to maintain this plugin system because whatever whatever maintenance overhead there is kind of thing there's only works if the compiler version is fixed which in GCC it's normally what happens they have GCC five forever right in LLVM you know fixing in GCC in LLVM six or LLVM seven doesn't make sense because it just progress so quickly that you can't get a plugin that work in seven don't care it's drug so it's a completely different yep okay thank you so much everyone we appreciate that