 Hello, greetings, everyone. In this session, we are going to learn about building an embedded Linux using the Clang compiler and tools. We've been seeing Clang as an option being developed for as a compiler over the past several years. And this is specifically, I'm going to cover what's going on with respect to Clang in the embedded Linux community in general. And there are several talks, including myself, who have presented on Clang in various aspects. This will be specifically building a full system and trying to use Clang as a system compiler. So this is a rough agenda. So first, we'll go through what Clang-based tool chain looks like, and then we'll cover the kernel status, the kernel compiling status. And then we'll go for building a platform. And I'm going to use Yachter Project to build the platform. There are other platform builders out there. I'll briefly mention them. And if you want to use those, you're welcome to try those as well. And then I'll go over what are different options that are available for runtimes, and what options we can choose within the Yachter framework, and what works, what doesn't yet. And then we'll also go over some of the common errors that you could see that you need to address for various packages. I'll also cover through some of this from Debian's perspective, because I think that's one of the largest archives that has been recompiled or an effort is there to recompile that with Clang as a compiler. So building a platform based upon Clang, there are three things that you will see in addition to Clang runtime, which is basically a compiler runtime similar to libGCC and provides the initialization code. And it also has some compiler built-ins that are generally provided as a library, but then support for sanitizers and profiling. LibC++ is an effort for running standard C++ runtime. So in toolchain spake, you can think it like a libstanders C++ library. And then there's a low-level implementation for ABI, which is called libC++ ABI. And then there's an unwinded library as well that is available, which is called the libunwind. And we'll cover them as well. So if you look at the tools map in general, between, say, a GCC compiler toolchain and a Clang slash LLVM. Then this is roughly how we can map them. There is a CC++ compiler, which is basically Clang or Clang++ and the assembler called CC1S. This is internal assembler, which is used by Clang by default. And there are options to not use it, but that will be the default if you don't make any specific effort to disable it. Linker, there is LLD, which is the LLVM linker. It's relatively new. It could be used as an option. However, Clang can work very well with BFD linker or Goal linker. There are no issues there. And debugger, there's a LLDB, which is your front end for debugging and native debugging. But there is also LLDB server, which is for cross debugging systems. So you could install LLDB on the target and then run LLDB on the host, similar to GDB and GDB server setup. In terms of compiler runtime, the compiler RT project, that is providing the seed runtime as I was discussing earlier, which is implementing some of the functionality that libGCC provides you. And then unwindr, there's a libunwind, which is a separate library, implements the unwinding routines that generally will find in libGCC as well. In terms of C++ runtime, we've got libC++ and libC++ API. We talked about that. And then it can work with several standard C libraries, glibC, muscle. There's also a LLBM libC, which is relatively new project, which is also offering as an option. Currently, it is in a nascent state. So I'm not going to cover that as much. From Linux platform point of view, G-Libs in muscle is currently more important and more interesting. And then there are also venerals equivalents for archivers and symbol dumping and other utilities. So you will find the LLBM-equivalent of all those utilities as well. So this is roughly the map that you see here. And then additionally, when you see to these are standard tools, but then you can also see that there are additional tools that are very useful. And clang-tiddy is a linter tool. And clang-doc generates documentation similar to Doxygen. And clang-d is actually a tool that you can use for adding CC++ features to editors, like VS Code and BIM and others. So it can offer a good editing experience. It exposes all the language features through clang-d. ScanBuild, which you'll cover a little bit in later, is the static analyzer. So it's very handy. You could use it to do some static analysis as well. And all this is bundled with the clang-tool suite. So we'll run through these tools a little bit later. Clang advertises itself as a GCC 4.2.1. So it's a very important distinction. So when we are compiling software, we'll see these internal compiler defines being used in several places to find out what features are supported by a given compiler. So there is conditional code that might be checking for underscore, underscore, noosey underscore, underscore, or the minor version. And then deciding to go one way or another, like supporting C99 inlining or something like that. So be aware that if code is doing this, then they would be fooled by clang compiler while it might have those features. But your auto detection tools or other scripts may not be able to detect that. Clang also exposes some of its own internal defines, primarily underscore underscore, clang underscore underscore. And that is one define you could use to make sure that you are using a clang compiler or not. And then there is also clang major, clang minor, which is similar to noosey minor and noosey variables. To find out which version of clang you're running. And similarly, when I talked about clang assembler, which is an internal assembler, for example, it's limited in several sense and it's different from a new assembler. Talking of ARM, it only supports unified syntax. So if you have some inline assembly or an assembly mixed code, you pass through clang, it may not accept it. It is a single pass assembler. What that means is if you're doing some symbol calculations based on distance that is very common in assembly files, it may not be able to handle that. And so usually what I've seen is when you have files that are written in pure assembly or oriented toward assembly, you might find these kind of incompatibilities. They are addressable, but they might show up as errors at the very onset when you're putting a software to it. But the good news is that clang can work with, say, new assembler. So you can ask it with disabled actually internal assembler with FNO integrated AS option. Once you pass it on, it will expect you to provide a new assembler during the assembly step. And if you build a tool chain correctly, it will automatically find it. So now moving on to a little bit on the kernel side. So there was an effort for LLBM Linux that was started a few years ago. And then now there are several community members who are interested in making sure that Linux kernel can compile with clang. And there is this landing page now called clang build Linux. And so what this is, it is basically making sure that it is running a continuous integration job that is building various configurations of the kernel using clang compiler and other clang tools. So there are options that are introduced in kernel to build just with clang tools. And it exercises all those options and reports back on the build status. It uses Travis to do that. So if you are interested in following what all different combinations it is checking, please go to this link that I provided here. There's an issue tracker that's using GitHub issues. So if you are interested in this effort, use it on file issues over there. If you find an issue in your own bills or whatever. If you are preparing patches to fix some things in Linux kernel with clang, submit those patches upstream into LKML directly. So GitHub issues is to track those, what all issues are pending. But if you have patches, then it's a fork of Linux's tree, but they may accept the pull request as a placeholder for a patch, but if you want it to be included upstream, then directly go to LKML and submit those patches over there. You could take the help from the community if you want to make sure that your patch is in good shape to be submitted upstream. So clang build Linux is a snapshot I've given here. There is a wiki. You can learn about how to compile the kernel using clang and what all repos it does. There's a mailing list and there's IRC as well. More importantly, there is actually a bi-weekly meeting that is held online. And a lot of these issues get discussed over there and status gets discussed. So if you are interested to become active, you're very welcome to join that. Right now ARM ART 64 and X86 64, those are primarily tier one targets, I would say, that can compile with clang and they are fairly in good shape where they can boot and all the features are working fine. Then there is a list actually, a big list that you can find there and you can see that limited test configurations are available for our PC and MIPS. And RIS 5 is in progress and there is a lot of interest in there and if you have other architectures, that are supported by clang and Linux as well, but there doesn't exist a build for those in here contribute. So this is a good effort to getting kernel compiled with clang. There are several talks that has been done there in various conferences. So feel free to search for those, especially recently there was a talk on LLBM Meet, there was also a talk on, there was actually a mini conference at Plumbers where this was discussed in detail as well. So now I'll move on to building the platform and there are actually few options that are out there in terms of infrastructures that you could use. Gen2 has a clang overlay, which is also in a way used by Chrome OS. Debian is actually building clang-based packages and archives and Majaya is another one which is actually ahead of all, where it is using clang as its default system compiler already for its targets. And you could also use Yachter project and as I said, that's what I'm going to use throughout this presentation. And you'd also use some of your own if you want because there are clang-based tool chains that are available. Clang is inherently a cross-compiler, so there is no big deal in creating a cross-compiler using clang. And you could set up your own if you want to build something your own or you might have your internal build systems. In Yachter project, there is a separate layer for clang. It's called meta-clang and I maintain it. What it does is it provides the overlay for adding all the clang-toolchain-related packages through Yachter recipes. In addition, it also provides additional tools that we discussed a little while earlier. For example, the debugger, the LLD linker, and other runtime libraries. There is also some packages that you will find in there that are made available. Those are the tools that are written based using clang. And so you will find those recipes over there. But primary purpose of this layer is to provide a tool chain for Yachter projects. You can build Yachter project using it, which means it will be the internal tool chain. And you could also make it as part of your SDKs or extensible SDKs that you can build out of Yachter project. And we'll go over that later. Setup is very simple. You just clone the Pocky reference disk pro and then you clone clang layer and you add it to the layer mix into your project. So it doesn't depend on any other layer besides the core layer. So you could just clone the Pocky depot and include it. And that's all you need in terms of setup. However, it's an inert layer which means that when it gets added, it shouldn't be impacting if you're not using it. So as a result, your default compiler is still GCC even if it is added in the previous step. So in order for you to make it active, you would want to add something like tool chain equals clang into your local configuration metadata. Once you set that up, what it instructs the system is that from here on you can use clang as the default compiler. And as you can say, there are two options for this. You could also set tool chain equal to GCC. So in some cases we'll go on later where some packages are not compilable with clang. You would just use tool chain equals GCC and this setting will not be effective for that particular package. And the another variable of interest is runtime equals LLVM. When you select that, you are basically asking defaulting to compiler RT to provide you the C runtime and lib C++ to provide the C++ runtime and LLVM unwind to provide you the unwinding runtime. So this tries to provide all the C C++ runtime from LLVM project. That's not the default though. Default is to use the new runtime. So runtime equals GNU is the default. And we'll talk about various combinations and the issues they might have, why these options are chosen to remain as GNU to begin with. Building an image is easy. Like if you work with your project, you will just build an image that you're building. And if you build a core image, it is going to build this image using Clang, which means all the packages that you build for target will be built using Clang. The native packages, there are certain packages it will build for your build host. They will not be built using Clang. They will still be built using GCC on your host or your host GCC, so to speak. And so Clang will be effective to build only the target packages. So that's a distinction that you should be aware of. And then you can run the image. By default, it will build for a QMU machine. You can just run it and run QMU and it should boot. And you could also build SDK. And these are standard steps to build an SDK. Once you build the SDK, there's a variable called Clang SDK. If you set that in your local.conf, then it will also insert Clang into your SDKs as an alternative toolchain. So you'll basically have both GCC and Clang compilers available as part of the SDK. There are a few exceptions, actually. And this list has been reducing with every release of Clang and other software, so which means that patches are flowing upstream and they are getting fixed to build with Clang, which is a good news. In some cases, there are just tweaks to flags because some package might be using a warning variable which is not available in Clang or vice versa. So we are basically tweaking those flags. It still is using Clang to compile that, but removing those options from the build. But some of them outright overwrite the compiler and they would be setting the toolchain variable we discussed to GCC explicitly for that particular variable or for that given recipe. So to look into a little bit more detail which one of those are, Jellipsy, if you are building a Jellipsy-based system, Jellipsy can't be built yet with Clang. There has been some efforts in the past, but I think we haven't gotten to a point where we can get a fully functional Jellipsy working with compiler and compile with Clang. Muscle, however, is another option in Yachto and that does compile with Clang. So if you are choosing, or you're already based on the muscle system, you are in good luck here. Once you start using Clang, it should build muscle fine. GCC runtime, as I mentioned, that's the default we use today. So these are things like libGCC and libstander C++ and other language runtimes. That GCC provides and they still need GCC to build. And so you'll see like, if you have compiler runtime recipes in there, they're still hard-coded to use GCC. We are still using GCC to compile Uboot, although some configs do work and there is actually a detailed list here how to enable Clang to build Uboot. We haven't enabled it in default by default in MetaClang. The reason is because Uboot is used for many, many different machines and all of them may not be available. Maybe one of these days, I would enable it for a particular machine and see how it goes. Outfitools is another package that is used by a lot of tools to poke at the objects and binaries and it doesn't work with Clang yet. So there are a few others, but very handful of those packages where you really, really would need. Clang is not building them at all. Grub again, same like Uboot. We have the same story here. It does work. I've actually compiled the latest master with latest Clang 10. It compiled fine. I just had to disable WR because Clang finds new warnings and treats them as errors. And once you disable that, it will fine. Python 3 is another key package that we are building with the GCC. The reason there is that we cross-compiled Python and we actually enable profile-guided optimizations for building it because Python is very core to us, so we want the most optimized Python on the system. And so for that, we run those PGO workloads in QMU. And for when those workloads are built using Clang, QMU crashes and we haven't yet debugged why, but I think there is something there to fix. Once we fix that, we should be able to build Python 3 with Clang. It builds fine if you disable PGO to build it. Then you would see in that particular file, I mentioned nonclangable.conf. There are many packages that are just disabling the integrated assembler by passing the no-integrated AS flag. And that is for obviously the reasons I mentioned earlier where it is using inline assembly, which is probably not understood by the internal assembler yet. And in many cases, your C code might have inline assembly. That is not understood. By Clang itself, because the inline assembly, it tries to match to whatever GCC does, but in many cases, the code that is written using inline assembly takes advantage of a lot of undocumented features, then it might not be available in Clang. See it on time. So this is actually your CRT begin and CRT end. So if you dump when you link a particular executables, these are your startup and routines that gets linked into the system, into the application. And these are currently used from libGCC package. There is CRT begin and CRT end that are provided by compiler RT. And we currently do not use them. This is one of the to-dos we would like to use them in future. So right now if you see when we build compiler RT, then CRT's feature is disabled by default. But if you enable it, then you will start getting this CRT begin and CRT end objects compiled and made part of compiler RT as well. But it would also require a little bit of hand holding because in some cases we might want to include both CRTs like compiler RT and libGCC into the system where built-ins are provided by compiler RT but the unwinding is provided by libGCC and so on and so forth. So I think there has to be some careful crafting that we have to do but this is certainly something of interest in future. As I mentioned, GNU runtime is the default. It works well. This is basically client relying upon libGCC and then libStreet++ to provide those runtimes. Mixing both may not work really well and we haven't tried that. While Clang works fine with libC++ and libStreet++, other way around is a less tested combination. So that's where mixing both may be a little troublesome. The other option is to use LLVM runtime which means your system has libC++ as default. There is one issue we have in there where if a package doesn't build with it, then we have to include both of both the runtimes. If there are two packages say 1A built with libC++ and 1Built with libStreet++ and then there is third package which depends on both of them then which headers to use. So there are little complex situations there. We do provide it although some application might link with it but at system level I still see that libStreet++ is still used as default. We have actually tried using libC++ as well. We can get system working with that to a certain extent but then many of the packages, you might find this build issues that you might have to address. The LLVM linker, it is built as part of the compiler toolchain when you use MetaClang but it is inert by default. So you could enable it by FuseLD flag which is similar to what you have in GCC. So you can say FuseLD equals LLD and you can enable it per package means per recipe or you can enable it globally. And to make it easy to enable at a global level there is a distro feature you could add LV is LLD and that would basically switch your system linker to use LLD. This works well for x86, 64. I haven't tried for other architectures yet. And similarly, there are the binodals versions that are available, the archiver and the symbol readers and they are actually used when you set toolchain equals to Clang. So they're already switched by the Clang system to use. There is a LTO use case as well that is available. So there is a class available in recipes. You can inherit LTO. There are two options that it exposes thin LTO and full LTO. So thin LTO is a faster variant. And full LTO will take a lot longer to compile might need more resources but then it might end up with a better code in the end more size reductions and so on. Thin LTO is fast and it might give you good results as well. So based on what you need, you can choose whichever you want. One option that one issue that you will generally find is if you enable optimization for size somehow Clang doesn't like that and enabling LTO along with optimization for size doesn't yet work. There are bugs already open in LLBM bugs. So I talked about additional tools and you would see that we have these static analyzers that is made available. And in Clang you have actually option to enable it. So you would similarly inherit scan build and you can set whether you want to scan every package or you want to scan a particular package. In this case I've said scan build PN curl equals one which means it will only enable the static analyzer for the curl package. And if you want to disable it, you can disable it by emptying out that variable for that particular package. So once it has run through you can see the results with a task called BitbakeCscanView and give it a recipe name what it will give you is it will give you a link that you can set and you can open that in a browser and I've given you a snapshot here for a open SSL scan that I ran and you can see that it reported some issues. So it's a good tool, it's kind of integrated into the MetaClang as well. If you're interested you could give it a shot. Extensible SDK, as I said earlier you could install the extensible SDK you can build it as we covered earlier once you install it and make sure that Clang SDK one word equals one is set and then you install it once you install you will have the Clang tools as part of the SDK as well and as you can see I've given you some snapshots here and commands how to install it as well as how to set it up for using. There are some variables in the environment that are made available like Clang CC, Clang CXX and Clang CPP which basically are the C compiler C++ compiler and preprocessor which point to Clang by default your compiler will still be GCC in the SDK so if you're building an application and you want to use Clang you might want to set CC equals Clang CC and that will enable it. Clang tidy, EXE is also made available so if you are running linters on your source code you can use that as well in SDK. So at this point of time we make it as a secondary tool chain but in future you might have used Clang as a default tool chain as well but right now we don't do that. Although there are tweaks in the system where you can make it if you want to do that. Alright so moving on Debian actually has a pretty decent size archives that are now buildable with Clang so as you can see this is the graph I took from the Clang.debian.net and it tracks the progress from LLBM 2.9 release onwards up until 10 Clang 10. Recently Clang 11 has been released but there is no data for that yet but as you can see there's a gradual reduction in number of failures from 15% to below 5% now and there are around 31,000 packages that are tried and you can see that just a little over 1,000 packages are failing to build so there's a good progress in there and so I'll go over some of the common errors that are encountered in the account towards this 1,000-odd packages as well so this I make failure is actually accounting for probably I don't know how many but hundreds of these will failures I think so what it is internally is that it's expecting a traditional preprocessor and the preprocessor that you have with Clang is not implementing the traditional CPP syntax so that is actually the crux of the problem even though there is a traditional CPP option but I think that doesn't implement whatever I make is expecting in there so I think there is a defect already there and there are workarounds to address that and the error that you'll see is that C++11 requires a space between literal and identifier so this is a warning that Clang will commit but for whatever reason GCC didn't and you could basically disable this on command or you can address in source code in many cases I've seen patches where people have addressed it and in some cases the warning is disabled so this could be addressed as well so this is also a large set of errors that you would see in Debian and linking with LTO fails so I think this is because of the way the code plugin is being invoked by the compiler driver and it doesn't find the compiler driver actually is not able to find the goal plugin and the error actually on the top is not indicative of what's going on but it's basically if you pass absolute path to the plugin then this starts to work and in Yocto I think this this issue doesn't happen so I think this is an old problem I'm not sure that we have this issue with newer GCCs but depends upon like if you have an old version of GCC Clang always followed the C99 in line behavior and older versions of GCC might default to GNU89 which means that the behavior is a little different where the external lines might work differently in GNU89 model and it might cause linking errors in C99 so what I've seen is that recently GCC can and Clang behave very similar by default so I think these issues probably are addressed one issue that we've seen there is with respect to %n printout format and Clang actually errors out and it says that in star is expected which is right but it doesn't find it and GCC doesn't warn about this and this issue is seen in some packages as well so in summary as you can see Clang can be used as a default system compiler we've been using it for for ARM machines and x86 64 x86 machines as well as AR64 and I've done runs of Yocto builds for RPC as MIPS and it goes through so today we can if you include MetaClang for example in Yocto then you can do the world builds except you know the issues I talked about most of the packages they do compile with this with with Clang these days and with newer releases it's going it's getting better. So some of the things some work to do we still need to get G-Lipsy compiling with this and the new runtime or maybe make LLVM runtime as default and enable Clang based builds for Uboot as well as Grubb and other bootloaders once we have that enabled we'll have like end-to-end apps we will be able to build using Clang so that's all I had today so thanks everyone and now we can take some questions. Thank you.