 Okay, it looks like people are there. So let's get this started. We don't have a whole lot of time, but if there's any questions or any problems or anything, feel free to interrupt me at any time. And we'll just have to try to get through either way. So this talk is going to be about tool chain options in 2023 and probably most people are familiar with what is the account tool chain situation. And there's one well-known, well-documented way to build cross-compilers and that's probably what many people have been doing for years. But there's a couple of other options that are not as widely known and that might be worth investigating for new projects. So traditionally, you would build new binutils, then build a minimal GCC cross-compiler, only seen no threading support, then you'd build JLPC and then build GCC again with all the features you need. There's a couple of good reasons to keep doing exactly that. By far, it's the most widespread, most third-party libraries and applications have been tested with it. Chances are if you have problems, you can find people on IRC and forums who have done it before and can help you and it works well and optimizes well enough. But like I said, it's no longer the only option. Let's take a look at binutils first. That's a collection of tools that deal with object files. If you haven't used binutils directly, you've used it indirectly because your compiler calls out to it. For example, whenever you're linking together object files into a final executable that is done by binutils, tools like object dump and so come from there. There's three major implementations of it and there's a good chance you actually have all of them installed on your computers right now. First is GNU binutils, that's the standard implementation that has been around forever. Then there's elfutils, which is shipped with pretty much all the Linux distributions these days to get libelf, which is one of the components that uses. But it also contains its own implementations of pretty much all the tools provided with binutils. And then there's LLVM binutils, which is part of LLVM, which is also included in pretty much every distribution out there, if only to build Mesa with it. Elfutils has pretty good implementations of the tools that it provides, but it's lacking a linker, which is one of the key components you will need when you're compiling stuff. So binutils by itself won't do it, but you can supplement it with binutils, LLVM or mold. The tools in binutils and LLVM binutils are pretty much interchangeable, use mostly the same parameters. But the tools that come with LLVM have one big advantage. All the cross compilers are built in. So even if you just have an object file and you don't even know if it was compiled for x86 or ARM or RISC-5 or PowerPC or whatever, you can use LLVM object dump dash dash disassemble, and it will just detect what it is and give you the right output. So if you're expecting to work with disassemblers and across multiple architectures, that may be a good reason to check out the binutils parts of LLVM. Next key part is the linker. If you don't do anything on most machines, most distributions, you will have the BFD linker installed by default, and that will do its work in the background when you call GCC or Clang or whatever. But there are a couple of alternatives. So the LLV linker, which is part of LLVM, has been pretty good at replacing GCC ever since version 10, it's currently at version 16. There's a couple of compatibility problems you might see if you're using very complicated linker scripts, but other than that, LLVM is pretty much ready to replace the other implementations. And another interesting option is the mold linker, which has been started by the original developer of LLVM, and it focuses on linking speed and parallelism. So if you're ever concerned about your application taking way too much time to link, that is an interesting option to check out. Another interesting part of it is that it supports both Clang LTO and GCC LTO, but its linker script support is somewhat limited, and full support isn't planned. So if you're using any complicated linker scripts, for example, in an embedded system where you try to include all the system boot-up code and everything right in your main binary, the mold is not an option for the moment, but for the bigger systems and desktop systems, servers, or even big embedded systems, it can certainly be interesting these days. Next big component is the compiler. Obviously GCC has been around forever. It still does a very good job. It also supports CC++, Objective-C, Fortran, Aida, Go, D, supports all the major architectures, always supports the latest versions of languages like C17, most of C++20, parts of C++23, and optimizes pretty well as well. The main alternative to it is Clang, which comes from the LLVM project. It also supports a couple of all the interesting languages C, C++, Objective-C, Fortran. Frontends for other languages are available out of trees, so for example, Rust always uses LLVM as a backend, sharing a lot of code with Clang, Swift uses LLVM, Pony uses LLVM. Its architecture support is also pretty good in addition to all the standard processors you'll have come across. It can also target a couple of GPUs that are web-assembly in BPF. Also supports the latest versions of languages like C17, C++20, C++23, optimizes them well and has quite a few sanitizers that can help debug your code and find additional problems. Another interesting piece of information about it is that it's a cross-compiler by design. So while if you are using GCC, you have to build one cross-compiler for every target. With Clang you have all the cross-compilers built in. So you just use one compiler for all the platforms you might be targeting, you just say clang-target and give it a drifted, and then it will target that architecture. Another advantage of the Clang compiler is that if you are looking at the compiler's code itself, unless you've been working on GCC code for a long time already, it's much easier to understand Clang's code than to get into working on GCC itself. We've already talked about the targets. Performance is important, obviously. And performance of Clang-built binaries and GCC-built binaries these days is pretty much similar. There are special cases where either one will perform better than the other one. Overall, most benchmarks show that these days Clang-16 is carrying a slight advantage over GCC-13, but in most cases it's not very relevant. It's within a range of two to five percent. It's interesting that Clang tries to be a drop-in replacement for GCC, so it takes pretty much the same command-line options. You can switch one for the other without making too many changes to your make files or anything. And Clang has been around since 2012 and file of GCC has been around since 1987. Obviously, the age is both an advantage and a disadvantage for each. GCC has been around for a longer time, which means it has collected more bug fixes and more experience over the long time, whereas Clang has been written quite a bit later, which means it probably doesn't have the same amount of experience in it, but it also doesn't have to carry around craft from 1987 that has been obsolete for ages. So that's not a clear advantage of either one, but it's a notable difference. Another difference is in licensing. GCC is GPL, Clang is Apache 2.0. I'm not going to get into the flameless about this personally. I think both of those licenses have advantages and neither one should be a reason to rule out using that compiler. And of course, there's a good chance you will be using LLVM in some form anyway, because for example, Mesa uses it and if you have any form of UI, that's likely to be on your device. But yeah, you probably also end up having GCC because you will need libstdc++ if you are using C++ or you will need libgccs if you are using anything that uses the low-level compiler runtime and has GCC built in. So there's a pretty good chance for whatever you're doing, you will end up building both compilers. Compile time is also interesting. That can be significantly shorter with Clang, especially when we are talking about C++ code. It doesn't make much of a difference for things like building the kernel, which is, see, they take about the same compile time there. But for example, building LLVM with GCC takes almost twice as long as building it with Clang, which is because the code base is pretty much heavy on C++ and the C++ components in LLVM tend to be much faster. Another interesting difference is that Clang is more modular code, so most of the functionality is contained in libraries if you want to create a new programming language or embed a programming language into an application as a scripting backend or whatsoever. Or if you are targeting new processors and want to target new architectures like WebAssembly or embed compiler functionality like syntax checking in an IDE or developing or so, there's a good chance you can use the libraries provided by Clang, and it makes more sense to use those than to come up with some hacks that call GCC from the command line, pass its outputs and then decide on warnings and stuff based on that. There's one piece of good news. They are fully binary compatible, so you can build a library with GCC and then build an application that uses it with Clang or vice versa, and you can even build one object file inside a project with one compiler, another object file with the other compiler, link them together and they will just work unless you're using LTO where the compilers use different formats. But there's tricks with which you can do even that, those tricks would involve building the object files first and then converting the LTO code into a traditional object code and then linking the files together. If you want to mix the compilers, you have to use GCC support libraries, libGCC rather than compileRT, because Clang can make use of GCC libraries by default, but GCC, unless you are using some Dational STD lib trickery and make files won't make use of the LLVM versions of those libraries. Clang is not the only alternative compiler out there. That's also TinyCC, which is what the name implies. It's probably the smallest possible implementation of a full C99 compiler. The compiler source is smaller than four megabytes compared to almost a gigabyte for uncompressed source of either C or LLVM. But obviously that has drawbacks, like not implementing all the optimizations. If you care about the quality of the binary that's being produced, chances are you want to use Clang or GCC. And TinyCC has other uses, like if you want to embed it into your own application as a scripting backend or so, it makes sense to not make your application too big, especially if you're on an embedded device. So TinyCC might be a good option for that. And in some cases it might even be sufficient to stay the system's only compiler. Then that's the topic of BSPs. Many bot support packages come with the compiler, but the situation there is really not what it should be in most cases. Mostly what you will find is some outdated fork of an outdated version of GCC or Clang. And usually in the time it has passed since those compilers were forked from the upstream projects, the upstream projects have added much better support for the hardware in question than whatever the BSP maker added. So unless you're working on a very special device which is not supported by upstream compilers, it's usually good advice to ignore the BSP and just build your own Clang or GCC and use that. Especially if you're making use of newer language features, the older BSP compilers tend to be far behind if you're looking into stuff like using C++23 or so. Sometimes that means you have to add a couple of kernel patches to support newer tool chains because unfortunately, again, many BSPs include an ancient kernel. Now the proper fixes obviously to upstream changes to the kernel to mainline using your mainline kernel, but where you can't do that because it has thousands of patches needed to make your hardware work, you can usually find patches to make the kernel work inside the kernel repository. So usually just back porting a patch or two from mainline kernels to whatever four-point something kernel you might have in the BSP will make it work with the current compilers you've been building. There's a bit of a special case for the extensor architecture. The Clang doesn't have support for it in upstream yet. It's supported by the LLVM libraries in version 16, but not yet by the Clang front ends. And there actually is a vendor version that works there. And of course, while working on LibAPU, which is a library for talking to all the APUs, we have created a little fork of that that's rebased to LLVM 16.04, giving the latest LLVMs with all the extensor patches. And we really hope both of those versions can go away with LLVM 17, merging all the remaining extensor patches. Now, like I've already mentioned quickly, Clang and GCC are similar in performance. Of course, all the architectures I've looked at, which is ARC 64, RISC 564, and XAD664. But that doesn't mean you will never run into any surprises. For example, if you run a loop unroll from the Adobe C++ benchmarks on ARC 64, and you make it work on Insta32T, GCC build version will take 208 seconds, while the Clang build version will finish in 14 seconds. But now, if you are saying, ha, GCC sucks, Clang is so much better, you don't even have to switch benchmarks reads, you just run this different file from the same benchmark on XAD66 and you get GCC outperforming Clang on that. So what are the conclusions for compilers? GCC and Clang are both good. There's no clear winner. And both have been used to compile full systems, including the kernel these days. Most Clang's distributions are built mostly with GCC, with the notable exceptions of OpenMentry via an Android, which are built with Clang. BSDs are mostly built with Clang. Quite a few built-from-source distributions offer both choices. If you're using Yocto, it uses GCC by default, but you can pull in the meta Clang layer to get Clang support, and that will work pretty well as well. So Clang makes it easier, and unless you are very much into GCC's code base already, to add new architectures, new languages, use it as a library, if you are planning to work on the compiler itself, probably Clang is the way to go. If you are using G-Lipsy, at the moment, you still need GCC to build it. The Clang support for G-Lipsy is still in progress, and if you don't need any of the extras offered by Clang and you want to go with just one compiler, that's a good reason to use GCC for that particular system. And of course, the overall conclusion is, multiple compilers are good. If you can try to build your code with multiple compilers, you can find out which compiler works best for your particular code you are developing. You won't run into nasty surprises like the compiler being 20 times slower than it should be, just because you didn't try anything else. And also, it will help you find bugs, because different compilers warn about different problems. I found a construct like this in a Bluetooth driver, where they actually assume passing a character array and using size of one, it would tell the size of the array. But it obviously returns the pointer size of the target system, because it's being passed around as a pointer. And up until very recently, only Clang would warn about this. A similar warning has been added to GCC lately, and I'm sure there are other situations in which GCC will warn about something that Clang's island list follows. So it is always useful to run your code through multiple compilers and see what they have to say. Next component of a tool chain is a libc. The default option that pretty much everyone is using all the time is glibc. It's the most widespread, most complete, most standard compliant thing there is. It's very well tested. It has very complete architecture support, but its code is not very readable. It needs GCC to be compiled. So if you are opting for a Clang-based system, that may be a reason against using glibc. It's not very optimized for small systems. It's rather big, roughly four megabytes for the LDSO and libc, libm. So especially if you are targeting a low-end system, you might want to look at some alternatives. One of those is Muzzle. It's also complete, fast, relatively small, 785K compared to glibc's four megabytes. It's designed for newer versions of the languages, C11 plus and POSIX 2008. It also implements many glibc extensions, Linux extensions, BSD extensions. So you probably won't run into many problems at compile time while you're porting to it. Chances are it will just work. It also has really good architecture support. The code is quite a bit more readable. It has been around since 2011, which is once again the same situation as with GCC and Clang. One has been around longer and can probably be considered a bit more mature. The other is newer and therefore doesn't carry around all the craft. So that's an advantage and it is advantage at the same time. One thing people frequently point out speaking against muscle is that system D pretends it needs glibc. So you can think that if you want to use system D on a system you have to use glibc, but it's not actually true. If you edit the make files a bit and patch towards three lines of code, at least the basic components of system D will work perfectly with muscle. It's possible that there's some components that won't, but at least I haven't run across any that won't so far. Another option is your clibc and g, which is a project that has been around for a long time, but it has also stopped a couple of times and then restarted. One of the most interesting features of this one is that it's possible to strip it down easily. You have a configuration system that is similar to the kernels where you can just leave out parts of the libc that are part of the standard, but that you won't need in your particular embedded system. Like you can leave out libm or you can leave out the stdio stack or stuff like that if you don't need it. It supports many processor types, including ones that don't have MMUs. So if you are targeting the very low end, that's certainly worth a look as well. Then we have clibc, which is originally written for early bootup process. Some distributions like the Debian use it in it, like MFS, it only provides its upside of libc functions. It's optimized for size of a performance. It tries to make direct use of kernel structures so it doesn't have to convert types. So for example, in other lipsies, kernel might have a different idea of what's in the struct stat than the libc. In klibc, the kernel structures are used. Because of that, it's extremely small. You can get to full libc in 75 kilobytes. It's probably not powerful enough as a real-world libc, but it might be an option for some embedded systems. One problem is that it uses GPL kernel headers and the resulting license situation is not 100% clear. Like when does that make it a derived work and you have to put everything that uses it under the GPL and when doesn't it really affect it because it's standard interfaces. But of course, if you're building a system that is fully GPL anyway, that doesn't really matter. One thing that is coming up, not yet fully finished, but looking interesting is the LLVM libc. It's in early stages, some code is there, you can look at it in the LLVM git repositories. It's potentially interesting because it has been designed from scratch to work with sanitizers and fast testing, targets only C17 and up, so it doesn't carry around any of the ancient craft that you might want to get rid of. Design goal is to use source-based implementations whenever possible instead of using assembly like many of the other libcs. That is actually a design goal because it's being written by compiler developers. So the idea there is to fix the compiler to generate the code that you would otherwise handcraft. While it is not yet ready, the LLVM project has a track record of delivering good toolchain options for both in binutils and compilers. So it's certainly worth mentioning already, but if you're planning to get something out this year, it's probably not worth a look. Another option there is Bionic, which is what Android uses. It's originally based on a BSD-Lip C. Currently supports ARM and X86, both in 32 and 64 bit variants. I know that with five supporters underway, it's rather well optimized because of the vendor support Android is getting. In its early stages, it used to be pretty much unusable for a regular Linux system because it wouldn't provide system five shared memory and stuff that you would need to get an X11 stack, but these days it has pretty much caught up on that. But at the same time, it added more and more Android-isms that are not used outside of Android, like Apex and system properties and the build system is totally tied to the Android tree. So even just building it outside of the Android tree to make use of it in any different context is a bit of a challenge. But of course, an advantage of using Bionic, even if you're not on Android, is that close drivers written for Android could be used in any different Linux system that uses Bionic without having to go through stuff like the fibers. So if you're trying to build anything that is in any way a Linux Android hybrid system, Bionic might just be the way to go. There's a few other options that should be mentioned. Newlib is a pretty much complete implementation of a C-library for mostly for embedded devices. It only supports static linkings or as soon as we are talking about bigger devices, becomes less interesting, but it is certainly useful in the low-end embedded world and it's used by most Zephyr builds today. Zephyr is currently transitioning to Picalypsy, which is a fork of Newlib and also of the AVR-LibC that took mostly the STDIO bits. And it frequently incorporates changes from Newlib, so it's probably a better option than Newlib because it gives you Newlib plus all sorts of extensions and better build systems, threat local storage and stuff like that. Then the STDIO-LibC, which is really optimized for small size and static linking, but it supports dynamic linking, even though it's not really meant for it. It's not very actively maintained and it's under the GPL license, not LGPL, so you might have a license problem there if you're building any components that aren't GPL. And of course lastly, there's the option to take a BSD-LibC imported to Linux, which has been done for example by Bionic initially. So conclusions, there are many interesting options. Again, there's no clear winner for every situation. My advice is if you need maximum compatibility with other systems, go with G-LibC because that's what everyone in the desktop and server world and many embedded devices uses. If you need something that is full-fledged, but smaller, more memory-efficient, muscle, if you just need a subset, you will need the possibility to strip out features of the Lib-C that you don't need. Try your C-LibC-NG. And if you want to experiment with Android features on an embedded device, also Bionic might just be worth looking into. Next thing is C++ support. There's primarily two contenders for the STL library. One is LibS to the C++, which is part of GCC, used by pretty much all the Linux distributions, even those that use clang as their primary compiler. Android is probably the only major exception that uses LibC++ from LLVM instead. And if you expect stuff to just compile without having to add missing includes or anything, there's a good chance you just want to use LibS to the C++. The main other option is LibC++ from LLVM. It's newer and smaller than LibS to the C++, carries less craft to support ancient code. Many benchmarks show it to perform better, but not all, that's the same situation as with GCC and clang. They are similar in performance. You can craft a benchmark that will show one being much superior to the other. You can craft another benchmark that shows the opposite. One problem is you can't mix those. So for example, if you build QT against LibC++ and then you get a third party binary that uses QT, but that QT has been built with LibS to the C++, that is going to go clash because they use many of the same symbols, but they are not fully binary compatible. So try to pick one of the LibCs. There's one context in which that's different. A lot of third party applications, like Chromium if you use the pre-built binaries, they will be used, they will be built with LibC++, but those projects have made sure that they are not being mixed with the C++ libraries that have been built against LibS to the C++, which is one of the reasons why for example, Chromium binaries usually include a lot of shared libraries that are just duplicated from the system. There's also Usylipsy++, which is an interesting option in concept. It was an attempt to write an STL implementation that goes along with Usylipsy, so like with Usylipsy and G, it had all the features to strip out stuff you don't need, but it hasn't been maintained since 2016. So if you need some of the features, like being able to rip out parts of the STL that you don't need, you may want to revive the project, but if you're not prepared to revive the project, it's not seeing any upstream maintenance today, and it's not worth using unless you spend significant time on it. There's a few more implementations like STL port and Apache LibS to the C++, but they have been unsupported since 2008. One thing that is possibly worth a look, especially if you're looking into working on the STL itself, is the MSVC STL that has been opened up under the Apache license recently, but it has not yet been ported to Linux, and the project has no intention of doing so, but if you're interested in looking at how some platform-independent stuff works, it might be another interesting place to look at, copy some code from, and maybe merge into LibC++ or LibSTD C++. Conclusions, essentially, there's two options that are generally useful. If binary compatibility with other Linux distributions is a concern, you certainly want to use LibSTD C++, because that's what everyone uses, and if you use Clang and you care about performance and memory efficiency, and not so much about compatibility, LibC++ is probably the one to try. Lastly, the debugger component of ToolChains, GDB has been around for a long time, it's still being maintained actively, and as usual, LLVM has come up with an alternative, which is LLDB, both do pretty much the same job, and both do it well. LLDB provides many command aliases for compatibility, but it's not fully compatible, and its native syntax tends to be cleaner. It's been designed 20 years later, but it's also more verbose. So if you like typing one data commands, GDB is still the one to learn. LLDB has an agency++ support and can evaluate expressions in the LLVM grid. So if you don't know how to use any of the debuggers yet, the LLDB is probably the interesting one to learn, but if you're already using GDB, there's no really compelling reason to switch. Now, that's all, and we have one minute left. So if you have any questions or feedback, let me know now, or contact me there. And if you have any backs of cash for me, just throw them at me, but not where the tax office can see them.