 Hello everybody. Welcome to this session on advanced system D for the embedded use case. I've decided to do this presentation after discussing with various people at ELCE last years about system D for the embedded use case and realizing that most of the people I know and who are more or less working in the embedded world don't know system D enough to really use it at its best. I mean system D is a very advanced tool with lots of features and you pay a cost for these features and it's really a shame not to use them because they really are useful in the embedded use case and will greatly simplify writing our code and in particular all the dedicated code that can't be reused between projects. So just a little word about what I'm going to say in general because of obviously time constraints. This will not be a tutorial. This will be more like a feature list. I will just say all the things that system D can do for you and if you really want to know how to do it we can have a talk after that or just go to the man pages. System D has very good man pages. I'm from the embedded world but the embedded industrial world not mobile phones not IOT while I do a little bit of IOT but that won't be the subject of this talk. So that's a limit on the point of view I have and the kind of problem I solve. And another thing to know is that I'm someone that knows system D really well and actually I teach system D professionally but I'm also an expert on projects on embedded projects that are using system D. So when I will say that the feature is easy to use it usually means that I end up explaining that feature to people and people get it right away and use it correctly and I never hear from them again. When I say a feature is a bit harder to use it usually means that when I explain the thing people have a little problem and I need to do a little intervention to get things working. So some stuff are useful but a bit harder to use and some stuff are really easy. I wanted to do some metrics and measures. I've done a little through this presentation very few. Just you see my system around there. It's pretty simple. It's a minimal build route with basically nothing in it. No useful software started. I went from using system 5 to system D and the image grew from 9 megs to 17 megs. So that's a back of the envelope metric of how much space system D costs. I wanted to measure boot times but it turned out not to be really significant because most of the boot time on such a simple system was DHCP and network time synchronization and nothing really relevant. So there we go. So this talks will be in three parts. First part about what I call headline features which is things you probably know system D can do for you but we'll go a little bit deeper into it to show how good it is for the embedded use case. Hidden gems are features that well unless you really know system D you probably don't know and which can really, really save you time. And then I'll go around to discuss some features of system D which are pretty useful but usually not for us, not for the embedded people. So I'll give a quick word about why they are here and why I usually disable them and if relevant why sometime I use them anyway. So the first thing is probably I think the most important thing with system D is that when you start a demon with system D you have a very, very fine grained and very easy to set up environment for your demon. So when you start a process under a Linux system there are all sorts of stuff you can set up for the process. There are the ones everybody know about like environment variables user, what group to use. Some people will set up some limits with our limit and that sort of tools and that sort of stuff but there is actually way more you can set up and most of these things are considered advanced by most users, hard to set up and usually are not set up at all. With system D those things are easy to do and that's the main point. Let's start by the standard file descriptors. If you have a system five in its script where does the standard out and the standard error of your demon go? Nobody knows. If you're lucky it will go to DevNull. Most of the time it will go to whatever standard output is configured which will be the console. Some people some startup scripts redirected to files which is not good either. With system D you just have one parameter which allows you to set up the most standard cases for this standard file descriptors like putting it to syslog, putting it to the journal which is default, sending it to DevNull. For the input you can give a character string or some binary that has which will be fed to your program and so on and so forth so it's really easy to set up. There are some more advanced things that most people don't know how to set up but system D can set them up for you again easily. That would be stuff like Linux capabilities. You can run your program as root with limited power. Syscall filtering which can forbid your demon from doing some system calls. There is all sorts of setups with regard to what I call file system masking which is allowing your demon to only access some parts of the file system and that's not Unix permissions. That's actually using bind mounts and mount namespaces to limit what your demon can do and all sorts of advanced features like forbidding a demon to load kernel modules through various layers of securities and stuff like that. All these things are things that are easy to set up with system D and can be checked globally with system D. System D has a very good security analysis tool which will take all the services to use the correct term on your system and will check with every one of them what security features are enabled and what security features are disabled. Doing an audit is easy and something you can do and that's the main point. You can really control. You can really do a jail for your demon and you can do it easily through a very simple configuration. Some of the stuff is really tricky to do. We're talking a little bit about security. This is not a security talk but it's important to understand where system D fits in the security model. When you're securing a demon you have two layers of security and both of which are very important but totally different. The first layer is your demon itself. The demon must validate its input. It must work correctly and in general nobody should be able to use it to do things that it is allowed to do but should not do. An example would be a database. A database is allowed to erase all data but it shouldn't do it. That's writing a correct application. That's not where system D acts in terms of security. System D acts as a second layer. It's more the idea that your demon has been corrupted. There is a malevolence somewhere to complete control of it. System D is here to make sure that it cannot do more than what the demon is theoretically allowed to do. How does it do it? It configures the kernel. It does not do any security check itself. It just configures all the security mechanism of the kernel. Those mechanisms are extremely powerful but most of them are hard to set up. If anybody has read the man page for namespaces or for capabilities, those are extremely hard to understand. Take some time. Once you wrap your head around it, you can figure out how it's going on but it's really, really hard. With system D, using those mechanisms is really easy. Some very high level constraints are trivial to do. You want your demon to have a really don't leave you on the whole file system. Tell system D to do so and it will be done. You want your demon to be run with the no new privilege flag. It's one line in system D. System D is helping with security, not because it's foolproof, very well coded. Well, I think it is, but that's not the main point. The main point is all those security features are really easy to set up and that's a huge gain. That was your demon. You configure it. You start it and you control everything in its environment so you can secure it and jail it and have it well controlled. Now the other aspect where system D is really important is all the synchronization things and starting the demons in the right order and having a startup logic and restarting demons. This part is a bit tricky. That's a part where I usually have to come back and give a few tips after the fact. But overall, once you get it working, once you understand the actual logic, it's extremely powerful. When you have bugs, bugs tend to be easy to reproduce and so easy to debug. And when you have complex use cases for restarting demons, that's where system D really shines. When you're in an embedded system, your system needs to take care of itself. There is no admin. There is no crashed instances that are replaced by other instances. Your system is on its own. It has to detect when things go wrong. It has to deal with things going wrong on its own. And that's where system D really helps you because this whole logic of robustness, not crashing or restarting automatically when something crashes, this whole logic, you don't have to write it. You don't have to write a single line of code in your demon. Well, you have to write one line, the one to tell the watchdog you're still alive. So everything else system D can do for you. First, it has a very robust startup logic. It's a logic that has been written to deal with all the various logics that Unix demons have had throughout time, stuff like privilege dropping, like demons that fork or don't fork. All that sort of stuff is something system D is used to deal with. And everything is protected with time out. So you will never have a demon which will be completely stuck because system D is watching it. It has some great system for readiness detection. So what is readiness detection? The idea is when you start your demon, the moment you start the process is not the moment when the demon is ready to work. It needs some time to prepare. So you want your demons dependency to start a bit later. So system D has various ways to detect readiness, but the most useful one, especially when you write your own demons so you know you will be using system D, is to just tell system D. So you have one call in the system D API, one function you call it and that will tell system D you're ready and system D will synchronize any dependencies on that signal. So it's one line of C code or whatever binding you like to use in your native language to get it working. It's pretty easy. You can easily ask system D to run scripts before your demon and since the pre-start scripts are part of the service, you are guaranteed that they will always be run correctly. You can also have post-start demons. So those are scripts that will be run by system D after your demon has said it is ready but before dependencies are started. So that's really handy when you have some cleanups and boy, I don't know about you, but in the embedded world we have some very badly written demons. So having that sort of tools is really awesome. Watch Dogs system D provides you with a software watchdog. So system D itself will feed the hardware watchdog. So if the kernel crashes, the hardware watchdog will not be fed. If system D crashes, the hardware watchdog will not be fed. And then system D itself provide software watchdogs for any application that needs it. So if your demon is coded for system D, you have one line of code you have to add in your main loop and that's it and your watchdog will be working and system D will monitor your application. And if your application does not feed its watchdog in time, system D will stop it and restart it in a very well-defined way. Again, with all the time outs you want, if you wanted to restart the whole machine, you can do that too. And the whole logic is here and ready to go. You can configure a restart very precisely, including grace periods if you need it and burst protection. So grace periods is basically you tell system D, I want my demon to wait a couple of seconds before being restarted and system D will guarantee you you have that couple of seconds. That's pretty good when you have some hardware that needs some time to restart, but you cannot detect it. And burst protection is basically the idea of saying to system D, I want this demon to be restarted as soon as possible. But if it restarts more than 10 times in a five-minute window, just kill it. So the whole thing is a little tricky to understand, as I said, you have to understand the whole mechanism. But once you get it going, you have a very well-defined dependency system ordering system. You can have a very precise control of how, when, and in what order your various demons are started. And again, it's really good to not just having a faster boot. We'll discuss boot time a little bit later. But really, unlike system five, you have a real understanding on what's going on. And it's really easy to get it bulletproof. As in, your system will never be stuck. I mean, it might reboot. It might restart demons. But you have someone who's actually monitoring everything, doing things, just work as they need, and has very precise instructions on what to do in every case. So not only is your demon really easier to write, because you don't need to do the cleanup yourself. You don't need to drop privileges yourself. You don't even need to fork yourself anymore. But the whole thing is configurable by the sysadmin if you need to tweak it. And it's overall working really well. So even if a system D is first and foremost a system to monitor your demons, it's in charge of boot. So in that regard, a system D has a few not well-known features that are really great for us embedded people. So the first one and probably one of the most important one is boot blessing. So that's a rather new feature. But basically there is a point in system D where we decide if a boot is successful or not. And what's really great is that this is a neutral point which is independent of hardware, independent of the distro you're using, and independent of the other-the-air upgrade system you're using. So you can add any test you want before that point which will influence boot blessing in the neutral ways that can be reused from product to product without anything specific to any hardware. And on the other hand, you can also add script afters at points that are specific to your boot loader, your over-the-air system, and it will take into account any script that comes before the boot blessing which are, again, neutral. So it's very good because you can do those tricky boot blessing things in a neutral way. So when system D boot, it has a boot target. So without going very far into detail, it's basically the same thing as a run level. System D has those boot targets and you can have multiple boot targets and that's really handy because in the embedded world it's very common that we want to boot in different ways depending on various criteria. The most common being prediction boot versus development versus factory test and so on and so far. Not only can you have multiple boots, but system D has this control called system control isolate which allow you to switch mode on a live system including stuff like blacklisting some services to avoid them being killed when you change mode. If you have, I don't know, a watchdog or something like that. So the whole thing is pretty handy and pretty versatile so you can use it in any way you want. One of the very good things with system D is you have some very precise boot time analysis tool which know how system D start things and will take parallel startups into account and tell you really what you're looking for. So reducing boot time is a very, very common problem in the embedded world and we usually do a lot of guessing in this area. With system D, no more guessing. You have a real tool which will tell you what demons you've been waiting for and what demons have been long to start but you have not actually been waiting for them so you're perfectly fine and you don't need to optimize those and usually the first time you use those tools you have some surprise because time is not spent where you think it is. Last thing I really like is generators. So generators are small binaries, runnables. That system D will execute very early in the boot process and their job is to generate new configuration files for system D, new units. So that's pretty handy for us in the embedded world because a very common thing we're asked to do is to have one hardware in multiple products and to detect on one product we are and boot differently or multiple hardware with one image because they have the same CPU and we need to detect what image we're on and boot differently. So we usually do that using CPU IDs or some hardwired GPIOs who tell us what we are. Generators are typically, well I've used them a couple of times to read GPIO and decide what services needed to be started or to choose a boot target. Once I had also a customer that wanted everything to be configured with a single XML file and that included starting multiple service in different ways so I had to write an XML parser that would generate a system D configuration file. So dealing with XML was hard but actually getting system D to do what I wanted was easy. Small side comment about system D and why it boots faster. So I was not able to get relevant boot times but most people agree nowadays that system D actually gain you boot time. But the question of why is rarely answered or at least not answered correctly. So the first reason is parallelization. So that's a well-known reason. System D starts all demons that can be started in parallel. All those demons will be started simultaneously. So that's really good at saturating both CPU and disks to the point that system D used to have read ahead to preload pages in memory and they've dropped this because the disk, the IO bus was saturated anyway so you didn't gain any time by doing that. So that's the first and most well-known boot time optimization in system D. Second one is socket-based dependencies which basically means that if you have a demon that writes in a socket, the socket need to be created before the demon can be started. However, if another demon reads that socket, the reader does not need to be started right away. And system D deals with that. That means system D can create the socket for you and start both the writer and the reader at the same time and use the socket as a sort of cache between the two to allow those to start in parallel. That's particularly useful with syslog and orjournalty because those do not block all demons that need to write to the syslog. System D has a very robust on-demand startup system which means that demons will be started only when they are needed and that includes not starting demons that are irrelevant for your particular hardware. So I don't know if you remember booting on all the Ubuntu like five to ten years ago, but you would have cups be started, you would have sane be started and they would be started even if you did not have printer or a scanner. Those days, I don't know if anybody noticed, but those days are gone. Even on a rich desktop distributions, a sane is not started anymore. Why? Because system D waits to see if there is a scanner before deciding to start the demon. So as a consequence, you have much less demons that are started at boot because useless demons are not started. And last, when you're starting, when your init system is based on shell scripts, you start a lot of shells. You start a lot of subchels and you start a lot of commands. I mean, whenever you want to change permission on a file or when you want to change user, you have to launch a command. And starting a process on Linux is fast, but it still has a cost. And system D does all the setup for your demon, the environment I've mentioned in my first slide. It does all this by calling directly the system calls and not by starting processes that are just started to start a system call. And the number of processes started is significantly less on system D. On my minimal system with nothing on it that just started a shell, I had the PID of the first shell after startup was 72 with system D and 155 with a system five in its script. And again, nothing was started. It was just starting those shell. So that's why I think that system D is faster at startup. So those are mainline features of system D, but features that are really useful for us in the embedded world, and I would say especially for us in the embedded world. I'm going now to move on to some features that most people probably have not heard of but are still really useful for us. So the first one is the one everybody loves to hate. It's journal D. So logs, let's just put it that way. In the embedded world, we don't really like logs. They clutter our file systems. They eat our flash drives. And on an embedded system that will end up under the sea or in a car or on a plane, nobody will ever read the logs because we don't have any sysad means. We don't have any logs and finalization. And so most of the time logs are a real problem. Journal D helps a lot with logs. So the first thing which makes journal D awesome is that it's exhaustive. Except for applications that write directly in their own log files, everything else ends up in the journal. So every demon started by system D unless you configure them differently explicitly will have it standard out and standard errors into the journal. Everything that goes through syslog will end up in the journal. Any kernel message, so what you usually get with the message will end up in the journal. Audit end up in the journal. And if you use containers, I will never use containers in embedded systems, but you never know. If you use containers, all your container messages can end up in the same journal, the journal of the host, which means that you have them centralized for free. The next thing with the journal which makes it really awesome is that it collects metadata. So if you use traditional syslog, you only get one line of text per message. And that line of text may contain timestamps. It may contain PID. Sometimes it doesn't. Sometimes it doesn't even contain the name of the person that generated the log. With system D, well with journal D, journal will collect every possible metadata it knows about and some of them are really great. The first one, which is really awesome, is reliable timestamps. So journal D will collect its own timestamps when it gets a message and will thus keep all message from all demons in one consistent order, which is the real order. That's pretty awesome. And as a reminder, I've said earlier that journal D collects a message from all containers, which means that all your containers are in the same journal with reliable timestamps. So everything is perfectly ordered between containers. It stores a boot ID with every message. So the boot ID is basically a random string that is generated at boot and that is used to uniquely identify the boot of your machine. So which means that you can just get all the message from a particular boot easily. You also get all process information. So PID of your process, command line of your process, UIDs that launched your process, what capabilities your process had at the time, and so forth and so on. There's all sorts of stuff. It's pretty useful. The next thing that's really cool for us in the embedded world is that journal D is not just here to collect logs and write them down in files. It actually has a complete API which allows you to control it. So one thing we commonly have to do in embedded systems which have a user interface, so one with a screen or something like that, is have some sort of debug menu where you can scroll and view the logs. With journal D you have an API which allows you to search the logs, to see the logs continuously. You have a pull-aware mechanism. This pull-aware mechanism, you can give it filters. So you will only be walking when a given type of message arrives. You can find where you were last time when you restart, which means that your log visualization tool can catch up if it crashes. So your log tool is monitored by system D. It crashes. It restarts. It's reconnecting to the journal. It will get all the message it missed while it was down. And that's for free. It's easy to do. This is an area where I just told someone you have this API. Go look into it and then the application was working. You can store any extra data you want in the journal when you're storing a message, not just one line of text, including binary data. So the system D people use this feature, the fact that you can store binary data with messages to store quadrants inside the journal. I'm personally not convinced it's a great idea, but it can be done. What is great is that you can store any binary data in the journal and I usually use it to store faulty bus frames. When you debug a system and when you debug a system in a real environment, commonly you're not here. You just give your system to a mechanic. The mechanic will put it in the car. The car will go and run around a circuit for hours and then it comes back and you have to see what has been going on. So with this kind of tool, you can just store any faulty frame you find inside the journal and get it with all the traces again with timestamps that allow you to see all the messages your application has produced before and after detecting the problem. Journal D has a network protocol option, so it's optional. You don't have to compile it in, but if you do, it's pretty awesome. A traditional syslog protocol is just put the message in a UDP packet and send the packet, so you have integrated because it's TCP, because it's IP, but you don't have guarantees that the message will arrive. You have no guarantee of the order and you have no timestamps to correct the order. So usually it's used in a data center where the network is reliable, but you could never use that sort of protocol over the Internet. Journal D comes with its own web client-slash-web server, which allows you to easily fetch or push the logs around an HTTP AS-based stack. So it's really easy to put in place. It has already been done for you. You can use certificates to guarantee that only allowed people can get into the log, and it even integrates a very simple, very light web page which allows you to browse your logs directly on your machine. So if you need a one-off log visualization tool, something simple, you have zero development to do. You just use a system-based integrated web server and you get the logs. That you just don't compile it. And last, in the embedded world, we tend to have file rotation issues. So we have some very complex logic about how many files we are allowed to keep, what size they must have, how old they can be. Journal D has a very, very complete file rotation configuration and will basically deal with anything you can throw at it. So just go for it. That's a solved problem for me. So Journal D, as far as I'm concerned, has solved all the log problems I have when doing embedded systems. And in particular, all the things linked to having an API which makes things way, way, way simpler. Something I love about System D is that you have complete D-Bus piloting. So everything System D can do, you can command it through D-Bus. So that means that you can monitor your system through D-Bus. And System D will send you a D-Bus signal when a unit changes state. So you can monitor units and have your nice little green-red lights on your UI simply by monitoring System D and looking at unit states. And the monitoring application can itself be monitored by System D, which is pretty awesome. You want to have to restart a service with a button on your UI while the button just sends a D-Bus message to System D and System D is very good at restarting services. So it's pretty trivial to do. Any property is available through D-Bus and those that can be dynamically changed, you can change them through D-Bus. And that includes all control group settings. So I didn't go into control group earlier, but basically it allows you to control how much CPU, how much memory a given service has available. So you can change that dynamically through System D and you can then pilot all this through D-Bus. So when you need to interact with the system in general, you do it through D-Bus. I mean, all major system demons nowadays have a D-Bus API and so does System D. But System D is pretty interesting because it controls the system. So you can reboot through a D-Bus API, restart any demons through a D-Bus API, change how the demons work through a D-Bus API. It's really easy. And again, something that people learn very fast. You have to understand System D to do D-Bus piloting, but once you do it, it's extremely complete and it solves all sorts of real-life problems. In the embedded world, we have all sorts of problems with file system and partition management. So to give you a few, usually we are asked to do file system, well, system images that are as small as possible because of flashing time in factory. And usually we want to partition and create our data partitions on first boot. We also usually have AB type upgrade systems, which means that the whole data is copied twice on bus, which takes a lot of time in factory. So we want the system to create its second partition on first boot. So there are all sorts of cases and System D provides some great tools to help us with that problem. So the first one is System D can detect what partition goes where and mounts them correctly based on GPT header. So GPT is a replacement for all the BIOS partition tables. And you can encode in there with what each partition is for and System D will detect this automatically. The latest version of System D has a new daemon called Part D and Part D will look at a couple of configuration files and if it finds that some partitions are missing, it can create them. And it can create them dynamically with constraints like just take all space available. So it's pretty easy to have partition created that will take whatever disk space we have. And that's pretty important because in the embedded world, usually you have new version of the product within which the only difference is the size of the disk. So having a system that can grow inside its disk is pretty awesome. System D can detect when a partition is not formatted and formatted automatically. And it can also detect when a partition is empty and it can populate it automatically. So that's pretty awesome. Again, either for our AB use case or more commonly in the embedded system when you have a data partition but you need to have stuff in the data partition. You just leave your hard disk empty and with a properly configured System D, System D will create the data partition, format the data partition, initialize the data partition for you. So this really solves a problem because FSTab is complicated to handle, especially in Yocto where we hate to have to write files. FSTab is one file. You can do drop-its. Portable services are also pretty awesome. So I'll try to be a bit faster on those ones. Portable service basically allows you to have binaries, all its libraries, all its configuration files in one single file system image. So it allows you to have packages in one file without having the risk of having a normal package system like Debian has, which breaks systems to make it simple. So you can have all these images and services in one single file and this way you can easily remove them if you need to do a factory reset. So Portable Services allows us to have a cheap, poor man's packaging system without paying the price and having the danger of a full-blown packaging system. That's for my favorite features in System D. So one quick slide on the features I do not use in embedded systems. So the first one I would like to mention is Network D. So Network D is a network configuration tool so it serves the same purpose as a network manager or con man except that Network D is really around the use case of having interfaces appearing and disappearing dynamically which is typically what happens with containers that are started and stopped. So we usually don't use containers that much in the embedded world. So Network D is really not practical for our complex use cases and I usually use something like con man. I usually disable login D, home D and per user system these sessions because those are all things that are linked to human users and we use dedicated UIDs for security purposes in the embedded world to separate security domains but we generally do not have human users that need a shell or things like that. So I disable those features and spawn again very little use for containers in the embedded world so and spawn is not really useful for that. System D boot is a very good tool but it's only for EFI systems and in the embedded world we usually don't have EFI systems. We have U-boot-based systems so it's not that useful and System D in the initram disk because we are in a trim disk when we have them are usually pretty trivial and are simpler to write in shell. So this is why I think System D is really great in the embedded world. It makes it's way easier to write a daemon because a daemon is just a normal application no forking, no complex configuration of your system. System D can do everything for you. System D allows you to monitor and to secure your daemon easily and System D allows you to interact with the system easily being changing the network configuration, reading the logs, writing into the logs, debugging all those things. You have great tools if you take the time to learn. So that would be my general advice with System D is take the time to learn. It's really worth it to have one person in your team that really understands and masters System D. He will write all the services you need and it will really, really simplify your main application and that's the big game. Things are simple. Securing is simple. It's robust and robust is really the number one thing we want in embedded systems. Thank you very much.