 So welcome everyone, do my talk on Xenomai, building hard real-time systems with Xenomai. So for those of you who do not know me, my name is Jan Kisker, I'm working as a team of technology, but today I'm standing here as a maintainer, or one of the key maintainers of this project. My daily work is dealing with Medi Linux for various devices in our company, and as such, I'm also maintaining a couple of open source projects, and Xenomai is among them. Yeah, and so today I would like to introduce you to Xenomai if you haven't heard about it, or give you a brief update where we are standing, and also what people are doing with it. So, why so Xenomai? Some examples I have brought with me, a little bit look under the hood of it, how to set it up if you want to play around with it, also look a bit into our community, and the challenges comes with it, and give you a summary on that. So maybe if we talk about these days about real-time and Linux, what would you associate with real-time and Linux these days? With real-time you may have something in mind like, okay, it's some serious time-critical work it has to be done, and okay, if I want to do it with Linux, then I will use pram.t, natural choice. You probably will not think immediately about using Xenomai instead of that, and you would probably not associate if you're thinking about real-time Linux and Xenomai that you may have been in contact with it, hopefully not at a doctor or at a hospital. That comes into the play, but you may also wonder why that Emmy is here. I will come to that later on as well. So first of all, why is pram.t omnipresent, and how does it work in a nutshell? I mean, I'm not going to explain in all the details how it works, this would be too complicated, but just to give you the idea and the benefit of pram.t is clearly that this enhances the existing kernel ABI to provide certain real-time paths through the kernel to the hardware and back. That's the great advantage. In theory, you don't have to touch your application. You don't have to touch your drivers. You don't have to touch, well, anything. At least this is one that some people are dreaming of. In practice, in some case, it actually works like this by making the kernel itself way more preemptible than this. It's always this trade-off between performance and determinism. Here we are with pram.t really behind the determinism. Short reaction time, often deterministic reaction times, and for that, pram.t paves you the way towards the kernel. And obviously, the other thing, which is key asset of the pram.t approach, it's going to be merged into the kernel next year. So maybe really next year this time, we are really looking forward with this. And that also obviously makes it quite present and quite easily accessible. That is not the case with the other solutions I will talk about. But yeah, that's the benefit of pram.t. In contrast, there is another concept to achieve real-time with Linux. And that's the co-kernel approach. And Xenomai is implementing this approach. Xenomai is not the first one to implement it. There have been solutions around a long while ago, like RT Linux, RT or decor, RTY, and others. But Xenomai is the one, I would say, who survived them all and is now still standing and providing that solution. So what is the difference to pram.t? So pram.t was this unified or is this unified approach to the kernel? While the co-kernel, as the name already suggests, there are two kernels. Well, actually, they're not complete separate kernels. But at least there are different schedulers. And there are different paths to the kernel for these different purposes. So the real-time paths are handled by a real-time scheduler, which is separate from the standard Linux scheduler. And the underlying layer, in this case, iPype or Dofter, come to this later on, they basically enable you to receive also certain external events separately and before the rest of the Linux kernel. So you get a faster path directly to the real-time scheduler. Also, the real-time scheduler is basically any time, not quite true in technical details, but in way easier as possible for the real-time scheduler to interrupt the Linux scheduler. So you have a different level of preemptability here. How much it actually is in the end depends, obviously, on the hardware. But in concept-wise, preemption can be done at any point where there is no really critical synchronization points between the two. This is not a hypervisor approach, just to be clear. So we have a collaborative model here. The kernels or the code bases of the schedulers in the kernel are still in the same code base or in the same security domain. And also, this is the other thing, obviously, about that the application process using this real-time scheduler is not running on a different operating system. It's really the same application that you are spawning here can have real-time tasks and non-real-time tasks using these different schedulers while being one process. So the programming model is not identical to standard Linux application, but it's very close to it. Also, the way how you debug these things and how you interact with these things. Now, the question is, why should I want to have this? And even in the time of Pre-M-30 and its omnipresence, there are three major reasons why people are still using co-carnals with Linux and are still biting or taking the effort to maintain these differences. So one comes from the area of you may have an existing software stack coming from RTOS, coming from homegrown operating systems, which is not quite posix. Well, because this is what Linux implements normally. Which are quite slightly different. I mean, it can be sometimes just how many priorities you have or how exactly the scheduler behaves in certain scenarios. And if this is not exactly mappable on a project system, but your application stacks consisting of a few million lines of code actually relies on that, you may port it over, but it may also be hard to find the engineers who can still do that. And that's basically where the co-carnal comes into play by being able to modify the scheduler for specific purposes without harming the standard use cases we like for Linux and all the negative impact it can bring on throughput and other aspects of the Linux scheduler. Also the tunings, there are different ways to tune these systems. Also while still using the same tuning, the same performance optimized tuning for your Linux part. So you can have different tunings for the different schedulers in the same system. While with Linux you have to decide, okay, I want to be preempt none or I want to be preempt RT. I can't say I want to be preempt none on that core and preempt RT on the other, at least not up to date. You can also build additional APIs if needed. Yeah, and simply model existing RTOS. This is what people did in the past as well and still doing, that we are still doing even in some cases, just to make the porting of an application stack more handable. Another reason to use that is the separate architecture. It first of all comes from the kernel perspective. It's quite some costs, but it comes for the user space also some advantages. So with preempt RT, if you're using a preempt RT application and libraries in between you, system calls, you don't know what, you're always kind of unsure, am I really now on this sweet spot path where preempt RT can actually guarantee me certain reaction times. And that's not how easy to find out, depending on the kernel version, depending on your external dependencies, libraries I mentioned, pulling in, maybe some of them actually randomly do malloc or something else in between. You never really know for sure. With the co-kernel approach and the separation you have that brings architecturally some clearer paths between these two execution environments and that can provide the application way earlier a signal, you are leaving the known good path. And that's actually what Xenomai can provide to the application saying, okay, once my critical application has been initialized with all the malloc and all the dynamics coming with initialization, I turn on, I'm now in operational mode and if I'm leaving that mode, for example, by calling a system call of Linux or by allocating memory, by touching memory, which is not yet ready for access, then I'm getting a signal. I can early debug, so to say, my misbehaving real-time applications. So if you have complex application stacks or you have to maintain them, also with developers who may have to change code paths while not being fully aware of real-time and real-time properties, that can be beneficial. And last but not least, some people tell us that there is actually some cases where this co-colonial approach still delivers better latencies. If that really matters, well, it depends on the use case. If you're looking for 10 times better latencies, sorry, we can't, but if you are basically with your latency requirement on the edge, what preempt that you currently can deliver, you may wanna have some safety margin or actually it can lift you over the edge positively and you get actually into an area where you can achieve things which are not yet achievable with preempt RT. That specifically bites on lower end platforms. That specifically bites also when you are condensed, so you have to have the real-time workload and the non-real-time workload on the same core. This is where we see more significant difference in the latency with preempt RT even on more performant platform than with the co-colonial approach. It can also help in some scenarios when you're scaling up, although there is some issue with at least with the legacy architecture of cinema as well regarding scaling, but we have also running systems where we reduced the real-time part to a number of cores and the few other dozens of cores are doing non-real-time performance workload and that happily lives by side by side. So the other cores, the other dozens of cores, they don't have to bite the overhead that brings normally the high-preemptability with it. So as I said, I'm working for a company who builds some stuff in this area, so I'd like to present you two use cases that we have for cinema as also reason why I'm still doing that. At Stevens, we have, for example, the portfolio of motion control machines or motion controllers for drilling machines and other things, and they have a quite demanding real-time requirements because the faster the system are, the higher the accuracy is, the better the output is. So they're really after these lower latencies. They also have a system which is predating, not predating Linux, but close to, predating but also predating, prem.td and predating, et cetera, my, and they cannot easily change this from one day to the other. So we are in a constant process of adjusting the APIs and reducing them to more and more projects, but still specifically for the initial step to get the software stack over on the Linux platform, it was very helpful, and it's still very helpful to have flexibility in the API providing here. And last but not least, the colleagues are also quite after long-term support as these machines or these devices live, yeah, easily decade or longer. Another use case, and this brings me back to my earlier slide, is from our healthiness colleagues, Steven's healthiness. So for quite a while, I think by now it's also about 15 years they are running the data processing unit and control unit for these magnetic resonance imaging systems or Linux and on Xenomai. So not the whole thing is done with Xenomai, but there are critical parts. You can imagine the control of these measurements when certain coils have to be, yeah, stimulated in certain ways when the measurement has to be collected. That is time sensitive. It's not immediately that the patient is in danger, but on the other hand, the patients usually are not for fun inside these machines, so the measurement really has to work reliably. So this is a critical operation. The software stack is quite complex, as you can imagine, and it's also, it's a mixture in the sense of having certain control elements, real-time elements, and a lot of data processing, number crunching I would call it, to get this huge amount of data out of the measurements and in the end towards a terminal to visualize the results. So we have this co-location of a complex software stack, real-time, non-real-time, and the colleagues really enjoy that there is a clear separation between the both and that not all developers have to be deep kernel experts in order to write applications which behave still properly on the real-time path, but they get the information early. And also we benefit from the fact that as I said, the impact, the performance impact of the co-colonel approach on the non-real-time path is at least the last time we measured is still quite low. So that's our use cases, but there's way more outside. And that brings me to the Emmy, actually. So have you imagined that you watched a movie possibly recently, which was shot by the help of Xenomai? So there's an interesting company which actually is doing motion control as well as we are doing, but for different purposes. Motion control in order to enable actors to interact with models of characters in the movie, which also enable them to interact or to act on platforms which are moving or to interact also with virtual models, which will be rendered later on for the final movie. So concept overdrive is providing this kind of solutions. So it's a mixture of standard components and obviously also specialized customized solutions for the individual movies. They're doing also for quite a while. And originally they did it on bedded controllers, later on on Arctic cores or these older real-time systems and in the past years they have migrated over to Xenomai 3. And there's a very interesting talk I can only recommend by Steve Rosenblatt, CEO of this company on last year's Xenomai meetup. Still online available, watch it and it's a mixture of interesting instructional technology as well as the connection to the movies. So be warned, the pictures I'm showing here, actually they are probably not taking Xenomai yet, but there's a list he shared beforehand of recent movies that are now with a new architecture taken. And actually, that's maybe something to show one example. Whoop, where is it? No, not, ah, here's it. He just shared it three minutes before the talk. So thanks, Steve, a lot. That's Dungeon and Dragons film, recently shot and that was taken with a new architecture. So yeah, maybe you've seen it on movies that real-time is not about only industrial control or healthcare systems, it can be also like this. I really like this case a lot because it's a little mixture of this application we do as well with a complete different domain, way more interesting than our domain. Another use case, audio. So there's a company, Swedish company called Elk Audio, they're providing some kind of solutions and building blocks for doing low latency audio solutions. I think they also have a live jamming offering there. And one of the building blocks for them is embedded links distribution called Elk OS, Elk Audio OS, which is tuned for low latency audio. So they have supported a number of these common embedded boards to build your own audio solutions. And they are now using Xenomai for quite a while, formerly three, but now they actually have migrated to us recently on the latest version. And that's a citation of them to achieve extremely low hard real-time performances. So they actually support both Premiere T and Xenomai, but they say, okay, in most cases, you might be fine with the Xenomai Premiere T, but it's really about this lowest latency real-time performance that gives you the best audio experience, then you have to go that path. So yeah, that's also very interesting architecture and solution, it's actually, and if you go for the website, they have the code public on GitHub, so you can look it up and see how they actually integrate. With Xenomai, how the application looks like, how they address these different operation modes as well. So if you have music, I'm not, at least not at this level, you'll probably find some interesting links how they achieve that. Okay, now let's have a look under the hood of this thing. So how do we build up the Xenomai stack? For the foundation, for Xenomai is, first of all, a number of kernel patches, and this is common to the different Xenomai version we currently have in the field. It's called the dovetail patch that enables this split up of incoming interrupts and other events into these two domains, Linux or real-time, and that we maintain, so to say, as the must-have patch on top of the kernel. We support a number of LTS kernel with that, steadily moving forward. So currently there are four LTS kernel in the field. Probably there's something we have to adjust in the future regarding how many hands we have on that topic. Architecture-wise, we are on ARM, 32-bit, 64-bit, on an X86, obviously, and that's quite some interest, but still some way to go to get it on RISC-5, although at least some people are working on this. As I said, this is common for the most major version we currently have in the field, and how it's being presented. So if you are dealing with maybe not just the plain vanilla LTS kernel, but if you have to struggle with a vendor kernel, we provide both a rebase patch queue, which allows you to more easily get that queue, hopefully, onto a vendor kernel, as well as a continuously merging branch so that you can also see basically what's changing along the LTS updates. Just for reference, in the past, for the older kernel version, we had the IPI patch that still shows up, and as this also, the adoption of these technologies sometimes a bit slower in that industry, you will probably still find a lot of references to IPI in the field. That is basically up to the 5.4 kernel, and it's only supported by Xenomai up to release 3.2. And we have also some older kernels here, still in maintenance, based on the civil infrastructure long-term kernel. So I already mentioned there are two major versions. So what are the major difference between them? The commodity we already discussed, the baseline kernel patch for Xenomai 3-series, and that's what probably most of the users currently are on. That consists, first of all, of another kernel change. This is an out-of-tree, think of a driver, out-of-tree kernel code base, the cobalt core, the scheduler, plus it's libraries. This is user space then. This core emulates a POSIX API natively as a ABI to user space. And on top, you then have either directly the POSIX library interface, or a second libc, so to say, for the scheduling relevant functions. Or you may have, on top, further libraries in order to emulate some art houses or your own flavor of API. In that model, the core, or the real-time drivers that we have in the core are, well, kind of forks or rewrites against a special API in the kernel, the real-time driver model. There are a couple of them, but don't expect to be as many as you have with Linux kernel, obviously, because also a lot of the hardware is not really real-time capable by nature, so it doesn't make sense to write writer for them. But still, this subset of drivers is limited. But users often then do, they come with their own driver source code in and implement it basically custom for their specific hardware, designing. So the Ketemys 3 series, as I said, is currently in various products, and that's also the reason why we currently have three stable series under maintenance, from 3.0 to 3.2, and that may also be reduced in the future based on feedback and based on our available capacity to maintain them, but currently it's quite easy. For me, well, I'm the maintainer of this branch here to handle the backpots of the patches and to keep, hopefully, everyone happy with it. So that's for the Xenomai 3 path. Now, there's even less, as it's called, EVL, Xenomai 4. That is driven by the former Xenomai maintainer, Philipserum, so we basically handed over all the legacy stuff to me, and he was happy to drive this new activity with the goal to really reduce this core kernel approach to its cores again. That comes with some price as well because we are cutting off certain features. At the same time, it comes with the benefit of being, well, less means ideally being faster, being more maintainable. So the EVL core, that's the Pondong, so to say, to the core core, is differently maintained in the sense it comes as a kernel patch pre-integrated. So you get the dovetail patches, and on top you get another branch which contains then the EVL scheduler patches in the kernel tree. So you don't have to deal with the integration. We will see it later on that you have an Xenomai 3 side. It has an own ABI, it's separate, it's not POSIX, so there's also currently no foreseen path to emulate legacy artists with this approach. Also the drivers that are available, so for example, for the audio scenarios, they are actually in kernel drivers enhanced by a separate, yeah, out-of-bent code path to enable the real-time code path. So you interact with the driver basically like you interact in the Linux with the driver except that there is a special mode you can switch on that the event pass and the data pass goes in via this core kernel. That works quite well for some drivers, for others, while you end up still having to rewrite certain things because simply the original kernel driver is not designed for real-time purposes. The goal of this approach regarding maintenance is currently two LTS kernels in flight, plus there's always a head-of-tree where Philip is maintaining the latest version for latest kernel. So that's the Xenomai 4 path. Brief look into if you wanna try it out. So you see for Xenomai 3, you have to fetch basically two essential artifacts. The one is first of all Xenomai itself. So this project consisting of the libraries and the cobalt scheduler as a release or as the git branch. Then you have to pick up the dovetail patches and with this already the kernel. In the past, we had the kernel patches separately distributed. We no longer do this simply because there was not too much interest in that path. At least for most users we heard about. So it's just the kernel tree available. You have to check out the respective branch there that you wanna have, which version. And then you have to marry these two elements, the kernel element coming from the Xenomai code base with the dovetail kernel. And that's done with these scripts here to basically run this kind of unification. It's kind of entry patching or integration of this. The advantage of this is that we can maintain mostly independent of the kernel version, the scheduler. We don't have to rebase it continuously over the kernel tree. The price is that the integration obviously is a little bit more tricky. Then you can compile the kernel. And next would be then going to the Xenomai code base and there compiling the user space part. So the library there and the tools available there. So that's in a nutshell the form scratch approach. If you do this on a distribution or if you do this for your embedded system in the build system there. For Xenomai 4, things look a bit different because of, as I said, all the kernel pieces come from one tree. So we can just clone here a different kernel tree, the evl, Linux evl tree. And you have all the needed kernel changes in one tree, configure and compile it, done. And then you fetch the library part which also contains a few benchmark tools just like the Xenomai above also contains benchmark and testing tools. And then you have the userland part. And those together actually makes your real-time system. Or if you do not like to write C anymore there's also Rust binding by now available. It's not yet complete but this is supposed to be done early next year. So if all of this is still too complicated we also have a third option that is currently only targeting the Xenomai 3 but this is going to be fixed in a couple of upcoming days. That basically builds you in a ready to use mini-debian system with Xenomai pre-installed was pre-configured all the tools and libraries you need. Basically what you need is just to have a Docker running in this case a privileged Docker locally and you can just clone this repository and run the build instruction. First of all you can select basically what you wanna build which kernel configuration which Xenomai configuration for which target you wanna build. So all architectures are supported. As I said currently only Xenomai 3 EVL is on the way so it also will be covered soon. And we have a couple of, we have QEMO images obviously so if you just wanna try out how it feels but don't look after latency obviously in Xenom and in QEMO but we also have generic X86 BeagleBone black for the arm side and the high key for the arm 64 side and it will be more added in the future. It's not very hard to add further boards on that. Yeah and it's also nice to have because you have the baby Debian system running then and you can also then obviously play around and modify the system on the fly. And that's actually also the baseline if you wanna build the product out of this or a real integration you can use this layer also as a baseline to then do your own customization on top. That tool is prepared for enabling that stacking. It's a little bit like similar like Yocto works in this regard as well except that it pulls from the Debian distribution. So talked about technology a lot but there's more for an open source but that's the community. And looking at the Xenoma project I mean it's been around actually for more than 20 years now which is quite a long time. What also has been since then basically around are most of the challenges we are facing. So first of all we have a user community we know that sometimes we even know them personally but that's where it stops because they cannot speak up even in 2023 it's still problematic for some companies to talk about what they're actually doing it's also why I appreciate a lot these all open contributions of our users talking about their use cases. So we also have very little feedback in regarding how Xenoma is actually being used. What is relevant, what is no longer relevant which version are you using mostly, which not. Up to the point that I actually tried to remove a subsystem and only then when the patch was about to be merchant to mainline the user spoke up and said oh but I'm using it. Okay now, okay. So yeah, this is really a challenge for that community given that the domain we are targeting despite all these use cases are shown is comparably small. The other aspect is that as you can imagine digging into the kernel hooking all that up is not a beginner's task. So it can be quite problematic to adjust or to diagnose and adjust certain things in the kernel. So the number of these people who are really familiar within all the internals are few while we need them to maintain the systems over all the time. On the other side you see again we have been around for 20 years the problem is solvable and has been resolved so far no major outage on that side. And then also interesting aspect that I only learned the other years very recently is that we have an even larger community apparently in China right now while the project originally comes from Europe. So how do you deal with that? So what we did in the past years is we try to improve the transparency of our workflow so to make it more open for others to join the project and to contribute to that. That is the thing, if you don't get feedback why should I work openly? It's always this discussion. If I now invest into be more transparent, be more explicit and no one really gives us feedback why should I do that? On the other hand, if you don't do that you will never get feedback. So this is actually one of the investment we did in the past years. We also try to get more developers on board and really bound to the project. Well, for me on the company side it was easier because I could more friendly ask my colleagues to could you join the project? I have some tasks for you would you like to help here? And this is what happened. So we have more engineers on our side involved. We also like to see it from other companies as well. And just to be mentioned in this case Intel joined the project three years ago basically bringing in also information by the way you know China is using this a lot and they also granted some engineers on this topic which is highly appreciated. What we do now these days also again initiative for Intel we have a bi-weekly community call you can just dial in it's friendly to the it's not friendly to the US side but it's friendly to the Asian folks because it's early in the morning of Germany later in the evening of Japan and China. But anyway it's open for everyone you can join it and you can follow and ask questions this way if it's not on the mailing list or other channels. We had a virtual meetup last year which initially was targeting a physical meetup but for some reasons it wasn't still possible. Interesting outcome of this the usual crowd I would say the usual amount of participation from Europe from the Western world and over 90% from China. And the numbers were also quite impressive. So that was one of the reasons why we had this workshop just two weeks last week actually in Bushi. Not hundreds of people but at least we were about 50 from different corners, different companies but that interest shows quite strongly there have been companies representing applications and areas there have been universities showing research on that and it's quite impressive and quite interesting to see how the community is evolving there so to say not directly visible for us and we try to engage with them. So to summarize my summary is here Co-colonel is here to stay as I said we have been around for 20 years and although there is the default choice most likely for most of us to use pre-empty also for us at Siemens we have more pre-empty use cases and we have cinema use cases the cinema use cases are not apparently going to go away. So they are built use case I pointed out so this is a good to have these choice. There are different versions now of cinema as I said the project keeps evolving it's not evolving like cloud technology I guess this is probably also a good thing the pace is slow but steady but we also committed to be really supporting these solutions for a very long time because this is what our users expect from us. So we're happy to take more active users stand up provide feedback if you are using already and if you will be using also do that please we need them otherwise we may make choice on the project side which are not in your favor but then yeah you ask for it. With that thank you for your attention and I'm open to take questions. So one of the things I think is missing for both Xenomai and preempt RT maybe I'm not aware of it is published like benchmark numbers on your real-time latencies. So both projects have a marketing problem. Yeah that's the point I mean I've seen a lot of benchmark in the past 20 years as long as I'm doing real-time Linux and some of them are they shouldn't have been published in this form others were helpful others were more harmful because people were deciding based on 5% difference what technology to choose when I'm in internal discussions about okay but real-time Linux do you recommend I say don't start with the benchmark numbers unless you can really explain to me you are so close to the limits. The problem with benchmarking is as you know it depends on where you're running it how you're running it how you're tuning it if you stop when you have the numbers or if you start to understand the numbers and then try to dig deeper and make sure that you really benchmark correctly both sides. So just to give an example I ask internally please redo the benchmarks between Xenomai 3 and Xenomai 4 the numbers currently are not presentable simply because it's not really in favor of Xenomai 4 but this is our fault I've seen numbers from Xenomai 4 which will be better. Depends on again what mistakes you make along the line and this is actually although Xenomai is sometimes easier to configure you can still make certain mistakes or the hardware can make certain mistakes to ruin the whole benchmarking thing. I always recommend to everyone if you have a case benchmark your representative use case not just the latency test like everyone is doing so that the representative IO benchmark on your platform and then come to a conclusion if this is a benchmark wise makes a difference for you. The other reason as I said for using a coconut approach if any of them applies you may consider it and if none of them complies use preemptive. My gosh this is the standard. So if I understand correctly there are two schedulers so does the application need to be aware of that to pass it down to the real time scheduler than the normal one? Yeah to a certain degree so that depends a bit on the version and it's interesting also from our perspective you still have to I guess look into that further. So with Xenomai 3 we have a mode where you link with a linker trick you link against projects and every threat that you create with real time priorities are automatically handled by the real time scheduler. Unless you explicitly that again is when you're put into something in the code say okay but I want to be this on Linux only even if it's real time. So this is almost transparent. With EVL you have a clear API separation and if you look for example in that elk audio code base you find their different code paths to create a real time threat against the EVL API. So this is their pros and cons between both it's more explicit to have a separate API but again if you're porting over a large code base or if you want to make a large code base compatible with both there's a disadvantage. So my colleague actually asked me if I now want to port over Xenomai 4F to rewrite my application again and you just promoted me to use POSIX and now there is no POSIX. So there might be POSIX for EVL for Xenomai 4 in the future. That's something we're currently considering it's not impossible but it's not the current default path and nothing which is today available. Further questions? Okay. Otherwise I'm still around. You can talk to me on this topic or other topics as you like. Thanks for your attention. Hope to see you.