 Hello, my name is John Ognes and I'm here to do a talk about everything you need to know about writing an application, a real-time application under Linux. And there's a lot of small details you really need to pay attention to when you're writing an application on a general-purpose operating system like Linux. And I hope that we can go through the critical points here so that basically at the end you'll have a checklist and you can just, just as a reminder of everything that you need to do. So let's begin by talking about what real-time actually is, because that's also an important thing to understand. What is a real-time application? So of course when we're writing applications we want them to be bug-free. That's an important aspect to any application. But when you're talking about a real-time application you're talking about not only that it's running without bugs, but correctness means running at the correct time. So when let's say a certain task needs to wake up then it's really important for a real-time application that that task wakes up when it wanted to wake up. And if it wakes up later than that then that's a certain latency that we can measure. And if the latency is larger than that which we have in our requirements then we've failed to meet our requirements and the real-time system has failed. Now to really consider your system a real-time system it's not just enough that you would like your application to meet certain timing requirements or that it's important for you, it's actually a requirement which means if when it fails to meet these requirements you actually have an error in your system then you can call your system a real-time system. So it's really important that you define which tasks on my system, which applications or which threads are actually time critical. This is actually a quite important step because a lot of people make the mistake of saying okay we have a real-time system so let's just make everything on the system real-time. And this is generally false. You're going to have a lot of problems because real-time applications have totally different requirements than applications that are not real-time. So it's really important that you take a moment to really analyze your system and say what actually has the timing requirements that we have to meet on our system. Now in order to have a real-time system, an operating system that supports real-time, there's really three main tasks that have to be satisfied. So the first is that you have to have some sort of deterministic runtime or scheduling behavior otherwise you have no way of controlling the system at all. If I call a function and it could run days or it could run microseconds this does not help me. I have no chance of doing a deterministic low latency real-time system. It's also important that the system is interruptible because the CPU is always doing something, right? So it's really important in the system that it's able to interrupt whatever is currently running and to do something else. My important task, my real-time task when it needs to wake up then we need to kick somebody off the CPU for that. And third we need a way to avoid priority inversion. Now priority inversion is a situation where a high-priority task is waiting for a low-priority task. So in this particular picture that we see here we have task three which is low priority and in this scenario we can say that task three is holding a lock. Now task one comes along, has a higher priority and task one wants that lock. That situation is okay. You're allowed to grab locks as low priority and it could possibly be a situation that the high-priority task wants that lock. Anyway in that case the scheduler does something intelligent. It puts task three back on the CPU because task one wants that, the high-priority task wants that lock. And so the only one who can free that lock is task three. So it's put on back on the CPU. But now let's assume that task two comes along. Task two has nothing to do with task one, nothing to do with task three, nothing to do with this lock and this contention that we have there. But task two is higher priority than task three. So when task two comes along is actually a correct decision for the CPU to be assigned to task two when task three was previously there. So now we have a problem that a totally unrelated task is now holding up task three and actually indirectly task one. Now this is the priority inversion situation. The priority inversion is not between task one and task three. The priority inversion is between task one and task two. Here we have a high-priority task, task one, waiting indirectly on the lower priority task, task two, and task two might run forever. And this is actually quite serious problem and it can happen very easily in a complex system like Linux if you're not very careful. Now one way that Linux can handle this is that in the moment when task one wants that lock that task three has, then Linux will do something called priority boosting. And it'll actually boost the priority, actually called priority inheritance, but we often talk about priority boosting. The priority of task three will actually get boosted to the level of task one. And now when task three is running to free that lock, there is no chance that task two can become involved because task two has a lower priority than task one. And task three currently is running at the priority of task one. Now in the moment when task three gives up that lock in that exact moment, then task three will be de-boosted. It'll go back to its old priority and now task one can run with the lock. So we can actually grab the lock and keep running. So this is how priority inversion is avoided. Now in general when you're writing real-time applications under Linux, and this is actually one of the nice features of writing real-time applications under Linux, is that everything is done with the POSIX API. So there actually is a real-time extension to POSIX that's defined in the POSIX standard. And this is what Linux is using to implement the real-time functionality for Linux. So basically if you're used to working with file descriptors and opening, close, and reads, and writes, and creating sockets and all these things that are POSIX, then you'll also be very comfortable writing things for Linux with real-time. And in fact sked.h, time.h, and pthread.h are the only headers you really need to have full access to all of the real-time functions in Linux. So this is something that's really nice for writing applications, real-time applications under Linux. Now the scheduling policies in Linux vary depending on real-time or non-real-time. So there are currently three different scheduling policies for non-real-time tasks. And really the important thing to understand between non-real-time policies and real-time policies is that non-real-time policies have a fixed time slice. So it means at some point it doesn't matter if it's in a busy waiting loop, an infinite loop just going and going and going. If it's a non-real-time priority, it will eventually get scheduled out. And this is actually what's nice. So we don't have to worry about applications going out of control or taking over the system because they have certain time slices. And actually on your real-time system, I expect that most of the tasks running on that system will be with the non-real-time priority. So we have logging daemons or maybe web servers or any other kinds of middleware tasks and databases and things like this, things that are accessing the file system. These probably should not be running as real-time tasks. And so these are all things that are very well suited for non-real-time. Now some people could ask, well, why am I even talking about non-real-time? This is a real-time talk. And the reason is because even these non-real-time tasks, you have a lot of chances to still configure how they take over the CPU or how they get their piece of CPU. So there's things like nice values. There are things like control groups where you can actually limit for different tasks how much CPU they can have. Now this has nothing to do with real-time, but it gives you the opportunity to say, for example, my web server is more important than my logging daemon or vice versa. Or to say I want to make sure if I have a web browser running on my real-time system, I want to make sure the web browser never takes more than 20% of the CPU and things like this. These are all things that I can package my non-real-time applications into cages. And I really strongly encourage you to take the time to not only evaluate the real-time tasks, but to say how can I optimally, the non-real-time tasks within each other, for them, this whole group of non-real-time tasks, how can I optimally make sure that they are also sharing the CPU in a way that makes sense for my system? Now for the real-time tasks, these are all tasks that run as long as they want to run. So the only chance, for example, if I have a real-time task with priority 30, and the only way I'm going to be, that is going to end is if a priority, you know, something higher than 30 actually comes in and says I want to do something, becomes runnable. So with a real-time task that you have to be really careful with your code, you're not allowed to have some bugs that possible infinite loops because it could really render the system dead, or appear to be dead, just because this high-priority task has run away with the CPU. Now the SCEDFIFO is typically what people use. This means that I get the CPU as long as I want, and then at some point I'll go to sleep because I want to go to sleep because I'm waiting for an event or maybe I'm blocking some sort of IO, there's some reason for me to block, I'm blocking a resource, and at that point I will actually go into maybe the D state or the S state and freely give up the CPU. But as long as I actually have something to do, as long as I'm in the runnable state, I can run as long as I want, unless someone with higher priority comes. And with SCEDFIFO you have priorities from 1 to 99. 1 is the lowest, 99 is the highest, although please don't ever use 99 for your applications. There are some kernel threads that are very important that run at priority 99. They are more important than your application. So in my opinion you should never run between more than 98 for your highest priority. Now there's another policy called SCED Round Robin, SCEDRR, and this is exactly like SCEDFIFO actually. The only difference is if I have two tasks with the same priority, then there actually will be time slices between these two tasks of the same priority. So if I choose to make myself SCED Round Robin, if I'm a task running with SCED Round Robin, then I have set a time slice for myself if there's another task with the same priority that's waiting to get on the CPU. That's the only time, otherwise it's exactly SCEDFIFO. SCED deadline, I'm not going to talk about it all. This actually doesn't fit into this picture. It's another type of priority management, in which case the scheduler decides on the priorities based on timing constraints. There's some great talks from Steven Rostead if you go on YouTube or something like this, if you're interested about SCED deadline. The only thing that I want to mention here is if you do do anything with SCED deadline, SCED deadline wins over the highest priority SCEDFIFO SCOUT Round Robin. So SCED deadline, if you decide to do something with SCED deadline, it will always win and the SCEDFIFO SCED Round Robin, they take a backseat to SCED deadline. Mixing them is a little bit strange, almost a little bit of guaranteed priority inversion, but you might be interested in only SCED deadline. You won't learn about that in this talk, but you should know about it. Now, one thing I mentioned here at the bottom, which is really important, is that by default the Linux kernel limits the amount of time that all real-time tasks are allowed to use the CPU. So if the combination of CPU time for all of the real-time tasks adds up to more than 95% of a second, 950 milliseconds within a second, then for those last 50 milliseconds, no real-time task is allowed to run. Now, this is the worst kind of priority version you could have and you need to make sure that that doesn't happen in your system. So this PROXIS kernel SCED RT runtime US, this is an option where you can configure that maximum time. Setting that to a minus 1 will disable this feature because this is really a dangerous feature in a real-time system because if you do hit that limit and you'll see a message to show up in the kernel log, so dMessage, you'll see that it hit that. It's called runtime throttling, and if you hit this runtime throttling, then basically your system is broken because you went 50 milliseconds without any real-time tasks at all. So make sure to disable that. It's not a compiler option. It's not persistent. You're going to actually have to write a boot script to echo a minus 1 into there every single time you boot. Don't forget this. Really important. This slide is just basically showing you how you can set the priority of tasks. So every single thread, the task is a thread or a process, and every single thread, I'm going to say the word task, but it means threads or processes, every task can have its own real-time priority or SCED other with nice values, things like this. With the CHRT tool, here is how I can set the real-time priority and policy for certain tasks. So there's a minus P option that I can give, and this will specify, for example, the thread ID of a running task, so I can actually modify the real-time priority and policy of a running task, or I can actually start an application running. So I do CHRT minus F for FIFO, priority of 10, and then I can actually type the name of the application and it will actually start with that priority. Here you also see, in SCED.h, you can use SCED, set scheduler, you can also, through this function, programmatically set your own priorities. So if you prefer that instead of your boot scripts or external scripts properly setting the priorities, you can actually hard-code this in your applications that when they start, they automatically choose the priorities that you feel fit. Scheduling is one topic, but CPU affinity is also something that's very important because sometimes if I'm running my real-time applications, it may be critical that I actually have them isolated on certain CPUs. So let's say I have a system with eight CPUs, which is pretty common these days, then maybe I have six CPUs that I'm dedicating for the general system, the web server, logging daemon, system D maybe, and then I have the top two CPUs I'm just using for the real-time application. So with CPU affinity, CPU affinity is basically a bit mask of the CPUs that a task is allowed to run on, and this is also per task. So this is per thread or per process. You can actually say that this one task is only allowed to run on this CPU or these three CPUs or however you want. So you can set the CPU affinity for tasks, but you can also set CPU affinities for hardware interrupts, which we'll look at in a second. So when a hardware interrupt comes, an actual interrupt in the interrupt line, one of the CPUs is going to have to service that interrupt, and actually everyone is allowed to service it. So it doesn't matter who's going to interrupt, but whoever services that hardware interrupt is going to be interrupting whatever task was on the CPU before that. So if I have a real-time application on a certain CPU and I want to prevent it from being interrupted, for example, from hardware interrupts, you can also set CPU affinities for the hardware interrupt handlers. We'll take a look at that in a second. And lastly, it is also, that's actually the third point. The second point here is you can also boot the kernel and tell the kernel not to put any of its own kernel threads on the certain CPUs. So this is a way that you can do it so that when your computer boots or system boots, you already have some CPUs that are free for being assigned to your real-time application, and the kernel will not be using that for itself. Now I will want to mention here in this last point here, in this last little paragraph there, it's really important that you understand that in a lot of systems the multiple CPUs will share the same caches. So make sure you're aware of how your system is architected that you know that, for example, two of the CPUs are sharing a certain level two cache. Those are the two CPUs that probably should be hosting the real-time application, because otherwise, if you have non-real-time tasks and real-time tasks, they may be on different CPUs, but if they're sharing caches, then that non-real-time task can have an adverse effect on the real-time application. So you should just be aware of that of caching. It's a really important topic. So in this example, we're seeing how we can control the CPU affinity for applications. So with the task set tool, I can either start a program with a certain CPU affinity mask or I can modify any task, the CPU affinity mask for that task. So if I say task set minus P, then it says PID, but you're actually the task, I do the TID actually, the task ID that you provide there, because you can do it per thread. So I can actually set the mask for a thread. Now if I don't provide a mask, I just say minus P and give the PID, it'll actually tell me what the mask is for that thread at the moment. So I can look at what the current mask is and also set new ones. By the way, that also works for CHRT, the tool we just looked at for changing priorities. You can also see what is the current priority. Although in PS, you can see that anyway. So just like with setting the priorities in code, we can also set the CPU affinity in code. So if you want applications to set their own CPU affinity, you can do that as well. The reason why you need the GNU source to find here is because the SCED set affinity is actually a function that is not part of POSIX. It's implemented in the glibc, but with this defined GNU source, basically we're saying we know we're using a glibc, so this function is available and you actually need that defined for it to be available in the header file. I talked about the boot parameters, so you can actually have the kernel boot up and only use certain CPUs. So you have two different options for this. One is called maxCPUs. This actually limits the number of CPUs that the kernel can see. So I can actually, for example, if I have an eight-core system, I can actually, if I set maxCores to four, the kernel will only allow four CPUs to be used. So the other four CPUs will be left completely alone, and some people will use this, for example, to run certain bare metal applications, perhaps for real hardcore, real-time where they need basically busy waiting, pulling applications similar to microprocessor applications that are running directly on bare metal on the hardware, and those can then communicate through shared memory with Linux or something like this. Anyway, using maxCPUs, you can do that. So we can restrict Linux to a certain number of cores and then we can run our bare metal applications on the other cores. ISO CPUs works differently. ISO CPUs specifies certain CPUs where I want Linux to not put any of its own kernels, and when it starts any of its own kernel threads, and when it starts applications, it will also start those applications with a default CPU affinity mask so that nothing appears on those CPUs. So basically, we've isolated some CPUs. However, Linux is aware of those CPUs and is allowed to use them, right? So basically, when the system boots, there will be some CPUs that are free, and then I can choose to start my real-time applications on those CPUs. The next two sections of the slide, they're showing how you can set the default SMP affinity for, for example, when a new interrupt handler is registered, what do I want the CPU affinity for that hardware interrupt handler to be? Yeah, so that'll be the default. And for interrupts that are already registered, you can actually go into PROC IRQ, and then you can look in the effective affinity. This is what's showing you the CPU affinity mask for that hardware interrupt handler, and you can write values, write mask values into the SMP affinity virtual file to actually change that. Just make sure you look in the effective affinity after you write into the SMP affinity because some hardware is not capable of arbitrarily assigning CPUs to hardware interrupt handlers, right? So you can put whatever value you want into SMP affinity, then check the effective affinity to see what was actually set. Memory management with real-time applications is actually probably the most important topic when you're talking about real-time developing and possible latency, and it's because the kernel memory management functions as I'll explain right now. Basically, any time you allocate memory, you don't actually get the physical memory assigned to you. So you do a malloc, you're going to receive a pointer back, and it's going to be marked in your MMU table, so it's going to be marked that, okay, this amount of memory has been assigned in my virtual address space, but there are actual no physical memory mappings behind that. There are no pages that have been assigned to that yet. And this is only when you actually access that memory. The first time when I try to write to it, for example, that generated MMU exception and actually created a page fault, the kernel has a handler for this page fault, and in that moment it's going to find a free page. So if you're generally four kilobyte block of memory, physical RAM, it'll search for a free one, and then it'll do the assignment, and then the application can continue. So my malloc is actually very, very fast. If you do time, you can see that malloc is really, really fast. But as soon as you start touching that data, then all of a sudden it's really expensive. The first time, once that page fault has happened and the memory is there, then we're okay. And this is not just for the heap with mallocs. This is for all components of the processor virtual address space. So even the text segment, the initialized data segments, the stack and the heap, right? So for example, we're talking about the stack, when the stack grows, just moving into the next page, because the stack has grown into the next page, that's going to initially cause a page fault, right? So page faults are quite expensive, which means this is something that you want to avoid, and I'll show you how to avoid that. There's basically three steps that you need to do when you're doing avoiding. The first one is you need to tune glibc's malloc to make sure it's actually using the heap, because there are two ways that glibc can allocate memory. It can actually use its heap, really calling the system break system call to actually increase the heap, as most people learn in universities. The other method that glibc can use to allocate memory is it can actually do an mmap, and in this case, it's actually requesting the memory directly from the kernel. And that kernel is going to provide a block of memory that's not within this heap. It's something totally separate, right? And we don't want to use this separate memory mapping version of allocation, because when I get something from the kernel within malloc in this way, and then I do a free, then it goes back to the kernel. And that means that the next time I do a malloc, I'm going to get some fresh unpaged memory again. I need to page all of that first axis is very expensive. And when I do a free, it's gone again, right? So you're guaranteed, if when you do the mmap, is you're going to keep getting lots and lots of page faults, and we don't want that. What we want is we want basically to use this heap area with our mallocs, and we want that to be paged, to be paged in. So we're going to cause these page faults, and we want to hold on to that. So that's what we need to tune glibc to make sure it's always using the heap, never going with the memory mappings. We also need to lock down the allocated pages because we want to make sure that pages that we've allocated, memory of allocate, never goes back into the kernel or is never recycled to use for something else. Now, this is typically, we know this as swapping. And some of you out there might say, we have an embedded system. We don't do swapping. You're wrong. There are paces of your process address base that are available on disk. And that's, for example, the tech segments. So if the kernel really runs low on physical RAM, it's going to start recycling all of the pages it can where it knows it can get a copy again. So this includes, for example, the tech segments. So the function, the code that's actually getting executed could actually get paged out. And this would be horrible because it means the next time your application gets the CPU and tries to execute an instruction, that's going to cause a page fault and it has to pull that back from disk, which could be quite expensive. So don't think just because you don't have a swap file, you're immune from swapping, that is not true. So we need to make sure we lock down our pages for our real-time applications so that they cannot be paged out. And the third step on here is pre-faulting. And this means we want to create a heap and we want to cause all the page faults in that heap. And then we have memory available to us that's already ready to go. And this also goes for the stack. We don't want our stack to be growing and then suddenly we move into a new page and it causes a page fault. We don't want that. We want, it doesn't matter how deep we go in our stack, we want to make sure that's already memory that has already generated page faults so that we don't get surprised if our stack gets a little bit bigger. So here we can see for the first two steps of how do we can tune glibc. So here in this first one, we see the malopt function. Here, first of all, we're saying the maximum size for doing an mmap is zero, which basically is disabling it, which means glibc will never do an mmap to get that memory for this application. It will always go to the heap, which is what we want. The second thing we're doing here is we're disabling the trimming of the heap. So this is also implemented in glibc that in our heap, which is growing, if there's a large area that happens to be a contiguous area that happens to be free, glibc will actually trim that. It's called trimming and reduce that heap so that that physical memory is available to another application on the system. We also don't want to do that. We're going to go through the effort of paging, of pre-faulting, causing page faults for all of this heap. We don't want to be giving it back. We don't want to give back any pre-faulted memory that's for us to keep. So that's why we turn off the trimming threshold there. Then the mlock all. This is where we can lock our memory into actually lock it into RAM that I mentioned. And here's the flags current and future. So any memory we've allocated now or we will be allocating, that we want that all to stay. Most of it, for example, if I don't have a swap file, there's a lot of things that won't be paged out anyway. But just to make sure with mlock all, we know the entire virtual address space for this application cannot be paged out for any reason under any circumstance. Here's an example of how you can do heap pre-faulting. So we see we have a simple function, pre-fault heap that gets some sort of size. It's going to allocate a very large chunk of memory. Ideally, you should see the worst case from my application. What's the worst case size that I'm going to need? Maybe I need 10 megabytes, 12 megabytes, 100 megabytes. And we want to allocate this in the heap. And then we're just going to go through each page and we're going to touch all of that memory. And I'm really writing a non-zero value in there just to make sure we're definitely writing to each page. And that's going to be quite expensive. This is a quite expensive for loop. And at the end, we're going to do a free, right? So we're allocating a big chunk of memory. That's going to, since we have already configured glib seed to go to the heap, it's going to do the system break. It's going to grab all of that really into the heap. We're going to touch all of that, which is going to cause page faults. And then we're going to give it free. And now we have this huge heap area that's already been pre-faulted. And since we turned trimming off, we know it's not going to go back down, right? Something similar for the stack pre-faulting, but instead of doing a malloc that we get it from the heap, we're just going to create a very large stack frame, something that we learned we're never allowed to do. In this case, we want to do this, right? So here, this is an example from a 512 kilobyte stack that I want. I'm just going to have a function that I go into that has a 512 kilobyte character race sitting in the stack frame. And I'm just going to touch all of that. So basically, we've created a huge stack frame. We're just going to touch all of that, or one page from each of that. And then when the function ends, the stack frame disappears from the stack. And now we have this huge stack area that's ready and it's already been pre-faulted. And that also is not going to be given back because that is also something, because we've done mlock all, the kernel cannot take back that stack memory. It's ours to keep. Now, both of these functions, pre-fault heat and pre-fault stack, we only have to do once in our application, right? Right at the beginning, just a startup. So the startup is maybe a little bit slower, but now we have all of our memory available ready to go. So that was memory management, right? Avoid the page faults. Now we're going to talk about locking the POSIX provides the mutex for locking and the mutex is a very important object for synchronization because it has the concept of owners. So for example, if a certain task on a system takes a lock, that's actually the only task on the whole system that can release this lock. And this is an important attribute because if one thread takes a lock and another thread comes along and wants that lock, the kernel needs to know who do I need to put back on the CPU to free that lock? And the only way it can know that is if you're using mutex and if you're using something like semaphores or some other locking mechanism where there is no owners involved, the kernel cannot help you. So you're guaranteed to have a priority inversion there. Also really important is there's this priority inheritance that I already showed you. Unfortunately, this feature is not the default for P thread mutexes. So you actually have to turn that on. I don't know why you would not turn it on. And then there are a couple attributes like robustness and shared that you might have to turn on if your mutex is sitting in shared memory and you're doing so a multi-process application rather than multi-thread. So this is just a really simple example that shows you that you have your mutex here as an attribute object and I can initialize this attribute object, for example, to say I want to use priority inheritance. And then I can initialize my mutex using this attribute object from the mutex. Then I can do locking and unlocking. Pretty common. Most people are probably aware of this. The P thread, the shared and robust attributes are something of interest to you when you're doing multi-process. So I have multiple processes, not threads but processes and maybe a shared memory and there's actually a mutex sitting in that shared memory. It is critical that I activate the shared attribute for this. Also there's a feature called robust. I'm not going to go into, but there's a man page there that you can look at of how robust mutexes work. This is also very interesting for multi-process architecture. So you have multiple programs that are using shared memory to communicate and there's a mutex in that shared memory. This robust feature may be very interesting. Just be aware that if you turn on robust, there's also a lot of code you have to write. You can't just turn on robust and everything's fine. There's actually more things your application has to do at that point. Just be aware of that. So that was locking. Another thing is signaling. If I have two different threads that want to communicate through each other with signals, then there's something called conditional variables. And conditional variables are very nice because they connect a weight object, the conditional variables a weight object with a locking object. So if you have the pattern in your code where I'm waiting for something and I wake up and the next thing I do is I grab a lock. If you have that pattern, I'm waiting for something, I wake up and I grab a lock. That's exactly what the conditional variables were made for because it will actually connect that conditional variable with the lock object. If you're doing anything with real time, you probably want to avoid using signals, for example, just because with signals you have a really hard time. Actually, it's not even really possible to control the environment that when the signal handler comes, you're in an environment you're not really in control of. When you're using signals, basically you register a callback. And when that callback gets called, what's my priority? Am I allowed to take locks? What kinds of things can I do there? What's my scheduling policy and things like this? You actually don't know. And it actually depends. And it depends also on the G-Lib C implementation or the Lib C implementation you're using. Just try to avoid signals. The only thing signals are really good for is determining your application. And also really important with conditional variables, and we're going to see this in a minute, that the person who's sending the signal, it's really important that they send the signal while they're still holding the mutex. So here's an example, first of all, of initializing the conditional variable. There's also conditional attribute. Conditional variable attribute, but this convert attribute is only used if it's sitting in shared memory. So if it's not sitting in shared memory for multiprocess, it's just sitting in memory that's shared between different threads of the same process. Then actually you don't need a attribute and you can just use null here for this second option. So here's an example of what the code looks like. So we have the waiter. And we see that the waiter also has to take the lock. And then it calls this con-to-wait function and it goes into waiting. And in that moment where it begins to wait, actually the lock is released. This is an atomic operation that I go to sleep and I release the lock atomically. And when I wake up again, I will receive that lock again. So from the perspective of the receiver, I never lost the lock. I grab the lock, I'm waiting, and then when the wait function's done, I still have the lock. But I got it back and then I have to unlock it again. If I'm the sender, if I'm the person who's doing the notification, I'm also going to grab that lock, do the special, work on that special area that's synchronized, and then I can do a broadcast to wake up whoever's waiting on this wait object and then I can do the unlock. And here's what I'm talking about where it's important. I do the broadcast before the unlock. The reason why this is important is because in the moment when I'm holding a p-thread mutex, I might be extremely critical for the system and I don't know, I'm not aware of it, right? So we saw this example of priority inheritance as priority boosting. This will only happen when I'm holding a mutex. So it might be that all of a sudden I'm extremely important for the system, maybe my priority was boosted. This is the moment I should send the broadcast. If I wait until after I release that mutex, I might never get to the broadcast, right? So maybe in the moment when I release that broadcast, now I'm not important anymore. And there could be some higher priority tasks waiting on that conditional variable. So it's really important that I do the broadcast before the unlock. When you're dealing with clocks, you're also important that you're using the monotonic clock. So the monotonic clock is a clock that always moves forward. It always respects the human definition for seconds. And for example, it doesn't get changed twice a year for daylight savings. It's not adjusted by NTP to, you know, push the time forward and back or because of time zones or anything like this. It always just moves forward, but it moves with a tick that the humans define as seconds. So it's still using our correct units of seconds, microseconds, nanoseconds, but it is always moving in one direction, which is a much better clock to work with. And also it's important that you're using absolute times, right? Don't try to calculate, for example, if I'm in a cyclic task, which we'll see in a second, I shouldn't try to calculate how long I want to sleep for the next wake up. I should just increment an absolute time and just wake up at that time, right? I shouldn't be calculating. And here's an example of that. If we look down here at the cycle task main function, we're just sitting in a loop. Before we go into the loop, we get the clock once. We go into this loop and we're just in this loop, we're just going to do our critical real-time work. Then we're going to increment the absolute time to sleep until, and then we go to sleep. And then we wake up, we do our work. We increment that absolute time. We're not even checking the time in this case. We're just incrementing the absolute time and going back to sleep. So there's no way, even if like in 10 years down the road, we look at what this program is doing, it's still exactly waking up when it's supposed to, maybe exactly on the second, every second of forever, because we're using absolute times and we're not calculating how long we should sleep because you lose, you get a lot of jitter and variation when you try to work that way. The only thing to mention here is we have this norm ts function. This is just up here. It's implemented just because the time spec definition, the nanosecond field is not allowed to be more than the number of nanoseconds in a second. So it's just a normalization function to make sure that that stays true. When you want to evaluate real-time system, cyclic test is probably the best tool for that. The cyclic test is simply a tool that measures and tracks latencies at a certain priority level. So I can call it, for example, typically with cyclic test minus capital S minus M minus P, with the priority level that I'm interested in measuring the latency for. Also the second line task argument is also quite important. And generally you'll start this task and it's basically all it's doing is going to sleep and it knows when it's supposed to wake up and when it actually does wake up, it just checks the clock and see how what was the latency there. And it just does it again and again and again. It actually creates statistics. You can generate histograms for what is the latency at this level. And I say at this level because at priority 20, there might be different types of latencies than at priority 90 because there are real-time applications running. So I don't just want to know what's my real-time priority at 99. This is not very useful. I want to know, okay, my application is running it, for example, real-time priority 30. Then probably running a cyclic test on 31 is a good place just to see what kind of latencies priority 30 can expect while my application is running. The reason I say 31 instead of 30 is because we don't want that priority 30 task to be affecting the latencies of the test. Now while you're doing the cyclic tasks, basically you want to generate a lot of loads just for the system and try to basically cause latencies. So there's tools like Hackbench, which do scheduling loads. You can have an external ping flood, not internal, but externally a ping flood coming, generates a lot of interrupts. I can do things like running top minus D0 on serial or SSH, which just generates a lot of packets and a lot of interrupts, particularly with serial that's pretty painful. I can run OOM killers. So basically I'm taking every page of free memory and the system is starting to panic a little bit and it's starting to kill applications in order to free up memory. Even in this situation, my real-time application should run fine if I've configured everything correctly and I've done the tips that I showed you before. There's the stressNG tool. Obviously you want to stress all of the components of my system. So if I have Bluetooth, I should be doing Bluetooth stressing. If I have an SD card, I should do SD card stress testing. All of these things, the components are testing because I really want to shake out. Is there any possible code paths where I really do have a high latency situation? I need to shake those out and find them. Whatever the worst case I find, that's the worst case of the system. If you see a value you don't like, that's what you got. It's maybe a bug in the kernel. Maybe your kernel is not configured correctly. Maybe your applications are not configured correctly, but that's the reality of the situation. Lastly, don't forget to test an idle system. That's sometimes the biggest killer of real-time applications is the idle system. If I set cyclic test to wake up once a second, and the system keeps going into low-power CPU modes and things like this, that actually might be my worst latency. We're running out of time here, so I just want to quickly mention a couple of things. Perf is a great tool just to track, count certain hardware events, for example, page faults, cache misses, things like this I can track with Perf. There's also some examples here of how you can call that. For example, if my system is running in a pretty critical situation, it's nice to just start Perf and have it track. Are we getting page faults in this situation? Are we getting a lot of CPU cycles? We can actually look at things like which task or which symbol is causing the most CPU cycles or things like this. Perf is really a powerful tool, a profiling tool to identify places in your system or in your application where you have optimization available. If I see that my application is causing a lot of page faults, this is obviously something I can fix. I've shown you how to fix your page faults. There's also a tracing infrastructure available in Linux. This doesn't just profile what happens, but it actually takes timestamps of the events and you can see when it happens. Here's an example with trace command where I can, for example, trace, wake up in scheduling events and you can even look at these graphically. This can actually show me in this particular picture I can see that the Matte terminal lost the CPU, probably because of its time slice, but strangely enough, CPU zero is free here. I see there's actually a moment where Matte terminal is in the runnable state and I actually have an idle CPU. It looks like that could be interesting. Why is our application not on a CPU when we have a CPU free? With a trace is you can actually see these things and it can go really deep in depth. We can actually see the events for when, for example, the wake up happened and we're coming out of maybe a p3 mutex lock where schedule switching is happening. Actually, this would be the broadcast there. A switch is happening and this one's coming then out of the weight would be probably this condition. Actually, with Kernel Shark you can very easily just measure these distances and so you can get really good microsecond accurate values for how long these things are taking and you can actually see what's happening and it's nice to view things graphically because you can see what's going on there. The last thing here really is the optimal real-time configuration if you're doing your kernel. This is basically just a list of options that don't necessarily mean that your kernel is not configured correctly but there are really things to look at. If I look at a kernel from somebody and I want to see is it optimal for real-time these are the configurations I'm looking at just to see how they're configured and if we say, okay, you're using CPU idle is this something that, for example, you need? Do you need the low power CPU states or not? Is power saving important or is real-time more important? Things like this. These are just some things to be aware of when you're looking at the kernel. Lastly, here is the checklist that you've all been waiting for where you just need to go through consider your real-time priorities and also your non-real-time priorities. I can't emphasize that enough. Use nice and control groups to your advantage so that the things that are not real-time are still running the way you want them. The CPU affinity. Avoid page faults. They are deadly for real-time. Make sure you're using the monotonic clock with absolute time. Make sure your kernel is configured optimally Don't use signal leaves for anything except for killing your application. Priority inversion. It's the killer for all real-time applications and make sure you're verifying your results. Don't just assume you've programmed it correctly. Trace it, analyze it, profile it and look at those situations that you think are happening. If you think there's a situation where priority inheritance could really happen, trace it, see it happening and make sure you're aware, yes, we did implement it correctly. I can see the priority boosting happening. Now you know we did it well because maybe you forgot to activate priority inheritance for the mutex and it won't happen, right? You need to see it happening. So really trace and verify those results. Lastly, NMI's non-masculine interrupts are the absolute killer on any real-time system because Linux can't do anything about it. They're just going to come and jump into some biosvector or something. Make sure you know about them. Make sure you know how to avoid them. Talk to your hardware vendors and get that cleared up. So thank you. I went a little bit over. I hope that's okay. I really appreciate you taking some time to look at this. Virtual conference is a little bit difficult this year and the real-time Wiki site is the address I've got there. Go there. There's information about the configuring a real-time kernel, also examples of writing applications and if you see there's an article missing, write one. It's a great place to go. So I thank you for your time and have fun with real-time Linux.