 Hi folks. Very happy to be here and I thank you for taking the time to watch this presentation. I hope you enjoy it and learn something from it. So allow me to briefly introduce myself. My name is Kaivan, Kaivan the Lemuria and I've been in the industry a while and and you know, this is stuff about me. I won't blab about it. I'll leave it to you to take a look. More relevant to, you know, our stuff. This is my GitHub page. I'm sure I have it open here. Yeah. This is my GitHub page and I have several repositories, a few of which actually managed to grab a star or two and yes, please do take a look back over here. Actually, I am proud regarding this aspect. From 2018, pack publishing contacted me and I have been lucky enough to have been contracted to write some books on Linux and you know, I've been doing that besides my regular work and as of now I have written these four books. Please do pick them up. Have a look and let me know if you like them. Point out mistakes. All of them have open source GitHub repos and make a pull request and, you know, we take things forward that way. This is my author page on Amazon. Anyway, enough about me. What are we here for? So folks, of course, the title of this presentation, you know, leveraging the Linux CPU schedule and it's to show you about stuff like the ability to write real-time apps on Linux and more on that, you know, scheduling and stuff like that. So let me get into it. I hope what I have done here isn't on the screen. Okay. Cool. So before getting into the agenda and the actual material, you'll find all of the material over here on this GitHub site of mine, the PDF slides and the source code. So, you know, please go ahead and clone it and run the code, read the presentation and enjoy. So in terms of what we're covering, we want to get into the meaning of real-time and several other things. So let's get started, guys. What exactly do we mean by real-time? I think we all intuitively know. It's the situation where it's not just about getting the result right, but also getting it on time. In other words, there are time constraints, deadlines, and we have to meet them. So, you know, this is unlike perhaps most of the apps that most of us are used to writing, the apps that run on typical systems. They're often non-real-time. Real-time apps exhibit some characteristics. One of the key characteristics is called determinism. We need something known as a deterministic response. In other words, no matter the amount of load on the system, we still need to respond within the given deadline and get the job done within that. Of course, the real world being what it is. We cannot absolutely guarantee this deterministic response even in real-time apps to a small degree. It's always possible that there will be a variance in meeting the time constraint. At times, you, in fact, might do better. At times, a little worse. So, you know, this variance, the term used to describe it is called jitter. We're looking to keep the jitter down as low as possible in our real-time system and in our real-time apps. The good thing is stuff like this can be measured. So, I'll show you some benchmarks later when we come to Linux doing this kind of stuff. Now, another key point, which actually is misread pretty often, is we tend to think that real-time means real fast, not necessarily. It does sometimes, but not all the time. So, what I'm trying to get at is, for example, with something like a nuclear power station, you are having a controlled reaction and what if it goes wrong? Now, that could become a disaster as we're all aware. So, scientists have a means to stop the reaction, you know, the insertion of cadmium rods or whatever. I forget these details that I learned back in high school, but the point being, there will be a given amount of time and it's fairly generous. It's probably in seconds or perhaps even minutes. I don't really know, but it's not like, you know, milliseconds or microseconds, but it still very much is real-time. If we don't get it done in that deadline, it's an unmitigated disaster, as we all know. So, guys, moving along, in order to run real-time apps, we require and demand what's known as an RTOS, a real-time operating system. So, I know the question is, is Linux one? I'm getting to that very shortly. But for now, let's just talk about real-time. Another key characteristic of real-time is that its algorithms, most of its algorithms, they tend to be what are known as Big O1 algorithms. You have heard of time complexity. So, Big O1 is deterministic. Even under worst case load, it will, or rather under load, it will still perform in a certain given worst case time. Those are super algorithms. And as a matter of fact, even the vanilla Linux kernel uses many such algorithms in its normal day-to-day work, which is fantastic. But real-time needs Big O1 pretty much everywhere. So, we could broadly, very broadly, classify operating systems like this. At one extreme is the so-called GPOS, at the other extreme, the RTOS. GPOS, of course, is an abbreviation for general-purpose operating system, non-real-time operating systems. So, guys, Linux is, in fact, a GPOS. So, as Windows, Unix, Mac OS, these operating systems have not been designed to support real-time. It's never been the intention. So, they don't. On the other hand, an RTOS is explicitly designed to support real-time, obviously. And that is what you use it for. So, guys, in terms of real-time, here's the interesting bit. On the scale of real-time response, the GPOS is completely non-real-time. And on the other extreme with the RTOS, we have something called hard real-time. Hard real-time is the exciting domains, the cases where you must meet the deadline every time. Absolutely essential. You can't miss the deadline. It could result in financial loss, loss to human life, both of them, and all kinds of bad stuff. So, when we have such requirements, you need an RTOS. There's no other way. The interesting thing is, there is something in between that some folks refer to as firm real-time. And further to the left, being less real-time than firm, is something referred to as soft real-time. So, guys, soft real-time is interesting to us. Why? Because the vanilla generic mainline Linux operating system easily qualifies as a soft real-time OS. But what does that mean? So, let's come to these step-by-step. So, here we are. And I think this is pretty obvious. A GPOS is non-real-time. There is little to no determinism. Jitter can vary to any extent. And frankly, we don't care too much because we aren't writing those kinds of apps. So, you know, your typical business app, enterprise kind of database stuff in general, we are not writing it to be real-time. At least, I'm assuming this. Your typical web apps, nobody's really measuring the time or, you know, yesterday it ran one second faster or slower. We frankly don't even care to measure. As long as it works, we're happy. Those are the non-real-time ones, which GPOS is run perfectly well. And then we have a GPOS with soft real-time. And this is an interesting case because Linux easily qualifies. So, you know, soft real-time is all about best effort. The code, the algorithms, they will strive to their best extent to meet all deadlines all the time, but they don't guarantee it. You might meet deadlines pretty much all the time, but miss some of them. And that's okay. That's exactly what soft real-time means. You are not meant to use this stuff to control an aircraft or a submarine and so on. There is some jitter. There is determinism, but there also isn't. And deadlines don't always get met, but that doesn't have a major impact. It's more of an annoyance. So, folks, consumer electronics is a great example. So, on your smartphone, you're watching a streaming video. Now, at times it would stutter or literally jitter, buffer, and it's annoying. You're hearing music and it glitches. And it's annoying, but you won't die right, unlikely. Okay. Form real-time is in between. Hard real-time is completely deterministic where we use an RTOS. So, I've already said it. We always must meet deadlines. Otherwise, it's disastrous. And, you know, the examples are pretty obvious. Many kinds of transport. So, modern cars, they all support ABS, anti-lock braking systems. It's an amazing technology, right? The wheels are braked at different points using quick calculations of the computer. And therefore they prevent you flying off maybe an IC road. Stock exchanges are a great example of hard real-time systems. Medical electronics, you don't want the pacemaker in your heart to start having variance in how it beats it, right? And so on and on. Nowadays, drones are probably a good example. So, having said all this, I'd like to reiterate this fact. Where is Linux, our beloved Linux operating system in all of this? Vanilla Linux, mainline Linux, is capable of soft real-time, but not capable of hard real-time. So, perhaps that's a bit of a disappointment. Hey, I came here because I thought we can do the really cool, hard real-time stuff with Linux, but now you're saying we can't. Well, guess what? We can. And that's an incredible thing. Guys, that is, as you know, that is the power of open source, right? That's the beauty of open source. I keep telling people because the source is open, people have taken the source code and modified it to fulfill their objectives. And a team of people did exactly that with vanilla Linux, converting it into an RTOS. But folks, please hold on to your horses. We'll come to that aspect of Linux later in this presentation. Before that, I'd like us to, you know, gain more understanding of the process, its state machine leading up to scheduling funders and how, well, to some extent, how they are done. We're not really going into the deep internals of it. And then to actually write a multi-threaded app which will perform as real-time, but of course, soft real-time. Okay. So let's get going. So guys, you know, there is obviously some technical background necessary to understand these things. On Linux, we've got processes executing. And we all know what a process is. It's really just an instance of a program in execution. Processes can be single or multi-threaded. A thread, of course, is an execution path within a process. Now, let me switch to my terminal window. I've kept this open over here. I'm running on Ubuntu Linux folks, the standard 22.04 LTS release. And you can see it's a very recent Ubuntu kernel. You can see it's 5.19.something based on that. Good. So what are we getting at? If you use the PS command, you will be able to see all processes alive. PS-A, PS-E will reveal that. And of course, on my box, by the way, I'm running Linux natively on my laptop. There are many processes alive. Well, many, how many? That's easy. Right now, it's 378. But I'm sure you realize these are processes only. We don't see their threads. But you know, with GNUPS, the PS we are running, it's always possible to look up many, many details. So let's look at the threads of every process. So guys, the command is PS minus capital LA. So, you know, think of Los Angeles and let's go Hollywood. So folks, this time, you'll see a lot more. And here they are. So here's the deal. A new column has come up. Lightweight process. In effect, that's the thread. So here's how you read this output. If the PID and LWP match, then it's the main thread of the process. If the left hand side column, the PID column repeats, then it's multi-threaded. Else, it's single-threaded. Right, guys? So check this out. It's pretty clear all these are single-threaded. They are. In fact, folks, as I'm sure many of you know, besides system D, which is of course our init manager, the modern one on Linux, all these threads that follow over here are actually kernel threads or K threads. Just to quickly show you, using the BSD style syntax, PSAUX, it's one way. You see the processes that appear in square brackets, they are kernel threads. That's a quick way to know. So, you know, we do have a few dozen kernel threads alive on the system. They are threads that run in kernel mode. With kernel privilege, they have some housekeeping to do. They do it. But the vast majority of processes and threads are in user mode. It's pretty much always like that, right? It's always user mode. That's the resource hogger. CPU, memory, network, disk IO. It's pretty much always them that hog the resources. So anything, coming back to Los Angeles, PS minus LA, here's a list of all processes and threads. Now, let's look. Are any repeating? We should come across some. Can you spot any? I haven't seen them yet. So let's scroll down further. Where are you? Okay. I guess we see some now. Guys, check this out. So what does this tell us? It tells us here's a multi-threaded process. It's called Bolt Demon, whatever the heck that is. There's a total of three threads. This is the main thread, and these are their two worker threads. And guys, we'll now start seeing many examples. Here's another multi-threaded process. I run an IoT service called RemoteIt. It's very good. And clearly, it's multi-threaded. And we'll find a lot more. Okay. Cool. Also, you can see that the number of threads far outweighs the number of processes, which makes sense. Cool. So we come back here. So guys, now we're saying when a process executes, it goes through several well-known states that are being maintained by the Linux kernel. We need to understand this and the transitions between states. And we call it the state machine. So this diagram is from my system programming book. So let's check it out. A process, folks, let me interject with one thing. This equally applies to threads. Okay. So a process or a thread is born. Now, the moment it's fully born, the Linux kernel makes it schedulable, which means it's a candidate for the CPU scheduler, which really means under the hood that it goes on to some kind of data structure called a run queue. Now, over here, it perhaps looks to you like it's an array, but it's not perhaps an array. It could be anything. It could be linked lists. It could be a tree of some sort. Frankly, it doesn't matter now. There is a data structure. And once the process or the thread is on this data structure, it's in the ready to run state. In effect, it's like saying, I want to run. Give me the CPU. So the scheduler is the arbitrator. It's the one that decides among these guys on the run queue. Who shall we run next? And it picks one of them and context switches to it. And that task is now running on the CPU, which I mentioned over here as our CPU, running on CPU. And this means ready to run. But you know what? The Linux OS sees this entire state as one state, not two, and it just calls it R. So now when I say it calls it R, what do I mean? Can we see this somewhere? Yes, we can. It's the same thing. It's our good old PS. But instead of just running PS, use PS minus L, long listing just like LS minus L. And look up the second column. The second column is the state. So see, guys, we see an R over here for PS itself. It means it's either ready to run or running. That's how we interpret it. So great. Here's Bash, our shell, of course. And it's in the state S. What does that mean? That's what we're going to see. So once a process is running on the CPU, what happens? So guys, I show it like this. Several things can happen. Usually it runs and runs. And often, not always, but pretty often, it hits what's called a blocking call or a blocking API. So guys, as I'm sure you know, what does this mean? A blocking call is one where the process is put into a sleep state. That's denoted by this box over here. And folks, under the hood, the reality is it's dequeued from the run queue and enqueued onto another type of data structure, which is called a weight queue. So weight queues, you can have any number of them, because the kernel and our drivers will set this up. And a weight queue is associated with an event. Why? Because see, guys, it's blocking on an event on something occurring. And until that something occurs, it waits. In other words, sleeps. So folks, what do we mean sleep? There's no comfy bed down pillow in the computer, right? Okay, forgive my stupid jokes. Sleep really means you are not a candidate for the CPU scheduler. You won't run, but you're very much alive. You're waiting for something to occur. For example, in a C program, if you call sleep five, the sleep function parameter five, what are you saying? You're saying, keep me asleep until five seconds elapsed. The event you're waiting upon is the elapse of five seconds of time, right? So when that event occurs, the kernel or perhaps the underlying driver wakes you up. And when you wake up, you become runnable again, you are dequeued from the wake, weight queue and M queued on a run queue. And now you're again a candidate to run and you will run in the near future. That's what it means. Okay, so a blocking call is a possibility. Hey, but check this out. There are two kinds of sleep or blocking. One is an interruptible sleep denoted by capital S. That's this guy interruptible sleep. But another is an uninterruptible sleep denoted by capital D. So the meaning is with respect to signals. If you're in an interruptible sleep and a signal arrives, guys, I'm sure you know all signals that the platform supports can be seen with kill minus L list. These are the signals that the Linux OS supports. So if any of these signals come, you will react when you're in the interruptible sleep, you will run the signal handler. It might kill you. It might stop you or you might run some custom code because you've installed a signal handler. However, if the driver or the kernel has put you into an uninterruptible sleep, it's controllable via APIs, kernel level APIs, like wait event. If it puts you into this state, then no signal can disturb you, not even kill minus nine. So that's about that. So coming back here, guys, what else can happen to us while we are running on the CPU besides going to sleep? We can be sent a signal, the stop signal. So, you know, this puts us into a state called the stop state denoted by the letter capital T. So folks, when we're in the stop state, we're frozen, suspended, we're definitely not dead. You should realize that. Okay. And how would we get into the stop state? So guys, signal number 19 is sig stop. And signal number 20 has the same effect, sig T stop, terminal stop. And when you do, for example, control Z on the keyboard, it puts you into this state. I mean, sorry, it delivers the signal, which puts us into the stop state. Beg your pardon. Give me a sec, guys. Right. So we stopped. If we stop, we're not running. So how do we continue? Well, the question answers itself. There's a signal called sig count. That's the continue. And once you receive the signal, you again become runnable. Excuse me. So how is this signal sent to us? Well, several ways. One of them is using the job control commands on the shell, FG or BG, foreground or background. They, under the hood, they deliver the sig count. Guys, I won't go further into those things. Just look up job control to get more details. What else can happen to us over here? Well, the only other thing that can happen to us is that we die. And we're dead. That too is a, you know, state that we pass through. And on the way to death, there is a transient state called zombie. And, you know, if you've been programming on UNIX or Linux for long enough, you surely know about this. Folks, it's not our topic here. So I won't delve into more of that. And we have to prevent zombies and all of that. And with Linux, it's really easy. But I will move ahead from here. Otherwise, we won't finish. So let's move forward. Having seen the state machine, let's now start understanding better regarding scheduling on the Linux OS. So guys, here's a really key point. Just think in your mind. If we have for simplicity, let's say three processes alive, P1, P2 and P3. Now, P1 has one thread, it's single threaded. P2 is multi threaded with two threads and P3 has three threads. And let's just for the moment assume all of them want to run. How do they compete for the CPU resource? Is it the processes that will get scheduled by the scheduler? Or is it the threads that compete for the CPU, the threads of the process? Or is it something like within a process the threads compete? Or is it something else altogether? So you know, in effect, we're asking, what is the atomic unit of scheduling? What is it that the scheduler schedules? What is it that competes for the CPU resource? So in computer science, this is called the KSE, the kernel schedulable entity. And it's very clear on the Linux OS, it's the thread. The thread is the KSE. So guys, this is very important to understand and internalize. So I've already said this. This is what a process is. And we know about these things. It's very much a unix thing. But you know, multi threading is a reality from a long while now. So we have processes as well as threads. And threads are an execution path within a process. Every process requires at least one thread, because it's an execution path, right? It's what executes code. So minimally, you'll have one thread. And in fact, that thread is the main thread, also called the T zero thread. So guys, if we have more than one thread, then of course, it's a multi threaded abbreviated as empty process. Okay. So where is all this alive? We visualize them as being alive in a sandbox. Every process lives in a sandbox, what we call the virtual address space of the process. Have a look. This is the virtual address space, the user mode, virtual address space of a process. So folks, without going into too many details, it begins at the low address zero, and it goes to the high address. Though I'm not going to be technically accurate, just for the sake of understanding, if we run a 32 bit system, then the higher address would be two to the power of 32, which of course is four gigabytes. If we run a 64 bit Linux, then the higher address would be two to the 64, which is an incredibly large number, 16 XR bytes, 16 into 10 to the power of 18. Now, I just said, this is not perfectly accurate. The reality is the virtual address space is shared between user mode and kernel mode. So the reality is that the kernel is on top here. So this is an in between number. But guys, in this particular presentation, I'm not delving into all of those details. Just keep it in the back of your mind. Okay. Because after all is said and done, we are monolithic, single piece of stone. So we are one piece user mode and kernel mode in the entire virtual address space. Anyhow, let's move along. The virtual address space is divided into homogeneous regions known as segments. A better term is mappings. Each of these that you see here is a mapping. So we have a mapping called text. Text is the machine code. The instruction pointer or the PC iterates over this. And that's how your code runs. Then we have the data segments. There are three of them. And we know the heap is a dynamic segment which grows towards higher virtual addresses. The stack is at the top of the user mode virtual address space, the stack of main. And on all modern processors, it's a processor feature. We say it grows down towards lower virtual addresses. In between, we have the library mappings. Because think about it, even hello world wouldn't work if you can't map in the text and data of G-Lib C. So it is going to be mapped in and all this is done by the loader performing an M at the time of loading. Now, this is just to give you an overview. Let's leave it at this. Okay. So let's move on from here, guys. This is a reiteration of the previous points. You know these things now. And here we come to a key point regarding a little bit about kernel architecture on Linux. So see, guys, let's assume we have a multi-threaded process. So let's say we have three threads. Obviously, the operating system has to track every process it does. It uses a metadata structure called the process descriptor, like, you know, the UNIX PCB. But it's terribly named. Though the name is process descriptor, it doesn't track a process. It tracks a thread belonging to a process. So guys, the reality is for every thread alive, we have a metadata structure that the kernel uses to keep track of it to hold its attribute information. It's colloquially known as the process descriptor, but I'd prefer you to think about it as the task structure. Task is a much nicer word. A thread is a task. Okay. So every thread alive has its own kernel task structure. Now here's another thing. We need a stack to support execution because it's the stack that holds execution context. Call frames are, you know, we say pushed and popped as you execute functions and return from them. Now, when you're executing user mode, sorry, user mode code, you're using your user mode stack. But the moment your thread issues a system call, you switch to kernel mode and you begin to execute kernel code in the context of the same thread, we call it process context. Now, if we're executing code that includes functions, right? Of course, you're going to be calling functions. Don't we need a stack? Yeah, for sure. We can't use the user mode stack. So guess what? There's another stack allocated in the kernel for our usage when we're running in kernel mode and that's called the kernel mode stack. It's important to understand the distinction. Okay. So guys, in a nutshell, for every privilege level that the OS supports, we have a stack. So our modern OS support two privilege levels, user and kernel. So we have two stacks, a user mode stack and a kernel mode stack. Well, every rule has an exception. A kernel thread has only a kernel mode stack because it doesn't see user space, only kernel space. Fine. Now, I put up a diagram here, which I hope makes this clear. Let's visualize, we've got three processes alive, two of which are multi-threaded and a few kernel threads that are alive. Of course, guys, the picture is simplistic. There's a lot more to the kernel. But for our understanding, I think this is fine for now. So check this out. We've got P1, P2, P3 with different numbers of threads and look at the mapping. In the kernel, there is a task truck representing this thread and there's a kernel mode stack. And because P2 has two threads, we've got two task trucks for P2 and two kernel mode stacks. I haven't shown the user mode virtual address space or the user mode stack, which of course is part of it. Okay. I haven't shown it in this diagram. It would get too crowded. In this example, we've got five threads here. So we've got five task trucks. I've shown three of them here and five kernel mode stacks. We've got kernel threads alive and running. Each kernel thread has a task truck and a kernel mode stack. It doesn't have a user space mapping. There is none. So folks, this I think is an important diagram. It helps us understand the basics of kernel architecture. Now, since we are biased towards scheduling, the task truck contains all attributes of the process slash thread. It includes scheduling attributes. And I just refer to it as this yellow box named sked. So you can see in the legend here, internally, it will contain all kinds of scheduling attributes. And guys, I want you all to realize that, of course, you can look up all or any of this at the level of the code. I mean, again, the beauty of open source, right? So let's do one quick thing. I keep this website bookmark. Bootlin is wonderful. The company is a French company and they have an online kernel browser build system. We're on the latest stable kernel here, but you can pretty much look up any kernel. So let's do one thing. Let's search for the task truck. And it finds it and it's over here. And we click on this. It's in a header and check it out. Brilliant, right? Here's the task truck. And it's a big struct. There are many members. So guys, this is per thread and all the attributes are over here. Well, it will take a lot longer than this presentation to explain these things. I won't even try. But look at this. Many of these members are to do with scheduling guys, which I'm sure you can see. We're going to refer to some of them. But we're not really going to go very deep into this. But a lot of stuff here is very relevant to scheduling as is kind of obvious by looking at it. So great. Let's come back here. These things I've explained and I'm sure you understand. Guys, let me move along. So here's the next key point. What are we after in this whole exercise of ours today? We want to know how on a per thread basis we can leverage the CPU scheduler. So are you starting to see all the scheduling attributes are in here in the task truck? There must be an API a way for me to query and set these attributes and therefore influence scheduling, CPU scheduling. Of course, there is guys. The API necessarily has to cross the user kernel boundary. Therefore, the API has to be a system called. There are multiple system calls which will allow us to fiddle with these things to query and set them. Permissions also matter. And I'm going to talk about that. And you will also find that it's not just system calls. We've got P thread wrappers as well. POSIX thread wrappers, which obviously in turn issue the system calls. Cool. Let's keep going so that we reach the place where we can actually program these things and try it out for ourselves and see. I always like to tell people, be empirical. Don't believe the book, the tutorial, the presentation. Try it out for yourself and see. But you can believe me. Yeah. Okay. Let's keep going, guys. So we as the Linux OS, we conform to the POSIX standard and there are POSIX scheduling policies. What are they? What do they mean? This is a key part of our discussion. So what exactly is a scheduling policy? Well, it's an algorithm. And it's been implemented in code within the Linux kernel's CAD branch. And it tells the OS how to schedule. Well, rather it implements it. You know, when I was new to OS OS and Linux, I always thought that there's just one scheduler. But it turns out that it isn't like that. And with modern Linux, to an extent, these things are configurable and very much in the hands of the app developer to be able to decide which scheduling policy shall I use for which thread. That is the real part. And that's what I'm trying to show you over here. So let's get down to it. What are the scheduling policies that we are supposed to support? These are the ones, guys. And these are the names given to them. SCADFIFO, RR, and other. SCADO is also called SCADNORMAL and is in fact the default scheduling policy. This last one. These are known as the POSIX scheduling policies. Okay. As a matter of fact, the Linux OS certainly supports these and it supports more as well. It has ones called SCADBATCH and SCADIDAL. Well, honestly, they are not as important, especially SCADBATCH. SCADIDAL is used. It's used when the machine or rather the CPU core is in an idle state. It only comes into play at that point in time. Right. As also in the interest of completeness, there are even more, for example, the deadline scheduler, but I am not considering those things here. We're going to focus on these. Okay. So now the obvious question is, I mean, probably in your mind, you're thinking, okay, you've told us that there are these three that are important, but I don't understand them. What do they mean? So let's get into that. So guys, I know there's probably a bit too much text here. This is an extract from my kernel programming book. And of course I'll explain this. We encapsulate the meaning of these scheduling policies in these tables. So let's talk about it. Let's begin, in fact, with SCADFIFO. So guys, first thing to know, both SCADFIFO and SCADRR, which of course stands for round robin, are real-time scheduling policies. That's the first key thing to know. And the second thing I'll say is a reiteration. When I use the word real-time now, I mean soft real-time, not hard real-time, because guys, we're talking vanilla linux. It does support real-time, but in the domain of soft real-time. Okay. So we always keep that in mind. So with that in mind, what do these policies, these algorithms, what do they buy us? So guys, think of it this way. If you make a thread run under the SCADFIFO policy, and I'll show you how to do that, once it's running on CPU, remember the state machine, our CPU, once it's running on CPU, it will only get thrown off the CPU under three circumstances. What are they? Those are mentioned over here. One, it blocks on IO. In other words, it hits a blocking call, sorry, a blocking call, it goes into a sleep state. Two, it gets stopped or it dies. Well, obviously, then it is off CPU. And three, and the really interesting case, the moment a higher priority real-time thread becomes runnable, it will preempt it. Okay. So interesting, but what are the priorities? I'm getting to that in a moment. Hang on. What about SCADRR? If this is SCADFIFO behavior, then what's the difference with SCADRR? It's like this. It's identical to SCADFIFO, except that it has a finite time slice. Guys, did you notice? I never mentioned the word time slice when I talked about SCADFIFO, but doesn't everyone talk about time slice when we talk about scheduling? Well, I mean, see again, that's the beauty of this algo. With SCADFIFO, you get infinite time slice. In effect, what we come to realize is that a SCADFIFO task is a very aggressive task. It latches onto a CPU and it hangs on. It doesn't want to relinquish it. If one of these three things happen, it will relinquish it, but it's not about time slice. But RR is a bit more polite. When the time slice is used up, that's another reason to get preempted. So what is the time slice? On the Linux OS, all these things are tunable. The default tends to be 100 milliseconds on a typical Linux. But of course, these things are tunable via the proc, what's called sys control features. We'll see a little bit of that later. So great. They are really the same thing, but RR also has a time slice. Now, it's important to remember that neither of these policies is the default when you create a thread. So what is, it's this one, SCAD other or SCAD normal. So guys, the moment you create a process slash thread, it defaults to this policy unless the parent is of another policy and it inherits the scheduling policy. But let's consider the default case where that is in the case. So when you create a thread, it becomes SCAD other or SCAD normal. And what does it mean? It means it's very much non real time. It's not real time. So what is the algorithm? It's a fairness based algorithm. It's all about overall throughput, prevention of starvation of any thread. In fact, the algo used on modern Linux and from a very long while now is called CFS, the completely fair scheduler. And in the Linux kernel, we call it a class. We have these modular scheduling classes. So the CFS class serves this policy and it's all about fairness. Now guys, these things are mentioned over here and I've also mentioned batch and idle and they are less important to our discussion. Now, all of this discussion is not going to be meaningful unless we understand prioritization and priority levels. So here's a simple way to look at it. The y-axis is the real time priority and look clearly, SCAD FIFO and SCAD RR, they are peers, they are soft real time and of inferior priority to them are all the non real time threads on the OS, which includes the SCAD other, which is always the default policy and of course includes these. Okay, so having seen that, now let's figure out priorities. So guys, we have a priority scale and the priority scale for real time threads, which of course means soft real time, is from 1 to 99 with 1 being the least and 99 being the highest priority. So here we are. One is the lowest real time priority and 99 is the highest and this is for these policies. Okay, that's fine, but by default a thread is never these, it's always SCAD other or SCAD normal. If that's the case, what is its real time priority? Quite logically, its real time priority is zero. So guys, think about it. If one or more real time threads, soft real time threads are alive and runnable, they will always be favored by the scheduler to a non real time thread. They will always win because they're always of superior priority as the non real time is zero. Okay, now that's fine and that's what we expect, but you might well ask, within non real time, I have most of my application threads. Can I have relative prioritization between them and isn't that important? Of course it is. We need to be able to prioritize among the non real time threads. It's a, it's a done thing. It's an old UNIX facility called the nice value of a process or a thread. It's a bit peculiar guys. This is how the nice value scale works. It goes from minus 20 to plus 19 with a default of zero. That's the base value that everyone starts at unless you program it to be something else. Plus 19 is the worst, minus 20 is the best. If you are non root in terms of permissions, you can only make it worse. So if you make your nice value plus five, you can't even come back to zero. You can only make it worse than five. But if you're root, you can do anything. You can make it better. That's why it's called nice value guys by the regular user can only manipulate it to make his or her prioritization worse. Therefore being nice to others. I guess it's the UNIX chaps idea of a joke. Okay, cool. Let's move on. I hope you've got this guys. So moving along, you know, there's a lot of utilities on UNIX and Linux. And of course I'm talking about Linux to make it easy for us to script these things and to try these things out on an interactive shell. One of the utilities is called nice. We can set the nice value guys since, you know, let's try it out since we are in the mood to try things out. Okay, so let's do one thing. Let's do nice minus five and let's run something. So folks, I'm going to do a silly thing. I'm going to run VI in the background. Okay. Now it's running, but it got stopped because you can't run an editor in the background. But regardless, I don't really care. The point here is that VI is alive and well, it stopped. But look at this. Its nice value is plus five. Guys, be careful. This is the hyphen switch. This is not the minus sign. So if you wanted to make it minus five, you do nice minus minus five. Okay. But you might say he is already alive. So now what do I do? So now we use the renize. Renize affects a process that's already alive. So we can say minus minus five on this process. I guess I did it wrong. Okay, folks, sorry about that. I don't want to spend, excuse me, but I don't want to spend too much time on this. Actually, the point I'm trying to show is this. If we try and run something at minus five, it fails because we're not running as root. But the moment we run as root, it succeeds. And now we are running PS at minus five, a nice value of minus five. Look up the man page for renize and learn how to use it. So I did it wrong. You do it right. I'll leave it to you. Good excuse, right? Okay, let's move along. Folks, there are, sorry about that. There are many useful utilities. One of them is called change real time. Okay. In fact, Robert Love and perhaps collaborators wrote this utility. And it's very useful. So whenever you come across a new utility or whatever, please look up the man page. In fact, we're going to use it, but it's so easy. You can set the scheduling policy as well as priority for a given PID or you can query. So we're going to make use of this guy. These things are just valuable. Okay. And they even allow us to script these things. You can write a script to which does this on your product. Why not? Okay. You think of the possibilities. So moving along. I'm sure you've heard of CPU affinity. If I have four CPU calls by default, okay, let me say this before jumping into this. The Linux scheduler, CPU scheduler is, you know, there's a nice statement. It's perfectly SMP scalable, which means for each live CPU on the system, we have a run queue and the kernel scheduler treats it as a different unit and will perform scheduling independently on each CPU core. Therefore, taking advantage of every core on your multi-core box, leveraging it. And I mean, this buys us parallelism, which is exactly why we spend all this money on multi-core, multi-processor SMP and software effort in writing multi-threaded apps and a multi-threaded kernel, like Linux. Okay. So let's say we have four calls. By default, when you run a process or a thread, and guys, let's get into the habit of saying thread because we know the thread is the KSE. When you run a thread, it can run on any of those codes that is denoted by having the CPU affinity bit mask set to 1111 binary, which is hex F. So you understand, right? That means it can run on core 0, 1, 2, or 3. Yeah. But guess what? It's in our hands as the owner of the process or thread to change that. And of course, there's a system call interface, but there's an easier interface. It's the utility called task set. With task set, we can change the CPU affinity mask, and that's wonderful. So see, guys, the help screen of task set shows us an example. We can query it, we can set it. So 0, 3 is the bit mask. You understand, right? 0, 3 in binary is 0, 1, 1. So we can run on core 0 and on core 1, but no other codes. And the OS scheduler will honor this. That's a pretty powerful thing. Folks, a word of advice. It sounds tempting to kind of start putting our own bit masks for affinity mask and therefore controlling things. But unless you really know what you're doing, unless you have a broad strategy, leave it to the OS. It will do the best job of deciding on which code to run a task. It will load balance. It understands the CPU domain. You're pretty well off leaving it alone in the general case. Okay. Fine, guys. I wrote a wrapper script called query tasks get. And internally it uses CHRT and task set, and it displays all these things to us. I think this is a good time. Let's grab a look at it. Okay. So folks, as I mentioned at the beginning, this is the GitHub site, guys. And I'll put the latest version of everything over here by the time you see it. Worry not. And I'm going to use code from here. In fact, I'm within that folder. So I'll switch to the source directory, scripts directory. There's just one script. Why don't we just run it? Hang on a sec. I have many threads alive. It's going to query several things in the domain of scheduling about them. So let's run it and look at the output. So folks, a lot of output and we'll interpret these columns. So of course, because I've got so many threads alive, you can see it. In fact, it's quite easy, right? This is the PID. This is the LWP. And I've even indented the output so that you can see this is the main thread, and these are the worker threads. Okay. You can see these are the worker threads. This is the name of the thread. So you can see, in fact, this is one of the Chrome processes. And these are its worker threads. You can see that they're all running as scared other. It's the default. And guys, just look. By far, scared other will tend to be the policy because it is always the default. And, you know, here's the proof. Coming back here, the next column is the real-time priority. And of course, it's zero because these aren't real-time. They're non-real-time threads. And any guesses, guys? What is FFF? It's the CPU affinity mask, of course, because look, my laptop has 12 CPU cores. So think about it, guys. 12 ones in binary is 0xFFF. It means that all these threads can run on any of the available cores. But as another example, this particular thread, in fact, it's a kernel thread, it can run only on core number one. Core numbering starts at zero. Over here, it's different. Over here, it's different. And you will find that occurring, especially with things like kernel threads. It's a pretty common place to, you know, isolate them to perhaps one core. That's fairly common. Okay. So guys, just look. The majority tends to be it's get other. The majority of threads alive. And of course, this is a desktop Linux. Perhaps on your embedded Linux, it's not exactly like this. But folks, it tends to be the rule rather than the exception. I'll keep scrolling up. And when we get to the higher numbers, you'll start seeing some changes. So let's look at this example. We have PID290. As a matter of fact, it's a kernel thread. It's called an IRQ thread. It runs at skedfifo, priority 50. And I put this asterisk here to catch our eye. It means it's a real-time thread. So the moment you see an asterisk, these are real-time threads. So guys, interesting, right? The skedfifo thread runs at priority 50, exactly halfway between 1 and 99. That's very deliberate. This is a feature of real-time Linux that has been backported to regular kernels that we all use. This feature is called the threaded interrupt. Guys, that's beyond the scope of this presentation. So I won't dive into this further. But look it up. For example, on my box, to serve the Wi-Fi, these are all threaded interrupts. And in fact, they run on different CPUs to give more throughput on TX and RX. It's to do with interrupt processing. And you'll find other examples. So here we have a watchdog demon that is again happening to be at priority 50. And look at this. I put three asterisks to catch your eye. Why? Because these are kernel threads running at skedfifo 99. So you might say, whoa, 99. This guy must be hogging the CPU. Hang on, guys. It doesn't work that way. See, think for a moment. This thread is to do with migrating threads to other CPUs when the need arises. So that's the key point. These threads don't run continually. In fact, the majority of the time they'll be asleep. But when they are woken up by the kernel, it's pretty much guaranteed they will run more or less instantly preempting everything else because they are skedfifo 99. So folks, in a way, that is one of the really key points I'm trying to drive at in this presentation. It's up to you as the app developer. You can set your real-time threads to skedfifo 99 if they are very important. If when they wake up, they must run and with priority. That is the point. So you will find a few kernel threads that run at 99. It's not going to be too many of them and they tend to be very specialized. So here we are, guys. And the header shows you these things that I've already mentioned. Cool. So let's come back here. Yeah, you can get the script here. And I just showed you this. This is a screenshot of a sample run on my box. Try it out on your box, guys. I don't have to say it. Always be hands-on, be empirical. Try things out. So let's get a move on. It's getting on time. How do we get where we want to get? How do we query our set and individual threads, scheduling policy and our priority? So folks, we have APIs, pthread and system calls. This is kind of obvious stuff. Let's move along. Let's talk about pthread APIs, because that is how we intend to do it in this presentation. So folks, let's get down to brass tacks. Here we are. We have an API. pthread gets skedparam, getScheduling parameters. Of course, you'll remember that three in brackets means section three of the manual. Guys, a quick revision. I like this about Unix and Linux. You want help on man. It's man-man. And these are the nine sections of the manual. Section nine is a bit of a lie. Section two is system calls. Section three is library calls, and so on. So if we do this, you see it is section three of the manual, which means it's a library API, and here are the details. And of course, I'm going to explain this, folks. Okay. So let's come back here. This API allows us to query the scheduling policy and priority of any given thread, well, within our application. And this does the complement. It allows us to set the scheduling policy and priority. Okay. It should be obvious. Quering is fine, but setting needs special permission. You either need to be running as root, or, guys, you, I hope you've heard of the modern POSIX capabilities model. It's a really powerful thing where we split up traditional root permissions into a bit mask of much finer granularity permissions called capabilities. In fact, it's nothing new. I shouldn't use the word new. It's been there for decades. Unfortunately, many aren't aware of them and are therefore not using them as much as we should. Guys, I recommend you look up man seven capabilities and read this page. These are the capability bits. They all start with cap, cap for capability, cap underscore foo. As an example, if you have this capability in your process, you can change the ownership of any file object. You don't need root. Of course, by default, all processes have no capabilities. And folks, we have utilities to look up these things, get cap and set cap. In fact, we will use them. So what I'm getting at is root is nowadays considered like the older way. And to be honest, the less secure way, because running as root attracts hackers, malicious hackers. So we really don't want that. Just giving a few capabilities. You know, it's the infosect principle of, it's called the Polp principle, the principle of least privilege. Run your app with the least privileges that it needs. And we can enforce this with the capabilities model. So coming back here, guys, besides root, you can use this capability capsis nice. And it will give the process or thread the capability to set scheduling policy and priority. You don't need root. So that's wonderful. And I'll show you a demo of this doing the set cap. So guys, now we are starting to learn these things. Besides P thread APIs, there are system calls. And obviously, this is, these are the kinds of calls that the P thread rappers invoke to actually get the job done. Because remember under the hood, the job is done at the level of the task truck. And only a system call can get us there. So these are available APIs to set and to get the set attribute and the get attribute are considered the modern ones, sked underscore, set and get. But there are others and you are free to use any of these. Look up the man page. Okay. So same thing folks, obviously. On Ubuntu Linux, I find a nice thing doing a man minus K, K for keyword, sked. Let's just do it. Be empirical. This is nice. It shows us all scheduler related man pages. So guys, we can learn stuff from this. Okay. I'll leave it to you to look it up in detail. The page on sked section seven informational is good. Check it out. Fine. Folks, this is from my system programming book, pretty much showing you the same thing. Okay. And these are explained in greater detail in that book. I know too much marketing. Let's move on. Here's the signature. So guys, we want to write code. Let's get down. Include the header pthread.h. This is pthread sets get param. It's got three parameters. What do the parameters mean? This is the get. Same three parameters. But of course, over here, this is a return value. So the first parameter is the thread. The one you want to query or set, you give the thread ID. Okay. The pthread type. The second parameter is policy. And you know, this is a key thing, obviously. This is where we specify the scheduling policy, or we get it returned. It will be one off sked free flow, RR, other, or even these. And the third parameter is a pointer to a struct of type sked param. So guys, as of today, the struct sked param has only one member. And that member is the scheduling priority. And we mean the real-time priority. So you remember the scale one to 99. That's what we're talking about. Okay. And of course, zero means non-real-time. Okay. And in that case, you should be using nice value if you want to prioritize. All right. This I have already told you. There are many related APIs. I, of course, have to leave it to you to look up if you're interested. Okay. And there are interesting things. Get the min priority. Get the max priority for a given policy. We have system calls to do all these things. Let's get on with it. So guys, we've reached the point where, you know, let's make this whole thing practical. Let's learn some, I mean, look through some code and then let's run it. Okay. So folks, this is all on the GitHub site. You can see this is the demo app. This is the code. And I wrote a small wrapper script to make it easy to try. There's a small header. There's a make file to build it. So great. Let's actually get into this. So folks, I'm going to do a make clean. And here we are. So I'm going to open the code in an editor. And let's keep it here. And what is our intention? So let me explain that first. So our intention in this demo will create a process. I mean, that's what runs and we will be placed in main. Main is the first thread. Inside main, we're going to create two threads using the P thread create API. And let's, let's think of them as T1 and T2. We are going to use the P thread APIs. You remember P thread set sked param. We're going to make the threads real time, which of course means soft real time. We're going to give them a scheduling policy of sked fee for. We're going to give them a priority and the priority we give them, you will decide. You'll pass it on the command line as a parameter, a number between one and 99. Okay. Now to actually see that they are real time. I mean, we'd like to be able to visualize this stuff. So what I do in the code is I write a small macro and I run that macro. It executes a for loop. And in that loop, it emits a character to the screen using the right system code. So we'll make it right. We'll make the T1 thread right number two, and we'll make the T2 thread right number three. The idea being you're watching the second thread execute, you're watching the third thread execute. So when you see twos coming, the second thread is executing. When you see threes coming, the third thread is executing. And after main completes this job. Hey, by the way, we're going to run them at deferring priorities. So you will also see the preemption after main finishes that it's going to call that same macro delay loop. And it's going to print M for main onto the console. You will find that it won't get a chance until the real time threads finish, which is the whole point of this discussion prioritization. So guys, let's look at the code. So I don't have the time to read every line and you don't need me to. We all know C programming. And I'm assuming you have an appreciation of P thread programming, at least the basics. So folks, let's check this out. In main, these are our local variables. And you know, you're expected to pass the priority. Excuse me, the real time priority. So guys, we query the minimum priority of schedule fee for it turns out to be the number one. We query the max. It turns out to be the number 99. We print it out. Guys, this message is just a macro, which emits a debug print. All that is simple enough. Over here, we get the priority value. I have a doodle and it's important. We must use superior APIs to prevent IOF bugs, integer overflow stuff like that. It's very important in production. But over here it's fine as a demo. Now, let's look at the nitty gritty. Over here, I call P thread create. And guys, as I'm sure you know, this is the thread ID and this is the thread. This is the life and scope of the thread we create. This is the parameter we pass it, the priority parameter, the number between one and 99 that you passed. We do exactly the same thing here. We call it P three. And the function is thread P three, the life and scope of another thread. So guys, and we pass the same thing, the priority. Let's have a look at this code. Let's have a look at the code of thread P two to begin with. Here it is. So see, we've got a local variable stuck to get per amp P and folks, as you know, threads run in parallel. So the moment main creates this thread, it will start executing and we will come here. This printf will get emitted. Now I very deliberately go to sleep for two seconds. Remember the state machine? We're in a blocking call. You know why? This allows the main thread some time, a chance to print a few m's for main onto the console. After two seconds, it will wake up. It sets the priority to whatever you passed. Okay. Like let's say we pass the number 30 priority 30. So its priority will become 30. And then the key part of this program, we invoke the P thread API. And guys, as the command says, it becomes a system call in this case, get set schedule. So we do P thread sets get param. The first parameter is the thread ID P thread self on myself. The second parameter is the policy schedule fee for right. It's soft real time. And the third parameter is the pointer to the struct, which contains the priority. So folks, by the time this call is through, assuming it succeeds, you are running as real time. This thread is running at real time with a certain real time priority, let's say 30. Now we emit a small print and we call my delay loop macro. Guys, all this is in the header file. It's easy to study the code. It prints the character to in a tight loop 350 times. Hey, one more point. You should not compile this program in an optimized fashion because it kind of defeats the purpose by compiling it for debug with minus or zero as well. You will actually see these prints being emitted. And that's what I'll do. After it finishes printing to it will die. It will terminate. Now folks, the code of the second thread is pretty much identical with the key difference. The difference is after getting the priority value, let's say 30, after getting the priority, I bumped up the priority by 10 points. That's interesting. So this thread is going to have a higher priority, but hang on. It hasn't taken effect yet. It's still running as get others get normal. We set the member in the struct and now it takes effect. P thread sets get parent on myself. This thread sked fee for ampersand P. Now assuming this call is successful, it is running at a higher priority. Shouldn't it preempt it? It should. But look what we do. We make it sleep for four seconds. So don't you see we deliberately hit a blocking call going off CPU, allowing this thread, this guy to print to the number two, well, the character to to the console for some time. So guys in effect, once this wakes up, it is definitely going to preempt this guy. It's going to run printing the character three to the terminal in this demo 210 times. And only when it finishes, it will die and only then will this guy continue. And only when this guy dies will main come over here and print the character M to the terminal 400 times. That is the code. So now let's run it. So see, folks, I'll leave it to you to examine the make file. It's it is kind of detailed and it's a style of make file that I like to call a better make file template. Please read through it and I'm sure you'll understand. I use it in my books as well. Have a look, but I don't want to focus on that. We don't have too much time left. So here's what I'll do. I'll build it. Now, there are a few warnings. And guys, it's very pedantic. I am going to ignore it for now. The make file is running the set cap utility as root. And here is where we set the capability capsis nice onto our binary executable. We need to do this as root, of course. It's done and our programs are ready. Okay, so folks, a quick LS. I should run the debug version. We will ignore the other versions guys. So let's do that. Sked, p thread, RT prior debug. When we run it, it asks for the real time priority as a parameter. So folks, what did we say? Let's pass 30. Now let's have a look at the output. Remember, what should it do? Main should run. We should see a few m's. After that it creates the threads which make themselves sked free for real time. Thread two should emit two preempting the m's. And after two seconds, thread three should print threes, preempting the twos. And when it dies, we should see the twos. And when it dies, we should see the m's. But don't believe what I say. Be empirical, try it out. Here's the m's. Ah, it's working, but not how we expect. See guys, this is, I mean, it's nice, but this is not what we expected. Do you see the interleaving? We can literally see thread two and thread m, which is the main thread running in parallel. And the moment this thread comes alive, even it is running in parallel. Now, I mean, in a way that's a nice demo, okay, of parallelism, but this is not what we expected. By the way, all this is fine. But look, read the output carefully. To run this as soft real time, we need to run it either as super user or having capsis nice. Now we do have capsis nice, but it still doesn't seem to work. What's wrong? Guys, think about it. We are running on a multi-core system. The Linux kernel has the intelligence to merely place the second and third thread onto another code. That's why they're all running on parallel. I mean, in parallel. So nothing has gone wrong here. In fact, it's taken advantage of our hardware because remember I've got 12 cores. It's got more than enough to run three threads. That's why they're running in parallel. So guys, don't you see to get our demo to work, we need to guarantee that all these threads run on exactly one code. And how do we do that? By changing the CPU affinity mask. Now, I could do it programmatically, but it's a bit of a headache. I mean, see guys, I remind you man minus case kid and we will find it. Where has it gone? Here we are. This is the get affinity. This is the set affinity. So we can do it. These are system calls. But you know what? It's much easier with task set. So with task set, we can use syntax and we can do it. So guys, let's run again. And hey, I forgot to show you all something. Get cap on our binary executable proves that we are running with the capsis nice capability, but which means we don't need root. It will still work. So that's good. That's good to know. Now this is how we ran it earlier. But now let's do this. This is one way guys, you know what I'm saying? Run it with CPU affinity mask zero one. So guys, okay, to make it a bit more interesting, shall we say zero two? What is zero two in binary? It's zero one zero, which really means it will run on CPU one, the second CPU core on my box. And now let's see how it works. So here we go. Here's main. Aha. It's working. Here's main again. And it finished. So folks, now do you see main ran in the two seconds available to it? But then thread P2 woke up and preempted main because it ran as real time. Guys, remember the code thread P2 ran as real time with a priority of 30. So it preempted main. It kept running until another two seconds elapsed because you know of the sleep for this guy woke up and printed three at a higher priority real time priority 40 30 plus 10 it preempted it not giving a chance. So it ran and only when it died did thread to come back and only when it died did the main thread come back and print M4 main and then it died and we are back to our shell. So folks, that's the demo. Okay. I hope this shows you and gives you ideas for your own apps, how you can leverage the CPU scheduler and make individual threads run as real time. Of course, soft real time. Guys, I'll need to hurry a bit. So we come back here. This is a screenshot of the same demo. But you know, one thing let me let me go back. One thing that didn't happen in my demo run because I missed doing something to be honest. Guys, you know, very often you will see the main thread suddenly appearing. Do you see the Ms over here? It's because of a CPU tunable. Well, a scheduling tunable. And I will explain that now. Proc allows us to tune the kernel using sys control. One of the tunables among the many for scheduling is called sked rt period us. This is not the us. This is microseconds you for me. Right. So guys, we call this the total period. And we call this the runtime. The runtime value is the amount of time with respect to the total that a real time thread will be allowed to run. So let's look up the values, the default values on your typical desktop Linux and, you know, any Linux as far as I know will be these numbers. So see guys, this is one million microseconds. This is 950,000. In effect, what we are saying is out of so many microseconds, for this many microseconds, only real time threads will run on the CPU. In other words, 95% of CPU bandwidth is allotted to real time threads. But you might correctly say, shouldn't it be 100%? Well, technically yes, because that's the meaning of real time. Real time means we don't care about fairness. We are ruthless. But you know, this is Linux. It's a GPOS. So the default behavior is to literally leak 5% of CPU time to non real time threads, allowing them to run. So guys, on my box, to be honest, I forgot, I had set this value earlier. Let's look it up, right? This is the period and this is the runtime. I had set the runtime to the same as the period, which is why no leakage occurred. So guys, we can change that. So here's my next demo. I'll echo 900,000 to this to leak 10% of CPU time to non real time threads. But this will fail. It will fail because of permissions. So of course, guys, you have to do it like this as root and note the syntax. Let's check. Yes. So do you agree with me, guys? The period is 1 million, but the runtime is 900,000. So I'll run the same demo. But this time we will have some leakage, CPU time leakage. The main thread will have the ability to peek in. Let's try it out. Let's be empirical. Here's main. Here's thread two, but look, main came in, main came in. It leaked. See? Interesting, right? So folks, this is in your hands. You decide how much CPU you want to leak to non real time threads, because we don't want them to completely starve. Perhaps this is all dependent on your project or your product, obviously, and you make the decision. It's part of the design. So, folks, these things are reiterated in this presentation. I wrote a script, a wrapper script called run it. Just do run it.sh-h and it'll explain how you can do these things. It has the intelligence to do these demos. So I leave it to you to run these things and try it out. Here's essentially a similar demo to what I showed you. Here we are showing no leakage. So fair enough. So, guys, I have only a few minutes left. I hope you're still interested. A few more things to cover, guys. The modern initialization manager on Linux is called SystemD. With SystemD, we can run services at boot. And whether it's an enterprise class server or whether it's a small embedded Linux, this is the preferred way to run stuff at boot. This is the modern way. The reason I bring up SystemD over here is that a lot of tuning can be done with respect to CPU scheduling. And it's without the need for any programming. So it's really powerful stuff. You can take advantage of this as well. So, guys, I'm not going into details here. I'm just giving you a very brief overview. So this is really the key man page to look up for the stuff I'm showing you. SystemD.exec section five of the manual, which is file formats. And check it out. You can give parameters like what's the CPU scheduling policy to run a service under, meaning a process. You can choose one of these. I mean, look how easy it is. No programming involved. Just write it in the so-called service unit. Of course, you have to learn how to use SystemD. There are many tutorials out there. Just Google it or chat GPT. Okay. You can set the nice value. You can set the priority. So this makes it really easy, guys. And in fact, here's an example. Okay. You can set the CPU affinity mask. It's pretty amazing. We know about resource limits on Linux. Every process has a set of default resource limits, like, you know, CPU time, memory limits, IO, and so on. Even those are setable from within SystemD for the services you are launching. So again, it makes it really easy for us, guys. We can set these things. You can set CPU bandwidth as a percentage. Imagine that. So folks under the hood, right? SystemD leverages a really powerful infrastructure of the modern Linux kernel control groups or C groups. And C groups gives us the ability to do bandwidth control on resources like CPU, memory, network, disk IO, and so on. Again, C groups is something else to study and appreciate. It's really powerful stuff. You can even set the capabilities of the process you're launching via SystemD. So guys, a lot of things. Now, I hope I've taught you a lot of stuff. So now I'm going to give you exercises. I am a teacher, right? So guys, this is for the fun of it. This is in fact from my system programming book, one of the chapters on scheduling. And this is all up to you. Have fun looking through this stuff and trying it up. Let's move along. So guys, the final section in this presentation, I hope you're still there. Right at the beginning, I said we have a GPOS, General Purpose OS Linux. We have soft real-time, Linux easily qualifies. We have firm real-time and we have hard real-time. Now, Linux is not an RTOS, hard real-time, but we can convert it. This is not meant to delve into depth. It's an overview, but you can get started. So guys, in terms of the theory behind it, Linux conforms to a scheduling scheme that we can call fixed priority preemptive scheduling. So just look at the words. We know what priority means. We have a priority scale from 1 to 99 for real-time. Fixed priority means it's up to the app designer or developer to fix the priority of individual threads in the process. The OS will honor it. The OS is not going to modify the priorities. You will do it as the app designer or developer. And that's very powerful. Similarly with the nice values for the non-real-time threads. Okay. So that's a big deal, guys. It allows us to say among the 10 threads in my application, three of them are SCED-FIFO with priorities 50, 60, and 70. Perhaps one of them is SCED-RR with priority 65. And the remainder are non-real-time default, but I can fix the nice values. Even for that, we've got API. Okay. So anyway, we have the word preemptive coming up here. Guys, we know what preemption means. And I mean, this is Wikipedia's definition. We understand these things. So what I want to say here is there are two kinds of preemption. User mode, preemption, and kernel mode. When we say preemption, often we're thinking of user mode. So guys, consider this. We're running on Linux. We have one CPU to keep things simple. You write a C program whose one line of code is while one semicolon. You compile it and run it in the background. And we have just one CPU, and we have a graphical desktop, and we run an analog clock application. So you can see the second hand of the clock ticking along. We can visualize this, right? It's easy to visualize. Tell me something. When the while one is running, will the second hand of the clock stop ticking because the while one is eating up all CPU? Quiz question. What do you think? So guys, after a little thought, and do think about it, you will realize, and I encourage you to try it, you'll see that the second hand keeps ticking. Why? Because folks, it's the basis of a modern OS scheduler. It has the ability to preempt tasks which hog an unfair share of the CPU. In fact, the while one is terribly unfair. So it's going to be penalized as a matter of fact. If you know about CFS, it will go to the right of the RB tree and won't run for a while. It will be deliberately penalized. It's a question of karma. Okay, you be a naughty boy and you'll be penalized. You sip on the CPU and you'll get it more often. It's exactly how we expected should be. So that's wonderful. But now I ask you the next question. What if I write a kernel module with the code in the init section as while one semicolon and I have one CPU? Well, you've pretty much had it because how can the kernel preempt itself? But guess what? From the 2.6 Linux scheduler and of course 2.6 is modern Linux, we kind of advertise 2 big features of this Linux kernel. One of the features was the O1 scheduler, the big O1 scheduler which meant it was kind of real timish. And the other feature was that you could configure it to be preemptible. And of course, it's the case today. We can configure the Linux kernel to be preemptible. And guys, in the latest kernels, the 6.x, it's become even more tunable with a preempt dynamic feature coming up or just come up. To begin understanding these things, I like this slide which I've taken from an existing presentation. So all credit to these people who built this slide. It's a really nice slide. So guys, look at this slide. It shows us how kernel preemption has evolved from old Linux kernels to modern ones. The red color means non-preemptible section of the kernel. In our really old kernels and even up to 2.4 Linux, the vast majority of the kernel was non-preemptible. But from 2.6 Linux, a large percentage of the kernel became preemptible. Why do we say 2.4? Because it got backported. Companies like Red Hat saw the advantage and immediately backported it to serve their customers. So this is good stuff. It actually is great for multimedia rich systems, stuff like that. It's not really something we'd want on server-class machines. We don't really need a preemptible kernel for that purpose. It's more for the high-end multimedia rich, high-end desktops, laptops, smartphones as well. From 2.6.18, guess what? We have patches, which if we apply and build the kernel, configure and build, we get a real-time kernel, hard real-time RTOS. And with that in hand, it's almost 100% preemptible, real-time Linux. So guys, it's evolved like this. It has led to the creation of hard real-time Linux. The project is called RTL, real-time Linux. So Linux has made this statement a long time back. In its early years, this effort was called preempt RT, preempt real-time. So more on that. So guys, it's amazing. Thomas Gleeksna, he was passionate about converting Linux to an RTOS. In fact, I met Thomas very briefly at a conference in Bangalore, India, where I live, but many years back. And of course, he would have forgotten. And he gave us a wonderful talk so many years back. And they have worked on this from a long time. And back in September 2006 with 2.6.18, the patches were merged, well, hang on a sec, patches were made available to convert Linux into an RTOS. So why do I keep saying patches? It's not in mainline? Yes, it's not in mainline. And Linux and the others resist pushing them into mainline. Why? Isn't it a good thing? Well, depends where you are in the ecosystem. Excuse me. Sorry, guys. See, folks, for Linux, Linux has never been designed as needing to be an RTOS. So he's right. We don't want highly invasive patches which convert it into an RTOS and change a lot of stuff. So it is an external patch out of tree that you have to download and apply. Sorry. And then you can configure, build a kernel and deploy it and you are running an RTOS. But it's not in mainline. But guess what? In recent years, it's been getting closer and closer. And the hope is that it will soon be in mainline, which has always been a goal of the project. The original effort was called pre-MTRT. A very positive thing happened in 2015. The Linux Foundation adopted it as a project. And it's now called the RTL Collaborative Project. Okay. Guys, don't confuse RTL with earlier approaches. And you know, there was an earlier one called RT Linux. It's all different. This is the modern one. Okay. So these are things that have changed within the kernel because of RTL. You know, the reality is RTL has brought up so many goodies that they have been back-ported into the mainline kernel, into the vanilla kernel. We owe thanks to Thomas Glicksner and team for modern features like high-resolution timers, threaded interrupts, priority inheritance, and so on. It's all been back-ported and it gives us a better kernel. So RTL becoming part of mainline is something that I think will happen. So anyway, you want to run Linux as an RTOS. Maybe you have a drone project up your sleeve. Wonderful. You can read up about it here. There's a good wiki site and here's a quick screenshot. They, you know, have a blog and it's very nice. There is an older wiki site, which I now find a lot of it has been decorated. But, you know, you'll still find some articles here which are very good. Okay, so try and check it out. I just told you this. It's not yet a part of the Linux kernel, but one day it will become. So guys, how do we convert Linux to an RTOS? You need to go to the tree and you need to get a version of mainline and the corresponding version of the RTL patches and apply them. Now, not every mainline release, vanilla release, is supported by RTL because there are just too many of them. So see folks, this is where we get, you know, Publinux kernel projects are team real time. This is where the patches are. Now, look at it. As of now, it's being maintained from 2622 and that's literally a decade back or close to it. And as of today, as I'm giving this presentation, the latest LTS kernel is 6.1. So let's look at it for 6.1. You see, there are lots of 6.1 kernels, but they have picked up the 6.1.19 kernel and have modified it to be real time. All you need to download is this single patch, which is of course zipped up. You download it, you uncompress it, you apply it. Apply it on what? 6.1.19. No other kernel. It'll fail to apply. Apply it on that. Do the make menu config. Okay, let me show you. I'm sure I have the screenshots. I've already told you this stuff. So you can go here. You can download this. You can unzip it. You can apply it. The usual stuff guys, patch minus P1 less than dot dot slash the usual syntax. Okay. So why isn't it a part of mainline? So I was talking about these things. It's quite invasive, but we're hoping it will get into mainline. Okay, at some point. Folks, what if I do have an RTL kernel? Can I figure that out in the code? Yes, of course you can. There are multiple ways. One is the if def. One is the if is enabled. Config preempt RT. That's the config directive. Okay. You know, the kernel programming book that I wrote, you'll find the full example of downloading and applying the patch, reconfiguring, rebuilding the kernel for a real machine, the popular Raspberry Pi. And I even do some benchmarking. So I leave it to you to check out. Okay, guys, I'm sorry. I haven't shown the make men of config screenshot in this presentation. Looks like I missed that. But if you do it, you go under if I'm not mistaken kernel features. And there you will find a menu called preemption. And instead of three options, you'll find a fourth option. And the fourth option will be for the preemptible kernel. In other words, RTL, you turn it on, you save and exit, you build the kernel, you deploy it on your target, and you are now running an RTOS. Now, fantastic, amazing. Imagine we do all of this. Does it help? So let's just look at some benchmarks. Now, it is, it is an old benchmark, but nevertheless, look at this, guys. Beno ran this benchmark audio sampling under load years ago on a 2.6 vanilla kernel. So see, folks, this is the latency, the Y axis. So you'll see this is five milliseconds. So this is like, most of them are just over a millisecond. No, sorry. Most of them are much under a millisecond. And the red line is where the human here detects dropouts. Okay. It's well below that, like even the outliers, even the outliers are well below the red line, which means what? Which means that even vanilla Linux is definitely soft real time capable. I mean, this is great. There is jitter. I mean, look at the variance, right? There is a lot of jitter. And in fact, especially here, but it's not too bad considering this isn't an RTOS. It's a GPOS. So I mean, we can certainly live with that. Okay. Now see, he runs the same benchmark, but with a 2.6 preemptable kernel, meaning config preempt, applying the patch, configuring the kernel, building and deploying it, and then running the benchmark and check this out. Wow. Guys, do you see the red line is here and we are nowhere near it. And just look, there's virtually no jitter or it's very tiny. And that's pretty awesome. I hope this light is okay. Yeah. So this is great. It really does make a difference. So I did another thing, guys. There's another utility called cyclic test. And if I'm not mistaken, Thomas Glicksner wrote this and I demoed this in my book. I run the Raspberry Pi under load with three different kernels and I run the cyclic test benchmark in parallel and I plot the results. So here is a screenshot of the results from my kernel programming book. So here it's running under the real-time kernel. It happened to be 5.4. Here it's running under a standard vanilla 5.4 and over here with the Ubuntu 5.4. And guys, this is the latency experienced via the cyclic test benchmark, minimum average max. So this is very revealing. Look at it. The max latency for the real-time kernel is very low compared to the other kernels. And that is the benefit. Okay. Now this perhaps makes you say amazing. I'm only going to use this. But guys, be careful. It's not only about latency. See, though the latency is like really minimal compared to these. Look at the minimum and the average. It's actually a lot better than what the real-time kernel gives us. See, over here the average is 26 microseconds. Here it's 16. Here it's 3.8. So what we kind of infer from this is that not all use cases are good for real-time. In fact, the reality is it's very few guys. Fine. You're flying a drone. You probably need a real-time kernel. But if not, if you're not really doing real-time, if you don't really need real-time, don't use it. Okay. In fact, throughput tends to decrease. So it's kind of non-intuitive, but that's how it is. Do a lot of testing. So folks, one more thing to say. It's not enough to merely be running an RTOS kernel like RTL. If your intention is to build a real-time system, even the applications you run, the processes and the threads, they have to conform to real-time guidelines. These are some of the real-time guidelines that I'm aware of. There are probably many more. Keep all these in mind. Okay. Otherwise, they cause latencies and that could make it unacceptable. In fact, a related topic is threaded interrupts and you should look that up. Guys, these are articles on the older Wiki side. So I don't know if all these links are still okay. As I say here, your mileage may vary. Try it out. Okay. And is anyone using RTL? Yes, of course. This is the OSA Development Labs. They use it for network testing. This is a big network rack that is running RTL. Drones are already using it. And in many products as well, you do find RTL. I have some customers who have begun to use it. So folks, we're almost done. Why is this smiley over here? I'll show you. Hang on a second. There's always more to learn. And I mean, we're all learning all the time, guys. So I'm just encouraging you to learn more about these things. There is so much to learn. There are so many tools on Linux. Nothing is hidden on Linux. Okay. It's just a matter of effort looking into things. So folks, there's a lot to learn. How do you learn all this? Read my books. That's why the smiling. Okay, guys. Thank you so much. I regret that I could not personally be at the conference and take questions. I would have loved to do that. Another time, I'm hoping this will happen with the next LF conference that I'll be there in person. But if you'd like, please email me. I'm giving you my mail address. Raise issues on the GitHub site. Talk to me. I'd appreciate it. Thank you so much for your time. Hope you enjoyed this presentation, guys. Take care. All the best.