 Hello, I'm Daniel. I'm a principal software engineer at Red Hat and a post-doctoral researcher at the Escuela Superior de Santana and in both cases I work with real-time systems, but at Red Hat I work from a more practical perspective and at the University I work with a more theoretical perspective and today we'll talk a little bit about the joint of these two things in the real-time Lenox context, right? Discussing the state of art of these and what is next? Where should we go in the future, right? So it's worth saying that this talk doesn't aims to be like a very complex talk or deeply explaining all these contexts instead it should be an easy-going going, trying to create the context and explain the context of these things but serving mainly as a starting point or an index of the things that are going on and the motivations that we have, right? So we'll start this talk with a brief introduction of real-time systems theory and what is real-time Lenox and then we'll discuss about the current users and the next users, right? and from that, from the conclusions that we have from the needs of the users that we have, we will discuss the recent findings of the applications of the theory and the practical practice in the analysis of the preemption on the preemptor T and the benefits of these analysis and we'll discuss what is next, where else we can apply such a more sophisticated reasoning and testing and analysis of the Lenox kernel, right? And we'll finish this talk with some remarks Before starting, let's recover a little bit about what are real-time systems? So real-time systems are computing systems where the correct behavior does not depend only on the functional behavior, but also in the timing behavior. In other words, the response time to a request is only correct if the logical result is correct and produced within a given deadline and generally the main metric here is about the response time of each task The findings of the real-time systems theory are generally communicated through scientific papers and these papers, they start with a clear definition of the system capturing all the behaviors and the general name the task model, right, of the system. Then based on the on this task model an algorithm is proposed and later this this algorithm is analyzed trying to demonstrate that the worst case are known and somehow bounded and this analysis is generally made in terms of a mathematical reasoning like with a theorem and from this explanation and this analysis a set of equations or models are proposed to conclude that the system can or cannot accomplish all the tasks within the deadline, right? Or in simple words showing that the response time of all tasks are shorter than the deadline The result of this analysis is generally a formula like some formulas have as the result some somehow simplistic format like the the schedulability analysis for a single core EDF But others in more complex scenarios will show a more complex reasoning and the my require some high skills on mathematics not only in the final result But also in all these steps that lead to that result, right? So The mathematical reasoning required for such analysis is very complex And now often some assumptions are made on the task model to facilitate this reasoning Otherwise, it would be too complex to be reviewed and to be analyzed in deeply with the the kind of rigor that the mathematical analysis requires you and so That's how things are and that's how things move in the other hand With Linux the approach is way more practical, right? but Linux itself real-time Linux is not a single thing It's a set of things or a set of features that together try to provide a deterministic timing behavior for Linux, right? the the approach used that is is more Practical as a set and it usually starts with a metric that needs to be maximized for example, the wake-up latency that is reduced with the preempt RT and In the case of the preempt RT some background is was considered, right? the background was the the usage of fully preemptive systems and The difference between the practical and the theoretical analysis that On Linux the the analysis is more made on Towards a testing tools for example for the preempt RT The cyclic test was developed and it is used to show that the system can schedule the highest priority thread Within a given time generally in the fewer microseconds, right? And then this approach repeats again and again and again for the other metrics As I mentioned real-time Linux not composed of a single tool But instead is a set of features and among them the most important nowadays are the preempt RT That implements the fully preemptive mode in which the system is Approximated to the fully preemptive in the theory and gives us low latency in the activation of the highest priority thread and We also have the scattered line which implements an advanced global scheduler that tries to provide Deterministic response time for tasks and we also have the priority inheritance protocol on the mutex that try to avoid unbounded priority inversion and Among others right, but these are the main features nowadays and then sometimes these Features are easy to explain with words, but They require the years and years of development better because the Linux kernel development is complex, right and sometimes Some assumptions are made to facilitate the development, right? for example the preemptive model just approximate what is the preemptive model in the theory and On the scattered line assumptions like That some overheads are are tolerable or that The all tasks can run on all the processors are made but It is not Fully compliant with the context of this This techniques in theory, right? But anyways Otherwise, it would be too complex to develop and too complex to To integrate and we might end up having nothing, right? So These things are just approximations, but they are good for many use cases, right? As I was Explaining there is this gap between the theory and the practice, right? the theory tries to provide Answer for the problems that we have in the practice, but without considering all our Restrictions and we try to implement some algorithms from the theory, but without paying attention to all the Assumptions that are made But at the end all this complexity who cares about all this complexity, right? Nowadays we have some workloads that require Some level of determinism for example have high frequency trading that requires Some fast response to external events and for these people even nanoseconds counts, right? We have many embedded electronic device that need to react promptly to some external events mainly on telecommunication Device with the 5g now providing low-latency communication We need to have an operating system that provide these low latency as well to not become the bottleneck of the latency and We can also run these together with KVM providing fast response time for for external events on the gas virtual machine and have like very nice results for this and and but Generally this use case required just fewer Real-time tasks generally one task per CPU, right and Still they are real-time cases and they are pushing us for more complex use case, right and This more complex use case are requiring Not only fast response for the highest priority trend But the deterministic response for a set of tasks for a pipeline of tasks, right way more than one per CPU and many of them requiring the synchronization among these tasks so many CPUs and not only increasing the complexity of the The timing guarantees that we need to provide The usage of real-time Linux on safe critical system requires Not only that system is able to do it But also that we are able to explain why the system is able to accomplish it with some some kind of tests and analysis that go far Far beyond the regular testing They need some strong evidences that the worst case scenarios are found and that are Be able to be handled in the case of default and for some more extreme case The usage of formal methods is recommended to show that we really are able to address these worst cases accordingly Right and then we return to this gap between the theory and practice this new use case are requiring a Closer integration between what are the explanations that the theory can provide us and what are the results that we have in practice and How can one thing match to the other? but Luckily, we had made some progress in the recent years that bring Closer this theory and the practice we have some good results and we will discuss this next In this part of the talk we will discuss a little bit the findings on this better integration with theory in practice In this to the case of the preemption analysis, right? So in the fully preemptive kernel with the preempt RT The threads becomes as preemptible as possible And that's a side effect of moving all the non-schedulable context to the thread context and making the threads As preemptible as possible as I said by making them preemptive Always unless the preemption is explicitly disabled with preemptive disabled and enable Functions or the function that end up using it, right? So as a side effect of this change When a new high-guest priority thread arrives It is promptly handled with a very short delay and this delay is Demonstrated using the cyclic test too that tries to to create the arrival of a new high-guest priority thread And take these measurements The results are so good That this delay is in the order of 100 microseconds even on very pessimistic cases and This allowed The linux to be used on system with such a requirement It allows the The scheduler to take actions with thin one millisecond So it improves the granularity of the scheduler decision and it approximates the linux from the to the theoretical Preemptive systems in which many technologies are based on for example the scat deadline the scat deadline assumes that system is preemptive So but why it's not good enough? It's not good enough because there is no clear description of the factors that cause latency And so it's hard to provide an evidence that the worst-case scenario was found And so it's harder to convince more skeptical people like the theoretical community so Trying to learn from the theory the thing starts with the creation of precise definition Of the system the development of the algorithm defining of the worst case and actually for the prem 30 Case The things should be simplified right because the algorithm was already created and we have like very good Evidence is that it actually works because cyclic test gives us nice results. So Things should be easier, right? So why can't we try to explain this in same terms that the theory uses, right? And we did it so Following that approach We started creating a very precise definition of the synchronization of threads on the prem 30 Considering these events that are important in the activation of the highest priority threads And this resulted on the a thread synchronization model for the prem 30 Which was published on academic papers, right? And you have more information on the paper in the slide Then we clearly defined what was the property that we would like to analyze And the property was the scheduling latency So the scheduling latency experienced by an arbitrary thread Is the longest time elapsed between the time a In which any job of this thread becomes ready and with the highest priority And the time f which the scheduler returns allowing this thread to start execution is executing its own code Then With this in mind we took that formal model and translate its properties and its specifications into a set Of arguments that are commonly used in the real-time theory From these arguments, we define what was the worst blocking that a lower priority thread can cause in this highest priority thread delay its activation And we also defined how much interference Interrupts can added in this time window And voila, we end up finding a latency bound using these theoretical arguments And it was good enough to be published in a paper, right? It's not the idea here to explain all the details But you can read the paper and there are also additional material in the page linked in the slides So with that we brought theory closer to the linux kernel, right? But what else could we do? How can we turn the practice closer to to what we see in theory? So to make that we jump it back to the approach we used on linux Which is trying to create a tool that showed us these delays in practice, right? That connects the theory with practice So Based on the latency bound We developed a tool that traced the system because the model was based on trace Trying to show the value of those variables And we did it the tool is called rtsl You can find information in the paper as well And with that we parsed the kernel events Trying to To show what were the value for those variables And with the value for those variables, we could state what would be The worst case scheduling latency And we have some practical output out of the tool, right? And with that we brought together the theory in the practice And it was very welcome both in the theoretical side and in the development side Which was good And one fear that people Frequently have when we join the theory in the practice in the real time is that We could have like very pessimistic values But that was not the case So the theory was adequate enough to demonstrate that the system has A bound, a safe bound for the scheduling delay But it was also not just the side effect of editing pessimism In that if you look at the results shown in the paper that I'm showing here in the slides Even being pessimistic The results are within milliseconds for many arrival curves of IRQs And it only crossed the millisecond barrier If we use some ultra worst case But still even using this ultra worst case in the Consideration of the IRQ arrival They are still bounded, right? So the system converts and that's good So what are the final remarks of this section of the presentation? The absence of formalism didn't avoid Linux to have a sound preemption model In that the preempt RT model was already deterministic The thing that was missing was a precise definition of the problem And the definition of the behavior of the system Under the terms using the real time system theory And with that we open the door for a new set of analysis that we can make on Linux And even though some of these results shown that Cyclic test was somehow optimistic The values provided by the proposal to show that Linux is still a viable option On the current scenarios And it's really endorsing the fact that people would like to go for it And use Linux on a more complicated system Like the safety critical systems, right? So at this point many people might be asking Is now Linux theoretically proof of real time operating system? And the answer is obviously no There is still a long way to go, right? The point is that the main metric for real time system is the response time And even though the scheduling latency is a fundamental step for it It's not enough, right? The scheduling latency is just about when the task actually starts running And not when the task actually delivers the final results And there are more points that affect the response time of the thread And we still need to analyze them So the next question is so From where should we start? Should we start from the practice or from the theory? And the answer is that this is not a chicken or the egg problem It's actually an evolution problem And looking back at the primitive case We can see that the primitive mode was Something that existed in the real time theory Since like the 60s or 70s Where the theory was created and developed And this was way before the starting of the development of Linux And the preempt RTE tried to bring Linux closer to this theoretical system, right? Until the point was it was barely impossible to make progress anymore So at that point Linux was not fully matching the theory And the next point was then to create a new theory that actually fits on Linux Which was closer to the primitive but not exactly that And we saw that it was possible So trying to think in a way that the theory evolves And the way that Linux evolves It seems that the way to go next is Try to find the next problem that we need to resolve on Linux To make it a better real time operating system And then try to create a theoretically sound algorithm To address this problem like based on some existing theory And try to create a precise definition of how Linux implements this theory Finding the worst case and trying to see if it matches Or if it doesn't match with the theory in which Linux is based on If it completes match It's nice, it's perfect, it's worked on We should jump to the next problem But often it will not be the case because of the problems that we have in the practice So when this is the case We need to try to create a new theory A new set of theorems A new set of argumentations that show that Linux is deterministic And with the new theory we have new methods And with new methods we have a new way to test and verify Linux Behavior and improve Linux in the testing side And also we have new metrics to evaluate Linux And to follow the progress of this new algorithm And try to find things that might break this theory and cause regression And this goes in the flow that Linux is going with the continuous testing And continuous integration of the operating system And then we can try to replicate this approach on the next and the next and the next problem But yeah Now we should jump to the question of what is next So what are the next problems that we need to try to address To try to provide the answer for the response time of Linux Right, to the response time of threads that run on Linux And we already have some ongoing work on this regarding synchronization, scheduling and locking That we will discuss now So the first problem in which we are somehow naturally applying such approach Is the migrated disable In the primed party we have the migrated disable synchronization primitive Which is used mainly to replace the primed disable In the case in which the primed disable was used to hold Thread into a given processor to force the synchronization So in many cases when the thread doesn't actually need to have the primed disable But instead just don't want it to be migrated The migrated disable is used on the primed RT Because by not using primed disable you can reduce the scheduling latency Which is the metric that the primed RT tries to maximize So that's why it's good But on the other hand it brings the side effect of Breaking the working conservateness property of the multi-core schedulers The working conserving property says that on a system with M processors The M highest priority threads will be assigned to the processors However There are cases in which the migrated disable breaks this assumption For example imagine a system with two CPUs And the CPU 0 and CPU 1 are busy running the two highest priority threads Then the thread in the CPU 0 goes there and disables the migration Then another thread arrives and it's scheduled in the CPU 0 Because it's now the highest priority thread on that CPU And then right after that the CPU 1 goes idle And in the current algorithm the thread that is primed on CPU 0 Would be pulled to the CPU 1 to keep the working conservateness But it cannot be pulled because of the migrated disable And that's a problem that creates idle time in the CPU 1 And uh well but that would also happen in the case of using primed disable No because if you if the primed disable was used It would the the thread on CPU 0 would not be primed at the at the beginning So this problem would not happen So we have somehow a dilemma here because in one hand we break the schedule latest In the other hand we break the response time and both are bad for real time right But uh we we are trying to find a way out from it And Peter came with an idea which is very promising So his idea is instead of trying to migrate the thread that was primed When the CPU 1 goes 0 it would try to pick the running thread If the the thread that it was about to pull Was the the one with a migrated disable Right and this should recover the the working conservateness of the process And uh that's good so Peter said that next steps he would follow would be To create a new algorithm to to to pull the threads on the RT And in the outline is scheduler to add the trace points So we can observe and measure the the case of migrated disable So we can trace in the same way that we do with primed disable And uh and finally he said that he was supposed to twist my arms To try to update the model to include these the scenarios and the numbers right The the it is there the the the new theory would be created to address this problem right And that's very nice because this wasn't a forcid occasion It just popped up on the LKML And that's a good sign of the evolution of the Linux kernel development community Towards having better real-time properties That's actually very cool So recalling here so the working conservateness is a required Proper for the current real-time scheduler With migrated disable we break this property There is a way out trying to minimize this gap But uh this uh this kind of problem was I never heard of anything in the theory right like this right So we need to try to find if there is something that approximates of it This looks very much the arbitrary priority affinity Which is a problem that is is now being handled in the real-time community And there are some cases of scheduler that tries to add the arbitrary affinity control But it is something that we need to try to address in the theory as well At least trying to find something that matches And uh that's nice because also we reach to this place This will start influencing the decisions of the design of the schedulers for example And uh yeah Linux is moving forward And it's moving forward in a nice way So what else what else can we we try to use this approach So the scattered line currently does not support tasks with different affinity Right because it's a global scheduler And sometimes we need to accept the tasks that are pinned to CPUs On the deadline scheduler for example some key worker cases that are actually there So at the end we have somehow a workaround to make them possible And that is good because we make Linux going forward But we are breaking some theoretical assumptions that we have on the schedule deadline The the same partition scheduler which is an idea that I explain on the On the slide pointed on the presentation pointing in this slide Might be a way to go that would reduce the these kind of problems But still as I said before arbitrary affinity is a still an open problem And a difficult problem in the theory as well And another thing that we need to try to address on the schedule deadline Is that it doesn't explicitly considers the overhead of the operating system Like it doesn't explicitly considers the the scheduling overhead Or the scheduling latest that we talked before And also the kind of tests that we do in the schedule deadline They are most based on the tasks running user space And try to say if they achieved or or not achieved the result before the deadline But there might be some hidden behaviors in the scheduler that we could catch With formal methods looking at the system trace And I know that there are some groups working on it with results that will be charged soon And that's something to keep an eye Another problem is that in the preemptor t we have the priority inversion protection Using the priority inheritance protocol The priority inheritance protocol is very nice for single core fix the priority scheduler Which is not the case of this cat deadline Which has each activation will have a different priority So the highest priority thread is always a different thread, right? You don't have a single highest priority And actually now we are trying to address this problem using the deadline inheritance But there are some no issues with deadline inheritance That are explained in the talk which is referred in this slide So one idea is use the proxy execution to overcome this problem But still it's a good mechanism But we need to develop more the analysis of the proxy execution mechanism In the Linux kernel with all the restrictions that we impose And that's also a very challenging problem And so on So we have problems to resolve And it seems that the lessons learned are being put in practice Actually in the kernel development by the kernel developers Because we are seeing the benefit of it And we are making progress So but looking backwards Linux had an enormous progress in the last decade With preemptor t with cat deadline we've tested in two And this is pushing us to have more and more use cases And these use cases are requiring more let's say sophisticated analysis So to conclude The real time Linux with the preemptor t is not only an integral part Of Linux now if the merging takes place But also in the culture of the scheduling development Right? And there are more challenges to come Mainly because now we can focus more in the response time As the main metric for real time Linux And this will make us to go forward Apply new techniques And trying to make them run in parallel with a theory and practice And the methods that can give us a better Insurance that Linux behaves correctly But in the logical and in the timing behavior And there's a lot of work to do in the next decade So that's it It was nice to make this presentation Even though I would prefer to do it live and have a beer after this But that's live how it is now, right? And I'm open for questions Thank you