 So Very much here. This is about real time is coming to Linux now You might be all laughing about that because we've been saying that since 2004 Maybe that 2004 I think we were saying it about 2008 so as a whole joke was you know This is a year, you know real time is going to be merged into Linux It was like saying this is a year that you know Linux is going to rule the desktop It was equivalent This time we actually mean it and how do I know we mean it? and how I know we mean it this time is the fact that all our focus is our talks our Meetings are not about what do we need to do to get real time into Linux all the meetings are now like oh crap We're getting into Linux The job is not done Not the fact that it's not done that there's not more work to do to get into Linux It's saying that our job is not done is once it gets into Linux There's a whole new set of problems that come up one thing is it's now needs to be maintained Girl developers can't break it if it does it's our responsibility to teach them not to break it again So before I go further ado. I have to get my you know Selfie Oh shoot the battery died no oh Oh crap I use my phone, but that's cheating So no selfie today the battery just died. That was the first okay So who is this talk for? Well, it's mainly for Linux kernel developers. I mean how many links curl developers are here Okay, how many people are not legs girl developers? Well 5050 I'm actually pretty impressed pretty good So if you do core kernel driver codes file systems, you know pretty much anything that touches the Linux kernel you're This talk is for you And it's also for those that are just curious about what is this weird weird thing we call preempt RT? You know what makes it different It's also for those who want to see how fast I can talk. So what is real time? It's kind of like the question of saying what is your favorite color? So the term is ambiguous I Always hate that term so I looked it up last this morning actually started writing these slides at 5 a.m Seriously, and I went there. I'm like when we're going to talk about real time. Let's see what the internet says about it So I went to one of the most Most common sources of information Urban Dictionary and I got there and says real time instantaneous taking place at once as all other things are also in progress When I survived the situation in real time, there were only four people who met the qualifications Or surveyed the print and sorry I survived surveyed instantaneous and it is blah blah blah Okay, I was actually pretty impressed that from Urban Dictionary. I had something decent Then I went to what is comm and it says real time is a level of computer responsiveness that a user senses as sufficiently Immediate or that enables a computer to keep up with some external process. I Need definitions for that. I continues Real time is an adjective pertaining to computers or processes that operate in real time. I Love definitions that include what you're trying to look up is included in your definition And they actually had more real time describes a human rather than a machine sense of time actually. I think that's the most accurate So That's not what I care about I hate to term real time because it has all these means you have real time tracking when you get UPS or your Go to Amazon and you get your package and you hear all this real time But the question is what does it mean to us the preempt our tea aka the real-time patch? By the way, that's another hint why it's getting into Linux We no longer call it the real-time patch because it's not going to be a patch anymore And we call it preempt our tea now means determinism Has nothing to have speed has to do with Knowing what's going to happen? So we care about latency. We always calculate. What's the worst that could happen? Because I always say it's knowing What will happen when it will happen? So that's what really real-time is to us. I Always say this should be not called a real-time operating system It should not be an R toss. It should be a deterministic operating system So what's our strategy to making Linux into a real-time operating system? Well, the first thing is we have to make it as preventive as possible We don't want we want any time a high-priority tasks wants to do something We let it do it as quick as possible So let the user determine what's important and as soon as their Defined task of importance wants to do something it should be able to do it and give it as much You know power as possible and try not to interrupt it in any way So we try to remove preemption interrupts everywhere that we can so we go is have scheduling as powerful or everywhere So let the most important highest priority tasks run when they want to run. That's what we try to do so if you When real-time comes into the Linux kernel and you go do make Menu config and go to processor types and features There's going to be a window that pops up in the preemption. There'll be a preemption model before The first three is only what's there today. You'll see the first three. No force preemption volunteer voluntary kernel preemption and preemptible kernel will see desktop two new things are added and for full preempt RT That's what we get so No force preemption is the old days. I only know does anyone ever run with no force preemption with our config preempt none Anyone do that a couple people. I almost want to say why Voluntary preemption is probably it's like that's mostly the default red hat. I believe sluice sluss Debian voluntary preemption basically any time you hit like a might sleep within the kernel. It will say hey We're telling there was a debugging option put called might sleep to trigger if you had Case where you're in a critical section and you call this function and it might have a path that it might sleep because a lot of Times what would happen is you could run your code and everything's fine But there might be a function with a path in it that will actually go to sleep And you'll sleep while holding a spin lock and your system will crash and we're like, oh, no And these bugs kept boiling up so we throughout the kernel we put these might sleeps throughout the kernel So that if you hit a function that might have some if conditional some strange conditional path to a sleep You'll know right away that oh, we can't call this function from a critical section. Let's call something else So let's fix up our code. So that's solved a lot Well, what they realized was all these might sleeps that are scattered out because these functions can sleep We can sleep there. It's just also told us that this function even though there's one little path that might sleep It means that we can sleep so voluntary preemption took up on that and said hey when we hit this might sleep if There's a scheduling something needs to be scheduled. Let's schedule right there So that's what voluntary preemption is it took this debugging app option that helps save us from They'll calling these might slips and just use it to say we can schedule because we know this function can sleep It's okay to sleep. It's okay to schedule out. Let's let's use it great It actually helped a lot. I I don't think I would love to get rid of serve or none because it's pointless I think they might sleep ones are perfect Then preemptive curl is what we have today low latency desktop. That's where Basically spin locks become preemption points. So when you grab a spin lock it will disable preemption and you could sleep almost anytime You're not holding a spin lock or don't specifically say I don't want to sleep. So that's throughout the kernel So it's actually very very already a lot very reactive. It's basically a soft real-time system But we have two new things so what I'm going to kind of ignore is just the basic the first one basic RT That's more for testing purposes on us But the next one is where it's full preempt RT and this is why we call it preempt RT because that's the config option So if you're wondering why we always say hey It's the preempt RT patch or preempt RT now and not the RT patch This is why because it's going to be a config within the mainline Linux kernel So interrupts as threads You can always do One of the things to help get preemptive kernel is to have interrupts be able to thread because even a long-running interrupt You can't do anything what it's going on. So we have to be able to preempt it So the only way you can make it preemptible is it has to ask has to have its own scheduling context and be able to switch so Request IRQ or request threaded IRQ. It's been there since 2009 How many people have used request threaded IRQ? Hey, good. Thank you. Appreciate it. Awesome Because that's even a mainline kernel. You were your thread is or your interrupt is not keeping Important processes from running. That's that's nice of you Also, if you ever want to do it, there's this thing called threaded IRQs Which was added as a debug option to debug certain criterias where it was really because we want to get our RT Linux our real-time patch into the kernel so a lot of times we would Dress it up as a trojan horse and say hey guys. This helps you so much This is something we always had rationale that had all the real-time patches that may or all the real-time stuff Or all the patches that came from the real-time patch We're basically Hidden as gifts to developers Locked up came from the real-time patch mutex came from the real-time patch Let's see generic IRQs the IRQ making all the IRQ systems basically Into core architecture it came from the RT patch This was because we couldn't do RT without it But people like to code like hey, this makes things cleaner because one thing is real-time patch requires Clean code so our gifts are cleaning up code to give to you So but this one was kind of like hey You could debug your system with you know making all your threads or all your interrupts threads That way when so if you have something crashing you could debug it easier. That was our rationale That's what we said But almost all threads everything becomes threads except for maybe no if you mark especially market to say no thread And those are timer interrupts IPI processors don't get saying I this my interrupt is so important I don't want it ever to be a thread because people will hate you for that. Just let you know Normally no handler is basically so you have a high priority task This is why we don't like interrupt handlers that are not threads because you'll preempt of the task The handler has to finish and boom so when you request an IRQ you'll get you give it a single Function called hand or a handler which is a function that's going to run when your interrupt vector is triggered When you do a request thread at IRQ you're given a handler and a thread function now the way that happens is when your handler takes Executes It'll go up do the handler in this way if you want to shut off your device because say if you have shared interrupt line then you have like a big law if you have a One of your threads or your handler takes a long time shared with other devices You may want to run it as a thread this way you could go Acknowledge your device and then go away and just do your handler Amazing enough. I searched the kernel and very very few People do this which is actually a good thing. I'll explain why in a second When you do forced threaded interrupts, which is basically when you enable preempt RT it does force the force threaded interrupts becomes on and What that does now is there you have the interrupt triggers you can't stop interrupts You know the hardware has that's a mechanism of hardware so you trigger interrupt and you say okay all it does the interrupt says okay, we're going to With disable that interrupt line and then wake up the thread for the interrupt thread to handle everything else so basically all we do is turn off the line so no more interrupts will come in and We wake up a thread and that's all that so I would have made that smaller the interrupt smaller But then the word interrupt couldn't fit in there because it's really actually this isn't this is not to size It should be a sliver So it was off boom right back. So it's really really fast. You know a microsecond at most Then it runs to your high-priority task and when your priority tasks you schedule now Here's the kind of interest interesting thing if you use both I say if you have a top half and a bottom half Now it's I'm like I'm sorry But you're going to have to scheduling search because your top half is still going to run as a thread And your bottom half is still going to run as a thread So we were working on ways to make it run as maybe a single thread You know, maybe so then get rid of the schedule switch. We have to we're looking at ways of fixing that But for now So we enable preempt RT. Here's the big thing. This is the big hammer This is what's getting into the kernel and this is exactly what we mean by real time is coming into Linux It's the it's the option. We've been all dreaming of Turning spin locks into mutexes They're not really a spin lock anymore. Are they? So they don't disable preemption. They don't disable interrupts Even if you tell it to disable interrupts if you do spin lock IRQ or spin lock IRQ save nope It's the same as a spin lock same function. Boom So, how do we do this? Why do they do this? Well The reason why spin locks disable interrupts if you do spin lock IRQ save or spin lock IRQ the reason why you do that is if you have an interrupt or a spin lock spin that's being shared between Thread context and interrupt context so if you were to take a lock and not disable interrupts and your process gets preempted or not more Interrupted by interrupt the handler interrupt handler runs and it grabs the same lock. Boom deadlock. Don't want that so This is why we don't need to disable interrupts Because everything's a mutex everything's schedule. So you take your lock your handler I should have made that smaller again. The handler is less than interrupt. I could have made that shorter. Okay. I was lazy Remember I started this at 5 a.m So he takes up the handler only thing is wakes up the thread say this time. It's not a high-priority process going on So the product the thread the interrupt handler goes right away and it takes the lock But the thread that had it The thread that it preempted or yeah, it's yeah preempted now not interrupted preempted and scheduled had the lock So it has to block so it schedules out it can interrupt handlers can now schedule out you run your task Once it releases a test releases the lock schedule back in your thread So priority inheritance Right now we have priority inheritance within the kernel, but it's only for few ticks fast user fast use space mutex When you do use p thread mutex attribute So if you use your if you use p thread mutex within your application, and you were to say turn on the attribute Pryo inherit you actually get priority inheritance now explain that Preempt RT does that inside the kernel to all this mutex is all this sleeping spin locks and to the mutex So my talk can't be complete without me doing these slides I did this this slide is actually for my first time I ever talked about real time It's been every single talk of mine. So it's like me taking that picture where the battery died So normal the best Example of where this could go wrong Came in 1997 The Pathfinder flying up to Mars They got it started resetting it would suddenly would reboot And they didn't know what was going on, you know got to Mars it would reboot every so often reboot And like what's going on these things just normally going on it would they lose communication for some time And then suddenly it would reboot and they would get communication back and this geared them So they took all the code and they they were able to simulate the exact same things I was going on up in space down in their labs, and they found out what was happening. They had this Process a very very low rare process that seldom ever ran and let's say it's process C That's up on the chip and this process would just collect meteorology meteorological. I can't pronounce the word meteor data and it would Wake up collect some things to put out to the bus and then go to sleep In the meantime the bus manager highest priority process in the in the On the system would be accessing that bus and we'd have to grab a lock before Would touch that bus and for some reason priority heritance was turned off on that lock Because I think they did for performance reasons and they did it roof. They forgot about this little lowly Task that was sharing that lock And what would happen would be the process when we wake up get preempted The bus manager would go run try to grab the lock. Okay. I got to go sleep go back to sleep the little Logging data continued and then there was an intermediate Process that came in that one runs for a very long time And it started up and it preempted the guy preempting a and now when a was stopped that all bus management stopped And the system had a watchdog timer go off and it did a reboot boom But since things are not things are locked up. We got to reboot the system So the fix was simply send in a pair or send up a little command They were able to upload a little switch that turned the that into a priority inversion Task or a lock and they never had a recent the resets stopped What happened? They so what happens is C goes once a runs in blocks It gives a priority here and inherits C inherits a priority of a so it runs a so when B wants to run It can't so it has to wait so C releases a lock loses priority a runs it sleeps B can now run normal RW locks has been our Nemesis for some time. We've done things. I've actually done. I actually wrote code to make Multiple priority inheritance now priority inheritance people don't like priority inheritance because it's kind of complex, but I Actually even made it more complex people. I don't know if you know me, but I kind of like code That's complex. I usually complex algorithm or complex problems which require complex solutions. I love that I Revolve around it. So I made multiples things and Thomas looked at it says no get rid of it So what do we do about reader writer locks? He's real right locks everyone loves because you think reader writer locks are really really good for multiple CPUs Because you can have a reader multiple readers Grabbing something because reader writer locks are basically the body, you know do a lot more reads then you ever have to write So if you sell them right, you just do a lot of reads you want that all in parallel. That's great So you want that all going on so read a writer lock perfect So each time a reader goes to get to look at the data and read it grabs a lock fine Multiple readers multiple CPUs all group find a Danny a writer comes along It's got a block and wait for all the readers to finish depending on whether it's a fair lock or not That could have issues and but finally when all the readers are done It will get the lock and do its data and then or do its updates and finishing that all the readers come back in again It sucks for multiple CPUs Don't use reader lie writer locks where you don't have to use RCU get rid of the reader Right locks for those of you you guys are hardware guys. This is ELC here. I'm sure a lot of yes ELC people here Cashline bouncing will kill you I could show lots of graphs of performance when you are sharing the same cashline That is now the bottleneck that we're hitting almost everywhere reader writer locks That's a bunch of readers hitting the same cashline. We've actually made spin locks special to do the spinning outside Off the cashline on local memories So we wouldn't have the cashline bouncing that was killing everything So a lot of work's been done to try to make the cashline better because that's what speeds things up again If you use reader writer locks, it's not much faster. You're actually it's kind of It does that it's a more CPUs to get it flattens out So avoid that with more CPUs with RCU. It's still exponential Not really but pretend it is So we compromised We just said reader writer locks or mutex is still so even if you do a reader RW lock It turns into a mutex, but that's fine. They just won't any priority inheritance So avoid it for another reason you can get get priority inversion with reader writer locks now on the RT patch So avoid those paths try to avoid it. We're trying to get rid of it. So that's what we're going That's what when we analyze real-time systems We look at the device drivers and make sure we try to avoid all real reader writer patch paths That will at least make sure that there's no writer involved in any of the Real-time threads or anything But writers do still have inheritance from themselves. They're normal mutex basically The try lock issue The try lock issue Okay, how many people have ever done something like this a few hands you're afraid to raise your hand you should be This is usually what happens is you got this mixture of Work where normally let's say you grab lock B first Say you grab your locking orders grab B grab a and if for anyone That's not a computer science, and you're just here to see B talk, which I think I'm talking very slow right now If you grab B and a always in that order you're fine You're not going deadlock But if you ever have any place where you grab a then B you could deadlock because you know the mixture of B a B a up. We're dead. So we always grab in one order So one of the ways around doing this because there's sometimes there's times You're doing some data that requires protection of a but then you want to do something that requires protection of B You don't want to drop your lock and redo everything. So okay, we'll just do a try lock and If we get it great Otherwise, we'll drop a and go back and try it again because if someone's blocked on a Once we release it they get the lock and then they'll release it So when we get the lock a we could try it again and do this. This is great. Okay problem solved But doesn't work for you know, it's great for spinning locks, but not for mutex's So what you have in a real spinning lock you have CPU 0 CPU 1 so in the try lock You know the CPU 1 takes its lock. It's lock a CPU 0 takes lock B. Now you want to try lock on B so you actually go into this little loop this is you spin while CPU 0 has lock B And you're waiting for lock B. So what you do is you you know, you release a You grab a again you grab try B fail repeat You know wash rinse repeat boop boop boop boop boop get nice pretty colors on graphs And then finally CPU 0 releases a lock. Hey, we're done. We got B. We go. We're all happy What happens when you do it with sleeping spin locks worth mutex's on a single thread and say if the green guy is much higher priority Then the yellow guy. So the yellow guy grabs lock B We go try to take or we get take lock a oh, we're going to try to get B So we go through that whole magic of release a try to grab a try to grab B again Ba da da da da da da da da da da da and guess what you preempted a Or B or the owner of B. You'll never get that lock. So you'll just spin forever So one solution that's a hack Best is don't do the trick But if you haven't figured out what to do yet Here's one hack that will work for you and that is to grab B after you released a And then release B and then grab a again so How's that work? So the test has be B Test the first test gets B and then you grab a So they try to get B. You don't get it. Okay. So what you do you release a and then take B again No, so you're going to block you're almost guaranteed the block now because you tried a or tried B It's not there you if you preempted the guy your own guarantee the block B blocks Not only that priority inheritance kicks in boom You just boosted that guy to your priority so here if it's of a lower priority He gets your inheritance he'll run once he releases B. Boom. You get it and go on. It's a hack. It works Try not to do that. Best thing to do is avoid that little trick Next thing is per CPU variables well This is when you have a per CPU variable. It's only associated to your CPU This is something that we like to encourage people to do use per CPU variables. Yes This is awesome. Like I said, we don't want cashline bouncing to scale You need to do everything as much as you can on one CPU without looking at any other CPUs. That's great We love it. Well, how do people usually protect per CPU variables? They disable preemption and something they do it for a long time and What sometimes they say hey spin locks disable preemption So I'm way grab a spin lock and do my per CPU variable stuff and then release a spin lock Hey, great. I just protected that preemption there Well, guess what? We don't disable preemption anymore but So this is what you'll see sometimes you see spin lock, you know Let's say I have two per CPU variables I want to add them together and show them to a third per CPU variable So I grab a or x y add them to Z. Boom bright Spin locks do disable migration. So this still does work. This is okay. You don't have to change your plans Preemption RT comes in it's still okay to do this. We disable from my migration. So As long as you protect that data by that spin lock if it's a per CPU spin lock, then we do have per CPU spin locks That's okay. We love them too But don't do both So if you had task a that said hey, I'm doing spin lock Here or one guy does a spin lock. I'm disabling preemption here No, you're not and you do your little magic and task B has another thing where it doesn't need to grab a spin lock So like we don't need to grab spin like we'll just call preempt disable and preempt enable and you do your same little magic And what happens here is you get this? so That operation supposedly has to be atomic you got to do all those three commands without anyone jumping in between well Guess what you start yours? You grab your spin lock, but you didn't disable preemption You just stuck on that CPU. So say you have a higher priority past process comes in you get scheduled out boom Primp disable great now you turn off preemption. You do your work preempt enable fine So this is bad, but prems preempt disable itself is not bad We don't want to tell people it's bad The only thing is if you can't see why would I tell you this if you use a print disable And you can't see the print enable on that same screen that's small on your phone You did it too much Something like this I could put on a slide say it say that to yourself Could I put this code on a slide and projective where people in the back row can see it's fine Okay, I think that's good enough Almost don't call functions. There's no functions in between in there either Don't do this. This is something we've seen Just because they did something like oh, I need it Malik something and they just happen to be in a preempt disable Location and they don't really want to reorganize their code. So they're kind of being lazy and said I'll just allocate right here boom and Now you're now you also it's a gfb atomic So you have to use the atomic which is bad at batting itself Because you know you don't want no came Alec to schedule out or anything problem with this is came Alec called spin locks It's been locked sleep now So That's not good. We don't do this do this do your allocation up front if they fail don't disable preemption do your work It takes a little more organization. Guess what though? It's cleaner code It does you it's makes your code much nicer and much easier to look at people will be pleased They'll read your code when they go to bed at night Do this with gfb kernel So now you do a normal Malik because you can preempt you could sleep great You don't have to do these crazy things and get special memory and all that So print disabled is not bad if it is short Keep slow operations out of print print disable. This is not only good for real-time, but it's also good for just a reaction of the system Because it's good for mainland to Disabling herbs so avoid local IRQ safe Most likely if you're doing using local IRQ safe It's a bug There's a lot of times I've seen in old drivers local IRQ safe where they just figured they were written when everyone had a single CPU They needed a spin lock. I think we got rid of most of those But that was a lot of things but you really shouldn't have to disable interrupts and also you're being I say you're being greedy you're being Selfish whenever you use this because especially for a device driver. They say I use the local IRQ save You're just a million Arabs for everyone else not just you okay, I Make a political point right now about my country's leader right now, but I don't want to go there I'm American Proud of this okay, so anyway that contract There is a case where if you have per-CPU data that's shared between actual interrupts And you don't want to grab a spin lock You could make it we have something that we're we have in the real-time patch something called the local lock I didn't add it here because Thomas doesn't want to add it He's like I'm trying to get find better ways to write code so we don't need it But we haven't come up with the situation where yes You could have you know shared per-CPU data between the thread and inter-pandler So on mainline the best thing to do is look like you saved because you stopped the inter-pandler going on that way But then again you are stopping everyone's in your panel or you're not being you're getting selfish with that too So best to rewrite your code. Maybe try to find another mutex or Spin lock we have this thing called a local lock Which is basically just a local IQ save without or preempt disabled without a mainline, but on On RT it's a Mutex, so it's fine to do and we also what's also nice about that is you're Annotating why you have this local IQ save because a lot of times well We've actually had this where we go back to you guys developers and say We've studied this for three days. We have no idea why you have a local IQ save here and the developer went Oh, I forgot about that. We don't need it anymore We just deleted it. That was the solution so Reason wise because they had no they didn't they didn't know why they had that look like you say because there's no annotation about it It's just local IQ save I needed stop interrupts on me, but they changed their code had spin locks the local IQ save was not needed anymore Don't do this And we've seen this some people say hey, I could be fancy I could put local IQ save do a lot of work and then spin lock and then I spent a lot can do a local IQ restore No, those local IQs. That's just spreading the pain out to everyone. Don't do that. You want if you do local IQ save You want to keep it with the spin lock if you not move the spin lock out. You're not saving anything It's not that much. You're not saving any benefit for Trying to be clever with where I disable interrupts and where I grab locks Don't do that messes up our tea because that's local IQ save is a real disabling of interrupts And then if you do the spin lock guess what you schedule so Soft air cues there are one of the biggest penis to mainline as to us They have a long history. I'm not going to go and talk about the history What they do is they ray you raise a soft raise it from the dead and Then you're basically asking it to run and when It will run when it can and right now the way it does it it does it by interrupts and it's interrupt Soft air cues are in interrupt context most of the time So if you write an IRQ our soft IQ handler a lower tasklet or something you think that you're always going to be an interrupt Context you're not because there's times when we get too many soft IQs We kick off case of dark you do that runs in a thread and runs your handler as a thread so we have this thing called local BH disable and Spinlock BH spinlock BH does actually do something different than it does Our other ones I think Yeah, because it will kind of do a special lock for Soft IQs so it disables preemption on preempt RT But doesn't but all right sorry non preempt RT kernels, but it stays preemption stays enabled But migration is disabled so you can use per CPU variables. That's fine So currently a mainline Soft air cues are indiscriminate in what they run So when you raise one although you might raise the networking soft air cue if there's like a block Soft IQ that wants to run it will run to and you got to wait for that as well So there's no way to change the priority between them There's no way to that make it deterministic so soft IQs really do suck on mainline And night they hurt people actually are working on fixing it now in mainline that they're using RT ideas to do so So we do things differently with software compute our cues and we went through several iterations to get this right so almost every in the bit from Probably every three years we wrote the soft RQ log logic completely different So if you go back like well not now actually we've been consistent ever since the four boy on curl So four point old curl. I think it's the time we went. Hey, we got soft RQs finally right Before that it changed all the time because we had no idea what to do because we we wanted to separate them Change the paradigm because we wanted every soft air cue to have its own priority so you could run the networking of Networking software cue at a priority of like the networking stack let people Changes we had soft RQ threads But then they caused deadlocks because it changed the paradigm of how things work and and because soft I could go soft RQs could run on any CPU at the same time But not on the same CPU at the same time So things are really really weird and what we finally figure out what we could do is Just let whoever said I want this off to RQ to run run it Run the soft RQ. So whoever raises it. That's the priority. You know what the priority is So if your high priority task raises the soft RQ it runs the soft RQ logic under its own task works great But main line is currently suffering from soft RQs with starvation Like I said the networking soft RQs can't run because the block soft RQs are going too long And it's they're having issues with network performance and everyone wants real good Response time even on mainline Linux. So there's work right now. That's going on That's taking ideas from Frederick vice-pecker who also came through the real-time guys He is influenced his work is influenced right now in LKML the first series. I wrote is Not the best and He it's going through you'll see soon what we're going to do is now start doing annotations You can actually say I only want to run like when you enable or raise a soft RQ when you with the Low guard you save and restore it's going to record. Okay We know what you saved and when you enable soft RQs It will only run what you enabled and when everyone else go to case off to RQD that type of thing So we only those that are important like the RCU case off to RQD We don't that no one really needs to run that the RCU one That's a garbage collector. It could run whenever it gets a chance to run as long as it eventually runs That's fine, but things like that don't need to run and those could take up a long time because those are running like RCU cleanup code which could take forever. So you don't want those things running. So We're actually working to get mainline to say hey, just pick which software Q I'm going to raise and it's gonna be all done magically for you. Hopefully we'll see we'll find out what the Patches from Frederick come in but the way we do it now is once you do the raise It sets a flag in the the task struct saying okay, this guy raised it So he wants this run we run out of his priority. He asked for it. Let him take the penalty for it So soft IQs are not Real-time looks like this high-priority task Has it raises a soft RQ Arab handler goes off and then when the handler before it goes back to user space It runs soft RQ handlers all of them RCU whatever it has it could go off So this high priority tasks has to wait for that So what we do in the real-time task is save interrupt goes off and triggers Soft RQ within the handler it will come back and the handler itself actually I should have I just forgot I put soft This slide is a mistake The green should have ran first then the handler swap that just in your head put swap schedule there and Then it runs a soft RQ logic within its handler itself Finally what I want to talk about this is getting close to be an end Raw spin looks this is actually our gift from Lina's Back in 2009 Raw spin lock has been introduced to the Linux kernel. You probably how many people have seen Ross been like in the kernel Just people just tired. They don't want to raise their hand Yes This has no meaning in mainline Zero none niche we came up with excuses, but they're all BS This is just a way that we could differentiate What when you have a spin lock that can't sleep For instance the scheduling spin locks the timer spin locks those can't sleep You can't have the scheduling spin lock schedule It kind of gets into an infinite loop that way So there are things you actually have to have as a raw spin lock. Well, this is it raw spin lock scheduling mode We have it in there Don't use it if it becomes a problem for you See makes because it makes order important. Yeah, this is what I wanted You can only use them for critical usually I would say hardware or scheduling Activities or CPU going down that major events could use them not your normal device driver Not saying that all my code is special Don't do it If you do it it could cause problems and actually it Defeats the point of being a preemptible kernel fully preemptible kernel the more spin locks Ross me locks you add the less real time it becomes it's less deterministic. It changes time. It's horrible Don't do it. So Your lock is not as important as you think it is questions What an anti-climatic ending? Okay Think we have like four minutes if there's any questions. There's microphones because I guess this is being recorded So come and show your pretty face to the TV. Oh Guess someone coming up here. Thank you Can you go one slide back, please regarding if you cannot figure it out? Yes Actually, we had a problem and it was What you write a scheduling while atomic I had a lot of error messages in my kernel lock and It was actually because we ported from 4.9 to 4.14 when the IRQ safe flag inside the HR time a struct has been removed. Oh And Yeah, we got it compiled again But then we saw a lot of scheduling while atomic is for main line or preempt RT a preempt RT patch set Have you reported that to Thomas? No, please do because that could be that could be that a case where you said We switched IRQ what that means when he removed IRQ safe flag from the interrupt handlers What that means is he didn't think there was any path that could get there That's not from an actual hard interrupt Yeah, we are using it in hard interrupt. Oh, no, no, no, sorry turn around stop it turn around I Started these slides at 5 a.m. Okay No, I meant the other way around he thought there was no path that it would be used from a hard interrupt context He thought to get here it had to be threaded and if not Something about his assumption is wrong or maybe something at your assumption about using that function is wrong So that's where it's a communication thing. I would highly recommend reporting that saying hey I found the path when you change this to that this broke What can we do to fix it or should that be an IRQ safe? You can always set a patch and see what happens Okay Someone else. Yeah, right near the beginning you mentioned Were you number two there or who's I thought I was I was stood up Yeah, where were we but wait? Yeah, they stood Geez I'm like I thought I'd keep patiently so I could take part You're right. You know, it's just that the speakers over here. I thought someone was yelling. Okay I think you're proud. Sorry You talked about basic RT at the beginning and then the do me full arty. Yes. What was the difference between those two? I believe It's been a long time since we used it and we kept it in and some of those things that if anything breaks We come back actually what I believe it turns on everything but the Spinlocks sleeping spin locks. It does everything else. It does This thread thread our cues everything else. It's just to say basically it's ways Is it the bug because of the sleeping spin locks? It was a bug for something else because basically the sleeping spin locks is the big thing that we want everything else is kind of like There might be a bug there too, so it's when you have both on it's hard to differentiate I've never used it Thomas I can see still uses it because I said should we get rid of that because he He doesn't expect that to go into main line That's the one thing he wants to rip out before actually makes it into main line I only put it up there because if you download and install the preemptor T patch, you'll see it I just want to explain what it was Anything else? Yes, yes, yes slight comments You put on some slight use RCU if you can I could correct it if you may Because RCU only for high priority task better to use mutexes Yes, I would say I would say you can and may are two different things Reason why I say use RCU when you can which means that it's not going to suffer the performance and everything else because yes RCU is extremely fast for writers extremely slow for reader. Sorry swap it I Need more coffee Extremely fast for readers horrible for writers. So yes, if there's a case you could do it But yeah, it's going to kill your performance because you have a lot of writers That's a case Maybe you could try to rewrite your code as I've actually worked to try to rewrite and get rid of the writers Or put them elsewhere and try to constantly piggyback and do use more RCU because that really RCU scales RWD mutex does not Okay, thank you and one more comment about drivers and actually this slide row spin locks We did several patches to fix GP a driver because they implemented an RQ chips So in some drivers, it's only way, okay I'm going to say something reason why I said it so strongly is I don't want people because I've seen it thrown too quickly There's been a few cases where we said, okay, they grabbed a lock. That's only basically it's the same thing as the Primp disable thing if you could see on your cell phone or on a slide with no function calls Just grabbing a lock or this law raw spin lock and only a few things and it's I think slab We might do this too because it never grabs it and does anything for more than a very very short time That's perfectly acceptable But anything more than a screen full or function calls within it. No does that make more sense? Yeah, but this GP our drivers is basically are you always registered so nothing else We have to we well, we're slowly fixing thing one thing at a time What is in place what is in place and being put into place to help catch regressions early on pre-emptority? Are you coming to plumbers? Anyone coming to plumbers links plumbers. We actually have an RT micro conference That's actually one of the agenda. How do we catch the regressions because like I said when the beginning of the talk I said we've doing we got it in great. Oh, no, we have a lot more work to do. That's one of the topics We have ideas. We have answers. I'm not saying anything yet because it's still a work in progress What about sec looks this good for RT bad for a T try to avoid them if you can they're not good for mainline No, they are they're not there they think they because they're like a repeating lock. You know like oh, we did I mean it matters. Okay matters how much contention there is but that's one of those things That's okay. I think RT if I haven't looked at the code recently I think we do make it more deterministic because right now that's a non deterministic functionality depending on your load or something but most of time like in time catching a time Sam So that's usually a reader. That's usually not something of prior most paths that hit that are not usually real-time tasks So if it's not a real-time task, we really don't care about determinism, but that's one of those things You have to audit. So yes sick locks or yeah Anyone else? Yeah, we're here. Are we are we still trying to get rid of the semaphores? the traditional ones um I don't think we've seen much left. What's left? No, they're actually getting more. Oh what what the semaphores? Yeah, they're getting added faster than they could remove With we we were down to 30. I think at some point now. We're over a hundred Good to know. Thank you You know that yeah, we would like to get rid of some ideas about what how we can Improve that. Are you gonna be in plumbers? Yes. I'll see you then Anyone else? Well, you gotta come up and get a mic It's being recorded, but if you don't listen to short question, I can repeat the question Okay, the question is if we get preempt RTA into the kernel, what will happen to the patch set? Will it go to zero? hopefully yes, we might have a niche patch set for things that there are some things in the kernel that May not got get into the main line that's For some device or for some weird situation We may have a patch set just to do that It might just be like a staging to try to get the main line come up with a new solution knew that But the big hammer is that sleeping spin lock once that's in that's real time That's everything else is sort of one of these little fixes everyone says looks at the patch Say wow you got 300 bites or you know 300 patches in there 290 of them are basically little fixes like that So and it's like once we get one thing into main line. It'll take out 200 patches So because those 200 things are hacks because we don't have a solution for this one thing What's it got that solved? Boom, those go away. We may have little things like that, but hopefully everything Yeah, it'll be optional. Yes. Yes at that point. It'll be something that you could it's going before those Cases where you just have like someone I want to get this in we say that's not a great Solution, but we may we may maintain like a patch set for those cases where We don't really have a real solution for yet But those are like that's going to be for them 99.9% of the people here. You won't care about that In fact, I probably would never run it. It might be this like hey, I got this Okay, we got figure out how to solve that here We'll put this patch here for so everyone that want has a same issue could use it shared The links kernel really is always forked everyone says, you know the people I say what happens if links gets forked I hate to say us. There's hundreds of thousands of Linux kernels out there Not one of them are those hundreds of those are the same. They're all for example Android even Debian red hat They're all forks red hats its own system. They have their own patches Susie has their own patches same thing Anything else? When will it happen? We're okay. We were hoping this year it looks like because of Because of recent events, it's not going to happen. It has nothing to do with technology Anything else? Okay, thank you One it's actually well. Yeah, go ahead. Sorry practical. It's just very practical It's this I mean with all these things are we getting really deterministic clock cycle? You know it'll happen within this many clock cycles or are we just saying it's guaranteed 99% of the time that kind of thing And what's the what is the time the latency that we're guaranteed to the jitter? That is not determined by the RT kernel. It's determined by your hardware. Thank you. Yes