 Right, hello. Welcome to a clock. Let's start. I hope you are having a great week Thank you for the organization for this great great events. I Mean, it's Friday afternoon. We are all a bit tired of the week. So let's have an easy talk about threads in embedded Linux So what I would like to Present you perhaps some of you have been already doing threads in other programming language or perhaps in other operating system on some embedded OS We have tasks and the good news is what you have learned. You can also apply it on Linux But the Linux landscape is a bit different from what you know And so it's important even though you can take your map with you Some of the region in your map will be inaccurate and what I like to do in this talk is to help you to correct your map Also, some of you may already be using successfully trade on Linux for these people. I like to show you some Internal and perhaps to help you to connect some dots I'm like domain a I welcome you to this talk. So we will have a six easy piece Getting started thread creation and life cycle thread stack memory access new text and condition variable and threads and signal Who am I? Well Like domain a I'm working with Linux since 1996 I Have a twist for safety critical software and I'm getting another bit older. So I thought it's a good idea not to pass the baton to the new generations So now I'm trainer at doulas since 2018 Doulas is a training company that provide great training for about 30 years now in hardware design and Verification as well embedded software and deep learning. So the question is why Linux? Why threats? You see here on the slide Some of the system has been working with Foundry simulation Telecom carrier grade air traffic control and I like the medical automotive What do all these system have in common? Well, they are either using Linux or trades and most of the time both and Traits should be seen as a programming model to simplify To simplify the the code when you have different concurrent activity Particularly in embedded system. We will use thread when we have multiple IO We want to handle multiple IO that may happen seemingly at the same time We may want also to use thread to take advantage of the multicore That's the case on Linux But more importantly you should be using thread between it will make your code easier to understand and to maintain Now you may say if you have been doing thread you say that's exactly the other way around. It makes things more complicated I see there are some confirmation over there. Thank you and The point of using this programming model is that you will have to have a clear understanding of the Timing dependency of the temporal dependency between your activity So it force you to get this right And that's something we want in safety critical system Okay Now let's review What is a threat? What is multi-threaded process on linux and voluntarily on classic system. So now a day a process is simply a collection of One or several threads and all these threads. They are sharing some common resources like for example the memory mapping or The list of all of open site descriptor and so on now a threads Is an abstraction and that's contain all the information we need in order to run the code on the CPU So in particular you will have thread specific Attribute like a stack you need to know where you are with functions And then other attributes that you need needed for by the scheduler So for example the state is it running sleeping and so on If you want to have a list of the resource that are shared by the trade what is common to All trade what is trade specific? Please look at man 7 p trades to get started on Linux there is a one-to-one Mapping between the Linux system trade and the corresponding Task truck the kernel task truck So meaning the Linux kernel only sleep task and the scheduler will schedule your trade accordingly all right Now I want to just discuss a bit of the details how it is done on Linux when we create a trade at application level we will use a p-track create which is offered by the C runtime library and Underneath the C runtime library will call the corresponding kernel System call for that it's the system call is clone and here you see a simplified view how to code this function We need a start routine. We need a stack and there is a bunch of flags That specify what you want to share between the cloner and the new clone you are creating So of course p-track create the policy standard will Specify what are the resource to share but nothing prevents us to create new user space? API that will be trade like but that will share other resources and I find it's a very beautiful API That is unfortunately under use on Linux Of course that would be a Linux specific solution, but sometimes you have problem that needs specific solution All right We have now achieved The first easy piece. Let's move on try creation and trade life cycle Well, there is an API. He looks quite straightforward. We need to include an edify We need to call p-track create. There are four parameters trade ID trade attribute Start function, and then we can pass an argument to the start function. Okay, that does not look complicated So here I have main that's my main trade of execution at some point I call p-track create and then I will have then to trade of execution the main and the start function I pass as an argument Well straightforward. Okay, then let's demo that let's start with first thread. How hard can it be? Here I have created a simple program p-track create. I have even do Some extra do some error checking make sure that the try creation is so successful and then I return zero from main what I've learned and Of course, we are doing our first thread program. So that must be an L. No the start function and EOSS we just print how far I'm a trade and then the name of the trade that is passed as an argument. Well, nothing complicated Let's compile go to the target and let's run the program. Oh, that's range. We don't see any Any output what's going on? Perhaps I don't use I'm not using the API properly and so on Well, no everything looks good. So you could look on the internet. What's the problem? But you may have also heard the talk on this trace. So let's use a trace to figure out what's going on. I use the minus F to follow parent and threats and What we see? Oh, we see that our clone function is called as expected here We see also the kind of resources that is shared with the clone We see that the clone is indeed created it as a clone ID 575 at the Linux kernel But then we see someone is calling exit group zero for whatever reason and then we see also that the new Tread was about to start but well everything terminated So now we need to figure out who is calling exit group zero That's not us. We have an exit But the exit is only in the error case and we will see here this print statement that we don't So fortunately we live in a world of AI and because I am also an HII trainer I've just had this morning a great AI that will be able to help me to figure out Where is the issue? Here I have chosen the muscle library because the code is a bit simpler to understand that glitzy But that's you have exactly the same on glitzy Okay, great. Hey. Hi. We is calling exit group on the muscle library. Let's see if we get some answer. Oh We have now these five lips. We start main that calling main like that exit main Rxrv mp and that's super funny Because in the alpha the file of fellow, there is an under store start function that the entry point and on Army does a branch link to exactly a function that is named like that Double underscore lipsy start name. That's a coincidence, right? So here what happened when your process start? There will be a jump to a start routine in the C library to do all the necessary initialization before calling your main function and the main function of your program is called like that Exit main blah blah blah. So when we return zero from main, we are passing zero to exit and bad luck exit Is the system code that we terminate the process that is all running trace? And that's why we are not seeing the result What's the fixed? The fix is easy. We should wait We should join the thread wait the trade to finish his business before we return from zero That's one possible fix What is the takeaway of this small test? the takeaway is that You have certain aspect to watch out when you are coming from some other OS or languages For example on some other languages when you do something similar like that no problem The system will wait that the trade finish And there are a few aspect to consider I won't go over it into details now I created these two table here That you can look you have access to the to the PDF you can if you move to Lean assembly that you can look at this table the kind of aspect you should be aware of And that's the table I'm using when I'm doing trade in some other operating system or Languages as well There is one which is dear to me safety critical. What happened if a trade cause an hardware exception segment? Will be only the 40 thread Terminated all the whole processes. So here all trade will die. That's okay It has only an impact how to do the error recovering Okay, try that cycle because of the one to one mapping between the System trade and the kernel task. Well that follow exactly the same life cycle as any processes on Linux from a programmer point of view You can adapt a simple Version of it with only for state Renabled it's ready to run but not yet on the CPU. It's running on the CPU It's sleeping on it. It is excited. That's enough to resonate about your program and that's also enough to debug it alright, please Number three threat stack That's a super important topic on embedded OS when we have a small artist When we create a task we need to specify the stack and here the situation we have That's how we start or trade so far P12 create TID null start routine null Well, the trade require the separate stack and As we have seen many of the embedded OS require to define at least the stack size when we create a new task And so does the clone API by the way. No, I'm not seen yet And as a matter of fact that the second argument here null means you are using the default trade creation Attribute and there is a default stack size Then again with my safety critical twist. I'm interesting. What happened if we do a stack overflow? So let's Well, let's have a demo let's overflow the stack for fun and teaching. Let's do that That's the second demo So here it's a simple demo. I Create a stack sorry, I create a thread that will overflow the stack and Before doing that, I will pause before starting the The test and I will print some information about the stack and The thread itself is quite easy. We just have here. We are posing before we start all the tests. We are creating a An Array in the stack of 1 megabyte and then we are just calling the function recursively forever Okay, so let's demonstrate what will happen So here we see the information about the stack Remember in many a most architecture the start grows from higher address to lower address We see we have apparently 8 megabyte. Well, that sounds seems to be a big Also, if you translate the task ID that you get from print right way to a pointer it seems to be an address which is Located at the top of the stack Just perhaps a coincidence and then we have something called a guard size of 4 kilobytes We will see what it is. So now eight megabyte site that seems super big and Remember on Linux the Allocation memory allocation is a bit different Here I'm using the tool called P-map Where you can see the process mapping of the virtual memory mapping. So if we Look at P-map here. We see that the stack that has been allocated. I have 8 megabyte plus 3 kilobytes stack allocated at that address and Here the first 4 kilobytes here is marked as Not really, but we cannot do anything with it. That's the guard What's more important? I think on this output is these two number here. We are allocating 8 megabyte Stack size, but that's only the virtual memory address range that we are reserving Remember memory allocation on Linux is using a page for mechanism It means when you are accessed one valid address that is not yet mapped that will be catched by the Linux kernel And it will do the commitment the physical memory commitment and that's exactly the second The third column here. We see we have allocated 8 megabyte virtual memory space But we are right now using something like 20 kilobytes only Right, let's run the Let's run the let's start now for Infinite recursion and then we get a page fault And that's really a great things again because it means that if we have several stack several thread the stack could be Stack number one then you have the stack of the second thread afterwards. We won't be able to Smatch the stack of another thread it will crash on Linux again That's great from a safety critical point of view because we can we know there is a problem and we can do error recovery Right, let's summarize the situations This is what we have At the beginning of the stack typically the thread library the sorry the C library will put some thread control block That's information for managing the threads and then your stack grow downwards and as we have seen we have a guard here That's a region of one page One four kilobytes that is map, but we cannot really we cannot access it. So if we Try to go past this guard Immediately the MMU will inform the kernel and then you will get a crash Some perhaps consequences of what we have seen. So first of all Remember virtual memory range is mapped. That's not physical memory that get committed and Memories committed using patchful mechanism. So if you do real-time application, you may want to look for mlocall Then the C library on embedded system Linux We have different C library could be glitzy, but it could be also muscle now the point is Gipsy the default stack size is 8 megabyte, but on muscle. It's only 384 kilobyte So if you need half megabyte on your application, it will run fine on glitzy to a crash on muscle So here you need to adjust The stack size and finally why do I spoke about the trade control block? That's a block of information. That's an implementation detail where it's located, but it has an implication Namely if you don't join your thread or if you don't detach it This mapping will be retained until you join There are API you can control what you typically control is the stack size There is an API you can create a default P thread attribute Initialized with a default value and then you have API where you can change the stack size and then pass that to pre-try create Right, let's continue piece number four memory access So now we have the so first of all, what can we access? Because we are sharing the virtual memory space among all the trade as long as you have a valid memory address You can access it from all the trades now of course, we want to keep a certain sanity if in one trade you do write access X equal 42 and then later on in another trade you do a read access For X and no one in between mess with X You expect to get to read the value 42, right? So that's the minimum sanity we need But for that we need synchronization Because first of all, you need to make sure that the set occur before the read for this example and Also, even if it happened in the right order, you have no such a guarantee because of modern CPU multiple core Because of the system the way System is the cache and so on are down We need to put some special instruction to make sure that the value is seen So let's discuss what are the memory visibility rules that are there? By default so first of all when you set some values Before creating a trade in the new trade you will see this value. That's guaranteed Likewise when you change value in the trade and you join well in the trial where you join you will see what you have changed That's guaranteed But that's what's not guaranteed is what you do in between So here between the trade creation in the join I set the value of T and then I do in another try some operational T. Yeah, I have no guarantee So, you know that compiler can reorder instruction Well, so does so do now modern processor. They can also reorder the instructions So we need to ensure that something like that do not happen and One way to ensure that is to use synchronization mechanism offered by trade and We will discuss two of them mutex and condition variable so mutex That's something I'm sure many of you know that the classical critical section you have a shared resource here That's my T and what you do you take a lock you modify your shared resource and then you unlock the mutex and What this log do is guaranteed that Either you are doing it or not essentially So you won't get this problem that you are in between changing some data structure another trade jump in and also mess around with this data structure And you see an inconsistent value But the mutex does more than that What it does here when we unlock in thread a and then we acquire the mutex in thread B We have also a visibility guarantee of the value So the mutex will also implement all the necessary memory barrier For you Perhaps some quick facts on mutex for people coming from some other operating system You may say oh well a mutex. That's really like a binary semaphore And that's true, but there are a few differences first of all It's faster meaning that you will enter the kernel only if another trade already locked the mutex Secondly, it's owned by a thread and that's great because when a thread lock and someone else has taken the mutex I can know with this the thread I can look at the priority and when we do real-time priority We can then adjust the priority to avoid real-time to avoid priority in versions or to mitigate priority in versions So that's why it's that's the major differences compared to a binary semaphore now There is a consequences. We have dishonor sheet. It means normally the trade that's locked the mutex should be the one that unlock it What happened in our situation? You have a trade that lock in another that unlock it or you have a trade that lock it and lock it twice That will be depending on the mutex type So the default mutex on Linux if you lock twice you deadlock for instance Know that there are other mutex type you can use for example error check that will have all the error checking in place It's slower, but you will detect this kind of problem and it's great during debugging sessions And mutex does not synchronize which trade lock first here the previous slide The example work because this instruction key here happened before this one Now how do I guarantee that it is exactly this order if I reverse the order? Well, I don't get the right value for that We have something called a condition variable so that's again a synchronization mechanism and People coming from other operating system. They oh, yeah, that really sounds like an event slag. I know that already great Let's use it. Okay, so the it works as follow what you do is When you you lock your mutex here, we set the value t and then what we will do we will signal The trade that we have set the value t and then it can goes on with its business And then in the trade where we want to apply here for this change on t what we do we lock the mutex And then we wait with con rate. We wait on the condition. We wait to be signal Okay, now it's safe to continue and change the value of t And if you are in no event slags that the way you will program these kind of things now There is a bug now in this code It works fine if it's in the order presented in the slide, but if it happened that trade a run first then you will never You will block income weight because there is a rule when you notify here If it's if there are no nobody's waiting for the condition variable the notification is lost and Here there is a piece missing. It's called a condition variable and the idea of a condition variable You want to wait for a certain condition to happen. So here what's missing is the condition And here it is so that's the way you should implement condition variable Here what I'm doing in trade B. I say as long as t is not set I'm using zero to indicate that The t variable is not set as long as t is not set I will wait to be notified that is set And now no matter in which order now it will work So the usage pattern and I've seen this error many times. So that's why I Mention it You have a notifier trade you will lock the mutex change your condition notify the waiters and lock and The waiter does as follow you lock the mutex you while you are do not have the wanted condition You wait for it then when you come back from the while loop you are guaranteed You can then change do some work and change the condition Why it works because you can wait When When you enter to come wait, it will do two things it will unlock the mutex and we wait on the condition And when you are waken up, it will relock the mutex So here you have all the ways mutex protection and now you see also why you need a mutex here Because you check the condition on the waiter thread and another thread will change the condition So the mutex guarantee the visibility Well, there are a few a few more information About condition variable signal will wake up only one thread broadcast will wake up all the waiters And also use a while loop here the while loop is something like When you are when you are wake up you say, okay, it might be the condition I'm interested in but you recheck it again That's a pattern from safety critical And that's also what the standard say you should be doing there is a lot story about that But I have no time to tell it now. I can we can discuss that offline Right there are other synchronization mechanism I won't discuss barrier with wide right spin lock also know that on the embedded system We don't want to wait forever some of the lock and wait API have also a time inversion so that we can have a timeout right piece number six trade and signal So a signal traded process and you want to have a signal handler. How does it work? You have your main Execution there is a signal that arrived you jump to your signal handler you do the code in signal later you reason and that's it now trades Makes asynchronous code synchronous So that's someone Antinomic to signal and there are other issue that happened We have only one trade of execution to interrupt now We have many trade of execution which one should we interrupt and maybe it will interrupt the one I don't want so how should I deal with that so here be some dragon as we used to say in the whole map and What I want to do is to meet some dragon with you I will so compiling with p thread break my single trade echo that will never happen on Linux But what can happen on Linux? You don't have any trade and you deadlock and this is what I want to show you now all right Number last demo for today We have Here yeah, so that's a real-time monitor of some Brevelle's Consumption that's a version zero zero one as you see and Here what we do we have this empty monitor, but we have also a signal handler because The number are increasing too fast or stakeholder have asked could you implement something like press control C and I reset The counter so that the value do not go too high Okay, so I've been doing that I set the signal handler with seek action and here is my ender I just print reset the counter and well, that's it. Let's do more heat. Let's see what's happened Erty drink right we see this huge number of Brevelle's conception if I press control C Well, it's great. It's reset. So well problem solved until Well, if you try hard enough you will see that there is a you may encounter one problem And it can take some time to to generate the bug. So that's why I prepare a video for it And as a video you are seeing now so here what happened you press control C and nothing happened anymore It seems that the program is stuck So here what you do in this case? I have opened a new shell and I have started a trace with minus p I am attached to the RT drink process Okay, and now we see what it seems to be stuck in few texts You text is the system code that is used when you lock a new text and you need to wait That's very strange. I have no new text. I have no fit rate in my code. Where does it come from? It's really weird. So here I have no idea and Well, perhaps let's attach a debugger. I have in the lucky situation. I have a debugger I could reproduce the problem and so let's see what's going on I attach a debugger and well it takes some time to get started and Then I will do a back trace to see where I am where I am in the trace It's a bit slow right Okay, back trace. So what do we see? Ah, we were printing here this world consumption. Okay, all right Then what happened as we were doing this spring stuff the signal and the got called the signal got received And then we jump in our under and then we have IO pass So we didn't see path but here or print statement do not have any formatted variable So that's why it's don't don't say to to pass and here it seems then we are using a few text weight well, that's Of course, that's that's really how what's going on here Fortunately, we have this great tool great pay. I so perhaps we we could great pay I Why does print a they look in signal and the question mark Well, this is what we get. Well, it seems that there is There is a bug in great pay. Well, it seems that it seems there is a I think that now it thinks there is something called lock file Implemented in steady IO. I won't get that. It doesn't matter It's 23 that was the bug. Okay, and if you look at the file And you see that here it seems that this we are doing something like That's what is managed in the The IO and internally the print statement has some data structure. You need to protect in the case you run With on Multitraded code So what it does it does a lock actually And this is exactly what we are seeing here So there is a lot implemented and the problem you have taken the log in the first Statement you singler single endler has been called you print you re-enter print and you lock again You are to twice this log and you desock and The point is there is a bug in my code. It's not safe to print tests in single handler, so that's really the bug here and The bad news is none of the p-tread function function are safe to be called in single handler So there is a way out of that and the way is as follow Is What you will do before starting any trade you will block the signal you are interested in with p-tread signals Now when you do that when you start the trade all the trade will block this signal and then what you do You create a special threads that will wait synchronize only for the signal you are interested in and then you can do whatever you want It's a normal trade and you don't have the mess. I'll show you before Wait, there is more to say but I think we have only four minutes left You have seen a pattern how to end a signal in multi-treaded code Well, there is more to say about signal and processes, but I will skip that for now Right if you are interesting to go further There are great books a month seven p-treads. That's the starting point Book of Michael Currice is a really great book. I'm biased because I review it This is an old book from David Butonoff, but still relevant and if you want to learn more about Parallel programming concurrency threads and much much more there is this really great book from Paul McKinney It's available online. That's a book that Paul is maintaining since 20 years updating. It's really a great resources So I hope you have enjoyed We'll talk If you want to come at your booth or visiturals.com if you want such a great training like now visit us and Well, maybe you have any question. Thank you very much Question. Yes. There is a question over there. I Very sorry. I need the mic or the vice. I won't understand your question What was that the crap AI? Do you know Do you know this meme? I will show you what great. I Okay, thank you Thank you Unfortunately, I've Introduced a but I didn't test it it properly All right any further questions Any question online? Yes, there is an online question threads.h is included in C since C11 Could we not use these in embedded Linux? Oh, that's absolutely correct Actually, if I remember correctly on muscle or on glipsy it will use underneath. Yes, that's true That's something I haven't looked at to it yet, but correct That would be if you want from the language support a way to go. Yes Thank you Well, I think that's it Enjoy the show and join the week. Thank you very much for being here. Thank you