 All right, welcome back to, I don't know what day it is now, Thursday, lecture seven. So hopefully everyone's lab one is done or it sounds like people have like taken like 80 hours doing it or something ridiculous. You might want to put that in suggestions because I can, I think a lab up to three is kind of set in stone, but I can change lab four or maybe so if these labs take too long I'll try and see if I can convince Ashton to change the other labs. But anyways, today we talk about threads and the other section has already talked about threads, so because we actually know about processes, pretty much everything we've learned about processes maps exactly onto threads, so we're gonna pass the other section and we're gonna be just fine. So we've already front loaded all the hard work processes with the hardest thing you'll probably learn around or at least the hardest thing to wrap your head around initially. So threads are exactly like processes except shared memory, so the exact same principle is a process except by default they're in the same address space, so they share memory and this is kind of what you expected when I showed you that first fork example when I was changing that X variable and it didn't affect the other process. So with threads they have the same address space, so any change in memory you make appears on every thread and you can and you can also optionally have memory that is specific to an individual thread that another thread shouldn't touch and you do that with thread local storage, but we won't see that for now, but that's just an aside. Yep, yes. Physical memory. Yeah, so anything that one thread touches the other threads going to see. So a process can have multiple threads, so we can kind of if we think of a process as containing everything we need to execute a program then if we just carve out the virtual CPU part of it that virtual CPU part of it of a process is a thread and by default there's only one, but you can have multiple threads in a process if you so choose and because of that all you need is a virtual CPU whenever you create a new thread which is much much smaller than a process because the process is everything which you can think of as would have to do like clone a virtual CPU and also you know clone all the memory and stuff. Stack. Yeah, so it's yeah so it's memory that's specific to you know one CPU that's executing all your functions and stuff because they'll all be stack allocated so it's a way so that one CPU doesn't air fear with another just when executing code. So you can think of it too when you're executing multiple threads any stack variables in C are perfectly fine they're your own copy anything allocated on the heap anything can have access to and that's where we run into problems so by default threads just express concurrency and even with a single CPU so threads you know existed back in the day when we only had a single CPU they allow you to express concurrency which remember is just being able to switch between multiple things at the same time make progress on multiple things and have the illusion that multiple things are running at once even if we have a single CPU and some things are just easier to express concurrently for example if we have a web browser we might just you know take in a request here we get a request some users trying to connect to our server then we could create a thread to process that request so we can handle you know 5,000 requests to our server do a little bit on one do a little bit on another do a little bit on another without having to write any like extra code where we have to keep track of all the requests coming in where we are at executing them and all that kind of nonsense we can just go ahead have threads and have that nice library take care of it for us so we'll see that threads are lighter weight than processes so if we can pair them to each other a process remember independent code data heap so independent address space while a thread has a shared code data heap aka it is in the same address space which means they have the same virtual memory a process has completely independent execution right anything that a process does does not affect another process unless it explicitly communicates with it through inter process communication and then a thread must live within an executing process so by default your process just has one thread but your process could have many threads and then as soon as the process is dead all the threads within that process are also dead so process will have its own stack and registers because it has like one virtual CPU by default a thread also has its own stack and registers just so that it can execute code independently and then share the rest of the resources so a process is going to be expensive to create or at least more expensive because right now we can assume that we would have to create some representation of a virtual CPU and then also for now since we don't know any tricks we'd have to copy all the memory to create a new process while for just a thread we only have to do the first half of that we only have to create a new virtual CPU and then of course both of them would have to context switch whenever we want to stop execution on one execution stream and move to the other one and then when your process exits it's like completely removed from the offering system on exit although we know that's not explicitly true it becomes a zombie and then something has to wait for it to get freed but when a thread dies within a process because there can be many of them when a thread dies it just removes its stack from the process and that's pretty much it so when a process dies all the threads die as well so let's go ahead and make sure I'm not full of it and just see how fast it is to actually create so in this create fork example I have main and then I enter I have a for loop that executes 50,000 times I call fork then there's error checking then in the child all I do is immediately exit zero so now my process is technically a zombie and then in the parent I just wait on that process ID so I wait for it to finish and then I clean up all its resources so it just creates a single process waits for it to die and then does that 50,000 times so if I do that and I time it it probably will take a while this is 50,000 processes so we'll just sit here and wait a while see how slow it actually is so it took 12.7 seconds that's kind of slow while if we create a bunch of p-threads which I'll show you later how to actually do this but trust me it creates 50,000 threads and does the same thing if I do that and execute it it should be significantly faster or at least yeah so it's 5.1 seconds so it's significantly faster just because it doesn't have to do anything with the memory that's all in the same address space okay yeah yes yeah so it'll be thread IDs so for this course we'll be using POSIX threads for Windows there's like their own API there's when 32 threads but people hate those so much they use POSIX threads on Windows so we're just gonna use that so to use that you include the header file it's a C library and then you add the dash p-thread to your compiler and when you link and compile so and they all have documentation in the man pages but we're just gonna see how to use threads and discuss them today but lab 2 is you're implementing this stuff so you're essentially going to make your own you have to make the representation for a virtual CPU yeah it's like one hard one I believe yeah I believe the way it's structured now I think you use p-threads later on but the other section is not gonna learn about p-threads to like way later but we're just gonna learn them now so we can actually understand threads while we're playing with them at the same time so p-thread we don't have to you don't have to know the API's now definitely API's won't be on quiz one but what they can ask you about threads so we'll just learn threads so we can play with them so thread create takes a few arguments takes a pointer to a thread structure that it will initialize for you you just have to create some space for it that has this attribute argument which is basically how you set options that p-threads should have you don't have to worry about the format for now I'll show some examples probably you'll never use it but it's good to understand what the defaults are and then we have this great return type anyone want to hazard a guess as to what that is so the it's kind of ugly because it's C but the third argument is just you define a function that takes a pointer so void starts just sees ways of saying just I just take a pointer I don't care what the data type is pointing to you can figure that out later so it just takes one argument takes a pointer returns a pointer at the end of it and then the fourth argument is the pointer to actually pass to that function when the thread starts to run and then p-thread create will return zero if it's success and otherwise return an error number which is a bit different than you know the old negative one return error no so we'll see an example of this so this makes more sense yeah yes yeah it just has a thread ID yeah it would like be somewhere within this or you can ask there's a system call if it's a real thread you can ask for the thread ID we'll see so real thread is like a kernel thread as opposed to a user thread which is what you'll be implementing so you'll be spending a lot of time on a lab and then you won't even have real threads it's kind of sad yes well it can be most of the time it is it cannot be because it's just a library you don't know what actually happens under the hood but on Linux it's a real thread it's a kernel thread and we'll talk about we'll talk about kernel threads and user threads next lecture so don't worry about for now just assume that there's a thread alright so this is what it looks like to create it just instead of seeing p-thread create you define in main right we can start reading from main like we always do you just define a p-thread t type call it thread and then for p-thread create you pass the address of that as an argument so it can initialize it we can give it the default values which we don't really know what they do then you can give it the address of run which is just our function that takes a pointer and returns a pointer and to I just give a null argument so I don't even use it I don't really care I just pass null to it so what are some differences between this and when we created processes yeah yeah yeah for yeah so for processes we fork and then we just have two things executing at the exact same point in time here when we have p-threads the first the first main thread which you could think of as a parent if you want comes executes p-thread create and then that thread just continues executing so it would print in main and then the newly created thread would just start executing this run run function yeah yes yeah so just like main has to have argv and argc and return an int p-thread functions have to take a pointer return a pointer yeah so this is where we can compare it to two processes so for a thread this run function whatever it is you can think of at you can think of it as a threads main so if a thread returns from whatever function it's supposed to run then it implicitly calls something that is going to act like exit and then the threads dead so you can think of it exactly like main we'll see it's pretty much exactly analogous yeah yeah so it depends so it's kind of like our exit status in the process so this return value if you want something to read it you have to get something to acknowledge it so there is a equivalent of weight so you can get that return value back yeah yeah so like how main returns an integer and you can read it a thread will return a pointer and then you can read it yeah no no yeah the main thread will the main thread will go here create another thread and then just keep on executing so the main thread would print in main and then the newly created thread would start executing here and as soon as it hits its return that threads dead so we should only get two prints here yeah we don't know how to get it yet but yeah so it's like it's like the same question it's like when we started this course it's like main returns an int where does that go who knew at the beginning of the course right no one knew but now we do yeah sorry it runs at least concurrently and you don't know if it runs in parallel so same answer is how do you know a process is done processes can run in parallel though right so how do you know a process is done yeah if it exits and then you wait on it right then you know it's dead so we'll see the same thing here for threads yeah yeah threads don't appear out of existence like fork where it just starts after a thread it has to start executing a function yep they may or may not it's up to the kernel so it's just like the same as two processes running right which one runs first yeah you don't know but like when we saw with the child and parent it was most like it was like 999 times one way and not another it's gonna be like the same thing we'll see in a second yeah oh that was your question all right well let's just go ahead and see it so the weight equivalent is called join just to confuse you so it's called p thread join and it just takes that thread structure that you know got initialized by the thread create and then it gives you know this nice void star star just to make sure we super understand pointers so you just have to allocate space for a pointer so it can write the value of the pointer there because it see and you know that's how return values work so trust me this will make sense we'll see a little example of that but the important thing for this is you could since we have to it's essentially like weight PID where we're waiting on something specific so you should only call this one time per thread otherwise it leads to undefined behavior and sees allowed to format your hard drive and you know do whatever with it yeah undefined behavior means that you don't know what's going to happen it could you know RM dash or it could just delete everything if it wants if you go by the spec yeah so this is how you would weight in that previous example and we'll we'll show an example of another type of thread so this is kind of annoying because we have a thread that you know essentially turns into a zombie that we have to wait on or join but we'll see an alternative and yeah when you clean it up the resources so in this case it'd be like after it's joined its stack would be dead can be any you can't join on yourself yeah well like in this thread if somehow I got you know this data structure in this thread I could call join on myself it'd be kind of weird but and you'd have to go some roundabout way but yeah yeah the main thread is just gonna go to sleep and wait for that specific thread to call to essentially stop running and then it'll clean up so it cleans up the zombie thing alright so there's an equivalent remember how we could either return from main and return a number or we could call exit and then return the number from that well to end a thread early which there's an equivalent of that thankfully it's called p thread exit so it actually has the same name so it just takes whatever value you want to return to whoever joins on you so when that start routine returns so before here that's the equivalent of it implicitly calling p thread exit just like how you return zero from main it implicitly calls exit so exact same thing happens between a process and a thread you just get to put a p thread in front of it so let's first talk detached threads and then we'll see an example so detached threads so joinable threads are the default thing where we have to wait for them and acknowledge them to get the return value back so then we can release the resources but thankfully as opposed to processes where you absolutely have to wait for them for them to get the resources reclaimed you can do something called a detached thread where as soon as the threads dead and returns its resources are cleaned up automatically for you and you don't have to worry about it the con of that is because you don't know when it's going to die and it just ceases to exist you can't get a return value out of it so you would only use this if you really don't care about what a thread returns and you do it just by calling p thread detached and we'll see an example of that here so let's just see an example of that so we have the same code as before we're just going to detach the thread and then when we run this we have our main thread that comes it should print in main and then also create a thread that should print in run and they should happen in any order it's kind of up to the kernel so let's go ahead and execute that so you can guess that there's probably something wrong with it because they call it detached error that wasn't supposed to happen I need to recompile so if I do that I just prints in main which sounds like that shouldn't actually happen right except when it does it twice in a row and does some really weird stuff what the hell okay oh it didn't actually print twice in run okay that didn't just happen I don't know what the hell that was so what should happen I'm gonna have to watch this again later it should only print in main every single time okay thank okay except for that time so so most of the time it will print in main which is what I would expect but sometimes we can get lucky and then it would print in run and then in main so what's happening here because it's a detached thread that means it will you know it will just run you don't have to acknowledge it or anything so you're not going to wait for it to die so it just dies whenever and then the main thread just dies whenever but when the main thread or sorry yeah when the main thread that's executing main exits that means the process is dead so if the process is dead before the other thread runs remember when a process dies all the thread dies so whenever it's printing in main it means that the main thread reached exit before the other thread had a chance to execute so the kernel just decided not to schedule it in this lucky case the kernel decided to schedule the thread before the main thread reached exit so in order to fix this with detached threads you put a p thread exit into main so that way what you do there is your process won't die until all your threads die unless you kill the process so p thread exit there will just kill the main thread and let the other one keep running until it's dead so if we go ahead and run that now no matter how many times we do it we'll get in main and in run and it may not be in that order but both of them will always print so this is now where we get to figure out relate this to our system calls which are annoyingly called complete opposite things so remember in lecture two it was called the system called was the system called was called exit group so exit group is like kill the process and all of its threads just kill the process while the system call exit annoyingly enough is not related to the C exit now just a normal system call exit just stops a single thread so that's essentially what p thread exit does now which is confusing so you don't actually have to know that little detail but there's just all you need to know is there's a way to exit a thread and way to exit the process and if you exit a process it kills all the threads yeah yeah but if I had joined if I reach the end of main whenever main ends that they would all die but if you join you make sure to wait for the other one so you need to join threads to clean up all the resources but they're not that big so you could just waste resources if you want but you should join them what like if I comment out both these lines and just make it a joinable thread like that so it should be the exact same thing as before where it just prints in main because because we didn't join it joining just makes it so that you wait for it and then release its resources so this is the same thing as the detached case so the other thread that's running even if it's joinable it doesn't matter it's still running that code and doesn't the kernel doesn't schedule it and then the original thread exits from main yes unless it calls pthread exit consent just exits the main that thread yeah so pthread exit just kills a thread and by default there's two ways to end the process either explicitly through returning through main which is like an exit group or when every single thread in that process is dead and the kernel keeps track of that yeah so yeah yeah cool cool okay yeah so pthread exit will wait for all the detached threads to exit and now it works like we expect so here's a quick example and then we'll have time to play with more stuff so this is just an example to see the attributes you don't ever have to know this this is just for reference in case you like get into the weeds so for example this is how you actually query the stack size of a thread so there's an attribute called get stack size you have to do all this pointer crap this is just for reference but this is also how you can set its state immediately through that attribute so if you want to create a detached thread without doing detached although it's super ugly and takes a lot of code and it's easier just to call pthread detach so you don't actually have to do it but let's go ahead and just quickly execute the stack size one just to see that I'm not lying to you the stack size is that number which if we convert it to powers of two it looks to me like it's about eight megabytes and two to the power 23 is eight megabytes so there we can see that the default stack size of your thread is eight megabytes which doesn't actually seem like a lot but hey it works so also when you get a stack overflow it just means you've used eight megabytes of memory yeah yeah yeah so the kernel has to keep track of all the threads and then when all the threads are dead it can exit the process yeah because the thing with threads because they're in the same address space and also exists in the process you can have the situation where one rogue thread calls the like calls the C function exit which kills the process so one thread can kill everything right and then it's a race to see whoever calls exit first that kills the process and instead so this is where it gets tricky yeah okay whoops all right so let's see all right let's do the fun thing and we have a bunch of extra time for questions so hopefully we can catch up a bit if we need to so let's compare creating processes to threads so this is exactly like the lecture 4 one where we created in lecture 4 we created four processes and then have them each print 10 times and we saw that the kernel just puts them in any order you want and we'll see the example with threads and there is actually something we can do to make our program a bit better so let's go ahead and walk through that real quick so let's start at main so if we start at main right we can assume there's a new process that's just running one thread called the main thread so the main thread would go here go into this for loop which executes four times and call new thread and then it does some error checking and exits if there's an error but for now we just assume it calls new thread four times with the arguments i equal one two three four so then in this new thread this is where it gets where we have to pass variables through pointers and actually use malloc so we create a new pointer location to store some argument to store our argument which is going to be an integer so we have to declare space for an integer and then we then using that newly allocated space we write the parameter we want to pass to the thread function there so now the argument is a pointer that points to the number we want to give to the function then here we create a thread and then do a p thread create which takes the address of that thread gives it a default null so it's a joinable thread and then gives it a run function that we'll see next and then that argument so the address that is pointing to an integer so inside run right it has to take a pointer and return a pointer so in this case because we know what it points to because you're creating your own threads they only live in a process we can cast that argument back to an integer pointer because we know it's an integer pointer and then dereference it so now we can store the ID on our stack because you know because it's it's a threads a virtual process so we have our own stack that is independent so then we can free the argument that we malloc from the main thread because we don't need it anymore and we're nice and we always we always free our memory right every malloc call we do it's associated with a free yeah alright so then we do the same thing where we just print the thread ID and then we print the number ten times so if we go ahead and execute malloc so that's microseconds a thousand microseconds yeah so which is a millisecond and I just do that so that otherwise it would just go and print off like each thread at a time because your process is pretty fast so if I do that this is what I get so so I see everything just kind of executing you know whenever the kernel wants to so it goes for the first one it goes one four three two you can see they're executing concurrently because it's switching every single time and the number keeps on incrementing alright so the next time the next number is one four three two then we get one four three two kind of boring because we always seem to get one oh here we go at eight we get one two four three so it's up to the scheduler which we'll see later like you don't have a choice which one runs and go on until the very end alright do we have any questions about that program at all do you want the ultimate test of your first year knowledge so knowing everything we know about threads and address spaces now and stacks being local so I had this question in the other section which was fun so what if yeah yeah so what if I say you know to hell with all this pointer nonsense why the hell would I do that I I know how to get that argument instead of arg I know it it's the address of ID and here because we don't allocate anything I can also remove the free because now I don't allocate anything right so I'm all good so who wants to tell me what happens now that is a very good guess and very close yeah so we got always four and we got a seg fault any other bets so it's going to go up like in crazy amounts from like one to three to six all right any other guesses what happens all right leading candidates all right whose team seg fault we got some at the back all right whose team always four that was a good is a good suggestion why wouldn't it be always four yeah if you're passing a dress to every thread and they all point to the same address might be the last one used all right more team more team always four all right what about just team random garbage but team always the same random garbage yeah so so new thread yeah new threads only called from the main thread right and these are all stack arguments so you were right where this ID is on the stack and then as soon as it returns from new thread it could it's out of scope but it's not an invalid memory address because it's part of the stack and you can access it so it would go through each time it would get the same address because it's doing the same calls so they just ID ends up being the same address for everything for every single thread and then once we're done creating all of our threads they all return and we just have a p thread exit so the main thread dies but still has some stack space so it's just and it got overwritten by some random value so now every single thread is pointing to that random value which was valid at some point it's part of the stack it's not a segfault so they all point and then print the same garbage value over and over again yeah the generally you only get a segfault if you're trying to access heap stuff because I can access my stack unless there's some protections I can access my stack as much as I want right otherwise your program wouldn't be able to work yeah main thread yeah it's not dead it just exited so it's in that zombie thread state so it's still all good well kind of all good yeah yeah so that's a good point that shouldn't if they're pointing to it at some point it's for like maybe the first thread run so just so happens that the way the Linux schedule where it works is if your process is running or your main threads running it's gonna run for some time so it just so happens to be fast enough that it always beats all the other threads so it finishes before any other thread executes yeah so if I all right so this will make our head hurt so this will just print thread one thread two but do it for every thread because it's still reusing that value because I slept there and this is where it gets hard because right we're we're all in the same address space so these things are insanely hard to debug so this is going to be the first time you'll say I wish it's seg faulted says now you're gonna do stuff that would cause a seg fault but it just kind of happily works and does something insanely strange yeah nope Valgrind does not help with this yeah we we debugged it let's debug it let's see oh you took you took the address of a stack variable that's very silly of you don't don't do that so there we go we debugged it they're different but then if you give the address of it here this is always the main thread so every thread gets to access the main thread stack right and be even worse so even worse is if I return that address that is somewhere in the main thread stack and then in here say I started writing to that address then I'm changing the main thread stack and then what if for whatever reason that was pointing to whatever function it's supposed to return from then your main thread just calls some random thing probably seg faults in that case and you're gonna have no idea what the hell happens and that's going to be pretty much impossible to debug so this is like a good example because it's like the first example of like doing something and then really really really weird stuff happens because I didn't pass any address from the main thread I malloc on the heap which is each thread sees the same heap right there's no weird it shares the heap everything sees the same heap so the main thread mallocs then passes as an argument and then never touches it again and then the thread reads whatever was at that address and then freeze it yeah yes so if everything was pointing to the same address instead that garbage one I could have written whatever I wanted to it and it would change for everyone in fact let's let's give some some insight as to why things are difficult and why this course is hard and computers are hard because up to this point programming has actually been fairly simple this is where it gets terrible so you can trust me on this one so say I do that I create a global variable so that's not on the stack or anything right it's this called count let's do something like fairly sane right so I'm gonna have four threads each of these executes let's let's not have a print let's say they go 10,000 times and then each of them just increments count yeah yeah it's unused that's fine okay so if I have four threads say I want to speed up some program I have four threads I know they might be able to run in parallel they might be able to run concurrently whatever it should make things faster so I four threads just adding 10,000 to the same global variable so at the end of this program what would I expect count to be 40,000 40,000 random garbage oh crap I have to reorganize this because the thread dies okay so I'm just gonna print to order to save me time I'm just gonna print out the count at the time every thread dies so thread zero and then count so ID count so whenever each thread dies it's gonna print off what it thinks the count is so if everything's working as nicely as we think it should be like last one's 40,000 then maybe like 39 9999 8 7 right all right and to make things even worse oh perfect no bugs we're good so this is all yeah so this is why bugs exist for seven years so this class of bugs that we'll see in the next lecture right work twice perfect three times bug free bug free bug free okay it's a weird order oh crap I do yeah so this is happening because one thread this thread just executes all the way until completion before any other thread even gets scheduled so this one gets nice 10,000 we'll get into why there's so there's something called a data race so the problem is this count here so every thread is going to do to do this increment to that global variable it's gonna have to load the value then increment it which it can do itself it's all local to that thread but then it has to write that value back out so if two threads load the same address and then both increment it they both increment to the same value then they both write that same value so you'll they'll just overwrite each other yeah so well it looked like it would go up by one instead of two so you can see how bad this case is by how often number is so for this example whenever I execute it I got unlucky you know 2,000 ish times and that sorry there's a really yeah so two threads are trying to both read it increment it which the increments local and then write that value back out to that same shared memory location so this problem happens if it's concurrent or parallel because it could load it could load that it could load that value then switch to another thread that thread can load that value then it at that point it doesn't matter what executes next you're screwed at that point and if that doesn't make sense that's next lecture anyways but yeah it's through three yeah yeah yeah so the scheduler when you context switch out you can context switch at any point so right so we have no control the kernels just doing stuff so luckily in lab luckily or unlucky in lab 2 you're writing threads so you get to decide when to switch you yeah you're writing a fairly silly it so the easiest scheduler is like okay if I wanted to I don't know give out like a thousand pieces of candy fairly into this room how would I probably do it yeah yeah throw the ground scramble no I just go one two I just go round robin right until I run out that's the easiest scheduling you can do all right well let's wrap up quickly because that was fun and again all that data races and stuff this is oh wait I have what few more things to say okay so the threads are lighter weight each process can have multiple threads also the fun thing consider if you're an operating system so say you have multiple threads executing at once what happens if in one of those threads you call fork yeah and another thing to leave you often that's even worse say you have ten threads and a signal comes in all right have a great weekend no this stuff is not on the quiz so the other yeah the other sessions I'll talk about more in the next one yeah