 Okay good afternoon. Welcome back to operating systems. So if you missed the initial announcement, where's the chocolate? There's chocolate somewhere. There's chocolate up here. So go get that. I do not want any of it to be there at the end of the lecture because I will make poor dietary decisions like I always do so please eat it. Okay so today we're doing stuff that's more or less fun still on topic but not really covered in the exam other than we're going to you know revisit the concepts we introduced in the course. So your first question might be what in the heck is a kernel module? So we know that there's user mode kernel mode and the kernel runs in kernel mode has all the privileged instructions all that fun stuff but we don't actually know how to program in it. Well the Linux kernel is written in C and you can write your own Linux code if you want and that's exactly what we'll be doing today. Whoops. Alright so again this will be recorded so we will make a kernel module today and that will just run in kernel mode so it will be a bit different than regular C programming other than we're still using the C language but because we are in kernel mode there is no standard C library there is no printf there is no malloc there is no whatever because it has to be contained all in the kernel right? Whenever you boot up your machine there is nothing the kernel loads and then the kernel is the only thing actually running. So we saw a little bit that proc file system and we know that that's not actually real files because the file descriptors are wrong but they kind of look like files we can cat them we can read them so today we will actually create one of them so and when we create them this kernel module alt will do is count the number of running processes in the kernel so if you read the file it should just give you a number back so it's pretty much all written for you so we'll just go over it and explain it so again we are not we don't have C we don't have any standard headers nothing like that so all of the header files because it's still C they all begin with Linux because they like doing things actually well and have a proper namespace so there's a few files we need so we need a module so a module is just a loadable piece of code that you can insert into the kernel when it's running so the kernel you can think of it as like a long running process so whenever it starts it starts running and the only difference between that and your processes is it's a bit special in that it runs kernel mode code so in order to insert code into it it's not like we can just start up another kernel or like another kernel process or something like that we have to actually insert code into the kernel and say hey mr. kernel or mrs. kernel whatever hey you kernel please here's some code please execute that for me and this is a way there's some triggers that will like trigger loading for you so you can say only load this if you like find specific hardware like if you're writing drivers or something like that but we'll just write a module that just loads whenever we tell it to then next header file we need well it doesn't have printf or anything like that but they have a kernel equivalent of printf called printk very clever naming so print for the kernel and then we want to create a file in the proc file system so they have a header file for that too and a bunch of helper functions so it's just called procfs then there's this seek file so it's just a sequential file so it's just like a normal kind of file that you just read and write bytes to and that's because we're going to represent you know that proc file system is just going to look like a regular file and then since we're going to be looking at processes well there's a header file for a sketch short for scheduling that will contain all the information about processes threads everything like that so in lab three now when you had your thread control block well the things that are in that schedule is the process control block so that will have all the information about a process like it's stuff you had like you had its thread ID so in here it would have its process ID and then it would have like a bunch of scheduling information and a whole bunch of other stuff so how does a kernel module work well like I said there's no main there's no nothing so you have to kind of tell it what to run whenever it's inserted into the kernel and what to run whenever it is unloaded from the kernel so there are two macros here called module init and module exit and you're allowed to give it a function that runs whenever you insert it into the kernel and whenever it's removed from the kernel so in the insert that's when we're going to create our file so we just have a PR info so PR info unlike when you have print f and it just outputs to the terminal whatever you write there's different levels of debugging information so it's different print f's so you can say like oh this is a print log an error log a warning log info log debug and then you can selectively change what it like whatever error or whatever level you want to see so we'll just use the info level which means just add some information it's not super critical so we'll print whenever we insert our module then here is a function called proc create single which will just create a single file for us in that proc file system and we will call it count we give it some we give it some file permissions then we give it some this is some additional information that we don't have to give it whenever it's read from and then you give it another function that says what should happen whenever someone tries to read from that fake file so we'll give it a count show make sure that there's not any errors and it has a little bit different way of handling errors but we don't really care about that for now and the count show whoops all it's going to do is there's some magic we have to do so this struct task struct that's what they call the process control block in the kernel so this is the real thing so if you're interested you can actually go see all the fields it's absolutely massive because it keeps track of literally everything about a process and then all we're going to do is for each process so it's nice little macro that will actually traverse every single process on our system and we're not going to use that task struct we're just going to count how many there are so we'll just introduce a variable called count increment it and then we will output to the file so you can use print f and then that will actually fake being that file so whenever it's read from we'll just print that count with a new line and that's it and then whenever we exit all it's going to do is say hey we're going to it's going to print an error method or print an info message and then it's going to remove that entry from the proc file system so it's actually not just floating there not actually backed by anything so so we're running we would compile it to do so it's already built for us and it's just an elf file like we know of but they just give it a little special extension so it's called a ko file or kernel object file but realistically it's just an object file that's supposed to be a bit special so if we want to insert our module into the kernel there's this inst mod command so inst mod will say hey go insert this code into the kernel and the kernel would run that initialize function in there and then hopefully that creates that proc count file and we can actually do some stuff so before we do that let's verify that in proc there's currently nothing called count so if we go through here there's consoles cpu info crypto config command line so there's nothing called count and we'll go ahead we will insert it into the kernel and then all of a sudden in proc well hopefully if everything has gone right there's now a file called count cool so and it behaves like a normal file so if we cat it this will give us however many processes are currently running on our machine and the code that counts that is running in kernel mode because we asked it but under the hood like this would just do a normal read system call and then the kernel would be like oh when this person's trying to read from this file descriptor for proc count it should actually go to that show function that we wrote and then that's the bytes that should be returned so you can see that's not a real file we're just kind of faking it so any questions about that kind of cool so if we look at dmessage so dmessage is a way to get read all the messages your kernel is trying to print so everyone that has called print K it will print to this print to this this is a way to read the kernels log in user space so dash L just shows everything that's at the info level so here's when I was testing it earlier and then here is when we just inserted the module so we can see hey it actually prints off our message that we put in so it says hey and it and we can actually see that there's a bunch of other messages that someone else printed to the kernel so system D printed some messages that that and then the kernel is talking about oh it found some USB devices that that so this is a fun way to figure out what your kernels actually doing and you know if you're just kind of sadistic and want to read some debug messages but yeah that's actually just code running in kernel mode and if I knew how to do it like we're essentially all safety is off so this is running in kernel mode so if I actually knew what instructions to call I could crash my system I could write random stuff the hardware I could kill and it finally if I really wanted to because no one's the kernels not there to protect itself from itself it assumes that you know what the hell you're doing in this case I don't know the exact way to kill the process but we could do all sorts of like just really weird stuff and there's no protections at this point all right any questions about that so it's a real kernel module running in kernel mode yeah yeah so there's a bunch of stuff read underlined because like your IDE like VS code assumes C and like would read those header files so I haven't set it up so that knows where those header files are so it doesn't actually know where those header files are and all the other stuff depends on that so it's just a bunch of red even though it actually works but yeah I could get rid of the red if I actually configured it to actually look at the right header files but yeah so there's nothing stopping it all right any other questions because that was pretty much the whole plan today yeah yeah every time I do a read it will know to route any read to that file to this count show and it would run this and then this is what gets read so I just always output a number so it would change every single time I read the file so let's see proc count so before it was 175 now it's 173 some processes died in the meantime so you can see every time you read it it will be the number of processes that ran whenever you read from it all right yep yep final is cumulative mostly a focus so where is it so can I show you this probably let's see so yeah I I can post the first page of it in the discord I was playing doing that anyways well first off a like lab six things I needed to address all right then we can talk about the file all right so I'll probably paste the they haven't confirmed it yet but right now if you all write this down I'll probably paste it in the discord later questions short answer 20 points page replacement 20 points virtual memory 10 points processes 10 points processes and threads 10 points locking 10 points center for its 15 points threads 15 points and file systems 20 points so total 130 goal is one point a minute yeah no yeah I'll post the front page but basically it's cumulative but more focused towards the end but like some stuff if I give you any program and say hey I execute it it's a process so I can't really get away from that at all so the review things are like there's a straight-up processes question I think I wrote it a while ago so I kind of forget now I have like a really bad short-term memory so that's kind of good that I don't swell the exam but I think there's a virtual memory question there's a processes question and then that's about it and then processes and threads combines the two so if I have both of them what happens so the the rest of the plan to is so next lecture do a little bit introduction rust it will probably only be a very short time and it will be related to threads and stuff and then we can start doing like final review I was going to bring up the past final I did start it then collect some stuff and the last very last lecture would also just be review but yeah alright anything else questions concerns yep yeah so on WSL you are not allowed to use kernel modules because it's Microsoft's version of the Linux kernel so it's not like a it's kind of a real kernel module but not real it's like a really weird thing so yeah you can only insert kernel modules if you actually have like a full blown virtual machine that's like a real what we talked about last time like an actual type 2 hypervisor or well you don't actually need a type 2 hypervisor by real virtual machine and not Microsoft's weird kernel because yeah it is a Linux kernel but it doesn't have all the same stuff you're not allowed to load kernel modules into it I actually don't know why because you should be able to alright any thing else so file review it shouldn't be that bad it is cumulative but it's not too bad the short answer questions also relate to the beginning of the course at least some of them there's not yeah it's more focused towards the end of the course that you actually haven't been tested on yet because that makes sense yep yeah yeah code like a law this is I don't want you to write code from scratch because that kind of sucks on finals there is one where you have to I I give you code and then say fix it please or what has happened or what should I do things like things of that nature so there's a bunch of examples where it's just a big code block and it's like help so yeah it's kind of the same format if you look at the website and do the CS 111 fall final it's fairly similar to that one although this one's probably a bit tougher yeah so yeah like I said that finals a good indication and then just I tried to relate it to like questions that came up during the course and like what we could do to fix it and if you know so from the outline obviously you should know how to center for so I'll probably give you something like you could probably imagine here's some code I want to run in a specific order how would you do that threads you can probably imagine there's probably gonna be a data race thing it's like oh does this have a data race why yeah things like that but yeah in terms of studying it as long as you did the labs understood the labs so like you know the virtual memory lab was meant to make sure you understand virtual memory so that question will be related to that if you did the lab it shouldn't be too bad yeah hopefully it's not too bad yep yeah virtual memory would be like hey here's a virtual address and possibly here's some memory values what would it what would it be like if I actually use this and try and translate it which if you did lab five right you know the steps you start with the L1 or L2 hop doot-doot-doot-doot yeah here all right well it's a full time we can start going over one quickly if we don't have any other questions so so here is where it would be under courses we got CS111 full final so we can start going over them so can we see that not really like the first question page replacement pretty much exactly what we had right here's the following accesses you have three physical pages in memory assume they're all on disk use the clock algorithm for this hint probably a good one to make sure you know how to do oh you know now use the optimal algorithm okay well great fairly straightforward that should be more or less free marks and threat threading so we should know how to do this one too so this one's a bit weird because we're calling fork within a thread so let's look at the question so we've int main we have our socket some comment that says oh we set up a socket called socket bind all that fun stuff while true except so this is obviously a server although we don't really have to read this part to understand it but it would create a connection that file descriptor represents the connection then we create a new thread and we pass the file descriptor to the thread and we detach it so remember everyone remember joinable and detach threads probably a good thing to know and then within the thread it gets the file descriptor goes to sleep for five seconds calls fork within a thread and then checks if its process ID is zero from the fork that means this process is the child so the child would dupe to fd1 so it would dupe that socket connection to one which is standard out and then it execs ls so it would basically just print the contents of ls to the socket so this is like a remote ls which is a fairly dumb thing to do but kind of fun so the questions would be something like this soon for clients connect each one makes it to the call to sleep 5 and okay you need to scroll over stop it all right and none of them have returned for slept yet how many threads are there in total and where are they executing in the code so in this example yeah yeah for clients so client just means someone so this would be a server so client is just someone connecting to it so four things would be accepted so yeah so whatever we start executing a process like whatever starts executing main that is a process and it has one thread called the main thread that starts executing in there if we want to think of it that way so our process by default whenever you create a fresh process it has one thread so we have a main process which has the main thread and it would start executing this and we said assume that there are four calls to accept that get that succeed so it would call except return a new file descriptor create a thread go except to create another thread true except create another thread true except create another thread and then it would be after four it would just be the main thread would be stuck it would be blocked in this except call waiting for a new connection to come in and we created four threads right so our main thread is sleeping it's blocked in except and then we have four new threads that we created and from the question they'd all be stuck in the sleep so in total we'd have five threads the first thread we start off with and then one and then we created four more okay any questions about that okay let's see if we can do the next one without all right all threads return from sleep and complete what gets sent back over the socket and why which I kind of accidentally spoiled so what does each person that connected see over the socket and why yeah yeah so each one would see the contents of LS in whatever current directory this is which is probably more information than you actually need to answer this question just the contents of LS whatever it prints so and that's because of this dupe right so we're replacing the standard out with that socket connection so wherever we call exec LP it starts running LS LS would just write to file descriptor one and file descriptor one points the same thing as the socket so it would just send all the output from LS would be sent over the socket and this is essentially like a this is like a version of everyone's used SSH right it's essentially a really scuffed SSH that always just executes LS and doesn't care what you type to it so if you want to expand upon this say guess what SSH kind of looks like this minus you know some security and all that stuff but who cares about security right all right so okay that's probably so how many processes did we create and says excluding the original process well in this case we would have created four and this question is mostly a hit that you also have so I worded it this way like exclude the original process so you could be like oh wait did that have a thread or why do you say exclude it should I have included in the first part so that was hopefully to be a little hint but in this one we created four processes excluding the main one because each thread called fork and create a new process so then oh no what did we forget to do with the newly created processes yeah you have to call wait on them we're awful parents right you always need to wait or at least argue about waiting so it says why is this especially an issue in a case where a server runs for a very long time and made up serving thousands in a request so let's go look at this so every time every time someone connects we call fork and we didn't wait on it right so we made a new process it probably terminated because it sent something back and then it is now in which state is the process in that ls process in anyone like what's the best zombie thing now last of us yeah so they're all zombies so every time a client connects to it we make a new process it's called a lot like we make a new process exec ls and then we don't join or we don't wait on it so we create a zombie process so every time someone connects to this it creates a new zombie process which just why is a zombie process bad yep yeah so it still has a process control block it's wasting resources the colonel is not allowed to delete it because someone needs to read its exit code alright so yeah so any questions about that so you can imagine that if this runs for a really really long time and like thousands and thousands and thousands of people connect to it well it's a zombie each time even if it's just every zombie only wastes like you know a hundred bites or something like that which isn't much you know after a few million times that's a lot of space wasted for nothing so next question was what would happen if we did not fork and instead just always execute dupe 2 followed by exec LP directly after each thread so this question just says hey essentially if I delete this what happens yeah yeah yeah so essentially whatever thread makes it to this exec LP first while I only have one process that is supposed to be the server that's accepting the connections but exec LP replaces the current running process so the first thing that the first thread that makes it to exec LP calls exec LP and now my server process is dead so it would have sent LS over the socket to one client and then that process no longer exists anymore so it would have been dead wiped off the face of the earth nothing more so only one client would see output the rest of the people would be like oh server not found servers dead or cannot connect or something like that alright any questions about that one cool on a roll alright oh what would be the issue if we remove P thread detach all right that's a good one yeah yeah so it's pretty much exactly the same thing right except we have zombie threads so yeah if we here we don't actually use the return value at all and we don't need to join but if we remove the detach it's a joinable thread which means that thread control block won't get cleaned up until you join it if we never join it then guess what it's essentially the same problem as before where we don't wait there's going to be a zombie thread that's gonna waste some resources take up a thread ID at the very least alright any questions so yeah that's that's a threading question so that's pretty good I can tell you from when I wrote the exam that this is similar ish to what you have so this one might actually be harder than the one I gave you all right yeah do what with the creative thread so I do what with the creative thread oh yeah yeah so the reason we can't we have to make a new process every time is because well in this case where we don't fork whatever thread makes it to exec LP first transforms the process and now that server is dead right so whatever was currently executing main it's now that process does not no longer exists if we hit exec LP so it's replaced crap I should probably rewrite your exam what I did actually I should probably stop talking is what I should do okay locks so we should be able to at least do this question for the day so locking typically one of the hardest things we have to do so you're given a transfer function to safely transfer funds between accounts when did we see that before while we had a whole bank sim so hopefully this will be familiar you decided to refactor it such that there's a separate function for deducting removing funds from an account assume that multiple threads can call transfer simultaneously using different accounts consider your initial implementation so here we have an account that has a mutex it has a name and it has an account or an amount and then in transfer we actually do check if from is equal to two so if you're trying to transfer to this to and from the same account it just returns and doesn't do anything so you don't have to worry about that otherwise we will grab the lock for the two account then called deduct and actually add whatever we take away from the from account and then unlock the mutex and in the deduct account well we grab the lock for the account which in that case is always from and then we check if the amount if it actually has enough money in the account otherwise we just give zero which means it won't do anything and then we unlock and then return the amount now first one says does this code have any data races come up with an example of data race it does or justify why it does not so give you a second or give you a little bit to look at that and see who wants to justify some data races oh also the teaching station is broken so I can't read the discord chat and I can't read it from this angle if someone wants to repeat it if it's midterm wasn't like super high so like so it's like what's the realistic average that should probably be as opposed to like what you guys are used to so like like a goal average is probably like I don't know like 75 or whatever ish 80 like I kind of set it so that you know if it was an average class doing average it would be 75 hey you guys are better it's fairly objective I think I think looking at 80 yeah I think 80 was fairly fair for that I don't know like it's like well what's the exam supposed to test if if everyone just fails it and like the exam average is 50 and everything gets scaled up and it's like does that really evaluate how well you learned anything probably not it's like evaluates how well you can suffer so yeah I'm not terribly concerned I'm more concerned about the that midterm average with the labs pushing up so yeah but we'll see if the University yells at me so far in my whole history of teaching courses I have never curved a grade or adjusted anything so the University yells at me so be it because yeah I think the labs are probably weighted a bit much but yeah I guess we'll see all right anyways this question yeah data races yeah so in this case whenever I modify any of the variables that are associated with the account I have the mutex for that account so there's no data races because only a single thread can run at a time within that lock and unlock also called a critical section so I don't have any data races because a data race to remember the definition for data race to concurrent accesses and at least one is a right in this case I only have one access at a time so I don't have that problem so like here I I add to the two's amount only when I have the two lock filter okay and then here I only subtract from this account some amount which I guess in this case is always from so I only subtract from that account only in the case that I actually have this lock here I just return the amount I'm not actually modifying anything I just return whatever I read yep so yeah there's some points where like it might be a bit different depending on you know the order but if you're at that granularity the order you're not good it's not really deterministic anyways because if you like transferred like did one transfer and then the other that's different than doing the other transfer first and then the other so that's not a data race data races like bad values but this is just the order of transfers we can't really determine so in this we don't have any wrong answers but you know if they're close enough we don't have any guaranteed order between transfers because it's just the way it is okay so yeah it does not have all accesses to the accounts amount protected using their single mutex includes reading and writing and subtracting funds or adding funds to the account so next question does this code have any deadlocks oh come up with an example of a deadlock if it does or justify why it does not all right so we'll go back there all right so now deadlocks is that true yeah yeah but back under your first point where you said I don't so one of the things you said is I don't have a lock don't try and get a lock why I already have one so well that's one of our conditions so let's just read it so imagine we have account from a and to be well we would have to acquire lock a first then we would call deduct and then we would try and get this mutex so we have lock a and then we try and get locked b so that's already not a comforting sign yep yep yeah yeah so so we've identified that at least initially that hey there's a case where even with the single thread when you have a lock you tried to acquire another lock so that should be setting off like alarm bells I don't know the reference for you guys I would say like the red alert thing but probably no no one watches Star Trek probably yeah I don't know what the references but anyways that's a bad sign so if I have a transfer from a to be and also I have a transfer from B to a well in another thread well what could happen is one thread goes here so this let's say it's a and then be well it would try and acquire lock a and then we could context switch over to another thread where it's from is be so it would trying to acquire lock be and it would get it so now thread one has lock a and thread two has locked be and now whenever they called deduct they would go into this lock and then within the lock well guess what they're trying to acquire a lock held by another thread so they're deadlocked so answer is yes we have a deadlock so and here's where it is so like I just said if we have one thread a to be in a thread be the a then we can have a deadlock alright so it says given the issues you found so we found that there were no data races but we did have a deadlock so that's not good so explain how you would fix the code to prevent these issues and it says you may not significantly refactor the code that means that transfer must add to the account and called deduct and deduct must safely decrement the account only if it won't have a negative amount otherwise must be zero deduct always returns the amount deducted you may write replacement functions or explain yourself clearly so what would we want to do so it's kind of a roundabout way yeah yeah so his suggestion is we just essentially save the amount from deduct save it to an integer something like that and then only after that we get the mutex for or acquire the mutex for two then add the amount and then deduct or then add the amount and then unlock it so that way we don't have two locks at the same time so that seems credible any other ideas yeah yeah so here they have names so they don't really have account names so let's see so this essentially has the idea that he explained so in this case we just don't hold two locks at the same time so we just called deduct deduct locks from and then unlocks from before it ends so whenever we return from deduct it's safe has no data races we don't have to change anything we just save that amount and then after that we can grab the mutex to two and then add that amount and then unlock it so this is safe because we're only adding funds to another count right we only do one at a time we don't have a hold and wait or anything like that so we came up with a third solution so to prevent a deadlock we can either always acquire the mutexes in the same order for every thread or we could not do hold and wait and in this case we just eliminated hold and wait and there's two ways to eliminate hold and wait so just don't acquire another mutex while you have one or if you have a mutex you have to be able to give it up with that try like try lock yep yeah so this doesn't have any data races that won't deadlock but you could argue that if you get really unlucky it might break like what a transaction should be which is like a problem specifically if you actually like ran a real bank but in terms of deadlocking and data races we don't have any of those but yeah it's it's a bit silly in that if you actually had a bank you would actually care about a transaction as a whole thing so you'd need a slightly different approach but yeah your example on the exam doesn't use a bank yeah it's it's not too big of a deal if it deducts in a row because it will never make it go negative I believe if you think about really really hard you can see some weird orderings where like if you have a large string of accounts where you expect it to be transactions then it might be slightly different answers and what you expect all right well oh wow another locking question great so this one this one I would probably title for your exam center for us so this one is probably what this one's way longer than yours but yours might require some more thought but I'll probably leave this one for next lecture and yeah hopefully we don't so fill the discord with questions and stuff for your exam like the last day we're gonna have lots of time to do essentially whatever you want so we've done like the first three questions of this exam which is like I think there's only eight questions in total so we've already done like almost over a third of the exam so yeah so that's pretty much it so oh right fill out your course of vowels please no I can't threaten you never mind just please do it your midterm grade is great that means you should be nice oh yeah the design fair also starts tonight for ECE if you want to see some projects I have one group tonight and then to tomorrow so they start what it's my haul 150 at 8 for the public I believe so if you're around you can check that out if you want to see what fourth-year projects look like get some ideas for yours maybe so yeah that's pretty much it so just remember for them for you we're on this together