 Alright, welcome to ECE 353, System Software or in other words, Operating Systems. So, who am I? I am John Alson, your instructor. Do not worry about trying to pronounce my last name. I did not pronounce it correctly for 25 years until some nice Icelandic man was very disappointed in me when I tried to pronounce it myself. So, don't worry about it. So, I'll try and learn your names throughout the course, so if I mispronounce your name, forgive me because I don't know how to pronounce my own name. So, if you want, well, the easiest thing to do, it's a bit of a mess to spell and pronounce, so I will teach you a quick little secret for how to do both. So, if you want to pronounce it without screwing it up beyond all recognition, like when I get some phone calls from the bank or something like that, think of it like this. So, elf, it was just Christmas time, it was just a holiday season, so we can all pronounce elf, perfectly fine, hopefully. So, elf and then sin, elf sin, perfectly good way to pronounce it. So, John Alson is usually what I use because I can't even pronounce it, right? So, easy way to say that or just John, and if you need to spell it for some unforeseen reason, the easiest way to do that is just shove the L and the F over a little bit and in that hole, just throw yellow in there and then you're done, you can spell my name now, hopefully you never have to do it. So, in this course, why are we taking operating systems? So, throughout the course, you'll understand how an operating system works and it will make you a better programmer. This is probably the most important course to take at U of T if you want to do software and I'm not biased or anything like that. So, the reason for that is, well, there's only two types of software you will ever write in your life. You're either going to write software that interacts with the operating system or that software is going to be the operating system and there's no other option. So, no matter what you do, knowledge of this course will help you in either case. So, a little admin to get out of the way, so some important URLs for this course. So, I have everything pretty much public, all of the lecture slides and everything are just on my site. So, you don't have to worry about logging in to Quarkus or anything like that. I just use Quarkus as a central location for all the links for the course and your grades will be on there and aside from that, I don't really use it. What we will use is actual modern software and tools and things you will actually use in the industry. So, for code submission, I don't take zip files or anything like that. You'll be using Git. If you haven't used it before, you'll learn it as part of this course. It might take a bit to start up, but you'll have some time because the labs will start slow and then progressively get longer as we go on. So, we'll be using GitLab and for online discussion or for also taking lectures or questions during the lecture. I also use Discord. I'm active on there. We use that instead of Piazza. There's form posts on there. And if you have a quick question, it's way easier to get your question answered. So, also, if you don't want to ask questions in person like raise your hand during the lecture or you're sick and you're staying home, all the lectures are live streamed on YouTube and all the recordings are there too. So, again, you don't have to log into anything. Everything's there. But please do that as a last resort. Please show up to the lectures still because if you don't, I get very lonely sitting up here. And it's also the easiest way and fastest way to get your questions answered because, well, I try and pay attention to Discord, but I might not. So, as an example, just post in the lecture's channel in there. That's what I can see up here. So, if you have any questions you don't want to raise your hand for in the lecture or if you are not here and watching the live stream, you can use the lecture's channel in the Discord and I will try to answer it there. Again, no guarantees, but I try. And that's it. So, a lot of you, I think over half of you are already in the Discord. So, you can follow that link and it also has the benefit of setting up your account on the GitLab server and also synchronizing you and joining you into the Discord server as well. So, when you use that site, that site currently has nothing really. So, if you see something like this after you click the link, that is normal. That means your accounts are connected. And in your actual Discord client, your web client or your actual app, then you are actually already in the server. You don't have to do anything else. All right. Any questions about that? That is our housekeeping how the course will be structured. All right. So, like I said, please still show up to the lectures even though we have all this nice stuff and you can not show up if you really want to, but again, much faster to get feedback if you're actually here in person. We'll have live coding as part of all the examples. So, telling me to break stuff is probably the best way to learn things. And if you break them yourselves, it might be hard for you to figure it out. Better to just ask me and I can relate it to the course and all that fun stuff. So, if there's anything at all in the course that I can do to improve or make things better, don't hesitate to let me know. I like feedback. I like improving the course and hopefully it gets better and better over time. At least that is my goal. So, boring evaluation for the course. So, we're going to have six labs that are graded. There's a lab zero that helps you set up. So, lab one is broken off into two parts. It's the coding parts really short. There's like, you have a week to just set up the environment, get used to everything, and then you have a little questionnaire at the end to make sure that you did it and it's worth 1%. It should be a free 1% for you. Other than that, we have six labs. They're worth 4% each. And we have a midterm already scheduled for February 26 at 6.30. And that's worth 25. And then we have a final worth 50. Yay, Joy. And that will, the final will be, oh, I should ask. Okay. So, currently the final is like you get a single cheat sheet. You can write whatever you want on that. I'm also open to, well, I want to make it open book because people just bring all their lecture slides and print them out and then kill a bunch of trees. So, I'm open to making it just closed book and then providing you what you need. So, let's do a quick little poll. Do we want cheat sheet? Hands up for cheat sheet. Ah, good. All right. Reset, hands up for just closed book. Some of you double voters at like Ashley Split. Yep. To what extent, like APIs, things like that, anything. We haven't taken the course so we don't really... Yeah, so... If you give us an example of, for example, what people did put on a cheat sheet and then would you provide the exact same or is it going to be less or more? So, at least last time I did this course people put like the lecture slides on their cheat sheet and said during the exams they didn't use them at all. So, that was what they said. So, doesn't really matter to me which one is picked. Yep. All right. So, probably closed I guess. All right. I'll probably do closed. I'll put another poll in the Discord just to be sure that the people that showed up today aren't biased or anything but seem pretty evenly split and you don't really know what to expect yet. So, if people don't care, I'll probably do closed because I think it would probably be better and easier but we'll see. All right. Oh, yep. Compared to what? Sorry. Yeah. No. So, labs are just weighted less now just because of... Essentially I got yelled at for the grading. So, I got yelled at by people so that's why it's adjusted. So, the labs are pretty much the same as previous years. All right. So, academic policy do not cheat. Please don't do it. It's a software course. It's really easy to see if you just copy paste it from people. Don't do it. Rules are you can study together, discuss concepts together. Don't give each other your code. You can ask me if anything is unclear at all or if you need help going through your code. There's also lab sessions where the TAs will go through your code with you. So, you should have no excuses. Cheating, not tolerated. Please do not make me go through this process. I had to go through it last semester with people cheating. It is not pleasant for you. It is not pleasant for me. You have to meet with a bunch of people so you would have to meet with me. I'd have to show you all the evidence which probably is really, really obvious. And then it goes to the department and then after the department goes to someone else and you have to just meet over and over again with people. If you really want to cheat, the labs aren't worth that much so just don't do it. You'll probably be better off for it. So, this course also, a lot of it is practicing and getting used to doing things and actually understanding what is going on. So, most of the labs are designed to help you actually understand what's going on so not doing them will only hurt you. So, the recommended books that complement the lectures. So, there's offering systems three easy pieces which is available online as a PDF. So, I just thus, I don't follow it directly. It's just nice if you want something to complement the lectures. It goes over the same concepts but I don't use it. You don't need anything except the lecture slides and me and asking me questions. And then aside from that, this course is all done in C. So, if you're rusty on C, you can look at the C, programming language book by the people that actually wrote it. So, some background skills you should practice again if you need to is while C programming and debugging. And debugging, you don't really have to use GDB if you want. Just printf debugging is fine for this course but following programs logically and thinking about what is actually going on and knowing that computers just don't do random things. They just do exactly what you tell them to. You have to, being able to apply logic to any problem you encounter is like the number one skill that you actually need to have. Being able to convert between like binary hex and decimal might be nice for some things but not really required but this shouldn't be new knowledge to you hopefully. Knowing little endy and big endy may pop up but you should know what that means and then the most important thing that you should definitely, definitely know is that memory is byte addressable. So, the smallest unit you can use is a byte which is 8 bits. People really tripped up on that. So, that is all memory is. It's just a bunch of bytes and memory addresses are just pointers. So, they mean the same thing. So again, please provide feedback whenever you can. The course is challenging. Let me know if anything is unclear. It will start off really, really, really fast and then after the midpoint it will actually chill out a little bit but this will be your first real programming course where you will encounter real life problems when you try and make something go fast and have multiple things going on at once because the world you're used to right now, you just write a program, you run it. That's it. That is no longer true. So, that will make things very, very fun. So, again, ask questions. All programs interact with the OS so even if it's not directly relevant to something I am showing you, you can just ask a question about it and you can even ask questions about different languages or anything like that because guess what? No matter what language you use, it all goes to the operating system at some point anyways and you will learn that throughout this course. So, by the end of this course you'll be a better programmer, maybe not a better C programmer but you will be a better programmer no matter what language you choose to use. So, the main goal of an operating system is to manage resources. So, we've seen the software stack before where there's applications or programs. They use the operating system so really they request resources like CPU time, memory, maybe disk space, something like that and then the operating system is the thing that's managing the resources so it is interacting directly with the hardware. It's figuring out what memory to actually use, what application to run, where things are stored and the application part is what you wrote in like your first introductory C course. So, for EC students it's APS 105. I forget what it is for NSI but you have an equivalent like intro to C course and then there's a hardware course that you should have also taken. Again, excuse me I don't know the NSI version course code but it's the one with assembly where you actually interact directly with the hardware and that and you don't actually have a pesky operating system getting in the way. This is the course where the operating system actually matters because we need to generalize things. So, there's three core concepts in operating systems. The first that will come up over and over again in this course is something called virtualization. So, it allows you to share one resource by mimicking many independent copies. So, we'll see what that is in just a few seconds. The second is concurrency so we have to handle multiple things happening at the same time. So, for instance all of your CPUs nowadays have multiple cores that can execute multiple things at the same time and we will have to utilize that and once you start running multiple things at the same time you encounter new and interesting problems which will be one of the confusing parts of this course. The next is persistence so that allows you to retain data consistency without power so that's like actually saving files on SSD or something like that that we won't get to until the very end of the course. That's probably the least important topic. The first two are the very important ones. So, there's this fun quote I like that I was introduced to in fourth year that I'm now introducing to you in third year. That is really, really true. So, all problems in computer science can be solved by adding another layer of indirection. What does that mean? Well, instead of solving a very, very difficult problem I just kind of generalize it give it a new name and I solved that problem instead of the hard one. So, one of the first abstractions or layers of indirection they kind of mean the same thing is something called a process. So, before we made a program that was just a file containing some instructions and some data required to run and now we have to have the concept of what do we call a program that's actually running because well we can run multiple copies of a program at the same time even though we might not have thought about it before. So, what we call actually running a program is called a process. So, one program can have many processes running at a single time and they should all be independent of each other. So, the basic requirements of a process at the operating system needs to provide in order to run it is well there needs to be in that process there needs to be some concept of something called virtual registers. So, if you were just using assembly right you just use the actual physical registers on the machine but if you're running a process and then you can run multiple processes at once or switch between them or something like that the registers of your program can't just change randomly. So, like register A can't change value just randomly no other process should be able to change your values of registers. So, in order to solve that problem the operating system has to create like virtual registers that are independent for each process so they don't interfere with each other. The other thing a process needs is well it needs some memory so it needs a stack hopefully we all know what a stack is. Everyone know what a stack is ish? All right good all right I see some head nods. So, stack that's where all your local variables go to this point it wasn't really important in this course it will be important and then the final one is the heap and that is a region of memory that you know the operating system uses whenever you do malloc or anything like that or dynamic memory allocations. All right so I have a question for you so this is probably something you have never thought of before so let's go here so here is a very very simple program so let's go through it so let's see starts executing at main we'll figure out that that's not really true this course is just debunking all the lies they've told you throughout your programming career so far so what this does is just it creates an int called local initializes it to zero and also has a global variable called global and we initialize it to zero anyone know what static means here or have they not taught you that someone anyone want to tell me what static means yep no not quite yep not quite all right so if I just did something like that whoops I deleted too much if I did something like this it's just a global variable right nothing special about it so in C if I put static in front of it it's still a global variable all it changes is where I can use that so if I put static in front of it it means I can only use it in the C file I can't use it outside of it and that's all it does so it doesn't actually do anything else special if it's in different places it actually does different things but for now if you haven't seen static it's just a way to say hey I use a global variable by only use it in the C file so why you would do that as well if I don't use at some point the compiler will actually be smart and tell me that hey you don't use this anymore but if it was a global it wouldn't be able to make that determination because it might be used in a different file so this just has while one so this is essentially an infinite loop it increments local increments global and then prints out their values and then sleeps for a second so anyone want to tell me what happens when I run this program anyone yep yeah infinite loop and what am I going to see shouldn't crash yeah I should just count up once every second right print 0 1 da da da so if I run that should just be like locals one globals one and they just keep incrementing over and over again right so I can leave that running switch to a different terminal and I can run it in a different one so that local variable and should be unique to that one process let's see what global is currently at global is currently at like 25 let's say so if I run this program well I don't see the globals 25 it looks independent somehow so it looks like they're completely independent of each other right how is this possible that I'm using the same global variable in one and the other but they're different values that looks weird anyone ever think about this before and how this could possibly work hmm yep yeah so it's being executed in two different places and maybe the memory is in different locations right like the memory for the global variable is in a different spot in one than the other that would make sense right is that what you're going to say to virtual registers so the those variables would be stored in memory at some point alright well let's see if that's true so we like pointers right so let's just print the address of the local variable whoa didn't mean to do that so if I do that I'll just print off the address of that local variable and see if it's different between each time I run it so in this one I'll run it it is and then ee 8c and in this one and then ee 7c so they are different pieces of memory so that kind of makes sense how this would work what about so the address of the global hmm in order for this to work they would have to be different addresses right everyone agree with that so they would have to be different addresses otherwise they're both reading and writing to the same memory location and one should be able to affect the other so let's print off whatever the address is of the global so the address of the global is a bunch of ac 0044 so it's counting up it's counting up let's switch over and launch another one oh it's a bunch of ac 0044 see exact same memory address but in this process it's currently 10 and this one's 21 how is that possible yeah yeah so this is where virtualization immediately comes in and you know it has to be a thing so let's go back to do so the memory a process uses while it doesn't just use physical memory directly that has security implications and you don't even want that anyways so each process has vert something called virtual memory so each process has their own independent view of memory it thinks it has access to all the memory and the operating system creates the illusion that that is true so in this case both my processes that were running the exact same program look like they were accessing the exact same memory location but they were completely independent because one one value did not affect the other value so there must be some virtualization at play here so virtual memory is something that is part of a process as well so any questions for an example but you never thought of that before right alright so how does OS actually allocate different stacks for each process well this is what we said that could be true if there are different memory locations and it looked like this was true so they need to be in physical memory often the operating system just allocates you know just some unused memory to the stack so that could be true without having virtual memory and we kind of verified that because the addresses of the local variable were different between the different processes turns out it didn't actually have to be that's actually a security feature so for global variables how they actually work is the compiler doesn't actually do anything that tricky all it does is pick a random memory address for global variables and the only thing the compiler does is make sure that whatever it picks is just consistent so it just picks a random memory address for a local variable when you compile and that's it so if you had only access to physical memory and you had to coordinate all of physical memory across all the processes that would be next to impossible because well you would need like a global registry of global variables and you would run out of space or like you'd have to request so money so many that maybe your global variable can only be one byte or something like that there'd be lots and lots of issues with that so you'd have to know your memory addresses ahead of time as well before you compile so this is what your memory layout could look like so we have some physical memory and then we could have between address 1000 and 2000 that's memory for process one and above that that's memory for process two and then that way they're not interfering with each other they're independent but we run into problems when process one wants to use more memory than has been set aside for it and this quickly gets out of control if most people don't run one application at a time it looks more something like that where you have Firefox and Chrome that use you know gigabytes and gigabytes of memory so imagine you had to set aside some memory for Chrome outside ahead of time so that would have to be like the most memory Chrome could use which I don't know about you but like as soon as I get to like 30 tabs or something like that it's like 12 gigabytes or something like that grows really really really quick and while you might have multiple web browsers so set aside some memory for Firefox something like that quickly it gets out of control and also that memory space is wasted if you're not actually running that program so even if I'm not running Chrome it would have to be set aside in case Chrome was running in order to make sure that nothing else interferes with Chrome's memory so this was an example we used if two process ran the same program and we came to the conclusion that oh well because the global variable is the same address they have to use something called virtual memory so that solves the problem of having a big global registry or something like that where every program needs to declare how much memory it's using up front each program can just use all of the memory and it is up to the operating system to figure it out and that figure it out part is what we do in this course virtual memory is one of the most fun topics in this course it's probably the hardest or second hardest one when we actually go over how it's implemented but it's really really important and it's why all your programs work and it's pretty much why we have operating systems in the first place so what did we find so yeah the address of the local variable there was different between different processes so it could be in a physical location well it has to be in a different physical location but could have been an actual physical address but we know from running this it doesn't really have to be because the address of the global variable was the same between the two processes and it had different values so we know that virtual memory does exist in our operating system so are there anything else that we may need for a process to run what else what other information does it need so it needs some virtual registers some virtual memory anything else it could use yep yeah storage accessing files onto the disk and actually interacting with monitors and things like that so we'll get into that that is the last part we'll need and we won't get into that until we approach kind of the middle-ish of the course so for now this is what we need to keep in mind when we think about a process which is literally any program or application that runs on your machine this is true even on your cell phone IOS whatever it's an operating system this it has processes and this is what they look like so within a process they would have to have virtual registers because well you can't use the registers directly because if you switch which process you're running you don't want to come back and resume your process and then suddenly the register values change that would be a nightmare to debug and you would never get it right so there's virtual registers as part of a process and then the stack and the heap and indeed any memory in that the process uses is part of virtual memory so two main parts the process already just has two virtual parts of it so we took our first concept and we are running with it already so how I structure this course is all the example code you will have available to you I have not made your accounts yet but I will shortly so soon by Thursday at the latest when you start lab zero is you'll have access to all the code repositories including your own and they'll be posted at this location which right now you probably don't have access to and you can clone that repository or get a copy of it on your virtual machine you'll have set up and you can compile it and fiddle around with it and ask questions about it or just use it see how it's done so some other example that shows what kind of code we have in this course this is read for byte program so let's go over that quick so don't worry about if you can't really read this yet because we'll work up to this but this is kind of the style of code we will have in this course so we'll have to include a bunch of library files and then we have some let's just start with main so main you can have it use command line arguments through argv and argc hopefully we've seen that before in this case this program just takes a single argument that is a file name if you haven't dealt with using files at all there is a function called open that takes a file name in this case it's the first argument from the command line and it opens it as reading and it returns this int that we will learn what it is as we go through the course so again don't worry about if this does not make complete sense yet but it is called descriptor and then what we do here is we create an array of characters that is four bytes long so some things you probably want to refresh is like how big things are maybe that might be useful so a character is one byte so this array is four bytes and if you don't know how to use arrays you should because all the data structure in this course will not be very complicated arrays are pretty much all you need and then possibly link lists so what this does is it calls read and that will read the contents of the file into this buffer and then we have some things that check for errors and what it does is return the number of bytes it has successfully read in this case hopefully it's four bytes or filled up that buffer buffer is just another name for some location to store some data so here the operating system is writing data to that buffer array and then here we have a for loop hopefully we all know what for loops are so it goes from zero up to the number of bytes read not including it so if we have four bytes this should go through zero one two three and here we just get the character of the buffer at that position this if statement checks if it's a printable ASCII character so if it's a printable ASCII character it will just print it as a character otherwise it will just print whatever the byte is just as a hex value again this is more for like some low level knowledge this isn't actually important to the course this is more just for fun I have a sick view and it is fun to me so this is fun may not be fun to you but it will be as we go on so don't worry about it I'm weird computer people are weird deal with it this course will have ok just as a warning the google searches as a result of this course will get really really weird so if someone can see your google searches just prepare for that it'll get really weird trust me so if we run this program read 4 bytes we can see, we can try and read I don't know let's read user bin ls so ls is something we all have used over and over again it's a program that actually executes and if we read the first 4 bytes of it it actually looks like something kind of readable byte 0 is a weird value of 7f but the rest of it actually looks like ascii so it looks like elf and that has nothing to do with me I did not write ls or anything like that but that's kind of weird right so what that is is any program any program on linux if you read the first 4 bytes of it it will always start with that in fact we can be meta we can read the first 4 bytes of read 4 bytes and in fact that also starts with 7f and then elf which is a bit weird so computers use something called magic they're like this is literally called magic so the way computers figure out what what a file actually represents reads the first few bytes of it and then sees what that value is and if that value matches some hard coded thing it knows what it is so if a file starts with 7f elf your computer knows that hey well that magic means that it is a executable file and in fact this is like a very standard thing so same thing for pdf so pdf start with like a percent sign and then pdf or something like that so all the files start with something like that do I have the pdf here no I do not but all the files will start with something like that and it's literally called magic at the end of the day what you figure out with most computers is that at the end of the day it's just there's some magic number somewhere just some number that we humans assign meaning to and that is how anything gets done so to further dispel the myth of computers being smart and what I will leave you off for the day is if I write this program so this is a bunch of bytes so each byte is just two hex characters if for some sick reason like you're me and you write a file that contains these bytes in this specific order and you run it it will actually print out hello world for you so this is much simpler than the c version of hello world right so it's actually a few bytes shorter it's only like a hundred and 36 bytes I think but believe it or not this is hello world and this is where I will leave you at for this lecture so for most of the lectures too I built in some time at the end to go over any questions or anything like that so first I'll do a scan hopefully we're okay for the first lecture any questions yep yeah so one question is why there's so many zeros in here well turns out that zeros are just mostly in memory it's just some unused values because I don't actually need to use them or some extra space because well like memory addresses are really really huge and I can't use all of them so most of it will be lost any other quick questions for the day alright so just remember I'm pulling for you we're all in this together