 Alright. Welcome everybody to a new term. We seem to have somehow managed to get 292 people so far on the on the Zoom. That's pretty impressive. I'm going to be basically lecturing twice a week from here and hopefully this will work out well. Feel free. I think to avoid the chaos, let's have people take their questions in chat. I have two screens so I can kind of watch and let's just see how this goes if everybody's ready. So welcome to the virtual version of CS162. I guess since we talked about virtualization, this is particularly appropriate. Can I get a couple of thumbs up or something in the chat just to make sure the sound is good with everybody and please disable your cameras if you could. Alright, great. So my name is John Kubitao. I'll introduce myself a little bit later. But today what we're going to do is we're going to dive right in and kind of ask some questions like what an operating system is since you guys are taking this class in theory and maybe we'll say something about what it's not. We're going to try to give you some ideas of why I find operating systems so exciting and we'll tell you a little bit about the class. But let's dive right in before we get to the operational parts of the class. And I want to point out that interaction is very important. This is going to be kind of hard even when I teach live normally. The interaction portion of this is challenging on the first day. But once we get into the second day, I think, or second time, I think we should be able to ask a lot of questions. It'll be great. So there's a question about whether slides are going to be posted. Yes, we're going to post both the slides and the videos, still getting things moving forward here. But let's keep in mind that you're free to ask questions and let's move forward. So first of all, let's start with this, which is what I like to call the greatest artifact of human civilization. Anybody recognize what this might be? Do we have any thoughts? Other than random? Yeah, very good. We've got several people that basically said internet. And in fact, there it is. So what's a great about this is that we have billions of people all and devices all interconnected in one huge system. And lest you think this is a plug for 168 or 268, this is, you know, the networking is just one of the amazing aspects of this huge system and the operating systems are really what make ties it all together and essentially ties everybody else together, which is part of what's very cool about the internet. And of course, you're all off on the internet right now, this is all entirely virtual. So I guess it's kind of appropriate to start a class this way. And I had a couple of people comment that this looks like a big brain. Yes, in some sense, it does. It's got huge numbers of connections and interconnectivity and multiple redundancies. And so the notion of this as a brain is not entirely off base. I think what we're going to try to do as the term goes on, I'm going to try to tame some of the complexity that's hidden inside both the internet and the devices on it and see how to understand it. So, you know, basically, if this were 168, we would probably dwell on slides like this, but this is pretty impressive. Kind of the original ARPANET couldn't handle more than 256 devices. And now we've got, you know, billions, four and a half billion devices penetrating maybe 60% of the world, which is just astounding place to come. And, you know, some of the deadlines here are some of the dates that are particularly interesting in here is I think in the sort of the very early 90s is when the World Wide Web took off. And that was when suddenly the internet became something that lots of people could use and turned it into what it is today. The other thing that's kind of interesting, which is in this picture as well, is this idea of the diversity of devices. There are literally, this is a graph kind of of the sometimes I like to call this colors graph or it's Bell's law, which is the number of devices that a person has. And if you look at originally, it was kind of one computer for millions of people back in the original day. And then as things move down, now each of you probably has hundreds of devices that are working for you. Modern cars have hundreds of processors in them. You all have cell phones and laptops and little computers inside of devices and thermostats and so on. And so this graph is kind of funny. It's almost an inverse Moore's law graph where the number of computers per person is increasing as they get smaller, which is kind of interesting. And there's a lot of timescales. And so when we start talking about how to make this whole system work, we're going to have to figure out how to deal with something that's, you know, nanoseconds or femtoseconds in some cases up to things that are seconds and tens of seconds. And somehow the system's got to work across those timescales. And that's there's a little magic involved in that. And we're going to try to talk about this. I didn't see somebody say femtoseconds no way. But a lot of the laser communications do operate very rapidly. So we'll hold off on the femtoseconds here, but sub nanoseconds definitely these days. All right. Now operating systems are really at the heart of all of this. Basically, you make incredible advances continuously in the underlying technology. And what happens is somehow each device is a little bit different. Every technology generation is a little different. And you got to provide somehow a consistent programming abstraction to applications, no matter how complicated the hardware is. And you got to manage sharing of resources. Okay, so the more devices we have connected, the more resources there are to share. And as we get closer to the end of the term, so I'm going to say the last third of the term, we're going to start talking about some of these very interesting peer to peer systems that are out there that basically allows us to have huge storage systems that span many devices. I did see some questions about postings of slides. We will definitely, in the future, I'll have them up earlier than the day after lecture. So some of the key building blocks to operating systems are really things that we're going to learn about in class. Processes, threads, concurrency, scheduling, coordination, many of these things you've learned about in the 61 series. Address spaces, protection, isolation, sharing, security, and that level of security is going to be both at the device level. And then as we build out into the network, we'll talk about things like SSL, and then we'll talk about more interesting security models as we go. And there's going to be communication protocols. There's going to be persistent storage. There's projects I've worked on in the past. I'll mention briefly later where it was interesting question was how do you store information for thousands of years without it being destroyed. We'll talk about transactions and consistency and resilience and interfaces to all these devices. So this is a class that spans a lot of interesting topics. And so, for instance, here's something you do every day without thinking about it multiple times. You've got your cell phone and you want to look up something on some device, a web page, or maybe you're using an app. And what happens there? Well, the first thing that happens is there's a DNS request that tries to figure out what the internet IP address is to where you're trying to go. And that goes to a bunch of DNS servers on the network. And they return information that helps now your cell phone route through the internet, which is a very interesting device in and of itself consisting of many pieces. And that may go to a data center somewhere with a load balancer that will then pick one of several possible devices out there, which will then maybe do your search and retrieve information from a page store, put it back together into some page that you can use, and then you get the result back. And, you know, you do this every day and don't really think about it too much. And once you start thinking about it, it gets pretty interesting. Like, for instance, how is it that those DNS servers stay consistent? And why is it that it's not possible to hack into them? Well, in fact, back in the middle 2000s, people were hacking into them. I'll tell you a little bit about that later. And how do you make sure that the packets get enough priority when they come into an operating system that maybe your particular query doesn't get delayed a long time? So there's some scheduling questions. So this is a pretty complex system. And it's, every time I spend the time to think about it, I'm amazed it works. Okay, it's pretty impressive. And hopefully by the end of this class, you'll have enough knowledge of what's going on in all parts of the operating systems and the networks that you too, you know, you'll be much smarter than when you started the class, of course, but you'll be able to appreciate and sometimes maybe wonder why it is that it actually manages to work or be impressed that it works. So yeah, but what's an operating system? Okay. What does it do? We could ask that question. So most likely you could say, well, from the standpoint of what it does, this is like being a physicist that's maybe measuring a bunch of things. You say, well, it's memory management. It's IO management. It does scheduling, does communication, does multi tasking or multi programming. You could, you could ask those things. You might ask is an operating system about the file system or about multimedia or about the windowing system or the browser. You know, back in the 90s, there was a lot of fighting between Microsoft and a bunch of other companies about does the internet browser constitute part of the operating system. And, you know, depending on your point of view, that may still not be a resolved question. But anyway, it was one that has been asked. So, and also I would ask everybody to turn off their video if they could please while they're while we're talking. So, so is this, you know, these questions only interesting to academics is a question you might ask. Okay. So hopefully not hopefully it's interesting to you. Could I ask that person just came in turn off your camera please or turn off your video because that will show up in the recording. So a definition of an operating system is no universally accepted definition is part here. Everything of ender ships when you order an operating system might be a good approximation. But it varies pretty widely. It might be the one program running at all times on the computer. Okay. That's the kernel. You'll learn a lot about kernel since the term goes on. But you can see these two points of view are different. You know, nobody would disagree that the kernel is the core of the operating system. They would disagree pretty widely about is everything that Microsoft ships with a Windows product part of the operating system. Okay. Probably not. All right. So, you know, as we try to fill down on to what an operating system is, you're going to have to keep in mind that we're going to talk about things that does and pieces that are important. But maybe you'll never fully know what an operating system is. So it's a typically, among other things, a special layer of software that provides the application access to hardware resources. All right. So it's convenient abstractions of complex hardware devices. Protected access to shared resources. Okay. Security and authentication. Yeah. Communication. Okay. So we could look at something like this where we have some hardware and it's the fact that many applications can simultaneously run on the hardware is something that the OS has provided for us. Okay. So, yeah, that makes sense and you will understand exactly how this works actually in a few weeks. But maybe we could do it this way. Well, operating system, what's the first word operating? Well, that comes actually from there used to be people. Like there was a switchboard operator, believe it or not, when you made a phone call, they actually had to plug you into the right connection and make the wires connect. Then there were computer operators, which were people that basically sat at one of these big machines for a long time and made sure it was running correctly. And then, of course, operating systems, the operating part of it then became more about, well, we're making sure that the disk is operating correctly or the network is operating correctly or the graphics cards are operating correctly. What about the word system? Well, this is interesting as well. So what makes a system? So a system is something with many interrelated parts where typically the sum is much greater than the sum of its parts. And every interrelated part potentially interacts with others. And of course, that's an n squared level of complexity at least. And we're going to have to come up with APIs and other clever techniques to avoid n squared complexity here because things are complex enough as it is. And making a system, which I showed you earlier on the internet, that's a system that has billions of components to make it robust and not fail is going to require an engineering mindset. So you guys are going to have to start thinking like engineers and we're going to give you some tools to really think about how to make something that complicated actually work. Again, the internet is something which, you know, it's a great example of a big system that is amazing that it works. And it's actually, it doesn't always work. I'll pull up some stories later in the term about times where it definitely didn't work. My favorite being one time where there was a single optical fiber that divided the network into two pieces and it went through this tunnel in the middle of the country, the U.S. and a truck went in and blew up and it melted this fiber and it actually temporarily partitioned the network. So there are times when it just doesn't work properly, okay? So systems programming is an important part of this class and you're going to do a lot of it, okay? You're going to learn how to take a system like this and figure out exactly how to make it work. And that's exciting, okay? You're going to get some of the tools. You're going to get, you're going to learn about how to work in groups. You're going to learn about testing and all of these things that help to make a complex system actually manageable and hopefully eventually workable, okay? So part of making things work are interfaces. So here's a 61C view maybe of things, the hardware software interface. So you have hardware, that's these bricks, and you got software, which might be a little bit of a problem, which hopefully will start coming back to you very rapidly. You had a processor and you had memory, which had the OS in it, maybe you had registers in the processor and those registers pointed at parts of memory, okay? And that allowed this program to run. Maybe you had caches. We'll learn about caches again and remember, mostly remind you of how they work, which helped to make the slow memory work. The way I like to think about a system with caches is you want to make it as fast as the smallest item, like the registers and as large as the largest item, like the memory or disk, and the way you do that is with caches, okay? And of course there's page tables and TLBs, which will help us out in virtual memory. And there's storage, disk drives, et cetera. There's all sorts of devices, like networks and displays and inputs. And making all of this tied together is something you started down the path with 61C. Hopefully you remember that. And then of course there's interesting things like buses that tie it all together, okay? And 61C doesn't quite get into that level of detail and we're not going to do that too much. I might suggest 152 and 151, some of those interesting classes. If you really want to talk about maybe 150 if you want to talk about the buses and so on. But then of course there's an instruction set architecture, which you did talk about. And that abstracts away a lot of what's going on in the processor so that people running programs and compilers, they're compiling programs have something common to use, okay? And so what you learned in 61C was machine structures. And you also learned C, which you're going to get to exploit a lot. So I know the notion that in 61C you learn C is maybe shared with a little bit of skepticism by people. But you're going to get to learn it a lot more in this class. So the OS abstracts the hardware details from the application. So not just the instruction set architecture is going to matter anymore. So that abstracts away the computation elements of the processor. But we're going to learn how to turn a bunch of storage devices like disks and USB keys and cloud storage and turn it into a single abstraction, like say a file system, so that a user can use that easily without having to worry about where the bits are stored, okay? And so that's where we go with this class is we're going to learn not just about the abstractions from hardware for 61C but processor but abstractions for other devices as well. Okay. So what is an operating system? So let's go through some things it does again. Let's try to maybe get an idea operationally. So one thing that I've started to talk about here is the fact that the operating system is an illusionist in some sense, all right? It's going to provide clean, easy to use abstractions of physical resources. And it's going to do so in a way that allow you to at least temporarily think that you've got infinite memory. You have a machine entirely dedicated to you or a processor. That there are higher level objects like files and users and messages, even though as you probably already know, but will know very well by the end of the term, there aren't files or files are an abstraction of a bunch of individual blocks on a disk that somehow are put together with iNodes to give you a file. So the operating system is busy providing an illusion of a much more usable machine so that when you program it, you have a much easier time of it and you don't have to worry so much about whether it's on disk or on a USB key or in cloud storage. And we're going to learn also about abstractions of users and messages and we're going to talk about virtualization and how to take the limitations of a system and hide them in a way that makes it easy to program. So for instance, so virtualizing the machine, so here's a 61C machine which has a processor it's got memory, it's got IO with maybe storage and networks and on top of it we're going to put this operating system thing, which we're learning about as we speak and that operating system instead of giving us a processor with limitations the processor has it's got a certain set of registers, it's got certain floating point operations it has certain exceptions that are causing so on we're going to give an abstraction of something really clean called threads. We're going to have address spaces for instance, we're going to learn about rather than a bunch of memory bytes that are in DRAM and scattered about we're going to provide a nice clean address space abstraction that will give us the ability to treat the memory as if it's entirely ours even when there's multiple programs running. Again I just talked about files rather than a bunch of individual blocks we're going to have files and rather than networks which are a bunch of individual Ethernet cards let's say that are connected point to point between here and Beijing we're going to have sockets and routing under the covers okay so that's a pretty clean abstraction which of course ultimately allows me to teach you guys spread all over the globe as you are okay on top of this these threads address spaces files and sockets are going to be the process abstraction and that process abstraction is going to give us an execution environment with restricted rights provided by the operating system and that process abstraction is going to be a nice virtual machine that your program can run in that's abstracted away from all of these physical details okay and so on top of that you could have your program so the one thing that you guys get to do a lot more of than you've done so far in your career is you get to actually do user level programs running on top of a Unix environment okay and so you're going to have compiled programs that you have produced that are going to run on top of your process abstraction and in order to give you a clean environment into the process abstraction there'll be system libraries so there's even a system something the C library, the security libraries many of the libraries abstract even further and give you nice clean abstractions that maybe allow you to do SSL very easily or so on okay there is an interesting question in the chat which I'm going to point out some people are asking about closed captioning some classes like last time we even had closed captioning but that's when we need it and we actually have a live captioner in that case unfortunately we don't but what I will do when I put the videos up is they will get automatically closed captioned by YouTube when I put them on there and that will be something but they won't be live sorry about that so this is our virtualized machine view and the application machine is the process abstraction provided by the OS and some people might argue including the system libraries and each running program runs in its own process and the process gives you a very nice interface nicer than hardware now the question here on the chat here is the hypervisor or docker deamon a part of the process acting on as the top layer of the VM so we will talk a little bit later in the term about docker docker is a way of wrapping up multiple different little environments and potentially running them inside the process abstraction it's not as isolated as say a full virtual machine but we will talk more about that in detail. Let's stick with process abstractions for now. The process abstraction as I will show you in a second you can have multiple processes all running at the same time and they are each given isolation from each other so that's what we will start with for this first lecture so the system ISA by the way stands for instruction architecture that was the question so the system libraries what does the system programmer think of? The system libraries are linked into your program which is then run by a compiler and turned into bits that will run in the process. You will get very good at this as well as you will learn how to compile programs link them with libraries and then execute them in a process environment and you will learn how to invoke the compiler to do that so this is the programmer's view so what's in a process so remember the process is an environment that gives you threads address spaces files so a process as I said has an address space which is a chunk of protected memory it has one or more threads in it one or more threads of control executing in that address space and the system state associated with open files and so on and so this is a completely isolated environment we will dive into processes very quickly in this class and you will learn how we can have a protected address space and multiple threads running in an environment that's protected from other processes even though for instance maybe there's only one core running we're going to give the illusion that there's multiple cores running with multiple processes at the same time so you've all done this here's an example on say a Mac where you look at the process monitor or the task manager or you do a DSAUX on Linux box and what you see here which is perhaps surprising if you haven't really thought about it is that there are many processes running all the time on your typical laptops okay so many things going simultaneously 50 or 100 of them mostly they're sleeping but they're there to wake up and do some execution at some point okay now the question why are the middle layers of abstraction necessary so part of the reason that we have many layers of abstraction is that if you try to squash all the layers down which is sometimes done in very specialized environments you end up with an undebuggable mess okay and so multiple abstractions assuming they don't make things too slow are a crucial aspect to making things actually work properly okay and so you'll see even modern operating systems still have several abstraction layers okay and you'll appreciate them I think as we go forward because it's much easier to actually have an operating system that has a device driver talking to the disk and then you have a file system that provides files and then you have a process abstraction which protects those files and exports them to programming and yes somebody brought up the programming in ones and zeros I can say that I've done that and it's not pleasant but anyway moving moving on here so here's the operating systems view of the world when they're multiple processes so each process gets its own set of threads and address spaces and files and sockets okay and they might run a program with its own linked libraries okay what's interesting about this point of view is these processes are actually protected from each other okay so the operating system translates from the hardware interface down below to the application interface and each program gets its own process which is a protected environment alright and so in addition to illusionist we're going to talk about another thing that operating systems do which is referee which is manage the protection isolation and sharing of resources and this is going to become particularly important when we talk about global scale systems you could imagine we talk about storage that spans the globe with many individual operating systems running at the same time each of which could be corrupted in one way or another you kind of get to the interesting question of well how do you protect anything and this is where the referee point comes into play and so here I'm going to show you we're going to now be more consistent with our coloring for what's going forward but here we have file program number one and number two each of them are linked with system libraries you're going to learn about the C library very shortly like I said and they are running independent of each other and however in this simple example there's only one processor okay so that one processor and one core by the way for before somebody asked one processor one core and how can these two things appear to be running at the same time well we start out with one of them running so the brown one's running it's got it's using the processor registers it's got a process descriptor and thread descriptor and memory to learn about those as well and it's busy getting CPU time okay the green process is not running but it is protected okay and so now how do we get the illusion that there's more than one processor or that each process has its own processor the each process has its own process descriptor in memory and then the operating system has to have some protected memory as well and what we're going to do periodically is we're going to switch from brown to green and vice versa okay so here's the example of going from brown to green so the brown device has this process descriptor here the green one has the other the green one and what we do is we go through a process switch where the registers are stored through the OS into their own process descriptor block and then the green ones are reloaded and what happens is voila the registers are now pointing at the green memory and the green one picks up from exactly where it left off okay and then a little bit later a timer is going to go off and we're going to switch back the other way and if we do this frequently enough you get the illusion that multiple processes are running at the same time and we're going to talk about this how this works in detail so I can very confidently say that in a few weeks you will have a very good idea of how this works so but at the high level it's very simple we're just switching the processor back and forth between brown and green and as a result we get the illusion that they're both running and notice that what do I mean by the illusion well the process one can pretend like it's got a hundred percent of the processor and process two can pretend it's got a hundred percent of the processor and things just work out okay and that's up to the operating system now the question that's interesting here and does a program become a process when loaded into memory a program becomes a process that's a very good question for next week but a program becomes a process when the binary has been loaded into memory and into the proper OS structures so it has to have a process structure allocated for it and it has to be put into the scheduler queue and so on once that's happened now that process is an instantiation of a running program so going a little further to that question that was there both brown and green could actually be the same program running in different instances with different states so we could have we could have one program two processes each of them doing something different and this is typically what would happen if you were logged into a shared machine and you were both say editing with Emacs or VI each of you would have your own state okay so and then the interesting thing about shared data we'll get to in a little bit next week probably but yes so you guys are way ahead of me so that's good so now the question about I will say one answer this question here about what does it mean when a process is some percent of the CPU that literally means what it says if process one has 90% of the CPU and process two has 10 it means that if you were to look from 10,000 feet you would look down and you see the process one gets the CPU 90% of the time and process two gets a 10% of the time and mostly what you're going to see is that there might be one thing that's getting most of the CPU and the rest of them are getting very little of it and that's because they're mostly sleeping or waiting on IO typically but if you look carefully and you add everything up you'll actually get 100% okay but that oftentimes if something's mostly idle most of that time comes up as the idle process which we'll talk more about too okay so let's talk briefly about protection so here we have brown and green but I said they were protected from each other so what happens if process two reaches up and shows tries to access brown's memory or tries to access the operating system or tries to access storage which is owned by some other user what happens is protection kicks in the operating system and voila we basically give that process the boot and typically cause a segmentation fault dump core and the green process is stopped now this isn't about more than 100% is an interesting one it really depends on how the statistics are reported if you have multiple cores you have say four cores in one view of the world you could have up to 400% execution in another you could say only if you use all four cores you get 100% so you have to be very careful about what the reporting statistics are because I've seen them both ways okay but if you have more than 100% then you definitely have it reporting multiple cores or each core is 100% okay so does one CPU equal one core I'm going to say yes for now and just know that that's not the whole story we'll go a little further for now but for now today you can certainly think of one CPU equal one core for this lecture absolutely the CPU often has many cores so we're not going to go there today but we're going to go there so this protection idea is really the OS synthesizes a protection boundary which protects the processes running on top of the virtualization from the hardware and prevents those processes from doing things that we've deemed not correct that are not part of the protection okay so what we're going to talk about as we go is exactly what I just said here I didn't talk about this in terms of virtual memory but one of the reasons that the green process isn't able to reach out and touch the brown memory is that virtual memory prevents it but this reaching out to memory you're not supposed to have access to can be shown reaching out past the boundaries of what the operating system has mapped this lecture is giving you some of the ideas at the high level which we're going to drill down to in a couple of lectures so this protection boundary is again part of the virtual machine abstraction somehow we've got these networks which have little packets with MTUs there are 200 bytes and what have you we've got storage which is a bunch of blocks controllers which do a bunch of complicated stuff you as a programmer don't want to think about the virtual hardware because if you had to do that you'd be you know you wouldn't be getting anything done and so part of what the OS does is it really puts these protection boundaries in gives you a clean virtualization precisely so you can program without thinking about those things and you can program without worrying about somebody else trying to hack in as well so that's the idea there's an interesting question on the chat here about whether the Java virtual machine would be an OS there are points of view in which the Java virtual machine could be considered an OS so let's save that question for another day but bring it back if it looks like we're going that somewhere where that's appropriate so the OS isolates processes from each other it isolates itself from other processes and even though they're all running on the same hardware so that's an interesting challenge which we're going to tell you how it works so finally the operating system has a bunch of glue that it provides which are common services so you may not have thought it this way but if you have a good operating system it's going to give you a file system so you're going to get a storage abstraction or it's going to give you windows that properly take in mouse clicks and so on or it's going to give you a networking system that can talk from Berkeley to Beijing and back without worrying about packets and so these common services are actually typically linked in with libraries and those libraries are things that you come to depend on when you're writing a program so really an operating system if you were to look at its functionality referee, illusionist, glue all of these things are part of what an operating system might be considered doing what gets interesting when you have non-mainstream operating systems like if I don't run out of time to briefly talk about the Martian rover for instance you might try having stripped down versions not as much functionality to try to run on simpler hardware or in a less malicious environment where there might not be somebody hacking in and so many times people build specialized operating systems which perhaps don't have all the protection internally or maybe they don't have all the storage services that you might see here et cetera and that doesn't make it any less an operating system it makes it a more directed operating system at a particular task so finally they always some of the basics are IO and the clearly I've just said that we're providing the ability for storage and networks to have a nice clean abstraction into the hardware that we can deal with common services so there was a question here about flipping transistors and heat I tell you what I promise as a computer architect to talk about that in a few lectures for you if that's interesting is there a smallest OS well there was something that David color put together in the early 2000s called tiny OS which is pretty small so finally it gives you some look and feel so maybe you have display services there is an interesting point back to what I talked about earlier in the lecture here is windowing part of the operating system is the browser part of the operating system well perhaps depends on what operating system so for instance Microsoft windows went through a phase the windows NT initially had was a microkernel type operating system and the windowing system was outside of the kernel and then they decided they weren't getting enough performance and so they went the opposite direction and put the windowing entirely inside of the kernel which is almost like a reactionary response and so you can have windowing both in and out of the kernel and the distinctions there have to do with protection security durability reliability some of those questions come up and hopefully you'll have enough to where you think it belongs as we get further into the lecture or further into the class and then finally we got to deal with power management and some of these things which only really show up on portable devices but these are all potentially managed by the OS so so what's an operating system referee illusionist glue many different possible so why should you take one seat well other than being one of the best classes in the department if I just say so myself some of you will likely I said CS I said 61 see I'm at 162 my apologies boy I'm slipping up here tonight but some of you are actually going to design and build operating system so by the way just to be clear I was saying that CS 162 is one of the best classes but you shouldn't quote me on that I'm getting trouble but some of you may actually design and build operating systems in the future and it'd be very useful for you to understand them many of you will create systems that utilize core concepts and operating systems so this is more of you it doesn't matter whether you build software or hardware or you start a company or a startup the concepts you lose that you basically use in 162 are ones that are going to go across very easily to many of these different future tasks that you're going to do and so you're going to learn about scheduling and well you could schedule in the hardware if you're designing processors you can schedule in the lower levels of the OS if you're building a core OS you could schedule in a big cloud system if you're building cloud apps and so the ideas that we learn here actually go across to many different places and we'll even talk about some cloud scheduling as we get a little later in the term all of you are going to build apps I guarantee it as you go forward okay and you're going to use utilize the operating system and so the more you understand about what's going on the more likely you are to a not do something that was not a smart thing to do hopefully you'll learn about locking you'll learn about concurrency you'll learn enough about the right way to design some of these systems that you're going to write amazing bug free software as opposed to almost amazing very buggy software okay so who am I so my name is John Kubitao it's most people call me professor Kubi maybe because they can't pronounce my last name but I have background in hardware design so I did there's a chip I designed for my PhD work which is one of the first shared memory multi-processors that also did a lot of work in peer-to-peer systems so the ocean store project this was our logo here of the scuba diving system I have a background in operating systems I worked for project Athena at MIT as an OS developer did device drivers and network file systems worked on clustered high availability systems we had a project for a while in the PAR Lab called tessellation which was a new operating system we were working on this was our logo here of the scuba diving monkey was addressing the idea of storing data for thousands of years and we were pretty much one of the first cloud storage projects before anybody talked about the cloud back in the early 2000s and so some of the concepts I talked about at the end of the term will come from some of those ideas I also do some quantum computing and perhaps it's a little off topic for this class and most recently I've been working in the internet of things or the swarm specifically I have a project called the global data plane which is looking at hardened data containers we like to use the analogy of these shipping containers that everybody sees down at the port of Oakland where these shipping containers are cryptographically hardened containers of data that can be moved around to the edge devices and back into the cloud and our ideal for edge computing and so we'll talk about some of these ideas as well and if any of you are interested in doing research in that that's certainly something you could talk to me about alright and I will say that quantum computing is the real thing becoming more real as we go it's got to be real because Google and IBM talk about it all the time now so that's a little bit of a joke but we have a great set of TAs this term and Neil Kulkarni and Akshad Gokali are co-head TAs and we have a set of really good TAs and so I'm very excited about our staff and I will tell you a little bit about where we're at in terms of scheduling sections we haven't the sections are still TBA and I'll say a little bit more about why that is in a second okay so let's talk a little bit about enrollment the class has a limit of 428 I just raised it and it's not going to go any higher so probably won't make the class any larger there's one circumstance where that might happen but I think it's unlikely at this point this is an early so I will say something here so running a class virtually in the middle of a pandemic especially something like CS162 is a serious challenge and so what we're doing is you're going to have a pretty good I would say an excellence ratio of students to TAs this term and that's to make sure that things all be smoothly running and so probably won't make the class any larger the other thing to keep in mind is this is an early drop class so September 4th which is a week from Friday is the drop deadline and what an early drop class means is it's really hard to drop afterwards so the next two weeks you need to make sure that you're still in the class because if you are still in the class and you get past that early drop deadline you either have to burn your one special drop late token that you get as a student or there's some appeals process that doesn't always work so so the early drop deadline is really there to make sure that when you guys start working in groups it's going to be stable we instituted that because what would happen is people would form their groups and students who weren't entirely serious about the class ended up dropping out on their project partners and that got to be a problem so what we need to do in the next two weeks is everybody needs to make sure they want to be in the class and if you don't you should drop early so that people could get in because we currently have a waitlist that was 75 or so in the last I checked the other thing which I'm going to say more about in a moment but we're very serious about requiring cameras for discussion sessions for design reviews and even for office hours and we're going to certainly use them for exams so if you don't have a camera yet you need to find one the only place in this class where you're not going to want to turn on your camera is lecture because having we currently have 328 people on the chat there and so that would be bad I think with the wi-fi issues people are asking about let's just do your best okay zoom tries to adjust a little bit and we'll deal with problems on a on a case-by-case basis but I'm going to tell you more about this in a moment but really having a class like this all virtual is very hard unless people interact a little more normally and so that's really requiring people to be able to see you okay if you're on the waitlist like I said earlier we kind of maxed out sections in TA support so if people drop they're going to we're going to automatically move people from the waitlist into the class so here's the thing you should absolutely not do and if you have friends who are you know we're just on the class and are thinking they're not going to take the class make sure that they either get themselves off the waitlist or they do all the work in the class because as I'm going to mention a little bit if you're still on the waitlist and a spot opens up we will enroll you in the class and you'll be stuck as I mentioned earlier but if you're not keeping up that could be a problem if you because we have occasionally had people discover weeks into the class that they were enrolled and you know couldn't get out of it so don't be one of those people okay now the question about discussion sessions I'll say a little bit more about them in a moment okay but how do we deal with 162 19 well if you look at this particular word play here we've got collaboration in the middle we've got to remember people and we've got to figure out how to combine all of you together in your groups and produce something successful so this is challenging and I know this is not the term you thought you were getting this fall when you you know when you thought about coming to Berkeley and I apologize I think experienced the end of last semester unfortunately but collaboration is going to be key okay so things are considerably different I would say this term even then they were last term because we're starting out fully remotely so you don't even get to see anybody in person probably maybe some of you will get to see each other but I would bet that the bulk of you don't most important thing is people and then interaction and collaboration so I put up something here to see all remember I fondly remember coffee houses this is what they kind of look like you know you sit with people and you drink beverages of choice I'm going to say coffee to keep from getting in trouble and you discuss things okay so this is how groups ought to work okay and the question is how do we do this when people are all remote and so first of all it's going to work it's going to require work okay I hate to say this but the way we make this turn out well is we've got to work at our interactions because as you well know if you don't look at anybody with cameras on or whatever you just exchange email that can go south very quickly even when you didn't intend to imply something and everybody gets their feelings hurt things are just not working out well so we've got to figure out how to bring everybody along with us so we don't lose anybody and if you notice here by the way these people are holding hands that's virtual so we're not suggesting that you don't socially distance when you bring people along but the camera is a part of this okay so this is call this an experiment but cameras are going to be an essential component you got to have a camera and plan to turn it on and if you have issues with spectrum let's see figure out ways of maybe lowering the bandwidth a little bit but you certainly need it for exams okay so if you don't have a camera you got to make sure you've got enough spectrum and a camera for the exams and you're going to need it for discussion sessions, design reviews and office hours possibly even that's going to depend on whoever's running the office hours we I'll get to section this week in a moment but yes we do have section this week but the thing about cameras is it gives the ability to at least approximate what we used to be able to do when we sat physically in person in fact I may even in fact not even made we are probably going to give extra credit points for screenshots of you and your group meeting on a regular basis drinking a beverage of choice and talking to each other okay so this is the kind of thing that needs to be strongly encouraged even before we had a pandemic I had groups that somehow despite the fact that they could meet never met the whole term okay and this was got bad and by the end of the term the group all of the members were upset with each other they you know the project failed and they all got bad grades and this was just a bad scenario and it didn't have to happen that way because they should have been meeting they should have been looking at each other while they were talking and it didn't happen so this is our experiment okay and so cameras are a tool not of the man they are a tool of collaboration okay so we want to bring back personal interaction okay even though we're on either side of fences humans are really you know even computer scientists are not good at text only interaction so we are going to require attendance we're going to take attendance at discussion sessions and design reviews with the camera turned on okay so hopefully that's clear any other questions on the camera you can now why don't you type your question okay and people turn off their mic if they're not asking a question actually type your questions too alright so infrastructure well it's only infrastructure you can't come see us but we have website which you've probably all gone to cs162.eats.berkeley.edu that's going to be your home for a lot of information related to the course schedule we've got piazza so hopefully you all have logged into piazza already assume that piazza is the primary place where you're going to get your information I'm also going to be posting the slides early as have been asked several times on the website on the class schedule and when the videos are ready they'll be posted on the class schedule as well so you'll be able to get everything related to the schedule on the website and then piazza is kind of everything else okay the textbook is this principles and practices of operating systems it's a very good book the suggested readings are actually in the schedule and so you try to keep up with the material you can get a red version on text of what I talk about and I think those two together help a lot there are also some optional things you could look at so there's I know David Culler really liked this operating system three easy pieces book the Linux terminal development book some of these are interesting maybe to look at as a supplement one thing that you may not have known is if you log in with your Berkeley credentials to the network which I think you need to use a virtual VPN to do that but you can actually get access to all of the O'Reilly animal books over the network as well that's something that Berkeley has negotiated with the digital library which is pretty cool and then there's online stuff okay so if you look at the course website we've got appendices of books we've got sample problems we've got things in networking databases, software engineering security all that stuff's up there old exams so the first textbook is definitely considered a required book you should try to get a copy even if it's only an e-book there's also some research papers that are on the resources page that I put up there and we'll actually be talking about some research as we get later in the term so use that as a good resource so the syllabus well we're going to start talking about how to navigate as a system programmer we're going to talk about processes IO networks, virtual machines concurrency is going to be a big part of the early parts of this class so how do the threads work how does scheduling locks, deadlock scalability, fairness how does that all work we'll talk about where address spaces come from and how to make it work so we'll talk about virtual memory and how to take the mechanisms and synthesize them into interesting security policies so virtual memory address translation protection sharing we'll talk about how file systems work so we talk about device drivers and file objects and storage and block stores and naming and caching and how to get performance and all of those interesting things about file systems which you probably haven't thought about and the last sort of couple weeks of the class we'll even talk about how to get the file system abstraction to span the globe in the cloud storage system so that'll be interesting we'll talk like I said about distributed systems protocols, RPC and FSDHTs we'll talk about CORD we'll talk about tapestry and some of those other things and we'll also talk about reliability and security to some pretty big extent there's a question in the chat about cloud systems and why they haven't really taken over as operating systems and I think maybe they have more than you might think I think the cloud has really become part of our day to day lives and things that people call the cloud operating system maybe where they put capital T C O S or something may not have taken over but a lot of other mechanisms have been synthesized together in a way that you haven't thought about so hopefully by the end of the term we'll actually you'll have enough knowledge to evaluate that question for yourself as to you know what is up with the cloud and is it really a monolithic thing or is it a bunch of mechanisms where's that at okay so we learned by doing in this class so there's a set of homeworks and each of them is kind of one or two weeks long there's one that you got to get going right away which is you need to get going on homework zero so this is one of the things that we do in the very first week it's already been released I believe and you should get moving on it and it's basically learning how to use the systems and there's also a project zero which is done individually and you should get working on there too so this class is as much about knowledge as it is about actually doing things I should say that the other way it's as much about doing things as it is about knowledge so you're going to do build real systems okay and and you're going to learn some important tools as you do that and they're either going to be done individually or they're going to be done in groups okay there was a question about Kafka and Cassandra probably we'll get some concepts from them a little bit later okay so a big thing to learn about from this slide is get going on homework zero and project zero will probably get posted soon and both of those are things to do on your own without your group so group projects have four members never five or never three okay it's four three is a very serious justification requirement you must work in groups in the real world and so you learn how to do it here and all of your group members have to be in the same section with the same TA okay and so that's why the sections that you attend and you are going to attend sections in the next couple of weeks are just any section you want because we don't have your groups yet and once we have your groups then we will assign you to sections and go from there and you should attend the same section and that's when the requirements for attending section will kick in and we do have a survey out on time zones and so on to try to get an idea where the best place to put some of these sections are so communication and cooperation are going to be essential regular meetings with camera turned on is going to be important you're going to do design docs and be in design meetings with your TA and I will tell you yes you can use Slack and Messenger whatever your favorite communication is but if that's the only thing you do it's not going to be great okay you got to have your camera you got to get together and see each other the group your groups are actually going to have to be formed by I think the third week of classes it's in the schedule take a look but when we get into groups we're going to actually have a lecture half lecture where I talk a bit about mechanisms for groups as well okay and sort of ways that you can cope with the typical problems that groups have and sort of what are some good tools there to give you a little idea but short answer is got to decide groups very shortly and we do that typically after the early drop date because at that point in theory people are really going to be in the class and we're going to have some mechanisms to help you form groups there's going to be a piazza looking for a group kind of thread we may even have some zoom room set up for people to sort of you know I don't know interview your group members or talk to them we have a couple of different things we've been thinking of just to try to get your groups together but keep in mind you want to have your group members in your group okay not five and three is probably only under serious justification okay and you're going to be communicating with your TA who's like a supervisor in the real real world so this group thread here is very much like what you're going to run into when you finally exit Berkeley and confront the real world so it started well there's going to be a survey out okay so the the question in the chat about TBD yes so the group you we're assuming that many of you might not have group members yet and it's also the case that the final discussion session times haven't been decided only for the next couple of weeks until groups are formed okay there's going to be a time zone survey out you probably have already seen it I think it was released on Piazza but you need to fill that out let me know where everybody is okay I want to know if you're in Asia or if you're in Europe or you're in New York or whatever okay get going on homework zero project zero is not quite out yet but it will be very soon okay but homework zero kind of gets you going on things like getting your github account and registration and getting your virtual machine set up and get familiar with the 162 tools and so on and how to submit to the auto grader so project so homework zero is up and it's something to get going on right away and we will announce as soon as project zero is up it's going to be out soon sections on Friday attend any section you want that we will post the zoom links if they're not already posted very shortly and get your permanent sections after we have our group set up so you're going to prepare for this class you're going to have to be very comfortable with programming and debugging C you're going to want to learn about pointers and memory management and GDP and much more sophisticated and large code base than 61C and so we actually have a review session on Thursday the 3rd of September to learn and review quickly about C and C++ concepts and just stay tuned we're going to get that out and consider going just to give you a refresher the resources page has some things to look at there's some ebooks on get and see there's a programming reference that was put together by some TAs a couple of terms ago and so first two sections are also about programming all right the tentative breakdown for grading is there's three midterms there's no final the midterms are going to be zoom proctored and camera is going to be required just so you know please figure that as part of the class okay and so get yourself camera so that's about 36% 36% projects, 18% homework 10% participation and let's see so yes zoom proctoring projects I've already talked a lot about homeworks you've heard about a little bit as far as the midterms are concerned we are going to set times after we know more about where people are okay midterms are we haven't entirely decided but they're either going to be two or three hours long each okay so the other thing I want to talk about here is personal integrity which is there is an academic honor code which is a member of the UC Berkeley community I act with honesty, integrity and respect for others you guys can take a look at it I strongly suggest you look at it okay this class is very heavily collaborative between you and your group but it should not be across groups okay or across other people on homeworks so things like explaining a concept to somebody in another group is okay discussing algorithms or maybe testing strategies might be okay discussing debugging approaches or searching online for generic algorithms not for answers these are all okay these are not things where you're getting specific answers to your labs and homeworks sharing code or test cases with another group not okay copying or reading another group's code not okay copying or reading online code or test cases from previous years not okay helping somebody in another group to debug their code not okay so sitting down for a long session of debugging to help somebody without you know maybe thinking you're not copying code in I'll tell you a long debug session has a tendency to cause the code to become your own code so that's not okay okay and we actually compare project submissions and we catch things like this we actually caught a case once where somebody sat down and debugged with another group and helped them out and didn't do any direct copying or at least they claimed not but when it was done the code looked so close that the automatic tools caught it so don't do that and the other thing not to do is don't put a friend in a position by demanding that they give you their answers for homework okay we had several cases we've had several cases like that recently where one person was having trouble with all work and they kind of guilted a partner or a friend into giving them an answer and that gets both of them in trouble so don't just don't do that okay do your own work and by the way to help this we're trying for the first time during the term a curve in this class we're going to actually do an uncurved version of this we haven't put up the thresholds yet but we'll see how that works but please just don't put your friends in bad positions by making them give you code because they get in trouble as well and it's just not worth it and you don't learn what you could learn by actually doing the work it's kind of what's the point of being in the class in the first place the goal of the lecture is interaction so lots of questions we already had a bunch of questions today that's great I'm hoping that this continues sometimes it may end up that we don't quite get through the topics I was hoping but it's much better to have interesting questions and what I can do in a virtual term like this is I can even have some supplemental extra 30 minutes of lecture I can post or something like this stuff I thought we would so let's give this a try and see if we can make this virtual term as good or better than it would be under normal circumstances all right and again if you have more questions about logistics you know piazza the class website those are your two best places to look for information so let's finish up here in the last 10 minutes or so let's get started this is what makes operating systems exciting the world is a huge distributed system we showed you what people were calling the brain view earlier of the network but the thing that's interesting about it is all the devices on there from massive clusters at one end that span the globe down to little MEMS devices and IOT devices and everything in between modern cars for instance refrigerators have processors and web browsers we've got huge cloud services we've got cell phones little devices everywhere all of this together is one huge system this is exciting why does this work in the first case and what's its potential so this is why I think operating systems are so exciting because it's what makes this all work without them there would be chaos and things just wouldn't work so of course you've all heard you wouldn't be at Berkeley if you hadn't many times about Moore's law so the thing about Moore's law which I like and I always want to mention is Moore's law basically says that for instance you get twice than transistors every 1.5 years or so for many years although that's starting to disappear on us now but what you may not know so that's an exponential curve or a straight line and a log linear curve as Gordon Moore was actually asked at a conference once what he thought was going to happen and a log linear graph on the fly at the conference he put down a couple of points, drew a straight line and said well this is what's going to happen far into the future now normally that would be ridiculous and laughable except he was right which was pretty amazing so what's the thing about Moore's law the thing about Moore's law is it allows you to make zillions of interesting devices because there's so many transistors that you can shove into a little bit of a device of course the downside which happened back in the early 2000s was that putting these transistors increasingly on chip kind of ran into problems with capacitance and power such that you weren't able to make an individual processor as fast it used to be that you could wait a few years and get twice the performance of a machine that you were currently working with somewhere around the 2000s that stopped and suddenly what did you do well suddenly people had to make multi-core processors and lots of parallelism and so you know from the operating system standpoint this is par for the course because you know I already showed you a huge system with billions and billions of devices and so yeah so the fact that chips have multiple cores on them is cool you know it's enabling of lots of stuff but it's just kind of that's the way it is and it's interesting about how we get around that complexity okay so around the 2000s we suddenly had multi-core the power density thing I think is a funny way to look at this if in 2000 if instead of basically tried to keep making the processors grow as fast in performance as they were if you had done that what would happen is we would have chips that had the power density of a rocket nozzle and you can imagine putting a laptop like that on your lap might be a little uncomfortable so power density, capacitance a lot of things is what kind of led people to suddenly make multi-core instead of making things faster but they did that okay so by the mid 2000s we had many cores on a chip okay and so parallelism is exploited at lots of levels alright and somebody pointed out the stock of Intel and AMD went up hugely that's true but that was because they were delivering something that everybody needed which was lots of processors on a chip alright and the problem of course is as you're well aware Moore's law is ending and it's not officially well it's officially over in the original growth but people are still shoving a few more transistors on there but unless there's some fundamentally new technology we're basically going to see the end of that growth of more you know smaller transistors but it doesn't mean that people aren't still shoving lots of devices together and connecting them with networks it just means networks become more important okay and by the way vendors are moving to 3D stacked chips and all sorts of cool ways of having a single device have even more transistors on it even if Moore's law is ending so I have no doubt that things are going to continue quite a ways into the future the other thing is storage capacity keeps growing okay we've got various Moore's law like graphs of storage society keeps getting more and more connected and so we have more devices more storage and more devices, more storage more people means more needs for operating systems in the right class network capacity keeps going up okay people need more connections okay and they're there at the small scale and the large scale but not only PCs we have lots of little devices we've got lots of internet of things devices you saw this graph earlier I showed you but we've got little temperature sensors and Fitbits and things you carry on your body and things you put in your cars and all the way up to the cloud okay so what's an operating system again it's a referee it's an illusionist it's glue that helps us build these huge interesting systems and that's what you're going to learn about this term the challenge which I'm going to kind of close with the challenge is complexity okay applications consisting of many software modules that run on many devices implemented on many different hardware platforms running different applications at the same time in unexpected ways under attack from malicious people leads to craziness and complexity right and it's not feasible to test software for all possible environments and combinations of components and so we're going to have to learn how to build these complex systems in ways that basically work and some of that it's going to be learning just how to design systems that are correct by design rather than correct by accident okay the world is parallel if you haven't gotten that by now here's an example of from 2017 the Intel Skylake 28 cores each core has two hyper threads so it's 58 threads per chip and then you put a bunch of these chips together and you get a huge parallel system in a tiny box and you put a bunch of boxes together and pretty soon you've got the world okay yes and 28 times 2 is not 58 very good so with that not only do we have the chips which are interesting but I want you to realize that the processors are only part of the story it's this all of the IO is the interesting parts and we'll talk about that but it's not just this processor up here it's everything connected to it it's the devices it's the networks it's the storage okay so this is interesting complexity when processing hits the real world and that's where the operating systems get involved I thought I'd put up this graph just to leave you with a few things to think about so here is millions of lines of code and if you look at the original Linux not the original Linux but version 2.2 which is quite a you know 15 years ago whatever at least and you look at the Mars rover these are on the low end of this scale but now you kind of look at you know Firefox and Android and Linux 3.1 which is a little bit older now and Windows 7 and then you get up into kind of Windows Vista and the Facebook system itself and macOS and then you look at mouse space pairs here that's a genetic thing that's 120 million things you can see that our systems are very complicated okay and so you can go by the way to this source and get your you know select the things you want and look at this yourself okay this information is beautiful.net visualizations and million lines of code it's kind of fun to look at okay so you know the math the Mars rover here it is is a very amazing one of you know there have been a couple of instances of the rover but this particular one one of the first ones was pretty amazing they were able to send it up and land it on Mars and it ran for decade or more it had very limited processing it's 20 megahertz processors and 128 megabytes of DRAM and so on it had a real-time operating system but for instance you can't hit the reset button or you can't debug it very easily and however they were able to set it up in a situation where they could they figured out some timing problems they had and they were able to debug it remotely which is pretty amazing and I'll talk more about that as we go but you need an operating system on something like this because you perhaps don't want it to run into a ditch while it's busy taking scientific data or whatever okay and so very similar kind of to the internet of things in its size so this kind of processing is par for the course for really tiny devices and so we're going to talk about this kind of device in addition to the really big ones as we go okay so some questions to end with does a programmer need to write a single program that performs many independent activities and deal with all the hardware does it have to, does every program have to be altered for every environment does a faulty program crash everything does every program have access to all hardware hopefully the answer to this is no and we'll learn as the term goes on and operating systems basically help the programmer write robust programs so in conclusion to end today's lecture operating systems are providing a convenient abstraction to handle diverse hardware convenience protection reliability all obtained in creating this illusion for the programmer coordinate resources protect users from each other and there's a few critical hardware mechanisms like virtual memory which we briefly brought up which we'll talk about that help us with that it simplifies application development with standard services and gives you full containment full tolerance and full recovery so CS162 combines things from all of these areas and many other areas of computer science so we'll talk about languages and data structures and hardware and algorithms as we go and I'm looking forward to this term I hope you guys all are having a good first week at class and we will see you on Monday alright ciao