 All right, whoops, I need to do this. All right, welcome back to operating systems. Things seem to work. So that's not good. Okay, so today we will talk more about operating systems. This lecture also will be a thing that you don't have to 100% know everything about. It will be a lot of interest stuff. Some stuff might go over your head and that's okay. So we're going to talk about libraries because you are going to make libraries and they are a core part of an operating system, so we should learn about them. So first off, a question to you. Well, we actually haven't answered the question what exactly is an operating system because the answer is it kind of depends. So we know that the kernel is definitely part of the operating system. It operates in kernel mode, good to know. But for instance, is macOS, iOS, iPadOS, watchOS, TVOS, are they all different operating systems or what's the difference between them? Anyone want to tell me what the difference between them are? All right, first one they work on different devices, so they might be different operating systems. Yeah, they kind of have different interfaces and Apple says so, so I should listen to them. Any other one? Anyone else with any ideas why these are different? Or marketing BS maybe? They're actually the same. So different applications. Well, let's get into it. So applications, that's what we ultimately at the end of the day actually care about. Whenever I had the three core things, it was applications on top of the operating system, on top of, I shouldn't punch that, on top of the hardware. So libraries are an important part of an operating system and what libraries you have to use depends on what your application does. So here's three applications. Network manager, which is like a low level application to manage your Wi-Fi, your Ethernet connections, all that fun stuff. LibreOffice, which is basically an open source clone of Microsoft Office and Firefox. So those are all three different applications and they might use different libraries. So all of them probably use the C standard library and then things are built on top of that. So on Linux there is a display server if you want to display any graphics and the current one is called Wayland. That will let you have access to direct CPU buffers and stuff like that. So maybe you don't want to use that directly. So maybe you use another library that has a GUI toolkit. On Linux, one of them is called JTK and that will render buttons for you instead of just raw buffers that you might not care about. So Firefox uses JDK to render some user interface elements. So it would use that library. But Firefox is a web browser. It does page layout and stuff like that. So it actually uses Wayland directly and just displays all your web pages because it renders them itself. And it might also have something to do with the system daemon. So this kind of abstracts some hardware on your machine and will make some requests for you. We won't have to use that in that course, but it's one of the things that's called that it is used. So Firefox might use all these together in order to work correctly. Well, LibreOffice might not request any details to hardware. It might just use the GUI toolkit and that's about it. NetworkManager might not use the GUI toolkit and it might use like Udev directly. So what's an offering system? What libraries you need actually depends on the application. There's kind of no one-size-fits-all answer. So it depends. So for the Apple example, well, all those offering systems actually run the same kernel and they're exactly the same in that respect. A lot of the libraries under the hood are exactly the same. They're 95% the same, but they all run different applications because you can consider Apple makes a distinction between a watch application, tablet application, phone application that have their own sort incentive standards. So Apple is right in calling them different operating systems because it expects a certain type of application. But if Apple was a bit more open, it might depend on what applications you want to run. So for instance, Android and your Debian VM both use Linux. Linux is the kernel. And depending on the application, if you just have like your little Hello World application or something you've written in 104 or 105, well, Android uses Linux. It has a standard C library. You could run your application on your phone if you wanted to. It would work perfectly well. You wouldn't have to do anything special. So if those were the only applications you cared about, well, Android and Debian would actually probably be considered the same offering system. No one is sick enough to actually run terminal applications on their phone and like compile stuff on their phone. But if you really wanted to, you could have done 105 completely on your phone if you really wanted to. But again, you would probably have to be fairly crazy like myself. So they might be the same if you only care about terminal applications. Otherwise, Android applications are a thing. So they would be very distinct operating systems. You couldn't just run an Android application on Debian and think that it was going to work. So Linux distributions might be considered, so Linux is just the kernel. You might call a full operating system GNU slash Linux if you have ever heard of that terminology before. Basically, GNU distributes like the standard C library, which is a core part since basically every application uses it and gives you all the standard utility commands like CP, LS, all that stuff is written by them. So you might call Debian like GNU Linux if you want to be like a generic term for that type of software stack. But basically at the end of the day, an operating just consists of a kernel because, well, something has to run in kernel mode, so an operating system will always have a kernel. And then it's any required libraries for your application. So if you just use a kernel directly, well, you don't need any libraries, but your life is going to be very difficult. Now we have to do some review in order to talk about libraries. So we have to talk about how C compilation works and all of that fun stuff. So this should be of no surprise for you if you've ever created an executable with multiple C files and if this is new, which it is for at least one person, then this is how it works. So if you have multiple C files together, this may have been hidden from you in the past, but what basically is going to happen is you write some code in these C files. Your compiler is going to compile each of those C files individually and create a compiled version of that file called an object file. Don't really need to know that for the course, although it's basically the compiled version of your code. So end in .o. And then at the end of the day, it will link them all together. So that basically is just a generic term for it will combine all of those files together, all the compiled version of those files together to form your executable. So that executable will have instructions for every single function you wrote in C. So any questions about that? So that may have been hidden for you if you used make before or any other build system. It would have hidden this from you and done the compilation and link step for you, but if you separated it out and you know what's going on, this is what will happen. So static libraries are basically the same thing, except they're just going to reuse something. So instead of combining everything into the executable and compiling it all individually, well, say I want to reuse that util, that foo in that bar. Well, instead of just having them have a .o file and linking it, well, I can combine those .o files which have the function definitions and everything in them into an archive. And this archive will basically just, it will consist of all those .o files just sandwiched together. So this lib.a will just have all of these in it in just one file. But lib.a actually, you can think of it as actually containing those three files. Yep. Yeah, so those three .o's would be the same ones from here. Instead of putting everything out in the main. Yeah, so the question is, why is the compiler deciding to put it in the library? So the compiler doesn't decide. This is something you decide as a developer. So this is me deciding as a developer. I want to reuse the code that's in util foo in bar. So I will create a static library. So that way, whenever I create an executable, I just have to link together main.o and that lib.a to create that executable. It behaves the same. And the idea behind that is, well, if I wanted to create a new executable with like main2.c, I could just replace main2.c and use lib.a, and I don't have to compile everything and link it together. I just say use lib.a. So it just saves us some time. Otherwise it behaves exactly the same. It's just a way for you to specify I want to bundle these together and I might reuse them again. But behaviorally, it's just a... and you have to organize it this way. This is something you pick. And you probably mean you haven't had to pick this yet, but eventually you might have to make this decision. What code you write is actually going to be reusable. So this is a static library. It behaves exactly the same. I'm just bundling things together so I can reuse it. So in computer science, always, if I say there's a static something, that means there's a dynamic something. It kind of always comes together. So you may have noticed that the C standard library is not a .a file. It's a .so file. So on Linux, dynamic libraries will be .so, which stands for shared object. On Windows, these might be known as DLL files. On macOS, they're called dylib files. They're all the same thing. It depends on the operating system. But the idea behind that is kind of the same idea as static libraries, but there's one key difference. But the idea is still the same, that I have two applications that want to use the same code. So I don't want to just reuse things or I don't want to waste space or reuse things over and over again. I don't want to use the same system both to just use the same code. And that's why the standard C library is a dynamic library. So if you have two applications both using it, the operating system can be smart about it and just loaded into memory the first time you use the standard C library. And then if another application uses it, the operating system can be smart enough to say, hey, I already loaded it into memory. So this program can also use it. And we'll see how that sharing would actually work once we get to virtual memory. But that's the idea behind it. It's supposed to save some memory because I can actually share some information. So how it looks. It looks very similar to the static library. So it's the same idea, but instead of creating a lib.a that has all the header or the .o files, I create a lib.so and I have to specify a special flag to the compiler that does some low level things a bit different, but essentially it does the same thing and just smushes them all together in that lib.so file. So that lib.so file will contain the definition for any functions that were in those C files. Yep. Yeah, so lib.so would be the result of after the compilation in the linking process. So it's the thing we actually care about. Thank you. Yep. So like libc is a .so file. Yep. Yeah. So the question is if the execute be the static one will be larger than the dynamic one and the answer to that is yes. So in this case, in the static case, we're essentially just shoving all those files into the executable. So it's going to include all the function definitions and everything. And then here in the dynamic case we're not going to do that. So this lib.so file will have all the definitions but whenever we link to create an executable we're not going to sandwich all those function definitions in our executable. We're just going to say, hey, whenever you run this program I need this library to run and the operating system will have to look it up whenever you run your program. So I would say I have a main and everything else I'm going to reuse and it's in my library. So just my executable just consists of main and then whenever you run that executable the operating system is going to look up that dynamic library and then any functions you call it will actually call those ones. Yep. Yes. Yeah. So the question was, well is the only difference a compiler flag and to create them, yeah, the only difference is a compiler flag. But the only difference and the difference here is in the static case all the libraries included in the executable in the dynamic case the library stands by itself and it's used and looked up whenever you run it. So that will bring us some interesting benefits and drawbacks. Oh, and also there's this useful command line utility if you like get into deep system programming so you can use LDD on any executable and it will tell you what dynamic libraries it uses. So it's a fun thing to poke around with won't have time to do that but you can go ahead and do that. So you might think, hey there's static and dynamic why do I have both of them? When should I use one versus the other? So again with static basically it's just copying all that code and just shoving it in your executable whenever you compile it and in the dynamic case it's going to keep it separate and then look it up at runtime. So some drawbacks well software evolves over time right people change it they update it hopefully they make it better so if the library you're using is a static library and you're relying on it and you're using it if they produce a new update or something like that that fixes something you have to recompile your program because it's included as in the executable so if they make an update you have to recompile everything. That's okay if you have control for all the source code and you're using open source and all that it might be okay but if you don't have access to the source code that might not be a thing so that's not good and yeah so that's not great if you have to recompile for every update and another issue is well it also wastes a lot of space I think if the standard C library was a static library well that would mean that every executable that uses printf would have the definition of printf in the executable so in that case if I have a thousand executables I have a thousand copies of printf just wasting space wasting hard drive space and then if I execute them they'd all waste I keep hitting that sorry they would waste memory because it would be independent the operating system wouldn't know they're actually the same so it would waste space both on your hard drive and in memory and while with dynamic libraries I just have one copy of printf in that standard C library and then every single program uses it so can anyone think of any issues with having dynamic libraries? yeah oh sorry go you first oh so the question is what's the issues with using dynamic libraries if for static any update I have to recompile everything okay sorry let me rephrase the question is there an issue with dynamic library updates? yeah were you going to say the same thing? same thing? okay yeah same comment? okay that's yeah that's another one so to repeat that few issues well you might have to recompile it if a static library updates which kind of sucks but with dynamic you don't have to recompile it but you're not guaranteed every update is good so if it's an update that breaks something then suddenly everything is broken right because they're all relying on the one thing then everything's broken so if you're using a shared library you better be sure that if you're making it you better be sure whenever you update the code you do things correctly otherwise you'll break everything which is not good another comment is well if the application is already running and you update it it might also have some problems some kernels will keep the version that started with when you launched it and not do like a half way update so that's another issue that's a bit trickier the other one is well the update might just suck so this actually happened back I forget what year it was but basically the standard C++ library did an oopsie and broke something and suddenly every single program on your machine died and everyone just had to recompile everything in order to fix it and it sucked so we're going to see an example of that if you're really subtle what will break things so yeah major drawback is dynamic libraries can break can just break executables and the change can be really really subtle so for example if you didn't know this before well if you defined a struct in C and you have fields in it well the way you define the fields there is a certain way that is laid out in memory and that is defined in the C standard so that C compilers agree with each other so that if you compile a struct in GCC and clang and then you're calling code between them well if they agree on the layout of a struct then they'll both agree and everything will be fine it's the same thing that you had when you made function calls in assembly you have to agree it's the same thing for compilers basically all computers it's just people agreeing on things over and over again and not breaking that social contract and that's basically the only way computers work which is kind of scary and kind of amazing how they actually work but at the end of the day that's like most things in computers so in this example we might have an executable that uses a struct used in the dynamic library and bad things are going to happen so there's some other text on the slide let's just get into example so like I said structs are laid out in memory a very specific way in C so let's say I was creating a library for a point so a point that just has an x and a y so I might have something like this so I might design struct point I will say it y in x everyone would agree with me that that is a good struct for a point has an x and a y they're ints so later I might change my mind and be like why did I say x or y than x so later I might come along and change it to this version I might just say well for my struct point x should come first and then it should come first well if you get into the detail of what the compiler actually does these are actually represented in memory different so you might see this term offset in this course you can think of that as like an index into an array except offsets are always just done in bytes but otherwise it's the same idea so for version one the x field is offset because the size of an int is 4 bytes so the first thing in the struct is a y so the first 4 bytes of the struct would represent y and then the next 4 bytes would represent x so you would say x starts at offset 4 or 4 bytes from the beginning and then in version 2 it's offset by 0 bytes because it's at the start and this problem is or this is not an API change because both of the structs I can describe them the same way they both have an x and a y and they're both ints so I didn't change the API I just changed the order which actually changed the ABI so to explain it in a bit different in case that didn't make sense in both versions I could have represented the compiler could represent them both as an array of 2 ints but that way is easier and in the first version it changes where y is and where x is so in version 1 y is at index 0 and x is at index 1 and in version 2 when I swap the order then in this case x is at index 0 and y is at index 1 so any disagreements with that this might make more sense to represent it so let's go into the library itself between both of them my library is going to be exactly the same so this is not what I want so let's close this so here is the C file that represents my library and it's just going to be one C file so it uses point.h which has that struct in it I'm not going to change this between the two versions the only difference is that struct so in here I'll have point create that creates a point it does the malloc and then it sets the fields and then there's a get x and then there's a get y and there's a destroy which just calls free so this is the code that is part of my library now the code that uses the library is an example and it contains main and this will be the only content of my executable so it uses point.h and again this depends so just reading point.h you don't know what version so it depends on what version so this might be version 1 or version 2 depending on whatever I compile it with so then in here it will create a point where x is 1 where x is supposed to be 1 y is supposed to be 2 so then I have 2 printfs so I have a printf that uses the library so I print off x and y using the accessors get x and get y and then here I'm printing them off using the struct so now if I compile this with version 1 and my library with version 1 oops that's probably confusing being up there sorry let me reset up it so if I compile it with version 1 and I use version 1 of the library they'll both agree with each other so I'll see x is 1 y is 2 so there's a way to simulate there's a way to play with dynamic libraries on Linux using environment variables so I can simulate an update by doing like this so this if you haven't seen it before will set an environment variable called ldLibraryPath and I'm pointing it to version 2 of my library to simulate an update to just the dynamic library so what this will do is force the operating system to first look for libpoint in that directory before it uses the default directories so it sets an environment variable executable so this will simulate me just updating the library to version 2 so if I go ahead and run this again same executable I'm running the only difference is what library I use whenever I run it yikes so you see the first line is still correct 1 2 but the other one flipped you could have and this is a really subtle change but you could imagine that might have a profound impact so imagine this was drawing a user interface or something like that and your x and y's had to make sense and now suddenly they went like that you would probably be like that's a pretty big bug my screen flipped 90 degrees that's not good so yeah so things like that can happen and it's not very good so everyone see why that's a problem and can kind of explain that I guess so like the confidence so here I'll show it again just to be clear so here there are a lot of things going on there's actually four versions of things going on so there's two libraries and two executables depending on what I compile with what version so on the left here in the gray boxes are the two versions of the executable and on the right are the two versions of the library so in the executable compiled with version one whenever I'm using the library calls well point get x, point get y, and point create that will call whatever library I'm using at runtime I don't know ahead of time which one that will be because it depends whenever I execute the thing so I'm not sure what version of this I'm using but because I use the struct directly in that point.h file whenever I compiled with version one it was y than x so because I accessed the field directly those are going to be compiled directly into that executable and the offsets are going to be part of the executable and they won't change so according to the executable it thinks y is at index y is at index zero and x is at index one so at runtime if I use this with version one of the library version one of the library will use x at index one and y at index zero for these three library functions so point create will use that point get x and point get y will all use that so that's why this one is correct because they're all from the library it's just using the library the library is going to agree with itself it created the point and it has the accessors and in the case where they both match this executable for the struct is going to be right as well because they both agree with each other but when it got flipped what changed is the library got updated to version two here so now version two of the library thinks x is at index zero and y is at index one so point create get x and get y from the library will all use those indexes so if I use version one of the executable and I call the library it would go ahead and call this library the library created it the library uses the accessor functions so the library calls would be correct it would always be one then two I have a mismatch so my executable thinks y is at index one but the library that created it said y is at index sorry y is at index one and it thinks y is at index zero so they're flipped that's why it flipped between the two versions this also works that if you use version two of the executable with version two of the library that works fine but if you use version two of the executable of the library that's another disagreement and you'll get it flipped again yep so there could be a few solutions the solution would be like oopsies I broke the library so you could force, you could say you have to recompile your executable and only use version two from now on so that is a solution but then you get into the thing where you're like oh I can't update my library it's stuck at version one and now it's version two and now I have a mismatch the other way which isn't that great so this this exact problem this really subtle bug is what happened with the standard C++ library so in STD list someone made this mistake when they updated it where they modified the fields and suddenly that broke literally every program if it used a list it behaved in really unexpected ways and the only solution was your solution which was just recompile everything and you have to recompile everything because everything is using the newer version of the library now and that update to the new standard C++ library took some people years to do so that is a warning for you if you're developing these libraries don't do things like this you have to know what you're doing so what the question is I'm not quite sure how the library understands the change so for both of them this is what the library uses point create and point get x and it both uses the struct so like in version one of showing the code would probably help so there's point create and point get x that are both in the library and it would access the struct and it uses the struct so in this case with version one of the library it would be in a certain order and whatever you compile it the library is going to agree with itself this x it would think is one for version one of the library and here it would be index one so it always agrees with itself because it's the same library so that's why the library calls are always correct so it created the point and it has accessor functions as well but the two libraries would be different so in version one of the library it would have x at index one and y at index zero and version two of the library it would be flipped but it would agree so in version two of the library if this was at index zero and it created the point yep so this has to do with when you compile it so when you compile it it creates an executable and then it's stuck like that so in this one my executable also includes point.h which has the struct in it so if I compile it it's stuck at whatever version I use that point.h from so it's stuck it's the same executable won't change so that's a fun thing so there are solutions to this one's recompile everything that's a bad solution what's another solution to not encounter this error yep don't change the order that's the best solution so you could add fields to the struct and it would be fine if you add them to the end so if you add fields to the end the other solution would be that if you're going to train change the fields of your struct don't include it in your header files don't let them to compile it in hide it from them so only make them access things through pointers and then call into your library and then your library has a struct file so you make them just do everything through pointers instead of letting them know what the struct actually looks like that's another option that's easier time with updates because everything's a reference which is a fancy word for a pointer so Java can do all this stuff under the hood because it can change stuff because it never actually never actually know where things are in memory but in C you might actually know where things are in memory and bad things might happen so yeah but for the purpose of the labs whenever you create libraries I would stick with the strategy of do not screw with the structs that is a good strategy to have because once you have them they're set in stone you're not allowed to change them so the Linux kernel has structs defined is it ever allowed to change them no because then every single application might break on your machine and things would lots of bad things would happen so that's why sometimes people would debate what goes in a struct in the library for years because if they have it generally you're stuck with it for like 30 years because some things in software don't change that often yeah yeah so so this is part of the give and take with them so you might think hey these updates suck and I don't want to do this so some people might just say hey I'll go back to static libraries and then I don't have this problem because everything's compiled in my executable and it just won't change it won't change between one run to the next so it will always behave the same way so in software engineering there's like fads if people want to use static or dynamic libraries all the time it just changes flavors so if you're relying on bad software with people that don't know how to write code that write bad dynamic libraries you're going to get fed up really fast and then do the static method like you said and then it'll work for a time and then it will probably get really old and you won't update it and you won't have to recompile it and it'll fall behind and it'll have security vulnerabilities and then you'll lose the source code and then you'll never be able to update it again and then you'll be like when I should have used the dynamic library and then you'll go back and forth between the two so the answer is dynamic libraries are better if the dynamic library is of very high quality otherwise you'll probably have an easier life with static and generally it changes from whoever you ask so generally you know if you're a good programmer you're like I dynamic libraries are great because all the updates work everything works flawlessly and I'm the best programmer on the planet hopefully that is us so we're going to make dynamic libraries in this course because that's what we'll do yeah so some libraries will have both versions depending on what you want and in general you use the dynamic library so like the standard C library does have an option for static if you want but generally it is the C standard library developers are generally very very very very good so using dynamic is often a good thing to do for that unless you really know what you're doing okay so there is a system to kind of convey what you mean through version numbers when you're using dynamic libraries so you might have seen version numbers for stuff generally for applications they're fairly meaningless and those are the numbers that you see like chromes on version 100 whatever the hell no one really knows but for libraries generally the version numbers should actually mean something so typically you have like 1. 1.2 or something like that so given a version number like that you usually call the numbers major.minor.patch and the rules for incrementing them are as follows so if you increment the major version that's supposed to signify that you made a change that broke the API or ABI and what that signals to developer is that well if I'm using version 1 of the library I can't use version 2 of the library without having to recompile so I know that I have to recompile if I want to use version 2 but I should be able to use any version 1 of the library and it should be okay so the rule for incrementing the minor version is that you're adding functionality to your library so I added a new function and I didn't delete anything and I didn't otherwise change function signatures or do anything silly like that so you add a new functionality so then you know that hey my library needs version 1.1 because it has a function that wasn't in 1.0 and then you also know that that function that was in 1.1 is in 1.2 or something like that and then finally for the last one it's just supposed to represent whenever you made a bug fix so it's supposed to always work if it worked with 1.10 it should work with 1.11 yeah so this is just a convention that you hope developers follow so the convention would be like if I add a new function I increment this number so I could say like in version 1.2 you can use this new function magic calculator whatever and then people know hey I need version 1.2 for this yeah so patch version should always just work with each other doesn't break anything it should just fix things and make life better so that if I'm running 1.1.1 I can upgrade to 1.1.2 and nothing and it might just work better and if you broke that version well then hopefully there's an update to 1.1.3 that actually works yeah so if they add a new functionality in the library they don't have to recompile it because it'll just be included whenever you run your executable no so if they added new functions or anything you don't have to recompile your code but it wouldn't be able to use those new ones unless you recompile so yeah so if you wanted to use the new functions you'd have to recompile but if your code doesn't use those functions and a new version of the library has the functions of the library yeah so just new functionality minor usually if you increment either number it should have bug fixes and stuff that makes things better yeah time quick another example that will be fun so they also allow for some easier debugging so let's look at this example so normal C program has a main calls malloc size of int should allocate 4 bytes right then I have a printf where I display the address that I got from malloc so this cast here is just to make the compiler happy because it's just printing off a pointer and it wants it to be a void star which basically means in C it's a pointer I don't care what type it is so this actually isn't doing anything but just printing off the address that we got from malloc and then freeing it so question to you this should be very readable how many times do I call malloc once hopefully once that's a good question so does everyone agree we called it once hands up for once hands up for twice alright we got two twice so this isn't too surprising look at it malloc this would start at main call malloc printf free I count one malloc so let's go ahead and see this so go here so let me free it again so I do have one malloc so what I can do is I wrote my own malloc so my own malloc will print off anytime you call malloc and then call the real malloc to do all the hard work and it also does this for free so my malloc and free will just print off if anything any one of these happens and let us see calls that happen and this is basically how valgrind works if you've ever used that we'll see that if I run it I can see how many malloc calls happen 2 4 correct so it looks a bit weird because clearly the size of a byte is 4 so that corresponds to this malloc that looks right and if I look at the address well it ends in 12a0 and then I called free that ends in the same thing so that seems to match up but then there's this weird malloc of 1024 did you know you did that who did you who can explain it yeah so that is a good guess one guess is for a string constant so this string constant here the answer to that is unfortunately no so this is basically a global variable it's loaded with the process so it doesn't need to use malloc or anything for it so the compiler kind of figures it out and it's more or less a global variable another good guess does it have anything to do with the percent sign that is very close getting warmer sorry another guess is it the pointer getting colder close yeah how do you think printf is defined so printf made the call to malloc that makes sense because because it had to take this string and then replace it so that it looked like this so it needed to use some memory so it makes sense that printf would use malloc so that came from malloc so well we just discovered something by messing with libraries so if you use valgrin before you could use it on this too and get a report this report is a bit weird because if we use it and we'll go really quick and then I'll let you go so if you look here valgrin says hey there are two allocations and two frees which is not true it didn't free anything I freed mine but it didn't free it so if you look at the documentation real quick it basically says that the standard c library might use memory it might call malloc but it doesn't have to free them because the operating system will go ahead and free all the memory when the process dies anyways so it doesn't have to free them and in valgrin there's hard coded things that will be like if the allocation came from the standard c library it'll count as freed so you don't get any false positives unfortunately that's all the time we have so just remember pulling for you we're all in this together