 All right, class size is dwindling. What is happening? Were people watching the Sabres last night? Is that what's happening? File systems are boring? Is that possible? Man, this is terrible. All right, well, it's nice intimate class today. So, and we're going to keep talking about file systems. Maybe if I continue to talk about file systems, then apparently the same dull and uninspired way that I've been talking about them so far, I can reduce the class size down to just like one or two and then we can just go out for coffee together and talk about operate systems in the future of computer science, et cetera, et cetera. So that'd be kind of fun. So, OK, so today we're going to talk about on disk data structures, sort of what happens at the actual block level, right? What happens on the disk? Where do things end up? We'll use some examples drawn from the EXT-4 file system showing you how you can use some simple tools to kind of poke around a little bit. And then toward the end we'll talk about the actual mechanics of resolving path names and finding data, right? So two things we really wanted to be able to do, right? One was translate a file name into the file contents and the other was modify the file or essentially translate the offset into a data block, right? So the two things that file systems are doing, translating file names to inodes, these index nodes that provide more information about where the rest of the file is. And then once we have the index node, translate in an offset to the actual data block, right? Where we will find the data that we are going to read, write or otherwise modify, okay? All right, so some of you guys have seen me over the past month, month and a half wearing a whole series of CS161 t-shirts, right? At Harvard every year this course was offered, there was a competition to design a class t-shirt and the t-shirts that you guys have seen me wearing have been the output of that, right? So I would like to do something similar for this class, right? And I don't know exactly how this is going to work. I don't know exactly how we're gonna pay for it but we'll figure that out. The shirts will either be free or they will be heavily subsidized by me and or the UBCSE department, right? Depending on who I can trick and to pay for it, including myself and or my wife. So we'll talk about that, but I think the, other than my poor spelling of competition, I think that the forum is a great place to do this. So I'm aware that there is this CSC421 MEMS thread that has been started. Those look like at least kind of like nice starters for t-shirt ideas, you know, t-shirts, you know I'm just gonna let you guys pick a t-shirt, right? I mean really the only rules are that it has to be clean and you know, that's it, right? So you know, come up with a design that kind of is commemorative of what you think your experience has been. Well maybe we'll start, you should start that process now before the t-shirts get really dark and angry as they might over the next month as you guys transition from assignment two to assignment three. But yeah, so let's get going on that. And then just a final reminder, assignment two is due Monday, right? And I think that apparently people are aware of this fact and are making a strategic decision not to come to class, right? I know the class is early, you know, I like when you guys are here, it's more fun for me. Clearly the videos are online, but you know, I have assignment two deadline that's coming up, crank away guys, you know? I've, people have been coming to talk to me, they've had great questions, so I know that people are making progress. I'm really excited to see what you guys can accomplish for this assignment, all right? Any questions about sort of course logistics? Stuff. Ju, ju, ju, ju, ju, I've got a small and subdued group this morning, so this is good. All right. So let's go back to Monday. Monday, Monday, so far away. I know that there's like probably thousands of lines of code that have emerged from your fingers between you and Monday. But let's talk about file system. So on Monday, we talked, what did we talk about on Monday? On Monday we finished talking about kind of where different file semantics come from. And we talked about Unix semantics, we talked about hierarchical naming, we read a nice poem by Christina Shuddleworth. You know, so any questions about the material that we saw on Monday? Sort of Unix file semantics, hierarchical naming, hierarchical directories, questions about this. We also introduced some file system design principles, which we probably didn't linger on long enough, which is why it's good that we're about to review them. But any questions about Monday's material? It is Wednesday, right? It is Wednesday. Okay, so Monday was fine. All right, so who remembers some of the design goals? So we talked about the fact that all the file systems that we're gonna look at support some of these standard features, right? They all support the idea of hierarchical directories and hierarchical file names, right? They clearly allow you to locate files and retrieve their contents. That's pretty basic stuff, right? So underneath the covers, however, there are a lot of differences, and those differences are driven in many ways by the design goals of file systems. So what are file systems trying to do at high level? Do you guys remember what some of the design goals of a file system are? What would you like your file system to be able to do? And how would you like it to be able to do that? I'm just gonna start picking on people. Right, right, so exactly. So just so these complex, ugly devices, and we wanna provide that file interface, right? Now all the file systems are gonna do that, right? So that's certainly one design goal, right? But what about that file interface? I mean, what would I like it to be able to do? What would I, how would I like it to be able to see? Alex? Survive failures. Survive failures, okay, so that's a great one. Of course, I hate that these are in order. I wish I could put them up in the order that people suggested me, because that's an awesome one, right? So, but of course first, what do I need to do? I actually need to be able to translate the name of a file into the contents, right? And I wanna be able to do that efficiently. So when somebody asks me, can you get foo, path to foo bar? I need to be able to figure out, here's the data that's associated with that, right? So efficiently doing that, doing that quickly, okay? I need to support the file interface, right? So I need to allow files to move, to grow, to shrink, to change. This is all pretty basic stuff, right? I probably want to try and improve access to single files. And we'll talk a little bit about that today, actually, because some of the file systems and some of the new file system techniques actually do this, right? And there's a trick to optimizing access to single files, right? And then another thing that I might want to do is optimize access to multiple files, right? So files that are related in some way, files that are frequently accessed together, I can use my knowledge of the dysgeometry in certain cases to do a better job of this, right? So that's something. And then, as Alex pointed out, surviving failures, right? And this is actually an area that's received a lot of attention and a lot of design and we'll talk about it, not today, but probably Friday and maybe next Monday, right? So some of the standard techniques and some new techniques that have been developed to help file system survive failures, right? And the reason for this is that file, you know, disks are these slow devices and crashes do happen in file systems. They're essentially these big complex data structures, right? So when things get inconsistent, it's possible that a crash at some point can cause the file system to be in an inconsistent state. And at worst, that could corrupt the entire file system and essentially make it impossible to find any of my files, which would make me sad, at least. It wouldn't make me that sad, actually. I have a lot of backups, right? So it maybe would make you guys sad. How many people in here keep really good, sort of consistent religious backups? Sad, yeah. So the rest of you, let me just make a prediction. And for the rest of you guys, there will be an event in your future that will cause you to raise your hand the next time that question is asked, right? Just, you know, so you're welcome to wait along for it and let it sort of provide its lesson the hard way, right? You can also find a way to do backups and maybe provide that for them, all right? Questions about file system design goals? Are you stretching or? No, no. Okay, great. Oh, yeah, yeah, yeah. So actually, that's great. That's not on here, but that's a great design goal. So there are file systems, not some of the ones that you guys are used to, but there are file systems that do versioning, right? And they do automatic backups, right? So a lot of network file systems, how many people have ever used an NFS file system that has a snapshot folder? So a lot of NFS file systems would be configured to regularly record the contents of directories and save that on another part of the storage device so that when you accidentally, you know, fat finger your RM-RF command, you can get in there and recover some of the things that you accidentally blew away, right? Yeah. How is that different from surviving failures? So there's two kinds of failures, right? There are failures caused by acts of God, the power went out, my tripped over the power cord, I dropped the machine on the floor, right? Then there are failures that are caused by users just making mistakes, right? So at time A, you said, you know, I will come up with an example that's like, I am so through with that guy, he's not my friend anymore, I'm deleting all of his email, right? And then you go out for coffee with him and you know, you have this great heart to heart and then you come back to your computer and you're like, I wish I had all that email from that nice friend of mine, right? So that's not really a failure, right? Because the file system did what you wanted it to do in the first place, which was that it deleted all those files. But there are versioning file systems that are built with this ingrown assumption which is that users make mistakes, right? And so it's possible that what you told the file system to do at time T, you might wish that you hadn't, right? And also part of this is driven by the fact that a lot of network file systems have gobs of storage, right? And to the degree that you have gobs of storage, why not use the stuff that's not being in use to store files that are visible to store older versions of things just in case stuff like this happens, right? So a lot of network file systems are administered by people who have this sort of view of the world that essentially allows them to think about users as idiots, right? And you know, it's like, and rather than pulling out the tape back up every time or rather than telling the user, I'm sorry, we have to completely reinstall your system. They're like, oh, just go look in the snapshot folder and you're fine to file, right? So to some degree, these are good features. I wish I had remembered that, that should be up here. Yeah, yeah, you made that point. That was a great point. So providing versioning in some form of in sort of, built-in backup, that's what I'm trying to say, okay. All right, so now, you know, given that we have this set of design goals and the backup and versioning should be on there, potentially, what makes file systems different, right? So we talked about, you know, they all have these goals, they all use a lot of the same data structures, but what makes them different, right? What do they do differently? Anybody remember? Agree. Yeah, like where do they put things on disk? Like where are the actual data structures? Hi. You're having a good day, aren't you? Where are the actual data structures, right? And what are the data structures that file systems use, right? We'll talk today, we'll start to introduce you to some different design options for certain parts of the file system, right? So that's one thing. And then another thing is, so, you know, the honest layout, the data structures, and then finally, crash recovery is one area where there's less of a standard, right? There's less of a standard interface. A lot of this goes into just implementing that standard file interface, and so a lot of it ends up looking very similar, but crash recovery is the case where file systems are given free reign to do things a little bit differently, right? So some file systems, when you crash, you have to run some program that very, very slowly tries to rebuild the file system. Other ones come up immediately and just say, hey, here's how things work, great. But there's some design decisions there, okay? All right, so let's talk about why this is hard, right? Remember, we talked about the fact that one of the reasons it makes this hard is that I had this big complex data structure, it's got lots of moving pieces, and typical file system operations require modifying lots of different things, right? And making those modifications in a consistent way. So let's walk through our example. So let's say I want to write data to a file, okay? I've got a file, and let's say I want to append some data to the end of it, right? So what are the things that have to be done here, right? In really any order, right? Some of these are ordered, some of them are slightly unordered, right? But what's one thing that I have to do? I have new data I want to write to a file. So I need to find a file, right? I also need to find space, right? So I'm only pointing this out. I need disk blocks to put that data into, okay? So I've got to locate some disk blocks. That's one of my tasks. What else? Determine, oh, right, right, right. So the disk blocks, I have to find empty disk blocks. I have to find free disk blocks, disk blocks that aren't, yeah, that aren't associated with any other file. If I found disk blocks that were in use by some other file, that would eventually make somebody sad, right? Because their MP3 would have this little in the middle of it that was like my text data, right? And that would be not good, okay? So now, the next thing I need to do is I need to actually attach those blocks to that file so that when I find the file later, I find the new blocks that I'm about to write to, okay? What's the third thing that I need to do, right? I mean, I need to update the file size, right? And that might be stored in some other place that might be cached in the file I note itself, right? Give me an idea of how many blocks are in the file so they can quickly return answers to things like LS. What else? Carl. Maybe if you say your inverting file system, marketing is dirty before you begin writing. Yeah, I might have to mark some of the un-disk data structures in a certain way if I'm doing versioning, sure. And then actually copying the data out, right? So actually writing the data to the file blocks, right? And again, I mean, one of the things we talked about is this is potentially modifying a set of disk blocks. This is modifying a different set of disk blocks. This is modifying maybe a different set of disk blocks, although it might be the same one as number two. And this is modifying a different set of disk blocks, okay? All of these operations are slow and all of them kind of have to happen before the data is actually in the file and everything is consistent on disk, right? And depending on where a failure happens here, I could be left in a variety of bad places, right? So for example, let's say, let's just, let's just play a little game here. So let's say that right here between one and two is when ChuChu runs into the side of the computer and knocks it over and it shuts down, okay? So I boot up again. Now what's wrong with the file system potentially? Right, so I've got disk blocks that are marked as in use but they're not associated with any file, okay? And why is this bad? It's a waste of space, but why is it a waste of space? What will never happen to those disk blocks, right? They'll never be reclaimed because when I delete a file, I'm assuming I'm gonna figure out what blocks are in the file and mark those blocks as free but these blocks aren't associated with any file. So they're just kind of sitting there, right? They're marked as allocated but depending on how I do free, they may never be free. Now what could I do to discover those disk blocks? What could I do to see if I could find any disk blocks that aren't associated with the file, yeah? So that's interesting, right? The problem is that some files, you know. No, no, no, okay, so you're getting closer. Let's say I boot up again, right? And I'm trying to find these files that aren't associated with any file. What could I do to the file system to try to locate these blocks? Right, so essentially I could walk through the entire file system and I could keep, you know, I could do this in memory and I could have a data structure and I could mark. Every time I open a file I could figure out what blocks are in the file and I could mark those and then when I was done I could compare my dirty list with the dirty list that the file system has. And that should allow me to see this because what I'll see is that, hey, you know, these blocks are marked as allocated but they're actually not part of any file, right? And those blocks are ones that I can probably just fix by marking as unallocated, okay? Now of course, again, the problem here is that I have to run this big disgusting task when I boot up my computer, right? Which isn't gonna make people that happy, right? So again, we'll talk later about some better ways to do this, right? Allow me to recover faster. All right, and again, from the perspective of our process, all this stuff needs to happen synchronously despite the fact that we're actually touching a lot of different disciplines. Okay, any other questions about Monday's material? File systems, yeah. Different. Yeah. Okay. Okay. So I don't know enough about this specific library to answer that question, right? So what you're saying is that I can put files into a folder without moving them on disk? Yeah. Yeah, so one, yeah, we actually, I don't have that in today's slides, but a lot of file systems for a long time have supported the idea of what's called a soft link, right? And a soft link, have you guys ever created a symbolic link? I mean, yeah, people are kind of familiar with these on Unix, right? What a symbolic link literally is, is it's a special file that contains the path to some other file, right? And the way that your system dereferences it is that when it opens that, it reads the special file contents and it just tries to resolve that new path, right? And so what that means is that that symbolic link file is really just holding this file name, right? And the file name can be anywhere else on disk, right? There's also a concept in Unix of a hard link which I might, I didn't slip into today's slides, but I can probably talk about. So why don't, can you pause that question and we'll come back to it maybe in like a lecture or two? Because I think it'll make more sense when we talk about symbolic links. Okay, sorry, any other questions on file systems? That was a good question. It's a little bit ahead of me. Yeah, yeah, yeah, okay, so this is, we'll get to this a little bit today. So maybe you guys have seen, so I mean, you guys have, if you've implemented open and close, right? You've used the OS 161 virtual file system, right? This is VFS open VFS close, right? Vop or evop right, right? What this is, is an implementation of a standard way to support multiple file systems, right? So normally, and this gets, this gets gross. If you guys have looked into this, you probably have discovered that it's kind of gross, mainly because C doesn't support nice inheritance and objects and nice sort of interface definitions, right? So the way that you can hack this together and see is by using function pointers, right? And function pointers are their whole, you know, like function pointers are pretty gross, right? But the idea is that you create a standard file system interface, okay? And then in order to write a new file system, you have to implement that interface yourself, right? And what, if you look, the VFS layer defines an interface and client file systems in order to be supported have to implement those calls, right? And what actually happens is when you guys make those calls to VFS open and VFS close and then the evop read and evop write calls, those are actually being mapped onto the MUFS layer, right, which you can find it's buried in the architecture dependent part. But this, so this is kind of the standard way, right? The operating system defines an interface and the file system is required to support all those calls. Now those calls don't have to do anything, right? Like for example, MUFS has some parts of the interface that they just return unimplemented because they haven't done it, right? But you have to at least allow those calls to be made, right? So yeah, that's a great question. So this is, you know, this is a case where it's kind of interesting, right? We talked with scheduling about the fact that the operating system doesn't provide a standard scheduling interface that would allow me to easily implement multiple schedulers with file systems it does, right? So it provides a standard interface and then if you wanna write a Linux file system, you can go and there's probably documentation about exactly how to do it. It says here are the functions you have to write, right? Like I'm gonna give you a bunch of data disk blocks and those are yours to use and then you have to support these functions, right? You're gonna get past a path and ask to translate it to data, right? And so those are the kind of things that you would have to implement, right? But yeah, certainly on one system you can have multiple file systems. You know, as long as they're using distinct blocks on disks, right? This comes back to partitioning which we talked about before. I can break up all the blocks on disks into multiple groups and then I can put a different file system on each one, right? Yeah? So it's more complicated than that, right? I mean, that kind of atomicity would allow me to prove atomicity between multiple threads but normally with file systems what we're really more concerned about is a more transactional definition of atomicity, right? Where the disk itself. So remember, atomic operations are supposed to essentially be indivisible, right? That's this idea of atomicity. So if the machine fails halfway through it either should happen or not happen, right? And what happens with file system consistency is if you break that guarantee you can leave things in this weird state, right? Like we just talked about that can mean either you have to do this expensive process of repair or you have to like give up and the file system might be corrupted forever, right? So in general, that atomicity is a little bit stronger. We'll talk about this when we get to failure, all right? Okay, so let's look at what actually happens on disk, right? So today we're gonna talk about how do we translate paths to index nodes or inodes, right? How does this actually happen, right? I give you a path, the file system wants to find is the inode, because the inode contains all of the other information about the file, the attributes, pointers to the data blocks. That's really it, right? That's what we care about, right? So again, now that we have the inode how do we find the data box? So what data structure is in the inode allowing me to locate the data blocks that are associated with the file? And this is a little bit of a challenge, right? Because a lot of files are really small, right? I might have a file that's like 4K and then I can have files that are gigabytes, right? You know, like your entire HD quality rip of some movie, right, might be like three gigs or something. So how do I have a data structure that allows files to get really, really huge, but also respect the fact that most files are not huge, most files are small, okay? And then, you know, how do I allocate and free these inodes and data blocks on disk, right? Okay, and I'm trying to keep this at sort of a high level because what I want to do next when we start talking about FFS and the log-structured file system is talk about different design decisions that those file systems made, but in order to make things concrete, I'm gonna use some examples that are from an EXT4 file system, right? EXT4 is kind of like a classic file system design, right? It's not like super fancy, it's not doing anything really, really crazy, right? It's kind of, it's relatively simple and straightforward and popular, I mean, it's kind of the standard Linux file system today, okay? All right, so let's just pause and get some, before I confuse you guys a little bit and get something straight here, right? So remember we talked about disk sectors, okay? The sector is the smallest unit on the disk that I can actually read or write, right? And normally sectors, when we talk about a sector in this context, we refer to a small unit of disk data, like 256 bytes, right? Discs think in terms of sectors, all right? The problem is that file systems think in terms of blocks, right? And file system blocks are usually bigger than sectors, they're usually like 4K, all right? Why would I want file system blocks to be 4K, right? I mean, there's two questions in there, one is why would I want them to be bigger than sectors? And the second question is why 4K, all right? So first question, bigger than sectors, why would I want a file system block to potentially always be eight sectors? This is close, right? But it's not about trapping into the kernel, right? I do probably want to write things in bigger chunks, but if I create a block on disk of what is it gonna be? 16 sectors, where are those sectors going to be, right? They're gonna, I'm gonna group them together, they're all gonna be right next to each other, right? On disk, and so writing that block is going to be a very, very efficient operation, right? Because I'm gonna seek once and I'm gonna, you know, let the sector rotate under the head and go, boop, boop, boop, boop, boop, and I'm gonna write out 16 sectors all at once, right? Where's Chuchu? Uh-oh, may have escaped. Chuchu? All right, let's pause for a minute. All right, disaster reverted. Okay, so, okay, so I wanna write, you know, large number of sectors because I can stream them out to disk very efficiently, right? What about 4K? Where do you think 4K comes from? Same size as a page, why? What am I gonna, what am I gonna do to improve performance? I'm gonna cache things in memory, right? My memory page size is 4K, and so I'm gonna disk block size at the same size as kind of convenient because it means that disk block sits in one page of memory, right? So to allocate a spot for a disk block in my file system cache, I'm allocating one page, 4K, and operating systems are good at allocating pages, right? As you guys will find out in assignment three when you write the page allocator, right? Okay. So finally, some newer file systems even use a larger chunk, right? So, and this is called an extend. An extend doesn't have a fixed size. An extend is just a series of contiguous blocks on disk, okay? And why would file systems wanna write even bigger chunks of data into contiguous blocks on disk? The answer is up on the slide. So if it's gonna hold, if it's gonna hold a big chunk of the same file, but what does this improve? Right, it improves performance, right? Cause again, I seek once, and depending on the size of the extent, all of my blocks might be on one track, or if they span multiple tracks, the seeks are still small, right? So if I can create, like, if I can get a huge chunk of file data into a huge chunk of contiguous data blocks on disk, I'm doing great, right? My life is good. And new file systems, as Isaac pointed out in one of the last class, do a lot, try to play games in order to delay writes to allow me to write as large chunks as possible, right? EXD4 actually does this as well, it turns out. So file systems will try to figure out, so when you start writing to a file, right? Let's say a process starts writing to a file and it writes like one 4K block, another 4K block, another 4K block, another 4K block. If I find all of those blocks one by one, they could be all over the disk. If I wait and I find one 16K extent, then all those blocks will be right together, right? So to some degree, this is another case where procrastination, right? Not looking for disk blocks right away allows me to do things more efficiently. Carl, you had a question or at least a proto question. Yeah, yeah, yeah. Yeah, so we're still in spinning disk world. Yeah, right, right, right. So some of this, you're right. So some of this is definitely still spinning disk world, right? The interest in large, now you're right. Now I haven't thought about this, but for Flash, I think that you're also correct in that Flash also, to some degree, benefits from large writes, right? Because Flash has to erase large sectors before it can write them. So if I do four writes, that might mean four erases followed by rewrites, whereas if I do one big write, it's one erase followed by a rewrite, right? We'll talk a little bit more about what makes Flash file systems different once I've finished introducing you to state-of-the-art 1970s technologies, right? All right, so let's talk about inodes, okay? So these are EXT4 inodes, right? One inode per file, right? Remember, the index node stores the attributes of the file and allows me to find the data blocks, okay? Those are pretty important things and I have one inode per file, okay? Inodes in EXT4 are 256 bytes, okay? And what happens is that means that there are 16 of them that are packed into one disk block, right? Disk blocks 4K, 256 bytes per inode, so that's 16 per disk block, okay? The inode contains the location at the file data blocks. That's kind of the whole point. This is the index node, right? The index of the file. It includes permission information and a bunch of timestamps, right? I'm gonna show you this information for a specific inode in just a second, okay? Inodes are named and located by number, right? You know, we've been talking about these nice human readable names. I mean, to some degree, they're nice, okay? But file systems and computers, they don't do names, right? They do numbers, right? They're good with numbers, bad with names, okay? And so inodes are all numbered on your file system and the number is how the file system locates them, right? And we'll come back to how that works in a second, okay? So here, so there's this great tool called DebugFS, which I encourage you guys to play around with. DebugFS actually allows you to do all sorts of fun things to your file system. So if you have a file system that you don't care about, maybe you create a little partition and you wanna play with a file system, you can use DebugFS to change all sorts of undisk structures that you should not be able to change, okay? You can allocate blocks, you can change pointers, it's kind of fun. It's like, here, please mess up your file system, right? So don't do this out of file system you care about. This was all run in read-only mode, right? Because otherwise I would've probably done something dumb. Okay, so what is this doing, right? So this is, I ran a stack command and again, file systems understand numbers, right? So this is for inode two, right? Inode number two. It turns out that inode number two is a special inode and we will talk about why it's special in the second, right? But this is the information, the interpreted information that is stored in that inode structure, right? DebugFS has read that binary data and printed it off in a nice way, okay? And what does it tell me? So it tells me what type of file it is, right? This is a directory. It gives me some information about the permissions of the file, including the read-write permissions and then also the user and group, right? User is zero, group zero, any guesses who that is? Root. The size of the file, so here's a good question. You guys have done LS on your system and normally prints the size of files, right? For directories, it almost, it usually prints 4K, why? Because that's a block size, right? Remember, a directory is just a normal file, right? And a directory, until it contains too many entries, is 4K, right? Because that's the smallest block size. So that's the fewest number of blocks I can associate with the directory, right? If you put a directory on your system and you create like 10,000 different names in it, you will see the directory size start to grow because as you add names to the directory, the file system runs out of space in that 4K area for the names and it starts allocating more space, right? But initially, directories are allocated to be 4K, okay? I've got a bunch of, so I think that this number here, the link count is a count of the references to this I-node in other places on the disk, although I'm not sure. Let's see here. We've got a whole bunch of timestamps, right? I'm gonna get these wrong. This is the creation time, right? So you can see this is actually on the CSE 421 web server, so you can see the date that I set it up. This is the, oh man, I should have, did I write this down? I did, ha, good, okay. The creation time, the attribute modification time, which is in order to confuse people called C time, the delete time, which isn't defined for this because it's still there, and the access time, right? So the access time is here. I don't know what M time is, is M time even on here? Yeah, the content modification time, right? So content and attributes, right? So the last time I added or removed something from this directory was February 29th, and that's weird because I don't know what I would have added or removed, but oh, you know what I probably did? I probably updated the kernel, right, which is stored in this directory, all right? Okay, so you might be wondering, right? I mean, we talked about names, and one of the things the file system has to do is translate names, and now I've introduced this idea of I know numbers. How do I translate an I know number, right? On some level, I've got to start somewhere, right? Like, there has to be something that the file system knows how to find, right? Like, at this point, the file system doesn't know how to find anything, right? Finding names requires finding I knows, and finding I knows requires what, right? How do I find an I know? What do you think? What's that? There's a table, right? And when is that table created? What's that? So when I create, exactly. So when I create the file system, I create a single table or set of tables that stores the I-notes, and those tables are stored at a well-known location, right? Or the location of that table is stored in the superblock or somewhere where the file system knows how to find, right? So the point is that, you know, I take my entire disk, and when I create the file system, maybe when I format the file system, I create all the I-notes that I'm ever going to have on the file system, okay? And, you know, on some file systems, they're all packed towards the beginning of the disk. What's the problem with this, right? What if I put all of my I-notes right here? Like, this is a very small disk, it only has seven I-notes. Right, so if these are all my data blocks, if I'm accessing a file that has data blocks way over here, I'm constantly jumping back and forth between the I-notes table and the data blocks, right? So what EXT4 does to mitigate this is it puts I-notes tables at multiple locations throughout the disk, right? So there's like 1,000 I-notes right at the beginning and then there's a bunch of data blocks and then there's 1,000 more I-notes and a bunch of data blocks and, you know, it tries to allocate data blocks close to the I-notes that they're associated with, right? Pretty obvious. So, but what are the other consequences of this? So I'm creating all my I-notes at format time. So what does that mean? What's one limitation of the EXT4 file system? Remember, I said one I-note per file. What's that? Yeah, so now I've got a limit on the number of files that I can ever have on the system, right? And, you know, when I was poking around looking at the material for this, I found some people online that were complaining that their file system, they couldn't create new files on their file system, despite the fact that they had empty space because they had run out of I-notes, right? So if you have a lot of very, very small files on your system, EXT4 may run out of I-notes, right? What EXT4 does is, okay, so I know it's made out, we already got to that. So we can run out of I-notes before we run out of data blocks. And EXT4 approximates the relationship between I-notes and data blocks by essentially allocating one I-note per 16K of data blocks, right? So what does that mean about EXT4's guess about the average file size on your system? That's 16K, right? I mean, the idea is that I'm allocating I-notes based on some assumption about the average file size. And I'm assuming the average file size is about 16K, okay? Is that true? I don't know, it probably is, actually. You probably are surprised because a lot of the files that you work with are big, right? That'd be threes, you know, movies that you downloaded from completely legitimate websites, things like that, right? Whereas, what you don't see is that there's an enormous number of files on your system, literally many configuration files that are small and are under 4K, right? Sometimes you guys don't notice those files. There's also files on your system that come and go very, very quickly that are also very small, right? Lock files and other things, right? So 16K is a terrible assumption, but if you want to, you can set up, when you format the disk, you can actually tell EXT4 to create more I-notes, right? So you can tell EXT4, use eight kilobytes per I-note rather than 16, right? If you have a file system where you think you're gonna have a lot of small files, all right? So we've talked about how directories are really just special files, right? Directories are special files that simply create, contain a series of I-no number and path pairs, okay? And the pairs, the paths are relative paths, not absolute paths, right? Why wouldn't I, so here's a, well, there's probably multiple reasons for this, I won't even ask the question, right? Because there's, I would never store absolute pile of path names in a directory, that would get gross, okay? So for example, here is, how many people have ever used LS to list I-no numbers? I don't know why, I don't know why it has this feature other than it's cool, but so you can actually use the standard LS command on Linux or Unix to show you the I-no numbers in any given directory. So for example, here what I've done is I've listed the I-no number for the root directory, right? And remember, we said this root directory had a special I-no number and I-no number is two, right? So I think this is up on a slide in a minute or two, but why does the root directory have a special I-no number? Why would I wanna have a fixed I-no number for the root directory that the file system always uses? Right, so whenever I resolve a path, an absolute path, I have to start somewhere and the way I bootstrap myself is I bootstrap myself by loading I-no two, right? So the way I always start resolving paths is I start with I-no two. After that, I use the information inside the directory to figure out what the next part of the path component, what I-no that translates to. But I have to bootstrap somewhere, right? I have a, you know, I need to start somewhere and the way I know where to start is I just choose two or an older file system, I think it was like one or zero as the starting point for all my absolute path resolution. Right, we'll talk about this in a second, right? So what I've done here is I've showed you, now here what I'm showing you is the contents of the root directory, right? So these are a bunch of, most of these are other directories, there's a couple of files here so this I think is a file, this is a file. But everything in it just has an I-no number, right? Relative path component, I-no number, right? And so if I wanted to continue doing this, let's say I was translating home, you know, and then I can list the home directory, the home directory has two other path name, I-no number. This is how path name resolution is done. And I don't know why I don't have that example in here, but, yeah, yeah, yeah. Where? Ah, ah, ah, okay. Yeah. Ah, man, you guys are very, very perceptive. So Proc, I know for sure, right, is not an actual file system, right? It's, so, and I think we talked about this like way, way a long time ago, right? But there are things on the system that are actually not file systems, they're file system-like objects, right? And Proc is not an actual file system, Proc is a pseudo file system that is implemented on the fly by the kernel. So when you do an LS in the Proc file system, there's no actual real file system on disk anywhere. What the kernel does is it makes it look like a file system by returning things that look like file objects that contain information about the running processes on the system, right? Sys, I don't know, right? I can look it up and get back to you, but I think this is what is going on. Dev also over here is something else that is not a real file. It's filled in dynamically about devices by the kernel. So, I should look into this. Lost and Found too, I think is also something that is not quite a real file. Yeah, but, right, right, right, okay, yeah, of course. Yeah, so these are pseudo file systems. They're not real file systems. And what I mean by that is they support the file system interface, but there's no data blocks anywhere on disk that are holding those contents. So when you do reads and writes for them, the kernel isn't reading from a device. It's just giving you data, right? So you say, show me a list of all the processes on the system, and the kernel gives you back what look like file contents, but actually generated by reading its own data structures. So this is clever, right? And we talked about this a couple months ago, and we talked about the fact that it's clever, right? Because it reuses the file interface as a way of exporting data, right? All right, so just in case I didn't convince you before, file system names are I know numbers and directories are files, right? So let's go back to this. This was our stat up from before. This is I know number two. I know number two is an I note. It's a directory, right? It's got one block attached to it, right? The block with number 8737. And when we did LS and we looked at the output from directory two, or directory with I know number two, that was what we read from, right? I tried just printing off the block itself, but it's like, it's machine, it's machine readable. It's not human readable, right? It's not that pretty thing that LS is actually interpreting the data structure for you, right? Okay, so before we finish today, I think we're almost out of time. So let me talk a little bit about the super block, right? So the file system's super block has all sorts of, it's almost like the I note for the file system itself. It's got all sorts of important and useful data about the whole file system, right? So the kind of file system it is, different features that the file system supports, the type of file system structures that are in use. The location of the I know tables, right? Remember that we need, when we're translating an I node, we need to find the I know tables, those arrays that store the I know contents. The super block knows where the I know tables are. The super block has information about when the file system was created, the state that it's in, et cetera. So what happens if my super block gets corrupted? I'm toast, the file system is a goodbye file system, right? So right, so if I was mean, if you played this game with me, you said you can change one bit of my disk anywhere you want to, right? You can flip one bit and I was mean, I would flip a bit in the super block because the likelihood of that breaking something is really, really high, right? If I flip a bit in a data file somewhere, you'll probably never notice. MP3 will sound a little bit different. The movie will have a little wrong colored pixel in it or something, right? But if I flip a bit in the super block, I actually might be able to just totally blow up your file system, right? So what do I do about this? This is super important data. How do I make sure that mean me can't flip a bit and totally ruin your day? What's that? I've got backups, right? So if this data is in trouble, I'm in trouble and what I do is I keep multiple copies, right? So it's a very, very simple technique. File systems store multiple copies of the super block at known locations throughout the disk and that way if a sector goes bad or if your mean professor flips a bit in your file system, you have another couple super blocks that you can look at and say, okay, this is right, this one is broken, yeah, I'm awake. Okay, fine, two bits, right? Four bits, give me enough bits to overcome your ECC, you're right. But the point is that this stuff is really, really important and the file system here is very fragile and very dependent on that super block. So let's just go over this and then I'll let you guys go, right? So I printed off some information about the super block for the file system that's mounted on the 421 web server and there's a lot of cool information in here, right? So this is a file system magic number. I'm assuming that this identifies it as an EXD4 file system of some type, right? So when the file system is mounted, the kernel kind of needs to know what kind of file system is this because it needs to figure out what file system implementation to use to read and write to it, right? It shows me what it was last mounted on. This is mounted at root, that's where it's used. It gives me a count of the number of inodes and blocks on the file system. It gives me a count of the number of free inodes and free blocks, right? The block size is 4K as we talked about. There are 32K blocks per group. Remember where I said that EXD4 distributes the inodes across the disk? Each one of those is called a group. So it consists of a bunch of inodes and then a bunch of data blocks, right? And then this repeats, okay? And then it gives you some information about when the file system was created. Again, pretty close to when I created, when I mounted or created root, right? And it gives me an idea of how many times it's been mounted, et cetera, et cetera, right? And here's some more information printed by the same tool. I would encourage you guys, you could use this on your, in your virtual machine, right? It's kind of fun to look through this and look at what's going on here, right? So this I thought was kind of cool. This gives me an idea of the lifetime rights to this file system. Why might I care about this? Yeah, yeah. So I might just want to know like, how old is the file system? How many rights have I pushed through it onto a particular disk, right? That might be useful information. Give me some idea when the disk is about to wear out, okay? I note size, 256 bytes as we talked about. The number of directories on the system is also listed. And then here now we have information about those groups that I told you about, right? So group zero, the first group. Here's where the free block bitmap is. This is how I allocate blocks of the group. Here knows where the inode bitmap is. This is how I allocate inodes. This is where the inode table starts. And then here's information about, number of free blocks. So this group as you see is full, right? It's the inode table's full, right? There's zero free inodes, okay? And then this information goes on and on. So I don't know. There were like 64 groups on this file system or something, right? And clearly not all of them are full, all right? Questions about this, right? Superblock. So next time we will get to how we do, you know, actually doing an example path name translation. But any questions about superblock contents, inode sizes, any of this stuff, all right? All right, so next time we'll go through an example of open and show you exactly, you know, you probably can guess how we do this by now. And then we'll talk about some options for how to store data blocks in the inode itself. All right, now we'll see you on Friday.