 All right, so we got inodes today. Whoops. Do I sound like a robot again? Kind of sounds like it. No, that's OK. All right. So inodes. So we left off at index allocation. We essentially had a single level page table. And that only supported files up to like 16 megabytes, which kind of sucked. So this is the actual structure used for storing information about files and what block represents what. So this is what is called an inode. And it stores everything about a file and what blocks it points to. So a few things in stored in iMode, so the mode of the file. So that's all the permissions and everything like that, whether I can read, write, execute it, all that fun stuff. There's two owners. So every file is owned by a user and then a group. And then there are three timestamps. So access modified and created or something like that. So those timestamps are stored directly on the inode. And you are free to modify them, which is why I don't trust any dates you give. So if you have an English course selective and you want to try and submit a late assignment, don't say I said this, but you could just edit it and say, hey, I turned it into due date. Look at the timestamps, correct. So you can go ahead and do that. So someone doesn't know about computers. You can fool them, but unfortunately, can't fool me. So that's unfortunate. And then the next parameter is the number of blocks that it takes that it actually points to. And then you'll see there's a few different options for blocks. So there's a few pointers stored directly on the inode, which are just called direct blocks. So all those direct pointers, they just point directly to a block on the hard drive. So you don't have to go through any hoops or anything like that. We don't even have a page full of pointers, or anything like that. It's just we point directly to it. And the idea behind that is, if it's a small file, we don't even want to waste a page to store pointers if it only takes up a few blocks. We'll instead have some space directly on the inode to point to it. And then the next one is the single indirect. And that uses the exact idea that we saw yesterday that had the index block. So it will point to a block full of pointers. And then that block full of pointers each points towards a block. And that had our 16 megabyte limit. Now, instead of that, we now have more options. So if you use that pointer, it's a single indirect block. Well, if you use the one after that, it is a double indirect block, which essentially means you have to go through two levels of blocks of pointers. So you first go to one block of pointers as that L1 page table. And then you go to the next one. It's like an L2 page table. And that acts directly as the single indirect where it just points to a page. And then you use that, go ahead, find the blocks. So you increase the number of blocks you can point to by a factor of how many you can fit on a block. And then triple is the same thing, except you do that three times instead of two. So that, again, is to be really efficient for small files and also support large ones. Because as we saw with multi-level page tables, the more levels you have, the slower it is to access because you have to go through a bunch of hoops. So the same idea applies here, except that there's varying levels of hoops to go through, depending on how large the file is. So if it's a small file, I don't go through any hoops. If it is a medium-ish file, I go through one set of hoops. If it is a bigger file, I go through two sets of hoops. And if it's a really, really huge file, I go through three. So if we go back to that same scenario where the disk block size is like 8 kilobytes, a pointer is 4 bytes. And we have just indirect blocks consisting of direct pointers only, so it's exactly like that index thing. Well, we should be able to figure out now what the maximum size of a file is managed by an inode now. So if I only had one direct pointer, well, the maximum size of my file would only be 8 kilobytes because I can only point to one block. So if I have 12 direct pointers, well, I can point to 12 8 kilobyte blocks. And then if I have a single indirect block, that's the same calculation as before. So I take 2 the 13, which is my block size, divided by 2 the 2, which is the size of my pointer to a block. That means I can fit 2 the 11 on a single block. So I can support at the end of the day 16 kilobytes through that. Well, if you extend that, then the number of addressable blocks you have is going to be, well, 12. So that's my direct. And then 2 the 11, that is just my single layer of indirection that points to just a block full of pointers. And then if I do a double indirect block, well, that's just 2 the 11 to the power of 2 because it's exponential in the number of things I can actually point to. And then similarly for the triple indirect blocks, it's 2 the 11 to the power of 3 because I have 2 the 11 choices for each block I have. So if you want to be lazy about this and just approximate it, well, the biggest term here is going to be 2 the 11 to the power of 3. So it's going to dominate all the other terms. So you can just save yourself some math and just say it's approximately the same as 2 the 11 to the power of 3, which is 2 to the 33. So that's just accounting for the triple indirect blocks, which is going to dominate the number of blocks you can point to. So that means I can point to 2 to the 33 unique blocks. So if I want to figure out the maximum size of my file now, well, it's how many blocks I can point to, which is 2 the 33 times the size of my block, which is 2 to the 13. So if I do that, it's 2 to the 46. Or if I go ahead, take out the terabyte, it means that a single file can now be 64 terabytes. So that should fit all your movies and everything. So that is the current limit for Linux file systems right now. They actually all use this. And this is the maximum size of a file. So if you go over 64 terabytes, they're going to have to start doing quad indirect pointers or something like that and change the format. But for now, this is the format we have. So any questions about this calculation or anything about it? Yep. Oh, so what I mean by indirect block consists of direct pointers only. So that block only has pointers on it and no other information. So it's just treated as a flat array with no other data. So it's basically how many ints can I fit on a block in 8 kilobytes is how many things I can point to. All right, so that is how big our iNode is. And congratulations, that is the actual thing that is used to represent a file. So now we get into what in the heck is a directory then? So within a directory, all the file names we've used so far, they are pretty much just for you. It's just a pretty way to let you access iNodes and nothing past that. So what hard links are, which are by default any file name you ever have used, is just a pointer to an iNode. So if I have to do .txt, it actually will point to some iNode and it will be given a number. And everything in a directory, which is just a file name, is called a hard link. And all your directory consists of is a bunch of name to iNode tuples. And that is all a directory is. It's just name, iNode, name, iNode. That's it. Nothing else. Yeah, there's nothing else to it. So if you have a pointer, well, we all love pointers in C and know that we can point to the same thing and make our lives difficult. So the same thing can happen with hard links. So in the same directory, I can have to do .txt and b.txt and then point them to the same iNode. So they actually just look like the same file. And this goes to a point that when you do an RM of something, you think it gets deleted. But in reality, all it does is delete the name to iNode mapping. It doesn't actually delete the file whatsoever. So you'll notice if you look through the list of system calls, if you're statistic enough, there is no system call called delete or RM or anything like that. The closest thing is a system called called unlink. And that just removes a name entry from a directory and that's all it does. So if I try and delete b.txt, what that will actually do is just remove this pointer. It will remove the b.txt entry and it just no longer will point to iNode one. But the contents of that file and everything, you can still access it through to do .txt. So the contents of the file don't get deleted. And for the most part, this is also how your recycling bin works. So your recycling bin would just, you delete a file, but it just keeps a name in the recycling bin that still points to that iNode. Or if you're on Windows, whatever their iNode equivalent is and doesn't actually delete it, you can go ahead and recover it. And all you have to do is make that old name, point to the same iNode, bam, your file gets recovered. You don't have to copy it. You don't have to do anything. You're just changing this. So there's another thing called softlinks. And we'll go ahead and experiment with these because it's lots of fun. And what a softlink is, instead of a name pointing to an iNode, well, a softlink is just a name pointing to a name. So now, once you have this, your kernel actually has a bit of a harder job. So when you try and open b.txt, what you would do if you tried to cat it or something like that? Well, your kernel is gonna be like, okay, to find b.txt, I actually have to find to do .txt. And then to find to do .txt, oh, that's iNode one. And then you can open the iNode, read the constants file and all that. So the fun thing about softlinks is, while a hardlink has to point to an iNode, a softlink just points to another name and that name doesn't have to exist. The name it's pointing to can be deleted at any time. So it would make that softlink essentially point to nothing. And unresolvable softlinks are generally lead to an exception or some error you would get because it doesn't actually point to anything valid. And on Windows, this is pretty much the equivalent of a shortcut, so it's not actually a copy of anything. It's just another way to point to the same thing. Okay, so let's go into a fun little problem. So let's just experiment with this new knowledge. So let us, let's just open a file called to do .txt. Oops, hello everyone. So this is something we're all familiar with. I opened a file, I wrote to it. If I do cat to do .txt or something like that, I get the contents of the file. Everything is normal, what we knew before coming to this class. But now, if I want to create a hardlink, so you create hardlinks by using ln, which is just, they shorten link, don't ask me why, they just shorten it to ln, but you can create a hardlink to it. So I create a hardlink to do .txt and I'll call it b.txt. So now, if I look at my directory, now I have two files. And they are the same size, everything like that. And also now, if I cat b.txt, what should I see? Yeah. Oh, sorry, yeah? To do file? What was in the to do file? Yeah, hello everyone. Yeah, so now, if I cat b.txt, it's hello everyone. Well, what happens if I edit b.txt? I get more excited. And then, well, so if I look at b.txt, wow, it's the same. What happens if I look at to do .txt? I never edited it. Yeah, it's the same, it's super exciting. So this is a real thing and you might have thought before that each file is unique, but they're not. So they're essentially pointers. And if you use ls, so we'll introduce a new column. Since we know about inodes are, this basically just tells you information about all the inodes. So aside from the name here, which is part of the directory entry, all this other information is from the inode. So this, if I do dash, was a dash i, it will also include the inode as part of a column. So this first column here is actually the inode number. So if we look at b.txt and to do .txt, well, they both have the same number because they're both pointing to the same inode. Then here is something we're probably familiar with. It's the permissions for both files. They're all the same because the permissions are stored on the inode itself. Yeah, permissions are stored on the inode itself. And then this number here, you've probably seen before because if I just do a normal one, that number's there. Any guesses as to what this number actually is that you've seen? So look, I will give you a hint. So let's make another hardlink to to do .txt. Let's call it d and we'll see. Yeah. Yeah, it's a number of hardlinks to that actual inode. So that's actually stored on the inode itself. So you can see how many things are actually pointing to that by just checking this number and instead of driving yourself insane. If that number is not one, then you have to kind of watch out. Then the next one is the user that owns the file, the group that owns the file. I think that's modified. Oh, then sides of the file. So 18, so it's 18 bytes. So it's hello everyone in a new line. And then this is just the modified date. So that is all about LS and that is just mostly displaying information about an inode. Let's go back here. Yeah, so we can do some other fun stuff. So let's go ahead and we can make a soft, oh, yep. Oh yeah, so the question is, does the inode store the number of hardlinks so it knows when to disappear? And the answer to that is yeah, the kernel keeps track of it. So if there is now zero references to that inode, you can actually delete it, maybe. So the operating system can choose whether or not to delete it, but it could actually be eligible for deletion. Yeah, yep. So one question is why would I want to point to the same inode that seems odd? There are cases you would want to, let's see off the top of my head, I don't know, but one case you would is that directories themselves are represented on inodes. And if I do this bigger one, that shows you the hidden files. So I think starting with the dot, well dot and dot dot are just, entries in a directory. And each directory has an inode and there's a lot of things that point to them. So for this one, so this current working directory, only two things point to it. So it points to itself and that's the parent of another directory. While this is parent, has 22 things pointing to it because that is essentially my examples directory. So it's telling me essentially how many things are in here because within each of these directories, they'll have a dot dot that point to that directory. So you have a duplication of hard links mostly for directories as one you'll see it. All right, any more questions about that? Yep. So the question is relating it to the last lecture or the last lecture, if they have their open file tables, would they manipulate each other's position if they point to the actual same inode? No, so the position is shared in the global open file table and that depends, they only share it if they fork each other and they actually point to the same global entry. And then as part of that global entry, it points to V node, right? Which is independent of that. So it's not connected to this at all. So the position's the same. But now connecting it to that, the V node for a regular file actually points to an inode. So that's where this comes in. So it's actually what that V node points to. If it's a file, it just points to an inode. So two entries can actually have V nodes that point to the same thing even though they went through them maybe using a different name because they point to the same inode. Okay, so any other questions before we do some fun stuff that probably breaks something? Okay, well, let's start breaking stuff then. So let's create a soft link. So let's say I create a soft link to to.txt, call it C.txt. So now if I look in here, well, I have three things all pointing to the same inode. So to do.txt, D.txt and B.txt. But then I also create a soft link. So soft link, like I said, is just a name pointing to another name. So if I try and cat C.txt, well, it would have to go look up todo.txt and then through that, it would actually find the inode. So soft links themselves are a different inode. So it's a unique thing and the contents of the file is just the name of the thing it points to. And we'll have the last lab. You will definitely figure that out. But just in case, here it is. So soft links kind of look like a normal thing. So they just point to a file. Yep. Oh, okay. So the question is, why does the soft link have all those permissions? So the permissions of a soft link, it's just has all the permissions because all it is is a name to a name. You don't really care. The permission check happens, whatever file it looks up. So it's just the most permissions. So you can always just resolve the soft link, but you might not be able to access what it points to. Okay, so what happened? So now I can cat C.txt. It looks exactly the same, but now I can remove todo.txt. So now if I remove todo.txt, well, my terminal gives me some nice help and paints it in red because it actually doesn't exist anymore. So C.txt points to todo.txt and that doesn't exist. So now if I try and cat C.txt, it'll just say possibly a fairly confusing error message if you don't know about sim links because clearly C.txt exists, but the error message is that no such file or directory C.txt. And that's because to resolve C.txt, it looked at todo.txt and that doesn't exist. So you have to be a bit more careful with this. All right, so does that make sense kind of? So why would you ever want to do a soft link? So by default, directories and everything are like a directed acyclic graph. So they form a nice tree and they don't have cycles or anything like that. But if you have a soft link, you can create cycles with soft links. Can anyone think of a fun cycle I can do immediately? So we have C.txt pointing to todo.txt. Yep. Yeah, why don't I just make, it's like I can fix this. So I'll just make todo.txt point to C.txt. So now if I look at this, well, C.txt points to todo.txt and todo.txt points to C.txt. It's fine, it exists now. So what's going to happen if I do this? Well, what detect that? So will the kernel detect that and stop me? Okay, so that's one vote. Any other possibilities? Yeah, infinite loop. Okay, so we got kernel.stosme, infinite loop. Anyone that cats smart enough to figure this out and stop me? Enter. So we actually don't know. So it says error too many levels of symbolic links, but we can't be too sure if the kernel stopped us or cat stopped us by looking at this, right? So how could I figure out what stopped me? Putting on our detective hats here. Yeah, so there's probably a limit to the number of symbolic links, but the question is what stopped me, the kernel or this program? Yeah. Can you strace it? Can you strace it? Of course I can. It's like I heard Wireshark before. This is like the Wireshark of OS. So if I want to figure out what actually stopped me, I can strace it, because if cat stopped me, I'll see it open C, or yeah, it opens C, then open to do, then open C, then open to do, then open C, then open to do, and then gave up at some point. If the kernel stopped me, it will probably just give me an error immediately. So I can strace it, see a bunch of crap. We can go through this, so we know that where the hell's my mouse cursor? So we know this is exit group, it stopped, hey, it has an error code, so it stopped with an error, it closed some standard file descriptors, kind of weird, but see here, this is where it printed out the error message. So we're getting close, libc did some weird stuff, where it checks some locale settings, close three, we're getting closer, open some other file that we don't really care about. We see that it wrote C.txt, so that's the standard out, because it's file descriptor two, wrote cat, and then, oh, here, whoops, here it is. So it did open at, which wasn't quite what we asked it, we asked it to do open, but anyways. So open at C.txt, read only, and the kernel gave us the error message, it gave us negative one, which is error code eloop, too many levels and symbolic links, so the kernel actually saved us and not us. So cat didn't save us, the kernel saved us, so there's some checking in the kernel, yeah. Yeah, because if this wasn't the kernel, I would have seen that I opened C.txt, and then, while it points to to do.txt, so I would try to open to do.txt, and then I would have, from to do.txt, open C, and then the program would have given up at some point, but here, I don't see to do.txt anywhere here, so the program never tried to do anything with to do.txt, so the kernel actually saved us, so the kernel gave us an error message, yep, yeah, yeah. So at the end of the day, whenever you do an open, it will resolve symbolic links for you, and then, like internally, that global open file table, that vnode will point to an inode, which is the final result of it. So there is some things, so for some programs, and I guess this leads to some security things, so for the open call, you are allowed, so by default, it will follow sim links and try and resolve them. Sometimes you don't want that, because that can create loops, do things weird, go up to your root directory, all sorts of stuff like that, because they can point to anything. So as part of one of the options for open is to not follow sim links. So it will just give you an error if it tries to open a sim link straight away, so you're not allowed to do that. So you'll see that as a configuration option on servers a lot, like web servers, it'll just say follow sim links, no, because there's too many problems with them. So some people let you disallow them because weird things will happen. All right, any other questions about this fun stuff? Bueller, Bueller, no, sweet. All right, so we go back. So the example that was in the slides is we created todo.txt, made a hard link to it, then made a soft link to it, and then we moved todo.txt to d.txt, and yeah, this brings up me talking about move. So move doesn't copy the files or do anything like that. That's why move, even if you have a 50 terabyte Blu-ray rip, which no one ever did that legally, of course, that happens instantly because all it's doing is changing the name. It's still pointing to the same inode, it's just a different name is pointing to the same inode. So this move is actually more accurately just called rename, so I renamed todo.txt to d.txt. I don't change any of the contents of the file or do anything like that, I'm literally just changing its name. And then similarly with remove, that just removes the directory entry and so it removes that name to inode tuple and it would decrease the number of things that points to that inode. So here is the solution to this. So right before the move, when we have two hard links to inode one, you just give it a name and then you have c.txt pointing to todo.txt. Well, after the move, all I do is rename todo.txt to d.txt. So now that soft link is pointing a name that no longer exists and now b and d.txt point to the same inode and then right after the remove, I only have d.txt pointing at that inode now and now c.txt points at todo.txt which doesn't exist. Yeah, so right now if I did rm on d.txt. Yeah, so right now if I did remove on d.txt, I'd have nothing pointing to that inode. So if you're the kernel, you're free to delete it but also if you're the kernel and you want to be really, really fast, deleting it takes time. So I can just say I can reuse it whenever I want and you just don't delete it. And that's again, if it wanted to hold on to it to make sure it couldn't delete it, well, that's what your recycling bin does. It will just keep a pointer to it or it would just keep it around and then later you can actually try and recover deleted files from your drive by taking advantage of the fact that it didn't delete them. It just doesn't point to it. So you can kind of just probe along, look for inodes and then see what they point to because they might be fun stuff, especially if you're part of the FBI or something like that. So yeah, so it's just not that you guys need it but it's a warning that RM doesn't actually delete anything. So yeah, no, so move just renamed that. So in this case, all move did was change the name of todo.txt to d.txt, just changes the name. Nope, because c.txt is a sim link and it just points to a name. So if you rename whatever it points to, it doesn't care. It's the same name. Yeah, so the question is how does one file not overwrite another file if it were to grow? So it's essentially the same thing as what we had with pages. So if a file needs to grow, the kernel would grab a new data block for it or data page for it and just fix up the reference to it and it can point to anywhere. It doesn't need to be contiguous or anything like that. So it just picks up a new one, adds it. Yeah, okay, and also as part of it, if you really want to get into inode stuff, well, there's the stat command. So stat will tell you all about files. So what do I still have? I have my infinite loop. So if you do stat.txt, it will tell you all about an inode and all of its glory detail. So it will tell you the file, the size of it. So 18 bytes blocks, which looks a bit stupid. There's eight blocks for an 18 byte file. That sounds really dumb. And it says the IO block size is four kilobytes. So that's like the actual block size on the hard drive. That it's a regular file, what device it's on, the inode number, the number of links to it and the same information at last, but just bigger. So the dumb thing about blocks is blocks don't correspond to IO blocks. Blocks have a hard coded size of 512 bytes. You'll never be tested on this, but just so you understand this. So that's why it says eight blocks because eight 512 blocks make up a single IO block. So this file takes up a single IO block. So on the actual hard drive, this B.txt, which just is 18 bytes that just says hello everyone. Well, that takes up an entire block or page, whatever you wanna call it, on my hard drive. So this actually takes up four kilobytes of space on my actual hard drive plus the inode itself. So lots of little files do waste some space. There's some, when in lab six, you'll see there's an optimization. We can probably see it. So if we look at c.txt, it says it is a symbolic link and it says that c.txt points to todo.txt. Its size is eight, which is actually just the size of todo.txt in bytes. But the fun thing about this is the blocks is zero. So a sim link actually takes up zero space on my hard drive. So you will figure out that optimization in lab six because you will do it. But little sim links, there's an optimization where they just don't take any space. Yep, nope. Yeah, so the question is, or sorry, the thought is do I put them all in one file? So I'll give you a preview. So if we go here, oh sorry, were you gonna say some? No, you're good? Okay, so if we go here and look at the number of pointers, so there's 12 pointers to direct blocks, then one for single, one for double, one for triple. So all in total, there are 15 pointers. And if my pointer size is four bytes, all those pointers take up 60 bytes, which is a fair number of bytes. So if the contents of your file are less than 60 bytes, while instead of pointing to a block and wasting four kilobytes, I can just have a little optimization that says, hey, if the file name is less than 60 bytes, I'll use that space for pointers to store the name itself. So the name just goes over the pointer, so we're reusing some space. So that's the optimization there. All right, any other fun questions? So don't worry if that doesn't make sense because you'll do that in lab six and lab six is pretty fun. Yeah, so in UNIX or Linux, pretty much everything is a file. So directories are just files. They just have a special format, which just is a bunch of name inode tuples, and that's it. Bunch of other things look like files, so your hard drive looks like a file. It's called a block device, which means it can only access it in like pages at a time, but it still looks like a file. Your network card looks like a file to it. You could actually, your graphics card looks like a file. So if you figure out how to access your graphics card or whatever through a name, which would be in like dev something, you can just write random bytes to it if you really want. You'll probably corrupt something, but you can go ahead and do that if you want. You'll see pipes are represented this way. Sockets, so I heard like Wireshark and networking course, do you guys have a networking course? Just us, okay. Well, we have a socket thing, so we'll see how networking connects to operating systems. We'll do that Monday, because it's fun. So, yeah, sockets are how your network works, so they look like regular files and there's nothing that, there's nothing that new about them other than they go to a network card. And yeah, as part of the directory, so directory inodes don't store any pointers to data blocks, but rather just file names and pointers to inode. And yeah, so sorry, I had this purpose of soft links is just to break any site or to make cycles if you really want. Another question, so every file has an inode, even soft link, yeah. So every file on your device has an inode, literally everything. So you can check it, if you do, whoops, if we go back to this, like you can start doing LS-IL on everything, first column's inode, you'll see everything has an inode. So if I go to, well, even things in proc have inodes. So, you know, if I want to see my CPU info, well, that's an inode, it just looks like a file. It's not actually a file represented on the hard drive, it's like some information from the kernel. So we probably have enough time, we'll see how to create these entries too, probably. And you can do all sorts of stuff if everything kind of looks like a file because all your tools kind of work with it. So if you go over, oh, yep. So the question's what maps the string file names to their inode and all that is in a directory entry. So a directory kind of looks like a file that points to blocks, but the blocks are formatted in a somewhat specific way where it's just like a bunch of name inode tuples and that's it. So it's just a bunch of name inode tuples. So yeah, so as a preview for lab six, you'll create a directory from scratch, a file from scratch and a sim link from scratch. So you'll actually point to blocks and there'll be lots of fun. So yeah, so you'll really get your hands dirty with it and figure out exactly how it works and hopefully that's the goal for lab five with virtual memory too, but hopefully that goes okay. So if you ask questions about what's stored in inode, well, the file names not stored in an inode, all the file names are only stored in the directories and that's the only notion of a name whatsoever. Inodes also have no idea what directory they're contained into because the file could be in multiple directories. So multiple file names can all point to the same inode. You have no way of figuring out from the inode itself what directories it's contained or what points to it. But what is on the inode is the file size. So that's, so you don't just read invalid information because it will be allocated blocks. So you wanna make sure you don't read blocks or information the blocks you shouldn't. So your file will contain a block, but like we saw with the to-do.txt, it's only like 18 bytes long. So that's the size is stored separately from the number of blocks it has. So I only access the first 18 bytes. Then what's also cool, oh yeah, it's probably the wifi. Wonder if the stream's still going. No stream's still going. All right, well, whatever. There's still audio recording. So yeah, so you don't know the number of soft links or anything because they're all independent of the inodes. But what you do know on the inode is the number of hard links to a file because the kernel needs to know when it could erase them. Again, that doesn't mean it necessarily erases them. And then you also don't know any locations to hard links because again, the inodes don't know about it, but what is on the inodes is all the access rights. So whether you can read or write to it timestamps. So when you last changed it, when you created it, when you modified it, like swear to God, I modified this last week, it was on time. And then file contents are sometimes stored in the inode and that was that little optimization for SIM links where instead of pointing to a block, I'll just use that space for the pointers to actually store the contents of the file. And then as part of the inode, they always have an ordered list of data blocks. So if you want to figure out what block one of the file is, well, it's gonna be the second direct block it points to or the second block in that file. So they're all ordered exactly like the index ordering, except now you're gonna have to count a bit differently. So in the index ordering, they were all in one block while if we have a bunch of direct pointers, well, if we start using an indirect block, since we have 12 direct pointers first, we would only need that if we pointed to 13 blocks. So your first 12 blocks are pointed to by the inode itself. Then if I need a 13th block, well, I have to go through a single indirect block and because I have no other choice. Okay, a few other little things. So your file system will also cache to speed up writing to disks. So writing data to disk is really, really slow. You have to wait for something. So you can use a cache to speed it up and memory is a great form of cache. So file blocks, they're conveniently most of the time the same size as pages. So you can map them one to one. So file blocks can be just cached in memory and the kernel would figure out that, hey, this page actually corresponds to this block on the drive, it's a cache for it. And I'll just keep it in memory instead of reading the disk again because I just read it and I'll just modify it in memory and I won't bother writing it out to the disk because likely someone else is gonna modify the file or do whatever to it. So, and also you might have, if you're reading a big file, well, if I read one block, I'll probably read the next block so you can go ahead and cache that ahead of time and just read it into memory before they actually use it. And if you keep track of which is which, well, then you don't actually have to read the disk again. So you could just have a kernel thread that only runs when there's no user threads to do anything, it's otherwise idle. And then you can use that idle time to then flush all your caches to disk and actually write the information to the disk. And that way, all your writes look really, really fast because in reality, all they're doing is writing to memory and not actually going to disk. So sometimes you like that because it looks really, really fast. Sometimes you hate it because if it's cached and not actually written to disk and your power goes out, well, then all of your work is gone because it never actually got physically written to the disk and now your homework is gone or whatever. So there's some controls you have over it. So that's what flush will do and sync system calls. So that will make sure that everything that is cached is actually physically written to disk. So if you start putting in like USB drives or something like this, I've had this happen before where I'm trying to format a USB drive and copy files to it. I copy a file to it. It says it's done and then I take it out and the file's not actually there because the kernel lied to me and didn't actually write anything to it. So what I should have done is copied it and then run sync to make sure that's actually on the USB drive. So I actually have my whatever. Nothing illegal, of course. So the last thing is the journal file system. So for that, well, power failure kind of sucks. And also if you want to delete a file, deleting a file would actually take three steps if you're actually deleting a file which is like the last reference to it, the last reference to that iNode if it wants to actually get rid of the iNode. So it has three steps. You would need to remove the directory entry. So that's the name to iNode mapping. And now that iNode has zero things pointing to it. So the ideal next step is that you release that iNode to the pool of free iNodes to say, hey, this iNode is available for reuse for a new file. And then whatever disk blocks that iNode pointed to, you also release them and say, hey, those disk blocks can be used for other things. But the kernel keeps track of all of this and you could have a power off at any one of these steps and then you would get in a really inconsistent state. So if you crash between one and two, well, nothing would point to that iNode but that iNode would think it is still in use. So you would never be able to reuse it. If it crashed between two and three, well, that iNode would be available to reuse but all the blocks it points to, I can't reuse anymore, which is also bad. So there's something called a journal file system that will basically just say, hey, I'm going to delete this file and start the three steps and it will write when it starts, it will try and do the three steps and then write if it finishes. So if it crashes at any point in between the three steps, well, it would have already have written what it is doing in the log and if it's still in the log, it means it hasn't finished. So you can essentially go through, retrace your steps and figure out at what point did I crash and then you can recover so you can actually free all the blocks or all the iNodes, yep. So the log would be on disk. So before I do this, I write to the disk what I'm about to do. So yeah, the log would have to be on disk, otherwise you're screwed, right? Yeah, okay, yep. Yeah, when it says journal, it means it has one of these. So if your power goes down, hopefully nothing bad happens. Yeah, well, if it's not journal, this is not journal, it doesn't keep track of it. So it might be it doesn't need to journal because it can do it in one step or something else. You'd have to let me know the specific thing you saw but generally either it doesn't care about it or it's a one step operation. Okay, so real quick, so iNodes were that hybrid allocation strategy. So they give us greater flexibility than anything we saw yesterday, contiguous linked, fat or indexed. It essentially uses the idea of indexed but with page tables and then it kind of opts for like a pay for what you use. So we only use three levels of indexes if my file is huge and I actually need to point to that many blocks. Otherwise, if it's small, I use direct blocks. If it needs a bit more, I just use one table. Then if it's a bit bigger than that, then I use two levels and so on. So it's like a pay for what you use thing. Then on Unix, everything's a file. Names and directory can be hard or soft links. We saw that, lots of fun. All right, so let's remember, pulling for you, we're all in this together.