 Okay. Hello. Welcome to the last third of the course. So this third of the course is the chill third of the course. So we're pretty much done all the hard stuff. So for the last third of the course, feel free. If you want to review anything we've seen before or if you try and apply, you know, making threads or processes to your solutions or what or other projects, you can discuss that, do all sorts of fun stuff. I'm also not opposed to doing going over lab three. If people want me to go over lab three, a fun thing could be throwing it out there. Someone that doesn't understand it can just or their solution might not have worked can just tell me to review their code. I'll review their code in the class and compare it with the solution. We'll have all types of fun. And if you do volunteer for that, I will not tell anyone who you are. So probably probably don't want to do that. All right. So yeah, I might have to go over lab five stuff. So I'll release lab five and hopefully grade lab three after this on the commute home. But for today, let's talk about this and to justify my decision of cutting some content. Who has ever used a hard drive that spins, you know, the spinning thing that goes world's one, two, three, four. All right. Any other spinning discers? So not that many. So like over half of you have never used a magnetic hard drive. So yeah, I'm just cutting that content from the course because no one really cares how magnetic hard drive works unless you're like hoarding a bunch of movies or something like that, in which case they would be useful. But I cut that and we will talk about solid state drives. So they're the more modern thing. They use transistors like ram. They're not magnetic. They don't spin. They don't do anything like that. So there's no moving parts. You can't jostle them like, you know, the first generation iPhones were magnetic spinning discs. So if you ran too hard, they would skip and do all sorts of weird stuff. And you don't want that. So SSDs are solid state discs, no moving parts or other weird physical limitations because as you might imagine, if you have a spinning disc, well, you have to wait for it to spin. It spins at a certain rate. Who cares? We use SSDs. They have higher throughput. They have good random access so you can access any page or any byte you want and you'll get it at pretty much the same speed. They're energy efficient. They have better space density. The only cons is that they're more expensive. They have a limited endurance. So you might have not known this. But if you buy an SSD, it's only rated for a certain number of writes to it. And then after that, it wears out and can't store any information anymore. There's also some weird, peculiar behaviors with that that we will go into today. And they're a bit more complicated to write drivers for. Thankfully, we won't be writing any drivers, but we'll talk about what some of those complications could be because they kind of seep into operating system design and things that you have to think about. So SSDs also contain pages. And again, pages are just fixed size blocks of memory. So SSDs also have pages just like the virtual memory system also deals with pages. So they're organized a bit differently. There's a big hierarchy of this. So in the green here, those are pages. And those are blocks of typically four kilobytes. So 4,096 bytes at a time. And those are organized in blocks. So a page will live in a block. And typically they'll be like eight to 16, eight or 16 pages in a block. And then within the block, multiple blocks will live on a plane. And then it will all live in a dye that you can actually visually see if you open up an SSD and start playing around with it. So that's a bit how they're organized for the purposes of this course. We only care about pages and blocks. Yeah. Yeah. So they're kind of relayed. They nicely fit up with pages in a virtual memory subsystem. And they just made it because it matches and everything lines up. But the original four kilobyte page size was from the virtual memory system. All right. So using SSDs and flash, whatever are much faster than HDDs. There's a type on there I need to fix because they're not spinning or anything like that. So pages typically line up with exactly what we have from our virtual memory system. So four kilobytes. And then some timing metrics. So reading a page is like 10 microseconds. Writing a page is typically slower. It's like 100 microseconds. And then erasing a block is like one millisecond. And you'll notice here that it says erasing a block, which is a bit weird because it doesn't say erasing a page. So the weird thing about SSDs is you can only erase blocks at a time, not individual pages. And remember going back here, there are multiple pages per block. So if you erase a block, you wipe out six or eight, 16, whatever pages, and that's what makes things difficult. And that's just the way the hardware works. Yep. Oh, that was the question. Yeah. So yeah, it's designed that way due to physics I don't understand. So some of you might understand physics of an SSD later. I decided not to care about it. But if you really care about it, there's a reason to that madness. I just don't know what it is. Some physical limitation, something like that. But if we care about programming, well, you can only read complete pages. And another weird thing is I can't just write to any page I want. I can only write to freshly erased pages. So that adds another layer of complexity for us if you are managing the SSDs. So again, remember erasing is done per block. Blocks can actually have more pages than the diagram shows. So it can actually have 128 pages or 256 pages. And again, the entire block needs to be erased before you can write it. And you're not allowed to rewrite it after you write it one time. And writing is really, really slow, especially if it needs to use a new block. You can imagine that you might be in a scenario where you want to write to one page on a block and that block has 257 pages filled of data. So if you want to write to that and it's not freshly erased, well, what you might have to do is just erase a random block and then copy every single page to the new block and then write your new page to it. Which, as you can kind of see, is kind of annoying. So the OS, if you're doing that, you have to manage all that and follow those rules. And dobbering systems job is to help that hardware be as performant as possible. So you want to try and minimize the amounts of blocks you're erasing and moving content and all that fun stuff. So SSDs will need to garbage collect blocks. So that's why I was just kind of alluding to there that if you need to write some new data and you're not allowed, it's already been written before, well, you have to move pages to a new block, which of course would also have to be freshly erased. So there may be some overhead there. We have to wait for the block to be erased and then you have to write over any pages you want to copy, so on and so forth. And also, the disk controller is generally pretty dumb. So you might write to a page and then you might delete the file. You would know you've deleted the file, but actually on the disk, that data is still there. The operating system is just not going to use it anymore. So the disk won't do anything special about it. And also, that's why, one of the reasons why, if you delete a file, people can just recover it if they know what to do, because generally, whenever you delete something, it doesn't actually delete it. It just doesn't use it anymore. So that's why, you know, if you need to shred some documents or something like that, you should just write random crap over it, so you just obliterate the data. So, since the SSD hardware doesn't know what blocks or what pages or what blocks may still be alive, then the operating system has to help it out and tell it which ones aren't used and therefore the operating system would need to keep track of it. So there is something called a trim command and that informs an SSD a block is unused because otherwise it wouldn't know whether or not the pages on the block are used or not. So the trim command is a way to tell the SSD hardware, hey, I'm not actually going to use that block anymore. And then the SSD hardware itself will go ahead and erase that block whenever it has free time and it's otherwise idle. So you don't have to get that overhead and that will therefore speed up everything. So far, we've only been talking about single devices too, which is kind of boring. Sometimes it's jokingly referred to as a single large or single large expensive disk and if you really care about your data, generally you don't want to do something like this. So if all your homeworks are on one hard drive and that hard drive dies, well you are now completely screwed. So generally if you care about your data you don't do that. So that's a single point of failure. So the opposite of a single large expensive disk is a redundant array of independent disks implying that they're probably cheaper. So instead of buying one disk and you write your data to that, you buy multiple disks and you distribute your data across the different disks. And because you can distribute your data across different disks, well you have a bunch of different design options you can use depending on how much you care about that data. So you can use redundancy and all redundancy means is I have multiple copies of something. So I can use redundancy to prevent data loss or I could also use redundancy the fact that there's multiple copies of a file to also increase the throughput of all my disks working together over just a single disk. So the easiest one is called RAID 0 and it is called a stripe volume. Typically it will take all of your data and organize it into stripes which will be like 128 kilobytes or 256 kilobytes and they're distributed over the disks. So this 0 here would have like the first 128 kilobytes of a file A1 and then the next 128 kilobytes would be on the stripe A2 and then it would just alternate back and forth between the two drives. So essentially half the file is on one drive and half the file is on another drive. So in this case we don't have any redundancy so we still have one copy of this file A. It's just half of it is on one disk and half of it is on another disk. So why would I want to do this? Well if you stripe data across all the disks you get faster access time. So now instead of reading one file from one disk well I can speed it up by a factor of two in this case. So if my bottleneck is a disk because they're slow well instead of reading one file from one disk I read half from one disk and then the other half of a file from another disk and therefore it's twice as fast. So any questions about that? Nope. Okay so this extends I can do this with however many drives I want so I could do it with four drives so instead of splitting all your data up into just half on one, half on two well if you have four drives it's like a quarter, a quarter, a quarter and a quarter so if you want to read then you read a quarter from each drive and it's about four times as fast parallel access. But now you are in the unfortunate situation where before you had a single point of failure and now you have multiple points of failure because if I go back to this I don't have two copies of the file I have half on one drive, half on the other so if one of these drives dies I lose half of the file and I can't recover the other half of it it's just corrupted, it's useless I can't use it anymore it's worse if I have four drives if one of those drives dies I lose all my information but the nice thing about this is is it's really, really fast so if you really care about performance and you don't care about the integrity of your data like say a game or something like that where you have my hard drive gets corrupted whatever I'll just re-download it I don't really care then I might want to use something like this so it's typically only used for performance the other extreme is something called RAID 1 which is a mirror across all the disks and what a mirror implies it's like an actual mirror so everything is the exact same as one another so if this file was split up over four little chunks like A1, 2, 3 and 4 well both hard drives are going to have a copy of each of those chunks so now if one of these drives dies well it doesn't matter because it was an exact copy of the other drive and I haven't lost any data so you can actually recover from this so in general the skills up when you talk about RAID 1 that every disk is just an exact copy of all the other disks so if I had four in this case all four would be exact copies of each other so why you want to do this well maybe you really, really care about your data so as long as one of those disks is still alive you have not lost any data which is great and it also gives you good read performance so even though we have the file split up over two copies of the file on each hard drive I can still get the read performance benefits of RAID 0 because I could read say A1 and A3 from disk 0 then I could read A2 A4 from disk 1 so I can still read half the file from one drive and half the file from the other drive and get like in this case two times more read performance but my write performance is the same as a single disk because if I need to write this file well I have to write it to every single disk I could write it to every single disk in parallel theoretically so it would be just as fast as one disk and I get no speed improvement whatsoever but it also doesn't go that much it doesn't really go slower so this has a really really high cost for redundancy so I'm wasting an entire drive or if I have multiple things in RAID 1 then I'm essentially just wasting drive drive space unless I really really care about read performance and really really care about my data but you don't get any benefit to write performance so any questions about RAID 1 cool alright so RAID 4 you'll notice we have skipped some numbers so RAID 2 and 3 are just some weird mistakes in the past that we will not go over and in fact RAID 4 is also a bad idea that is never going to be used but we'll go over it very quickly because it will help us understand RAID 5 so RAID 4 introduces something called parity so it's the same idea as RAID 0 where we stripe the data over the disks but in this case we have a dedicated kind of backup disk or a parity disk and the property of that parity disk is as long as we have some parity information if one of those disks dies we can reconstruct whatever data got lost and all the parity is an XOR calculation so we can see that real quick so say we have 4 drives and we have disk 0 1 2 3 and on 3 we have some parity information so an XORG will essentially if you don't want to go into the details of XORG the easiest way to think of it for binary numbers is it will tell you whether or not the result of adding everything together is even or odd so if you XOR everything together and it's 0 it means it's even if you XOR everything together it's 1 it means the sum of everything is odd if you want to think of it that way typically I think that's an easier way to think about it so the idea behind the parity bit is XOR of everything so if on disk 1 let's say we have a 1 0 1 and say we have 0 1 1 1 0 0 so say this is the data on all my drives well for each bit across all for all the bits across the drives I calculate a parity I calculate a parity which is just an XOR of everything so if I XOR 1 0 1 well if I want to take the shortcut that's an even number so if I XOR all them together that's a 0 aka it is the sum of all them is even if I want to think of it that way then if I XOR all these bits together well it's the same idea it is even and then if I XOR together 1 0 well all them together XOR is a 1 or otherwise I can think of it as odd so now the fun property about this is I have some more information that if one of these disk dies without knowing any of the information that was on it I can reconstruct it so let's just assume so remember 0 1 0 and we'll in disk 1 and we will figure it out as if not knowing anything about it so say this disk dies so now we lost all that information but because we have some parity information some extra information we can actually figure out or deduce what bit was there that we lost so the idea here is that drive dies you go to the store you buy another one and you say please reconstruct my data I don't want to lose anything so we're basically solving an equation here so it's like so 1 plus 1 is even and then it's like plus either a 0 or a 1 gives us something even so if I say this information lost was a 1 well this doesn't make sense anymore according to the parity bit because it says if I add up 0 1 2 which is it says I should get an even result at the end but right now it's odd so clearly what I lost was not a 1 it was a 0 so I reconstruct a 0 and then you do the same thing with the other column so I have a 0 and a 1 so sum together as 1 is odd and then plus 0 or 1 equals something even so this has to be a 1 because 1 plus needs to add to 1 to make something even right if it was a 0 it would be odd and then it would not agree with this and then similarly for the last column well I have a 1 and a 0 add together that's a 1 and I know if I add my missing bit it should produce an an odd result so therefore this number has to be a 0 so we can reconstruct that drive we lost just by using that parity information and looking at all the other information and we can go backwards and figure out what information we lost so any questions about that? okay so that's the whole idea behind RAID 4 so no matter what if we lose one disk then we're good so if we argue about how much space we can actually use for the disk well essentially because all the data is striped across the disk I can use all the data minus that disk 3 there there's one disk reserved for parity so I can't use that for any of my useful information so it won't store any of your files it's just storing that XOR calculation it's just full of XOR calculations and that's it so with this too you also need at least 3 drives to be worthwhile or to even be possible to use so you need at least 3 drives and one disk is going to be wasted for that storing all that parity information so pros here is we're going to get a performance boost over using a single disk of course so we get minus 1 times the performance boost because if we remove the parity disk it essentially looks exactly like our striped example so it's just exactly like our striped example but we have one disk we're using for parity and the nice thing about it is well now we have some redundancy if one of my disk drives I don't lose any of my precious data because I can go buy another one stick it in there and rebuild it it will get all my data back and I will be happy as a clam the bad thing about this is the write performance really suffers so every write has to write to that parity disk so because these are striped well if you're using a bunch of like little small files well if I touch say I just touch A1 here on disk 0 and I touch B2 on disk 1 and I touch C3 on disk 2 well those are three accesses to different disks that can all be independent but this disk 3 would need to calculate a parity bit for A, B and C so it's going to have to do three times the work which isn't that great so there's kind of a little bit of a bottleneck here on that disk because whatever you do to any disk you're going to have to touch that parity disk which is not going to be terribly good for performance so that is where RAID 5 comes in so RAID 5 is literally the exact same idea as RAID 4 except it removes that bottleneck on a single drive so now instead of having one drive have parity information for all the other drives it's distributed across all the disks so in this case disk 3 only has the parity information for A disk 2 only has the parity information for B disk 1 has the parity information for C and disk 0 has the parity information for D so each of them you just distribute the parity across all of them so there's no one bottleneck on any one individual drive so any questions about that same idea as RAID 4 but it distributes it gets rid of that bottleneck so that's why no one ever uses RAID 4 because it's the same idea as this but this is just better in every respect because it doesn't have that bottleneck so all the same pros as RAID 4 and our performance has also improved a bit too because we no longer have that bottleneck on that single parity drive so if I have all those writes well all the writes don't have to wait so if you write to a single disk it's distributed across all of them so your write performance will be a bit better sorry yeah well the idea is that it just be distributed across all of them so like you can come up with cases where one disk will still be kind of a bottleneck but it's just the average it will be distributed across all of them so if I'm if I'm writing to all the files well it's not concentrated on what I'm waiting for it just improves the average time so it doesn't reduce the number of writes or anything it just makes it distributed a bit more so you can still come up with cases where this still has an access of writes but in general it on average it gets better alright so RAID 6 is the same thing except it adds another parity calculation so instead of just a single parity there are two sources of parity so we now have P which is our normal parity information and also a Q which is another parity calculation so we have two calculations now or two equations and the way this works is it's the same idea it just distributes those parity bits across all the disks so for example disk 3 has the P of A and the Q of A and disk 2 has B's P and disk 3 has B's Q so it's the same idea we just have another redundancy calculation it's just distributed yep no so P and Q aren't the same so P would just be the XOR calculation like that equation the Q calculation is going to be a bit more involved so we won't go into the details of what that calculation is it gets into some fairly advanced math if anyone has ever heard of Galois fields I think something like that so it's basically it just goes down to like your distributed numbers course it's just another equation so you can solve for two unknowns now we don't have to worry about the actual parity calculation you can just think of it in terms of like discrete math or whatever where we have another equation so we can solve up to another unknown so it's the same idea here but we just have an extra parity calculation so now we can tolerate two disk dying because we have two equations so one's easy to understand which is just straight XORG the other one if you're more interested you can look it up like I said it's more involved but it's perfectly doable so now we have essentially two disks of parity because we have two other equations and this requires at least four drives and for your usable space well two drives are essentially used for parity information ones for the parity of Q ones for the parity of P yep you can reconstruct any two drives so the right performance is going to be slightly less because we have to do another parity calculation and rebuilding it might be a bit slower because it's two calculations but yes you can redo any of the ones you want yep so if you just have three that's like one drive for useful information and then two drives with parity information so at that point you may as well just use a RAID one and just have a mirror because it's the same thing so you're just kind of wasting space at that point so you want at least four drives for this yep so question is how often drives fail so drives fail a lot so typically with these RAID configurations use a spinning magnetic disk so that's like in the big warehouses and stuff like that and if you run a big data center these things fail I think average failure rate is like 5% a year so if you're running a Google data center these are going to die constantly like drives it's not a question of if a drive will die it's just when and they die all the time yep that's a funny story so I as part of my grad studies I knew about this I had my work on a RAID 5 so I could tolerate a drive failure no problem right so inevitably when I'm about to finish up grad school one of my drives dies I'm like okay well I can still access all my files no problem and while I am too lazy to go out into the store to buy another one to reconstruct it another one dies so since I had a RAID 5 did I lose data yeah so I had a RAID 5 two drives filled because I didn't replace one in time typically they will go at the same time because typically when you buy them you buy them all at the same time so their failure characteristics will probably be fairly similar so what you typically want to do is buy them buy the same brand of drives but from different stores try and distribute it a bit so they won't fail at the same time luckily for me one that started to fail was still not too failing so I ran to the store bought two drives and it recovered most of it so I was lucky but don't be like me and don't do that because if one drive fails that means you are like living on the edge another one's probably going to fail so you may as well go fix it and yeah these things depending on how how fast they are like RAID 5 and RAID 6 take a long time to rebuild so recalculating that for terabytes of data or something like that takes a long time like on the order of days so you have to wait a few days you can't really use it you have to wait for it to rebuild so that's another consideration with this and there's also what we didn't talk about but we should talk about so there is a combination of doing RAID RAID 0 and RAID 1 so this has no redundancy whatsoever it just splits half the file on one half the file on another but you can do this in like a tiered configuration so you can do something that's called a RAID 1 plus 0 or sometimes it's called a RAID 10 where essentially you have two copies of this so you have four drives and you have one that looks exactly like this half is on one drive and half is on another drive and then you have an exact mirror of this so you have another two drives that have half of the same information and that way you can only use half the space because you have a mirror an exact mirror but you don't have any parity calculations so if a drive dies well reconstructing it is really really fast because all you have to do is copy the data any calculations whatsoever you just copy the data and each of the drives has a mirrored pair so you only lose data in the event you're really unlucky so you'll lose data if both pairs die at the same time but as long as you have one in each pair you can tolerate some failures so like if I have eight drives well then everything would have four different pairs of stuff so in the best case I could lose up to four drives and not lose any data if it was a unique pair or one of the unique pairs for each of the drives that died or I could get really unlucky and then just a pair together dies and then I lose a bunch of data so those are other ways to go about it typically if you don't rate one plus it's actually typically more used just because of that reconstruction time because it's a lot faster than recalculating parity information you lose a bit more space but if something fails typically you want to get back up and running as fast as possible so people use that and I've actually switched my stuff over to use that so I don't have that same situation again because yeah, it's not good alright any other questions basically don't do what I did alright, so these are discs and they are part of our persistence of an operating system so we'll go into more details later we explore SSDs and RAIDs so these are again, are your non-volatile storage so whenever you reboot your computer all the information is going to be there and it's going to be great so we saw SSDs, they're more like RAM except well, they survive a reboot and they are accessed in pages and differently blocks and they have a lot of very weird rules mostly being that you can only erase in blocks at a time and you can only write to a freshly erased page so that comes with a bunch of challenges if you get into the nitty gritty details because of that SSDs need to work with the operating system to get the best performance so like that trim command to inform the actual SSD hardware that hey I'm now no longer using this block you're free to erase it when you're otherwise idle and then we saw a bunch of RAID so they allow you to tolerate failures, improve performance using multiple discs and that's what any data center is going to use so yep so SSDs can use, especially like the cheaper ones people are buying them and putting them in RAIDs now but most people do a RAID for spinning magnetic discs just because they're cheaper and we'll talk about more about different storage mediums when we go on all your laptops will not have this so that's why you should push your code to the server often and then if your laptop dies well it's kind of like you have a RAID one because the mirror is the server and you don't have to worry about it so but if you're also like a data hoarder like I am then you might want to know this stuff too and essentially don't do what I did with my RAID 5 what drive dies and you just don't do anything about it alright any other questions we can wrap up early told you chill third of the semester well so lab 5 we'll see it right after this you guys can read it and tell me what you think I might have to go over virtual memory again but yeah I'll be virtual memory but just remember, we're on this together