 Okay, all right. Okay, so this is the division so the planes are divided into things called blocks and those blocks contain a bunch of pages, so the pages are Sometimes the same size as the pages we use for memory so that 4096 Sometimes they're different size pages, so they don't have to necessarily match memory. They can be like eight kilobytes something like that But they're divided in The pages live within a block and that's where the complexities happen so Pages typically they're four kilobytes and that lines up nicely with the pages we use in memory But they have a few weird caveats So reading a page is really really fast like on the order of ten microseconds Writing to a page takes a lot more effort So it's like a hundred microseconds and then erasing a block is like one millisecond And you might look at this and be like well, that's weird. What about just erasing a page? Well, that's where the complexities come in because you are not allowed to just erase a single page You have to erase a block and a block Contains multiple pages so I can only erase blocks and that would erase multiple pages so the other silly rules of that this is you can only read complete pages and You can only write to freshly erased pages, so I can't just overwrite data without first erasing it and Again remember erasing is done on a per block basis and usually a block will have like 228 or 256 pages and again that means I have to wipe out the entire block before Actually doing some writing and again writing slow So we might even need to create a new block if we want to just modify a single page Well, you know if we modify a single page. Well, there might be a hundred and twenty seven Valid pages on that block to modify it we have to erase the entire block and We would have to erase those other pages so we'd have to save them We'd have to move them to a freshly allocated block and you can see how this gets challenging So fortunately, we won't have to deal with the issues. We're just taking a brief overview So the operating system has to help with this so the SSDs will garbage collect blocks, so If you're using a block and there's only a few pages left on it Well, what it could do is it could say There's only like two pages left So I may as well move these to a freshly allocated block and then just completely erase this block So now I can write to it again, and I have some more space So moving those alive pages are going to be some overhead because I'm just copying pages that exist anyways But I'm doing this because I want to free up that block so I can actually write to it again So this discontroller is going to be really dumb It doesn't really know what blocks are still alive and in use by the operating system that are referring to files and things like that So your operating system is going to have to help the drive and say that oh, okay Maybe this pages or this whole block. I'm not actually using anymore. So you're free to just erase it You don't need me to do it. So that's What the trim command is if you've ever seen that So the trim command that's an option for SSDs as like an optimization to get much better performance out of it and also Increase its longevity. So the trim command essentially all it does it's informed the SSD like an entire block is unused and Then the SSD itself will have a little controller on it And it's free to erase that entire block without moving anything. So By default it would have to move things to be safe but if you're not using anything you can just erase the entire block you don't have to worry about it, you don't have any overhead and The SSD can just do it while it's otherwise idle because it has its own controller So so far that's about all for hardware. We won't get into too many details the Thing that comes up is well a single device if you are trying to be resilient and You know protect against a drive failure or something like that Single devices are often a bad idea. So sometimes just having a single hard drive or SSD is called a single large expensive disc Sled to kind of make fun of it because it's essentially just one giant point of failure Most of you in your life have probably had a computer die on them or come hard drive die on them And then guess what all your files are lost and you're screwed, right? So you usually don't want that failure, especially if it's really critical data So for instance, you like I don't know. It's like your PhD thesis or something like that You probably don't want that to fail because there goes like three years of your life or four or five or however long it takes so There's another technique called raid if you've ever heard of that before Stands for redundant array of independent disc and that aims to use multiple drives together to provide Some data loss so some redundancy and redundancy is just having multiple copies of the data To prevent data loss in the event that a single drive fails because that happens all the time and They can also use this redundancy having multiple copies of the data to increase throughput or otherwise go faster than a single disc so for instance if you're Google or something like that you will have tons and tons of discs and With hard drives. It's not a matter of if it will fail. It is a matter of when a device will fail So hard drives fail all the time It's someone's job at like the Google data center or the Amazon data center to this go through and replace bad drives And that's what they do all day So there are many techniques and to combine two discs so each cylinder in this diagram represents an entire disc and then the Numbers with the letters represent pieces of data So raid zero is called a stripe volume, which it just distributes that data over the disc So this file let's say the file name is a is in eight different parts Well, instead of just having all eight parts on a single drive. It just splits it across two drives So essentially each drive gets half of the file So a one goes on this zero a two goes on disc one a three goes on disc Zero a four goes on disc one and then they each have half of the data on them, right? So any questions about that? Yeah Yeah Yeah, so Yeah, so in this point Well, if disc one fails, I lost half my file and I can't recover anything So this does not help you for failures. It's actually worse, right? so the default case is if my disc drives my my Disc dies I lose everything right in this case if either disk drive dies I lose everything I Lose half of half a file is useless So you still can't reconstruct it or anything So this is actually worse if you care about the integrity of your data Because now instead of a single point of failure. I now have two points of data or two points of failure So why on earth would I do this? Well, this is primarily for speed so Instead of reading a file from a single drive I can read a file twice as fast So I read one half from this drive and one half from that drive So I use both that disc at their full speed So I get a two-time speed up or speed up of how many discs I actually have So this is purely for performance. So unlike if you have Sick gaming rig or something like that it might have two drives that are in raid zero And the reason for that is to go fast So the con is well you get Lots of speed-ups, but any disc failure results in data loss So you actually have more points of failure. So any questions about that? So raid zero go fast The other extreme is something called raid one And that is a complete mirror and that will mirror all data across all the discs So say file a is now just in four parts. Well on disc zero It would have the entire copy so it would have a one to a four And then disc one would also have an entire copy a one through a four So now in this case if either disc drives If either disc drive dies then I don't lose any data, right? So if disc zero dies doesn't matter because my data is all on disc one All right, any questions about that? Yeah Yeah, so like if each disc Say if each disc was two terabytes or something like that Then you could only store two terabytes of data in total because it's a copy on every single disc because they all look the same All right, so that's another point to take. Yeah Yeah, yeah, so that's another question. Would you also get the performance enhancement with raid one? Yeah, raid zero, right? Yeah, so if we were to read a file from this could we read half from one disc and half from the other disc? I see a no why no yeah, you can so in this case if you're reading I could read a one from this zero a two from disc one a three from this zero and a four from disc one So I could get double my read performance if I wanted but if I had to write data Well, then I have to if I like make a new file called B. I have to put B on both discs, right? So for writing I don't get any benefit because I just have to do the same thing for every single disc So it behaves the exact same as a single one. Yeah Yeah Yeah, yeah, so I can write that entire file in parallel to both discs at the same time But it's gonna have the same performance as like a single disc, right? I just do it in parallel So I can speed up my reads in this case because I could read part of the data from each drive But I can't speed up my rights Yeah, so this one I waste a lot of space because well, I can only have a disc of storage So if each of these this was two terabytes, then I just I can only use two terabytes If I had a third one here, so with raid one the same rule applies No matter how many discs you have so I could have another disc that would have the entire copy too So I could speed up my reads by three times But my rights are still like a single disc and I can still only use two terabytes While in raid zero if both my discs were two terabytes, then I can use four terabytes If it was if they were if I had three discs and they were all two terabytes I can use six terabytes So this lets you use all the available storage, but you have you get speed benefits But you created more points of failure Yeah, so with raid one real simple, but kind of wasteful So it has really good reliability As long as a single disc remains since all of them are copies of each other you have not lost data So if I have 10 discs and they're all copies of each other I can lose nine before I lose my data right nice and safe and It also has good read performance So I can split up my read across all the discs get a nice speed up But it has a really high cost for redundancy everything is just an exact copy of each other We can actually do a bit better and the right performance is the same as a single disc So these are the two extremes. So likely you want something that is also a bit fast that doesn't waste as much storage and Also has less points of failure So there is something called raid four and you might ask what happened to raid Two and three two and three were bad ideas. So we go straight to four four is also a bad idea So this one's not actually used. So we'll we'll do a little tweak to this But raid four is the first one that actually makes some sense so what raid four does is it essentially takes the idea from raid Zero and Distributes the data across all of the discs So in this case I have four discs And I distribute the data across three of them and I'm going to use the last disc here discs three for some parody What's parody parody is just some information I can use to reconstruct the answer in the case that I lose something So in this case the parody would be an XOR so it would be an XOR of all the bits in a 1 a 2 and a 3 and That way I can reconstruct the information if one of those discs drive One of those this dies. So how would that work? So anyone remember how XOR works? Yeah, all right, so we remember this Zero one So if we have like a and B This is a This is a X or B So that's one way to think of it Another way to think of it is well you can have XOR with as many Inputs as you want as long as it's binary an easier way to think of XOR is Essentially will add all of the digits together and then tell you if the result was even or odd So if it is a one it means it was odd if it's a zero it means it was even so especially if we have way more inputs That's probably an easier way to think about it so If we had something like this If we add a 1 a 2 a 3 and let's just assume a single bit, but if it was a byte you just do this for every bit Say we have something like this Where a 1 is 1 a 2 is 0 and a 3 is 1 Well, we compute a parity and our parity is going to be an XOR of all of them I write the symbol twice So if you take my shortcut the parity should be I I can add all them together And it will tell me whether or not the result was even or odd. So 1 plus 0 plus 1 is that even or odd? Even so what should the result be of XORing all of them 0 right? So now this is a nice trick. So knowing that well if for some reason Say a 2 is dead so a 2 has now just lost data and That disk is dead and now I have no idea what this is So can I work backwards using this and knowing the value of a 1 and a 3 to get me What this should be if it should be a 1 or a 0? Yeah So in this case well my parody tells me that if I add everything together it should be even right That's basically what that says So the two values I have that still exists are 1 and 1 which are even which means This can't be a 1 if this was a 1 that would if I If I XOR all them right now, then that's odd which conflicts with the parody I calculated, right? So that means that if this can't be a 1 it has to be a 0 so I can reconstruct it I can get back to the original value knowing that parody information, right and this works No matter what dies so if I have Something like this and then I lose a 1 Then knowing that a 2 was 0 and a 3 was 1 and if I add everything together it was even well That means a 1 must have been a 1 so I can reconstruct it, right? So any questions about that? Yeah? Sorry Yeah, what if to this die so If it's like this what can you do? Yeah, so in this case These both of those are valid solutions, right and one's right and one's wrong. So if to this dies you're screwed Yeah, yeah, so we'll get to that Well the second if you wanted to protect against to this failure to get much more complicated It gets to like math stuff Yeah, yeah Yeah, so this is just showing a single bit on this but if those a ones were bites you just do the same thing It's just the same x or across all of them, but the concept is the same thing. It's just deal with every single bit Yeah, no in this case. We're assuming disc failure like the entire disc is dead Yeah, just died so In this case Yeah, that means the entire disc is dead so we can recover so through that We just argued that We can recover if a single disc dies, right one disc dies. We're good. What about performance? So it's perform. Yep Yeah, so in this case reading and writing is about three times as fast because Minus that one disc. It's the same as raid zero, right? So I'm you I'm essentially just using a single disc for parody and then That's it. Yeah, so that's right So it wouldn't be exactly three times because we're going to waste some time Calculating the parody so we'll have to calculate it and then also write to this disc So So for this oh also in this case If though all those drives are two terabytes then suddenly we can use six terabytes of usable storage, right? Because one disc is set aside for parody and then we're essentially striping data across the remaining three so we can use All the information Minus the one disc. Yep. Yeah, so you can some things do it in software some things do it in hardware Most things do it in software now. You can see that this is kind of a bad idea and because Say on disc zero we change a one and on disc one we change b2 and then on disc three We change c3 well, guess what this disc will get hit by every parody calculation We'll have to write three times while each disc is only writing one little circle there So we create kind of a bottleneck with that parody disc too, which isn't great Yeah, yeah, this is way slower though than the CPU Like this are many orders of time slower, but for that the operating system is Calculate it's telling what this to write anyways, so that's gonna be in memory so it can just X or Yeah, so this each one of these discs is a separate hard drive So this would be like you go to the store you buy for discs These are all physical different discs So this is only for when you care about your data because in most of our computers in any of our laptops It's a single disc, right? So we're kind of screwed, but if you actually care about your data use something like this So in this case well, we get to use Essentially all of our available space minus one disc because we're using it for parody and this technique requires at least three drives The pro is essentially we get you know n minus one times performance So if we had five discs for example Well, one of those discs is for parody and the other four are for striping so yeah, we basically get n minus one times performance and If a if any disc fails we can recover we can just throw it on there Recalculate it recalculate the value it should have had and we're good to go so the con is that write performance can suffer because of everything being bottlenecked on that single parody disc so even Everything needs to change on that disc. Yep So it's like just n times more performance So it's always compared to a single disc So raid one the read performance if I have two discs, it'll be two times faster than a single disc If I have three discs the read performance will be three times faster than a single disc for right performance It's the same as a single disc right in this one So you can assume you can write in parallel So yeah, we're assuming we can write in parallel. So it'd be the same as a single disc So any of the comparisons are always in relation to a single disc So here when we say n minus one, so that's the number of times so in this case we have Four discs one's using for parody. So it's three times faster than a single disc Alright, so now to get to the fun one. So We can we can actually fix this con So we're kind of bottlenecked on the parody disc and now we get into the first raid level that is actually Used while raid one and raid zero are used But raid five is the first one with some parody that's actually used So the only change here Conceptually, it's exactly the same as raid four Except we just distribute the parody blocks across all of the discs. So instead of disc three having All the parodies while disc three has the parody for a and then in disc two It has the parody for B and disc one has the parody for C and then disc three or just zero has the parody for D So it's otherwise exactly the same. We just take that parody information distributed across the disc So nothing else really changes except we move remove that bottleneck, right? So any questions about this? Yeah So the kernel would be keeping track of which block is the parody one All right, so that'd be part of the kernel part of the file system for that so This again remember one disc failure and we're good to and we're not so This is what I had when I was working on my thesis a disc drive died and I was okay So I continued working on it thinking nothing bad would happen But this usually die quite close to each other So before I got to replacing the disc guess what another disc drive died and what happened to my work? Yeah, so don't be me if you do something like this and then this drive dies go replace it and don't be me Yeah So especially for magnetic discs, they don't last that long like years or something like that So if you tried to buy like, you know a 12 terabyte SSD, that's going to be like thousands of dollars It's gonna be pretty expensive, but if you buy one of those magnetic discs, it's like $300-$200 something like that. So it's a lot more affordable. So if you're storing Like I back up all my lecture files on my recordings, which are gigantic So they're on like a raid 5 drive because I can't afford a raid of SSD drives. I don't get paid that much Yep, yeah, you could implement them yourself, but typically people just like on Linux or whatever you just Set up there's like a raid file system And then you set that up and then you can put anything else you want on top of them And then the kernel will just take care of doing all the raid stuff No, you just buy a bunch of drives They just should all be the same size and be similar characteristics because you're limited by the slowest one anyways so You pretty much just always buy them. They're always identical and Then if you want upgrade you buy another four ones that are identical and just replace it Unless you're Google and then you buy like hundreds of these damn things So, yeah, well, this is saving the cloud. So this is what the cloud uses So if you do the cloud, what's the quote the cloud is just someone else's computer So nice thing about that is if a cloud drive dies Some guy in the data center is going to go replace it for you. You don't have to go to the store or anything it's basically like insurance and They get to fix it right they get to rebuild it and all that fun stuff So This is what you were talking about. Hey, let's add another parody calculation. So two dot drives can die So that's exactly what raid six is so raid six adds another parody calculation So this P would just be an XOR and this Q would be a new parody calculation such that if any to die I can recalculate you might ask what is that parody calculation and I will refer you to The CS department because that gets into like some heavy math if you've never heard the term Galois field Then you will not understand it and I do not understand it So you have to take like some grad math course to understand what it does Just know that it is possible and some math geek has figured it out so With this we have extra parody. So we're essentially using two discs for parody So two discs can die and we don't lose any data But now we are using another disc for parody. So we have less available space So for this we need at least four drives so two for parody and two working and We have way less available space So in this case if I have five discs and they are all two terabytes Well, if I'm using raid six I can only use six terabytes of storage Because two of them are used for parody calculations. So there's always a trade-off. I Have less usable space, but I can survive two disc failures now And if I had raid five while I have more usable space, but I can only survive one disc failure So it depends on what your level of tolerance is and how much you care about your data So I probably should have cared about my data a bit more But again, I could only afford for this and I didn't want to have the speed. So, yeah so Right performance is going to be pretty similar to raid five But it's gonna be a bit slower because now I have to do two parody calculations instead of one and write a bunch more Information, but we can just say it's about the same. It'll be a bit slower, but not significantly. Yep for spinning this Depends on the solid state drive because some of them are way faster than other ones So cheap point if you have enough drives, you won't really notice that much These drives Standard hard disk drives are like 120 megabytes a second some crappier SSDs top out like 800 ish Something like that. I forget something like that. So as long as you have what like six of them It's about the same So it takes a while and if you've seen some big data hoarders Their server will have like 48 drives or 50 drives in them So at that point performance is pretty much faster than an SSD anyways, but they also spent thousands of dollars on it so So any other questions for fun raid stuff? How many more raids are there? So the other one that is sometimes used is something called So there's raid zero and raid one Some people combine these in something called raid ten or raid one plus zero or whatever where they will essentially Do the striping like this? So there'll be a raid zero where you don't have any extra copies of anything and then they'll do a raid one on top of this so they might have two copies of this So you would have four discs and on two discs It would look like this one would have half the data The other would have the other half and then there would be another two discs that are a copy of these So you essentially would have four discs one has one or two would have one half and then two would have the other half so You can see with that here So with that you would get slightly More or or less lucky actually change the color. So if I had raid One plus zero might look like something like I have this Zero with a one disc one with a two and then here I have my you know my level one and between these discs would be like a raid Zero and then we have a raid One on top of that so these extra discs would be new copies So this would have a one this would have a two So now how many discs could fail before I lose data? to so Yeah, I could lose this one and then This one and then I now lost data, but I could get lucky right if I What what about that? Have I lost data in that case? No, so to I could survive to if I got lucky the wrong to die I get unlucky and that's the same thing if I extend this even more So say I added another pair And I disc four and discs five and I had you know a one a two So how many discs could I withstand losing now before I lost data? Three right depends how unlucky I get so if I lose This disc and this disc and this disc all I've lost some data But I could if I got lucky I could lose all of these I could lose four and I'd still be fine You're essentially going up to the gods. So you have like a variable level of Tolerance depending on how lucky you get so this is like a good spin the roulette wheel, but in this case I'm essentially just wasting half my storage Yeah, and I could also split it up. So instead of doing this I could also split it up that I only have One mirror of raid zero. So this is more typical and If I added another two discs, I would instead. Okay, can we stop? So that one's a bit more wasteful because while I waste like a third of my data I ideally would just like to waste half of it. So you do something like this This two three this four This five so now how many discs can I lose before I have some failure? No, a three is just data. There's no parity in this So if I got unlucky To and I'm done right? Well, if I got super lucky Yeah, if I got super lucky three, so same idea here, but in this case well Each of them are different so I can use half of my space, right? So I essentially just have one raid zero. I just have one copy of everything. Yeah So raid 10s are only like just mirror and striping you don't really do it Like you don't really mirror a raid five It gets kind of messy when you do that like some people if they have you know the disc of Or like the drawers of 45 drives like they'll raid five Like six of them and do multiple raid five So raid five another six raid five another six raid five another six and then kind of combine them all together So it looks like one big hard drive so you can withstand more failures. So that Again goes to like your level of tolerance of how much may drives you want to fail But if you have like 40, yeah, if you had 40 drives and you just did a raid five If one of 40 drop or two of 40 drives die, you are screwed So you probably don't want that so you can break it up into smaller raid fives if you want That's typically what happens yet So industry again, it just depends right So there's more complex file systems that will let you combine stuff for like big So more complicated things will also have this on multiple servers And then have redundancy between servers So like I don't know like s3 and stuff like that or Google's distributed file system That will make sure that there are malt like three copies of the data across the world So there's a whole bunch of trade-offs. It just depends on your tolerances. Yeah Yep Yeah, if you use raid your file Explorer looks the exact same that the low level details the kernel gets to handle But it just looks like to you. It just looks like the same file system, right? You can put you can upload your file to the cloud It could use some crazy distributed system like your file if you upload to the crap cloud It'll be on a hard drive somewhere, right? It's not gonna be on a single hard drive. It's gonna be pieces of it are gonna be all over the place Yeah, but it still looks like one hard drive, right as far as you're concerned the cloud has a copy of your file looks the same But that's part of the illusion of all the software, right? It wants you to think that there's one file because that's easy for you to understand You're not gonna understand if they told you every time like some drive that was holding your data failed You would probably freak out and go crazy because it probably happens more often than you think But yeah, that's why they guarantee That's why like if you look at the cloud providers thing They won't guarantee a hundred percent of time because they could get super unlucky Whole bunch of drives die and then you lose your data. That's why it's like ninety nine point nine nine nine nine nine nine nine nine nine nine percent because You could still get super unlucky and lose some data Yeah, if you look at the history of like reddit outages or any big Netflix outages stuff like that that can happen So even if you have a big distributed thing while they live in a big data center So if you have three copies across three different computers, then most of the time you're good if a computer dies But if the entire data center goes down Then what can you do and then oh now we have to back up things across data centers and Then if the unlikely event of two data centers die then right So yeah, but that's more in line with like the distributed computing course that comes after this Which is why this is a pre-req So any other questions before we wrap up? Yeah, so it's not real so distributed system usually means between two computers This is just within a single computer. So it's not really a distributed system and but like I mean the most common distributed system that's Everyone uses is just the cloud Because that's all distributed, but you have no idea, right? But famous quote clouds does someone else's computer, but they'll do this stuff. All right. Any other last questions? All right, let us wrap it up and go for reading week. Hell. Yeah. All right So this are the thing that enabled persistence. We need forward to topics SSDs and raid SSDs more like RAM. So it has some random access It's divided into pages and blocks, which is pretty annoying because you can only erase blocks at a time And you can only write a page to a freshly erased page, which gets really annoying So your operating system is going to have to do some clever things and work with the SSD to get the best performance One of those things is called trim Which is just letting the SSD know that you are not using any pages on the block So I can just freely erase the block and allow you to have some fresh pages Then we saw raid raid is a fun topic Gives you lots of opportunities to tolerate failures and improve performance using multiple disks And as we saw today, right? There's like raid zero raid one raid five raid six That all have different trade-offs in terms of how many this can draw it die In order for you to still retain data and you can also combine them like raid one plus zero So remember pulling for you. We're on this together and happy reading week