 Tom here from Orange Systems and my favorite ZFS question is, how do I expand? How do I expand the pool? Can I add one drive? Can I add three drives? Can I expand it asymmetrically? Those are all related and I'm going to talk about how, because the problem is, and I needed to make this video for probably a long time, people ask the questions, but I want to explain the intricacies of how to expand ZFS. Too long didn't watch, you can't just add one drive to the pool. Well, you can in a way, and that's why this video is going to break down into two parts. We basically have some graphics I put together, which we'll show how to do this visually. And then we're going to play with TrueNAS and show you how to do it, like in a lab environment. So you don't have to take apart your lab to figure this out. So I'm going to cover how to expand the pool, how to expand it later because it can be done, but there's just rules you have to follow. But the way ZFS works, this video is also being done here in October of 2022. I say that because this is the current status of ZFS. There are changes coming. There are expansion options coming. Those don't exist in currently TrueNAS. There's what we'll be doing the demo and it does not exist there right now. So there could be a future when you're watching this, when you go, Tom, all this is your elephant out because they've now added this feature where you can just add one drive and it redistributes all the data. Perfectly fine. Awesome. I hope you live in that future. And I hope I live in that future as well. But let's get back to reality and start with the graphics. Now I have plenty of videos linked down below that get into some of the more complexities of ZFS and how it works and how caching works. We're going to focus just on the data V devs. Essentially the way ZFS works is your ZFS pool has data sets and Z vols. Most of you probably only use data sets. And this is where your creature shares from. I've got plenty of videos on that topic as well. And as I said, we're going to focus on the data V dev types specifically. These are the ones that, well, when people want to expand, this is what they're going to be asking about is, how do I expand these? And can I add a single drive? No. This is the first problem and answer I give, but there's going to be more to the story because there's ways to expand the pool. So you have your four drives here, making up this V dev raid Z one. And the important part is, and how do you expand this comes down to the V devs have to be symmetrical by raid type. That's the first rule here. So V dev has to be symmetrical by raid Z one. So if you started out with a V dev that trade Z one, each subsequent one you add has to be a Z one. So we have four drives here. We want to add four more drives. No problem. They have to be Z one. If they were Z two, it had to be Z two, et cetera. And you get the idea. So that's the symmetrical V dev expansion I've said, and this is all that really means and you can keep adding many, many of these. You can add as many V devs as usually you have enough plugs to hook up all the drives or whatever J bot expander you may be using. You'll put quite a few of them together and you can group these together. And there's different strategies for performance of why you may want to take even like my story or Q 30 and break them up into a series of V devs instead of making them all one. There's some performance things, but this is all about expansion, not performance. The next thing is they do not have to be, but for performance reasons they should be, they do not have to be all the same size. So we can have four, two terabyte drives. We can have four, three terabyte drives that we add later from a storage perspective, this works perfectly fine. There's not any issues in terms of storage by doing this, but you will get well performance of whatever the slowest grouping of V devs. And then the data sets themselves are unaware of how many V devs are behind the scenes. The ZFS pool just handles that magically. So as you write things to the data set, it will evenly distribute it across all the available V devs. And when you add a new V dev, it will then start adding data, not only to that V dev, but cumulatively to all of them. Rebalancing of the pool, the easiest way to describe it is if you take all the data off and put it back on, it'll automatically rebalance the pool between all of them. But you can get some unpredictable performance, but don't worry data integrity. And a lot of times if you're just using this for archival storage and you're more worried about integrity and performance, it's fine to do this and just keep expanding them and adding another Z one at a time. But what about doing it symmetrically in terms of RAID Z type, but not in terms of number of drives? That actually works too. You're going to get a warning about doing it that way. And you'll find a lot of people discussing in the forums. I haven't found anything related to data loss by doing it that doing it this way. But it doesn't seem to be the best idea because of the way it calculates parity. I believe it pauses a little bit more. So once again, performance issues, yes, data integrity, that should be fine doing it this way, but it's generally frowned upon. And when you set it up and we'll show you how to set it up in TrueNAS, you get some warnings about it. What about mirrors? And I'm positive there's a comment already down below about people going, just use mirrors end of story mic drop. That's as far as we go. Yes, mirrors will work. Mears do follow the same rule because a mirror is a type of RAID and you have to symmetrically match them. The downside, of course, of using mirrors is that storage inefficient. You are using an extra drive to do every mirror, but you get a good performance out of this and it's a good way to do it. And you just have to buy the drives to at a time when you expand. And yes, you can just build out a lot of mirrors, but you're losing some amount of capacity and trade for setting it up this way. You can also just do many pools, but as we see in the very first slide, the data sets are not shared between pools. So this is another way that you can expand. And this applies to TrueNAS and other systems. You can have multiple pools. Now, this is popular where people maybe started out with some RAID-Z1 and then wanted to go RAID-Z2 and they start adding more drives. If you have a large system with a lot of capacity so you can add many pools, that's great. Also really good if you're doing things like building a storage pool that's maybe for a virtualization and one pool is full of flash, really fast drives and the other drives are full of spinning rust. That works well. Also, when you have two pools, it allows you to easily use replication between them to synchronize the data between the pools. Not for a share purpose, but for backup purposes. Maybe you want to have a really fast pool, but you only have a RAID-Z1 and you have a larger spinning rust pool that you're like, well, I just need all the data archived here. You could set that up to be a Z2 pool and replicate to it. Losing any single VDEV loses the pool. This is really important to remember. And essentially, if one VDEV goes bad, data loss, data loss everywhere. And yeah, this is a problem and this is something you really have to consider. So if you lose any one of these pools, VDEVs within the pool, the pool itself is now gone because it does not have any redundancy across here. The redundancy is all built into each VDEV. This is why Z2 is pretty popular where you can survive two drives of failure, but it really comes down to your risk tolerance of whether or not you should RAID-Z1 or RAID-Z2. If you want better integrity, because if a drive is lost and you only have RAID-Z1, well, you have a limited amount of time before you could potentially have a problem again, or the resilver process itself, pulling the data to replace that drive could be the stress that makes the rest of the RAID-Z1 or RAID-Z2 fail. So use at your own risk if you're doing Z1. It comes down to what's your risk tolerance and what kind of data is on there. But Z2 is obviously going to be more recommended for production and more data integrity. Well, of course, why don't we just set everything to Z2? It comes back to that storage efficiency, because when you have more parity, well, you've taken up more size for parity. Now let's jump into TrueNAS and show you how this works. Now for the lab part. TrueNAS scale 22.02.4 is what we're using, which is the most current version here in October of 2022, and let's create a pool. Now, we can do this pool using mirrors, which would give us a 16 terabyte capacity. But as I said, that's kind of storage inefficient. We want to do it much like we had in the slides here. We're going to take four drives and we'll throw them over here to be a RAID-Z. Now, if we want to add another VDev, if I click Add VDev again, it'll automatically figure out, and let's go ahead and do this. We'll add a few more drives. We add them down here with this, and now we've got an error here, because adding data VDevs of different types is not supported. So let's go ahead and do it there, RAID-Z. Now, this is, as I said to you, warning VDevs of different numbers is not recommended. The first one has four or two, so you want to make sure they do match. You get the warning, but it will let you create them. It'll just give you a checkbox that you need to check to make sure you're willing to take that risk on yourself. But we'll start with the basics here, and we're just going to add RAID-Z with four drives, and we'll keep building from there. I mean, instead of building them all at once right now. So this is the most storage-efficient way, but not the most risk-tolerant way, because we can only survive a single drive failure. So there's some risk with it, but hey, we want lots of capacity, and it's a lab system, so let's play with it and show you what happens. Go ahead and create, create the pool. All right, we've created our demo pool, and we've got a few terabytes of storage available in there. Let's go ahead and add a dataset called FIO, because we're going to use a tool called FIO to write some data to this particular drive. And we're going to do that this way. And the command we're going to start off at the top here doing is zpool.io.stat-by demo and 0.1. Means update every 0.1 seconds, that way you can watch things update really, really fast. I think we make this a little bigger here. All right, and at the bottom here, we're going to run FIO. FIO allows us to write some data that we want to write out to the drive, and we want to see how this gets allocated and where it goes. So we'll kick off our FIO, it writes to the demo FIO, it's a random write, and we're going to fill it up with those some data on there, and you're watching it spread between all the drives and slowly fill up this allocated space. So four gigs of space, five gigs, just about. There we go, jump to six. All right, now the job's running, and you can see we're just doing some testing, read and write tests here. And it's filled up, it's got 13 gigs of storage that's allocated to this one. So let's go ahead and expand our pulks, we need more storage. We'll cancel that job and go back over to here. And we want to add VDVs, and let's just add a couple more. So sort them by capacity again, so we're putting in four more drives. Add them over here, it automatically knows, it won't let me even choose a different Z type, because, well, it's locked in saying the first one's RAID-Z, next one has to be RAID-Z, so we'll go ahead and add that. The disks are erased, the pool extended onto the new disk with chosen topology, existing data on pool is kept intact, absolutely. Now please note, this is a one-way operation, once you've added under VDV, you can't remove it because it doesn't keep track of a certain amount of data and doesn't have a wind-down process to move pieces of data out and over to the leftovers of VDVs, so there's not a undo for this. Once you've added a VDV, it now has two and will always have two VDVs. Now let's go back over to this. And you can see that the pool has 13 gigs over here and only 848 kilobits here. Now if we run this same FIO command again, and we don't change anything about it as terms of where the file names are or anything else, it'll actually overwrite the data, which may be what we wanna do, and then it'll start allocating to the other one, but let's do it a little bit different. So we're gonna go over here to mount, demo, FIO, test, move everything to test. Okay, we moved everything to test, so that amount of data will stay there and now we're gonna have the FIO command run again and it's going to start distributing this data across there. Now the moment you start moving data around, and we'll go ahead and do this, here we go. It's now distributing it evenly, evenly based on its size and capacity. This is an interesting aspect of the way this works. It realizes the total available free space per VDV is different, so it breaks the file up and slices it up in that way to distribute it evenly across each VDV that's on there. So it's going ahead and writing all these files out and adding capacity to it. Now let's actually change this up a little bit and for those of you wondering, what happens if I'm in a production environment and I wanna add a VDV but I don't really have time to shut everything down, so what if it's running, oh I don't know, for 900 seconds here. So we'll let it just pound away at the drives and keep writing and we got these RAID Z and RAID Z1, the second VDV here, so you're on one, let's add a third one to there while the drive's doing something. So actually we can go here to the dashboard. It's actually responding a little slow, it's not the most performance oriented system here, but oh yeah, we're at high CPU usage, it's doing its thing. Let's go back over to storage and expand it while it's writing data. So we go right here to pool and we're gonna go ahead and add VDVs. So while this is under heavy load, it's going to respond slowly to this. We'll go ahead and go here, here and before we do it, we'll just jump over here so we've got these drives and let's do this and make this a little bigger and you'll see the next ones show up. So as soon as we click, add the VDVs, add a disk or a race, the pool has expanded, same message we got before and we'll watch it just keep writing here, it's doing its thing. There is a pause in the right, so there is a, as it does the pool up disk, formatting the disk and we appear to have paused it and it's just not writing much. It will, it's gonna change in a second here and some writes are going through. This is where you have the IO fight going on right now. It's trying to expand things out while simultaneously trying to keep this job from running. So that's what would happen. Extending the ZFS pool, pool update, capacity change detected, so it's still going. Our job here, you can see the writes are slowing way down. There we go, now this came back. Now here's all the writes. They are now accumulating on this drive. So these ones are still where they are and the writes are expanding onto this drive. Now, of note, unless I rerun this task again, where we relay out the files, so this will relay out those same files again and rewrite over them. No, actually I'm wrong about that, it didn't. So let's go ahead and remove them all. So let's remove all these files real quick. Delete all, hey, why not? Up arrow again, run this again. All right, now it's laying out the files again and now they're gonna get evenly distributed all across here and we've expanded the pool or rewriting it. Now, there's some different scripts you can find for removing your data back and forth without you just copying and pasting it, but the simplest way is obviously just copy the data, move it and move it somewhere else and move it back and that data will get redistributed across these drives. Now one final note on expanding your pool, there is one more way to do it. It's just a little bit more tedious. If I have four drives, for example, and those four drives are all one terabyte drives, I'd like to replace them with eight terabyte drives. I can replace each drive one at a time. Now each time you put a drive in, you're removing one drive, you're re-silvering the new one, adding it to pool. Once that re-silver process is done and it's added, then it didn't expand to that size of drive until you go to the next drive and do it and the next drive and do it. And once you've, if we started with four one terabyte drives and we've now replaced them with four eight terabyte drives, as you've gone through this process, it will expand the pool to the larger size. So that is one other methodology that will work. But leave your thoughts and comments down below, head over to my forums for more on the discussion on this topic and let me know what you think on here or maybe this was the deal breaker that says, this is why I don't like ZFS or people who like to point out that article, the hidden cost of ZFS and send it to me because this is why they hate ZFS because I can't just add one drive at a time. I'm going with some other RAID type that does that. And that's fine. These are things I think are important to know. They're part of your decision-making process of whether or not you choose a ZFS pool to be built. ZFS pools are very popular because of their data integrity and the performance you get from them. But yes, this is one downside that exists here in October of 2022 that I hope, as I said in the beginning, there's a future where this doesn't exist and you can simply expand and add one drive in and the magic behind the scenes works. But that's gonna take some more engineering and we're not there yet, but hey, I think we'll get there someday. Thanks. And thank you for making it all the way to the end of this video. If you've enjoyed the content, please give us a thumbs up. If you would like to see more content from this channel, hit the subscribe button and the bell icon. If you'd like to hire a short project, head over to lauranceystems.com and click the hires button right at the top. To help this channel out in other ways, there's a join button here for YouTube and a Patreon page where your support is greatly appreciated. For deals, discounts, and offers, check out our affiliate links in the description of all of our videos, including a link to our shirt store where we have a wide variety of shirts that we sell and designs come out well randomly, so check back frequently. And finally, our forums, forums.laurancesystems.com is where you can have a more in-depth discussion about this video and other tech topics covered on this channel. Thanks again for watching and look forward to hearing from you.