 Tom here for more systems and data sets and zebolls are the topic for the day They're fundamental in managing your data within ZFS data sets are more like enhanced directories with a few extra features We're going to dive into those details of how they're different than directories and how important they are to your structure and when You should be using them and we'll also talk about zebolls and how they function as a virtual block device within the ZFS environment I have a whole series of other ZFS tutorials linked down below such as ZFS as a cow ZFS and the raid types ZFS and the special v devs So you can go through those videos as well to get a better understanding This video is gonna be broken into two parts one is gonna be a few slides where we talk about some of the fundamentals Then we'll go into true nas for a demo for how to add and create or change these inside of true nas But in general they all apply to ZFS some of you may ask why are you in a sling Tom? Well, that's not relevant to the tutorials I'll throw that at the end of the video just to answer any questions for those that are curious Now let's get started with this tutorial data sets and zebolls live within a ZFS pool That pool can have a single v dev. It can have multiple v devs I've got other videos breaking down the different v dev and raid types You can also have it living within Multiple v devs along with some special v devs such as a cash drive or a metadata drive But those are not assigned to a data set or a zeboll They're used within the pool and the data sets and zebolls you build on top of the pool Which you can have many of each in any combination you'd like Those take advantage of the underlying features that you have built within that Z pool so this ZFS pool having let's say some special v devs some Raid Z1 a pair of those that's all taken advantage of whether you create a zeboll or a data set That is all living within there What does not work is if you have a system with more than one pool You cannot have the data set or zeboll span multiple pools. That's not a function. It's built in there This is something that Chewness scales working out with cluster where you will take multiple pools across multiple different machines and Use cluster to create essentially a span across there But we're narrowing in scope this just to data set and zeboll Which live on an individual pool and whatever that pool is made up of now Let's talk about the features of a zeboll zeboll short for ZFS volume as the virtual block device within your ZFS pool This virtual block device you can think of as a hard drive Presenting as a virtual block device that leaves the OS that you're attaching it to to decide what file system goes within That zeboll that does not mean you have any exact visibility It is not going to be easy to see into it unless you have something that can mount it and read whatever that file System is type is this is also not a shared type of storage with certain exceptions of Software that understands how to do sharing such as if he uses as a target for virtualization And if you're using something like vmware xcp and g and many other virtualization platforms They can share because they are handling the access to the block device So you can have more than one machine able to see this This is not though where you're going to store normal file systems as you would with a data set Which we'll get to next the advantage though is you can snapshot these and clone them You can replicate and back them up using zfs replication They have compression on a per zeboll basis. So for each one you create you can choose the compression type deduplication at the block level I say it at the block level because even though ZFS may be unaware that you formatted this with ntfs because you presented it to windows It's looking at the blocks and which blocks that can do deduplication at so deduplication still works without understanding the file system that is Being used for the zeboll as it's presented to other operating systems You can encrypt each zeboll and they can be set up sparse Which means thick or thin provisioned that means if there's a four terabyte zeboll that we configure Do we want it to allocate all that space at once or do we only want to allocate what was actually used within those blocks? So if we have only one terabyte out of four terabytes stored It doesn't take up four terabytes on the system unless you tell it to thick provision He can actually only take up the amount that's actually used so that is an option There's advantage and disadvantages to using that Sink on or off So if you need it to synchronize the rights that is a per zeboll option and then a block size for different workloads Whether it's a database or virtualization and it depends on the software that's tied to the zeboll there may be better optimization for one block size or the other and primarily you're going to only use these for local virtual machines and that's going to be the hard drive for virtualization inside of churnass I've got a video on how to set that up But you set up a zeboll to attach this to be a actual Hard drive for that virtualization system and iSCSI storage targets that can be used for any application It uses iSCSI. I have a couple use cases virtualization is one way Another way it's going to be presenting it is as a virtual hard drive to windows I actually use that for my gaming system because it's quite convenient to have all the data stored and be able to expand This zeboll because it's another feature it does house can be expanded later and presented Essentially as a hard drive to a windows system. This works on linux as well It's whatever applications can use iSCSI now the much more common use case is a zfs dataset Which presents like but is much more flexible than a directory They can also be nested in other datasets and create a directory to like hierarchy within zfs And as far as when you do a share with these it looks like a series of directories But they have some extra features and these extra features are sometimes why you will have several of them Because you may want to have different policies based on each dataset and each name Snapshots and cloning is still possible with each dataset Individually, this is a big reason you may want to have several because you may have different policies for data retention and being able to set that on a per dataset means that particular as the user may see it As folder may have a higher frequency of snapshots and a lesser frequency on another one Replication and backup using zfs replication once again It kind of goes to your data retention policies and how much data or how frequently you want that data to be Backed up at snapshot compression on a per dataset basis De-duplication at the block level. This is still done per dataset And there are some advantages to maybe only certain datasets you want to do this for but maybe not the entire Set of datasets you have Encryption settings for each one You may not want your main system to have all datasets underneath encrypted with a password that's required at boot But maybe you have one system and that once the system starts up and you want that particular dataset to Require a password to get further. That's a nice feature you can have You can also have different sets of keys that auto unlock and those keys can all be nested within there for Once again different security policies based on where you may be replicating incentive the data Sync on or off whether or not you need synchronized rights. That is a per dataset option record size limits for different workloads ACL sophisticated controls over file permissions I've just done a recent file permissions video You'll find down below as well and that file permissions allows you to go Much more in-depth and carry the extra metadata to set the different user permissions on there And that is all set up on a per dataset basis quotas and reservations Those are a couple extra features you can do now. This is the metadata special small block device I'll leave a link to a video that Wendell from level one text did covering this. It's actually pretty slick He's got an entire write up on there This says that particular dataset should be indexed in one of those if you have them special metadata v devs There's some parameters and some thoughts you need for storage architecture when this has an advantage For example, if you wanted to be able to put on a faster metadata special drive v dev A lot of the small writes and have that indexed over there, but there's a lot of precautions and thinking about it It's a pretty in-depth write up that Wendell has from level one As I said, it's linked down below if you want to dive into that because it's not as simple as just adding it And it'll solve your speed problems Now let's talk about how these work inside of churnass and what they look like when you create them So right here, I have my virtual disk and let's talk about using a z-val Under virtual disk, I have this dataset called iso storage where I store isos that I want to use in my virtualization Then we have this text linux and then we have this toms ubuntu And if we want to create another z-val we can create it here And we just go to the top and hit add z-val give it a name Let's call the test set the size and we'll say one t i b This is where we can force the size if we need it Do we want the sync to inherit from the parent or disable it? We're going to disable it so we don't need to sync on this zfs duplication on verify on those options are explained here Do we want it to be set to read only do we want to set to hidden and then we hit save now Well, there are some advanced options, but not many It's just whether or not you want the snap vdev to be visible or hidden We'll just hit save and let it create this So now we have these three different z-vals in this one iso storage But what i'm going to show now is if we go to the mount location mount dozer virtual disc and we do ls-la To show all you notice the iso storage dataset, but you don't see any type of z-val here That's because they don't present to the file system But they are indeed nested underneath here and if we go back over into true nas and we look at a virtualization And we click on my one virtual machine here and we look at the devices You can see that the disc that is attached to click three dots and go to edit It'll let me attach to the new one if I wanted to there's the existing one. That's good This is another one I had here and all the functionality if I wanted to clone these or if I wanted to add another disc So it was a disc And it would show up and I could add more than one disc to my virtualization Now we go back over to datasets And we go here to tom's gaming desktop and we see it's presented as ice guzzie if we scroll down here We can go into manage the ice guzzie shares And this is where you can create the associated targets for each one of these We can click edit and i'm not going to break it by changing this target But if you want to add more targets or use the wizard you could build more ice guzzies that target this This actually presents to my system. So here's the tom windows And this is what allows me to present this to my Window system and save all my games on here So I don't have to have a big hard drive in there and my true nasty connected 10 gigs So this actually works really fast It's formatted in tfs if I need to expand it I can go in here and edit and expand that storage So we go here to datasets Tom gaming desktop i'm only using Two terabytes right now, but if I needed to we could expand this to be even bigger. I could just change this to Six and I don't need to force it, but I can then expand it out from there How that expands on the other side is going to be dependent on the system you're connecting it to whether or not it will auto expand Or you'll have to actually do some type of disk management Now if we go all the way here to the top This is our primary data set that everything gets nested underneath and then we have several different Data sets that I have such as my archive versus my production one for lts videos Or even under true charts here because I have a few different things set up under that I have fresh rss in that and i'm doing each one of these and if I wanted to add another dataset nested under here I could create some app put some comments and I could set all the settings So when I point an application at this I would have all my granular control for this Now you could create a series of folders, but I usually recommend datasets because as you have the apps tied to it Which it lets me know this data says used by fresh rss You may have different policies you want on a per app basis fresh rss I find pretty important. I really like it and I go over here to data protect protections And there's currently 337 snapshots because there's not much data changing But I would hate to lose the data So I have the snapshots themselves running very frequently and keeping them for several weeks Which means there's a lot of snapshots in there for this particular dataset Because each one's a dataset you can build a policy on that per dataset basis So you can decide what you do or don't want to keep inside of there And because the snapshots only reference the change in data, they don't take up much space Now the other important use case for datasets, of course, is permissions And I cover that in my permissions video and talking about how you nest the permissions And how if you want to handle the permissions via TrueNAS via the users and groups You're going to assign them each to a different dataset So you can get that granular control that you're looking for it is different though But you need to start with a dataset when you're doing it with actor directory But I cover how to do it inside the TrueNAS UI in my permissions video Hopefully this clarifies the difference between a dataset, Zevol, why they're important Why it's really a great idea because there's not some limit where you can only create like 5 or 10 You can create really as many of these as you probably have space for on your system So having everything in a series of hierarchies setting the permissions to automatically inherit This can help really get you on the way to building a good data structure for things And I love everything being nested in different datasets when we design things for clients Or when we're designing it for ourselves just to keep the data very organized And me understanding what belongs where and having policies that are driven based on that hierarchy Nonetheless, love hearing from you, leave your thoughts and comments down below Like and subscribe if you want to see more content from this channel Now as to why I'm in a sling here in July of 2023 I like to go off-roading on my motorcycles I was riding I lost control hitting some mud and slid and managed to break several ribs in a clavicle Which means at least the next few videos are probably going to be done inside the sling I love hearing all the positivity from you So if you have some comments or questions about all that I don't know what the future of riding motorcycles looks like for me But I know it's currently not in my immediate future while I take some time to heal Thank you and take care