 Understand Zen Server VM High Availability. And this question comes up quite a bit. A lot of people, like how do you set up an HA environment in Zen Server? Is it supported? Good news is yes, this is an article from 2014. So it refers to the Citrix Zen Server. Now in 2019, we're talking about the XCPNG version 8.0 and how to do it there. So this article is actually still quite relevant even though it's been five years because the underlying operating system is essentially still based on the same principles but it's been modernized and updated for lots of fancy new features. So we're gonna be talking about XCPNG version 8.0 and how to set up high availability in there. Now I'm gonna link to this article which is still a great read other than some of the older screenshots in it it's completely relevant. We have these three old Dells and that's also what led to the name of the cluster. So if you look at our resource pool here it's called the LacklusterDell cluster and it's because it's just a bunch of old Dells. This is just our lab environment. I wanted to set up for demonstration. There are some Dell 3010, 3020, 3020, they're pretty similar and ideally this is not the system you want to set up in terms of HA for a business environment but it's fun for a lab and a demonstration. If you want to do something in business of course using much higher end servers with better redundancy, higher quality and a redundant switch and a redundant storage would be awesome but for lab and testing this is great. Now this is gonna be the layout for our lab here so we have our XCP one, two and three those three Dells as you've seen connected to a single switch all sharing storage and everything over one gigabit line to one free NAS storage box running NFS. This will be the shared storage configuration. And you may have noticed I have a tab pulled up here that says HA lizard because someone always asks about this. Yes, you can do this as well with the ASN server. The problem with these types of configurations and if you're not familiar with them HA lizard is a way you can do this without having a shared storage. You can just have two servers and configure high availability and use the local storage pools on each individual server. The problem right away you're gonna run into with this is IOPS. So if you have a ton of rights going on on server one host one here and those rights have to be synchronized over to the second host that way if there's a failover we have a most recent copy of the system. That is a bottleneck. So you would have to have a really fast connection between the servers. So not only is it a fast connection needed it is also going to be the demand put on the drive system and has to do a right over here and get that right not only in transport as in whatever the network connections between the two of them also in the disk activity of hold on I need to take this right information and get it over to the other system so it stays in sync. So this is a way to do it but this will come at the cost of keeping those two things in sync is a, well it's a expensive hardware cost because of keeping all those rights up. So if you have something that doesn't write a lot or is using some type of external storage where all the heavy IO activities going that can be great. So the OS can live in an HA2 system environment and I think this is a cool project. I never actually mess with it much but I know people always like to ask have you heard of it? It's commented on many of my videos. I have never tested this with XCP and G but it should work. I don't really see why not. It's kind of a neat system. I'll leave it at that and leave a link down below so you can, if you want to try it yourself. We're gonna be doing this like I said with a normal shared storage system. Now obviously this is non-redundant and you would use something much higher end in a commercial environment and I'm gonna bring up an example of you would have a whole storage sand with failover and you look at something I've reviewed before it's like the TrueNAS system where you can have a single storage server but it actually has redundancy not just in power supplies but redundant motherboards even in it to help survive a failure and then you would of course have multiple switches and then multiple network interfaces and all of these et cetera. This is just a lab environment. All right, so understanding the mechanisms but when you have HA we would love it if it was more magical but it would just immediately move a VM over but we can't predict a failure. What I mean by that is it does move it over but it does have to restart this. This is a misconception a few people have of if you lose a server. The reason it doesn't just magically work with all three of these servers if it's on a VM running on server two doesn't immediately shift over because it doesn't know the server's going to fail and to try to keep a memory sync between two servers would be, well really really difficult. So it does restart. So that's part of what I would have mentioned. If the VM fails and it's running on server two and server two fails out of the three cluster node it will just restart on one of those other servers. This is where the shared storage comes in. It's living on the storage as far as all the data. Therefore, we don't have to ever move it. All three devices are connected to it. So the way you set these up is go over here to the pool. So we'll start pools cluster. Now whenever you add and I have a more in depth video on how pools work you can join other devices to the resource pool. So you just go here to add hosts. You grab the first machine and I named the pool lackluster delcluster and then I added the other two hosts to it. Once I had all the hosts added and we got to all three of them on here they combine all into this resource pool. And it's currently has two VMs and there's three hosts in the pool. Here's the three hosts xcpng one two and three. You can see the IP addresses of them here. Now storage. This is the important part when we added storage. I added a, so we have new storage and anything you add, you just choose any one of the cluster here but because they're in there and the storage is available to all of them. So we're going ahead and choose an NFS storage mount. It will allow you then to just, once you connect it to one it just connects it to all of them. So pretty straightforward and once it's added we'll go over here to storage and show it again. You can rescan all this or connect to all hosts. Pretty much a default when you add something to it it's going to connect it to the hosts in that particular pool. So this is how we get the storage set up and it's important because right here this little ID you have I can hit copy to clipboard and we'll understand this a little bit better. One of the important things is all these machines one have to have exactly the same time. That's important because they're using time as part of what they call the heartbeat. So this explains how the heartbeats work. So you have to have a heartbeat that one all these boxes have to talk to each other and all the boxes have to have a heartbeat and storage. So once the storage is set up you then have to figure out what the shared storage is and set up the heartbeat. So Xe pool, H, Enable, Heartbeat, SR, U, U, ID. So you want to add the U, U, U, ID of the storage in order to get this into H, A. So we would go here and you know we copy this right here. All right and paste it in and then for the actual storage ID, that's this right here, it's already enabled. So it doesn't allow me to do it again but it has to say this is the storage. So you can have storage that's not part of the HA and storage that is, this particular storage is. So now the system's going to be, you get the little, go here, oops, pools, get the cool high availability there. So now we know that they're in high availability. And by the way, if you ever want to know how to change which one's master that happens to be right here, we can, you know, change which one's the master because we can remove any of them, including the master and it should automatically pick a new master for that but that goes beyond the demonstration what we're going to do here. So let's look at the VM itself. Now, because the storage, the bigger part of the VM, this VM only has four gigs of RAM but the larger part is the 16 gigs of storage it has which I know is not much. What we're going to show what happens when you migrate it. It's currently on XCP NG2. So we're going to go here, move it to XCP NG3 and you just see me click the move and let's see what's it run in. I got to run an H top just so it's doing something and we'll show you how fast that move happens. So because we're only have to synchronize the memory and we're sharing the storage for you know, I can just live migrate this like nobody's business and there we go, it's moved. Takes no time at all, I didn't fast forward that and now it lives on XCP NG3. What if I have to restart XCP NG3? Well, that's where it gets kind of interesting because we can take and restart some of these servers and we'll just do this and watch it do it automatically but let me show you something first before we do that. You get to pick which one of these VMs that if you have multiple runs running do you want to be HA? So do you want the HA to be enabled? Yes, what do you want it to do? You want it to restart. It does have best effort or disabled so we can disable it but I just recommend restart. Restart means if that host is lost, go ahead and restart this. So no problem, this is currently, we'll go back to console and we see it's doing something right now but now we're going to move it by restarting one of them so we can go here to the hosts and if we do this, we'll go ahead and just reboot this one. Now obviously this is a soft fail so to speak we did some maintenance, we need to restart something on this because it needs a thing. What happens is it goes, hey you tried to restart something and this is HA so it grabs this instead of shutting it down and it moves it on over to whichever server it needs to be moved to before restarting that particular computer. Obviously this is the nice way of doing it, the next way we're going to do it is less nice because we're going to go ahead and unplug a server that's running on there. So here is the WM migrator right over to XCPNG2 and now we're waiting for the other one which we go over here to hosts. It's in restart so it turns yellow until it becomes available again. This also allows you without having to shut things down provided you have enough resources across your pool to do things like load patches and then roll the patches out in a machine. And by the way, Xenor Orchestra will handle that for you automagically so you apply all the resource pools and you have different VMs running on it and they're in HA, it'll shift them between there while it does the restarts and it's a pretty select system. While we're waiting for this reboot read a little more on here. You do have the option depending on we started with three hosts here but yes you can add more and you can determine the maximum number of host failures to tolerate. You can get some real more in depth and they have a good explainer for each step of this on here. Halting the VM. So you can halt it, you do have to make sure that all these are talking to each other when you do one of these failures because if you were to halt it and you weren't in the machines or out of sync, one of the problems you could have you could try to start that VM on another machine and you can end up with some errors. So what we're gonna go with, you know, I'll just read you through all this to get more of those details but generally speaking what you're trying to prevent is this. Pull the plug type of failure as in. Hardcore, we're just gonna pull it, shut it off and unexpectedly do it and show you what happens there. So because we did nice soft fails when you go over here to the VMs and these soft failures means the VM never had to restart because it was just live migrated between the machines and we could do maintenance. Those are nice and convenient but let's show you what happens and how long it takes for me to unplug XCPNG2 in the cluster here and from unplugging it how long it takes to restart on the next one. So all of our hosts are up and running and we can see the VM is running on XCPNG2 so we're just going to unplug XCPNG2 and go ahead and see what happens. So I walked towards the Dells and I unplugged them, started at the time around my phone so far we're at 30 seconds. Well, it's, we lose the console right there and we're waiting for this to start back up. It'll take a second, XCPNG is down, it's doing its thing. So it's gonna take a second here and it's confirming that it is no longer living so this should disappear shortly here. It's been 50 seconds, took a minute 20, it's restarting it. So we're now at 130, I've seen the restart happen at 120, we're at like 130 now. And let's see, check this right here and it's up and running. So under two minutes it restarted the server and away you go. So it was about a minute 45 from the time I pulled the plug till it was booted up and pinging and we can log back into it over here. So it happens really fast. Like I said, there's fine tuning you can do with the HA system but it works as expected and we go over here to our hosts and that one's missing from the list now because it's down. And we can see it just says red right here and I could try to restart it but obviously it's, I haven't physically unplugged and all it has to do to add it back is after it sits there, I wanna go, hey, let's put the system back in play. We just plug it back in, start it back up and it joins the resource group again and no big deal. And if it was dead permanently or we're removing things out we could just remove this from the cluster as well and remove the whole system. So I'll leave links to all this so you can do some deeper reading in this participate in the forums over at XCPNG or our forums if you have some questions on it or thinking about building one. And of course, if you're putting in something have your duty, you obviously wanna design this with either, you know, a multi-path in iSCSI or a larger storage array. I do recommend like some of the TrueNAS systems which I've reviewed because they're really awesome for doing this because you can build a really high availability single storage box that's reasonably priced. But yeah, this is, you know, all those things become weak points so you have to double up on all of them when you're building an HA server. And I won't lie, the HA literature system is pretty cool too but think about if you have a problem with high write volume, whether or not that solution works for you. But I'll leave links to all this and you can check it out. And once again, all that by the way is I will mention this before if you haven't watched any of my XCPNG videos that I've done before, 100% open source, all this is free with the exception of those Dells which are actually technically free so we're for recycling. There's some old recycling computers but this is 100% open source project. There's no license fees associated with any of these HA modes that I discussed here on Zen servers too. All this is done with open source software so I'm gonna throw that out there and mention it. And that's it. Thank you. And head over to the forums. You wanna further and continue on the discussion. Thanks. And thank you for making it to the end of the video. If you like this video, please give it a thumbs up. If you'd like to see more content from the channel, hit the subscribe button and hit the bell icon. If you like YouTube to notify you when new videos come out. If you'd like to hire us, head over to laurancesystems.com fill out our contact page and let us know what we can help you with and what projects you'd like us to work together on. If you wanna carry on the discussion, head over to forums.laurancesystems.com where we can carry on the discussion about this video, other videos or other tech topics in general, even suggestions for new videos. They're accepted right there on our forums, which are free. Also, if you'd like to help the channel in other ways, head over to our affiliate page. We have a lot of great tech offers for you. And once again, thanks for watching and see you next time.