 Issues in your setup. So usually it's not that the feature that We want to promote as something great, but like it needs to be there to get fixes for your best status What we are gonna do we will make some kind of introduction with some kind of terminology to be able to To talk about what is fencing. How is it done? Then I will show you some real life examples some situations and what how Fencing tries to fix those situations and we also take a look at the future plans because there are still some Areas that we would like to improve Before I start Just for my knowledge How many of you are using overt? Okay How many of Let's say of you have ever needed fencing like I don't think in overt but regularly in in some other cluster where Okay, so it seems that like It's usual that things doesn't work as expected. So we need something like that So just in in short This is very simplified Picture showing the architecture of over we have something called engine. It's Kind of brain of the whole the system Everything important is Decided and then executed at this machine We have a bunch of hosts which is used to run your VMs And all of these hosts are connected to the shared storage so we can Migrate VMs between those hosts We also have some kind of logical units that like bunch of Hosts accessing the the same storage can be Treated as a cluster and each cluster can have some kind of special configuration The above cluster is a data center because like it Makes a set of cluster and also it makes a set of storage that can be used in those clusters Hosts can have a network connection either The green one is it's like the typical network connection you want and The red one is usually The power management of the host I will talk about it later, but it's An interface that allows you to For example stop and start your host Using power management actions not like doing SSH to the machine and execute shutdown But really like to turn off for example electricity or your horse and shut his down So as he said that we have a host this is a physical server to run hypervisor on and the VMs We have a cluster which is like set of host with some kind of same Capabilities and the same configurations We have a data center which is set of clusters and storage And Also, we have something that is called highly available VM by default if VM runs on one host and this host Fresh is or stopped working the VM Is not like it starts automatically if we detect that the host is really down We just mark it as down and it's up to an administrator what to do with this VM But if this VM is configured as highly available We want to Run it on a different host as fast as possible. So if we detect that it crashed the host We need to make sure that the VM is still not running It doesn't access storage and so on and once we are sure of that we execute it on a different host So this is this is what we call the highly available VM Some other terminology power management interface It's usually in most modern servers. You have a special Network connection Which is which contains the device that you are able to like start the host remotely or shut down host remotely Or get his status and and like other functions This is what we are using to to manage power of our hosts fence agent it's a tool that like provides and Common API for us to manage all types of power management interfaces Because like the most usual is IP MI But there is direct and some other proprietary power management interface and we want like to handle them in a Single API and that's what we used fence agent for Fence agent is not over specific. It's the same package that is used for example by clusters you or open stack Another term is non-responsive host It's the host that like engine came cannot communicate with there is something happen Engine is not able to contact the host and we need to Find out what and try to fix it Fence proxy it's a it's a like when you take when we get back to the to the picture for example, if this host is Is is non responsive? We need somehow to contact is power management API to find out if it's down or up or something broken using this we are we select different host and On this whole we execute the fence agent to contact the VDSM To contact the host at to know the status or to execute the action. Why is that? usually the engine is like a point of access for our clients and we don't We don't usually have direct access to Power management API of the host because it's like considered to be secret and not open to the public so that's why we like Execute an action on our own hypervisor which calls The fence agent to execute the action So that's fence proxy Now let's take a look at the UI of over When you add a host there is a special tab called power management and he here you can set up Some things the important things is here This is for example, the IPMI interface of of the server which we can get the status on this picture, and there is also Secondary agent, which is like not directly at the host But for example, it can be the UPS port of the host to power. So if we cannot contact this This host for some networking issue or died completely. We can Take a look here at the UPC and say, okay The host is without power. We can be sure that is that Okay, so This is this is the slide When I told you how this power management operation works So we when engine for example when for example, I miss that they want to know the status of a host So we execute an action for example get status in engine which contacts some other host in Cluster and on this host we execute fence agent, which is using like different IP address For example to get the power status of the of the desired host This is this is the simplest scenario Usually in in production in large size These like networks are completely separated. So That's why also why we why we use this a bit more complicated scenario so As we said we need some other hosts, which is like execute the power management command we call it fence proxy selection and It's it's a process that like gets all the other hosts or from from the cluster or data center and Evaluates them and pick up a single one which like fit best its status to be assured that the Power management action is successful. So using this process. We evaluate the host we Get rid of everything that has connection problems. They are non operational for some reason and we take just the best fit host and this is Selected to execute power management operation By default the process starts that we try to use the host from the same cluster as the one that we want to execute Power action on if it's not possible We go up and we try to select the host from the same data center about different cluster Even when this is not possible. We also may try the other data center but like this really depends on configuration and you have to enable it manually because as By default we don't expect that the data centers are ever connected But if you know that your data centers are connected you can enable So even if your whole data center is down, you can still get the status and restart the host from other data center so this is The same dialogue F of the of the host detail here are advanced parameters And here you can see that for this host we try to To select the fence proxy first from the same cluster if not succeeded We continue to the data center of the host if you want for example at other data center You can do this using this button By default This is the default like the same data center same same cluster. This is default For the project if you want to change it globally you can if you want to customize per host you can do it here so fencing like speaking Easy speaking is fencing is a process that tries to make non responsive host responsive again It uses like Several approaches which also later how to do it how to achieve it and it's like tightly coupled with host monitoring so usually When you take a look at her host monitoring we try to monitor our host and if there are for example some connection issues We like wait a bit more time if it's temporary or not if like Some kind of timeout which can be customized is passed over we mark those hosts as non responsive and try to execute fencing to him to be Responsive again Why is the fence needy as we said for a highly available VM We need to be sure that like this VM is no longer running on the host that is not responsive Why if we execute the same VM on the different host that it could be data corruption And that's the things that we need to prevent at all costs so How do we know that like the VM doesn't work or or it's really shut down We have at the moment we have flown like only two two things If we successfully detect that host is in Dumping process. We know it's that and the VM cannot running or we successfully execute The host shut down and the host status after it is off. We know that the VM is not running So this is the only case when we can start like Execution of the highly available VM on different hosts So the fencing process is like Try to fix the host and if it's not possible Try either detect it's dumping or detect its It's shut down so we can like treat the VMs again to be either down or it's highly available To treat them and execute them on on different machine As we said prevent data corruption is it's like the most important goal for fencing It's better from our point of view not to restart highly available the VM at all Then restart it twice and make it that the corruption So in overt hold the fencing flow contains three important steps The first one is something we called SSH of fencing It's like really the I would say the easiest things to do is try to connect to the machine and try to restart VDSM Which is our agent which handles all calls between engine and hypervisor If it's not possible or if it we succeeded restart but still doesn't work We continue to the next step and this is the KDOM detection When you where your server has hardware error or some kind of kernel issue it usually it's reboots into what is KDOM kernel and It gathers memory data and then try to Save this into some kind of predefined location for further analysis When when this like reboot to KDOM kernel happens Everything what was running on the server is stopped so we can assume that like okay We the VMs are no longer running we can restart them If we detect that the host is not dumping then we know something else happened So the only thing that we can do is try to restart The host completely using power management action So we execute power management stop now testing if the host really stop when it's stop we try to execute start and Testing if the start execution was working fine when everything was fine We like still make the host non responsive But we expect that it was start was successful It will became up at some time like it depends on the server on its size on its load How much time it does it take when he games up again? so before we go to the to the real-life examples so any questions so far So the question is if we plan to monitor host by a storage It's our future plans is to introduce storage fencing. I will try to say something at the end about it Yeah, there is there is also a feature I will talk about it like it's not storage fencing But we detect if the VM is still accessing the storage or not Yes So the question was how do we detect that the host is dumping its memory? We are using Or when we deploy the host we configure K-Dump for it And we add fence K-Dump, which is a module which sends notifications like when you boot into K-Dump kernel and start gathering the process you can You can execute a process which send notification. Okay. I'll start the dumping and those notifications are received until the dumping is over and Host is rebooting into a normal kernel again. So we are gathering those notifications I will show you the example later and by that we know, okay, this host is dumping We don't want to fence him not to lose like the dumping information Any other questions? If not, okay, let's go to the basics so This is like pretty easy example, we have a host which is non-responding and We don't know nothing about let's assume that this is simple example Networking is working, but our VDSM agent has crashed so Fencing started in engine and as we said the first thing is SSH So fencing so we initiate SSH connection to the server and Restart VDSM service after that we wait for some time if host goes up Everything goes soft is if it doesn't work we continue in this case Let's assume that the restart of VDSM work. So the fencing flow is over. So this is this is pretty easy Let's continue to another example in this example for example, I don't know the port switch Stopped working. So we lose connection between the host and the engine like we said before Hosts starting to be non-responsive on engine we initiate we initiate Fencing flow first step is SSH so fencing. So we try to connect the SSH to host Unfortunately, it's not possible. So when we get the exception that SSH connection was not possible we initiate Next step In this case for simplicity K-Dump is not configured. So let's keep it for now The first the first step is like to do power management restart So engine tries to select a fence proxy host as we said before Once this fence proxy hold is selected engine sense and commands to this fence proxy host please execute power management stop on the from this non-responding host So here VDSM cause desired fence engine which is using like the power management API of the host and Try to stop it If this was successful we know that no VMs no longer running in this host and we can restart the VMs on Other hosts. So in engine executes Other action in parallel like which start highly available VMs and normal VMs It's set is status to down and other things In parallel we execute another command to the fence proxy. Okay. We know host is down Please start again so we call a fence agent again and he Tries to to start the host if the start command was successful We mark the host and it's non responsive and we wait until it's come up Any question about about this flow how how this works? Okay, let's go on. So this is this is the case that we For example in this case host everything is fine But there was some kind of kernel issue and host is start dumping as we said before during like registering and deploying host to over if K-dump Detection is configured Because like you can turn it off for some reason if you don't want to but by default it's turned on so during during the host deploy we Alter K-dump configuration on the host and we add a special fence K-dump configuration which sends notification from the host To the to the engine at the moment, but another host can be selected So let's assume that the host started to dump The process booted into K-dump kernel and this case K-dump can now start to send notifications to the engine that host is dumping At this time in moment Like engine didn't know what happened. It has like the host is not responding so the whole machine is starting up and As we said like the first things that engines want to do is do SSH so fencing So he tries to do SSH SSH connection it doesn't work because host is dumping So the next step is like detect if K-dump is going on So he take a look at the database And there he finds that okay host is dumping at the moment like the non-responsive treatment of the host stops and wait until like those messages Those notifications stop to come There are special configuration, but for simplicity The host is still dumping if you have a host with lots of memory it can take significant time So once the host is stop dumping the notification process like stop sending messages And it rebooted again into into normal kernel and the boot process started again So at at moment at this moment and after some time out Engine says, okay, we stop Getting notifications about Host dumping we can assume that host is restarting and we mark it as non-responsive and We will wait until it's come up So this is this is the K-dump any questions about it Yes So the question is how do we handle? failures of fence proxy, so when engine select like first fence proxy and during So there are two ways if you are not able to contact the host so the command to execute Fence one is not get to the to the fence host we just detected in engine and Select another host at the same moment we detect that we cannot contact the original fence proxy If something happened during this execution of the faint agent, we return just the failure with the reason Back to the engine and engine also select Different proxy by default we do three thrice per each Per each command in the same cluster if it doesn't work We try to do same data center also free retries with different with different hosts, it's Like customizable if you want more it tries what it tries it depends on you, but this usually is enough Yes, so the quest so the question is when K-dump is running do we have some kind of timeouts? Yes, there is several timeouts before like we have We have timeout when engine tries to detect kidnapping before like the first notification gets in So this is the first timeout Then we have timeout between Like because by default we send the notification every five seconds So we have another timeout which detects how many of those notifications can be lost So for example like, okay, we if you lost five notifications We still assume that K-dump is dumping if you lost six one, okay K-dump is over and also we have another timeout After the last to receive the notification for how long we can expect that the K-dump process is still Is finished and we can assume the host is restarted Yeah, so so even in this case we just waits Until it finishes because like we have an event Which we sent to an administrator. Okay, this host is non responsive or this host started K-dumping so so this is just up to the administrator he received the event and he had to handle it somehow like, you know if He knows something happening and it's up to him if you like if he Like stop the dumping of the host and rebooted manually after that He can like mark the host in in overt, okay I reboot it in manually treat it as rebooted and the things go on but it's up to the administrator We don't have any other way how to like detect stored K-dump or something like that because like the only Way we can do we can detect. Okay. We receive notification K-dump is going on Okay, so let's go on So this is like slightly more advanced configuration let's assume that like we have hosts with one cluster and Host with other cluster each cluster is connected through its own switch So let's assume that there is an error in in the switch for cluster one also each host is or its power management interface is connected in a different switch and also its network is connected in a different switch in overt we also can have like VM access using spice or VNC defined on different network So I didn't draw it but let's assume that there is also different switch when users can access the VM So engine host is non-responsive and gin tries to do SSH just venting it fails So he tries to select the host from cluster one. It fails because the whole switch is down So he tries to do other Cluster in the same data center. He select the host and on this host like he execute He execute power management command and the host is successfully fenced so Is it does it seems right to you or do you see any issue with that? Like from the from the from the previous slides we said, okay host is fenced everything goes fine Yeah, exactly because the host is still connected to the storage and we don't know if the issue is like in here But so okay engine cannot connect to host But like we may have users connected through a different networking API Which can still access their VM and be pretty happy and they didn't know any issue. So That's why we have like clustered fencing policy and Exactly for these options. We by default Can turn on skip fencing if the host is still connected to the storage? That's what I was talking about. So what happened in this case if this if this option is on And engine tries to execute power management stop on on this fancy grocery host It is able to connect to Sunlock, which we are using for for storage synchronization and ask Sunlock Okay, this host does it still access the storage or does it renew its its Sunlock lease If the Sunlock tells him, okay, this what this host is still accessing the storage We skip the fencing and we said, okay There is an issue but the host is still alive and VMs can access it So this is like kind of protection for us if you have more complicated network setup We cannot assume that if this is down that there is some issue Like there is an issue but like for us is primarily our users to be able to access the VMs. That's the primary goal So this is why we have This like kind of failsafe. So if the host is connected to storage, we skip fencing now another question like okay, we probably there is an issue with the whole Cluster do we still need like to to for each host to execute SSH of fencing and if it fails to to execute Host stop and then detect if host is connected to storage Like it's inefficient, right? So this we have another things another options for that and it this is skip fencing on cluster connectivity issues When you can define a fresh hold for example, if 50% of your host is connecting or non responding You can skip the fencing because you know that like this is probably an issue in in this switch So in this case, we are like we can prevent Unwanted fencing and we can do it much faster than we like Wanted to execute the whole things and detect if cluster is is Connected to the storage Yeah, we tried the soft fencing it fails and then we called like the special verb to our video Sam, please do power management restart and We like with this command we sent like if the option like to skip fencing for storage lease We send in and if this is stand on The the first thing that here we do we like we contact Sunlock and tell us okay Are we still connected to storage if we are not we continue with power management? so if you if you go back to the to the cluster fencing policy There is like first button is enable fencing It's like the thing that we introduced in in 3.5 slice moment It turns off fencing completely for the whole cluster Now you think like is it good thing or bad thing like we have a users that Doesn't run highly available VM a and for example the link between engine and and their host is pretty slow So they want to like to skip fencing. They want to do everything manually Now that makes sense To turn off fencing completely because they knew the risks They are okay with that they don't want to receive or to be they host to be to be fenced inappropriately But in the usual flows the only reason that you want to skip fencing is for example, sorry For example, if you know that this Switch is for example replaced or reconfigured This is plan plan reconfiguration and for example, you know, it assumed to take like half an hour to do some of the configuration so if you like Don't touch it over all the fencing stuff will happen Even if it not succeeded, but it will start so we will receive Okay, our host is not connecting and non-responsive and we attempt to to associate so fencing and so on if you want to prevent this You can do okay Our switch maintenance started 10 minutes. I will disable fencing for the cluster Completely so we will not fence anything and when the switch maintenance is over you enable the fencing again And everything runs smoothly So like this is the options. I think this is the only viable case Which you should turn off fencing? Any questions now that Yes Yes, all of all of the functions or definitions are available using rest API So everything you can turn on off in in web app menu can also do in in the rest API Yes Yes, so so in this case if you want to be sure there is like I said before this Secondary or or other power manage agents so you can define for example your are law as the first one And if it's not reachable you can define for example UPC Fence agent on the UPC. So when we when we contact the UPC and we said, okay It's power off we can power off if you can't do ever this doesn't work And everything is broken the administrator has an option in a web admin or SAPI to tell okay I manually rebooted the host and I take the responsibility Like this is the last step we can do It's up to administer if he like only claims to do it and expect the results can happen Yes, I will talk about that in in like future friends. So we we would like to introduce like proper Proper storage fencing because like when we have this we will be like able to for example Said okay this host is in state. We don't know about please turn some lock to stop or to place a like Lock that the this instance of VM is not able to access it again and the restarted at once This is this is a plan for the future any other questions Okay, so so I Talk about storage fencing The other thing that that we are like thinking how to do it is like for example if your host is dumping It might be a kernel issue Fixed by next reboot, but it can be some kind of hardware error So in that case like host is start dumping rebooting back Normal boot process started there is an issue booted to to K-dump and this is like never-ending story so we are thinking to do some kind of like let's say configuration and heuristics for example if host was K-dumping like Maybe three times in the last hour mark is and non-responsive and and like don't care with him and let it Administrative to do what it happens. So this is like any other questions. I'll be available and happy to Transfer them. Thanks a lot for attention