 Customers who get health checks benefit from the ability to predict and avoid issues in the future. Customers who have implemented the recommendations that we provide in health checks also see fewer issues in the deployment activity and open fewer cases in the life of those nodes. This is your subtle part here and welcome to another episode of ClearHour. Let's talk. And the next guest on my show is Philip Mary, software engineer at Sios Technology. Philip, it's great to see you again on the show. And I see you again, Swap. And today's conversation is going to focus on the anatomy of high availability health checks. If I ask you, what is high availability health check? A health check is just as it sounds. It's an evaluation of the cluster to look for underlying problems. And the purpose is to make sure that a cluster is healthy and avoid any detectable issues for the future. There are a few different types of health check, but they all focus on making sure that the deployed systems are a configuration that's going to be valid and operational with LifeKeeper. It's our goal to make sure that the health check not only uncovers and remediates any problems that we might see as a discovery on that health check, but also provide confidence to the customer that their upcoming deployment maintenance activities or just the continued operation of those nodes is going to be without issue. Is it a process that teams do or is part of Sios offerings? So it's not another task in the pipeline of teams. So the health check is something that Sios would take care of. Typically that would be an engagement that you would reach out to Sios for and we would schedule that. And there would be a little bit of involvement from your team, but it would primarily be work done on the Sios. So I'll talk about a little bit later on, but the primary engagement between your team and Sios' team would be the screen share session. And like I said, I'll get into a little bit more of what we do on that session later on. When are these health checks done? Is it like something that organizations should do on a regular basis or is more or less like when it starts seeing some problem? I would definitely suggest starting one of those health checks before you see problems. It depends a little bit on what you're doing on what sort of health check you might want, but I would always suggest a health check before the deployment of those nodes, maybe before an upgrade that would happen. And in the health check, we could also review the state of the clusters after that review, make sure that the application versions are agreeable with what LifeKeeper is able to support. And how deep do these health checks go? There were a few types of health checks. I refer to them as short form and long form health checks. And so I'm going to talk about those a little bit differently because their depth will change. A short form health check primarily focuses on validating the configured environment meets the requirements for protection by LifeKeeper. And I know that sounds general, but it's primarily focusing on application settings and configuration. It's a really good option for new deployments because a log review isn't included in that evaluation. Whereas with a long form health check, it's tailored to nodes that have been in deployment for some time and might have a history to investigate. It investigates everything the short form health check does, but it goes into the additional detail of reviewing logs for not just LifeKeeper, but even VLS. And if there's been an issue in the past, it's something that would typically get uncovered on that long form health check. I'd also add that the long form health check benefits from the context of any previously open cases and experience that the customer experience team has, not just with your cases, but with all of the cases we've handled. In some instances, customers have opted to add on to the health checks. And we've done reviews of runbooks and validated maintenance procedures as an additional part of those services. Of course, when things are running for a while, of course, you do have to keep an eye on the health. But do organizations also need health check when some changes are made, whether they update something, update something, deploy something new? What I'm trying to understand is that the importance of health check in the processes that whenever a change also happens, then also you check how it is impacting the health of the whole system. Health checks are really best done whenever there's been an update or any material change to those clusters. If there's something that is going to be a marked shift from the previous deployment configuration, then I would say it's a good idea to go ahead and get that health check done so that you can have a little bit of perspective down the road about what might be lying to be remediated or what issues can be avoided ahead of time. You already talked about short term health check and long term. How long do these checks take? So a short form health check can take about one week. Give or take a few days for any uniquities for that cluster. But they really benefit from their more narrow scope. But we still do make sure to follow due diligence and any remediation suggestions we provide, we make sure that those are appropriate not only for your environment, but also for your business needs. A long form health check, on the other hand, can take about two to three weeks. They're a little bit more in depth. I've had health checks that I've done before where I've investigated a year's worth of logs in the past. So there can be a lot more to evaluate before composing that report. But they also benefit from a much more fine toothed comb sort of approach going through anything that might have been affecting those clusters. Of course, when we look at SIOS, we talk about high availability as well. Do these health checks, you know, when they run for whole week, is there any impact on the business continuity? Any downtime there? Or no, there are two separate things organizations don't have to worry about downtime for these health checks. Downtime is up to you on if you want to take downtime for your health check. In that screen share session that I talked about before, that's a session where SIOS engineers would get on and speak with the team responsible for those nodes to collect environment information. But as a part of that screen share session, we might do functional testing, actually perform fail overs and actually simulate failures on those systems. But that part is entirely optional. I've performed health checks where customers were ready and eager to do those tests. And I've performed health checks on systems where customers didn't want to take the average to do that or couldn't afford the time to be able to take those systems out of production. In those cases, when you're not able to perform those sort of tests, it's not always a negative. It just means that I'll look back in the logs and look for other instances that might meet the criteria for the tests that I wouldn't perform. And then organizations are, of course, planning these health checks, whether it's long term or short term, what kind of perhaps they should do internally or it doesn't really matter. All they need to do is, of course, get in touch with your teams and then plan it out. Of course, the first step I would say is getting in touch with one of our teams and starting the planning just so you can get an idea of the timeline. But as for actions for the customer to take, I would just say make sure that the systems that you're going to have checked have already been fully configured. When we get on to do that screen share session, we want to make sure that we're collecting the most accurate information for the deployment configuration of those nodes. And so it's always good that things are, quote unquote, finalized before performing that health check. What does this process look like, you know, is screen sharing and how do you interact with the teams? So that screen share session, like I mentioned, one or two CYOS engineers maybe more depending on the scope of the health check will hop on to a screen share session with the team responsible for those nodes. And we're going to go through and look at there's a whole lot. We're going to start with operating system or operating system configuration, make sure that that is configured to meet the requirements for lifekeeper, make sure that the installation of lifekeeper happened effectively and there weren't any unknown or silent issues during that installation. So far, we've never uncovered anything like that, but we always do our due diligence to make sure that there's nothing there. Further, we're going to start going into our quorum checks, make sure that the topology of your cluster is something that will enable you to have high availability and avoid issues with network bifurcation or anything like that. Once we've finished that, then we're going to actually go into the, we call it arc testing, arc for application recovery kit, and we will test the configuration of each of your, each of your configured resources to make sure that it's meeting the requirements for lifekeeper to run, that it's meeting the behavior requirements that your business has for those applications and also that things are configured in a way that will enable high availability or higher availability. Can you also talk about the importance of, you know, how does a health check fit into a successful deployment? Customers who get health checks benefit from the ability to predict and avoid issues in the future. Customers who've implemented the recommendations that we provide in health checks also see fewer issues in the deployment activity and open fewer cases in the life of those nodes. A health check gives you insight not only into the obstacles that might lie ahead, but it also gives you the toolkit you need to be able to address those and make sure that those obstacles don't lead to interruptions for your business. Of course, you know, I'm not asking you to share your whole playbook, but if you can also tell our viewers what advice you have or what kind of best practices teams should follow so that they are well prepared for these health checks. I would suggest for teams that are going to be performing these health checks, there's actually, it's pretty light on your teams. The primary thing I would suggest is to make sure that we have access to the resources that we'll need to perform all of the checking we need during that screen share session. I mentioned that we're going to go through not only the operating system but also the protected applications will also go through cloud environment settings and if the nodes are running in a VM maybe even hypervisor settings. It's important that we have access to the personnel who are able to gather information for each of these respective resources. I would suggest to make sure you have availability from any teams that are responsible for the protected applications. We need to make sure that they're available if we need access to permissions that they have access to or just their expertise to be able to perform the checking more efficiently. It's best that they know that we're doing the checking and that they're also able to hop on and join that screen share session to do a little bit of checking if their presence is needed. Whenever you're doing a health check, I would also suggest leaving ample time for your team to receive the report, read it, ask any questions that might arise and make remediations that are recommended in that report. So it's important that when you're doing a health check you have time before your freeze and cut over to not only implement those changes but test them, look at them, make sure that the recommended changes are going to be acceptable for your environment. Of course that's not always possible for nodes that are already in deployment but we would then suggest you either use some QA systems or if that's not an option we would even maybe take the action to test the configurations on our lab systems to make sure that it's acceptable before handing that recommendation to your team. Thank you so much for taking time out today and walk us through the whole process and also the importance of health check for high availability. Thanks for all those insights, those advice tips there and I would love to chat with you again soon. Thank you. Thank you for having me swap and yeah I'd like to be on again sometime.