 And I've seen him given some talks over here as well. I think today morning he had some work stuff or something. You have those cars? You can give to people who are asking good questions. I should have seen delays. This I got it from Amy. We are starting the next talk. Please keep your mobiles on silent. Check the messages. Keep them silent then. If you are going to enter, please be careful with the doorstep. Quite noisy so just be quiet with them. Today at the facility in Auditorium D11045 they built a competition with prices. So we encourage people to come over and to be there. There are some round brief pieces, some other things that you can win at the prices. Maybe there will be a few different goodies to come over and take part in the competition. For those who are here, we have a few cars that the presenter can give as a gift for the good questions. And that is it about the operational part. So now please welcome Ramesh Shmuthrou, who is our blaster engineer. I am Ramesh Shmuthrou, working in Red Hat Bangalore for Overt team. Mostly I work with Overt and Cluster integration. Today I am going to talk about Overt and Cluster hyperconvergence. So how many of you know Overt? How many of you use Overt with Cluster? Overt with Cluster? Cool. This might be interesting for you now. This is going to be my end of our next 40 minutes. First I will introduce you to Overt. Maybe most of you since no, I will just roughly talk about the architecture and storage. Then I will talk to you about poster engine. Then I will introduce you to Cluster and basic components in the cluster. Then we will see how to set up a hyperconvergence cluster using Overt and Cluster. How it looks and how to set up and how do you manage the setup. Then I will talk a little bit about the recent enhancements we did for hyperconvergence case in Overt as well as Cluster. Let's get started. So basically Overt is a virtualization management platform which helps you to manage your virtual machines. Also for storage and networks for virtual machines. If you see from the diagram on top we have the Overt engine that runs in a separate hardware and inside a J bus application server. The Overt engine is written in J bus, it is written in J2E and it runs in a J bus application server. That takes care of managing your hyperraces and storage. Overt engine uses post base SQL as a DB and that exposes many two interfaces for the user to communicate. One is the Webb and Mint portal where you can define your cluster, you can add your hyperraces, you can add your storage nodes. And you can create your virtual machines there. Also it exposes nice rest APIs which can be used to integrate Overt with other management platforms like ManageIQ or CloudForms etc. Below Overt you see the clusters, it is like a cluster of hosts. Those are all virtualization enabled. Mostly we use KVM virtualization that takes care of running the virtual machines in the host. Overt is something called VDSM, it is an agent or demon running in all the hyperraces and it is written using Python. That takes care of all the hard cooling for Overt. Overt communicates to host through VDSM. It asks VDSM to create a virtual machine or stop the virtual machine or stop the VM, attach disk. All those thoughts are done by the VDSM. It manages the host. And underneath you see storage. Traditionally Overt supports different storages like NFS, FCP or ISKC and cluster of hosts as well. That can be managed by cluster or managed by Overt or it can be independent of Overt like ISKC or FCP or NFS. Next we will see specific about storage architecture. So, storage is main component in Overt because the VDSM completely relies on the storage for to store the virtual machine images and as well as the config files and other things. Also it relies on the storage for synchronization between the VMs. Like the same VMs should not be running on multiple hyperraces. Also the locking and synchronization is handled by Sandbox and that is done through storage domain which is the actual storage here. And Overt supports the storages like NFS, FCP, ISKC. There is one difference between other storages and cluster of hosts because NFS and other FCP storages are single point access where you have to communicate to a host to get your storage. But the surface is different because it is a cluster by system and it has multiple hosts. So, if one host fails VDSM or Overt can talk to other hosts to get the VM images or other data. That is why the surface is more useful. Next the hosted engine. This in the previous slide Overt engine runs in a separate hardware and that is again like what if that machine fails. That is a failure point. Also you have to dedicate one hardware for that VM, for that Overt engine. Instead you can do something called hosted engine where the Overt engine itself will run in a VM. And that VM will run in any of those hyperraces. You can get rid of one host, one physical host. Also you get Hatche. If one host fails the Overt engine can run on another host. You get Hatche for Overt engine as well. So on in this hosted engine setup there is something called hosted engine Hatche agent which will be running on the hyperraces. So that if one host fails the other host will automatically start the hosted engine on it and that can manage all other hosts. So in the end the Overt engine will be always running and you can always manage your hosts. But it is kind of chicken egg problem because since you have a VM it manages itself and the other hosts which are running the VM. So what I need is you have to do some good set deployment to bring up that VM then add the other hosts. Maybe I will talk about this later with more information. All right, exact steps how do you do it? Okay. That is about hosted engine. Next is GlacierFS. Yes you all know GlacierFS is an open source distributed file system. It runs on any commodity hardware and it can scale up to better bytes easily. As you see in the diagram it can be accessed by access to multiple protocols like ester pulse or n-pulse. Any sieves are lately with the API. What it does is basically obligates different storage exports like different storage exports from different hosts and give it in a single namespace which is more reliable and replicated or distributed. There are multiple things. Maybe next later I will talk about the basic things. So Brick. Brick is a basic component in Overt in Gluster. It is an actual place where your data is getting stored. It is actually a one point. You see from this diagram, you have a block device in the system and you format it, you get a file system on it and you mount it on your system and that mount file becomes your Brick. Brick is a place where your actual data is getting stored and the limitation of the Brick is completely on the system. There is no limitation from the Gluster side. Whatever the Brick can hold, whatever the amount of data the Brick can hold is completely and up to the file system and the disk. Next is GlusterFest volume. So this volume is the actual collection of Bricks. As you can see, we have three Bricks from different different hosts. You can combine all these hosts and export them as a single volume. And GlusterFest supports multiple types of volumes called Distribute or Swipe or Replicate. It takes Distribute where your data is distributed across different Bricks. For example, we have three files called File1, File2, File3 and each file may go into different Bricks so that you can access them easily from different hosts. Another is Replicate. The Replicate is more reliable especially in case of virtualization and other use cases where you need a reliable data and you do not want to lose your data at any cost. There you can use a Replicate volume where your data is replicated across Bricks. The water file, you write it in a volume is replicated across all the Bricks in the volume. At least in case of Replicate3, it replicates in three Bricks. Also you can combine them. You can make Distribute, Replicate or Distribute, Stripe or other cases. Like you can combine them or amend all of them. Another important aspect here is the Quora. Let us just support something called Quorum. So if you enable Quorum, you can avoid SplitPaint. The SplitPaint condition is like you have three Bricks. Assume one Brick goes down or something happens. Finally end up in three Bricks having different datas. Now you do not know what is the final data or actual data. Those cases can be avoided easily by Quorum. If you enable Quorum, the majority of the Bricks are not there. It will automatically make the file system read only and you cannot write anything so that you can safely recover from your failure. Next, this is how you access the cluster of a volume through Fuse. At least in virtualization case, you have a Q and A running your virtualizations. If you want to access an image from cluster of a volume, it goes through Kernel. From Kernel, it asks the cluster of a Fuse client that requests the cluster Brick which is running on the remote cluster of a server through Internet, through Net. And that again goes to underlying XFS file system. As you see from here, there are a lot of Kernels which are happening from user space to Kernel space. Ideal file features here. This is not performing well. So what we have is, we have something called GFAPI support. This helps you to communicate directly to the cluster Brick and get your volume data. So with this, with this QMU Kernel supports this GFAPI. With this, the QMU can directly talk to the cluster of a Brick and get the image or volume data, the cluster of a data. There is one bug or one RFE missing in this. Kernel QMU supports only one cost. Though cluster of a C is a clustered file system, currently QMU talks to only one host. If that host fails, then your VM will get passed. So we are working on this bug so that if that host fails, even QMU should be able to talk to other hosts in the cluster to get its volume back. We are working on this patch. Next is hyperconverged. Hyperconverged over setup. If you see, maybe let's compare this with the first one. Here how many hosts you had? One is for each. Within the other three parts, your hyperconverters are then separate storage. At least you need five, six storage hosts here. But in case of hyperconverged, you need just three hosts. And the same was to be used for both storage as well as virtualization. This is very useful in case of small scale deployments where you have very limited setup hardware and still you want to try out virtualization techniques. Here we run both cluster plus and carry one same like same physical nodes. And for over engine, we use something called the same thing what we explained, the hosted engine. So over engine becomes a chain able automatically and runs inside a VM. That can manage the other VMs, other hypervisors. It can also manage the cluster plus volume. And these boxes can be standardized so that you can add, you can scale out or scale out easily. You can add more storage later to the same host or you can add bringing multiple, I mean the additional host to scale out your setup. This is overall hyperconverged over cluster setup. We had to do some enhancements to bring to this. The first thing was lead GAP support. It was not there, now it's there, since it's there. And we pick at three volume. So previously, especially with over 3.5, when people tried hosted engine with the cluster, they started using, we pick at two volumes and they get into split-plane situations. And split-plane situations are very costly in over because when you go to split-plane situation, then your VMs are lost or at least passed. You have to reboot them or most of the cases you have to reboot, they won't receive automatically. And sometimes if you have corrupted data, there are cases where same VM running in different host, there are multiple problems. As a result, they stop supporting replica 2 volumes and start supporting cluster of us for hosted engine. In 3.6, it was enabled with replica 3 volume. So, immediately it has to have replica 3 volume to use cluster of us in hosted engine. Now that support is there. Another thing is called over engine appliance. Deploying a hosted engine setup is traditionally it was very tough because you have to solve the chicken and egg problem. You have to bootstrap your virtual machine, which is going to run the hosted engine VM and add the hypervisor host. And over here, there's something called hosted engine deployment script. That script is a very interactive script, but it asks a lot of questions. It's all related questions. If you make any mistakes in between, then you have to start from beginning and you have to clean up the setup. Even we face a lot of issues. But with this appliance model, what they come up with is, you have a pre-packaged VM appliance, which has over already installed, can just use them. In this model, you could avoid multiple errors in your deployment. Also, that VM can be customized using cloud init. Basically, we have to configure a few things like what should be the FQD name for the VM, what should be the username, password, all those things. You can easily configure. And one more enhancement is the shared configuration. Previously, though there are three hosts, and VM can run, the host engine VM can run on any host, but the configuration about the hosted engine VM was showed separately in all these three hosts. So if at all you want to make any changes, you have to edit the same configuration in all the hosts to make a change to the hosted engine VM. Now, that is being said. It will be stored in the same storage, the cluster of the storage, or another storage, whatever is the hosted engine. In that way, the configuration will be shared across all the hosts. And we have some mechanism to upgrade from over 3.5, where the configuration is separately stored in all the hosts to move to the shared configuration. And we have management GUI. So till now if you want to change something in the overtingen VM, like I want to give you more memory for the hosted engine VM, or I want to change the processor, number of processors allocated, then you have to edit the config file. But now we can do this through UI. But I came to know that these features were currently lost. In the last minute, it was removed from 3.6, but it will be there in the next 4.2 release. Okay. What are the main ingredients for hyperconverters? You just need 3 physical hosts with virtualization enabled. Then you need at least some 20 gp of memory, 20 gp of storage space in your system, because we are going to run cluster storage there and you are going to run your VMs there. That is very, very minimum. If at all you want to try from a virtual machines, this hyperconverter, you need this minimum requirements. But at least in hardware, if you go to hardware, I hope you will have more capacity by default. The next important thing is you have to configure your DHCP. Basically you have to reserve a MAC address for your hosted engine VM and that MAC address would be resolved to a fixed IP. And you should add a configuration in DNS to give a predefined DNS name for that IP. Because we are going to communicate with the VM, the hosted engine VM using the same FQDN always. So that we have to make these configurations. Next we have to just install the package. And the last point is very interesting, because you need physical console access or network access. That's the standard. But why do you need screen package? It's because, since I tell you the bootstrapping the hosted engine VM is a little complex and it involves multiple steps like configuring your firewall and configuring your network. You may end up losing your connectivity to the physical server. As a result, if at all if you lose, you have to get back to the same session. That can be done through screen or the physical console. Next. So how do you set up? The first step to set up the hyperconverter setup is setting up your cluster volume. As I told you, we need replica three volumes. So we have three hypervisors. Make a cluster storage, cluster storage pool out of those three hosts. Then create a three-way replication volume, three-way replica volume at least. At least one volume for now. Maybe later we can create another volume. The one volume will be used for hosted engine medium, which will contain the disk image for the hosted engine medium. Another volume will be used for regular VMs, which you are going to create. And as I already told you, we have to enable cluster column. Otherwise, you may get into split-band situations. Another important feature is Sadi. This was recently introduced in cluster. So without this feature, we were facing many problems with the self-heal. When self-heal happens, it eats up the whole CPU time. And your VMs are not getting any CPU and VMs are getting passed. As a result, either you have to reset the VMs or... Yeah, you have to reset the VM. And especially in case of hyperconverged case, even your hosted engine VM will get passed. So it's kind of difficult situation to recover from that condition. But if you enable Sadi, your big files, your disk image files, which are anyway going to be big, will be striped into multi... Not striped. It's a little different from normal striping. It will be chunked into different chunks of... It's a considerable size. By default, we use 5-to-1 MB sort. It will be chunked into 5-to-1 MB charts and will be stored in different bricks. So the self-heal... And it greatly improves the self-heal performance as well as GRO performance. We tried with GRO as well. And it actually solves the CPU problem. Now with the self-heal, we are not seeing more CPU usage. We had to do a few more tuning, like setting the nitropic timeout and self-heal algorithm. And the over group, setting the one ID and user ID to 3.36. Sorry, that should be GID. Okay, next thing is the hosted engine setup. As I told you, we need these three packages. One is the over engine appliance, the OVA image or hosted engine image. Next is hosted engine setup. That brings in the hosted engine setup script, which helps you to bootstrap the VM, hosted engine VM. Next, VDSM cluster, which brings in all the VDSM packages. Then this is our friend, the hosted engine deploy. If you run this script, it's a very interactive script. It asks multiple questions. And these are the few important questions, which will be asked. One is the storage type. You have to choose storage type and you have to enter the guest surface volume name with the host. If you do that next, it will ask you how you want to bootstrap your VM. There are three options. Either you can use OVA image or you can use CD-ROM or you can use Pixi Boot. But as I told you, using OVA image is the easiest and better way. So you choose OVA image. Next thing, you can do some customization to the VM through CloudInit. And it supports the script supports on-demand creation of CloudInit configuration, where you can specify the QDN name and the root user password, etc. Also, you have to give the MAC address for the VM sneak, which you have reserved in your DHCP server. If you want more details or exact commands, please read my blog. I have written this blog to help you with the deployment that covers all the details except sequence of commands what you have to run, what you have to answer. So once you do this, what you get is you have hosted engine VM running. You can log into the OVA engine. You can see a cluster with one host where you run the script. So now next step is enabling cluster service. So by default, OVA engine doesn't enable cluster service for the cluster for the cluster where it's being deployed. So you have to enable cluster service for the cluster. Next step, you have to add the remaining two hosts to the setup and create another storage domain where you will create your VMs. Any questions? Yannick? Do you have it? You mean this ping timeout? Yes. These are our volume configs. Cluster volume options. After creating volume, you can set these options. Basically, these are all volume options enabling cluster corum and enabling sardine. So how do you enable cluster? Cluster was integrated with OVA I think since 3.1 for a long time. Currently, you can manage your whole cluster deployment using OVA. In this hyper-converged case, we have a cluster where you want to run both OVA and cluster. So by default, I think it should be fixed in OVA. By default, it doesn't enable cluster service. So we have to go and edit the cluster to enable cluster service. If you enable cluster service, then it automatically takes care of bootstrapping your cluster host. Like managing your firewall, managing your cluster deservice. It also supports volume management. You can create your volumes. You can start, stop or you can add some updates, rebalance. It supports almost all the cluster operations. And you can use the same REST API which is exposed by OVA to do all your volume operations. Though we enable cluster and web service for the same cluster, there is one problem with OVA because the OVA doesn't consider that it's a cluster cluster service is running in this host when it does some when it does fencing or other power saving what happens is if we enable power power management for a host and if that host is not reachable for some reason it will reboot the host. It may not be required in case of cluster case but still it will reboot because it's not considering the fact that the host is running cluster service. This has to be fixed. Next thing is adding additional host. Since we are done with the first host and we are bootstrap the hosted engine only thing remaining is adding the remaining two hosts to the cluster. Just go and add the cluster in the add cluster you will see a checkbox saying do you want to make this as a hosted engine agent? Do you want to configure a hosted engine agent? We say yes then it will configure the hosted engine agent on this host as well. But this UI was removed for some reasons in the last minute because of some bugs. So again the hosted engine device is your friend. It can be used to add additional host. Go to the second host and the same script. This time it won't ask you so much questions only you have to give the same volume if you do that then it automatically takes care of most of the things. Now your second host is there. Similarly you can add the third host. Now you have three hosts in the cluster which are running both cluster and web services. Next is creating storage from cluster. As I already told you Overt supports cluster deployment and cluster management. You can create your volumes in Overt itself either through weapon view portal or through REST API. It supports other cluster features like volume profiling, capacity monitoring geo-replication, volume snapshot almost everything whatever is there in the cluster. This is adding the storage domain using the cluster plus volume. So now that we have created a volume which will be used for the regular VMs we have to create a storage domain using this cluster plus volume. Here the important point is this one. You have to give the volume path and you have to give the mount options with the backup alpha server. This will be used to communicate so that when the main host fails host 1 fails it could talk to host 2 or host 3. So this is really required. So we have developed this dashboard as a UI plugin in Overt. Mostly from actually we had a dashboard for cluster which takes care of all cluster later entities. Now we have included virtual storage in box also there and this gives you a complete picture of what is happening in your deployment like how we are doing with the capacity like how much storage you have in total, how much is used or how much is free and how we are doing with the CPU and memory utilization. That is the average out of all the hosts and how much is network traffic and how many hosts are up and how many is in maintenance and how are we doing with the volume we have some different states in the cluster volumes like volume can be up or volume can be down that is normal we have something called degraded what is degraded is in a replica 2, 3 volume if one brick goes down still your data is available but it is kind of degraded you are not able to access another node you have only 2 nodes serving the data that is degraded we have something called parse here where in a distributed volume if you lose one node or one brick your parcel your data is partially available not fully available that is called partial you should see those statuses when that happens also virtual machines you can see here for some reason if your network is broken you should see some mix down or something similar there this is quite useful so next how do you manage your hostage a little bit about it you can change your hostage engine like memory and CPU allocation and network configurations through UI but currently it is not supported I think it should be in 4.0 it uses the sad storage when you change something when the hostage engine restarted it automatically picks up the new configuration and uses it next thing is hostage engine management hostage engine you can configure few things like state transition SMTP these things are very important because your robot engine is a VM what happens for some reason the VM dies and you could not you may not know that VM died VM is not running the hostage engine VM itself is not running which means you are not monitoring the you are not managing the whole setup but in hostage engine you can configure your mail address so whenever the VM state changes whether it goes to down from up or up to down or any state changes you will get a mail notification saying it is going down it is up in that way you are being alerted that something wrong happens also you can configure other things like timeout and the host scoring host scoring is used to define where the hostage engine VMs would run you have different hosts among three hosts one host is highly powerful other two hosts are not they have less CPU or less memory then the higher power host will get the highest score and it will run the hostage engine VM also we did few enhancements recently one is cluster pasading we talked about it another is orbiter volume actually to try out this hyperconverged setup we no need three volume we no need three hosts it could be done with the two hosts itself but for replica 3 volume we need three host that can be easily solved by the orbiter volume where the third brick will not be a in case of replica 3 the third brick will be used only for metadata purpose when you store a file in a replica 3 volume normally it gets stored in all the three bricks but in orbiter volume in third brick only metadata is stored not the actual data so that can be very less in capacity the machine itself may not be as need not to have the same capacity as other machines another is replace brick this is important because if you want to replace one node in the hostage engine setup currently it is difficult you have to do some manual steps in the cluster to replace the code now what we are trying is that should be do all from the UI itself you can automatically bring in a new host with replacement for old failing host it will be done through UI without CLI any usage of CLI and the backup all file support in libret and qa moon which we talked about the next session will be focusing more about these tasks that is all I have today I hope you understood something any questions I think when it cluster bus by default does not use much CPU I feel that is why I think we are running some tests we did not see any major performance issues with those thing the main problem was with the self field runs we used to see more performance issues but with the sorting that is also getting reduced we do not see any major performance bottlenecks we did not see but still we are in the evaluation state we are running some performance tests now with even with arbitrary volumes currently it is not there currently it is not there that is why we recommend not configuring power management in case of hyperconverged any other questions can I ask you you can I take a few minutes we will that is why we are running some tests we do not see any major performance bottlenecks we do not see any major performance bottlenecks we do not see any major performance bottlenecks How can I do it, how can I do it for down? No, no, I have it here and I have it down. I'm not sure about that. So you can try, you can try. You can give those away for questions. And there is some water if you need some. Thank you.