 Sorry. Hi, my name is Guida. I'm going to introduce Ganetti, which is an open source cluster visualization management tool We've wrote at Google for use in our corporate environment So first of all, I'm going to do an overview of how to manage a genetic cluster and what is it Then I'll go a bit into the details of how failover internal to Ganetti works Just to show it and then we'll talk what we're doing on Ganetti I'll have convinced you by the time to use Ganetti in your production network So I want you to know what we're working on and what the next features will be so What's gonna be basically you have Xen as a building block you Might want to use it rather than on a single box in a cluster possibly using failover But you don't have or you don't want to buy a Sun to store your instances on and things like that What we do is tie it with the RBD and build a purely open source based Cluster management technology without any need for special hardware without any need for raid on the machines you can just use your desktops and put them somewhere and start using them and We want support for different type of host systems because we have different customers that have different needs And we want to manage at the same time tens of nodes and hundreds of instances because We don't want to manage each node at one every after after the other, but we want to have a unique cluster system So what do we do basically we have this is a genetic cluster These are the physical nodes The DOM zero Xen terminology and then we have a virtual systems This is one virtual system. It has a virtual disk which is actually the RBD device between System one and system two so what you can do after this is you migrate the system between System one and system two right now This is a non-live migration Because we needed some features in the RBD 8 in order to do live migration now We have the RBD 8 in the latest version and we will implement live migration later But for now you have to switch off the system and restart it on the other system even if Xen Support live migration, but it does only if you have the sun and not with the RB this the RBD things without some changes So this is how to use genetic. It's very simple. It's very straightforward. First of all you In it the cluster and say this is my new genetic cluster then you can go ahead and add your nodes. So This was not zero. It's already added to the cluster when you add one two and three and then you have one cluster So you can for example install a system on all nodes or do things like that and This is the package that contains a genetic instance, which actually will be needed in order to Set up our cluster. So this is the cluster setup after we've set it up. We can just go ahead and Use it so we can create instances. This is an example command to create instances There are a lot more options around swap space disk space But for now, let's keep it simple You can create a normal instances with default options and then you can for example after a node crash You can fail over the instance To its secondary node. So it was a node to you fail it over to node one And then you can replace its disk and move them to node three. So the instance will be Rated again while before it was only on one node if the node had crashed Of course, it was if it was just a downtime when you can bring the node up again And you don't need to do that. It all depends how bad the node crash was So and then you can basically query your cluster status query, which instances you have What are they doing? How much memory? How much disk space? This is an example list when you have other fields You can ask for various information about instance configuration And this is the status of all the nodes in the cluster so you can decide where to allocate your instances and things like that now What can you do? This is the whole gannety like in a short version You have a gnt node command that lets you manage your nodes gnt instance. Let's you manage the instances the OS command basically lists and queries the operating systems you can install on the instances for example, you could install gannety on Debian, but have a red hot gen 2 and Debian images for your customers or whoever to use and You have global cluster commands and an instance import-export tool in order to backup an instance to Some other disk and then re-import it back or re-import it back with some changes So let's go a bit into the detail of an instance failover First, this is the normal cluster status as we've seen before Then what happens is that system one has failed So as you could imagine in Xen if the DOM zero fails the third system is down to and it's crashed But we have luckily a copy of the data So if system one's motherboard has failed for example, and we don't have a system one anymore or all its disk Have crashed because you didn't have a raid on system one. It doesn't matter We can just run the failover and then the system will be rebooted on system two which Will have a copy of the data as of a crash So it will basically be as bad as a power unplug but not as bad as having lost of your all your data Of course, this supports only a one node Crash out of the whole cluster or more if you're lucky But we can't guarantee more because if you have instances like node one system one and system two crash Then you've lost actually the virtual system one and you can't do anything about it So then what we do is replace discs So this of course is the wrong arrow somehow the real copies from system two to system three because we have no system one But theoretically what you have is you move your secondary discs from system one from system two And then you have your virtual system redundant again and ready for another failure So after you've done this and you've reduced the basically your cluster size You can support another one node failure as long as you have enough RAM and enough resources to host all your instances So this is the basic genetic what it does right now We have some optional features which are considered advanced for various reasons for example the replic the rbd replication Normally runs on the same network as the instance traffic is you might want to put it on a separate network for speed reasons Like if you have two network cards on the nodes why not use the second one for that or you could Have different instances Indifferent bridges for example rather than creating only one exam bridge you create five of them Connected to different route different for rolling policies instances that have not to talk with each other and then you connect them This is an example of it like green instances all talk to each other Magenta instances talk to each other in the blue instances on a network by itself and doesn't talk to anybody and Here the discs are connected to a special Interface a special switch which is just for replication which increases your performance during replication time And then you have tagging you can Tag instances or nodes with labels and then look for them If you need to make them special for any reason all the instances belonging to one customer or something you tag them So this is basically what we have till now then That's what we're working on for the next version of guanetti Which we don't know yet when it will be released, but we're aiming at sometimes in the second half of this year First of all what we're doing is a job queue So right now you have to create instances or do things One after the other and people of course want to create big number of instances So just script it and run it serially We want them to be able to say submit a job and then query the job status later And we want to be able to run these jobs in parallel also right now as every project starts We have one big lock and only one guanetti command can run at a time We call it the big guanetti lock because of historical reasons of course And then we want to have granular locking and for example If you want to create instances on two different nodes or set of nodes There should be no reason why this wouldn't work. So we're working on that Another thing we're doing is a remote cluster API. So right now you have to SSH to the cluster and Actually type your commands, but what if you have a web interface to control the cluster or something like that? We don't provide that now, but some people might want to build one So we want to build an API that allows you to control the cluster and maybe in the future even single Customers to control single sets of machines, but that's not in the first 1.3 release as a release target at least Then if someone has patches, of course that might happen Another thing that was asked is file-based storage What has what guanetti does now is it requires you to have an LVM disk and Allocates the instance disks on LVM So this works pretty well for the RBD, but if you actually do have a son it was reported to us that It becomes tricky because you would need either to have a cluster file system Anyway, it becomes more tricky than just storing the instances on files and Letting the son deal about everything else So what we are going to do is allow you to say put them in this directory and then you will manage your son Configure it and take care that it's visible from the other device for failover We want but at least it will allow you to use granted together with that then We'll have a possibility to Customize your instance a bit more as we said now you can install different operating systems But instances are kind of limited you have one disk One swap one network interface. That's all you can hack in the genetic config file manually to have more people have successfully done that because they needed it, but it's not quite intuitive and Basically, you have all the support inside genetic but not in the interface So we're planning to polish the interface and letting you having special instances instances with two network config connected to two bridges if you want to do like VPN And central points or things like that on genetic you'll be able to do that after This change Then we want of course to have live failover Now we don't have it failover because the RBD 07 which we were using before works in master slave slave master mode So you couldn't fail over the instance without tearing down the dRB device and re enabling with a master change to the other node The RBD8 can work in master master node But we don't want to do that because people will log into a node and just mount the disk for debugging While the instance is running on the other node They don't have cluster file system and everything will break So we want to keep with master slave nodes. So the disk on the slave node is just read only But what we want to do is to add the possibility to change it at runtime. So Well, when you will failover live lay an instance We will change this because we don't need to tear off the year the whole dRB disk stack anymore And so we can do live failover in the next version and then we want multiple coexisting hypervisors Right now only Xen in the latest versions. You can have Xen or Xen HVM so running windows or other operating systems all together and not just the versions of Linux with Paravir tops or anyway parallelized So with multiple coexisting hypervisors, you can run those together but you could also be able to run KVM or Virtual box if we will have a module for that on the same box or on the same cluster Now we have a module for the hypervisors. We are only supporting these two and we don't allow you to Basically Use more than one at the same time. So one cluster is tied with hypervisor We want to implement more and we want to make them coexist So also instances can have different parameters depending on which hypervisors and things like that And then what we want is a multi cluster management, but that's later after 1.3 We want to use the remote cluster API to have a multi cluster manager in order to decide which cluster to allocate your instances on In a central point external to Gennady, but that's also something we plan for a lot later And we don't know when it will be released And that's basically it. Thank you very much for your attention and if you have questions we have Two minutes now or you can follow me outside for Other follow-ups or just send a mail. We have a mailing list I didn't publish the website on this slide. Sorry, but it's code.google.com Slash P slash Gennady or just look for Gennady on Google and join the mailing list talk with the development team Send issues try it and let us know what you think about it. Thank you very much Do you have any questions? Yes Yes, you have to do it manually you could install some monitoring system that fails over for you But right now we have no like integration with heartbeat and checking the node uptime and so on what we do is basically If some for node is down and instances are down someone gets page and Fails over the machines, but at least you don't lose anything It's better than a failover like failing for a physical hardware, but it's not Automatized for now You could plug it to some automatic system that just runs the command Like plug it to a heartbeat or something Thanks Yes Yes, the cluster knows all nodes know about the whole status of the cluster So if you lose the cluster master, you can fail over that one To another node and you will have all the status Thanks other follow-ups outside