 So we'll get started in a minute here. Hello, everybody. My name's Rodney Peck. I'm here from eBay. We're working on a project, eBay PayPal. And as part of that project, we're moving a lot of virtual machines into PayPal. So we've brought this information to you guys talking about how to move virtual machines in an efficient manner. My colleague, Darshan, is back at the office working. So he's not here to take up his part. So I'll do that for him. So the problem is, if you're running a large cloud, eventually, you've got to move some virtual machines. And why? Maybe you have added hardware or done a large open stack upgrade, and you can't do a minor version change. So you have to move the whole machine to a new hypervisor. Or maybe you've got a new data center, and you've got to move all of your machines to that new data center. Seems like a simple problem. But it turns out it's not so simple. First thing you'd try and do, use an open stack, is just make a snapshot. Use Glance, download your snapshot, copy it to the other server, upload it into Glance on the other side, and boot your machine. But the problem with that, I'll try and slow down so I don't go through the whole talk in 10 minutes. But the problem with doing that is, say you have 100 machines, and each one's got a 30 gigabyte disk. 30 gigabyte disk. You download that, that's going to take quite a bit of time. Even over a 10 gig link, 30 gigabytes takes some time. So you download that to your machine, and then you upload it to Glance on the other side. You've taken a snapshot of a machine that was taking no space in Glance other than the original image to boot it. And now you've made 30 gigabytes on your target Glance. You multiply that by 100 or 1,000 machines, and soon you've run out of space. You've got 30, 40 terabytes of snapshots on your target Glance server. And that's pretty expensive. And worse, every time you want to boot a machine on the target Glance server, it has to download the entire image just to boot it. So that's going to take forever. And it's going to use all of your network bandwidth. And it's going to be very expensive. So we thought about it for a while and came up with an alternate plan that we had used several years ago to upgrade from Essex to Folsom. And we took that plan and added more features to it so that we can do more than just VM booting. So as I said in the previous slide, moving all those images, you have an enormous amount of data. You're at QCal. If you look at how the images are stored in Nova, in Varlib Nova instances, the instance name, there's a disk file. And then next to that, it has a backing file. The backing file is your base image, and it's stored in a different place. All the VMs, my iPhone is syncing its photos. I don't know why. It just decided it was time. All your VMs will lose all of their overlay efficiencies. So on the target system, for example, where normally you say booted and booted base image of a gigabyte or so, and then you have your local disk overlay. If you copy all of these to the target, it will, you don't want to see my beach photos, I'm sure. OK. It will use all your space. And the image will be flattened. You'll have the base image overlaid with your data, and then it will upload that every single time to your target. So you'll lose all your efficiencies and overlays. On the hypervisors, where you're booting from, the base directory will be full of all of the base images instead of just the one tiny Ubuntu image, and then all of your overlays. So all in all, it's just not a good situation. What else did I add here? Users' roles and quotas. That's an important point that I added at the last minute, is using Flyway, it'll copy all of your user information. The tenant, the members of the tenant, their roles in the tenant, the quota of the tenant, all of that gets copied over. If you just snapshot, none of that happens. You'd have to do it all yourself. So Flyway, it copies VMs between clouds. It's designed around a message bus queue, and it copies directly between the hypervisors. So I can kind of go into the architecture a little bit here. You can see over on the far left, there's an API server. A user might enter, I want to migrate a tenant from this zone to this zone. And then that goes into the Flyway queue, which is a list of tenants to be processed. And it gets picked up out of the active MQ by an agent who's listening for messages. So that agent will start the process of initiating the users and the roles in the target based on the source. And then it'll send a special message to the hypervisor over here in Cloud B to rsync its data from the hypervisor in Cloud A. So what that's doing, and that's sort of the secret sauce in Flyway that makes it better than just doing a snapshot. What it'll do, it'll take the base image, it'll synchronize it to the other side, and then boot a VM and shut it down. So you'll have a same-sized flavor on the other side, but the data will be just the base image. So then it shuts down the source as well for consistency, and it tells the hypervisor to rsync the data file to the other directory. So now you've got the overlay disk and the original backing file, but it's a different UUID. So it does a backing store, stitches it together so that the new Kukau image is exactly the same data as the source. So what we've done is we only copy the base image, which is very small, maybe a gigabyte of Ubuntu. And then the file's written to the VM in the overlay data file, which is much, much smaller than the overall image. So that's what gets synced between the two. And then in your glance, all you have is the base image. You don't have all of the flattened images. And that's basically all there is to it. And then this sort of shows the progression as it goes through the queue. At any of these times, something might go wrong and it falls into an error state, and we have a database that collects the status. So that's basically the structure of it. And what we're looking to do is we'll be putting this in open source. Hopefully people will pick it up and be able to add additional features and things. The idea is that it's a framework, and we've added in Cinder, we've added in Trove, and Load Balancer as a service that migrates all of them. So if somebody had some other kind of feature that they wanted, it should be pretty simple to add that. Basically, there's some pre-staging, staging, and post-wrap-up. You make a class and add it to it. So just to go through some of the features, Flyway, it makes the migration as transparent as possible. And it copies all of these things. I don't really need to read them again. But essentially, what we're trying to do is if you're a user of the one cloud and you have to move to another cloud, it's transparent to the users as much as possible. I have another section about FlyView, but maybe we could do questions and answers. It's not Flyway. Does anyone have any questions? Yes. There's a microphone if you want to so the other people can hear. Or you can just say it you're close. I can repeat it. Yes. Right. Yes. So in order to make this work, this question is what do you have to do to make the R-sync between the two VMs or the hypervisors work? What sort of setups are required? It's quite a bit. So in order for this to work, in our case, we have a lot of security and firewalls. So we have to open the firewall between the hypervisors in the zones. So in order for the performance, you have to do that. So we open port 22 between our production zones in order for that to be allowed. We also open Keystone, Nova, Cinder, Trove, all of the open stack services between the two zones so that the agent can talk to them. So there's quite a bit of firewall setup that you would have to do. It's true. Yes, sir. So I have a question about the VMs. When you migrate them over to the VIP them because they're in a different cloud? Excellent question. Yes. So what it does, it saves the original name when you Nova boot. Our system will assign a four-digit number to the prefix name and then the local domain name. When you migrate to the other cloud, it's got a different domain name. It's got a different network and all of that. It takes that initial name and uses it to Nova boot again. So you'll get, say, your host was Rodney11111. It'll come out Rodney2516 or something, some other number in the other cloud. But it'll have the same prefix. And we also keep a mapping in the database of source UUIDs to target UUIDs. And that's used by Trove and other parts of Flyway to know how the VMs they use are used in the other cloud. So the users could do that too. They can look up and find out, I had a server I used in the source cloud. What's its new name? They can look that up and it'll show them. But definitely, they don't keep the same IP address. They don't keep the same domain name. It reruns cloud in it. When the machine comes up, it has a new UUID. So we leveraged that to use cloud in it to set the domain, the DNS, the resolver, the LDAP, all the localized variables. But the content of the disk is saved. Yes, you can go ahead here. What do you think? Is it doable to do it live without turning off the VM? Yeah, so it's not live migration, like VMware. But it's snapshot point-in-time migration, I would call it, I guess. The source VM has to be shut down for data consistency. If you were feeling risky, you could try and migrate them when they're idle. But the problem is the source VM might have data in its memory that hasn't been flushed to disk. So if that were to happen and you synchronize it over to the other side and try and boot it, it'll just fsck and crash horribly. So in order for consistency, we shut down the source VMs. You might be able to do something like inject a sink and halt it, suspend it. Some sort of way to make the data consistent. But for reliability, for the number of machines we're doing, we decided that we'll just tell users we're going to shut down your machine and then reboot it for you. And so it's not really live, because it does have an impact on the users. Sure, second question here? Yes, I will. I can't really see you guys over there. There's a really bright light over your head. So if I'm looking at you, but I can't, yeah. OK, I won't throw it. What about if your VM got the back-and-file in RBD? I mean, Seth, do you do RBD export and import on the other side, or just don't care this case? What was the first part? The export what? RBD export. I mean, your VM got a disk in Seth, as RBD image. Oh, boot from volume, we mean. Not the volume, just the normal disk. In Seth? In Seth. Well, we don't do that. But I can maybe talk about boot from net, which we do do, where we have a Cinder volume, and that's the VDA root disk. That's sort of a weird situation, because it copies the VM, and then it does the volumes afterwards. In the case of boot from volume, you have to copy the volume first, and then boot it. So I have to write that code this afternoon. But yeah, so we're adding edge cases, and that's one of them. Is there a way to map the users from one cloud to the other? Yes. So when it migrates the tenant, the tenant, it'll look up in Keystone and find all the users in that tenant, and it'll check to see if they exist on the target. If they don't, it creates them, and then adds them to the tenant that it creates on the target. Also the roles. So if you're a member or admin, it also copies that. Well, we don't map it to other users. We try and keep it the same, but you could change the code a little bit, but it's certainly doable. You just replace the target name with whatever you want it to be when you create it. It's possible. Right. Not yet. Yeah. Those are the sort of things that are really interesting that would be extensions that are not that difficult, but don't really fit our use case. So that's an interesting thing. Yes. Yes. That's exactly what we're doing. So the question was, does it do volumes? Does it do Cinder? And yes, absolutely. In fact, it does any Cinder. It's based on a particular vendor of Cinder right now, but it's Cinder Cinder, so it'll copy any volumes between. So in fact, we built it because we have a particular application for it, so eBay. And we're doing thousands of VM migrations for exactly that thing. Because there's so many, we can't be individually checking them, so we need it to just be as transparent as possible. No, no, you could, but that would be pretty aggressive. So we migrate everybody, let them boot them, and then come June 15th, we'll delete the sources as the plan. So I wanted to, if there's not another quick question, the other half of this is about Fly View, which is a tool we invented to help visualize a cloud. So this is one of my tenants, and you can see the yellow things are Cinder volumes, and the blue are VMs of different flavors. So one of my Cinder volumes is attached to my VM. So that's one tenant. And then you can take that bigger into a zone, and these are each kind of a blow-up of a particular cloud zone. The big red ones are half a hypervisor. Those are 16 processor virtual machines that we use for our stages. And then the little ones are mediums and larges over there. And then this is like a zoomed in section of a larger cloud. So that's over on the left again, and this is one of our data centers here on the right. You can see the stage tenant on the bottom left is really, really big. And then there's some small, bigger ones. And then a little tiny tenant's up in the top. It's an interesting thing, and we have a movie showing the life of the cloud. It turns out from the NOVA records, every time a VM is booted, it's logged a date and time. And then when it's deleted, it's logged. So if you go into MySQL, you can actually make, I wasn't going to show this, but why not. You can make a movie of a growth of a zone over time. So this is one of our data centers, and you can see each frame is a week. And you can see that one's one tenant getting bigger and bigger and bigger and bigger. So that was several months of that one. No, it's just I didn't think it would go so high. I didn't know how many we'd have. So questions or anything? We have 10 minutes or so? Yes. Yeah, that's a very good question. The way that the main issue that we would have is volume transfer. If we do too many tenants at the same time, we'll be pushing too much data maybe from eight or 10 hypervisors simultaneously at 10 gig rates through our switch. And that would saturate the connection between the two. So it uses AMQ, and it has an agent that listens to the queue, and we can have any number of messages in the queue, and then tune the agent so that there's only 5, 10, whatever number we want running at once. And that's how we can control the process. So right now, we're running with 32 servers at the same time. I need to talk to our legal team to find out what we need to do to vet it, but I would expect in the next month or two, we'll have it online. PayPal and eBay have a GitHub that's available publicly, and I think that's what it would be. And of course, if people are interested in working with it, we can. It would be great if somebody were interested and could take it to the next level here. I don't think our team has time for that. Yes? Yes? Right, there could be firewall issues. There could be database access issues. So the question is, like, how do you tell the users? What do they have to do when they migrate, and how long do you wait before you delete their source? And the answer is, we'll migrate them, and the VMs are left unbooted, so they will boot them up, and then they'll go and look at them. So if they've just migrated a big MongoDB database, it's not going to work. It's just the data's not going to be in the right format. So what we recommend in that case is they boot it, erase all the data, and then use whatever data transfer mechanism is there, like MySQL dump, copy it, and restore it to the other side. They're MongoDB, the same sort of thing. But in general, the checkout process, what we're planning to do is we give them some time to work on it, and if there's a problem, we'll help them with it. But we don't want to just delete their source. In this particular case, though, Paypal is going to be its own company, so at some point. Yeah, we also do update DNS, because we have our designate and new DNS service. Yeah, we can. We can change their C names to point to the new one. We can make it as seamless as they like. It doesn't look like a lot, too, that you do all that. Right, so we don't want to just make it hot, because there's too many teams. And it's such a big company, we're migrating thousands of tenants, and to each tenant, their stuff is really important. But we can't deal with 800 people asking questions at the same time, so we'll let them change it when they're ready, because they understand the application's better than we do. They could, you could, this is for DR. So for disaster recovery, say you already have an application running, and you want to make a DR of it. Yes, you could snapshot and clone it with this tool, which kind of brings up the philosophy cloud related. This is pretty anti-cloud in a lot of ways, so we're kind of hesitant to put this out there, because it encourages people to make one, and then just rubber stamp it, which isn't what we recommend. We want you to use a base image, and then use Puppet or some sort of source code management to boot a base image, and then run your stuff on it. In reality, that doesn't happen. And in this case, we need to move all of these quickly, so we're trying to make it transparent. But that said, you could use it for a feature to move it to another cloud very quickly for DR, if you wanted to. I don't know. We're kind of flipping back and forth internally as to whether we'll keep this around after we do the migration, because of that reason, because it encourages people not to use cloud techniques. He's the first. Is it possible to just move to different hypervisors? Yes. Do you mean in the same cloud? In the same cloud? Sure. Yeah, if you give the target cloud of the same as the source, it'll just do exactly that. It doesn't know that it's the same cloud. So that would be another application where you've bought new hardware, and you want to move to the new hardware. Maybe you just migrate everybody to the new, bigger hardware, and then retire the old hardware. You had a question? It was like, can you migrate to the same cloud? That was a question, sorry. You can't all hear him. About security groups, is it also migrated? No. We didn't migrate the security groups. It's very simple to do. In our case, we have default security group that's open for everybody. So we're a private network. So yeah, RVMs boot up with open 0, 0, 0, 0, slash 32. But it's a relatively simple thing. And in the framework, it's any sort of open stack service. We could add a module to it where it says, all right, query it from this side, massage it to whatever we need to do, and then apply it to the other side. So we could do security groups. We also do key pairs. So that's, thank you, so your head, yes. Right, so the question is like, let me see. Would you use this with public, without the rsync trick, if you can't get to the hypervisors between the two clouds? Say you're migrating from Amazon to your private cloud, how would that work? We thought about that. You could have multiple ways to do the migration and use the one that's ranked highest. Like if you can connect, do that. If you can't, then use our SSH. You know, if you, there's ideas on that. We don't have any solution for that though. But as we were toying around thinking about what would be interesting to people in open stack summit about this talk, which is really focused on what we do, we're thinking, you know, you could add an abstraction layer where you just plug in, this is for Amazon and this is for this other cloud and this is, and then migrate it. Use it as a tool to migrate between public clouds, private clouds, things like that. It wouldn't be that difficult to add that abstraction layer on. Then you might need to figure out some other way. Possibly, right. So like for volume copy, the same sort of thing. If you have a backend service that has some fast way of moving the compressed data between the nodes, you'd want to use that. But you could also just use DD to copy the entire data from one disk to the other, but you're a lot slower. Yeah, I don't think it's open. Yeah, so in that case, you may have to instantiate a socks tunnel on either end or something like that. You know, it's hard to say. I don't know how much time we have. Six, maybe 10 minutes. It's good questions. Thank you everybody, if there's no more questions.