 So, welcome everyone. I'm Otilo Slovenček, working for General Electric and my co-presenter is Viktor Šlofr, working for Nokia. We've been working together on an OpenStack migration project a year ago, and that's something we are about to talk here. So, let's see the starting point. We had a REC space, managed OpenStack environment, roughly 2,000 m's, 500 terabytes of data, and it had a mixed storage environment, so a local disk for Nova, Swift, so object store for Glance and SF4 Cinder, and we didn't have an in-place upgrade option because it was an in-sourcing project, so we wanted to manage this OpenStack entirely. Hence, we started up a new OpenStack environment inside of the same case into the same data center. So, we had low latency and high bandwidth connection between the two environments, and Target environment was a Red Hat-based OpenStack based on Queens, so it was P13, and it was an all-SEF environment. So, of course, we gathered business requirements before starting up with that. What we had is what the requirement was is a full automation. The reason is very simple. The theme is small. The change windows are just very short, so there is no time for improvisation. Everything has to be running automated, and also, if something goes wrong or we are just hitting some edge case, we have to roll back, so that's another requirement. Also, we have to preserve the IPs as well, the Max on request, and also, the migration is going for like six months. We are migrating via VM, so if only the half of the application is migrated, those folks has to be able to communicate with each other, so layer-to-trunking is a requirement here as well. The right-sizing, so just like even shrinking the root disk is also important from the storage efficiency point of view, and then the whole migration procedure has to scale out, which means that because we might have some crazy scheduling on how many VMs are just migrated, this methodology should scale out like 100 VMs per day if required, so there shouldn't be any kind of infrastructure or process about the bottleneck here, which is stopping us to do as much as we can at a short time span. We shouldn't have physical access to the original, the source machines, the source hypervisors, because this is a vendor-managed product, so it is either a support exception or we have to open a ticket for doing something on the hypervisor, so that's something we cannot really automate, so we should be restricted to the API as much as we can. Obviously again, for the storage efficiency point of view, we have to utilize the SAF copy and write feature as much as we can, so the deduplication features, and because of this is an infrastructure-to-service support, we don't really have access to the operating system, so we cannot really install and we don't want to interfere with the users if possible to install something like a local agent on the machines, so that is also an important part. Scalar is also a very strong requirement, like 70% of the VMs are managed through Scalar. Scalar is a cloud management platform and this is a fair request that they should or they want to manage these environments even via Scalar, even after the migration, and the flexibility, of course, is very important, so whenever it comes in a new feature, a new edge case, then we have to react on that very fast, so the whole design should be like plugins and hooks, and also we should hold the software in hand, so just to react very fast. Of course, before just re-eventing the wheel, we started with some research, so we just examined what would happen if we were copying VMs via CopyStack, this is a well-known free tool to migrate workloads between OpenStack environments, and practically CopyStack is, would do something like stop the VM, create a snapshot into glance or into sweep, practically, and we have to download that snapshot into a staging area and do some crunching on that, because of like, first of all, we have to convert from Google to RAW, because the RAW image is a format that is supported by self-currently, but also we should do other things like resizing or injecting some scripts or something like that, and once we finish, we have to upload into Glance again, and when that's done, we can spawn the VM. The whole procedure for a moderate-sized root disk is taking over an hour, and of course, we have like eight or 10 times bigger VMs in terms of a root disk, so that probably takes like four or eight hours to migrate, that's probably too long. For us, the other thing is that the whole procedure is restricted by Glance, so the Glance and the Swift on the source side and also the Glance on the target side, so this is a bottleneck, we cannot do as much as we can at the same time, because we are restricted by the network bandwidth of that environment or the load balancer at least, and also every single VM is having his own image, so that's not really effective from the storage perspective, and that's really a big deal because compared to the scenario what we ended up with and what would happen if we were using copy stack, the difference is like 200 terabytes, which is like 600 terabytes, row capacity in self, so that's a lot. So the reasons reinventing the wheel is the lack of scalar integration, the scalar import, and this kind of far migration stuff is very state of art, so not available on and off the shelf tool. The root disk copy is usually using Glance, which is not good from the well-known reasons mentioned previously, and of course the data application is also a problem here. The self integration is also missing, so most of the tools are just trying to be universal, any cloud type of migration, which means that rather clone, copy the volume and do some crunching on that or migration by their own, but because of we have like self on the both sides, this is practically insane not to use the self-provided migration methodology. We also have to develop by an IP collision avoidance tool by our own, and of course the cost is very important factor here. So most of these migration support companies are just doing or providing professional support or just doing something like cloud migration as a service or disaster recovery as a service. And most of the cases, if you are not the most important customer for them, probably you're waiting weeks to have a fix or to have some new plugins implemented. So that's also important for us. So the tool set, what we invented here is like having a centerpiece of Python scripts that are just managing a migration for images and VMs and volumes. And practically, this is talking to the OpenSec API to change the VM states or provision new resources. Also, it has a REST proxy piece, which is just orchestrating the self migration or self mirroring. It has an independent part, the port sync demon, which is taking care of IPs and avoiding IP duplications. And we have a REST proxy for scaler, which is either creating the new constructs or the new containers for the new in the new scaler environment and also managing the state changes of the VMs as well. So that's and of course, we have a bunch of agents. I told previously that it was a requirement not to have an agent. The trick is that we have these agents in a REST queue image in a REST queue mode. So nothing has to be installed on the operating system. So that requirement is satisfied here. So how would you clone the VMs? What is the basic idea here? If you just take step four on the bottom of the slide, you can see that there is the final point is that we have like a source VM, which is created from a Glenn snapshot. And it has a single volume attached. And the requirement is to have a clone of that VM on the other environment, which is just based on the same image and has the same volumes or the clone of those volumes attached. So in order to achieve the goal, first we have to have the snapshot copied to the other side. We are starting with that just to trace back the parent or the grand, grand, grandparent of this image. So this is the base image in step one. And what we are doing here is that taking this base image, we are creating a temporary VM from this base image on the first side. And practically we are creating another VM on the target side from a zero image, zero image is practically a small tiny image full of zeros. Because at that point, the base image is nothing to compare. And we are just transferring everything other than zeros from base image to the other size. And when we just make a snapshot of that temporary VM, we have the clone of the base image. And once we have the base image clone, we can do the same procedure for the snapshot image. So create a temporary VM on the source side, compare the contents of the snapshot to the base image clone. And finally just do a snapshot of the temporary VM. So we have the snapshot clone. A few days before this whole environment, whole procedure or the target window of this migration, we are just initializing the RBD mirroring for the attached volumes. And so these steps are practically preparation. And when the migration window comes, the change window comes, and we are just doing the final migration of the root disk. And we are just managing in and and attaching the volume clones. And we are ready with this migration. Of course, if anything goes wrong, the original source VM and the original volumes are still on the source side. So the aerobic procedure is very simple, we are just discarding the target VM and we are just getting back or restarting the source VM. So what's under the hood, so how the root disk migration workflow is going on? Practically, what we can do is is doing some first of all, our creating or having a rescue image. So that's what I mentioned that we have an agent here installed. So once you just reboot the machine into this rescue image, then you have these agents started. So you can just control the streaming from the source to the target VM. And I mentioned that we are comparing the checksums practically or comparing the target, we target this content to the source we this content. We in order to avoid reading and writing the target, this is the same time. By the time we are just making a copy of the source image, we practically be creating a checksum file, which is like generating a section checksum for each 64 kilobyte chunk. And and by the time you you come to the point of migration, then you just download that checksum file into from an object store into the source VM. So all you have to do is just reading the checksum from the checksum file and calculating the checksum from the root disk. And based on the whether you have a difference or whether it is the same, you are just either transferring the data or you are just instructing the target VM to do a seek on the target disk. So you are practically not storing anything in the set layer for that target file. And the final step, of course, doing all these kind of plugins, like like fixing files, like fixing the MAC address fixing something that an ETC FSTM file or doing any kind of auto modifications on the target side and rebooting that machine. And we are happy. That's how the migration goes. Heading over to Victor. Yeah, so both environment use Salesforce in their volumes. So it was an obvious choice to use RBD mirroring to migrate them from the source open stack environment to the destination. RBD mirroring between self clusters of different providers required additional validation before we were comfortable to move forward with this. A self rest proxy tool was also developed to provide an interface to the self clusters so that the VM migration tool could orchestrate the RBD mirroring of our individual RBD images, like setting image features required for the mirroring and enabling disabling the mirroring and also promoting and demoting the images. The main takeaways of the self RBD mirroring were the following. The journal link has an impact on write performance. So it makes sense to enable it just before the migration enabling the journal feature may fail if the RBD image has heavy load. In such cases, posing or shutting down the VM may be the solution snapshot of the RBD image are also mirrored. So this is what this was quite useful for us as some of our customers had data retention policies that meant that the snapshots had to be migrated as well. Another consequence of this is that the dependencies between the volumes had to be resolved by flattening some volumes in order to prevent migration scheduling constraints. You can only enable mirroring between pools that have the same name that's important as well. So this had to be considered when the Cinder volume were volume types were created in the new environment. The last takeaway here is that the RBD mirroring can be a bottleneck as well. In our case, the observed RBD mirroring speed was around two terabytes per hour, and this was only with five RBD images mirroring running in parallel. Okay, the next one is the sports in demo. The two open stack environments were running in parallel for multiple months, and we had the requirement to keep the IP addresses of the migrated instances. So we had layer two tracking between the open stack environments to make this possible. As a result of this, we had to find a solution that allowed us to create new Newton ports on both ends without causing an IP MAC address collision. We use the different MAC prefix in the new environment, but we could keep the old MAC addresses as well on request. For the IP addresses, having non overlapping allocation pools is only a half solution, as ports with a fixed address can be created without respecting the allocation pool settings. Therefore, we have developed the ports in demo that created port clones on the remote end to prevent reusing the same IP address. These clone ports were then reused during the migration. The last takeaway is that the port ownership needs to be considered here so that the port can be managed by its owner later on. So this slide is about the scalar migration tool and REST proxy that we have developed. A majority of our customers were using open stack via a scalar cloud management platform, so the migration had to cover seamless transition within their scalar environments. Scalar is a hybrid cloud management platform that provides a cloud agnostic definition of the infrastructure with high level primitives like swarms, farm roles, roles. These scalar resources can be defined in different scopes and the visibility of these different configuration items are also important. Scalar had no support for such farm migration between scalar environments, so we have developed our own tool to take care of this. This was mainly based on the public APIs provided by Scalar. There are a couple of interesting things to mention here. Scalar has its own desired state engine, which means that Scalar tries to converge to its internal known state of the VMs and farms, so the VM lifecycle management had to be done through Scalar. Therefore, we have developed a scalar REST proxy as well to provide a simple API for the VM migration tool to execute lifecycle management actions on scalar instances. External integrations with LDAP servers and DNS had to be considered as well, and these were mainly done via webhook integrations that we had to trigger manually during the migration, as again this was a use case that Scalar wasn't really developed and this is something that we had to trigger ourselves. There was also one interesting tidbit that the private keys for the scalar farms are generated when a scalar farm is created, so you have no control of that. This was not good for us because we intended to keep the old private keys and unfortunately this had to be done via Scalar database hex, as there was no API for things like this. Okay, so this slide for me, so how would we use the tool set after the migration has finished, so the first obvious way to do that is to do the image copy between the regions. So we have a monthly obligation to create G customized images for OpenStack and we are just building that on one platform and we are copying it to the other regions and this is providing an effective copy between the environments. The other way or the other thing is a missing feature for OpenStack so that project migrations or project split and merge is not something that we have embedded into the base OpenStack software. So the migration is practically are the moving VMs between projects is just a simplified migration. So we obviously have a migration in the same region that simplifies a bunch of things. So we don't have to have port sync, we just have to share the network between the projects and we are moving the fixed IP between the ports and of course we are using volume transfer instead of self migration. So that's also simple. I have to mention that the snapshots are not transferred so this is something I could take away from that stuff and we are also considering to have something a region to region migration. Ceph is a source spot here so the VAM connection, the high latency connection is probably in the way of doing RBD mirroring so that's a problem and the L2 trunking is also complicated because if your default route is on the original environment you have to route back everything on this VAM connection back to the other regions so some sort of a proximity routing is needed but that's something probably challenging point as well. Okay, so the lessons learned. So first of all, the migrations can introduce a lot of inefficiencies and if you don't pay attention the promise of lowering the IT spending can quickly evaporate. We put in a lot of effort to use the storage efficiently and it allow us to save us around 200 terabytes of self storage which turns out to like 600 terabytes of raw storage. It's also important to make a quick study before you start developing our own tool. We realized early on that there is no off the shelf tool that meets our business requirements but the things that we have learned by checking these products help us to come up with our own solution. We also try to plan everything in advance still there has been a lot of gadgets and edge cases with this amount of workloads that we identified during the migration. So it was really useful that we had the migration tool chain under our own control and it allowed us to adapt our workload quickly to handle non-trivial use cases. For a longer term project it's also identified to have these long lead time items at the beginning. So coordinating development with third parties, networking changes, customers with less flexible schedules are all long lead time items that may impact the timeframe of your whole project. So it's very important to identify these things early on to make sure that you will be able to meet your deadlines. And finally, I would like to say thank you for team for their contribution on this project. So thank you guys for all the coding, testing and of course doing the migration here. And now it's time for questions.