 Okay. One, two. Okay. Thank you very much for coming. I know it's the end of the day already, so I'll try to get through this quickly so you can get yourself a drink and start starting to go to the pub crawling. Socializing. But I would like to get through this quickly, as I said, and it's great to have full room. Hopefully these full attendants reflect the interest of the topic. So today's presenter is myself, Irat Karadinov. I'm an open stack engineer at Klarops. Klarops is a delivery partner of Mirantis in Canada. Today with me I have Alexey Kaladyazny. He's a senior deployment engineer at Mirantis and Ivan Kaladyazny, he's a sender developer at Mirantis as well. Both guys came from Ukraine and we have with us Christian Yubner. He's a senior system architect at Mirantis. So before going to agenda, I would like to ask you a few questions. Is anybody here running the open stack in production already and some critical workloads? All right. Nice. Look at so many hands. So are you guys any of you still running on Grizzly or Icehouse or Juno, like all the release of open stack? Raise your hands. It's okay. Cool. So the rest of the people I guess running their clouds on Kilo and Liberty, right? So yeah. But here we are on Summit and we have a new release of open stack ready. So question to you guys. How many of you are considering to upgrade to the latest release? Okay. There's one, two, three. Okay. I guess everybody wants, right? But as you probably know, currently the open stack upgrade is work in progress from the community. Most of the vendors actually already have implemented upgrades, but they limited to the one or two smaller releases, basically, so you cannot really go back to the old release of open stack cloud. And there's not even a question about if you're going to change your storage backend or if you want to change the network architecture. It's basically like a small reference architecture that supported for the upgrade. So it wouldn't be great if you upgrade your old cloud somehow and get to the latest version. So the alternative way I want to discuss today is migration. Basically, in migration you have two clouds side by side. So you're basically migrating your workloads from one cloud to another. So for this summit, we actually came with two presentations. One of them is going to be tomorrow. We'll just talk about why and how to assess and migrate cloud application between open stack clouds. But today's presentation would be only kind of focusing on the storage migration between clouds. So the agenda is basically we're going to have quick introduction of storage migration between clouds and open stack and between different backends. And we're actually going to deep dive how we implemented those different methods of migration between different backends. So we start with the granular migration and we will go to the all-in-one kind of migration. And finally, I will finish with the native Cinder APIs, which allows to do the volume migration between backends only, because native Cinder APIs, they cannot do the migration between clouds. So I'll pass the mic to Alexei and he'll continue his part. Hello, everyone. My name is Alexei Kolodyozhniy. And I will tell you a bit more about the introduction. I will tell you about what is migration and I will name some main approaches on how to migrate the data. So what is actually migration? Migration is a transferring of some kind of data from source to destination with appropriate adoption of this data. And the actual data in the cloud, we can divide in two groups. It can be the configuration part, which includes user data, user information, tenants, networks, security groups and so on. And the actual data. And in this presentation, we will talk about this actual data, which actually includes ephemeral, images, shares and volumes. Everything that can be stored on the storage backend. And I want to note that migration can be done inside the cloud, between the cloud and between the backends. And we will talk about this in our presentation. So why are we migrating? There could be a lot of different ways that brings us to the idea of migration. And first of all, it can become because the cloud upgrades. Unfortunately, in place upgrades is currently not available. And the alternative for the in place upgrade can be creating a parallel control plane and migrating the workloads from source to destination cloud. And disaster recovery, when it comes to something went wrong and you have to migrate back from some disaster storage, your workloads back to operating cloud. And instance migration, when you have to migrate some instances and volumes between tenants, users or clouds. And actually the benefits, what migration gives to us is the no vendor lock-in because you can migrate between different storage backends. And actually the flexibility, which is very important in business. So the main expectations that we give to migration is minimal downtime. So we have to make the downtime nearly to the zero downtime. Data consistency, we have to look after our data and not left something behind. Relability, we have to have plan B and magic because actually migration is not the easy thing and there are a lot of business logic and covered under the hood. And actually the difficulties, the main difficulty may become from the network because it's maybe the main bottleneck for the migration. Also that data is currently in use and we have to deal with it when the users should not notice that we're working and migrating their data. And maybe last but not least, that application which customers want to migrate are not usually ready for clouds. And we have to deal with legacy applications. So let's get closer to the approaches. And the first approach I want to describe is low level migration. This is a part of the group of approaches which is named granular migration when we are working with each volume. And the main idea is when you have the volume, it's presented like the block device or self-artbd device. You're mounting this device somewhere and on the destination site you're creating the similar volume with the same size and also mounting the block device on the destination site and then you are making a transferring of the data using some tool like DD. The pluses of this approach is that it's quite simple and you can automate it easily and also it gives exact copy of the volume. But also it has some cons like that you have to copy, hold the block device because you are not dealing with the files and you don't know how much size is actually used. So you, for example, if the volume takes two terabytes, you have to copy, hold two terabytes. And also this approach is very slow so you cannot apply it for all volumes because it will take ages. So the next approach is a bit more high level. In this approach we work with file system and, for example, imagine that you have some volume and you already have some file system on this volume. You can mount this file system somewhere, for example, on some node of the cloud or you can create some operational VM and access this volume from this operational VM. The same thing you're doing on the destination cloud, you're creating the volume with the same size, you're creating the file system and you're mounting it and then you're making a copy of the data from between these mount points. Using RCNK-SCP doesn't matter. The plus of this approach is that you can copy only user data. So you're dealing with files and, for example, if this two terabytes volume and only 20 gigabytes is used, you're copying only this actual data. And the cons is that it's quite hard to automate because you are not sure and you have to predict what file system will be located in this volume. The next approach is the backend replication. This is the part of the group of approaches which can be named like one-shot migration. And the main idea is that you have some backend attached to the cloud and you can attach the similar backend on the destination site and make a replication of the data between the storage backends using native replication. The process is that you can transfer all volume in one set, but also this is a cons because you cannot make a granularity and you cannot migrate, for example, only volumes for this particular user or particular tenant. You have to migrate everything in one set. Also, it brings a vendor lock-in because you can replicate only between the same storage types, and it costs us additionally because you have to pay for additional hardware on the destination site and maybe additional license fee. And the last but not the least approach is the backend re-usage. In this case, you will use the same backend on the source and destination. You will access it simultaneously from both sides. So it gives us pluses like, first of all, you don't need to transfer data. You will not lose time on it. You don't have to pay for additional hardware and for additional licenses. You're just reusing the storage. And actually the cons, they can appear only because of some particular realization of this approach. In more details, all of this approach will be described in further. And I would like to give the microphone back to Irat and he will tell you more about the granular migration. Alright, thanks, Alexei. I'll try to make the ball rolling. So basically, Alexei covered all the nuts and bolts of the migration between clouds. So the approaches that Alexei covered have been implemented in the migration tooling that actually Miranda's been using for several years already and offering to customers who wants to basically upgrade their clouds. So and definitely those migrations been used where the upgrade was not feasible. So creation of this automation tooling involved a lot of magic. So today we want to share with you how this magic been done basically. As you know, OpenStack has many backends. I think 70 plus backends now. So it's a great thing for having, of course, but when it goes to migration, it's actually kind of a hard pain for us because you need to consider, you might have any different backends to migrate. So when we're building our migration tooling, we basically had to think about how to make this process more general, so we can reapply it to another backends. So actually, for this presentation, we prepared the ISCSI fiber channel NFS and self type of migration. So I will start with NFS type backends. NFS type backends basically the the simplest to migrate because basically you have your volume presented on the file system basically as a file. So we basically need to migrate the sender metadata and then transfer the file from source to destination. Before going to explain, I'll just quickly remind how NFS type backends work in example of creation sender volume. So we have our controller node and NetApp NFS server, which is basically they communicated through the NetApp Direct7 mode driver. We have NetApp here because we just recently used it for our migration, so it's still fresh in my mind. And basically, when you create a volume, the volume will be created in the NFS share. And if you want to attach this volume to your VM node, basically the file system will be mounted on the compute. And then that's how it's going to work. So important configuration would be to have NFS mount point base and you can also have several shares attached or several NetApp servers attached to the same controller and the shares config on the shares.com file. So if everything configured correctly, you will see on your controller node that in our case two file system attached and the shares are MNT sender volumes. So when you create your volumes, you basically in our cases test and test and with the sizes, you would actually see those volumes created on the MNT sender volumes. So we basically need to transfer this metadata and transfer the volume to the destination. So here we have two clouds, controller node, which is running on Grizzly and here the Mitaka, the latest release. So as you know, sender APIs cannot create volumes on a different cloud. So we had to use the migration tool in between. Here it's deployed on the controller node. So the first step of migration would be actually to migrate the users and tenants. So they would be available on destination. And the sender volume migration start with the migration of quotas. If you want to migrate them, this is optional. And then you would find this volume on a source cloud, create exactly same volume on the destination with the sender API call with this basically the volume would be with the same name and with the same size. And then basically you just need to transfer the file from source to destination. So we just had to automate this process in order to make this for transferring many volumes. But it's very slow. So the way to make it faster a little bit, we basically need parallel migration and we used more advanced transfer protocols. So we found a protocol called BBCP, which uses bandwidth more efficiently than SCP or RCN commands. So I hope this is clear. It's very simple. So things are rolling now. Now we're moving to the block level type I can. So here we have Iskasi NFC channel. This is more complex to automate and to perform migration because the volume is a block device. So you need to find the way how to mount this block device on the controller. So again, we just recently worked with the VMAX driver. So I have some fresh in mind and basically I want to show how just basically create a volume on the on open stack. And from there we can basically show how to do the migration itself. So we have a controller node and the VMAX and the communication between controller and not goes through the SMS solution enabler. This is kind of a server in between which has installed APIs which can talk to the VMAX. So when you create a, send a create volume command, it will use WMAX WM API and create VMAX volume and it will land on the default storage group. This is like specifically for the VMAX driver. It can be actually configured in a fast pool as well. In our case it was fast pool so we put it there. So if you want to attach this volume to the destination cloud, you would basically need to create a, what always happens, the volume is kind of transferred from the default storage group to the instant storage group. And then you need to create a masking view. This is basically the way to create view for this block device so it can be seen on a compute node. And then you're running a RISCAN and then you basically can attach the volume to the compute node of ISCSI. So we need to work with this block device in order to do the migration. So now we can show how we did migration and this implementation actually a little bit more complex because we have, on the one side, we have NFS share type and on the destination we have VMAX ISCSI. So to do the migration the same way, we would need to check to find the volume we're migrating on the source cloud, create exactly the same volume on the destination cloud and it will go to this default storage group. So the next step would be actually the magic or how to call it. Basically we need to do the backwards engineering in order to understand how the VMAX driver works and instead of attaching to the compute node we attach the block device to the controller because our migration tool is landed there. So once that's done you can basically do block by block transfer from source cloud to destination. So I think it's clear but once things are automated it's working well but again this is very slow so it can be only used for the groups of volumes or like very critical workloads which you want to migrate. So finally the safe type accounts the migration work in exact same way, the idea is the same, the only difference here the commands you're using so instead of block by block transfer you would use rbd export div to rbd import div. With the rbd export you can actually transfer not whole volume but actually the exact data inside of the volume so the migration is faster here and if you want to transfer from block to safe you would use rbd import and if you do the from safe to block you would use rbd export. So now as I said these methods are very slow and my colleague Christian would actually cover how you can do the migration in one shot. Christian please. Okay thank you I heard. So in many cases when you do migration you do not want to actually see have your users tenants see that you're doing migration even. So the idea is instead of using OpenStack mechanism to use the built-in mechanisms in a host of storage backends one of them would be safe rbd implement incremental application netapp has their snap mirror, emc has mirror view and a number of other backends also have this built in. So the other approach that I would like to discuss is why do we migrate the data at all? The problem is if you have a three four five petabyte store and you're trying to push all those bits through the through the network you'll find that first of all you have impact on your performance because hard disks have only a relatively limited capability of doing multiple tasks. I mean a standard eight terabyte spinning hard disk gives you 100 IOPS if you're lucky and the other thing is that you normally you don't want to clog your network the production network with data that is being pushed through. So the approach that we are currently working on is why migrate the data at all take the back end attach it to the new cloud migrate the sender entries from here to there and then reuse or keep using the back end with the new cloud. So replication within backends of course you have performance impact in many cases it's possible to throttle that netapp has a mechanism to throttle that emc has a mechanism to throttle that but you will still notice the impact you will notice especially if you're running your storage array not only a 20 percent which everyone in the room probably knows that this is not something that normally happens you have a storage array that's running a 60 70 percent capacity and if you add on the capacity that's necessary to push all the data into the new cloud then you will have some performance impact. So the idea is to get the delta within a couple minutes you have to invest that mirror usually my experience is that you have maybe two minutes that you can get it down to there's no way to get it down faster further because the data comes in as fast as you can write it and during the final swing so basically you have to shut it down shut all your applications down and this is the one shot so you have to shut all down all your applications at the same time and then you have to migrate the sender database and start up the applications with the target storage. So obviously it's only applicable to whole data pools you cannot just do that with individual volumes and it may not work with all back ends and the downtime is going to be measured probably in 15 minutes 30 minutes an hour depending on how well it's automated. So here's a number of replication mechanisms as we are going to share this slide take I'm not going to read them out loud and I'll proceed to by migrate at all. So this is user sef as an example but it would work with most of the other storage back ends too. The idea is to have the back end attached to the old cloud you have the old cloud right here and then you do the attachment to the new cloud and we are currently working on a mechanism how to attack how to automate the attachment to the new cloud so you can actually just tell Miranda's open stack during deployment that you want the new cloud attached to the same sef back end and then the only thing that you actually have to migrate and it is minute by comparison is that's in the database so the other big advantage of this is that as your data stays in place you can do the migration volume by volume you can just say okay this volume is currently in my old in the database I take out the entry put it in the news in the database and fire up the application in the new database and you can do that over days or weeks it doesn't really matter because the data is not actually moving anywhere. So we are currently working on this on this plugin it is already functional and if somebody is interested in what it looks like I have a brown back talk tomorrow where I describe attachment of sef to multiple clouds not for the purpose of migration but for the purpose of actually running more than one cloud with one sef back and to take advantage of the economy of scale and the downside to that of course is the storage networks of the clouds must be connected if you are a major bank and you have your production system and you have a test system you probably do not want to share a sef cluster between them but in many cases this is an acceptable drawback especially if it's done for migration and you only use it like this for a few hours a few days at the maximum and obviously as this is still under development there are going to be a couple of bugs and we would much appreciate if somebody finds one if you actually tell us about it so and of course sef has its name from sephalopods so I put the sephalopod in there and the final point is you take advantage of the economy of scale if you do more than one cloud with one sef cluster that's a pretty important thing because a lot of people try to run three, four, five nodes sef clusters and find out that they are not quite as stable and not quite as fast as one would think so we have discussed all kinds of migration from cloud A to cloud B but we oftentimes also have the need to migrate within one cloud and Ivan is going to discuss with us how we can do with the native Cinder API we can do this migration within one cloud thank you Christian as you may know Cinder have several APIs to migrate your data from one storage to another Cinder use the same approaches like Alexei mentioned earlier so I mentioned such APIs like migrate, backup, replication and few words about retry because retry sometimes will call migration so let's start with migration Cinder can migrate volume in two different approaches the first approach is very fast and simple if backend storage and Cinder driver supports direct migration between two backends the data will be transferred directly this approach is unfortunately is supported only about 10 drivers in Cinder and Mitaka and usually you could migrate your data from the same type of backend for example from NetApp to NetApp from pure storage to pure storage and so on but this approach is very fast the second Cinder approach is also Cinder Migrations and it works for any supported Cinder backends in this kind of migration Cinder will attach source volume destination volume to Cinder node and copy all data using DD command this is could be very slow especially when you've got big volumes but it works for any backend for example you can migrate your volumes from Iskasi to fiber channel or Iskasi to NFS or CIF in Cinder we call this approach generic volume migration and Retive as I mentioned earlier Retive can cause volume migrations let's discuss what is volume time for Cinder point of view and why Retive will call migration so from Cinder point of view volume type is just a set of label set of volume definitions for example you can map volume type to specified backend and all your volumes will be created on SSD or HDD storages volume expect is more granular storage definitions it could be storage specific like QoS special when the defined properties for your storages so when you will call call Retive for example you've got HDD storage with tape A and SDD storage with type B and you have to call Cinder type command to move your data don't forget to mention migration policy on demand because if you want to migrate from in our case from HDD to SSD and without migration policy on demand Cinder won't migrate your data because Retive works very simple Cinder will check volume type contents if it is the same it's nothing to do except change database record if contents different Cinder will call the driver to make Retive and if driver doesn't have open implementation it will call Migration API so for this case migration policy options very important because now in metaka without this option you don't see why it doesn't migrate the other important note that migration is only admin operation of course you can set up permission in police JSO but it's not recommended to use it to make immigration so once you get emig... once you do emigrations the user doesn't know about it they don't have permission to view migration progress or they actually see available volumes I will not cover migration for in use volumes or live migration because Alex mentioned it in the previous session and you can see it in the video the next backups you can also use Cinder backup service to migrate data between storages it's very simple you have original volumes create backup and restore volume from Cinder point of view backup is full copy of volume so it's easy to migrate data but we've got several limitations in there it's about only six supported backup backend the newest one is Google cloud storage it was introduced in metaka but once you've got a backup in one of these storages you can restore volume to any supported back end in your cloud so you can get backups in Google or Swift or it's safe and restore it to LVM NFS NetApp, Salify and so on it could be extremely slow but usually if you've got a backup user don't... users don't work with it so it does not matter that's all about Cinder APIs you can ask any questions after speech Christian okay thank you very much Ivan this was very informative and I would like to wrap up and the application it was very good talk about application in the previous session so I almost forget about it thank you very much for your talk application is more about disaster recovery not about migration as mentioned before it we introduced application with 2.1 cheesecake and in Newton it will be tiramisu so if you need to disaster recovery application makes sense but it's not about migration data no it's all okay so I'll go through the summary very quick one thing to keep in mind is that replication is nothing to be trifled with you have valuable data you have you have performance requirements you have SLAs and if it's not well planned it's a recipe for disaster and obviously moving data is bound by physics a petabyte of data does not flow by itself it does it does not flow quickly by itself we have presented a set of methods granular migration all in one migration migration between backends and migration within one cloud and in many cases you will find that a combination is the best choice for instance you migrate with keeping a self cluster in place and you go at the end of the migration you will want to add a solid fire array or something and do just migrate volumes inside the cloud from the from the self backend to the solid fire so I thank you very much for staying with us even so the booth crawl has probably already begun and for all your application backends as sincerely which is access this is it's on one hand it's not easy but on the other hand doing it right and finding out everything has worked and your customers are happy that's the real reward and please feel free to ask questions we are going to be here a few more minutes and even though we are technically at time but I would appreciate being able to clarify a few things thank you guys enjoy the summit any question everybody wants to be a yes you guys know the pop crawl yes unfortunately you as a user or administrator you have not any option to view why it fails only of your logs in Newton there will be a new API seen the user facing messages migration within backends can be inside one cloud or it can be between clouds it's for instance you have one cloud that has a net up as a back end and one cloud that has let's say an emc as a backend and you want to migrate between those clouds either direction then it's migration between backends and between clouds but you could have for instance you could have tiers performance tiers in one cloud you have whatever spinning storage device and whatever SSD storage device and you find out that a certain instance is stopped in terms of IOPS so you take the instance or the volume that's attached to that instance and migrated inside the cloud between tiers so they are basically orthogonal to each other they can exist both together or both apart from each other the real touch option would still work you could for instance say okay i'm going from a hyperconverged to a non-hyperconverged setup i'm just leaving the OSDs on on my on the notes that they are currently on but migrate the compute services into a new cloud and so basically instead of migrating from one server to another i take the services off and put them onto a server that is separate from each other but a hyperconverged to hyperconverged obviously it does not work for obvious reasons but it does not work with reattached for with for obvious reasons in this case it would hyperconverged to hyperconverged you would go and use the backend replication method most probably so the question is basically he's asking if we can migrate and would it be matter if the volumes were transferring if they encrypted with that unpacked so Christian yes it does the answer is yes or no if you have if you do something like a snap mirror it doesn't matter whether the data is encrypted you get an exact identical image on the new net up that you had on the old net up if you go from volume from from backend to backend obviously you have to be to have the decryption key so you can actually get the data and shuttle it over to the other device right so it really depends on the methodology and in it for instance if you don't have the decryption key but we want to migrate from here to there a backend replication is oftentimes not the best thing to do all right guys thank you very much for your time enjoy the summit thank you