 Welcome to KubeCon again, and also to everyone here, and also people remote. Very excited to have in-person events anymore. I really missed KubeCon, I must say, over the last years. Welcome to our talk about transparent live migration of services between Kubernetes cluster across multi-cloud. As you can already figure out, we tried to squeeze as many buzzwords in there as possible. But what is it really about? So it's about moving stateful services. In our case, that's a distributed database between different Kubernetes clusters, between different regions, and even between different cloud providers. This is quite a challenge for us, because it should be transparent for the user. So when we do that, we want our database users to actually not, at least not lose the availability of the cluster. They might have a small performance degradation. But from an operational perspective, this is super useful, as you'll see for upgrading, et cetera, managing a general managed service for databases. We are spoiled by choice. We had a number of options we evaluated. All of them, they require cooperation of multiple tools from Kubernetes over our Kubernetes operator, networking layer, and also the database needs to be aware. So we'll talk a little bit about that and also try to give general advice for different setups. And then we'll try the challenge for everything, the live demo here on stage. Let's all hope this goes well. We, this is my colleague, Adam, here. Hello, my name is Adam. I joined the communities of my adventure with the Kubernetes itself from the version 1.1. From this point on time, I worked as the DevOps, but I found that development is a little bit more interesting for me. And from two and a half years, I'm developing the operator, which is used for the AngularDB. At this point of time, we went to that a lot, but I will pass it. Are you OK now? I actually took the opposite choice. I'm a little further away from development right now, which is very sad. But CTO currently at Arangudbi, very happy to combine my big two passions, cloud nativeness. So I was early on with this hype, like back at Mesosphere, building up also community. So it's really great to see KubeCon also growing that much. And then the other passion database systems also across different career stages. And now, yeah, it all gets within one role. Pretty cool. Why are we here as a database company talking about KubeCon here? If we look at databases in 2022, we actually, even if we look at the landscape, we see it's like the first block in the CNCF landscape. So we already see it's quite important if we walk around across the sponsors, we also see a bunch of different database companies. And this has really changed over the last years. Being a database company nowadays usually means you are cloud native in some way or the other. And interestingly, and this is something what I enjoy quite a lot, it also means you actually change a lot of things in how you built your database. Like five years ago, it used to be those service in some basement, some fixed switch in the middle. Nowadays, it's basically a distributed stateful system on dynamic infrastructure. Kubernetes telling us what to do, AWS telling us, please reschedule this or that. So it's actually, it's a very different game, but it also feels quite interesting to turn into a fully cloud native database. And just from the customer side, probably like most of our users, so WranglerDB is an open source project, are leveraging also some cloud service or the other. And most of them also our Kubernetes operator. And then we also have our customers who also run on the cloud on top of different options that we'll come to in just a second. As said, WranglerDB is an open core graph database. So it's really a scalable graph database at its core, but it also supports other data models such as document. We have a full tech search and retrieval on top. Imagine like a Lucene inbuilt into the graph engine. We have a graph analysis on top based on Google's cradle framework. And we also have a graph ML stack. But in this talk, we actually want to focus on like the left upper corner of this circle and this is about how can you run WranglerDB or how can you distribute it? You see on the right, this is just like some architecture diagram. I also don't want to go into too much detail. It's not to focus, but in short, WranglerDB is a distributed system. So kind of like, you don't want to just manage that. It has different components. So it's not trivial to set up, to maintain, fail over, et cetera. And this is why we early on invested also in the Kube-Wrango it's called. This is our Kubernetes operator. And then based on top, we also have our managed service Oasis building up. And I briefly want to introduce that because that was one of the main drivers we actually came up with the changes for this talk, but changes are also in the open source operator. So WranglerDB Oasis is our fully managed cloud service. So automated deployments, failover, scale up, et cetera, yada yada, as you probably heard from all other vendors here offering a managed service as well. I think what makes us special we actually run across all the big cloud providers, AWS, Google Azure, and we do that by leveraging their managed Kubernetes offerings. So this is actually like the naive assumption could be, hey, it's all Kubernetes is the abstraction layer. Interestingly, it's not quite that easy. There are still quite some differences in between, but I think we are managing that pretty well right now. What you just the brief architecture piece maybe to understand the motivation for this talk. We have like one central control plane, controlling Oasis by itself, but then we have data clusters. So you can translate that to Kubernetes clusters and they are spread across different cloud providers across different regions. So right now we have like a very, very large number of those data clusters flying around up to hundreds of them. So this was kind of the motivation course. We were facing like some operational challenges. Of course, this is only the small subset which motivated this talk, there's a lot more. But to just start out, it's the updates of Kubernetes. And in particular, as we're using the managed Kubernetes offerings, we are often forced to upgrade. So especially Google, they will tell us, yeah, you have to update till then and then or we will do that for you. And being a stateful distributed database, we of course, we want to be in control when we upgrade. And there are the first big challenges coming in and this is about that an upgraded cluster is not the same as if I would spin up a new cluster with the same version. So for example, I think on Google Cloud we started with like a 114 and we know that when we hit Kubernetes 124, we won't be able to upgrade anymore just because the changes accumulated so far are too big and we'll actually have to deploy a new cluster, probably 124 around that. And this also makes of course testing much harder because if we test, easy thing is to just bring up a new cluster and you don't even have a choice anymore to bring up like some of the old clusters. And so this is one of the big operational challenges we are dealing is how are we upgrading our managed Kubernetes clusters and how can we also test whether it still works because simply it's not exactly the same environment. Next challenge is migration between different regions. So I mean obvious thing is like an outage in a certain region or unavailability of resources in another region. A new region simply opening up and the customer saying, hey, I want to move my deployment closer to where I am. So reducing the latency between the client and cluster. Next thing is then even if we already start by migrating between regions, we also would want to do the same between different cloud providers. And maybe the last migration option was of course, and it's also great. We have a number of people in the community or also customers who are running their same self-deployed of RangoDB cluster on Kubernetes. And of course we also want to offer them an option to migrate into a RangoDB cloud. So in short, our solution to that of what we want to achieve is the migration between different Kubernetes clusters and ideally also between different regions, providers, et cetera, et cetera. Maybe just briefly as we talked about this upgrade procedure, what we do as like on a pretty standard routine is kind of like this two-dimensional update, either upgrading the Kubernetes version of a particular cluster stepping up or updating the RangoDB version from I don't know, 3.8.2 to 3.8.3, et cetera. So this is something we really do in like a daily operations level and it results in kind of the step function. But as said, there's some limit to that. So what we would actually like to do is not just upgrade the same kind of like still fixed cluster, but we would like to be upgrade to actually also move or upgrade in between different Kubernetes clusters. And this could be back and forth. So in the end we actually want to move from our two-dimensional upgrades to a three-dimensional structure in which we can upgrade or move. So this Kubernetes cluster is not always an upgrade but basically move on that Z dimension as well. So the challenge is there and therefore I would actually love to hand over to my colleague Adam here because he's actually the mastermind behind all this technical implementation of that. Thank you. All right, so from this specification we went out to the requirements which we specified to satisfy the customers which we have. It is a little bit tricky to have classes which run, for example, one cluster and one hundred deployments behind to make sure that all of the services are all of the time provided to the customer and the customer in the best efforts shouldn't see even difference after the immigration. During the immigration there can be small difference that why maintenance windows are useful but after the immigration there should be no difference on the cluster itself. Also we had to remember that database and client migration is not atomic operation. Remember when, for example, customer want to migrate from one region to another he also probably needs to migrate his applications. We need to have in mind that it will not happen at the same time. So there needs to be still backward compatibility where all endpoints used by the customer still gonna be reachable for everyone outside. Also, from the other side while we are using DNS to propagate or database to the customer we need to have in mind that there are some propagation times of the DNS. There are up to 24 hours. In this scenario, even if we migrate from one place to another of course we will have different load balancers, different IPs behind. That means that we need still to propagate the old service for at least 24 hours. Of course configurable depends on the customer use case. And after that we can simply cut down the old cluster. About latency. Also latency should be as low as possible. We are using TCP so each bump in the latency time between two servers increase the time of the queries for us a lot. I can say. And about the performance it shouldn't be affected in the matter that the customer can have decreased performance but not getting time outs on the query execution. About the networking, also very important for us was reliability. This is due to fact that customer is using the service all of the time. He cannot stop using it. So he should be able all of the time to reach data which is inside. About other stuff like security and reachability. This was also very important stuff for us but in principle data shouldn't be able to leak outside of the cluster. We are migrating data between different regions. That means that we need to go through the internet if we are going to cross the providers. If we are in the same provider we can go through the secure path inside. Then this is not a problem. And also there should be nice cooperation between all components used inside. It is not able to do this to do migration only from one side. All of the components needs to be aware like DNS, load balancers, operators, even data centers needs to be aware that something is going on. After this requirements we came to four options which could we use to migrate. First one was the direct networking between two Kubernetes clusters. This was an example of, for example, on the first stage we discussed about the VPN where ports from the one cluster will be able to reach the port in the second cluster to the port IP. Second one was something which was most universal for us. It was Qtl port forward functionality like. This allowed us just to expose the API of the Kubernetes, open the tunnel and be able to simulate the traffic, just pass the traffic between two clusters. Next one which came also to our mind was the internet exposure. We wanted to use their host port to be able to expose the ports through the host IP in case if you have cubelets exposed and then if the nodes have access to the internet and they have assigned elastic IPs, for example, and just go through this site. On the last step, we proposed load balancer part. This required a lot of the evaluation because the main goal of the load balancer, so services in Kubernetes was to load balance the traffic between different ports at the same stage. We wanted to use it in a little bit different way to be able to map one port to one port and be able to expose multiple ports through one service. For example, if you need to migrate three services at the same time, you need to expose three different ports and you do not want to create three different load balancers because it did not make sense. After proposal, first POCs, two options were picked to be implemented, so direct networking, which I will describe in the next slide, and service spot mapping based on the load balancer, which I will describe in two slides. All right, so what worked for us? For the first scenario, we picked the Siloom cluster mesh, Mechanism, which allowed us to interconnect two clusters, which we had anywhere on the world in the demo which I will present. It will be between Europe and United States and all of the ports were cross-reachable. That means that we could reach the pod IP from cluster two from the cluster one. It was very useful for us, but required additional logic which allow us to manage the endpoints inside. This is due to fact that when you specify the service, you can provide the selector, but as you know, selector is limited to one namespace. About that, we got limitations. POTCAIDR cannot be conflicting because if they're gonna conflict, POTCAIDR can get the same IPs and even routing will not work properly. And it required one important precondition before you could do it, is that Siloom needs to be installed on the cluster, be the network layer on the cluster, and also needs to be ready for the cluster mesh. This was not always the case due to fact that during our test, when we tried to change the network layer, it could provide the interruption of the service due to fact that one of the ports were on the new networking schema, other ports were on the old. That's why the second option was also important for us. It was service load balancer. This was used, this could be used almost anywhere due to fact that we did not care in this scenario about port CIDRs. We have just load balancer which internally maps to some IPs and in principle, from our point of view, we did not pay attention for it. We just needed IP of the load balancer, range of the ports, that was fine. This worked on all my cloud providers without any adjustments from our site. And this could provide us functionality to migrate specific deployment, so specific customer deployments between the regions or even cloud providers. About the limitation, there was limitation about load balancer implementation because as far as we know, on managed Kubernetes cluster, we have these load balancers in cloud, we have these load balancers already implemented, but what with bare metal? If you do not have anything which implements for your load balancer and about latency, there was issue just on the bare metals that if you need to provide something, then this brings you additional latency in your connection. On cloud provider, there was no huge impact. It was just one to milliseconds connection, additional latency, and in principle, it was just invisible for the use case. First scenario for the direct connection worked perfectly fine for the upgrades of the Kubernetes clusters. And second scenario works perfectly fine for migration of the cluster between regions or even continents. All right, so from the network perspective, if you have this load balancer and direct connection, how to make sure that our service, which is stateful, will be able to talk to the proper endpoint on other side. This is due to fact that if you try to reach database, you have usually some access points, so in our case coordinator, which needs to know where data is placed. In this scenario, all of the services and expose for us the endpoints on which they are listening and they are reachable, and we had to somehow make it transparent for cluster, which we are using. That's why we used services to communicate to each pod, and in this scenario, when the pods were in the same namespace, we had service, which was called service one, was pointing to pod, which was behind with proper selectors. But in case if we migrated this pod to other data center, this selector didn't work anymore. That's why we have wrote smoke. We used the benefits of that. You can define the endpoints in a manual way. So in principle, we just used the same service. We just removed the selector and we defined the endpoint by our own. In the scenarios like described here, with the direct networking, it was just pointing to the IP in the different cluster of pod, which was already scheduled there. And for the load balancer method, it was pointing to the one of the pods of the load balancer, which had one pod behind on the second data cluster. Okay, so how the mutation looks like? I will further describe what we got. On the left side, we had got our operator with one service on top of it. It was the load balancer service and three database instances behind. On the right side, we got Azure cluster, where we first created the service itself. In the service, we didn't provide the selector, so we specified that all of the endpoints, which are on the right side, so in the second data center, are pointing to the instances which are still in the first data center. Then during migration, we picked first member out of the first data center, moved it to the second data center, and in this scenario, we were able to edit the endpoints of the services to point not only to the instances which are in the same data center as the load balancer, but also to the other data center. This allowed us to have the failover recovery in case if instance in Azure went down, it's normal that it can went down just normally restart. There was still fallback to the instances which were in the AWS. We were proceeding with that instance by instance, so we picked one instance from AWS, moved it to Azure, and modified the endpoints. After all of them were already moved, service in the AWS was not pointing anymore to any instance in the AWS cloud. It was just pointing to the instances in Azure. Operators left in Azure, AWS, but in the next step, it has been moved. Here we have this propagation time which we need to get for the DNS propagation, so we waited until there is no anymore traffic through the AWS, and after this has been done, so everything has migrated, all DNS propagation took place, we finally closed the AWS cluster, and we were finally migrated to the Azure without downtime in principle. And now we will go to the small demo which will present the benefits and the side effects of the migration. On the slide we can see in the description, we have on the left side of each line, so EU or US, we have defined where the client is placed. Then on the second place, EU or US, we have defined where the service load balancer is placed. In scenario on the left top, we have information that we are contacting from Europe to the cluster which is placed in the Europe. So load balancer which is placed in the Europe. On the right side, we have information, we are contacting from US to the EU. We see that huge increase of pink due to the distance. On the bottom left side, we see that this pink is almost two times higher than other. This is due to fact that our traffic first went to the US and then had to go back to the load balancer to the EU. After that, we see US-US communication, it is exactly the same, almost the same pink as the communication between US-U. Just this go, the traffic needs to go through the same path. All right. Just a second, I will, here we have the live version of it. I will go first to start the migration. This is the comment which we provide to the, is it, oh, it's too small? Can you? I think you can just control plus. All right. So here we have information which we need to provide to our operator to migrate the clusters. In this scenario, we define that in the context Europe, so all of the ports which are placed in Europe, we want to migrate them to the second cluster which is in our case US. I will issue this comment and on this site, we will see, just, on this site, we will see how the endpoints change. As you can see, one by one coordinators, on top you can see the EU, on bottom you can see the US cluster. And as you can see the endpoints for both clusters are changing all of the time. In this scenario, in US, once the port become ready in the US, the endpoints for the service load balancers, which is database and database EA, changed itself to the new values which are only in the same data center which reduced pink for us. This proceeds one by one. So each time one port is killed in one data center, second port appears in the data center in US. This is currently going home. The second one started. Now we'll proceed with the third one. We'll see that third one endpoint disappeared from the US. That's the second, it will just proceed. And third port will start in the US in a few seconds. Let's give me a second. It is about 33 seconds. It should take about 33 seconds. Yes. Come to B. Okay, third port has been killed. After third port will be started in the US, we'll see that all of the endpoints which were in the U are pointing to the port in the US. And from this point of time, our service is fully migrated. And due to the fact that we use the services inside, everything went in the transparent way for service which was in this namespace. Okay, I'm going back to presentation. And what we learned to this implementation phase and investigation phase, that what did not work for us. We got a first idea about the port forward, but in principle, it gave additional overhead in case of the days. And it just was not a reliable solution for us because it added like 40 milliseconds of the delay even if we migrated within the same region. Also, exposing ports on host port via public IP didn't work due to the security because our data needs to be protected and hidden in the private network. And we need to have the full security like the firewalls or security groups to protect access to it. We also had on the first step plan about manually managing the endpoints during the migration, but the problem was the scale. You can do it for one customer, but not for 200 customers in the regular period. And why did we came with two solutions instead of one? This was due to the different kinds of the updates. When we had to migrate whole cluster, it was easier for us to set up cluster mesh and do it for everything what is inside. But if we had to migrate just one cluster within the Kubernetes cluster, it was just unnecessary to set up the service mesh. We could do it to the load balancer and do it just for this particular deployment. For the future, our plan is to provide the operator in two namespaces. Right now, when we were having operator in HA, we were providing functionality with which use the Kubernetes API leader election. But it will not work when you are in two Kubernetes clusters. We need to come with all better solution which will allow us to make sure that at least one of them will be active. This is due to the fact what, in case if first data center will go down, second needs to still work. And based on this migration, we came to conclusion that distributed clusters can be also idea when you will come to the, you will use three different regions to spin off your cluster to be able to handle the single region failure, which might be very useful for customers which demand such a resiliency of their solution. All right, so. Yeah, thanks very much. Feel free to drop by if you have any questions about avocados, about cloud migrating services, graph databases, or anything else. Our boost is a stand 72, I believe, Pavilion 2, but easy just follow the avocado or just drop by here. Thanks very much. Thank you.