 How you guys doing today? My name is Davis Phillips, and this is Chandler Wilkerson. We work for Solution Engineering at Red Hat, and we've been working on a product called CubeFed. CubeFed will basically allow you to take a single multiple clusters and manage them as one. So the federation of the product in Kubernetes, it's kind of been a redone. There was a V1, now there's a V2 of the product. It's still kind of in tech preview for OpenShift as a product. But in Kubernetes, basically the re-spin allows you to manage multiple clusters from a single control plane. Now, how does it work with GitOps? GitOps is a relatively new concept that uses a Git repo as a source of truth for your infrastructure and deployments. It's not a new concept in Kubernetes or in OpenShift, but it's a newer concept from an infrastructure perspective. We'll be on to a demo at some point, and lastly, a Q&A session. Guess I covered it already. Alright, so with this plan, CubeFed's still at work in progress. It is very much still an alpha product starting in beta yet. There may be some changes over how the product works today. Basically the idea is the same, but the inception may be changing over time. And then lastly, backwards compatibility is not assured until it's beta in 010. So our production today looks like this. We have multiple Kubernetes clusters or OpenShift clusters on different platforms. CloudProvider A could be vSphere, OpenStack, AWS, GCP, even bare metal. All these clusters are managed independently, and typically you'd have infrastructure endpoints, either to get traffic into the clusters or to live balance around the clusters. But no way to sync the clusters or sync applications around the clusters. So tomorrow's idea is a hybrid cloud with Kubernetes having a control plane that manages all three clusters. The control plane will allow you to roll out and migrate applications between the clusters seamlessly using Kubernetes concepts like deployments. Some of the problems they're trying to solve at the Federation is having the unified control plane for all applications, because right now, like I said, it's all independently managed. Some high availability for applications being able to move from data center to data center with limited or no downtime. Disaster recovery. Geographically dispersing, low bouncing between applications and data centers. And then application portability. Kind of being able to move your application around where you need it, when you need it, based on demand. So at Red Hat, the people working on the Federation product, we have an involvement from CTO, Systems Engineering, our group, the engineering group writing the binaries for it, the storage team for our storage backend, and then networking in other groups for both interconnects between the clusters and then ingress egress points from clustering. And lastly, Federation V2 is a open storage storage upstream product. Alright, some vocabulary for Federation. Essentially, I kind of covered a little bit of this before, but multi-clusters, lots of single, multiple clusters, they're not aware of each other. There's no workload management. There could be CICD pipelines that manage between those clusters, but for the most part, they're all completely independent of one another. We explored the idea of a stretch cluster, which is taking a single cluster and stretching across multiple data centers. So your CICD and API endpoints are co-located. We encountered some storage and performance issues with this. Most scenarios, you'd have to have a kind of metric connectivity with very little latency at all. Now onto the federated cluster, which is multiple clusters connected by a simple control plane. And then our host cluster is the cluster actually running the control plane. It's kind of the manager of all the clusters. And then lastly, the member clusters are those joined to the control plane. So why CUBE then? The control plane to manage resources across multiple independent clusters. It's a cloud agnostic, which is an important setting factor for your application portability. But you don't want to be locked into AWS, GCP, OpenStack, or vSphere. You want to move your application around as you need to and stand up and move data centers as you need to. The best part about it is that basically any of the API resources and Kubernetes can be federated. This means like your deployments, config maps, secrets, service accounts, all your standard Kubernetes resources can be federated and applied across multiple clusters. There's no latency requirements. It could be a long-range latency, low connectivity. Here's an example of a federated site being converted. So your typical deployment is a Kubernetes function to deploy your pods, applications, secrets, all the things for a standard deployment. By converting it to a federated deployment, it adds some placements so you can put this application across multiple clusters and then do overrides and move it between the clusters. The KubeFed CTL, which is the CLI for KubeFed, allows you to convert the standard resource types into federated resource types. So the federated resources are comprised of three main properties. It's going to be your template, placement, and overrides. I'm going to show you an example here. This is a federated config map. So if you look here, the spec is a template file. So this is the template that gets applied to all three clusters or however many you're using. In this case, we have placement for two. The config map is applied to cluster one and cluster two. And then you can actually do overrides for individual clusters if they're going to be customized. So in this case, cluster two has an override for the data path. And the value is obviously... Would you say those were? Alla has a cat and then Alla has a dog in Polish. So one of the ongoing discussions for KubeFed and federation, initially there was a cluster wide federation, which basically federed the entire cluster and all the resources. Since then, there's been discussion to move it into a namespace federation so that only a single namespace is federated. This kind of matches the multi-tenant Kubernetes ideology by keeping it independent of the entire cluster and being able to apply it specifically to namespaces in each cluster. So the cluster scope, KubeFed, is going to handle resources in namespaces cluster wide, like I was saying. This is going to be required if you're trying to federate cluster resources that are cluster wide, like a cluster roll, cluster roll binding, cluster storage classes. Basically, you control multiple namespaces with one set of cluster wide relationships. The namespace scope to KubeFed allows for multiple instances of KubeFed operating independently. This is going to provide less roll privilege for service accounts, not cluster wide, only for specific namespaces, and each instance will handle the cluster relationship independently. While trying to federate multiple clusters together, one of the issues that we've run into in addition to trying to get the objects between the clusters is trying to figure out how to get the networking working between the clusters. So this is mostly just work in progress and identifying key points. Basically, one of the first things that you need to figure out is actually how to get a tunnel going between service networks, back-in-service networks of the clusters. One of the issues that we've run into that with that is of name, or of actually CIDR ranges, the actual IP ranges between those. They tend to overlap when you just blindly deploy clusters. One of the things you have to do is if you want to build tunnels between clusters, you have to think about that at the front end and make sure that the IP ranges are distinct between the clusters so that you can actually get some routing rules. So you have some tunneling routings and additional issue because you may have some mesh networks or service network like VXLAN and an OpenShift cluster that you have to get into the rules and adjust how the traffic goes between the clusters. And once you've figured out tunneling and routing, then you have an additional problem where you actually want to be able to use the services from one cluster in another cluster and you start to realize that whenever you do lookups of a cluster's DNS like for a service that it's usually cluster.local. And so putting some sort of naming into that, some sort of cluster identity into that, there's some work to be done there. And finally, once you get your DevSecOps going, then you need to make sure that your network security people are involved and there's some way to put actual network policies into this kind of control so that when you do have data going off cluster you're not violating some sort of policies that you want to avoid. So we're just looking at the technologies right now. One of the ones we're kind of fast following right now is Submariner. And Submariner essentially has this concept of setting up a bunch of member clusters and laying what they call undersea cables between them which are actually just VPN connections. And so we're trying to discuss whether this should move into the Kubernetes multi-cluster sig special interest group which also handles KubeFed at the moment. And oh my gosh, we left in... Sorry, that was another DevCom. Apologies. Okay. The KubeFed operators, how we're deploying in OpenShift 4 and pretty much operator hub is what we're trying to move to for deploying anything that's complex. And so there is an operator for KubeFed. There may still be in the operator hub and the old version which was called Federation but we're moving to KubeFed now. As I said, it's a supported way when you're going into OpenShift 4 and beyond. And it allows you to do both namespace and cluster scope federation which I put that in because it wasn't it at first. It was only namespace at first. And as a slide writing time, we're at an 0.1.0. It's not quite the full beta version just yet. I think we're still on like a release candidate 5 or so but we're very close to full beta once the team works out some additional issues. And so there's one kind of gotcha I wanted to point out here because when you're talking about a single namespace versus cluster wide when you install the operator, it actually makes more sense to install the operator to watch a single namespace if you want your KubeFed to do cluster wide and vice versa. If you want your KubeFed to be in multiple different namespaces doing single namespace then you have to make your operator watch all namespaces. It's just kind of a naming kind of disconnect there. But here's the link for the operator at the bottom. Sorry, I couldn't quite hear. Okay, we can, I guess you want to see like in a cluster or just... What do I mean? Okay. So you mentioned that we could install the operator in such a way that it watches one namespace. Does that mean we federate the objects in that namespace across to another cluster? So there's the disconnect and that's why the naming gets kind of confusing. The operator watches for a specific CR, a custom resource called KubeFed which then instantiates the actual KubeFed controller manager. And so all the operator is doing is waiting for you to drop that KubeFed file in either for a cluster-wide version or a specific namespace. And so if you want to do cluster-wide KubeFed then you want a single KubeFed CR usually in the kube-federation-system namespace because that's the one that the CLI automatically goes to when it expects to find it. So that's why it's... There's a little bit of confusion between the operator. Thanks for the question. Thank you. Okay, so like I said, we have a lot of teams who are kind of looking into this and we've identified some challenge spaces that we're still working through. One of them is trying to get DNS and Ingress of services into federated clusters with the ability to then switch around regions pretty quickly. So we just have kind of the typical problems that everybody has with DNS at the time to live and trying to shorten the amount of time that it takes for changes to happen but DNS is not cooperative with that. Trying to get a multi-cluster storage going is a challenge because if you're thinking about moving apps around, of course everybody... It's always easy if your apps are stateless as not so much if they're stateful so we're still trying to work out exactly how federation looks with actual storage rather than just flinging types around. Federating operators themselves would be a really powerful concept that's something we're still exploring right now and because operators kind of imply sort of an ordering to things happening or just the ability to do whatever code that you need it's harder for it to work in a slap it down and expect it to be eventually consistent kind of model like Kubernetes which is part of the reason why operators came into existence anyway. So trying to federate that is a challenge. As far as just trying to work out how going beyond something that's in an operator trying to federate applications themselves and how you do that in the right order so that it works and then can migrate we've got some work in that we have some demos and then trying to work out what kind of infrastructure concerns like can you get KubeFed to actually federate some of the more principal things behind Kubernetes cluster and allow more admin kind of roles to be federated still working on that as well. And the final one is kind of the sticky point of day two operations we like to spin up quick demo clusters say hey look it works and shut everything down and forget about tomorrow and so we always have to come back and write like okay how do you ensure that authorization works how do you ensure that backup and recovery are working what what do you do if you need to move a data center what do you you know how does this all work okay. So as I was saying earlier the GitOps is the kind of a shift from the source of truth being a pile of YAML in an admins directory to the source of truth being in a Git repo with proper pull requests and you know kind of the chain of events that have happened and sign offs and all that kind of stuff. And so there are a number of tools coming out right now and we've been looking at a few of these on just trying to work on GitOps workflow and so what we found so far is that you know obviously this looks different from what we're doing with Kubefed because it's a pull modality rather than a push kind of idea although some GitOps tools do multi-cluster and some of them push and some of them pull to the remote cluster so there's not even quite consensus on that and you can kind of basically you know think of GitOps as just running over and over and over again a Kube cut will apply or you know an OC apply on a set of YAML and so you get the kind of workflow where you're just putting YAML in the Git repo somewhere and then there's a container that makes sure that it gets blasted out to the cluster. What we're thinking of with Kubefed is to put Kubefed kind of in between so you have the GitOps model but the GitOps is feeding not resources but federated resources into your cluster and then that handles the multi-cluster. We'd like to do that not just because we're trying to bet on Kubefed but because we see some value in putting Kubefed into the middle of that. One of them is that you can set up overrides for the cluster so that the multi-cluster piece of this is more handleable and so another part of that is replica scheduling preferences so rather than trying to do your kind of replica count per cluster Kubefed actually can manage to say something along the lines of this set of clusters that I am propagating this resource to I want to ensure that in copies across these clusters are there rather than each cluster should have five copies or maybe that cluster should have three. Although that's what our demos look like right now we're still working towards exactly what that would look like. Did you want to drive demo? So what we're going to show you here is a federated application running across three different AWS clusters one in US East, one in East 2, and then one in West 2. Chandler's set it up earlier and there's an HA proxy listing at a URL and it's set to automatically do a round robin distribution of the application so what we're going to do is we're going to open it up and then we're going to open it up again and we're going to show you in the application how it shows you which zone and which availability zone it's in and which cloud it's in. So we were having some Mongo issues earlier and he said he got some of that stuff fixed, let's see if he did. It's demo time. No, no Mongo still, it's alright. But it's all the availability distribution that's important. So you see here we're in AWS and this is East 1. So this kind of highlights one of the big issues with a multi-cluster in Federation is ingress of points. We really need a great solution to be able to go, hey, you're coming in from here, let's push you over here. And there's several proprietary lab balancing solutions that offers capabilities similar to this and I think it's kind of open source, cloud agnostic, multi-location and it's one of the stores we're working on here. So this is basically what we're looking at with the demo environment and so as Elvis was mentioning we have a US, I don't know if the brochure comes out on the screen there, US East 1, East 2, East or West on the sides here and so we're running the same set of Mongo database pods across each. They're using Amazon BLB storage essentially and we have a Pac-Man pod that's designed to work with the Mongo database for the high score system. So when the demo is cooperating what you'll find is that you can run on any of the regions, save your high score and then high scores are all saved to the same database regardless of which region you're running in and so that's our federated kind of Mongo slash data replication kind of setup and so essentially the Mongo pods within these are all in a replication set, a replica set which is more specifically downlined in the following slides. And there's the HA proxy with the essentially just a Route 53 DNS entry that says, okay well pac-man.sysdesng.com is over there or in this case we used pac-man-devconf if anybody wants to play. What's important to note is the HA proxy is running on the federated control plan so there is a single ingress point for it and it's distributed between all three data centers. Okay, so this is where we've got a number of our demos if you want to go and try this out yourself. We have some to kind of outline the difference between the scopes, the namespace scope versus cluster scope, we have the Mongo and the pac-man as well. And a bunch of useful links essentially there's the upstream project for KubeFed which is in the Kubernetes special interest groups, KubeFed, GitHub repo. Once again with our Federation Dev for OpenShift specific demos. We have a couple of presentations that went through in KubeCon around Federation KubeFed. The one from 2018 probably refers to it as Federation v2. We have a Catechodes scenario which lets you spin up three kind of test clusters or two or three test clusters and then federate them real quickly under the kind of the cover of a Catechodes learning experience. And we have a couple of blog entries one of them describing the process that happened with the KubeFed rename and then actually the mixing OpenShift versions is kind of nice. We kind of assumed that it would work but then we actually tested it out so you can take a 3.x cluster and a 4.x cluster federate both and move applications between them. It does work as mostly as expected. Which is great for things like migration if you're migrating from 3.x to 4.x. All your stateless... We are working on the stateful side of that as well. That is also work in progress. Okay. That will say thanks and turn it over for questions. So it's on the same cluster as the federated control plane Exactly. So basically there's a federated ingress point that has to exist and it has to point somewhere which is kind of like the whole ingress dilemma because while you have three routing points you still have a single ingress so you still have that point of failure. Yeah, so that goes away then you're sunk, huh? Yeah. That's the goal. Finding a solution that we can distribute that between and then it going away. Within a DNS TTL kind of window Yes. There's a great project where they try to do something like this called external DNS which is a Kubernetes plugin that lets you create a service IP or a load bouncer IP and then it will take that and automatically create a DNS record for your DNS of choice if you're using a RAF 53, GCP, bind, whatever. It's interactive for all of them. Okay, thank you. Yeah. You mentioned that you're also working on figuring out how to federate operators. What exactly are you trying to attempt there? Is it have a single controller running on cluster A and have the same controller watch resources on some other clusters? That's the basic idea, right? So the logic dictates that you should just be able to do this but we're trying to figure out where the corner cases are. So we're still trying to figure out if there's something that we should be changing within KubeFed to make this work or does this just work? So the other question we had is when we set up federation across multiple clusters, does whatever sets up the federation also validate the API compatibility between different clusters? Okay. Between the clusters. There's some validation that takes place because like we were saying earlier, you can't create a service class unless you have a cluster-wide federation or cluster-wide bindings without having to fully federate. But for specific resource types, I'm not sure. Yeah, I'd have to defer to the upstream project. I imagine that there is something so far most of the validation errors that we hit are local cluster saying something's wrong with the YAML. It's still very much a work in progress. Actually, before we were in here, we were just trying to fix the demo and our old YAML files were pointing to the cluster context that didn't exist. I'm sorry. So they were pointing to cluster context that didn't exist and we ran it and everything ran fine without an error and we're like, it's still not working, it's strange and there's no cluster role being created in all the clusters that's going on here. So like I'm saying, it's still coming along, right? So the last question I have is so you showed that you have MongoDB replication across the three clusters. Is that a cube-fed managed replication or is that a MongoDB? That's a MongoDB replica set, right? I mean, this couldn't be done with MySQL because you'd need back in storage to be replicated across. So it's one of those corner cases like he was saying, right? So how many applications, especially enterprise applications, is MongoDB over MySQL or whatever? Thank you. It looked like the cube-fed operator is in one of the data centers, in one of the clusters. What happens when the cluster which is hosting the cube-fed operator goes down? So essentially the reconcile step will no longer be happening essentially. But the other clusters will continue to run with whatever Kubernetes objects have already been deployed. So for instance, in our scenario, if we just knocked out East 1, then East 2 and West 2 would still have Pac-Man. They'd still, this is the important point, have their own reconcile loop going on, making sure that those deployments stay. So if they also suffered like node failures, it would ensure that the number of pods would end up to date and everything. So that's kind of the cool thing about cube-fed is that it sits on top of the Kubernetes reconcile loop as well. Second question is, what would be your recommendation for admins who are going to use this? Should they use only cube-fed for all their objects or only for the objects that they want to have federated? They go through cube-fed because they just do an OC create using the API master for one of the clusters. I think it's probably safe to say that you should use cube-fed for the ones that you want to actually federate. Because at any point in time you could also decide, okay, well, I thought I didn't want to federate this, but I do. That scenario, you can actually just run the cube-fed cuddle federate command and then basically lift that application into federation and then choose its destinations. That may be a good case for the cluster-wide versus namespace federation. So if you have a specific namespace you want to federate versus federating the whole cluster. Great talk, by the way. Thank you. So would this allow pods to communicate into a cluster as if they're in the same namespace or is that not what the federation's goal is? That's not what cube-fed is doing. That's why we kind of tack on all these other things before we actually say we're done with federation, right? That's actually from the networking slide where we were talking about establishing tunnels between the clusters. You got to do that. You have to have the routing working and then you have to have the service discovery. That's where the Submariner project he was talking about that was in the play. That'll kind of create those routes as LTT reports, VPNs and everything for you for cross-cluster communication with the SDM. Cool, thank you. If you have federation at the namespace level only what kind of permission is required on each one of the member clusters? Is being project admin enough or do I need cluster admin privilege for that? Okay, when you're joining the clusters it establishes a service account and when you are using namespace scoped federation it only establishes that service account with namespace admin permissions. Very good, no other questions? Cool, what's up? This might be easiest to describe if you go back to your Pac-Man infrastructure slide. I totally get the use case and it seems very easy to me and makes sense to me when I have a unified application that I want to run globally like this Pac-Man service, get it highly available, share state in some sense across these, with MongoDB as you said, across these clusters. What if I wanted these to be private Pac-Man instances? I was maybe admitting three different Pac-Man for my three different friends and I want to keep things in sync, not necessarily the data, I want each friend to have their own copy of the data. Is that a use case for KubeFed? I'm managing multiple Kubernetes clusters. Does that make sense? Is it easy to do? I think that's a much simpler use case than the one we're trying to get to. You think it would be easy to do with KubeFed? Rather than all the work we had to do to make Mongo do this replica set, you just have a Mongo pod. Exactly, okay, thanks. I want to make sure. I guess we're good to go? Yeah, any more questions? Well, thank you all very much for coming.