 Hello, and welcome, everyone, to Cloud Native Live, where we dive into the code behind Cloud Native. I'm Taylor Dolezal, head of Ecosystem at the CNCF, where I get to work closely with teams as they navigate their Cloud Native journeys. Every week, we bring a new set of presenters to showcase how to work with Cloud Native technologies. These folks will build things, they will break things, and they will answer your questions. In today's session, I'm stoked to introduce Flynn from Boyant, who will be presenting on multi-cluster failover using LinkerD. This is an official live stream of the CNCF, and as such, is subject to the CNCF Code of Conduct. So please don't add anything to the chat or questions that would be in violation of that Code of Conduct. Basically, please be respectful to all of your fellow participants and presenters. Be excellent to one another. With that, I'd love to hand it over to Flynn to kick off today's presentation. With that, Flynn, please take it away. Today is going to be very interesting, as I was telling Taylor. I think we might have lost your audio. Yeah, I think we lost your audio, Flynn. Right now is about the time I make a UDP joke, but not sure that you might get it. Hold tight, folks. We're going to see if we can get some audio back. I'm going to continue assuming that things are mostly OK. Let's see if I can get a screen share going of everybody, for better or for worse. So let me do this. Here's our kind of nominator, which is that what I have been planning to do here is actually up on the repo. There we go. Hey, OK, I got the screen share going. Things are starting to work a little bit better. Everybody keep your fingers crossed. OK, so packets are flowing. I'm going to talk a little bit about, yeah, packets are flowing. This is a good thing. I'm going to talk a little bit about what we mean by multi-cluster failover and what we mean by failover, that sort of thing. And hopefully, maybe, we'll get to a point where we'll be able to do a little bit of a demo. If not, everything that I was going to do is posted up in the repo, in a GitHub repo. And I'm going to send this, send the URL of the GitHub repo so you all can follow along, no matter what happens to me. OK. So in there, there is a really huge file called readme.md. I believe that that link will get posted to the main chat in a moment. The readme.md is the steps that I've been following along to get things set up. And hopefully, we'll be able to actually follow some of that live so I guess we'll see what happens, right? All right, in the meantime, let's talk a little bit about multi-cluster and failover and all that kind of stuff. So first up, OK, that's me. First up, let's see if the screen share is going to catch up when I did something different. Wow, this is so much fun when nothing at all is working. I just love that. It's how you know it's a live demo. And thankfully, it's not Friday. So we're good on that front. I'm not sure what would happen on Friday. All right, well, here we go. Yeah, it's coming close to Friday, for sure. So I guess we get to just do this the old entertaining way. Taylor, do you want to drop that screen share? And let's see what we can do. I'm going to try restarting that once. And if that works, then great. And if it doesn't work, then we can manage something else, right? Always, absolutely. Don't try this at home, kids, or something like that. It's wild to think that they had to resort to charades before developing PowerPoint and slide decks. So hopefully we don't have to go to that today. Well, you know, worse comes to worst. I can, in fact, just go through and talk out loud and we'll see how it goes, right? But I would really like to have at least some things that I can point to that would be lovely. OK, let's try this once more, shall we? And then it's going to be a lot of fun trying to go through and catch up so we don't run too far over time, right? All right. Yeah, you know, it might not let me share at all. That would be delightful, wouldn't it? I don't want to drop out and come back. I would just kind of like it if it did the right thing here. Well, yeah, great. Oh, well, there's that, but that is not right. All right, well, you know what? I guess we're just going to have to do this with words. How creepy is that? You can go ahead and drop that because that doesn't seem to be updating with what I'm actually doing. Let me talk a little bit about failover and what we're talking about with multi-cluster at all. Basically, failover is a really, really old concept. It has been around since long before Kubernetes. It will be around probably forever. It's basically just the idea that if you go through and you have a service that isn't working, then you want to go and redirect traffic for that service to something that is working, which I hope makes sense to everybody. I'm also seeing some questions in the chat about Istio and LinkerD. And we can come back to that a little bit later. When we're talking about simple failover, we're mostly talking about failing over a service within cluster. So maybe you have our emoji vote demo running, which is what I'm supposed to be demoing for you. And you can have multiple instances of the emoji vote pod running in the same cluster. Then if one of your pods goes down, you can just redirect and let it go to a different pod. And that works out really well. One of the really interesting things about Kubernetes though is that as people have gotten better and better with Kubernetes, they've been kind of getting more and more sophisticated with how they use it. And they keep coming up with more and more different patterns. One of the patterns that's getting a little bit more popular these days is the idea that the entire cluster can be a fungible object. And so this implies not just that you're going to treat the services as something that can die and get immediately restarted, but that you'd like to be able to treat the entire cluster as something where if the whole cluster crashes, no big deal. Just go ahead and pick it up with a different one. This is a really cool idea. And multi-cluster failover is a specific example of this where if you have a service in one cluster that dies, you can just go ahead and route traffic over to another cluster, the same service in a different cluster. It isn't quite as dramatic as having your entire cluster catch on fire and be replaced by another cluster for everything, but you can do that as well if you treat the ingress as the service that's going to be failed over, right? So that's basically what we're talking about with multi-cluster here. Question so far as I try to figure out how to go through and continue with some of the other possibilities here. Not seeing anything. All right. So multi-cluster failover and linkerty, there's another old concept software of layering where as things get more and more complex, you try to split them up into layers. So you have something simple that happens at the bottom, then you layer something a little bit more complex on top of the simple thing, then you layer something more complex on top of that. This is all over the place in computing from networking through into application design, Kubernetes itself is a good example of this. And in fact, linkerty steals the same concept as we go through when we talk about failover and multi-cluster. We have at the very bottom of this stack, we have this thing called the, I have to play a little bit fast and loose here, but at the very bottom of this, we have the service mesh interface extension to linkerty. The service mesh interface extension allows linkerty to honor the traffic split resource. The traffic split resource can then do things like say, oh, 50% of the traffic going to service foo should go to foo one, and the other 50% should go to foo two. It's a simple way of doing canarying or load balance. Well, not really load balancing, but canary deployments, things like that. We get to use it for failover as well. The linkerty failover extension sits on top of the linkerty SMI extension. And if it sees that a service has crashed, it can manipulate the traffic split resource to divert all of the traffic for the broken service to a service that's actually working. Makes sense so far? OK, the multi-cluster extension fits into all of this by giving linkerty not so much a way of magically causing a cluster to connect to another cluster. What the multi-cluster extension really does is allows linkerty to see what services have been exported in one cluster and bridge only those services into your first cluster. And to Carlos's question, the service mesh should be in both clusters. What you do with linkerty is you install linkerty in both of your clusters. For the example here, I'm going to be talking about the east cluster and the west cluster. So they would both be running linkerty, and then you would provide links between from the east cluster to the west cluster, and if you want from the west cluster back to the east cluster. The link is directional because it determines the direction of service mirroring. If you link the east cluster to the west cluster, the west cluster will be able to get services from the east cluster. And vice versa, if you link the west cluster to the east cluster, then the east cluster can have services mirrored from the west cluster. That in turn then allows us to use the traffic split resource to split traffic between services that aren't even in the same cluster by using those mirror services. And that's the way all of this works is that we set up a traffic split that allows linkerty's failover extension to flip traffic between the normal service in the same cluster and the mirrored service in the other cluster. All right, so we live dangerously and try sharing the screen. Because you know, that would be kind of nice. Oh, hey, the menu opened to this time. Yes. That's always a good sign. Everybody keep your fingers. Let's see if I'm working it all at this point. Can still hear you. It looks like your video. Okay, now your video is coming back. All right, we'll see if maybe the packets are gonna flow again. While we're at it, I suppose I can talk about, hey, look at that. We kind of got some things going there maybe. Yeah. I picked the wrong window to share. So while I'm messing with this a little bit further, let's go ahead and talk to Carlos's questions there. Yes, there are situations where it can be a very good thing to have both of your clusters active. And yes, you do need to be careful about those for all the usual reasons that, you don't wanna be redirecting traffic to one cluster that's then gonna redirect it back to the first cluster. Definitely a thing that you need to be very careful of. A more common scenario actually would be, you might go ahead and link the clusters together bidirectionally, but you might export different services. So for example, you might choose to have, okay, a kind of contrived example would be that you might choose to have the web server in the East cluster fail over to the web server in the West cluster. And then your single page app service might fail the other direction. That feels a little bit weird. It's not clear to me that that's all that really sort of thing that would happen in the real world. But on the other hand, something that might well happen in the real world is to imagine that you have clusters that are in different regions and now imagine that you're dealing with GDPR, for example, where it's very, very important that the personal information for your European users stay in Europe and for your American users stay in America. That's a scenario where it could very well make sense to allow the web service itself to fail. In the American cluster, you would say, oh, the web service primary is the American web service. But if need be, we can fail over and use the European web service and vice versa for Europe. But you would never export the services that would allow access to user data out of their respective clusters. And hopefully that makes a little bit more sense. All right, so I think we can do some stuff here. To Ugo's question, it's going to depend in the case of the failover, will the traffic still be passing through the first cluster to only then go to the failover cluster? Kind of depends on which service it is. In general, the traffic has to get to the point that Linkerdee can get a hold of it. So with just two clusters, yes, you would have to go from your first cluster over to your second. But we could also imagine a scenario where you have three clusters, where one of them holds the Ingress and the other two only hold back-end services, in which case the Ingress cluster would be the one redirecting to one of the others and it would not, at that point, need to go through. You can go from the Ingress to the West cluster or the Ingress to the East cluster without having to go from the West to the East. Did that make sense? That would be a little easier to do with drawing things. All right, so normally when I'm doing these things, I can go through and actually run all these command while we're looking at this stuff and you get to see all of this stuff live. I kind of don't dare do that right now because I'm kind of convinced that if I try to do that, the world will come to an end. So instead, okay, for some small value of the world's coming to an end, but at least now I can show you the things where I can walk through the steps and maybe more importantly, I can point out the gotchas that are in here. And actually, let me first start with bit where the assumption that we're making here is that you have two clusters for this demo that they are called East and West. And if you're trying to do this with clusters that are not named East and West, it doesn't really matter. You can just change the context names in the rest of this file or you could use kubectl config rename context if you want to be a little destructive about it. But I want to point out a specific thing if you're trying to do this with K3D, which is what I was doing. And for that, we need to look in the create.md file. Come on, browser, it's just a markdown file. You can do it, it'll be okay. We believe in you. What's really remarkable about this is knowing full well that an LTE cell modem is vastly faster than a bunch of things I used years ago and now it feels so slow and unpredictable. It's just remarkable. Okay, let me point out these things because these are really important. If you're trying to do this with K3D or let me back up a moment. If you're trying to do this sort of thing with LinkerD at all, the requirement, there are two big requirements. The first one is that your clusters have to have IP connectivity, which sounds kind of silly, but yeah, the clusters have to be able to talk to each other. The other really important thing is that, and we'll talk about this a little bit later, they have to share a trust route because they're gonna do MTLS between the two clusters, which is very, very helpful when they talk to each other over the public internet. If you're going to set this up with K3D, there are some kind of weird gotchas in here. Specifically, when you're doing this, you must set up all of your K3D clusters so that they're on the same Docker network. Otherwise, they won't be able to talk to each other at all and because they're on the same network and they're bridged through the same host. You also have to play games with setting the port for the API server and the port that you wanna, you know, we set this up with an English controller so that we can actually use things like name-based routing and all of those want to expose ports to the host. But again, this is K3D in the Docker network, they all show up with the same IP address and so you have to give them different port numbers. I'm not gonna go through all the rest of this script, but I did wanna point out those gotchas because those are very, very important. So let's start walking through assuming that you already have the clusters set up. This really, really long markdown file has all the commands to go through and set up the clusters from scratch, get everything installed in a way that will work and let's go ahead and walk through that since I don't think I dare try to run these right now. Once again, so a couple of obvious things here is this starts by running LinkerD check to make sure that your East and West cluster can actually run LinkerD, which is kinda important since we're using LinkerD. Then we get into certificate setup. There's a whole service mesh academy on certificate management that goes much more into the details of what is. I think we might have lost you there, Flynn. Give it just a second so it comes back. Oh, am I back again? There, yeah, you're back. Okay. Where'd you lose me? Was I talking about the step command and stuff? Yes. So like I said, there's an entire certificate management workshop that goes into considerably more detail here, but we basically are using the step certificate create command to make a single trust anchor certificate. We do that here. And then we create any issuer for each certificate, or sorry, for each cluster. The clusters have separate identity issuers, but both of the identity issuers are signed by the same root, the same trust anchor. In that way, they can do saying they get to manage all their own workloads in each cluster, but whenever cluster East talks to the West cluster, then MDLS just works because the root chain is the same certificate in both. After we've made the certificates, we can then go through and install LinkerD. This is pretty much just straight out of the start for how to install LinkerD 2.12 with notable exception that for both of them, we explicitly say which certificates we want it to use rather than letting LinkerD generate its own. And again, if you have questions while I'm walking through this, I know I'm going a little bit quickly because we're running a little behind where I would like to be because of network now restarting. Okay, hopefully I'm back again here. Yeah, I think it's gone through a couple of blips, but last one was short, but this one's less and a little bit long. We'll see if- Was it long? There we go. Now you're back. Okay. After getting LinkerD itself set up, then we are installing Grafana and LinkerD viz, and then we install an Ingress controller. Again, this is mostly just to go through and make sure that we don't have to use kubectl port forward all over the place to be able to talk to things. I'm kind of biased. I've had lovely experiences with port forward dying out for my things like that. Okay. Next last, almost the last thing in the install everything step is we install the emoji voto setup. This is straight out of the emoji voto quick start, except that we're just installing it into each different cluster. And then the last thing we do is set up access through the Ingress controller so we can talk to things. I think I can show at least this, you have that. So, okay, true to actually setting things up for failover and for multicluster in general. And the failover extension. And important to study SMI extension actually ships whenever you have straightforward access to that. So this is using the Linkard CLI to install the SMI extension. Linkard failed ship with Linkard itself. And so the easiest way to do later on helm. Again, these both get installed into both clusters so that we have the traffic splitting functionality with Linkard SMI over. We need after the install. I think we might be having some latency once again. Apologies, folks. The Linkard multi-cluster. Let's see here. Might just have some network issues. Let's try a few things. Let's see if Flynn comes back here in just a second. Then we'll go ahead and continue. We might disable our videos and just leave the slides up here. Extension is also pretty straightforward. One sort of weird thing is, am I back? You're back, yeah. I'm gonna turn off my video. And I think if we turn off our video that should fix some of the things. Cool. Actually, you know what we could do is why don't you share a browser window with us in it? Yeah, yeah, let me do that. Because hopefully that'll work out a little bit better, we hope. Absolutely. Yeah, and if you need me to scale or... And I am gonna go ahead and turn my video off. And then if you could put that in our private chat, I can get that loaded up. That's, it's already in the private chat. That's the... Sorry, that read me. Mesh Academy Link. Yep, that's the read me. Perfect. I'll get that up here and go ahead and share my... All right, and then if I can get your help, Libby, with sharing that. All right, so scroll down to the section that says linking clusters. There we go. Okay. Yeah, so if you scroll up, and yeah, I think you could make that quite a bit larger in terms of font. If you scroll up just a touch more, there we go, that looks great. Okay, so what I was trying to say is that, that line there that's doing the annotation on the Lingerty Multicluster namespace right in the middle of that bash block. That's really more about providing a little bit more information for debugging than it is a strict necessity. Yeah, that one. It's in here because it was helpful for me when I was doing all this stuff. Okay, so once again, going through and Lingerty Check is your friend and running Lingerty Check on both of those clusters can be important, right? All right, so let's scroll down a little bit more to the linking clusters section. I wonder if I'm gone again. I think I can hear you, it might have just dropped out. There we go, I can hear you. Great, I should just make Taylor narrate all this, put him on the spot. A particular gotcha that can be a little bit weird, especially if you're using K3D, is that when you set up the links using the Lingerty command, one of the things that has to happen is that if you want to link the East cluster to the West cluster, then the West cluster has to be able to get cluster permissions for the East cluster. It actually has to get Kubernetes credentials. And the way it ends up doing this is that the Lingerty CLI itself is what tries to go ahead and read the credentials for the Kubernetes cluster. But unfortunately, if you're using K3D, the Lingerty CLI running on the host will end up seeing a local host address for the API server and that won't work from the other K3D cluster. So I'm not gonna go through this absurd API server adder function, but the point here is that it can go through and figure out the right API server address to use for the setup that we're using here. And it should also do the right thing if you have a pair of clusters that are, using SIBO clusters or GKE or something like that, they can just talk to each other over the network. If you scroll down a little bit more, let's take a look at that link to the cluster command. So I wanna point out something here where most of the time, if you carefully read these commands, you'll see things like Lingerty dash dash context east pipe to kube control dash dash context east. So we're running Lingerty for a context for a specific cluster, and then we pipe it to kube control for the same cluster. But that is not what happens when you're doing the Lingerty multi-cluster link command. What happens is Lingerty multi-cluster link goes through and constructs a link object, a link resource that then needs to be applied, but you construct the link resource based on information for one cluster, and then you apply it to the other cluster. So if you look carefully at these two Lingerty multi-cluster commands in this bash block, you're gonna see that the Lingerty command has a context for one cluster, and the kube control command has a context for the other cluster. That is not a typo, that is necessary. So very important as you're looking through this to remember that. The next thing happening in step three is that we actually go through and export some of the services. In this case, we're going to export the emoji service, and the emoji service is the one that produces lists of emoji. So it's kind of down deep in the call graph. What happens is that your web browser is gonna talk to the web service, and then the web service ends up talking to the emoji service as do some of the other things. But we're gonna export the emoji service from the east cluster to the west cluster and vice versa, so that from either cluster, you have a way to reach the emoji service in the other cluster. And as noted there, after you do that, you should actually see if you'd run that kube control dash, dash context, east get service, dash, and emoji photo. You should actually see a service in there called emoji service, and you should see one called emoji service dash west. In the west cluster, you'll see emoji service and emoji service east. So, wow, yeah, I really wish I could show this stuff. If you, at this point, after going through and exporting the services, then everything should be working exactly as normal because the services being exported are not gonna be taking any traffic yet. And if you run linker div is stat, you can see that the emoji service will be taking 100% of the traffic, and the emoji service west and emoji service east clusters won't be getting any traffic at all. All right, let's head on down to the failing over thing. There is one more thing that has to be done in terms of setup, which is that we have to install a traffic split resource to tell the LinkerD failover extension that it's okay to go through and shuffle traffic around. Taylor, can you still hear me? Can you scroll down a bit, if so? Absolutely. Or possibly I'm not seeing what is actually on your screen, but... I'm in the, installed the traffic split section, but I can keep going down if that's... Perfect, okay, yeah. The install of the traffic split section is exactly, oh, now I see it, great. That's exactly what we wanna see. If you, you can actually, well, you'll have to scroll way up to the top, but you could actually look at the failover config emoji split east file. It's checked in in the repo. It basically doesn't do anything terribly fancy there. Yeah, if you just scroll all the way up to the top, all the way up to the top, you'll be able to click on the path of that. There you go. Click on the failover config directory, and then click on emoji split east. So if we look at this, it's not very profound. It's a little tough to read, but or it might be a little tough to read, depending. The important thing here is there, there are a couple of things that are important. One of them is there's a label that explicitly says, hey, Linkerty failover, it's okay for you to mess with this, and Linkerty failover will not touch a traffic split that does not have that label. There's also an annotation that tells it the primary service, and that is letting it know that if nothing's gone wrong, emoji service is the one you want to use. This is important because if you really want to, you could say things like, oh, emoji service, twist is the primary. I don't know why you would want to, but make sure that that's the way you want it to, set it to the one that's actually in your cluster. Finally, the last bit here is you can look at the weights down there, in the back ends, and you'll see that currently this traffic split is configured to send all of the traffic to the primary service, emoji service. That's exactly what we want. When we install this traffic split, we don't want it to do anything silly like splitting 50, 50 or anything weird like that. Okay, so let's go back to the read me again. And many thanks to Taylor for his wonderful slide running here. Happy to help, a little bit further down, a little bit further down to the install the traffic split section again. There we go. So we're just going to go ahead and apply that. This is the first thing in this whole read me where we're only doing it on one cluster. We're only applying that split to the East cluster because for purposes of this demo, we're going to demo a failure of the East clusters emoji service and we're going to have it fail over seamlessly to the West clusters emoji service. After applying that traffic split, you should not see anything change because once again, all the traffic is still going to the one in the local cluster. And you can check that again with that linker div is stat command. That one, I'm actually running watch in front of it just to sit there and watch it for a few seconds to make sure that there's nothing going over. All right, scroll down a little bit further and the fail a service section there. We're not doing anything profound here at all. We literally just scale the emoji workload to zero replicas in the East cluster. And at that point, you should instantly see the weights flip in the traffic split. If you get the traffic split back from Kubernetes, you'll see that emoji service will have a weight of zero and emoji service West will have a weight of one. If you then go back to the browser, everything will still work because it's just going ahead and sending all the traffic over to the still running service in the West cluster. If you run that watch linker div is stat there, you will see over the course of a few seconds, you'll see the traffic move from the emoji service to the emoji service West. That's actually all there is to it. If you then rescale the emoji deployment back to one replica, you'll see the weights flip again. You'll see this stat show you the traffic moving and that's really all there is to it. It's like 80% careful, careful setup and then things just start working. One really important thing to point out here is that the way these services, the ways these extensions are kind of stacked on each other gives you an enormous amount of flexibility to do things differently. The fail linker to fail over service is actually fairly simple. As long as they were running end points, then it won't fail. If you want to do something more sophisticated, you can certainly do that. All you need to do is figure out how in your environment you can arrange it such that when you see a service that's wrong, you can just change the weights on the traffic split and you can also then rely on the multi-cluster extension providing you services that link across to the other cluster. So that actually goes through everything that I wanted to show you. So I think we still have a little bit of time for questions if there are any or Taylor, if anything comes to your mind, I'm gonna try turning my video back on. We'll live a little dangerously here. I'll do that and I'll stop screen sharing that. If you do have any questions, please feel free to throw them into the chat where you're watching this live stream and we will be more than happy to answer on that front. And there's our first question for Michael. What happens on a split brain between clusters? So I think that might be the same question that Carlo just answered. Split brain can mean a couple of different things but I'm gonna assume for the moment that we're talking about a network partition. The simplest version of the answer to that question is that linkerty is going to trust that your network is functioning. And if it is not, then you should absolutely be using health checks and maintaining the, yes, you should absolutely be using health checks for that. I believe that if the multicluster extension simply loses contact with the other cluster, I think that the mirrored services go away. I just realized that I was going to check on that but so I'll make it out on that and we'll update the read me there. But I believe that that is part of the answer there as well. I think one question that I had for you Flynn was as teams go to getting services up and running, it can be kind of a feat in and of itself, whether it's lifting and shifting and applications starting to containerize or move something to Kubernetes. And then like you said, implementing service mesh can mean one of many things is you start to kind of up the comfort level with Kubernetes and add things on, add extensions, et cetera. Are there any things that you see teams go to do or adopt that might not work as well as they would expect? Or do you have any like tips, tricks or insights as far as people looking to adopt a multicluster kind of setup? Keep it simple. I like this. I mean, that's... There's obviously a lot that could be said there, right? But yeah, you're 100% the... There are a lot of ways you can get in trouble. So for example, I would strongly, strongly encourage things like pick a service to do this with, pick a setup, you know, go through and do it in your, don't do it in production to start off with. Even just, I think the right way to phrase this one really is that as you'd more and more complex things with Kubernetes, it becomes more and more important to play around with it and to really try to understand what you're getting yourself into before you just go ahead and flip it on. I'll take this opportunity to link that back to the question about Istio versus Linkerty and point out that one of the things we hear over and over and over and over again is that one of the things people really like about Linkerty is that it's entirely possible to just fire it up in a K3D cluster and have a proof of concept running in an hour as opposed to several days. And I cannot emphasize enough how important it is to do that sort of just playing around with things, right? A lot of, you know, I obviously had to go through and get all this stuff working, which now you can't see, thank you, Faas. But I obviously had to go through and take the time to become accustomed enough to multi-cluster to talk about it and to get it running. And yeah, you know, it was tricky at first until I kind of wrapped my head around what it was really doing. Does that answer your question? Is that? Absolutely. Yeah, a lot of stuff that we do, a lot of the stuff that we do as tech evangelists is learning about things and then teaching other people about them. And it's hard to overstate the importance of doing things iteratively and really getting handle on something before you go ahead and try to roll it out across your entire world, all of the ones. I like that advice in terms of keeping it simple and just like, you know, like you don't have to stress yourself, iterate, take it slow, make notes, validate your approach and those kinds of things too. I completely agree. I think that in a lot of cases we tend to kind of jump over that right into the code or the configuration or the solution. And it's, you know, you can take that time to actually think it through or validate this is the right approach. It Kubernetes is complex enough all by itself. We don't really need to, we don't really need to make that harder than it needs to be. Yeah. Hugo asks if multi-cluster is in beta or if it's ready to production. Multi-cluster is in production. It is not beta. The SMI extension production, the failover extension is, you know, it's production ready, but it's also simple. And so it's the one that I suspect you might want to look the closest at to see whether it's gonna meet your needs or to see what you need to do to make it meet your needs. And Carlo asks about the BigQuery service. So so here's the interesting thing about things like BigQuery where the big question I have about BigQuery is not is it possible to failover that kind of service? The big question I have is how are you managing all of the data behind that? Are you trusting that both of your clusters are, have, you know, are they both mounting the same volume somehow? It might work if they're in the same regions. I don't know, but most of that to me is less a question about permission and is more a question about state and data. In terms of permission, though, if you do a little bit of digging, you will find that the multi-cluster link command actually defines a service account and sets up some RBAC and things like that. And so I would approach permission first by looking at that and then I would approach permission further by looking at user authentication in the application itself, if that makes any sense. I don't really talk in the repo about that RBAC stuff. So I'm gonna make a note of that because that was a, that was the thing that was very surreal until I understood what was going on. The mechanisms by which multi-cluster is arranging for cluster permissions are normal plain vanilla Kubernetes stuff and it works by, yeah, it works by allowing the linkerty command to read credentials and move them around and, you know, tokens and things like that. So that would be another good thing to talk about a little bit, I think. Boris, it looks simple when you have static data, how do you manage dynamic database data? That is literally an entire separate talk. The short answer is that as soon as you are dealing with persistent data to my way of thinking that's no longer just a service mesh problem and may not be just a Kubernetes problem. So, you know, we can talk about database replication, for example, but that's a whole different world. Yeah. I would love to have a simpler answer for you than that, but yeah, yeah, wouldn't that be cool? Wouldn't it be awesome if your service mesh could handle database replication for you? I would love that. That would be so cool. I would outsource that in a minute. So that'd be fantastic. Yeah, I mean, just, you know talk about a business opportunity. Wow. Any other questions? Anything else, anything else in your mind, Taylor? I think that's it. Honestly, it's given me a lot to think about and I'm really excited to kind of go and test this out myself. I think that in the previous, the N minus one through five places that I've worked, having a multi-cluster set up spend one of the biggest things and most difficult technical things to accomplish with those teams in terms of, you know item potency, all the cues, all those fun things. So this is gonna be fun to try out. Yeah, it's, there's some really, really tricky stuff that happens when you're dealing with that. Yeah. I think one of my most favorite ones was a rabbit MQ cluster, which experienced split brain on its own. So like, yeah, I've got stories also for another talk. Yeah, two or three jobs ago, was it? We had a guy who was just completely enamored of rabbit MQ and wanted to use it for everything. So we got a lot of really good experience about things rabbit MQ is great for and things it's not so great for. So yeah, it would be, it'd be really interesting to hear though, as you play around with it. I'd love to hear about how that stuff goes for you. And on that note, I put in the chat, hopefully it'll show up soon. The URL for the Lincardy Slack and how you can find me there. I'm also gonna put one more link in there. Yeah, so that's slack.linkardy.io. I am at Flynn there, always around, almost always around. And one more thing I wanna put in there as well is this is the link to, also just added the link to the Boyance Christmas Academy site. So every month, we at Boyance do a free workshops, really a very similar format to this one where we go through and pick on a topic and try to tear it apart and hope that the network is working. The next one coming up is actually a deep dive into MTLS and we would love to see you there as well. And if that doesn't make it up, it's at buoyant.io slash service-mesh-academy. So other than that, many thanks. It's always a pleasure to be here. I'm sorry, this one was so rocky, but hopefully we're able to salvage something out of it and people got something out of it, I hope. Okay. No, thank you so much for joining us today, Flynn. It's always a pain with ISPs, CDNs, networks, but you were able to make it happen. Thank you so much. I'm telling you, I think the real moral of this story is that you should always have two internet connections to your house. We just talked about multi-cluster, so multi-ISP would be great. Exactly, right? Fully redundant upstream connections, that's what we all need. Thank you so much, Flynn. Thanks very much, and I hope to see you all again soon. Thank you, thank you. Thanks, everyone, for joining the latest episode of Cloud Native Live. We really enjoyed your interaction and discussions today. Thanks for joining us, and we hope to see you again soon. See you later, everybody. Bye. Ah, and there's the link to the service-mesh-academy. Great.