 Hello, great. Yep. Excellent. Yep. Welcome everyone to this panel Two houses both like I like in dignity Gateway API and MCS API. I'm Stephen Kitt. I work for Red Hat on the summerina project I'm Rob Scott. I work for Google on Kubernetes networking and I'm a Gateway API maintainer and I'm Mike Morris And I work for hash you core on console and I am one of the colludes of the gammo initiative and You may notice that unfortunately Laura could not be here in person But she was able to pre-record what I think is a very creative intro Oh, this is Laura coming at you from the past as I was not able to travel to KubeCon EU this year Instead I have the privilege of sending my digital form to you to give a quick intro to our panel topics for two houses Both alike in dignity Gateway API and MCS API So what we want to talk about today is these three general projects MCS API Gateway API and service meshes and where they overlap and how they are coming together We've been thinking a lot about these little overlapping joints in here in the Venn diagram So all of these projects in the center here all of them bundle some back ends together So some higher level of abstraction like a service to bundle some endpoints together MCS and service meshes are both very concerned with providing ways to discover a service especially across cluster boundaries the Gateway API and service meshes both do traffic shaping it with different levels of expressibility and sophistication and Meanwhile, all the service meshes usually have a whole other suite of things that they are able to independently do like securing traffic for example So since all of these projects are still a little bit in flux It's informative to know at least a little bit about their history But the most important part is that these efforts have been evolving all along the same time in the face of shared problems That were really clarified in 2019 that users needed more sophisticated traffic solutions than were available in Kubernetes at the time But that the solution space which was pioneered by all these different projects collectively known as service meshes was too fragmented So a few of those Projects connected on their own attempt to converge called the service mesh interface or SMI spec and at the same time SIG multi-cluster and SIG network Respectively approached individual pieces of the problem in the form of the MCS and Gateway APIs So the MCS and Gateway APIs went through a very exploratory period in 2020 But have standardized a lot especially in 2021 both in terms of the API maturity level and the breadth of implementations that we're using the API and Then in 2022 the successor kind of to the SMI spec the gamma initiative Picked up the mantle to try and bridge the gap between at the converging service meshes and these maturing Kubernetes native API standards So we were really attached to the title of our talk So we're going to quickly demonstrate how these APIs all come together with a little bit of Shakespeare I don't know how much of you watch the 1993 version of Romeo and Juliet We have Leonardo DiCaprio cast as Romeo Montague over here on the left and Claire Danes as Julia Capulet over here on the right And if you're familiar with the story, you know that Romeo and Juliet fell in love Despite being in desperately feuding families that didn't want them to be together or in this case today We're going to pretend that they're in desperately feuding clusters instead So let's say Juliet says her classic line except in a Kubernetes kind of way So instead of Romeo Romeo where for our thou Romeo? It's Romeo dot Montague dot svc dot cluster dot local where for our thou Romeo dot Montague dot svc dot cluster dot local, right? So in the story Romeo meets her at a party at the Capulet house. He later visits at her window So we need to get Romeo a way to respond to this request at her window at her cluster instead of being isolated in his cluster So we can represent this with the MCS API in API form if we have Romeo over here as a deployment Local to the Montague cluster with a service in front of it local to the Montague cluster We can create a service export from the MCS API to export it to the Capulet cluster and a service import Representing that Romeo service will now appear in the Capulet cluster so that the Juliet deployment there can interact with it part of this is that Services from other clusters exported this way can be queried by a familiar DNS so we can change Juliet's line now she can ask where for our though Romeo dot Montague dot svc dot cluster set dot local and Romeo will be able to respond But in the story apparently it's not enough just to meet up on occasional evenings at a windowsill Instead in the story Romeo and Juliet get a third party Fryer Lawrence to marry them ASAP and To marry them Fryer Lawrence will have to address each of them individually to say that ceremonial question Like do you Romeo take this woman Juliet, right? So let's make him an API resource as well We introduce a gateway API here because we can imagine Fryer Lawrence as a client using an HTTP route from the Gateway API That splits traffic based on your iPads so slash Romeo is backed by the service import matching the Romeo service over here and Slash Juliet is backed by the service import matching the Juliet service over here So when he instead asks do you slash Romeo take this pod slash Juliet as your lawfully wedded wife Each service can respond even though they're in totally different clusters. I Do have to give up the analogy for the sake of slidespace now But I want to introduce one less current event concept to give an example about gamma So service meshes that conform to the gamma specification now have a way to interact with multi-cluster services from the MCS API and the traffic routing rules on the Gateway API by respecting routes that have a service import as their parent ref So by binding this service import in the parent ref This indicates that the service mesh Should intercept traffic destined for this service import and apply these Gateway API expressable rules So here we have a rule to match this path and add this header so if you're following along with the API spec here if only the Romeo service had been configured to add potion and to do it header when attempting to slash wake up the Juliet service import upstream perhaps the tragedy of this age-old story could have been averted So here we are four years along and a convoluted analogy afterwards But we're really excited now that the gamma initiative is helping tie all of this together in this overlap area Bringing all of these projects together and crystallizing it into a unified ecosystem that addresses users needs and is all together A way that you can use all these projects So now we'll hand it off to the real live in-person folks to take your questions and elaborate a bit more on these themes That I just talked about a little bit from our abstract. So Rob Steven Mike take it away Yes, I love that intro. She did an amazing job We're going to spend the next few minutes going through these questions and then we'll have lots of time at the end To take any questions from the audience. So if you have any we'll definitely have time for them But we'll start by going through these. Yeah, so I guess to start all Rob and Steven like How do these API's work together? Well, we've been spending a lot of time talking about just that as you can imagine now I These API's have a history of a few years at this point. I've been working on Gateway API for Three and a half years since Kubcon San Diego As Laura mentioned MCS API also has a similar timeline. I And some of the maintainers have been working on both for example Jeremy one of the maintainers in sig multicluster contributed, I think TCP route to Gateway API and I've been you know helping review some of the MCS API Work so so there has been some crossover But as strange as it sounds multicluster services are not a sig network thing. They're a sig multicluster thing So this is kind of a cross sig effort So it has taken some proactive efforts to make sure that we're working together across sigs to make these make sense More recently, you know, we've had these discussions We've had these high-level ideas, but those ideas were largely stuck in maintainers heads and not written down anywhere So we've tried to take that step in Gateway API We have a concept called Gateway enhancement proposal and that's like a cap a Kubernetes enhancement proposal But we've that defined in great detail every little nuance of how Gateway API and MCS API work I forget the exact gap number, but it's easy to find on Gateway API's website And of course we made sure we worked closely with sig multicluster Gamma and Gateway API to make sure we're all collaborating and all have the same ideas for how these APIs interact But maybe Steven you can provide a bit more context. Yeah So one of the things I find interesting is as Laura said All these initiatives have been going on for approximately the same number of years, but there's been very different dynamics So, you know, as Rob just explained the Gateway API is such a big initiative in practice that they have their own enhancement proposal system whereas Sig multicluster is much smaller So we just use caps another Difference to is that the Gateway API came out of a lot of frustration with earlier APIs You know, it's rising from the ashes of ingress v1. And so there's there was a lot of Knowledge already before the Gateway API started also a lot of actual real-world users Who knew what they wanted? And so that was perhaps Well, it's easier because you get lots of people involved that can also create More complexity, I guess but it also means that there's a better idea of what's going to be a good API Whereas in the multicluster space Especially in 2019 the sig was really inventing things that probably still is to a large extent because this is still a very Research-heavy space, but one of the things I find Fascinating about what Laura explained is that there's the the whole Gateway API and MCS API interaction revolves around an object the service import Which from the MCS perspective is an implementation detail because when you interact with multicluster as a user all you say is that you create a service export and at some point in the near future that will Result in you being able to access a service through DNS From your other clusters you don't care about the service import You don't care about the endpoint sizes and so on and so in theory We could have written the spec without that but then Gateway API would have had a harder time and it turns out the service Import was a really useful object just in terms of being able to consume it Yeah, absolutely So one of the other things that's been going on during this time is service meshes have largely been evolving Kind of on a parallel track outside of the cap or get process So it's had given them a bit more freedom to innovate But now we're kind of at a point where we're realizing that a lot of our meshes are doing Fundamentally similar things at least at the core. We all of course have specializations and different features but there's an interesting Benefit here where we could have a common configuration language For at least like the base set of stuff and looking at what was happening in Gateway Gateway API It started to make a lot of sense seeing we're doing traffic shifting and HTTP route and get away if I is basically doing the same thing So how can we start applying some of these concepts to this east-west use case? And I think even I on what's been happening to say multi cluster for a couple years now It's been really interesting to follow I think one of the challenges is that it started a bit earlier than we saw a lot of end users having that Multi-cluster use case and really that need as a pain point So I guess just like as a show of hands here like how many folks are using MCS API currently and How many folks are managing multiple clusters in production currently a lot more so yeah, what we're hoping to do is Make it easier for everyone to manage multiple clusters in production as part of integrating all three of these APIs Together. Yeah, and how many people use server smashes? Yeah, all right quite a bit more than use MCS All right, we'll move on to the next question How have they evolved separately and as part of a larger upstream initiative to make the multi cluster experience feel more native? I think Stephen maybe do you want to start that? And yeah, I'm sorry. We didn't we did prepare this Yeah, right so this yeah, so this ties back to some extent to what Mike was just saying about where the MCS API came from really because As you might imagine multi cluster and as we saw from the show of hands Lots of people have multiple clusters a smaller subset Use service meshes and a smaller subset still are aware of the MCS API and use that then it it started out from One view of multi cluster, I guess which came from Google which is where you can have all these clusters that are considered to be the same from an initiative perspective and from how you want to use them Which isn't necessarily the case For in many other contexts and that's where service meshes Come in and then there was work going on in the API without actual concrete implementations, which I think is a challenging situation to be in and Certainly when we started working on some arena and integrating it into the MCS API or at least trying to implement the MCS API with it We were able to go back to the SIG and say these things don't quite work as well as you thought or maybe we're not quite understanding What you wanted to do with this and that helps improve the the SIG and make it more More real world and I guess another big difference too is where The outcome of the SIGs fit in to how you use clusters gateway PI Has a direct representation and how you get your traffic to Where you wanted to go and how you make that available to users service meshes also Apply there with policy between services making services discoverable whereas the MCS API is a Somewhat lower level API against more infrastructure oriented So it's more a concern for infrastructure providers, perhaps than actual end users. Yeah, so I can follow up on that a little bit so MCS API offers two really valuable building blocks. It's kind of like how we see it from the perspective of service mesh so MCS API is in Isolation can be used purely for a service discovery case by creating service in service exports and Using the cluster set DNS to be able to route traffic across clusters Service export service import are actually really useful objects. So we had a use case that didn't quite fit in console, but we definitely took inspiration from the design of these and Hopefully at some point we can get upstream we can converge with upstream in the future but exporting services Allows you to do something Across cluster or administrative boundaries to so That was to have like two different multi cluster use cases that we've seen the service mesh world one is where you have multiple clusters or data centers that are logically identical and Are basically serving the same services and you want them geo redundant or highly available around the world and You just want to be able to like fail over to them as necessary or shift traffic between them as necessary, but they're all the same in Larger organizations you often have that as a single teams boundary But then thanks to some of the cloud vendors that have made cube clusters very easy to create and manage We've seen a challenge of cluster proliferation So in some large organizations instead of sharing one or a few cube clusters You end up with each team managing several of their own clusters And then the organization as a whole needs to figure out. How do we do service networking? Across these different administrative boundaries of teams that are all managing their own clusters So that's kind of where I am hoping to see MCS API evolve in the future is to support both the like seamless use case But also like across that cluster set boundary. So yeah, I think there's a lot of interesting work that could happen here Yeah, great answer You know, I just for the sake of time, I think I'll keep moving here But just real briefly I'll say that as you've mentioned Service import has been just a tremendous building block that we can build incredible things on top of with Gateway API It just kind of slots in Interchangeably with the service API So in Gateway API you can either use a service as a target or a service import And they both generally work the same except one is targeting a multi-cluster set of endpoints. So Really neat that the way multi-cluster services were designed They just kind of fit in naturally without any extra thought they just it just feels like it works I'll move to the next one because it's a bit of a mouthful here, but you know, we're gonna talk about CRDs You know these APIs are official Kubernetes APIs, but they're CRD based and CRDs and this whole process is different and new to Kubernetes So before I go much further here, how many of you have installed a CRD in your cluster Okay, how many of you have had an issue installing a CRD or working with CRDs? Yeah, yeah that they can be challenging and They definitely I've had a talk before where I've complained about CRDs So I have my frustrations with CRDs. They are not a perfect thing by any means You know, there are things that we miss I've worked on both upstream API's so endpoint slice ingress But I've also of course worked on Gateway API, which is CRD based Despite our best efforts with upstream API's it's been very hard to really collaborate on API design the The way enhancement process works. It's pretty rigid. There's a very clear timeline. Everything has to go through caps There's it, you know, we've tried and at best we've had maybe five people able to collaborate on an API Gateway API has a hundred and forty plus people that have contributed to it so far and that number keeps on growing So CRDs have enabled us to collaborate more than we've ever been able to before with With any Kubernetes API that I'm aware of Also, as you might expect because these are CRDs when we release a new version of Gateway API It's immediately available and we say we support the trailing Five Kubernetes my versions so wherever you are whatever cluster you have You're probably within the latest five Kubernetes versions. Hopefully And in that case you can go ahead and install the latest version of Gateway API The latest controller the latest implementation and you're good to go So again one of the problems that we had with upstream API's is there's this huge delay between I'm gonna try something and then wait a year get some feedback Oh, let me change that one thing wait another year get some feedback it. This is a much faster feedback cycle and that's been really helpful to us to iterate and You know also we've got lots of implementations We have around 20 more than 20 implementations now And that's helped provide a very helpful feedback and ensure that the API we're building Is really truly portable across lots of different vendors Maybe one of you want to chime in I'm not sure yeah, so one of the keys to this portability is Conformance tests, so this is something that lets us know that all of these different implementations are actually behaving in a standard expected way So this has really been enormously beneficial in enabling implementations And I'm also really excited that MCS is starting work to add conformance. Yes, so yeah Yeah, so this is this is it's a really big deal and because it enables implementations, so as an example when I started working on consoles API Gateway implementation of Gateway API One of the things that was happening was the conformance tests for Gateway API were still being written at that time largely so while we had Looking at the spec we can read stuff and figure out. Okay. This mostly makes sense. This is how these resources map to each other sometimes there were unclear behaviors and in that case One of the things that was really really helpful was Opening PRs upstream to add or propose new conformance tests And then that drives the conversation upstream across multiple implementations on What is the actual expected behavior and what should it be and do some implementations do different things today and some of us office like okay, we can Have a little bit of flexibility here and we'll end up making some compromises so that everybody Agrees that the behavior makes sense Yeah, so this has been an enormous boon for Driving consensus across implementations as well as being a tool for understanding Am I building a thing that actually implements the spec the way it is designed? So yeah, that's Been really helpful and I am so excited that MCS is starting work on adding conformance tests the project I think it'll be really beneficial for enabling additional additional implementations. Definitely. Yeah, and it's also interesting to Measure the spec itself because as we implement conformance tests We realize that there are some features that none of the implementations Implement the way we thought the spec meant they should So sometimes that means that we need to go and revisit the spec or perhaps go and educate all the implementers but you know it's one of the first implementations of as you might expect of the MCS API since it came from Google was in GKE and for a Long time the GKE implementation was actually quite far from being spec compliant So when you see that happen, obviously it means that there are you know, it things take a while to settle It's also a bit easier to almost like TDD a project When instead of trying to read through the caps with the fine-tooth comb and figure out what's this actually mean? Yeah, so yeah, yeah, the MCS spec is quite short Yeah, we don't have hundreds of gaps to All right, we'll move on to our last question. We'll keep this short So we have time for your questions as well But finally, where do these all fit in the ecosystem of service discovery solutions service meshes and vendor-specific tooling? Maybe Mike you can start us off Whoo, um, I guess I I feel like I touched on us a little bit in the beginning where we're starting to see this kind of convergence between these different APIs and finding them as a way to have a like Common set of tools to solve problems that most users of any of these individual projects are facing And I think one of the real boons here is that we've seen that for as much Excitement as there is around Kubernetes as many people as there are this conference today It's still on like the leading edge of the adoption curve We know that there are a lot of folks that are not doing this yet and one of the biggest challenges there is Education it's the building that skill set to be able to work on these projects and Having this portable configuration language Helps enable that if instead of learning one of 20 different ingress is you have to learn one gateway API then If you move to a new job or if you have to train somebody new who's joining your plate your workplace There's one thing that you can get started with that will get you a significant part of the way there of understanding how to do this work so Yeah, and there's also ecosystem benefits to that too of We know that there's folks that are developing learning courses and things like that So yeah, hopefully this work kind of the convergence helps benefit growing the community as a whole It's one of the things I'm most excited about Yeah, yeah, I don't have much to add Really great. Yeah, I'll just say real briefly, you know, obviously I coming from the GK perspective We've loved the features that this unlocks for us Earlier this coupon. I think on Wednesday I had a talk with Lee Wynn from AWS and we talked about how cloud providers them on EKS and us on GKE are using these two APIs together to enable some really powerful routing across clusters because of course gateway PI enables lots of advanced routing functionality that hasn't been in Kubernetes yet and then multi cluster services allows us to bridge that Multi cluster service gap. So combined. They're a really cool set of APIs to work with But with that, I want to make sure we have time for questions So if anyone has questions if you can line up at either of these mics and we'll be happy to continue the discussion there Thanks, everyone So I had a talk with a few friends yesterday about all of these new CRD's that are coming in How do you manage dependencies across the CRD's because that's been an issue with a lot of other Systems in the past so I can see that it might be an issue with versioning. How do the controllers fit together and such Yeah, that's been a huge challenge In gateway PI we've been really focused on backwards compatibility So similar to what you'd expect in upstream. We want to ensure that you can Upgrade to the latest version of C of a CRD and even if the controller or controllers you have in your cluster are Expecting an older version of a CRD that'll still work and similarly if your controllers are expecting a newer version of a CRD That should also still work. So we're really really focused on ensuring that Whatever version of API you have installed it should be compatible Of course, we're encouraging everyone to keep on the latest version Of course, you know on GK for example, we'll just manage that for you and install the latest CRD's across all versions That's what we recommend most people do because again, this is backwards compatible You're not going to lose anything by upgrading to the latest set of CRD's They're just purely additive changes Great question. I don't know if anyone Yeah, well MCS avoids that by only having a single version so far I Related to to his question. I Have you given any feedback to API machinery? come up with ideas for How we should be advancing CRD's themselves because it's another example of Using CRD's which are very generic concept for something very specific I'm thinking that through both of these API's we might be looking at okay for Kubernetes What comes after CRD's? Yeah, absolutely. We have a wish list That we've shared with API machinery. There's a doc and I think an issue tracking that and yeah, you know, I want to say that Although CRD's have their issues they've enabled So much here that I you know as much as I can complain about them the things that they have done for us are Amazing so and they are only getting better You know that there have been some talks at KubeCon that have mentioned the updates to CEL That enable a lot of really advanced functionality right into CRD's So it will take some time and especially because we're supporting the trailing five releases of Kubernetes It will take longer for us But there is a lot of good work going on and we are trying to collaborate with API machinery to ensure that CRD's only Get better over time. Yeah, the the CEL stuff in particular For folks who haven't heard about it allows like synchronous validation of CRD's so right now Getaway API is dependent on installing a admission webhook and that's the thing that you have to Assume a user does and then if it's not there or doesn't catch it You have to if you're building a gateway API controller figure out what to do in the event that something sneaks by it So yeah, CEL we can't use it today because of the version policy But it was really exciting to kind of like see how we'll be able to make CRD's more robust in the near future. Yeah Yeah, that makes me sort of think of you know, well the usual Appeal that you'll get from and talks about there about SIGs, which is that really if you have frustrations with CRD's do voice them Join the SIG calls From the outside it might seem like there's big intimidating Set of cabals that run Kubernetes in practice is just a few people and we would like more More people and for example in SIG MCS. What we see is that there are maybe four or five people who Involved over the long term and we see people pop up every now and again Someone will come along and give us a demo of an implementation and we'll never hear about that implementation ever again And then we'll hear from other people that there's this Implementation that we've never heard about that's actually quite good or is gaining traction or this cloud provider has made available Has added the MCS API to their offering without telling us about it So there's this big disconnect between what the SIGs perceive and what happens in the real world So yeah more input from end users and it doesn't have to be getting involved just coming along and saying Hi, we're using your stuff and one of the benefits of whether you're an end user or an implementer of either of these specs is Participating in the discussion in this meeting is a way to help influence the shape of it Both of these are still rapidly evolving so if there's something that your implementation needs to do or wants to advance upstream or wants to figure out How to make this back support a specific use case that your users care about Now's the time to do that like come join the SIGs join the discussions Gateway API meets Monday evenings in a US time Gamma meets Tuesdays By weekly or yeah swapping between an EU friendly time and a US friendly time on alternating weeks and Cluster yeah most the clusters Tuesday That's a European friendly time. So you mornings for the West Coast and Late afternoon for Europe. Yeah by weekly and yeah Don't be afraid to just pop into a call or if you see something doesn't make sense Open an issue like that was how I first got engaged with Gateway API Was lurking on the calls with my camera off kind of just paying attention and seeing what's happening And then when we started implementing it filing issues for things that I saw in the spec that didn't make sense So there it's definitely an accessible project both of them. Oh all three of them if you count gamma Oh, yeah that where we want all new users to join and help Determine where these are going. Yep and to avoid having all Indian tragedy All right, I think we've hit time. Thank you so much everyone