 Hello, EnvoyCon. I'm Harvey Tuch, and I work at Google. I am the TL of the Envoy platform team. And today I'm going to be co-presenting with Mark Roth, a colleague of mine who leads the GRPC C++ efforts at Google. And we're going to be talking about the future of XDS, and in particular some of the changes around versioning and XDS transport, which are going to be landing over the coming year. This talk is going to be in two parts. The first part, we're going to look at XDS transport evolution, and the next part will be on versioning. So getting going with transport evolution. I want to start the talk by exploring some of the limitations in the XDS transports that exist today, and in particular, when we want to address some of the more sophisticated use cases that are coming down the pipe. But before I do so, I'd like to point out that actually the XDS transport is pretty remarkable in that it's brought huge improvements to how Envoy receives its configuration from control planes. Going back to 2017, all we had was polling-based rest for XDS delivery. And that had significant issues with performance on control planes, scalability, and also for latency of updates. And the versions XDS transport for essentially a version pump sub scheme, which we introduced since then, has brought about huge improvements. And it's working incredibly well today. And so we want to evolve this forward incrementally without breaking the world and without affecting anyone who's using it today, but at the same time, we want to address some of these limitations, which I'm going to talk about. So what are these limitations? So the first of these is that there's actually issues around cacheability of XDS. And so this impacts XDS as we want to scale it up to many, many different endpoints. And this could be, for example, in Envoy Mobile, as depicted on this slide, where we might have millions or tens or hundreds of millions of endpoints. We need a good story for how XDS resources can be distributed and found out. And there is this project on XDS relay, which I was just checking out, which basically is building the first steps of this. But to do this kind of caching and relay, there are issues. And let's go through some of these. The first is that the XDS resource names that exist today are just opaque strings. And they're not really a unique cache key because they don't include a bunch of useful contextual information, which includes things like node ID or metadata, which are actually provided in the XDS stream, but out of band with the XDS resource names. So if you're trying to cache just on XDS resource name, that's not sufficient. And as a result, even if you want to use something like traditional CDNs with the XDS HTTP transport, things don't work particularly well today. The next major sort of next-gen use case is federation. And this is when you want to have multiple control planes managing XDS. And today we have some limited support for having more than one control plane with Envoy. But it's at a very coarse granularity. And you can't, for example, delegate a single route configuration to a different server and this kind of thing. So in this world, we want to think about where we've got multiple XDS servers. There might be in different clusters. There might be in different clouds. There might be on-premise and so on. We want to be able to provide fine-grained delegation of authority over resources and failover between servers as resources as they come and go. And ultimately, what folks want to do probably, as they scale up, is disaggregate the control plane into a number of microservices, which is sort of form of federation supports. So today, as I mentioned before, we kind of lack some of this support. We have no explicit notion of what an authority is and sort of what a node in this graph that's where a drawing looks like. Resource names, not only they are opaque and uncashable, but they're also global. And there's no way to qualify them by specific authorities or have each authority manage its own resource namespace. There's no support for redirection or failover alternatives. And our existing conflict sources are pretty coarse-grained. OK. Then there is the issue of collections. So XDS today really doesn't have a great distinction between singleton and collection resources. So we would like there to be a better way of thinking about this. And a lot of the issues around this was surface, while GRPC and Marks had a great talk on GRPC adopting XDS, we're switching to XDS. And in this effort, it became clear that there's a lot of strangeness about the corner cases that XDS makes around collections, which we would like to resolve and make work better. And this really feeds into my final slide, which is that we have quite a bit of a curated technical data in XDS, which some of these things made sense at that time. But as we've gone to sort of stretch and test the parameters of XDS transport and its use, I became apparent that there really are issues here which need some mitigation. And then we want to do this to simplify control planes, make things easier, to implement, make it more robust and reliable. And there are all these surprises I mentioned before, the weird treatment of collections for LDS and XDS. There's also the fact that we have multiple wire protocols effectively now for the state of the world in Delta. We have this issue with cynical aliases, which we usually do especially with the HDS. So our proposal, and I have links to the proposal in the final slide of this talk, goes about systematically addressing all of these. And we're 100 things over now to Mark, who's gonna talk about one of the core concepts which really is fundamental to do this. And that is an idea of a structured resource name which takes the form of a URL and has this UDPA schema. So I'll have things over to Mark at this point. So the key to addressing the limitations that Harvey spoke about is this new naming scheme for XDS resources in which resource names are represented as a UDPA URI. So let's talk about the anatomy of a UDPA URI. The first part of the URI that's interesting is the authority. This indicates who's authoritative for the resource which is not necessarily the same as the XDS server that's used by the client. The client's bootstrap file will indicate what server or servers to use to access resources for a given authority which might or might not be the same as the actual name of the XDS server, right? And so an example of a case where you might want them to be different is in a large distributed infrastructure. If you've got for scalability reasons, you might have sort of a local caching XDS proxy in each data center that has clients and you might have those clients access the data through the local caching proxy even though the authority is still some centralized single point that is centralized across the entire global infrastructure. The authority acts as a global namespace for resource names. So if you have resources in, if you have an authority that has a bunch of resources in it, you can feel free to organize them however you want and you're not gonna have any naming conflicts with anything that anyone, any resources that anyone has created in other authorities. The authority is an optional part of the URI. So if you have a non federating use case, just a bunch of local servers that don't interact with anything else, you can just omit the authority and have everything in sort of the empty string as the authority name. In the future, the authority might be used to authenticate the resources themselves using some sort of signing mechanism. That's not something we actually have today, but it's a possible future direction that we could go in here. The next part of the URI is the resource type, which is listener cluster route configuration, that sort of thing. The ID is essentially the path part of the URI. It can be any string that you want. It's totally up to you as an authority owner as to how you wanna lay out your resource names. Context params are similar to query params in an HTTP URI. They provide a way to serve multiple variants of the same resource and we'll talk more about that in a minute. And the final part of the URI is the directive. This is similar to the fragment in an HTTP URI. It's a directive for how the client should interpret or use some part of the resource, but it's never actually sent to the server. It's only interpreted on the client. Next slide, please. So here are some examples of UDPA URIs. First, we have an example of a fairly basic URI. It's got authority xds.example.com. And so Envoy would look in its bootstrap file to determine which server to query for this authority. The resource type here is listener and the ID is service mesh slash sidecar. And again, this can be any path that you want. This is just sort of one arbitrary example of what you could put in here. The second example shows the use of context params. In this case, the context params is node type equals front end. Note that the context params are actually part of the resource name. They're part of the unique identity of the resource. So from the xds perspective, two resource names that vary only by content params. For example, you had another one that was node type equals back end. From the xds perspective, that's actually a different resource. From the human perspective, in practice, usually if you have a resource that varies only based on the context params, that's usually two variants of the same basic resource. It's telling the client to do the same thing, but in slightly different ways for slightly different scenarios. Context params do actually come from different places, not just from the UI, the URI. There are some additional sources of params that can get added on when requests are made to the xds server, excuse me. And we'll talk more about that in a minute. The final example here shows use of a directive. In this case, there's a directive called ALT, which specifies a fallback resource. So when the primary resource doesn't exist or the server can't be reached, the client can sort of know to fall back to this alternative resource. Next slide. All right, so let's talk more about context parameters. As mentioned earlier, the purpose is to allow servers to provide multiple variants of the same resource. This is the key to making all kinds of information a first class part of the cache key for resources. Context params come from multiple places, so we'll talk about these in order. First, there are node identity parameters, which can be populated from the node message in the bootstrap file. These are prefixed with the string udpa.node. This is a key part of replacing the wildcard queries in LDS and CDS that Harvey mentioned earlier. The idea here is to use these context params to encode things that used to live in the node message only. So instead of the server looking at the client's node identity to decide which resources to return, the client can explicitly ask for what it wants in a flexible way. Next, the second source of context parameters are the ones encoded in the URI themselves. So this is like the example I showed earlier of node type equals front end. You can have any prefix. There's no restrictions on what the content is. And it is applied after the node identity params, so it can actually override them. So you could actually specify something in the URI that's udpa.node.something and it would override whatever comes from the bootstrap file. Next is the third source of context params is client features. These use the prefix udpa.client feature. These are not user controlled. They're added automatically by the implementation. Unlike the client features that exist on the node message today, these client features are resource specific. So a client feature that is specific to, for example, EDS will only be added to requests for EDS resources. And this prevents unnecessarily polluting caches with duplicate copies of the same resource, which is what would occur if the same CDS resource was accessed by two different clients that had different values for a client feature that is relevant only to EDS, right? You wouldn't wanna have to make two copies of that. So this avoids that. Next, the final source of context parameters is per resource type attributes. These use the prefix udpa.resource. Just like client features, these are not user defined. They're added automatically by the implementation and they're defined for each resource type as needed. The one concrete example here is for VHDS, we plan to use this to allow the client to request the specific virtual host that the client wants. So this would be the thing that would replace that aliases mechanism that Harvey mentioned earlier. Next slide. Now, another thing that we're introducing is first class support for collections. There are two types of collections, list collections and glob collections. A list collection is a resource that contains a collection of resources of a particular type. The collection resource itself will have its own type. So for example, a resource of type listener collection is a resource that contains a collection of listeners. A resource of type cluster collection is a resource that contains a collection of clusters. As indicated by those examples, this is another key part of replacing the wildcard queries for LDS and CDS. Now each resource in a collection can be either an inlined resource or a UDP-A-U-R-I referring to another resource which the client must fetch separately. So here in this example, we have a listener collection resource with name xs.example.com, listener collection, sidecar one, just an arbitrary name that, you know, as far as the ID goes anyway, just an arbitrary name that I made it for the example. And in this case, the collection contains three resources. There's one inline listener for Part 80. There's another inline listener for Part 443. And then there's a reference to an external listener resource which the client would have to go and fetch separately. Note that for the inline resources, there are names that are sort of attached to them. These names can be used in a entry directive on the client to refer to one of the inline resources in the collection. Next slide. The other type of collections that we're gonna support are glob collections. These are an alternative to list collections for cases that need additional scalability. So the way this works is the client requests a resource where the last component of the ID is star, and it works basically like a simple shell glob pattern. The server will return all resources in this quote unquote directory. The client will then know what resources in the response match this request by virtue of them all being in this requested directory. So it's sort of a path prefix match sort of thing. The reason that this is more scalable than a list collection is that in the case of a list collection, let's say you have a list collection with a whole bunch of inline entries and any one of them changes, you have to resend the entire list collection. But in a case like, so one of the limitations we've had in EDS is that EDS has all of the endpoints listed in it. And we know that there are cases where there are really large numbers of endpoints and then one of them changes and you have to resend the whole thing which uses a lot of bandwidth and is not really the way we want things to work. So with glob collections, we're going to be able to build something that we're gonna call LEDS, the Locality Endpoint Discovery Service, which will be based on glob collections and will allow each individual endpoint to be updated to the client independently of each other. Next slide. Next, let's talk about different ways that you can use UDPA URIs to address various Federation scenarios. There are three different ways for a resource from one authority to sort of delegate to resources in another authority. First, the delegation can happen at the normal handoff point between resources. We know that listener resources can tell the client what route configuration to request via RDS. So that RDS resource can actually be in a different resource, sorry, in a different authority than the listener resource was. So that's one way of sort of handing off. Next, the second type of authority delegation is redirection. So this is similar to an HTTP redirect. The client will request a specific resource and the server then tells it to use a different resource instead and that resource can be in a different authority. So this is a way for one authority to sort of say, I wanna delegate this particular resource to some other authority. Next, so the final form of authority delegation is having a list collection that includes references to resources in a different authority. So in this example, we have a listener collection in the authority xds.example.com but it's redirecting inside of it, it's referencing inside of it, resources in other.com and mumble.com. Next slide, please. Now, a different type of authority handoff is a failover. This is where we use that alt directive that we talked about earlier, but in this case, the alt directive points to a resource in a different authority than the original resource was. This sort of thing can be useful in cases where you wanna like fall back to a different configuration from a local xds server when the remote xds server is not reachable. Next slide. And let me hand back to Harvey at this point. Thanks, Mac. Okay. So in summary, like the real win here is we've got the ability to cash xds resources, we can delegate and failover for authorities, we've got better support for collections and we've eliminated a bunch of technical debt and we're now ready to go, I think for things like federations. So this is like all pretty exciting. Now, in terms of the implementation roadmap, we've just started to do this and we have the first three, I guess items in that roadmap underway, but at very early stages. We're planning Q4 and landing probably one through four at least in this list. And then I think once we've landed that, which is basically support for the core UDPA URLs and the glob collections, the rest of the implementation to be added somewhat incrementally, it's sort of distributed amongst other folks who are interested in contributing, but this is the current plan and we'll be working on this in the coming months. In the second part of this talk, I'm kind of looking at XTS versioning and I know this has been a source of considerable friction in the ongoing community in the past year. So I'm hoping that this plan that I present has sort of reflected some of the feedback that we've received and is providing us path forward to achieving the best properties of versioning while still being thoughtful and mindful of the costs that control point operators bear when implementing some of these schemes. So let's just start with a recap of versioning in Envoy and you know, you may even ask, this takes us back to last year's Envoy card, but like one of the first things to ask is like, why are we even doing versioning? And the basic reason is that we were previously had just a single unversion API, which people would randomly remove fields from, which doesn't work as a stable API. If you're a control plane operator, if you're implementing Envoy as an XTS client, and you're not Envoy XTS as a, if you're an XTS client and you're not Envoy, these things are problematic features and we really needed to have a pretty serious strategy for not breaking the API. So we introduced major versioning, which would actually allow us to, you know, turn down and tire APIs and bring up new ones. And we were planning on doing this initially on a yearly cadence with V2 in 2019, which was the Extends API, V3 in 2020, which we did roll out and deprecate V2. And then in 2021, we're planning on introducing V4, removing V2, et cetera, and one per year. And that was problematic. We decided to stop the clock right there because it turned out that there's just too much control plane operator developer pay, which I'll go into. I have a mini retrospective in the next few slides, but that's essentially where the situation we're in today. We're not getting rid of major versions, but we're not gonna issue one per year. And we're gonna come up with a more incremental approach towards versioning, which I'll explore in the next few slides. So a quick retrospective, the good, the bad, the ugly, while good things is major versioning did solve our breaking change problem within major versions. We built sophisticated tooling in Envoy to automate most of the upgrades. So it was a lot smoother than it would have been if we'd done it manually. I mean, there were a bunch of technical improvements in V3 over V2, which were made possible by the ability to break the API over the version bug. What went badly? Well, we had a fair bit of code sharing. That's not too bad, though. We didn't finish basal integration. So there's this generated API shadow thing that you may be familiar with. If you're an Envoy developer and you need to un-fix format for, we didn't have all parts of V3 ready to go on day zero and there's some complexity around handling versioning in Envoy. Now, the really ugly parts are control playing developers and operators had a lot of pain and they didn't have the same tooling in libraries that we had in Envoy. There was a lot of confusion. They weren't ready on day zero for upgrades and documentation. And some of this, I would say, reflected the lack of interest in control playing operators and what we were doing while we were putting together this proposal and socializing the Envoy community. But some of it we probably should have been better prepared for and ready to go at day zero. There was also a performance overheads of upgrading resources from V2 to V3 inside of Envoy and internally operates a V3 and there's quite a few people and there's a lot of conflation over what is a transport and a resource version and so on amongst the Envoy community and confusion as a result. Okay, so what's the actual new plan? Well, we're gonna slow down major versions. We're not gonna not do a major version in the future we're not gonna do one until the benefit really outweighs their identified costs and there's been a useful learning experience to capture that. We're gonna switch to a scheme called minor and patch versioning to make things more incremental. And so, you know, borrowing from the sort of the semantic versioning terminology or this isn't semantic versioning. Major versions are gonna be when we wanna break the world, they happen rarely and we can actually remove fields from the API. Minor versions are gonna occur once per year and what's gonna occur at these clock ticks is nothing's gonna be removed from the API but XDS clients can remove support for deprecated features at these clock ticks and finally we'll have patch versioning which will basically change in every API change and this will let you know exactly what version of the API a client is at which is useful for feature discovery. Okay, so Matt's got a very detailed plan of record around this and I recommend checking this out if you're interested. There was the details and Adi, who's a Googler who is gonna be working on this in Q4 he's gonna be looking at things like how do we do version negotiation and feature negotiation which are really important to coordinating XDS clients and control planes in this world. Okay, now what are the implications? Well, if you're a control plane operator you're gonna have to state a policy and support one around which minor versions you support. If you're a control plane developer you need to support a range of minor versions and support negotiation of features and version. XDS client developers need to pretty much do the same and anyone who wants to is willing to put in the work of being an API shepherd can join in the process of deciding what gets deprecated and when to help guide these minor version changes. Okay, so the new version of a roadmap kind of looks like the first two points of 2019, 2020 are the same. Next year on Q1 we're kind of introducing 3.1.0 minor version and we will remove V2. That is going away. So if you are still alive V2 you should be moving to V3 because that is going from the onboard code base. And then each year from then on we're gonna start introducing minor versions and deprecating things which are more than at least a year which I've had deprecations of at least a year. Okay, so there's a bunch of resources around this linked to this slide. If you download the slides and click through them you can follow up on them. At this point I think we're done with the presentation part of this talk and we're happy to take questions. Thank you. Hello. All right, so should we tackle the one about the collections including collections? Like I think that's something we could support. I'm not sure. We'd have to look at the use cases and make sure there's no implications but it seems on just first thinking about it it sort of seems like we could do that. Yeah, for this collection is sure. I mean, we already have the XDS clients going back and forth to the server multiple times when fetching a collection. And so this is essentially just adding a bit of hierarchy there. So there's no reason we can't support it. I guess we would just need to understand the use case for that first because it's, I mean, like looking at the existing proposal we already have a fair bit of conceptual complexity around the collections around where these list collections and then we also need these clock collections for performance and scalability. So ideally we have, we land these first and then we add some sort of notion of nested collections, yeah. Yeah. I mean, I think the nice thing about it is that it wouldn't, you know, this would sort of just be a change to the data model for list collections because list collections are just a resource, right? So it doesn't really have any transport protocol implications, which is nice. But yeah, I agree. Let's do the basic stuff first. And then, you know, we can see where we go from there. Okay. Yeah. I think that makes sense. I think so we had another question around how are we managing trust amongst authorities and things like SPIFI and so on. So right now, I think this actually ties into another sort of related concern that is how do you actually like sign and attest to the integrity of XDS resources. That doesn't really exist today, but it seems that at the resource level, it would make sense to introduce some sort of notion of signing. So you can have an idea of who was responsible for producing this resource and is it in the actual original, you know, intended condition. And tied into that would then be some idea of like what is an authority and how do we actually identify them using, you know, certs and that kind of thing. So I think like we have an idea that this is where we want to head and there's various points in the API where it makes sense. If you look at the proposal under the conflict sources, essentially what identifies and maps an abstract idea of an authority down to the concrete transport, it seems that would be a good place to also attach certificate information as well. And the XDS resource objects themselves are now in a wrapper object and that is where essentially the signature would go. But this scheme has not been fully designed yet. We're at the stage in which we understand how we would probably go about building this. And I think like anyone's actually interested in helping to drive that will be very interesting because I think that's a really important part of making this work for real in a situation where you have mutually untrusting parties and that kind of thing. Yeah, I think one of the challenges with the whole signing thing that we're gonna have to think about when we really start looking at signing in detail is that sort of thing I think could work fairly well if what we're signing is the XDS resources themselves. But if you've got some sort of control plane infrastructure behind the management server that what users actually configure is not that form but a different form and then it gets converted then there's this whole chain of trust thing through that whole workflow that makes things a little bit more complicated depending on how much you're mutating the data. I guess back to the original point on this idea like a disaggregated control plane. So today control planes are like just a single or monolithic so it's just probably fine while they're simple but you can think about breaking up the control plane and even amongst those who let's say we're fronting their control plane with go control plane they probably have a pretty sophisticated configuration pipeline behind that which runs in a different servers and that kind of thing. Okay, I think we have one minute if there are any other questions or we can end the minute back to the next session. Once, twice, twice. Okay. Thank you. Thank you. Thank you.