 Hi, everyone. I'm Steve Sloka, and today I'm going to walk you through how to build your own Envoy control plane. So we're going to introduce some core Envoy concepts. Once we have those concepts, we're going to look at how we can implement those concepts into different ways of configuring Envoy. And once we understand those, we're going to go ahead and build our own control plane. Today, we're going to use GoLang for our source code, and we'll have lots of examples that will be available for you at the end. Let's get started. All right. Quick little bit about me. So again, I'm Steve Sloka. My role is on my maintainer of Contour at VMware. Contour is an open source ingress controller for Kubernetes. It's also a CNCF project. So enough about that, let's dig into the project. So what is Envoy? So Envoy is an open source edge and service proxy. It's designed for cloud native applications. This is a quote from the Envoy proxy website. So it's a service proxy. We use it all over the place. Contour uses it as this data plane data path component. Other projects use this as well. It's becoming a more and more popular service proxy out in the industry. It's used by a bunch of people. Again, this is taken from the Envoy website. So I'm sure there's much more in this list. And it's also used by a bunch of different projects. So you'll see here, if I move myself, all these different projects are leveraging Envoy under the hood. Again, it's taken from the community website. So let's dig into some core concepts and terminology in Envoy. And the first thing I want to talk about is the difference between upstream and downstream. So any request that's in Envoy, and it routes somewhere to some sort of endpoint or other place, this is what we call the upstream. And any request that comes to Envoy from outside, this is what we call the downstream. So requests will flow from downstream to Envoy, and then from Envoy to an upstream. It's important to get this right at least for the next 20 minutes, just so we're all on the same page. But this is what Envoy says upstream or downstream. This is what they mean. OK, so when we talked about downstream, the first thing we're going to introduce is a listener. And a listener is a named network location. And it's what downstream clients connect to. And this will be a TCP or a UDP connection. And then you can also apply filters to this and chain them together. So filters in Envoy are sort of the extension magic. They'll let you take these TCP connections and you can do different things with them. In our demo, we're going to go build out an L7 or HTTP type network proxy, which is what kind of a contour it is. With that, we'll be able to take the TCP connection, run it through a filter, and that filter will then turn that into HTTP headers and requests and responses. And so it'll make it easier to do this different type of L7 routing. There's a lot more to filters and filter chains. You can dig into that on your own. Again, just to introduce that these are concepts that you can apply to your server. All right, now that we have that, we have routes. So routes are results of listeners. So listeners will then call different routes. So again, in our L7 model, we're going to have a bunch of different virtual hosts that we can route to. So again, you could have stevesloka.com and vmware.com has two different virtual hosts. And from there, I can have them route different places. I can also modify the headers on each type and do a bunch of different things. So again, this is super quick and not very much. But then just to introduce that these things called routes are what makes you decide where things should route within your cluster. Now, once you have a route, you're going to route to a certain place, and we're going to route to clusters. And a cluster in Envoy is a group of logically similar upstream hosts. So I always think of this as Kubernetes, right? Kubernetes has services. And to me, a service is like a cluster, where it's just a name, and then it points to a different set of endpoints. And Envoy is the same way. So Envoy has, again, it's a cluster which has members. And members are discovered via service discovery. And there are a few ways that you can implement that service discovery. So the first way is you can just be static, right? You can define a cluster and define all of the endpoints that exist in that cluster and then move along. You can use DNS, and you can use strict or logical. And they're similar in a few ways that they both look up endpoints dynamically or asynchronously. But strict will look for all the endpoints that resolve from a DNS query. So say three come back from a query for a DNS entry. Envoy will then say, all right, there's three members in this cluster, and it'll load balance across those three different endpoints. Now, logical will use, similarly, it'll a security look up the IP addresses, but it'll use the first DNS entry as its result. So it'll just proxy out to that one initially and not without load balancing over all of the endpoints in that back end. We can have original destination. I'm not gonna introduce that today. We'll skip through that. And there's also another one we're gonna introduce. We'll skip as well. But the one I wanna really focus on is this thing called Endpoint Discovery Service, or EDS. This is our first XDS kind of protocol that we'll introduce. So for clusters, we can have, again, we can statically define the members or we can have the members be defined by another lookup service. So kind of like how DNS is implemented, this will be a service within Envoy that we can then feed in the members of that cluster. So EDS is what we're gonna do today in our demo. And again, I mentioned you can have custom, but again, we're not gonna get into that. All right, so now that I have these core concepts, I need a way to configure Envoy with these types, right? These routes and listeners and clusters, et cetera. So there's a couple ways you can do this. So the first way is to implement a static file. So I can just go build a file, program in all of the different routes and listeners and all the things that I want and pass that off to Envoy. Envoy will load that file in and then it will serve traffic based on that. I can also have a place on a file system. So I can tell Envoy, hey, look in this directory for your files and it'll load them in dynamically from there. I can use the rest endpoints for my management server or I can also use GRPC. So this is what we're gonna do today. So GRPC has an advantage of a rest is that rest has the pole, right? And it's gonna be slow and it's got a lot of overhead because it's always pulling for traffic. GRPC is a rich connection, right? So I can stream changes bidirectionally very easily. So that's what we're gonna implement today in our XDS example. And there's two ways to implement these management servers. So the first one is called state of the world and the second one is called delta. So state of the world says, hey, say I have nine clusters in my configuration and I add a 10th cluster. What the management server is gonna do is it's gonna pass down all 10 clusters to Envoy and say, hey, Envoy, here's the state of the world. This is all the clusters that I know about and so this is what you should use your configuration against and it works for all the different types, right? Now, delta is gonna work like you might imagine. Instead of it sending all 10 clusters down to Envoy, it's gonna send just the one cluster that was added. Same way if it was deleted, it would delete one cluster and send that one deletion down. If a cluster is deleted in the state of the world example, it would just, you know, admit that from the list. So it would just only send eight clusters down instead of nine. And then Envoy would say, hey, this cluster is missing and it would, you know, delete it out of its configuration. All right, so let's talk about XDS. We've kind of hit around a little bit but we've introduced these different things called listeners, called routes, clusters and endpoints, right? Well, they each have a different discovery service or XDS protocol that we can implement. So over GRPC, we can have each one of these return a list of listeners, routes, clusters or endpoints just like we just said. And if you strip off the first name, replace it with an X, so you can see how this is we get this name. Cause there's a bunch of different protocols that exist that you can implement. Again, these are just four that I'm introducing. There are more and I encourage you to go research this more afterwards. But essentially my point here is I want to get across is that this is where you get XDS from. Now there are four different variants of XDS. Just like we talked about, we have the state of the world as well as incremental. We can also decide on how we want to send out or set up our GRPC streams. So the default way is to have a GRPC stream per resource type. So for listeners, clusters, routes and endpoints, I'll have one stream for each type. So I'll have four total. And that's in this top version. Now alternatively, I can implement ADS, which is aggregated discovery service. And ADS lets me have basically one stream for all those different resource types. And I can send traffic across all different types over that one stream. For today's example and today's purposes, we're going to use the simple way, which is a stream per resource type, as well as the state of the world type updates. So having all of that background, right? And that's a lot. All these different concepts, let's take all of that and put it together and let's go build an example. And we'll see what this looks like if you had to go build this yourself. So this diagram here is going to be what we're going to actually demonstrate today. So we first need an envoy, obviously. An envoy will then get past a bootstrap config, right? And this is a static file that will pass to envoy. And what's in that bootstrap config essentially is anything you want to preload to envoy, right? So we can set up static cluster, static listeners, any kind of static resource in that file. Then we can also reference dynamic resources, and that's where we're going to go today. So dynamic resources basically will say, hey, instead of you loading my listeners and clusters and all those different XDS types statically, we're going to load them dynamically. And they're going to come from this XDS server here, this box in green. And what we'll do is we'll create a cluster, a static cluster, which will point to this XDS server. So when an envoy starts up, it'll have enough information to go look at that cluster and pull down all of its information dynamically. Now what the XDS server needs is it's going to need some source information, right? What is its source of truth in terms of how it understands what routes and clusters and endpoints it should configure an envoy with? Now this is going to vary depending on your implementation. So for contour, we run on Kubernetes. So Kubernetes is our source information. So contour looks for things like services, for endpoints, for secrets, as well as ingress objects. It takes all that different from information. And it makes an envoy configuration and then fills in those envoy XDS caches and then passes those down to envoy. So that is the nutshell of what we're going to do. Let's go ahead and let's poke around. So the first thing I want to look at is this thing called Go Control Plane, right? So Go Control Plane is a project in the Envoy project organization. And it's a Go implementation of the data plane API. So the data plane in Envoy is a bunch of protobufs that represent all of those objects we just described about in the last couple of slides. But this one, they're all Go structs. So you can import them into your Go project and use them from there. The second thing that this project gives you is it gives you a sample implementation XDS server, which we're going to utilize today. Instead of you having to go build out all of the GRPC connections and all of the extra overhead and routing, you can leverage the implementation here in this project to go ahead and build your own control plane if you like. Now, you can read through a bunch of things here, but here at the bottom, there is this example server. So shout out to this user here who helped build this out. I know we used to have to look at unit tests before to get a good example. So this is great. So this will give you a lot of enough information to spin up your own server, as it says, right? So I'm going to show you basically the same example. I took this and reused it. But what I did was I added a little bit more dynamic configuration. So instead of it being statically configured, I have my configuration come from a YAML file, right? And we'll look at that next here. So this is my project. I'll have the link to this at the end of the slides if you want to check it out yourself. So in here, what I have is a main.go. And to get started, what we'll do is we'll go ahead and create a cache, a snapshot cache. Now, this cache is the core of this go control plane XDS server. The snapshot cache holds all of the different snapshots that exist that we've passed down to Envoy. So remember, we're talking about setting this state of the world configuration to Envoy. So what we'll do is we'll go ahead and look at, build up our configuration based on our source information. And we'll build out a list of listeners and routes and clusters and endpoints. And then we'll create a snapshot out of that. And we'll pass that snapshot down to Envoy and then Envoy will load that in. So whenever any of those different objects change, we'll generate a new set of caches. We'll make a new snapshot and then we'll pass that down to Envoy until Envoy, hey, here's a new version. Go load this in and Envoy will parse that in. That's essentially what we're trying to do here. So like I said, I have this config file that we created. And this config file here is just to help us configure this server more dynamically. This server has, this is my source information. Again, for contour and others, this could be a Kubernetes, this could be again, depending on the environment, this is whatever it needs to be. But for me, this is going to be the static file here. OK, so to watch this file for changes, we're going to go set up this watcher. So we first created the snapshot cache. We're going to create a watcher. And what the watcher does is anytime the file changes, we get a notification back here over this channel. That's all we're doing. So we're getting a callback to say, hey, the file changed. And essentially, that's going to be our signal to go rebuild a new configuration. Now that we have that set up, we're going to go start up our XDS server. So here we're going to go ahead and build the server out. We're going to pass it the cache. Again, that was the snapshot cache we created here, line 56. Now that we have that, we'll go ahead and run the server. And we get to pass it a port. And for us, this will become from a flag. And we'll default to 9002. So let's go ahead and run this. OK, so we ran the server. You can see we're listening on port 9002, like we said. And here, this debug line says, this is the snapshot we're going to serve up. So here you can see we have some information in here. And what it is, is this information matches our configuration file here. So if you can look in here, you see we've got this cluster name echo. And here I have, let me move myself over here for you. Here I have this cluster named echo. And if we dig in, we can see we've got a listener 0. And here we've got a listener 0. So the snapshot came from this file. And what happens is, anytime we change this file, a new snapshot is generated. So let's take a look at how that happens. So here in this processor, you'll see here is we have this new snapshot, I'm sorry, this process file. And what it gets passed in is the file that changed. So the Notifier again sends us an event whenever the file changes and tells us what file got updated. So the first thing we do is we just parse that YAML into a go struct. So out here I have this API. And this API defines our struct in go. So coming back to our processor, after we parse that in, what we do is we actually go and generate envoy types out of that YAML file. So we'll go ahead and loop through every listener that exists, and we'll go ahead and build a set of listeners. And we'll pass in the name of it, the routes and the address in the port. And then from there, we'll build a set of routes. So here's our listener cache. And then here's our route cache. Then we'll build out our cluster cache here. So we'll add all the clusters. And then we'll add all the endpoints. So essentially, by the time we get to this point, we have a cache full of listeners, endpoints, routes, and clusters. So now, after we build those all out, we've generated and sent our point in time or state of the world type cache. Now we'll go generate a new snapshot. And what we'll do is we'll pass in a snapshot version. So this will go increment our snapshot version, basically just increment it by one from whatever the previous one was. And then we'll also go ahead and pass in the contents. Now this cache contents goes and converts my local type that I have here into Envoy XDS types. So if we look at something like listeners, you can see we have this resources package. And this resources package goes and builds a new resource type called listener.listener. Now this type are actually Envoy types. So you see here I'm into this listener protobuf. So again, now we're digging into that go control plane. And now we have this listener go struct, which lets us create Envoy listeners. So what we do is we go ahead and build out. So I'll go back in here. We'll go build out a listener. And one thing that's interesting here is that we'll say we'll build out this filter chain. Here's where we're adding that HTTP connection manager. And this is going to give us this L7 router. So the same way in here, we have routes. So here's make route. And again, it returns a route configuration. And this route has a match and an action. Again, these are Envoy specifics. But these are building out all the different Envoy prototypes. We can create endpoints here. And we can create clusters. So the processor goes and builds out a cache of that YAML file. It then generates a new snapshot against that type, converting all those different types into Envoy types. We then make sure that the snapshot's consistent, meaning do we have all the routes referencing proper clusters and so on. If it is consistent, we'll go ahead and create a new snapshot. We'll set the snapshot to Envoy. And this is where we're actually going to tell Envoy, hey, go ahead and load in this new snapshot. And once you have that new snapshot, you can then process your update, all your recaches locally to get the new configuration. So all that said, we have all that done now, right? So the next thing we're going to do is we're going to go ahead and spin up Envoy. So let's go ahead and create a new window, a new tab here. And let's take a look at what this Envoy looks like. So as I mentioned, we need this Bootstrap cluster. I'm sorry, this Bootstrap YAML. And this Bootstrap config, we'll go ahead and load in static resources. And we talked about loading in a static cluster to point to our XDS server. And that's this. So on localhost 9002, again, that's where our XDS server is running. We're going to go ahead and tell it, create a cluster in Envoy to point to this. Once we have that cluster, we're going to set up all of the XDS endpoints. So we're going to set up our cluster, a CDS config. And again, we're going to use gRPC type. And we're going to point it to this XDS cluster created here in this previous step. So Envoy knows now that, hey, I have dynamic clusters and they come from this endpoint here, or the set of endpoints. And it's similar for LDS or listeners. So listeners are going to come from this cluster name. Here we're just going to set a cluster name and ID. And then down here, we're going to expose an admin web page. And the admin web page is helpful to debug contour, I'm sorry, to debug Envoy, and look at all the different information that we've loaded into that Envoy instance. Okay, so now this is up there, we'll go ahead and start Envoy. So we're going to go find the binary local owner machine and we'll pass in that boost trap YAML file here using dash C, boost trap YAML. So I can go ahead and say hack start Envoy. And when the first thing you'll see is when Envoy starts up, it loads all the extensions that are compiled in. So we're using the upstream one here. So by default, what you'll see is we have all of the different extensions that are already automatically compiled in and just logged out for your information. Down here, you can see that we've spun up this runtime layer and we loaded one static cluster. And that cluster is that XDS server that we referenced earlier. And then you can see that we connected to our server and we added a cluster and we added a listener here. So here's listener is zero, again that matches our config. And there should be an echo server here right here, this update cluster echo. So that got added as well as my listener. So we connected to our server here. So let's go verify our configuration real quick. So we'll go look at, again, localhost 9003, right? I'll refresh this, so here we are. And what we can see in here is we've got, you know, a set of listeners. So again, listener 0009000. And again that matches our config, 000 on port 9000. We can also see clusters in here. So we can see that we have this echo cluster and it has two sets of endpoints. So dot 244 on 9101 and dot 244 on 9102. Again, that matches our configuration here. So it's called echo 244 on 9101 and 9102. We can also look at this config dump. And this dump shows you all of the running configuration in Envoy. So again, it shows you all the different extensions that are compiled in. Down here a bit further, we'll be able to see the static configuration we loaded. So here we see static resources and we have our static XDS cluster. Again, pointing to port 9002. And then here you can see we've got our dynamic resources. And again, it's that LDS config pointing to GRPC over that same static cluster we created. All right, coming down a little further, we can now see our dynamic clusters. So this is what got loaded in dynamically through our management server. So here you can see that cluster called echo. Again, it comes from this XDS server. I'm sorry, our endpoints again, are coming from EDS. We mentioned that back in the service discovery slide. So EDS is the source of our echo cluster. And again, it comes from the same XDS server. And again, here's our dynamic listener because we load that dynamically as well. And then here we're loading in those filters. So here's our filter chain and we're loading in this HTTP connection manager which gives us that L7 proxy routing. Okay, so this is enough to get running. Let's go ahead and query this and see what happens. So if I go ahead and curl localhost 9000, I get a response. And you'll see here, this is that simple echo server that we have running. And you see here's the hosting that responds. So we have two endpoints. So if we query this again, we should get a different one maybe 682. And then here's three FC, right? So there's those two. So I can go ahead and do a curl on this. We'll curl every second. And we should just load balance between those two endpoints. 682, three FC, you can see that, right? If I go ahead and come into here and I remove an endpoint, again, as soon as I hit save, I change the configuration. Our callback will fire in GoLang. We'll then create a new configuration. We'll then create a new snapshot. We'll pass that snapshot off to Envoy. They will update. So as soon as I save this, you'll see Envoy down here updated. If we come back to our window here, now we're only gonna hit the same endpoint. So 682, because now we only have one endpoint in our cluster. Right? Cool. So now what we can do is we can add another cluster and a route if we like. So we can go ahead and copy this and we'll call this new. And we'll put this on 9103 and then maybe 9104. And then we'll add a new cluster here. We'll call this one new route and a point to the cluster new. And maybe we'll call this one slash foo. That'll be our prefix, right? So we can do routing on slash or slash foo. Again, our goal here is to make an L7 load balancer. So we'll go ahead and save that. Again, our file changed. We created a new internal state of the world configuration, create a new snapshot and pass that back off to Envoy. So now if we go ahead and do our curl, nothing should have changed here for the slash endpoint. But if we curl for slash foo, what we should get now is some two different endpoints. Okay. All right, so there we go. So we get slash foo and slash running. So, trying to get to what else. So I think that's all we have for slides here. So if you're interested in learning about more, there's some resources here. This top resource is the sample we just looked at. So it's the XDS server that I wrote, which drives from the Go control plane example, but adds the YAML parsing file. The Go control plane is another place and obviously Envoy proxy.io. So again, I'm Steve Sloka. Please reach out with questions. I'm happy to answer more. I'm happy to discuss this more. I know this was quick, but hopefully you've got a good understanding of how Envoy works and how you can build your own management server. Thank you.