 I'm Flynn from Ambassador Labs, and I'll be talking a bit about our experiences working with Envoy as part of the Emissary Ingress Project. In particular, I'll talk about some things we learned along the way, some of the things that helped us out, and, of course, some of the challenges we ran into. If you're not familiar with Emissary, it's an open-source, self-service, Kubernetes-native, resilient API gateway built to top Envoy. It can be used in a lot of different ways, but its bread and butter is managing access at the edge of your cluster. Envoy does all the heavy lifting of wrangling your data, and Emissary, in turn, wrangles the Envoy. Emissary is a CNCF incubating project, and it forms the open core of the Ambassador edge stack. If you're familiar with the Ambassador API gateway name, Emissary is the same thing. The name changed with the CNCF donation. To lay my biases for this talk on the table, the way the Emissary project uses Envoy is that we take an Envoy image, wrap custom control plane logic around it, and then that's the Emissary image. Just to complicate things further, it's a custom-built Envoy, since there are still a couple of things that we need to use that are not in mainline Envoy. So that's the lens through which I tend to look at Envoy. However, I expect that this talk should be applicable to other use cases as well. We started learning lessons about Envoy actually before the Emissary project existed. Back then, before Ambassador Labs, who was even called Ambassador Labs, we used to host the Microservices Practitioner summits, and at our summit in January of 2017, we heard to talk about this shiny new Envoy proxy thing by this guy named Matt Klein that you may have heard of by now. Envoy came across sounding pretty cool, and when I got back to Boston, I ended up taking a close look at how well it took to Kubernetes. It turned out that it worked pretty well, but it was considerably more painful to set up than we would have liked. So that was our first lesson about Envoy. It was extremely powerful. It was extremely performant, but it could be painfully complex. And of course, you can say pretty much all the same things about Kubernetes as well. So in pain, there are opportunities, and it looked pretty clear to us that we had an opportunity to make that experience of using Envoy as a Kubernetes Edge proxy really much more pleasant. That was the actual start of the Emissary project. We wanted to go through and set things up so that developers could use Envoy in Kubernetes without having to spend time becoming an expert in either Envoy or Kubernetes. And really just making it possible wasn't good enough. We wanted to make it easy. To state the obvious, this is a pretty big ask. Taking a powerful, flexible, complex tool like Envoy and making it accessible to non-experts is hard. Part of the reason it's hard just to state the obvious again is that it doesn't make any sense to just take Envoy, slap the name Emissary on it, and then lean for everything else the same. That's kind of pointless. Instead, Emissary really needed to be its own thing, distinct from Envoy. And given that Envoy is very powerful and very flexible already, the things that could really differentiate Emissary from Envoy or from other tools built on top of Envoy were not likely to be things dealing with our expertise with Envoy or cool little technical things that we added. They were much, much more likely to come from our choices about the things Emissary should make easy or the things Emissary wouldn't support. And what exactly the user experience with Emissary was going to be like. Another counter-intuitive thing we learned here was that in many cases, adding options really just adds confusion. An early version of Emissary allowed configuration using files on the file system or Kubernetes config maps or Kubernetes CRDs. We had all three choices because we thought that people would appreciate having the options. And in fact, some of the more expert users did. But after a little while, we realized that overwhelmingly having those three options discussed was just confusing newcomers. And things really got much smoother with the product once we removed the choices of files on disk or Kubernetes config maps and just drove everybody towards CRDs. So given that Envoy is a complex powerful thing, we were trying to make a complex powerful thing accessible to non-experts in a meaningful way that would actually add value for them. And we realized that that requires being opinionated. So that was our second real lesson. Although we needed to go out and gain enough expertise to form good opinions and to be able to act on them, the expertise was going to end up being secondary to those opinions. That kind of meant that we needed to shift gears from thinking hard about the technology to thinking hard about the humans that would be using it. We ended up borrowing the idea of persona-based design from the user experience world for this. We created a developer persona named Jane in mid-2017. Her ops counterpart Julian came a little bit after that. But all throughout the project, we've used them to help us think about how Emissary is going to be used and to help us drive that process of forming opinions and distilling expertise down into value we can add. To this day, the toughest thing that we do regularly working on Emissary is working out semantics that simultaneously make sense to Jane and Julian, are explainable to Jane and Julian, and still fit with Envoy's semantics as well. And designing for the humans first always sounds obvious, but we constantly relearn how hard it is to really do it well. So the first few lessons here, these are mostly things that we learned while we figured out what our goals were going to be and put together some tools to give us a place to stand while gaining the expertise that we needed in order to actually execute toward those goals. Actually gaining the expertise had its own challenges and its own lessons. The first one we run into is that we as an industry have a big problem around the learning curve that we tend to throw at novices. Imagine for a moment a new cook going over to the range, hauling out their cooking documentation and finding something like this that gives them detailed explanations about what every control does and exactly what setting it's changing. This is not what they need. It's all true information that's very important to have available, but what the novice really needs, at first, is a sense of how to accomplish a goal that they have in mind. They need a set of instructions. Even better, they need a guide that can talk to them about the kinds of things that are even possible with the thing that they're working with, because they don't have that yet. I'm going to show a simple example illustrating this with Envoy because this is a talk about Envoy, but I want to be very clear here that this is not a place where Envoy stands out. It's a ubiquitous problem in the whole industry. I could just as easily have pulled examples from Kubernetes or from Emissary itself or from Unix or really any piece of technology I've ever worked with. But grabbing one from Envoy, there's a thing in Envoy called a filter chain which is able to act on an incoming network connection. Within a filter chain, there's a thing called a filter chain match which specifies which connections are going to match a given bit of Envoy configuration. One thing you can match on is the server name and there's excellent reference documentation on all of this stuff. What the reference documentation will not tell you is that the server name match doesn't work at all for HTTP. Once you get a more complete picture in your head of what's really going on, this makes perfect sense. This level of thing, this filter that we're working with is operating on a new connection, a new HTTP connection that doesn't use TLS and SNI, doesn't have a server name on which to filter. You have to wait to get the server name until you've read all the HTTP headers and that's later, that's not the stage that's going on yet. So the challenge here is not that the information is wrong because it's not, it's perfectly correct. The challenge is that it relies on concepts that a newcomer to Envoy will not have internalized. And really, when you get down to it, one of the most important differences between a novice and an expert isn't that the expert knows how to do more things than the novice knows, it's that the expert has a mental model about what is possible so they know where to look to figure out how to do it. The novice doesn't have any of that. And again, I want to stress one more time, this is by no means unique to Envoy. We could pull examples from this straight out of the emissary documentation as well. It's really not about Envoy or not just about Envoy. The lesson here is that, again, the expert already has the mental model to make good use of the reference materials but the newcomer needs a curriculum to guide them while they form those models and creating that curriculum is really hard, although it's also really valuable. First recommendation here if you're building on Envoy. Okay, so start with remembering that Envoy is not likely to be something you swap into and out of your product, it's gonna be more of an investment. So it'll be up to you to decide which areas you spend more time in and which you spend less time in. Make the docs a place to spend more time, both in the sense of creating good documentation for your own product, but also in the sense of spending time with the Envoy documentation itself to learn what's going on. Specifically with the documentation, Envoy's introduction and getting started sections are really good, they tend to get overlooked a lot. You know, go in, spend some time with them. Any time you're trying to build your own curriculum out of reference materials, it's an iterative sort of thing. You should plan to keep prowling around in the docs repeatedly while you're learning and you should expect to have your brain feel awfully full every so often as well. It's all part of it. Building the mental models takes time. Anything that we can do to improve the experience of building those mental models is a good thing for our users. The next challenge we ran into is that Envoy moves very, very quickly. It's not something where you just become an expert and be finished. You have to maintain your expertise over time. This has to do again with the fact that yeah, it's an investment. Just in the time that I have been working with Envoy, I count 62 releases. We've gone through three major versions of the Envoy API with a fourth in the works and I literally have lost count of the number of new features that have been added. Things move very, very quickly. Given that speed, you will need to pay some more attention than you might expect to, to deal with bug fixes and new versions and things like that. If, like us, you're an Envoy distributor, meaning that you build a product that incorporates Envoy so that you're shipping an Envoy binary with your product, then this goes at least double. Because in the case of a distributor, you have to pay extra attention to security releases given that they always have a deadline and they almost always require action. So the most important recommendation here is that you always invest in keeping up with mainline Envoy, especially if you're a distributor. I can tell you from sad experience that it's not really much fun at all to scramble to manage a security release when you've diverged significantly from the mainline. In turn, that implies that it's worth thinking carefully before you commit to shipping a custom Envoy build. So that also raises a question. Why did we decide to ship a custom Envoy build? Historically, this is because of Auth. When we started, there was not an Auth service in Envoy that did what we needed, so we wrote our own. Since then, our Auth protocol has been added. It's the HTTP protocol in the mainline XDOTZ filter now. But by the time that happened, we'd added some other small tweaks, mostly around backward compatibility for our users. And those can be kind of challenging to get into the mainline. So although we're moving closer and closer to being able to use a stock build, it's gonna take a little while yet. This permitted us to do a lot of things, but it has definitely been costly in other ways. If I were starting now, I'd probably look very, very carefully at Wasm before committing to hacking on Envoy C++ code. If you decide to go for a custom build anyway, keeping up with the mainline goes at least triple. It's difficult to manage security releases and go ahead and track new features and things like that if you have to jump across many Envoy commits at once. Much, much easier to just go through and handle small numbers of commits at a time. You should also be aware when you're thinking about this one that you will need to spend some time coming up to speed with Envoy's code base. It's pretty complex. Envoy uses C++ quite carefully, but every C++ project has its own project-specific idioms that you have to sort of get used to and Envoy's no exception there. Envoy's code is also heavily, heavily performance tuned enough so that it makes certain things that you would expect to look really easy actually look quite a bit more complex so that they can be faster. Another thing to be aware of is that building and testing Envoy tends to really benefit from beefy hardware. Within Ambassador Labs, we tend to run Envoy builds and Envoy tests on a pretty beefy GCEVM with lots of cores and a ton of RAM so that we can run the tests from a RAM disk. Makes it really fast and it really drives home how painful it would be to try to run it on all of my Macintosh. And you know, I said it before, but it's worth repeating. Think carefully about whether you really need to commit to diving into the C++ and shipping your own custom build. You can contribute to the mainline without pinning your product to a custom build that may turn out to be a better plan, especially as Wazem and Lua are better supported and more performant. So finally, I would be remiss if I didn't mention a very positive lesson here, which is that the maintainers and the community around Envoy have both been really, really wonderful. Back in the time before COVID, Ambassador Labs was located closely enough to a bunch of the people working on Envoy in Cambridge, Massachusetts that we could and did get together with them, talk over things, get to know people. It was always a huge win. The maintainers and the community have always been very gracious and very helpful and overall, strong recommendation to get to know Envoy's people. It'll be a big help and besides is very pleasant. So to recap, lessons learned. Yes, Envoy can be very complex. If you want to use it without a team of SREs devoted to it, you will need to be carefully opinionated and you'll need to be opinionated in ways that resonate with the humans that you want to support. That is gonna take a significant amount of time and energy to sort out all of that. You're gonna need to start right now by taking the available reference material and building a curriculum for yourself that will let you develop the expertise that you need to do all this and you'll need to do it while the product is moving very rapidly under you. On the other hand, you'll have a great set of people around you to help out with that. In terms of recommendations, first, recognize that you're making an investment here with Envoy. Be careful how you approach it, obviously, but always invest in the documentation both in the sense of studying the Envoy docs, being prepared to dive back into it more than once, and in the sense of building your own docs to help your own users. Likewise, always invest in keeping up with the mainline Envoy. If you're shipping your own custom Envoy, this isn't a specially big deal. Even if you don't, you're gonna need to do it because the project is moving very quickly and it will help you a lot to not be behind. Finally, get to know the people. They're great, they're a great resource, always a good thing. So, thanks much for joining in. If you are looking for more information, you can find Emissary on GitHub. You can always join the Ambassador Labs Public Slack or just drop me a line at flintedatowire.io. Thanks much.