 Hi, so for any of you who don't know me, my name is Alyssa Vilk. I'm one of the senior maintainers of Envoy, as well as one of the maintainers of Envoy Mobile, and I'm just gonna give you an overview of what we've been up to for the past year. So Envoy highlights for 2023, I'm gonna start out going through who's who on the project. Even folks who contribute aren't really familiar with all the roles and who's doing what. We'll go through what we've accomplished in the last year in terms of features. There's still a ton of development going on in Envoy, releases, project improvements, and then finally I'll go through what little we know of what's next for the project. So first I wanna start off with the Envoy senior maintainers. I suspect almost everyone here knows this project would never have happened without Matt Klein, who wrote and open sourced the Envoy server. We had Harvey joining in from Google shortly thereafter and basically owned the Envoy APIs and massively refactored that to really make the project success. I joined shortly thereafter. We had Stefan join working on clusters and load balancing. We had Greg who picked up and rewrote like the entire TCP proxy and the TLS stack. Lee's on join starting with GRPC and moving on to Walsum and other things. Jan has been with us for years with an emphasis on security and reliability. We have Ryan Northy who most of us know as Flax and I'll refer to his Flax from here on to avoid the Ryan disambiguation problem. Now it owns all of our CI, our release tooling and everything else. Ryan Hamilton joined and got our HB3 stack up and working. And then we have our newest senior maintainer, Biving, who is started doing performance work and has recently done a huge refactor of Envoy's upstream load balancing. Then we've got the Envoy maintainer crew, a lot of long-term awesome people here. Josh has always been and will probably hopefully always be our go-to for stats. We have Abi who's taken over the control plane from Harvey. Kevin joined and has done a ton with overload and making sure that Envoy scales safely. Kevin, sorry Keith picked up a bunch of our Bazel work and is the person who is Bazel better than even a lot of the Googlers. We have Kuat who is our go-to contact with Istio. And then again, this year's new additions. We've got Raven over at Dropbox who's picked up a bunch of work with caching and file system IO and Alex who did our IO URL ring and is working on data point work. So these people are really the people who keep the project going. The Envoy maintainers do all the code review, senior maintainers have to review all changes to core code. But as you know, a lot of the core value in Envoy is actually in Envoy extensions and the maintainers own and can do sole review of the extensions. We have a rotation. We've tried to add a little bit more visibility into that so the Envoy maintainer on call is now publicly visible in our Slack channel. And maintainer on call gets daily updates on what PRs need to be assigned, what issues need to be triage, what reviews are getting stale, aren't getting responsive enough and we really try to keep things running smoothly for our contributors. So again, a ton of work there to keep things going but if folks have things they think we can improve, please reach out to us. Again, we're all on Slack. There's a lot that keeps Envoy working other than just the maintainers though so we have the Envoy API shepherds. Not everyone is aware of this but the Envoy APIs aren't an Envoy specific API. They actually, we plan on moving it out of the Envoy repo and no one's gotten around to doing it yet. But the Envoy APIs are used by several other projects. So they get their own separate round of reviews by the API shepherds and we do not merge any changes to the Envoy APIs until it's been signed off by one of these people. So some of these folks again are Envoy maintainers, you know pretty well. Some of them are for other projects like Ali from Envoy Mobile or Mark Roth from the GRPC project. Unlike the data plane where if we submit a bug we can fix it and backport APIs are forever so we have to do a really, really good job of vetting them and appreciate the folks who do that extra pass. We've got the dependency shepherds crew. So back when we open sourced Envoy, we kind of pulled in whatever dependencies willy nilly. We didn't make sure that the key library components that Envoy depends on were as secure and reliable as the actual Envoy code. And we fixed that over time and we're still working on improving the dependency chain for some of the data plane which I'll get to later. But basically the folks on this list make sure that we avoid dependency bloat, that a dependency chain is trusted and that updates are safe and sanely done. We've got Envoy security team and these people really are the unsung here as at the Envoy community. This list includes the Envoy senior maintainers as well as a bunch of volunteers who, again, we could not do the security releases without these people. So Envoy security is in charge of triaging all the issues sent to the Envoy security list. If you go to file an Envoy bug report with a crash it warns you in big letters, please do not file crash reports in public. Please, please, please. You email the Envoy security list and then we'll let you know if that can be fixed in the clear, if it's not a part of the Envoy security policy. It's an issue with an API or a startup crash. Often we can just fix that right away in the clear or if it is actually something that could be a threat to people running Envoy and we need to fix it under embargo and release it with our core security releases. And so the people on this list responds to every email we've got an SLO we try to stick with and then are responsible for actually sorting out these issues and this is where we really rely on the community. The people email us, some people email us a stack trace and then never respond again. But ideally we have a back and forth we help they help us reproduce the issue that they've encountered. We come up with patches and then there's a whole annoying back ports for release process that takes weeks and weeks and weeks of frustration behind the scenes. And then we have patch day and you see magically there are security releases. So huge, huge folks thanks to everyone who volunteers here because it isn't seen by the community but it is seen by the maintainers. And then we have the Envoy extended community. So Envoy has also seen your extension maintainers for folks who we trust to own an extension but are not actually doing Envoy maintainer work. So we have a couple of people who own some of the extensions that are getting rapid movement. We've got the Envoy mobile maintainers of which I am one who keep Envoy mobile CI up and working and address any issues between the Envoy project and the Envoy mobile project. We've got our distributors who get our security patches early and test out high risk changes. Again, we have these Envoy patches they do not always go smoothly. So sometimes we roll out a fix that we think is gonna fix some crash and it adds another crash. So it's really, really important that we have people who are willing to try these out early and give us early feedback. And again, Envoy could not exist without all of our community contributors. Probably everyone here has done some amount of trying on Envoy or filing a bug report or asking questions on Slack or emailing security. So a huge thank you to everyone who does all of that. So, exciting stuff. Matt keeps joking that Envoy is boring, but when I look at what we've done in 2023, there's still a lot of work being done on this proxy. Just looking at new filters alone, I didn't list them all here. I listed almost all of them. But there's a lot of cool functionality. So things added in this year, we have the customer response filter. So if you're running Envoy, say in a multi-tenant environment, your cloud and individual customers will want to have their own customer responses to 404s or 503s. You can have the customer response filter that can do these responses either via local configuration or getting responses from remote sources. We've added a file system buffer. So if you have large downloads, you've got fast step streams, you've got slow downstreams and you don't want to use Envoy flow control and keep it in kind of network buffers, you really want it all on that local machine. You can use a file system buffer and spool all that data disk so that you're not holding all of these large responses in memory and potentially uming your machine. We've got the mutation filter. So Envoy's always had a lot of changes you can make to headers, adding headers, removing headers, editing headers. But the mutation allows you to do this at any point in the Envoy filter chain, right? We're instead of doing it kind of at the end in the router filter or at the beginning in the HTTP connection manager, you can plug in the mutation filter anywhere in the pipeline and configure more custom changes. We've got a JSON and metadata filter that lets you tag things with JSON metadata and then that metadata can be used to enhance load balancing or to select which queries you want to log. We've finally got GUIP. This is a huge additional Envoy that people have been asking for a long time so you can GUIP your client requests as they come in and we have, that's pluggable. You can put in your own GUIP data or you can use the MaxBind geolocation data. I know this seamlessly work end to end. We've got a UDP capsule filter that allows you to proxy UDP over HTTP and essentially take each UDP packet and encapsulate it as part of that TCP stream. We've now got Golang. So you can do L4, L7 filters, so HTTP and network filters as Envoy Go extensions. It's not just Lua and Wasm anymore. And then last but not least, we've added a stateful session and so the one current use of the stateful session is doing cookie-based load balancing. So kind of hard sticky routing based on cookie is something that Envoy's missing for a long time and we now finally have that feature up and working. Outside of filters, there's been a lot of improvements to logging. One of the wonders of Envoy has always been the observability. So first off, there are dozens and dozens of fields added to the access logs. I didn't even try to list them. You can look at various release notes but larger improvements than just adding fields here and there. We've added more flexibility to the L7 access log. So for HTTP, you can now trigger access logs when you start a request. So when you receive a stream from a client, when you have an upstream assignment, so when that stream is associated with a host and now you can do periodic logging as well. So if you have long hanging requests, you can kick off at a preconfigured interval, which is nice. L4 already had some of this but we added parity. So again, you can do L4 logging when you have that upstream connection assignment. So much more flexibility on when you do your logs. We also have more flexibility of what your logs look like. So we have common expression language, access log support. You can output your access logs in JSON, which is something that a lot of people appreciate. We added not access log related but more kind of stdr logging, the ability to do glob control. So we've got this Envoy fine grain logging, which we have a tech talk about later. I won't go in too much, but basically when you're debugging your Envoy's in production and you want to turn up logging, you used to have the choice of kind of all or nothing. Well, I mean, you can do different log levels, but sometimes just the amount of logging can overwhelm the CPU on your machine. So now we have much more fine grain flexibility. And then we added a bunch more SSL error metrics. Again, what you'll hopefully help with actual live debugging. For security, reliability and performance, this is something that is near and dear to my heart. There have been a ton of improvements here. I mentioned our dependency chain. So one of the things we've been working on over at Google is trying to make sure our data plane is secure as possible. So we've moved off of hdb-pursor, which is an abandoned wear library for parsing hdb-11 requests. And we moved over to Balsa for hdb-11 traffic. So now we have an underlying codec where the project has control. It has a standard security release model. We can actually get bugs fixed. The last hdb-pursor issue we had took like six months to resolve. And we get ad features too. So I literally got a ping today saying, can we get custom methods now? And the answer is yes, we can. It's just a simple matter of code now. We promoted the TLS inspector to be our bus for upstream and downstream. So that was functionality that had been an Envoy for a long time, but wasn't vetted to be used in all production environments. So that's pretty exciting that that is now much more secure. We have better overload management. Kevin's put a ton of work into more flexibly rejecting requests under resource pressure. So again, making it sure your Envoy's won't oom is a really important thing. We had a fascinating series of attacks in the wild called Frameshift, if you ever want to Google that. So this quarter we ended up with an emergency security release to deal with rapid reset attacks. And so we now have the option of limiting requests per each loop so that you cannot have one client kind of overwhelm your Envoy worker thread as well as rejecting connections who have too many suspect requests that have the profile of this attack. And this is all tunable. We've got lazy stat creation. You can optionally do lazy cluster creation. We have maglev compaction. So again, if you can figure your kind of fine tune Envoy to have much better memory profile, especially if you're running lots and lots of clusters like some of us do. There's been significant hardening with Xproct that is still ongoing work. So really the goal is that you can use Xproct to talk to completely entrusted sources and you will end up with completely safe changes to header's body and data that you don't have to trust the external source to trust that Envoy will do the right thing. And then another cool feature, which I think may be underutilized is plugins options for custom config validation. So something that we think is really cool, you can basically add your own rules and say like, if I get an update to clusters where I don't have 500 clusters, there's something wrong, reject this config. I know that I need at least 500. I know that I need at least this many endpoints. If I get something else, there's a bug in the system, reject this. It's gonna cause 500s. And again, that's very flexible. So you can really add your own business logic to guard updates to make sure that they work for your particular production environment. Things that do not fit in these buckets that I still think are awesome. Upstream HTTP filters are now GA. So Envoy now has the ability to do normal header body trailers, transformation after the router filter. So you now can attach an L7 filter to a cluster. So if you're doing shadowing, right, and you're shadowing something to a customer upstream, but you're also shadowing it to like a test upstream of yours, you can do different filters, different standardization there. This has been helpful in our own use cases for security where some things we might do to internal service versus an external service, and we want to sanitize headers differently. It also lets you do some like off things. Once you know what the upstream connection is, so you can apply some checks based on what endpoint you're connected to, which can be really helpful from a security point of view. We have ECDS, which is really exciting, the ability to the discovery service that lets you update the configuration for a specific filter. So now if you have your Envoy filter chain, and one of those filters is say Wasm, and you want to reload that Wasm, you don't have to reload the entire listener filter chain. You can just update that one filter by itself for listener and downstream filters. We added support for Connect UDP. There's gonna be a tech talk about that later today, but basically if you've heard of Privacy Relay, Envoy now functions as a hop in a Privacy Relay, so you can use that in a deployment to kind of join in improving privacy for the greater internet. We've got common expression matchers, so again Envoy matching language is now even more flexible and robust than it used to be. We added support for custom load balancer plugins. So again, it's easier to add your own custom business logic there. And we have EDS happy eyeball support. So now when you're getting your endpoints via EDS, you can actually get V4 and V6 addresses. This is great when your Enwys are talking to the open internet and you're not quite sure if the V4 endpoints and V6 endpoints are gonna be up or down or what have you. So a lot of cool features here. Moving on to releases. Envoy is on track with our four anger releases. They were all within SLO. This is normal and boring. There's a bunch of work, but not a ton of work to do this, but I'm glad that there were not any hiccups. We had planned security releases in April, July and October. We're working on bringing that back to a quarterly cadence, so hopefully we will have four security releases going on forward. But we also had two zero day releases in July and October. So July was unplanned. This was one of the distributors who accidentally broke embargo. And we had to kind of create a process of what we do about that. Postmortem that issue. They even proved their internal process. So we're confident that won't happen again. And then in October, we had the frame shift. So this was planned industry coordinated release. This is an Envoy release where we weren't just trying to coordinate with our distributors, but we were also trying to coordinate this release with Google, Cloudflare, like all of the other companies that were under attack doing their major releases. And so there were a lot of back and forth on what the release dates were and how we were allowed to communicate and how we could communicate. And a ton of work put in, especially by Yan. And I think it went out as smoothly as it possibly could. So that went pretty well. And hopefully, you all were happy with your Envoy's not being subject to live internet attacks anymore. 2023, we also saw a lot of tooling improvements. So if you're on the Envoy developer side, hopefully you saw some of the massive improvements that Flax did to CI. A lot of them were caching improvements that really improved the CI runtime. So for example, coverage used to regularly take two to three hours. Now it takes five minutes in the common case. And it can take two hours if you're making enough changes. But I will say as a developer, it's so nice not having to wait two hours to get a CI flake and then have to retest. And he has continued improvements on that line. We did significant deflaking. This was a month or two of a very unpleasant experience as Envoy developer, where all of our CI tests were set up to retry because they were so flaky that they often failed. But as we turned up the number of retries, we ended up with more and more flaky tests. And it got to the point that even with multiple retries, CI wouldn't be clean. So what we did is we turned off all the retries and everything failed. And then we fixed all the tests. So we deflaked dozens and dozens and dozens of Envoy tests over those three weeks. We went from a 20% CI success rate back up into the 80s. It's been fairly stable since. So thank you for patience during that time. And I think it really did make a positive difference. We're also gradually shifting CI on GitHub and RBE over to InchFlow. So hopefully instead of having our CI spread out over three or four different locations where if any provider goes down, we lose one or two of our CI shards. It'll all be in one place to have fewer dependencies. So that is the hope for next year. Flaks also did a lot of work on faster release process. So we've dramatically improved our release publishing where it used to be 50 different steps of do this thing, touch this file, change this thing. It's now run the script, wait, run the other script, which is awesome. And that tooling involves other improvements, like having our devs and our multi-art binary. So hopefully is a better end user experience as well. So what's next? The next thing that I'm excited about is NGHB2 is being replaced with OGHB2 for HB2 parsing. So for those of you who have not memorized the issues and the explanation behind this, OGHB2 is probably going to be the biggest improvement to our dependency chain that we have had in the history of Envoy. NGHB2 has caused at least two zero days that I am familiar with, I think possibly three or four. They are a project that does not have a policy of doing fixes under embargo and letting downstream people know. So when they have a query of death bug fix, they fix it in the wild. And then we get to do an emergency date release and everyone has to push emergency binaries and it sucks. So we are moving over to OGHB2. We have tested that internally at Google. Pinterest has been running that and it has worked successfully to them. So hopefully that'll get flipped to this quarter and our supply chain will be stable to my satisfaction. After we switch over to OGHB2, we are also moving over to unified header validation, which is another really exciting project Jan's been working on. Essentially, if you look at Envoy, CVEs in the last two years, a number of them were due to bad validation sanitization code issues that passed normalization. The Envoy sanitization code is spread all over the code base and kind of held or skelter with different codecs. And so we're unifying it. We're putting it in one place. It's going to be really standards compliant. The changes between sanitization today and sanitization that standard compliance will all be rolled out in accordance with the Envoy data plan principles. We will not break you without warning. Any change will be announced and run time guarded, but at the end of the day, this is going to be way more secure and way safer and hopefully the end of major data plane changes fingers crossed. We've got EDS caching for EDS. So basically your endpoint discovery service and address discovery service didn't play well together and could end up with problems where you wouldn't end up serving 500. So that's been fixed. We have a bunch of improvements planned for security release tooling. So I mentioned our unsung here is doing security releases. There is still a ton of Scott work to do security releases and some of that's automatable and Flax is getting on that. My own wheelhouse, I'm working on Envoy Mobile. We have our first production Envoy Mobile deployment by Google. So if you use Google, absolutely. So one of them should be using Envoy Mobile by the end of the year, which I'm very excited about having Envoy end to end to end. And then adding XDS failover support. So essentially if your control plane endpoint goes down having some failover point where you have latest cache configs that if your Envoy's restart, they will restart with roughly latest and greatest config and have somewhere else to go to. If you notice, there's a really, really big difference between the number of features that we did last year and the number of features I have listed for next year. And that's because as always, Envoy's roadmap is community driven. About 90% of Envoy improvements land without significant premeditation. Someone will just send out a pull request and say, here's a feature I want. If they are more organized, they might file an issue and say, here's a feature I want to do and then send out the pull request. But it's very few where we know anyone files something saying, okay, in Q4, I have this plan. It's not like an internal business process that many of you deal with at your day jobs. So I happen to know this list because I grabbed a couple of the maintainers who have that internal process and know what they're working on. But again, most of the changes to Envoy don't come from just us. So I fully suspect that next year we will have as long of a list. I just don't know what it'll be. So with that vagueness, I will open it up for questions. Impressive work, congrats. You did mention or maybe not IOEuring. What's the state of that? Yep. The reason I ask is because I read like a Google thing where Google was kind of banning IOEuring from all products. So I was wondering like, oh, is it also affecting Envoy or like there was some controversy around like security issues and stuff? I don't know. Anything about political issues between IOEuring and... Jan, do you want to stop up and take that one? Google is actually moving towards using IOEuring. It showed, you know, modus but very good improvements in performance. So there's an active work. It's been driven by the Intel engineers for the most part with a certain amount of Google participation but there's no real, but there's no conflict issues or anything. Is it something that's might be delivered next year? It's very likely. But there's definitely a lot of work. It's a fairly significant change in how the events are processed within Envoy. Thank you for your talk. You mentioned, I think at one point that there's a plan to move the API out of the repository. Could you talk about what the sort of level of status is on that and what the implications might have for developments? Yeah, yeah. So the status on that is everyone, you know, CNCF Envoy maintainers and the other users of the Envoy APIs agree that the API should live in their own repo. It was a couple of years ago it was decided they were going to be project neutral. And I believe we actually tried to forklift them out and it broke like 18 things in the Envoy build because it was very hooked in. And we said, okay, we need to do some detangling and then we'll move it out. And then no one signed on to do that work. So some of the downstream people for that who use those APIs are very interested in that work happening, but don't have staffing. And no one on the Envoy side is particularly motivated to go do that work as it would mean a two stage commit. You do the API change and then an Envoy bump a GitHub hash and do it. So we have had similar workflows before. They're really not painful at all. And it actually might make their view process slightly faster because right now one problem we have for Envoy developers is we have to do the API review before we do the API review before we do the code review. And the tooling is not where we want it to be in terms of reminding the right people. Unfortunately, that's something that's on my task list to work on this year. And I've been working closely with API shepherds so that they get better notification when they need to do review. Currently, if they are associated with review they get pinged every time anything needs change. And so they tend to ignore it. It tends to be a slow, painful process. So if it were just that file they'd pay attention to it because it was very clearly in their valley work. So yeah, I don't think it'll be a hardship but I also wouldn't bet money on it getting done in the next year just because it doesn't benefit anyone. It's just a kind of cleaner abstraction. So I will say no concrete plans and no roadmap but it is being discussed. Yeah. Any more questions? Okay, thank you so much, Alyssa.