 Hi, I'm Rand Hamilton, an engineer at Google working on Envoy. I've been at Google for a little bit over 10 years, during which time I was involved in the design, development, and deployment of HTTP3 at Google. I'm here today with Elizabeth to talk about HTTP3 and Envoy. This talk is in three parts. I'll talk about the first two, an overview of HTTP3 and the benefits it provides, and the experience we gained deploying HTTP3 at scale. Then Alyssa will take over and cover the third part, Envoy integration. Let's dive in. We'll start with an overview of HTTP3 and the benefits it provides. But first, let's start with why HTTP3 is so exciting. Namely, HTTP3 is simply faster. Let's look at two metrics from Google which show how much. The first is search result page latency, how long it takes for search results to arrive. When HTTP3 is used instead of HTTP2, the page loads 8% faster. This is pretty significant when you consider just how optimized the Google search page already is for fast delivery. The second metric is from YouTube. You probably recognize this spinner that happens when you're paused waiting for more of your video to load. This is called buffering. YouTube calls this buffering. And the time between buffering events is the MTBR. The longer it goes between rebuffers, the better. The fewer rebuffers, the better. With HTTP3 enabled, YouTube MTBR improves by 30%. That's huge. OK, but what is HTTP3? HTTP3 is simply the next generation of HTTP, following creatively from HTTP2 and HTTP1. It was created at Google about 10 years ago and recently standardized at the IETF. It's supported in all major browsers, in many mobile apps, in server stacks and server deployments. According to Cloudflare radar, 17% of internet traffic is currently HTTP3. And for services like Facebook, which have leaned heavily into HTTP3, they're seeing something like 75% of their traffic coming in over HTTP3. And quite simply, it's designed to be fast on the internet. So let's look at how HTTP3 delivers these benefits. First, let's take a look at the stack. Traditionally, with HTTP2 and HTTP3, we have IP packets at the bottom, over which we run TCP, on top of which we stack TLS. And then finally, HTTP1 or HTTP2. With HTTP3, this is quite different. Instead of using TCP, we use UDP. On top of that runs a new protocol called Quick, which consumes the reliability features of TCP and the multiplexing features of HTTP2 and internally makes use of TLS13 for the handshake and for packet encryption. And then on top of that is HTTP3. And let's look at two features that give HTTP3 its performance wins. The first is faster connection establishment time through the zero RTT handshake, the zero round trip handshake. So with TCP, there's the common SINAC exchange, the three-way handshake between a client and a server. So after one round trip of packets between the client and the server, the client is able to send application data. But of course, these days, we don't just want to send data between clients and servers. We want to send that data securely. So we need to run a security layer on top, typically TLS, which adds its own set of round trips into the handshake. So with TCP plus TLS12, we're looking at somewhere between two to three round trips before application data can be sent to the server. With Quick, which is equivalent to TCP plus TLS, we're down to between zero and one. If a client has spoken to a Quick server before, the server will give that client an encrypted token, which the client is able to use in subsequent connections. So it sends the client hello and can immediately follow that up with encrypted data. Round trip times on the internet cannot uncommonly be in the range of 250 milliseconds. So if you're doing two, three round trips before you're able to send data, that's somewhere between a half and one second waiting for the connection to be established. That's a long time. So with Quick, if you can get that down to zero, it's a big win. Next, let's talk a little bit about head of line blocking and the evolution of HTTP. Back in the HTTP one days, each request would go on a dedicated TCP connection. If you wanted to have multiple requests in flight at once, you would need multiple TCP connections. If you went in a lot of requests in flight, you would need a lot of TCP connections. But of course you can't have an infinite number. So browsers impose limits on the number of sockets that they'll open to a particular site. Unfortunately, each of these TCP connections competes with the others for congestion control. They often spend time idle while waiting for the time between a request and a response. So this leads to really suboptimal resource utilization. This is where HTTP 2 comes in, in which a single connection is created between the client and the server. And all of the requests are multiplexed onto the single TCP connection. This is great because it allows us to do new things with HTTP like compress headers for increased efficiency and introduce prioritization by which low priority requests can get delayed in favor of high priority. However, because it's a single TCP connection and a single in-order sequence of bytes, packet loss can be quite problematic. When packets make their way onto the internet, the internet sometimes decides to drop those packets and on a TCP connection, any data after the lost packet can't be delivered up to the application until that data is retransmitted and arrives at the server. This is called head of line blocking and it's a real problem for latency. And here's where HTTP 3 comes in using Quick. Quick runs over UDP, which doesn't present a single in-order sequence of bytes. Instead, it's a series of datagrams sent from the client to the server that are each capable of being processed independently. So in the case of HTTP 3, each stream is a series of quick packets which go off onto the internet. Some of them may be dropped, but when they are, the data in other streams is still able to be processed. So this removes the head of line blocking completely. All right, let's switch gears and talk a little bit about the deployment experience that we gained when we turn HTTP 3 up. The biggest is that fallback is simply required. HTTP 3 is new, TCP is not. The internet is old and TCP has been a part of it from the beginning. So basically all internet infrastructure is expected to allow TCP traffic to pass. If you're on a network and your TCP traffic doesn't get through, your network is basically broken and it's time to complain to somebody about it. Unfortunately, that's not true for Quick. That's not true for HTTP 3. There are lots of devices out there that block all sorts of things, including Quick and HTTP 3. And so an H3 client needs to be able to fall back to HTTP 1 or HTTP 2 in these situations. Some data we have from Chrome suggests that currently something like 10% of all users will be unable to ever make Quick connections and will always need to be doing HTTP 1 or HTTP 2. And there are a couple of network pathologies which lead to this behavior. One, which is most common, is that all Quick traffic will be blocked. So when a client tries to establish a Quick connection, that Quick handshake fails, the connection is never established, the client notices this, switches to a TCP connection instead and sends requests over that. Clients are often smart and will notice that Quick is broken and avoid attempting subsequent Quick connections as performance optimization. However, some networks are worse. Some networks will cause failures after the handshake by blackholing all packets or sometimes connectivity will randomly drop in the middle of a connection. And so clients need mitigations for this. The first thing they need is detection. Clients need to be able to notice that they're sending packets into a black hole, that everything they send results in nothing coming back. But what do you do once you've detected that? It turns out a really simple thing you can do as a client is change the source port that your traffic is going out on. Because Quick isn't tied, it isn't running over TCP, it's not tied to a single source port. And when the client switches to a new port, for some reason, magically, traffic seems to get through much of the time. Another option is when you walk from your house to your car, sometimes you will walk from Wi-Fi coverage to out of Wi-Fi coverage and experience a connectivity bullet. One thing that Quick is able to do is notice that this isn't working, notice that your cell connection is available and swing the connection from Wi-Fi to cell and your connection is seamlessly continues. All of these problems are solved today or will be solved very shortly in Envoy and Envoy Mobile. So this is a great set of technologies to use for doing HTTP3 in production. Let me turn it over to Alyssa to talk about Envoy integration. Hey, I'm Alyssa, one of the Envoy senior maintainers and a longtime server and protocols developer at Google. Before I dive into how this all works, I wanna do a call to Dan, the Google engineer who did the heavy lifting for all of this work. She's actually heads down closing off the last few MPV bloggers to get Envoy Quick to GA for an upcoming launch. So as I've been doing much of the Quick client side integration, I'm standing in to talk about how it all hangs together. Envoy has this amazing series of API abstractions with a clean separation of function. As new requests land on the machine, they get accepted and assigned to a worker thread. Data is read from and written to a transport socket. For example, the TLS socket, where the transport socket is responsible for decrypting data from the wire and encrypting data that it writes out. As data is read, it's passed through a configurable series of L4 filters which can inspect the data or do transformations on it. For HTTP, the last L4 filter in the chain is the HTTP connection manager, which does a transformation from L4 to L7. It does this by setting the L4 data to a codec, previously only HTTP1 or HTTP2, and the codec is responsible for understanding the framing for that wire format, reading in data and spitting out strings, it headers body data, et cetera. Each of the streams has its own series of L7 filters culminating in the router filter which ships the data upstream. Things don't work quite the same way for HTTP3. To start with, getting the packets to a connection, from a connection to a worker thread is a little bit more complicated. We're accepting a TCP connection makes it really simple for TCP data to land on the right worker thread. The kernel doesn't know about HTTP3. The default kernel mechanism for shipping you to be packets to threads is hashing on IP and ports, which is great if you have a stable connection, but less good if you wanna handle port migration or connection migration. Instead of worrying about these misrouted packets, we install a Berkeley packet filter which effectively teaches a kernel how to hash on quick connection IDs, ensuring packets will land on the right worker thread. Now that we've solved the problem of landing packets on the right threads, we need to get them to the right connections. All the packets for this thread's connections will land on the same listener and something needs to sort them into individual connections. That entity is the dispatcher. It takes in all the HTTP packets for that thread, creates new logical connections for new handshake packets and for established connections just routes the packets to those connections. Quick has one option for crypto and it's TLS13. So the server simply won't start up if you configure a different transport socket for quick. As the dispatcher handles forwarding packets to the right connections, there's no need for the transport sockets traditional read-write support. There's also currently not support for L4 filters for each to be three. Each quick stream is logically an L4 connection and an L7 stream and we treat them as a ladder. There is the same need to turn the stream of packets for connection into a series of streams. The on-way quick server session does this, creating new server streams for each HTTP three stream. These map one to one with the streams inside the HTTP connection manager. So as headers, bodies and data arrive on the quick server streams, they're passed to the on-way stream representation and passed through the L7 filters just as streaming it does for HTTP one and HTTP two. Okay, now that we've established how HTTP three works in on-way, the real question is how do we use it? I'm going to run through three example configurations using HTTP downstream and using HTTP upstream. All of these examples are linked in the on-way repo. So if you have an on-way instance, you can run them pretty easily just by setting the config path flag to one of the files listed above. The first thing that we're going to run through is how to use on-way to terminate HTTP three traffic. If you have edge on-way deployments today, you can pretty trivially get the latency and quality of experience gains for all major browser traffic just by adding these sections to your config file. For HTTP three deployments, you still need to maintain a TCP listener doing TLS for HTTP one or HTTP two. Responses from that listener should inform clients you now support HTTP three. We do this here by sticking an alt service header in the response headers to add. So for this route, browsers will now try to do HTTP three for all the host names they encounter and they'll direct that HTTP three to port 10,000. Now normally in production that would be 443, but for the example configs, we wanted folks to be able to play around without needing root credentials. So we picked an alternate port. Of course, you also have to have the HTTP three listener or rather a UDP listener configured to do HTTP three. And this is what that looks like. It's a standard UDP listener with quick options to tell the listener code it'll be doing quick. And then a transport socket with a quick crypto config to make sure it doesn't TLS one, three correctly. The listeners running on port 10,000 since that's what the alt service advertisement is set to. You can also get fancier with the alt service headers directing quick traffic to a set of canary clusters rather than having your quick server and TCP servers in the same process if you want to scale up more slowly and carefully. Getting Envoy to do HTTP three upstream is even simpler. You just configure the clusters protocol options to do HTTP three and configure a quick transport socket. Now note, this particular setup is great for a lab environment or local service mesh where you can guarantee that HTTP three will work on that network. It's not a sensible thing to do for internet facing deployments. It's hard coded to do only HTTP three and there's no failover to TCP if UDP 443 is blocked. This example shows Envoy negotiating the upstream protocol. In this case we've switched from the explicit HTTP config to auto config meaning Envoy will use the best available protocol. Here it'll start up doing TCP TLS and do ALPN to negotiate if it's using HTTP one or HTTP two. If it gets a response with alt service headers for HTTP three Envoy will cache that the endpoint uses HTTP three and attempt to use that going forward. If HTTP three fails consistently despite the alt service advertisement Envoy will keep on using TCP connections and eventually HTTP three gets marked broken. The important part is that with this configuration Envoy will try to use the best available protocol but we'll make sure at the end of the day the queries get through as quickly as possible. There's more data about the timing of trying HTTP three failing over to TCP, how connections get marked broken in the Envoy docs. So feel free to check those out if you want the gritty details. We've largely set up the default timings to match what Chrome's been doing for failover which has been tested at scale at the real world. It does a great job of balancing and desire to HTTP three where it's possible while minimizing time waiting on HTTP three connections on networks where it's blocked. One last thing I wanna call out before we go to Q and A the quick G is coming really soon. The config is stabilized and is being tested at scale and production as we speak. We're down to four blockers, which are all pretty small. I was honestly hoping it would land this week rather than in a week or two, so I could announce it here but we'd really rather have it sell than rush anything. After that, our team's gonna be head down getting fully functional HTTP three for Envoy mobile. This is largely making sure that things like the elf surface entries and zero round trip credentials are cash persistently across application restarts. You lose some of the latency benefits that every time your app loses focus it forgets to use HTTP three and has to do a full handshake. But there's also work to be done for the black hole detection and port migration things that Ryan mentioned earlier. Hopefully we'll be able to finish that work quickly and start using HTTP three for Envoy mobile early next year. We should have time for Q and A after this but if you're watching this online or have questions we didn't get to feel free to reach out to us on Envoy Slack. We keep an eye on the Envoy UDP quick dev channel so it's a great place to keep in touch with what's going on or ask questions about what's coming. So thanks and any questions?