 The GRPC route to success. My name is Arkodas Gupta. I'm a software engineer at Techrade and a maintainer on the Envoy Gateway project. And I'm Richard Belville from the GRPC team at Google. This talk is about the new GRPC route resource, part of the Gateway APIs. This presentation is a special milestone for GRPC route because after two years, it is finally heading to V1. So we are excited for it to make its way into the hands of a bunch of people in the near future, hopefully some of you. In this talk, we'll be covering the basics of GRPC and the Gateway API for people who are unfamiliar. We'll lay out a few great reasons to start using GRPC route today. And then we'll tease you with a look at where GRPC route may be heading in the future. Before we jump into that, though, I have one thing to plug. Normally, this is where we like to show you all of the great related talks at KubeCon this year. But as you know, we are very close to the end of things. So we only have one talk to plug right now, which is what's new in GRPC. This is a maintainer talk run by the GRPC team, covering everything that's happened with GRPC recently, including a brief mention of the subject of this talk, GRPC route. And that is happening immediately after this talk. So if you want to follow me down the hall, I'd love for some of you to join me there. All right, let's dive right in. I think most of you are probably familiar with GRPC I saw from the show of hands. But for those of you who are new to it and curious, let's start with a brief recap. So GRPC is a high-performance remote procedure call or RPC framework. You write what are basically function signatures in a language called protobuf, and we will generate code for you that will efficiently serialize requests, send them over the wire, deserialize them on the other end, and then hand them off to a server that you have written. GRPC has a special place in the Kubernetes ecosystem because it's the basis for many of Kubernetes' own APIs. For example, the Container Storage Interface, the Container Networking Interface, and the upcoming KNI or Kubernetes Networking Reimagined Interface. Beyond the Kubernetes control plane, GRPC is used by thousands and thousands of engineering organizations to get their bytes where they're going and then make sense of them on the other end. For those of you coming from the rest or JSON of our HTTP world, let's call out a few key differences between GRPC and HTTP. The first big difference is that GRPC is schemed. That means that you write your API definition first rather than as a documentation step. Your boilerplate client and server code then gets generated from that definition. GRPC uses protobuf for serialization and deserialization. This is a fast, space-efficient binary encoding. It's great for speed and efficiency, but it makes it a little bit more difficult to debug and to send RPCs manually. Then there's the transport. GRPC is built on top of HP2, which solves some of the blemishes of HP1. GRPC does things a little differently from browsers with HP2. Browsers and REST clients generally only use HP2 with TLS. For plain text connections, which are useful for debugging, or when running in a secure tunnel transport, browsers and HP clients generally only use HP1. But GRPC is all HP2 all the time. It's totally normal with GRPC to use HP2 clear text without upgrading connection first from HP1. This also factors into the gateway APIs. And the final difference I'll highlight is GRPC streaming. Stream-based APIs provide a ton of benefits, including making it much easier to build reactive applications. Streaming RPCs can last four hours or longer, but long-lived connections can also be a little bit more difficult to route, jump into that later. And even if you're not using a streaming RPC, the GRPC library keeps TCP connections around a lot longer than most REST clients. We're using that connection to send multiple RPCs and to save you the latency of that TCP handshake. Again, you need to make sure that your routing handles this well. And with that, we'll hand it over to Arco. Thanks, Richard, for the introduction on GRPC. Now, here's a brief introduction on the Gateway API. The Gateway API is a Kubernetes project that started in 2019 that focuses on layer four and layer seven routing. These are APIs and require an implementation to be installed in the cluster that can consume these APIs and implement the routing behavior defined in the API. It's a successor to the Ingress API, which focus on exposing a service inside Kubernetes to the outside world. As the next generation of Ingress, the Gateway API has managed to make many improvements. It supports multiple protocols, such as HTTP, TCP, UDP, and GRPC. It has strongly typed fields for traffic capabilities, such as header transformations and traffic mirroring. It splits up the Ingress intent into multiple smaller resources. The Gateway class that's linked to the controller that holds common configuration for a set of gateways. The Gateway that links to the Gateway class and holds listener configuration and defines how external traffic can be accepted into the cluster. Then there are the collection of route resources defined for each protocol type, like HTTP route, TCP route, UDP route, TLS route, and GRPC route that link to the Gateway to define routing and traffic shaping filters, allowing users to match on specific attributes within the traffic flow so it can route the traffic to the service backends. These smaller resources can now be mapped to roles within an organization. For example, the Gateway resource can be used to expose a hostname on a specific port and also specify TLS certificates needed for TLS termination and authentication. This is something only an app admin or a cluster operator should be able to configure. Similarly, only app developers have the knowledge to define routing and traffic shaping rules for traffic meant for their service backends. This split now allows the organization to have isolation at the resource level while still sharing the same implementation infrastructure under the hood. Let's talk about Gamma. So Gamma is short for Gateway API for mesh management and administration. It was created in 2022 as part of the Gateway API to focus on roles and traffic management for East-West communication. There's a big overlap of routes and policies defined in the Gateway API for North-South and East-West, so this move makes a lot of sense. This slide shows one of the API semantics defined by the Gamma working group. There's the service producer, the server, and routes like an HTTP route that can be defined for it. The route links to the producer service using the parent ref field and defines specific traffic rules for any traffic meant for this producer. Now, any service consumer or client that's trying to reach this producer using the service's FQDN or front-end, that traffic will be sent, will undergo any matching or transformations defined in that route. This background now brings us to the main topic of this presentation, the GRPC route. This API was introduced by Richard from the GRPC team in 2022. The API proposal, also called a GEP, highlights the need for a dedicated API for routing GRPC traffic to the back end. The focus here is on GRPC routing over HTTP2 as a transport and protocol buffers as the message format. The existing HTTP route could technically be used to route GRPC traffic, but the addition of a new route type significantly improves user experience. As you can see on the right side, the proposal introduces support for method matchers, allowing users to match in GRPC-specific constructs, such as the GRPC service and method. This API is currently in alpha, but it's planning to go to V1 next month. It's great to see so many implementations already supporting GRPC routes today. Conformance tests for GRPC route have been recently added, and we're happy to share that they pass on contour, cilium, and Envoy Gateway. With multiple implementations passing conformance tests, we've met the last graduation criteria for GRPC route. And with GRPC route going to V1 soon, we're hoping more implementations add support for it so more users can access it. I admit I might have packed too much information on this slide, but I wanted to highlight the additional features within an implementation that can be enabled knowing that GRPC route is configured in the system. The implementation could emit additional stats for GRPC. If the protocol is set to HTTPS on the Gateway Listener, it can choose to only support H2ALPN and not support an upgrade from HTTP 1.1. Similarly, it can directly connect to the back-end pods over HTTP 2 without an initial upgrade from HTTP 1.1. GRPC web may be enabled to support GRPC web traffic. Least request load balancing can be enabled by default to spread the load across pod endpoints. Richard will elaborate later in this talk on why this is important. Thanks, Farko. In addition to those functional benefits, GRPC route also provides some great UX improvements throughout in GRPC traffic. The biggest improvement in the current version of GRPC route is method matchers. GRPC is, again, a remote procedure call system. More or less enables you to call functions at a remote machine. And while GRPC uses HP2 URIs under the hood, users of GRPC shouldn't have to know or care about the way that GRPC maps to those URIs. Method matchers let you route traffic for individual GRPC services and GRPC methods without having to use those implementation details of the GRPC protocol. So it's pretty common to route an entire GRPC service to the same back end. So if you omit the method field, that will route the whole service to that back end. It's somewhat less common, but you may have multiple services that have the same method names. For example, a v1 and v2 version of the same service. In that case, you can omit the service name to route all RPCs with the same method name to the same back end. There are a few other UX improvements that we have planned for GRPC route, and we'll touch on those in just a little bit. Now, diving into the service match side of things. Arco touched on this a few slides ago with the Gamma API slide. One of the most confusing things for folks using GRPC on Kubernetes is the long-lived connection aspect I mentioned a while ago, and there have been several talks on this in the past at KubeCon and outside of KubeCon. When your client creates a new connection for each outgoing request, as many HP clients do, you'll go through a full DNS and routing flow for every request. But since GRPC keeps connections around for as long as possible to improve performance, you only go through that process perhaps a single time over the course of an hour. Now, the most common way to route requests to a collection of server pods in Kubernetes is a service with type clusterIP. Services with type clusterIP get a virtual IP or a VIP allocated. This is the single IP for that service, and when a TCP connection is made to that IP, KubeProxy will balance that connection to one of the server pods associated with that service. When your client creates a new connection for each outgoing request, this will result in all of your requests being balanced across your server pods, so the request load and CPU utilization should be pretty uniform across all servers. For example, suppose you had 10 backends and 100 clients, each of them with the same request load, then the request load across your backends might look something like this. In this case, KubePC gets a relatively uniform distribution, but GRPC is much less so with clusterIP. So what's happening? We've got a basic setup here. One HP client in this case can figure to send request to a service named Foo. The client starts its first request to Foo, and the first thing it does is send a DNS request for Foo. KubeDNS gives the client the virtual IP I mentioned, in this case, 10.1.1. Then the client sends that IP and establish a new TCP connection for the request. Now Kubernetes will see the VIP is the destination of the TCP connection and balance it to one of the backends. In this case, the client got connected to backend one, and backend one returns back the result of request one. Good. Then the HP client sends its second request. It may or may not do another DNS request depending on caching, but it will almost certainly initiate that that connection goes to a backend, in this case, 10.002. Great. We have already seen a spread across backends. Third request, same story, yet another new TCP connection, yet another chance at hitting a different backend. Okay, so we have recapped how the vast majority of HP traffic in Kubernetes works. What happens when you drop a GRPC client into this picture instead? DNS looks the same. The GRPC client asks for Foo, and it gets back the VIP 10.0.1.1. The first request also looks the same. It's when we hit the second and third request that we realize things aren't working the same way as with the HP client. GRPC keeps TCP connections around for quite a while because setting them up and tearing them down is costly in terms of latency, CPU costs, and network congestion. But because GRPC keeps them around, the cluster IP routing mechanism just isn't super effective. As a result, you get some pretty bad balancing. Coming back to this graph, this is really just the law of large numbers. If you repeatedly roll a die, the chart showing how many times each face has showed up is not going to look uniform until you hit a very, a surprisingly large number of dice rolls. In the example here, GRPC got 100 rolls of the die, while HP got 10,000. That's a big difference, and that difference clearly has implications for load distribution. But of course, I'm telling you about this problem because GRPC route makes it better because the problem is service mesh. Service mesh moves all of the routing decisions directly within the client pod, so you will get rock-steady uniform load if that is what you want. And if you don't, you can configure the load balancing to happen exactly the way that you do want it. In the past, when folks have reported this issue to the GRPC team, we've recommended service mesh to them, and I've often thought, wow, that is a really heavy hammer. Service mesh involves a lot of setup, a lot of routing configuration, and that routing configuration might actually be done outside of the cluster. But now, with GRPC route, setting up a service mesh in Kubernetes has become easy enough that I no longer feel any hesitation in recommending it as the solution to this problem. It really is the simplest way to solve this issue. There are a couple of other more invasive solutions that don't require service mesh, and if you're interested in those, I invite you to check out the previous talks at KubeCon and at other places on this issue. So, those are a handful of reasons why you would choose to use GRPC route today, but as soon as the idea of a GRPC route resource was brought up, the community was eager to suggest a bunch of features, not all of which made it into the first revision. We've already touched on a few of those. The first is a feature that's already somewhat real, GRPC web transcoding. For those of you that aren't familiar, most browsers do not implement the full HP2 spec, and therefore they can't support the GRPC protocol as it's run in data centers. As a result, there's a sibling protocol called GRPC web that lets you use the same developer flow as GRPC from the browser. Historically, you've had to configure a proxy to do this transcoding, but several implementations of GRPC route, including Envoy Gateway will actually take incoming GRPC web traffic and automatically transcode it to GRPC proper. These implementations are running ahead of the spec. There's actually nothing in there yet about GRPC web transcoding. The next step is to use what's out there in the wild as the basis for an extension to the GRPC route specification that includes GRPC web transcoding. Very similar to GRPC web transcoding, there is HTTP plus JSON or REST transcoding. Very similar reasons for this as for GRPC web. Some people just prefer to use REST from the browser. These people put HTTP annotations in their proto. These add extra information that allow a proxy to map certain methods to the GRPC. So REST in and GRPC out. In the example here, you can see we've mapped our echo RPC to V1 example echo. So you should be able to hit the gateway with that URI and trigger it GRPC RPC to the back end. At the moment, the most common way that people accomplish this is to generate proxy code from their proto, compile that into a binary, deploy that into a REST cluster and configure traffic through that proxy. Exhausting, right? Well, the idea here would be to add a single boolean knob to GRPC route that says, hey, give me REST in and GRPC out. And voila, the gateway API controller that you've installed will take care of all of that for you. And then there's reflection. For those of you who aren't familiar, ProtoBuff is binary on the wire, so it's difficult to do that. You can do that with the ProtoBuff system, but that's a real pain. Instead, most people make use of the reflection protocol. Reflection is a secondary service you run on your server that tells clients the type information for that server. With a reflection server, tools like GRPCURL just work. Unfortunately for folks running GRPC servers on Kubernetes, this means that in addition to your application's standard RPC services, you also have to route through the reflection service to those servers. You wouldn't believe the number of times I've solved people's routing issues by telling them to add a catch-all route rule for reflection. In the future, we'd like to extend GRPC route to automatically add routes for reflection, probably as an opt-in feature. So stay tuned for that. And last, but certainly not least, there's payload routing. This would be a great example of how you can route RPCs to different back ends based on the contents of your protobuf requests. This sort of thing isn't super standard with REST because in REST, the payload is human readable. You'd have to parse the JSON to do this, which is pretty expensive. But with protobuf, checking the value of a single field is actually a really inexpensive operation. Surrouting this way isn't prohibitive. The sort of thing is done all the time and it's a really good example of what we currently have within GRPC route. So in this aspirational example, you see we're matching traffic for the login method only if it's a particular user logging in. Of course, this will require type information, so reflection will be required for this. There aren't currently any open source implementations for this, so it's a little further out than the other potential features that we've mentioned. We've been super happy to see how much Gateway API implementations have embraced this resource, and now I'm looking forward to seeing users put GRPC route into action routing production GRPC traffic. As always, we hope that this talk was helpful to you. If you have any feedback that you want to share on this talk, please follow that QR code and leave your feedback. So Envoy Gateway has just hit V1 with support for GRPC routes, so thank you for the question. And then there's GRPC Conf this August in Sunnyvale, California. We'd love to see you out there in a couple months if you want to come discuss GRPC in a little bit more detail. Also, if you want to follow me to the GRPC maintainer talk after this, we'd be happy to have you there. So with that, let's move on to any questions you might have. Thank you very much. It was a great talk. I have a question. Is any of this feature actually available? So, like, just all of them are around some limitation. Like, for example, I'm particularly interested in this payload routing. Can you tell in a few words how it works? Sure. So this is a future direction. This is not implemented or in the spec yet. It would be an additional piece that we'd work on top of this. It would work only with loading and routing. But, yeah, that should theoretically work out of the box. Got another question back there. You mentioned that service mesh was now a no-brainer solution to the load distribution issue. But it wasn't entirely clear to me why, you briefly mentioned it, but I must have missed it, why it was so much easier to use service mesh now that this had been implemented. If you could elaborate on that. One, in the past, service mesh has been sort of implementation dependent. So, like, if you want to use traffic director, you've got to use GCP APIs. If you want to use Istio, you have to use Istio specific APIs. And that is an additional piece that you have to learn on top of things. The onboarding process for Gateway API service mesh implementations is very turnkey. And so now it's one or two control apply commands to get a load balancing logic for your client. The other thing that I'd say is for a lot of people, service mesh has this connotation of sidecar proxies. That's not at all the case here. GRPC supports proxy-less service mesh. And so you're able to not have a sidecar, but instead have the client process do this load balancing logic for you entirely within your client. So, that's the root route type. Correct. And now there's a very simple way to configure that. Yeah. Okay. Cool. Thanks so much. All right. One more question. Hello. I have a question related to the service mesh that you mentioned, gamma at the beginning. I wanted to know, maybe I missed it, but what is the link or the special features related to the service mesh? I would like to understand the link between gamma service mesh that you mentioned and GRPC. What features or what is the... Sure. What do we gain around that? Thank you. Sure. So, in general, I would say service mesh solves the problem of flow distribution. Gamma gives you a way to configure service meshes in here to configure GRPC service meshes via gamma, via the gateway APIs. Just to add to that, you have an additional data plane that can do least requests, so you can actually distribute the load rather than do least connection. Right. I think that might be it unless anybody else says another question.