 Hello. My name is Sanjay Pujare. I am in Google Cloud Engineering and I am a lead in the GRPC team. So in this session, I'm going to talk about what GRPC is and some history behind it. Then I'll discuss microservices and service mesh and GRPC's role in that evolution. One of the main stages in the evolution of service mesh is proxy service mesh and I'll cover that. Then there are a few slides about main tenets of proxy service mesh, traffic management and security. I'll mention XVS and how traffic management and security works in the proxy service mesh with XVS. I'll describe how to use proxy service GRPC with an example in Java. Then I'll talk about one of the main developments in GRPC, observability and what's happening there. And then we'll end with question and answers. So what's GRPC? GRPC was created by Google based on their experience of and the next version of study. Study is the internal RPC framework Google has been using successfully to operate Google scale of microservices. One of the numbers I heard mention is something like order of 10 to the power of 10, about 10 billion RPCs per second. One of the improvements of GRPC over study is use of HTTP2. And the benefits we get are binary framing, multiplexing, streaming and HVAC compression for headers. I'll also talk about protocol buffers in a later slide. Protocol buffer is also known as protobuf and that's the term I'll use in the service presentation. Protobuf is used as the serialization framework with GRPC. So how does one use GRPC? At a high level, you define your interface using the protobuf IDL in a .proto file. .proto is the extension for such files. I use the protoc compiler to generate client and service apps. By the way, protocol buffers is a separate project and the protoc compiler uses a language specific plugin to generate code for a target language. So for example, there is a Java plugin for protoc and that's how protoc generates Java code. So anyway, after the code is generated and steps are extended or implemented, you can use the client step to simply make a call to the server. Strictly speaking, GRPC can be used with another serialization framework. Protobuf is not strictly speaking mandatory but is the most popular and de facto framework for GRPC. So what's protobuf? Protobuf predates GRPC since Google has been using it with stubby for quite some time. As you can see, its advantages are that it is strongly tied. It has a binary format for compactness and is highly extensible in a backward compatible way. I had already mentioned the code generator plugins for various GRPC languages like Java, C++ and Go. There are code generators for other languages as well. I have shown here a sample snippet of protobuf ideal that shows a struct with string, enum and n fields. This like captures in a nutshell why GRPC is so good. Most of these things follow from what I have discussed so far. Now just a couple of things I would like to add. GRPC is not only used for unary RPCs. A unary RPC is where you send a single request and get a response. But it is also used for the streaming paradigm where you do client streaming or server streaming or bi-directional streaming RPCs. A stream is an unbounded stream of messages. For using streams, GRPC provides an asynchronous framework for processing streams. Another thing that I would like to talk about is extensibility and customizability using what we call interceptors. These interceptors are present both on the client and the server side. Using an interceptor, a client or server can intercept RPCs to decorate the calls or it can perform some cross-cutting functionality such as authentication or authorization or logging. I'll talk more about using interceptors for logging when we get to observability. Now this wraps up the introduction to GRPC part of the presentation. In case you want to know more about GRPC, here are some of the links or resources you can look up. The first one is a link to a very useful presentation covering introduction to GRPC in a previous QCon conference. Let's move on to the next section of the presentation. Now GRPC enables remote procedure calls to make them almost as simple to use as in-process direct calls. This has enabled the microservice architecture where a monolithic application is broken up into multiple microservices. What used to be in-process communication inside the monolith is now RPCs over the network. In this picture, the yellow layer talks to the red layer which talks to the green layer and these RPCs cross network and infrastructure boundaries. The dotted lines in the cloud-like shape show the various boundaries the RPCs have to cross. One advantage of microservice architecture is the ability to scale. Various infrastructure resources such as VMs, clusters or networks are added as the application scaling requirements go up. And the RPCs have to be routed or load balanced or secured as part of that scaling up. So how can all this be automated or managed? So this is where the service mesh comes up. The service mesh has a control pane like Istio shown here in the yellow box that is managing the data pane. That is the bunch of microservices that make up an application. The control pane maintains and manages policies for traffic management and security. These policies are enforced or implemented by data pane entities called proxies. Now this is the proxy mode. In this picture, GRPC traffic flows through these proxies and the proxies are responsible for offering the traffic and enforcing security based on policies from the control pane. The proxy is transparent. That is the GRPC application is not aware of their presence and communicates as if the proxies are not present. The proxy is also used for HTTP and other application level traffic such as Redis or MySQL. But with GRPC we can do away with the proxies if we implemented the same functionality in the GRPC layer. So this is the so-called proxies model. We enhanced GRPC to have the same traffic management and security functionality as the proxy or specifically the on-white proxy. That is most commonly used in Istio or XPS based service meshes. The control pane Istio in this case sends the same policies to GRPC instead of the proxies and the GRPC client and server enforce these policies within the GRPC library. The services in the mesh talk to each other directly without the proxies. So this is of course true for GRPC traffic, but if you have other kinds of traffic, say HTTP or Redis, then proxies might still be needed in your service mesh. So let's look at the advantages of the proxy service mesh. In this slide, you can see the latency gains as the result of the proxies mode. Here are some background on this experiment setup. This was set up using Forteo, which is a Go-based load testing app. The infrastructure resources used are as follows. In the GKE cluster version 1.20 with three E2 standard 16 nodes, these nodes have 16 CPUs and 64 GB memory each. This experiment uses the Forteo client and server apps in the application container, which has 1.5 virtual CPU and 1000 MB memory allocated to it. And we also have the sidecar container for the Istio agent and the on-white proxy, which has 1 virtual CPU and 512 MB memory allocated to it. The experiment included both the with and without MTLS enabled modes and with and without the on-white proxy. Now, a little bit about the sidecar container. If this is proxies mode, why is there a sidecar container? Note that we did the experiment to measure the performance for both the cases. In case of proxies, there is no on-white proxy, but we have the sidecar container for something called an Istio agent. So even though we don't need the on-white proxy, we need the Istio agent. Now let's compare the performance. Now compared to the on-white case, there is a massive improvement with MTLS at 64 connections. The latency improvement is from 6 to 16 times. That is about 500% to 1500% improvement, depending on whether you are looking at a P15 or P99 latency numbers. Note that this still supports advanced traffic management and MTLS, but without the on-white proxy. Let's also look at the social research. As mentioned earlier, the proxies experiment still requires an agent called the Istio agent. Even if we still require an Istio agent, the agent uses less than 0.1% of the full VCPU and only 25 megabytes of lemmery, which is less than half of what on-white requires. Note that these metrics don't include the additional resource usage by GRPC in the application container, but this serves to demonstrate the resource usage impact of the Istio agent when running in this mode. I think I had mentioned XDS before, so let me expand on that a little bit. XDS is a protocol for control plane to talk to the data plane entities. That's why it's called a data plane API XDS. Now XDS where X stands for some data plane entity like X in algebra, and DS is discovery service. For example, CDS for discovery of clusters, RDS for routes, LDS for listeners, and so on. XDS was developed for on-white, but it's pretty open and extensible for any kind of service mesh. GRPC adopted it and extended it for the proxy-based service mesh. This slide mainly shows how XDS works for traffic management in the service mesh. The LDS, which is the listener discovery service, is used to discover the configuration route for any GRPC service a client is trying to reach, such as payment-service.mydomain.com. LDS is the route of the configuration and it points to other artifacts, specifically RDS for routes. Now RDS contains the routing information, things such as how to process the host, path, and other HTTP headers to route a request. So this is the place where routing and traffic policies are enforced. RDS further points to CDS. CDS has cluster information. A cluster is a set of backends that are part of the same infrastructure, such as the same geography or the same network, and all the backends in the cluster share the same security configuration, such as certificates and keys. Now CDS has references to what is called EDS, endpoint discovery service. EDS has the actual backends that a client connects to. So CDS and EDS together are used for load balancing. Having covered traffic management, let's look at security. So why is security so important in a service mesh? Remember that a service mesh is the result of breaking up a monolithic application. What used to be in-process communication inside the monolith is now RPCs over the network, and as a result, they need to be secure. These RPCs are routed or load balanced as part of the service mesh orchestration. We need security that's well integrated with things like routing, load balancing, and service discovery. For example, each endpoint needs to be able to validate its certificate and identity using the control frame provided information. Security also includes authorization where a server authorizes a client before accepting the RPCs. So how is all this done and who does it? The answer is service mesh with security. This is a diagrammatic representation of how it all works. You have the client in the yellow box on the left sending RPCs to the server in the red box over a secure channel. The blue lock on that line indicates a secure channel. Both client and server are XDS enabled and get the security configuration from the XDS control frame shown in blue at the top. The client and server need certificates and keys that are provided by the certificate and deployment infrastructure to make it all happen. The green box represents the various infrastructure components which include certification authorities, also known as CAs, to make certificates. A process to continuously generate CSRs and use them to make the certificates. A mechanism to make the certificates and keys available to GRPC workloads using the GRPC's plugin feature. Now, for implementing security, we implemented a certificate plugin feature in GRPC. And we also added the required extension points for the plugins in XDS. When all of these things are in place, the client and server secure their GRPC traffic. Let's recap service mesh security. The control frame uses a transport socket abstraction within a CVS or a VS to consider security in GRPC. But there are external components like the security infrastructure to provide certificates and keys. When MTLS is involved, GRPC uses the provided certificates and certain other bits from the transport socket configuration to create the MTLS configuration or to create MTLS connection. We get authentication, encryption and something called server authorization with MTLS. Server authorization is somewhat like the host name check in HTTPTLS. Then as part of security, we also have regular authorization or client authorization implemented on the server side. Where a user can use authorization policies, also known as RBAC, to authorize RPCs based on the various things including client identities. So how does one use this stuff? Say in Java. The example in this slide is from the security GRPC mentioned here, 829 XDS TLS security.md. Now a GRSC is an RFC or a proposal in the GRPC ecosystem. This example is in Java, but the usage in C++ and Python and Go is similar. There is something called XDS channel credential and that you supplied to your channel builder and this credential tells GRPC to use XDS supplied security configuration. And there is something called server credential on the server side, which instructs GRPC to use XDS supplied security configuration on the server side. And I wonder what the insecure channel credentials used inside create is doing. This is a fallback credential. And what is that? I will come to that in a bit. Now that XDS channel credential is, note that the XDS channel credential is a way for a caller to obtain the use of XDS security config. Note that a caller can use a different credential. For example, a TLS credential with a channel in which case the XDS supplied security config is ignored even if the rest of the configuration from XDS is used. And now back to fallback credential. And XDS credential also takes something called a fallback credential, which kicks in if XDS doesn't supply any security configuration. So instead of choosing this, choosing to treat this as a plain text or insecure communication, a caller can tell GRPC to use fallback TLS credential. If they had used TLS credential in this particular XDS credentials. This wraps up the GRPC and service mesh part of the presentation. For more information, you can look up a previous KubeCon presentation that talks about professional GRPC service mesh in some detail. I have also included another KubeCon presentation last year that focused on security in professional service mesh. Both of these talk about using the professional GRPC service mesh in Google Cloud using Google service mesh products. I have also included an Istio blog that talks about professional service mesh in Istio. Now let's look at another aspect of microservices or the service mesh where effective use of this paradigm could be hampered by what we call observability or lack of it. Observability in simple words is visibility into the internal goings on in your service mesh to the extent that you need that visibility for reliability and efficiency. For example, if something breaks in your complex service mesh or it is performing poorly, how will you figure out where the problem is? Is there any way the infrastructure or the software components can provide that required visibility? So let's talk about GRPC observability and how that comes into play here. We in the GRPC team are about to release GRPC observability which consists of logs, metrics and traces. The three main pillars of observability. I had talked about GRPC interceptors before. We use the GRPC interceptor framework to inject our observability interceptors which generate the required raw data for the three pillars. This part is then integrated with exporters and backends so that the raw data is massaged and sent through the exporter pipeline to an appropriate analytics backend. This integration provides the end to end GRPC observability. Some more information about GRPC observability or O11Y as we like to call it. You have the app developer, the character, the icon on the left-hand side in green. The developer is building GRPC applications using the latest GRPC artifacts that support observability. The running apps are provided with appropriate observability configuration by the developer or the SRE. The GRPC interceptors pump the relevant raw data, I mean the logs, metrics and trace information through the related exporters into the respective backends. Here you see a logging backend, a metrics backend and a trace backend. These backends then produce the required dashboards that are used by the consumer. In this case, the SRE to get the visibility into the internal state of the application, specifically as it is related to GRPC traffic. The GRPC observability product runs in Google Cloud platform. On the producer side, it has plugins for logging, metrics and traces. On the producer side, you also include the required exporters in the application, namely the stack driver exporters configured to send data to the Google Cloud apps backend. The product also has an admin console to enable or otherwise administer the feature. And then we have consumer dashboards that provide some popular canned views or allow you to configure or customize the views with customizable queries. Let's see how all this works in Java. In Java, there is an artifact called GRPC GCP observability that you use in your application. The artifact also pulls in other required dependencies such as the stack driver exporter. In your Java app, you call an init function at the beginning. When the application is running GCP and an appropriate configuration is provided, it automatically sends the required raw data to the Google Cloud apps backend and you have observability. Let's look at the Java code snippet for an actual example. In your main application class, there is a main method. You call GCP observability.grp init to start observability. In this case, we are using try with resources so when the try block exists, close is also is automatically called. The whole application execution is inside the try block. Now you may choose to call close explicitly if you don't want to use try with resources. This is what the GRPC observability configuration looks like. There are enabled disabled flags for the three pillars. For logging, there are filtering options to limit or filter the kind of RPC you want to log. We also have provided a filter to filter the kind of events you want to log. And finally, there is a probabilistic sample configuration where you specify the sampling rate. So with 0.5, it will send only 50% of the calls to which are randomly selected to be for generating the trace data. Now one more thing about observability, there is something called tags which are labels attached to the lot and methods data. We automatically attach location tags which identify the location or sold information of where the data was generated. For example, the VM name or the Kubernetes cluster name or the namespace. We also allow the user to provide additional custom tags that could be used to provide additional identification information such as the app ID or the data center and so on. This is a screenshot of the consumer dashboard. As you can see, we automatically provide suggested queries based on the ingested data and clicking on a suggested query will generate that dashboard. For example, log records of GRPCs that had errors. This is a screenshot of a log record of one of the GRPCs that failed with the deadline accident error. The screenshot of a log record shows the location and custom tags called labels are here. The location tags have the cluster name, pod name, project ID and so on. The custom tags contain the app ID and data center as supplied by the user's environment or the configuration. For GRPC observability, we have already released the logging feature in private preview. The metrics and traces are coming soon, almost as we speak. These will be integrated with Google Cloud monitoring entries. This is a screenshot of the upcoming metrics part of observability. This is done by specific workloads such as the VM name or VM instances or by location or custom tags. So that wraps up the whole presentation. Before closing, I would like those of you who are interested to know more about anything I mentioned here. If you just want to meet the GRPC maintainers, you can go to GRPC.io to submit a form and to get started. GRPC.io is a good starting point to get more information about GRPC. Thanks everyone. So let me know if you have any questions.