 Good morning. Good morning. Thank you for being here, taking an entire day out of your busy schedules, if not more, for folks who have traveled here. It's a very generous gift, and I hope we'll all get good value out of it. My name is Abhishek Kumar. I've had the privilege of working on the RPC stack here at Google for several years. I've been on the GRPC team since the inception of the project. And today, I'll talk a little bit about where things came from, where we are today, and all the great things that the team is working on. RPCs were originated as an idea back in the early 1980s. And RPC systems have been built and distributed in open source ever since that period. At Google, in the very early startup days, the systems were built using open source frameworks. But in the very early days, some of the early engineers at Google, folks like Jeff Dean and Sanjay Grimovath and a bunch of other big names, they developed an RPC system called Stubby, and they developed protocol buffers. And over time, these became the foundation for all sorts of distributed systems built at Google. Fast forward to 2010s, as the industry started moving towards cloud, we had the opportunity to take everything that we had learned, building and maintaining and improving Stubby internally, and go implement everything from scratch in the open source for everyone to use. So that's what we attempted to do with GRPC. And here we are, with all of you in this room, using and extending GRPC. I'm going to come up with three different descriptions of what GRPC is. For people who are focused on the protocol and what's really happening on the wire, it's a protocol specification for how to communicate between clients and servers, how to serialize data on the wire, how to do all of it with flow control, with the right layer seven negotiations between the two endpoints. But that's not really what most developers think of when they think of GRPC. They think of the middle definition. They think of GRPC as a framework that provides idiomatic APIs for communication between distributed systems. It provides those APIs in the languages that they choose and it abstracts away from a lot of the low-level networking details that these developers don't have to worry about so they can focus on building the actual business logic that they are interested in building. But GRPC is also part of this movement to build microservices, to try and decompose large monoliths and large problems into smaller self-contained services that communicate with each other and work together to provide complex functionality. And that's the third way of thinking about GRPC. So I mentioned that Google itself standardized in the early days on an RPC layer with Stubby which was the internal version. Today we are seeing a lot of organizations in the industry standardized on GRPC. So when we talk to organizations that are exploring whether or not they should embrace GRPC, questions come up around what's the value for an organization. And the value is some of these points. Because GRPC is polyglot, it allows you to pick the language that is right for the problem at hand. It allows you to work in C++ if you have to deal with a graphics library that's only available in C++. If you have a machine learning application in Python, then you can use GRPC in Python. If you have some business logic that's already implemented in Java, you can use Java. And you can stitch all of it together. You can have these different services talking to each other and still provide a higher level, more valuable service to your users. Having a standard RPC layer in an organization also helps with developer productivity. Developers become familiar. They get used to a nice higher level of abstraction and they are able to take those skills across projects and teams within the organization and apply all of this experience across everything that the organization works on. And finally, I know there are some folks in the room here that work on platform teams. So organizations, as the organization size grows, there ends up being some need to ship infrastructure improvements. Maybe there's a new observability solution that needs to be rolled out across the organization. By having a standard solution like GRPC, it's easy for a platform team to roll out this kind of a solution without having to go work on every single application, every single service that the organization wants. So that's the third big value that organizations get out of an RPC. There's also a number of ecosystem benefits. The stack is very actively maintained. You can see the huge amount of traffic we get in terms of issues and comments on GitHub. We get vulnerability disclosures from the community and we have a good engaged team and maintainers across the industry who are working hard to keep up with these and to keep the foundation of these GRPC implementation solid. Performance has always been a focus because GRPC is used, for example, at Google, it's used as a core building block for all Google Cloud services. It's also used at many other organizations as a building block for their distributed systems. Because of all that use, the GRPC stack goes through a lot of performance tuning and it gets run through the specific scenarios that people have. Maybe someone has a huge data heavy workload where they are just focused on throughput. Maybe someone else has a workload where they are very latency sensitive and all of those optimizations to the extent that we see them, we are able to bring them back upstream and the entire community is able to benefit from this. Now, of course, the industry will keep moving to faster network speeds, more data heavy workloads and we are committed to continuing to keep pace with that to continue to deliver a very high performance stack. Now, talking about how GRPC is cutting edge, researchers in working on things like application networking or service mesh or distributed systems are increasingly using GRPC as the de facto reference platform because it's open source, they pick it up, they extend it, they play with it, they experiment with it and they try out all sorts of new things. Some of the good ideas from all of this work, I'm sure is calculating back into GRPC implementations and over time, everyone using GRPC will benefit from the new research that's happening on top of this stack. We do have some vendors in the conference here. A number of people are building projects that are themselves open source or building tools that support the use and development on top of GRPC and we love that, that strengthens the ecosystem, that strengthens the community and gives more choice to everyone. And the last point I'll make is talent, right? This is us, this is everyone in this room and thousands of more people out there who put GRPC as a core skill on their resumes and who are available to lend their expertise to whatever organization is embracing GRPC and to be able to move faster, build faster at a higher level of abstraction. I'll switch gears and talk a little bit about all the features that come with GRPC. Well, I won't be able to cover all but I like to touch upon some of the classes of features. I'll start with protocol buffers. Protocol buffers provides the interface definition language. It also provides the basis for data modeling in applications and it provides the wire protocol for encoding and transmitting actual application data. Now, if people choose, they can actually use GRPC with other alternate solutions instead of protocol buffers although protocol buff is probably the most heavily used layer. Security has always been a priority for GRPC. Since the beginning, GRPC has supported pluggable transport and application security solutions for things like TLS and MTLS, various OAuth flavors and plugins for authorization and authentication. And through the course of the day today, we'll hear from speakers talking about security. We've also been false testing the GRPC implementations and again, we are thankful for any vulnerability disclosures we get, we have tried to keep a very good response time for any disclosures that we receive. Talking of networking, the RPC team here at Google has traditionally been part of the networking software organization and we do take good pride in all of the networking features that we have managed to build into GRPC. It comes, of course, with good solutions out of the box for discovery, for load balancing. Some of the low-level networking details that not everyone wants to deal with unless they are forced to, tuning the parameters for low-level TCP interactions and so on, or exposing APIs to a tune QOS at the network layer, but also higher-level networking things like retries and traffic management features, which we'll hear about more in a moment when we talk about service measures. To touch briefly on flow control. TCP has a very ingenious solution for flow control, but because we put multiple RPCs on top of the same TCP connection, we had to build a flow control layer on top of TCP. This token-based flow control mechanism was actually developed simultaneously by the Edge proxy team and the RPC team at Google almost a decade ago, and it was published as speedy and then eventually made its way into the HTTP2 standard. GRPC makes heavy use of that flow control while at the same time, making it, embedding it into the API in a way that's very unobtrusive. So as a GRPC developer, you probably are not even aware of all of the flow control that's happening under the hood, and unless you're running a really data-heavy application and you care about fine-tuning the memory utilization of your application to the last degree, you can continue to be blissfully unaware of GRPC's flow control while getting good benefit out of it. And finally, GRPC has great support for observability. It has pluggable support there, and we'll certainly spend a little more time talking about that. I mentioned service meshes. A lot of the kind of things that are included as part of the service mesh feature set, we have traditionally built those into the RPC layer, and I'd like to invite Arun to talk more about service mesh. All right, thanks, Abhishek. My name is Arun. I'm a software engineer at Google. I'm the tech lead for service networking infrastructure areas and service mesh product within Google. So Abhishek talked quite a bit about GRPC features. I'm going to switch gears a bit and talk about service mesh, and within that, my thoughts on how GRPC is evolving to be a data plane that's well integrated into the rest of the service mesh ecosystem. Most of you probably already know what a service mesh is, but I'm going to give my own simple definition of it to kind of kick things off. As Abhishek mentioned, application owners often split their application into a set of composable services and workloads, and as they develop these inner services, there is a natural need for services to communicate with each other, and as they communicate with each other, there is a need for them to discover each other, load balance, and many service networking capability requirements that come into a picture as a package of that service to service in a communication. When services talk to each other, they also need to talk to each other securely, which means they need to authenticate with each other, authorize with each other. The communication also need to be observable, meaning it needs to lend itself to a supportable and debuggable system through that observability. The definition that I always give for a service mesh is a solution and product term that represents the enablement of these capabilities. A solution and a product that offers these capabilities is what I call as a service mesh, and it's architecturally enabled by a combination of data plane and control plane capabilities. I'll look at some of the options here more in the next slide. Let's look at some of the ways that service owners typically deploy a service mesh in a data plane. One of the popular ways, and again, you probably all know this pretty well already, but one of the popular ways is to plant a flight car proxy right next to the application. This proxy sits right next to the application and offers service networking, observability, and security features for the application so that the application binary need not be burdened in implementing these capabilities. This obviously has a resource management and an operational management of the proxy visible to the service owners. GRPC data plane has the GRPC core integrated into the application binary and implements these capabilities directly in the core integrated with the binary, and these capabilities are implemented with a control plane interaction all within the application binary. That's a clear distinguishing factor for GRPC in a service mesh in a deployment because you do not need to run another data plane component right next to it to enable these capabilities. There are also some interesting evolutions of architecture happening in the open source, primarily in the STO open source community. The STO open source community is looking at ways to remove the operational cost of the site car and again, you may have heard this term. There's an architecture term called Ambient Mesh in STO that attempts to move the proxy away from the operational view of the service owner hidden in the infrastructure and to minimize the infrastructure complexities and infrastructure cost. The local data plane capabilities are stripped down to a layer four and an authentication and authorization capability, but the layer seven capabilities enforced through a remote proxy. And if you really think about it, at least from the perspective of an operational cost and direction, these architecture initiatives are attempting to get to where GRPC already used by being a core that's natively integrated with an application. So that's something that I kind of wanted to draw your attention to. So let's look at some of the capabilities that GRPC provides integrated with service mesh. The list is too long, so I'm not going to go over all the capabilities, but you'd see all the table stakes that we just talked about, ability to offer a rich set of service networking capabilities, discovering services, load balancing to services, kind of advanced traffic in a management, things like if you are a service owner and if you want to test how your service is resilient towards your dependent services failing, you can go enable fault injection. If you are a service owner who wants to safely roll out new binaries, you can use traffic splitting. So each of these advanced traffic management tools carry either operational SRE or service owner requirements or they're kind of mapped open. You can also enable security, authentication is supported through a mutual TLS and you can layer authorization on top of it as a combination of an L7 RPC property and identity. And of course, there's a rich set of observatory tools, but the two things I actually wanted to draw some special attention to because it's a distinguishing factor for GRPC is given that it's an RPC framework, it allows an easier instrumentation of collection of server side cost that you can go use for your load balancing in your decisions, be it CPU or in an ML workload, you have custom TPU cost, whatever that may be given that it's an RPC framework, the collection of the server costs and the ability to use it in load balancing is actually pretty critical for very large scale in a service in a deployment. The other distinguishing factor is since it's part of the application, it natively knows the services of interest and asks the control plane only for the services of interest and what it does is it decouples a GRPC fundamentally from the scale of the mesh. So you can have thousands and thousands of services in your mesh and the overall deployment scale of the service mesh would not impact the scale that you need to take on in your data plane which is a characteristic that's harder to achieve with a sidecar because it's decoupled from the application and unless you do some inference-based implementation, it's pretty hard for the sidecar to only load services of interest like what GRPC is natively enabled to do and this allows us to, even within Google production, we are able to run it with some 100,000 plus services and the number of services can keep growing but the footprint of what you see on the GRPC side would be limited to the services of interest your application instance as opposed to the number of services in your mesh. So I want to close this out by talking a little bit about how GRPC is integrating with the rest of the open source in a service mesh ecosystem. At the core of it are two APIs, a reasonably well-known data plane API, XDS it was initially implemented by an envoy and later generalized with a lot of contribution from GRPC community for the XDS data plane API to be able to support GRPC as an RTC framework and that has allowed GRPC to integrate with a compatible XDS control plane. Two good examples, I'm from Google, so I'm going to take a Google example first. A traffic director based service mesh product solution that's in a GA and there is an experimental support with Istio as one of the open source XDS control planes and there is an experimental support with GRPC with Istio in the open source. So that enables GRPC at a data plane level to be compatible with any compliant XDS control plane that's out there. The second one, which is a relatively newer initiative is Kubernetes has been driving and users facing API ecosystem through a gateway API implementation. The gateway implementation initially targeted load balancers but there is now an initiative to go extend that open core user-facing API surface to work for mesh solutions. The initiative is called gateway for mesh. Not very innovative and this enables GRPC to be a well-supported data plane in the broader Kubernetes in the ecosystem. So now we have a data plane that can pair up with a compliant control plane and then now we have a data plane that can be driven through an open core Kubernetes in an API and all the functions that you expose as a part of your application platform, Kubernetes application platform, GRPC could be one of the data planes that enables those capabilities transparent at that particular user-facing API in a level. So those are a couple of things I wanted to kind of call out in terms of ecosystem integration. So with that, this is the link for you to try the GRPC proxyless service mesh solution. I hope you can give it a try. I thank for the opportunity and pass it on back to the shake. Thanks. Thank you Arun. Whether you think of your deployment as a service mesh or whether you think of it as just a plain client server deployment in whatever environment you have, you probably do need to think a little bit about observability. Now the RPC library is a logical integration point for observability because if you instrument the RPC layer, you can directly get information about what's happening on the network and what's happening in the application right at the point where those things are actually happening. We have set ourselves a goal to build out support for observability so that it provides a good batteries included experience to the users without requiring them to become experts in choosing what metrics they want or understanding what distributed tracing is and how it works and so on. So we want to make it a seamless experience. We want the defaults to be right. And the intent is to build out observability solutions for metrics, logs, and tracing which people can use with very little work as developers or service operators. When we think of observability, we look at the different stages, starting with the producer stack, which is the instrumentation that goes into the GRPC libraries themselves. And next comes the exporter, which is the layer that exports this observability data collected from the stack and exports it out typically via a pipeline to some sort of a solution that's going to store and analyze this observability information and eventually present it to a consumer UI to operators of the service, be they app developers themselves or a DevOps team or an SRE team that's actually operating the service. We have built a solution in this space, but there's a lot more to come. The team is working hard on figuring out our roadmap for open telemetry support. The observability solutions we have at GRPC today work with open sensors, which is the predecessor to open telemetry, but as the industry is moving pretty fast towards open telemetry, we want to keep pace with that. We are also getting requests to support Prometheus and other collectors, and we want to explore how we can do that. We have an opportunity here to provide deeper instrumentation and insights into what's going on at the lowest levels of networking because of recent advancements in kernel, TCP, timestamps and so on. So we would like to go do a better job of exporting low-level networking information. And finally, GRPC has a pretty rich channel layer. That's where all of the service mesh functionality is implemented. We have something called channel Z, but it's not very well integrated with the observability stack today, and we would really like to go do a better job of exporting all kinds of channel-level data into the observability stack. I'll end with an appeal to request all of you to give us more feedback. Tell us more about how you're using protocol buffers, what the pain points there are. Tell us what you are doing for CI CD and what the challenges there are and what we can do as a team to help mitigate some of those challenges. We definitely want to hear about your microservice deployments and operational headaches in managing those and anything that we can learn about problems that you have or ideas that you have that we could incorporate in GRPC to make it better for everyone. I'll wrap up with that. Thank you very much. We do have a YouTube channel. Please go and subscribe. And Kevin, if we have time, we can take a couple of questions. Is it possible to go back to the slide GRPC observability? This is the same architecture we have it right now. We have many services, many clients are calling our GRPCs. I'm sorry, my name is Lakshmi. I'm from Equifax. We have many services up and running and a lot of clients access our services and using our interceptor, we log a lot of requests and responses. It's huge transactions with a large volume of transactions. Now, we want to get that to an analytics environment. What we are struggling is we are using our home ground way of logging this and monitoring these things like using the GRPC streams. But what we want from the GRPC team is when we use a cloud logging, we're missing some of the messages, some of the transactions. Early stages, like a couple of years ago. We built our own GRPC streams from every service. We'll send this information to another GRPC service where it writes into GRPC's buckets. That's our approach. We took it right now. What we want is the same service should log inside using cloud logging or cloud monitoring. How do we achieve it? We need confidence on it. So, are we proved a lot on these cloud logging and metrics or we still have some problems missing on these messages? Thank you. Thank you for the question. Certainly, things have moved over the past couple of years quite a bit in cloud logging. We'll follow up with you and want to understand what challenges you ran into in your prior attempt. As we move towards open telemetry, actually the logging parts of open telemetry are kind of the ones that are lagging the furthest behind, but as the standardization there takes place, we will eventually... By the way, we are using open telemetry. We use open telemetry. We're also logging all the events which goes back to the BQ. The way we are going around the BQ, which we want directly all these services should go back to the BQ, which should be much faster also. We do not want to spend a lot of time in logging. That makes sense. I think we'll need a deeper conversation to understand what challenges we ran into. Because these are... Trust me, you're not unique in running into this kind of challenges, but oftentimes we need to drill deeper into the situation to truly understand. Can I ask one more question? Now we have all these GRPC services up and running. How is it integrated with our cloud data flow? For example, these services should work in batch mode. We have a huge transaction which we want to hit through the GRPC services. With one of the approaches we took it, we package as a GRPC service in a batch container object or image. We are using inside the data flow container. It's a complex one. But scalability on this or throughput will be little less. How do we achieve externally not just Kubernetes engine alone? So I don't know whether the GRPC team is thinking about how do we use it in the batch operations? That's a great question. To be honest, we don't have great answers to that question. GRPC, I know is used heavily in batch workloads as well. I would have before you finished your question, I was going to say, well, Kubernetes has auto scaling support and perhaps that expensive thing worth exploring. There's a cost involved in it, right? The cost between the GKE and the data flow engine is different. I see. So that's why we want everything to be done inside the data flow containers. Got you. Yeah, we should definitely follow up. Absolutely. Thank you. All right. If there are no other questions, we'll wrap this up. There is a question over there. Hey, my name's good presentations. Thank you. Kevin had the star graph with the core SDK languages on it. It doesn't include the non Google-favoured SDKs like Rust and C-Sharp. I think it would be good to include some of the other PGs. There's a follow-up question. How do those other language SDKs that are out in the wilderness become more beloved as part of the core? C-Sharp I use, but I know it's a significant language and Microsoft's doing a lot with GRPC and Rust I do work with and I see lots of growth but it's not being included in these discussions and I think both and probably more should. Yeah, I can take that one. So there are GRPC implementations that are not necessarily supported by Google. You mentioned C-Sharp from Microsoft and Swift is another notable one from Apple. We certainly talk very frequently with our colleagues from Microsoft and in fact we have kind of coordinated the handoff of usage from the core-based C-Sharp implementation that we started out with to the new native C-Sharp implementation. I'm not sure if we missed that one we did have the Swift numbers there. In general the standard is we have a set of interrupt tests and as long as an implementation passes that suite of interrupt tests we are able to admit that implementation into the GRPC ecosystem and if it wanted to go in one of the core GRPC reports we would probably want to go over the governance of the core-based etc but certainly as a CNC project we are welcoming of contributions and implementations from everywhere. I'm not sure if I got the idea. Thanks. Thank you for your presentation. I'm from the open telemetry project and maintainer as well as on the GC Alalitha Sharma I was curious about a couple of things. Very excited about all the work that Google is doing with open telemetry contributing a lot of support for metrics. Super interested in a couple of things. One, performance benchmarks for being able to actually ingest GRPC layer data through open telemetry and are there any tools that you recommend which we could also recommend back from the project itself. And the second part is again to the previous question the API support and kind of making sure that the languages which are being supported as first classes versus what is coming in the future because a lot of folks are using hotel to be able to kind of have that published someplace is that available anywhere which we could actually communicate by them? I think that second question we should certainly have that information available. Perhaps people are having to read between the lines a little bit. So we'll take another look at our documentation and make sure we kind of may do a good job of identifying where we have official support and full governance from the CNC of GRPC libraries where we have implementations that are in other repositories but part of ecosystem and then occasionally we run into GRPC implementations that are not really part of the ecosystem either so we should do a better job of cataloging and surfacing that information. Again can we help or collaborate together? Absolutely yes. We also have users for hotel and a lot of large users as someone who is just excited. We welcome any help we can get documentation is something where our documentation repo is of course also available in the same way as our source code. Do you also follow some kind of a progressive path for graduation that is what library is at what state of support often language support for different languages say at RC versus release candidate versus GA? I think what we have done till now is had just one bar around this interoperability test but you are raising a good point that because there is so much feature richness we could probably go beyond interoperability and actually measure things around feature support and actually grade them based on how much of. I think it would be very helpful. Thank you.