 Hello everybody and welcome to our journey on the land of microservices. I will have an opinionated talk about different ways of communication between services these days. I'll give an introduction to protocol buffers and GIPC by covering their main concepts. As I mentioned, I will generally talk about things needed for the communication process between microservices. That will include the message sterilization and deserialization of the message, their transport over the wire, of course, the wide diversity of your services in terms of technological stack they're running on from the variety of languages to the variety of platforms that you should support. In a real world use case, you'll have some services that communicate with each other. In this case service A relies separately on BC and D. However, things could look like that. The essential difference is that there are more interdependencies between those services that are not exposed to the user. Denoted by the node A. At the same time, the node A, the user should not even care much about those dependencies. In a nutshell, simplifying the view, there is usually client-server communication. The communication can be done over HTTP and the messages could be serialized to JSON strings. Also, the communication could go over some proprietary protocol and instead of serialized objects in the payload you'll have references to the actual remote objects. We'll come back to that one a bit later. Now considering JSON for object representation, let's take a brief look to its advantages and disadvantages. So first of all, it's human-readable. So it's easy to perceive and debug. It's also schema-less. You have the liberty to form your JSON in any way. There's nothing to enforce you to follow some structure whatsoever. And it's language-agnostic as the serializers and deserializers are available in mainly all the programming languages. Speaking of disadvantages, first of all, it's human-readable and it does for itself. Isn't that a benefit? Well, not really. Human-readable means that it's not too compact. Those it's more consuming in terms of size. Also, it's schema-less. So you ask yourself again, isn't that a benefit as well? Well, actually not, as you always have to map the contents of your JSON to meaningful objects. Not only doing more work, but also introducing lots of repetitive boilerplate code. For instance, when you have multiple clients consuming the same endpoints. In fact, you implement an artificial schema at the application level, whereas it could be done and it should be done on a lower level. Also, there is no type safety whatsoever within the message, which could lead to serious problems when interpreting a value in different ways for different implementations. As an alternative to JSON, we could consider protocol buffers. Protocol buffers are Google language, neutral, platform-neutral, extensible mechanisms for serializing structured data. Thinking somehow, but smaller, faster and simpler. This is the official documentation. I'll add there a small correction. Think JSON, but smaller, faster and simpler. I may refer to protocol buffers later using terms like proto, proto messages or proto buffs. The interface definition language looks like that. The entire content from this slide defines a message. It's called person and it's denoted by the message keyword. It has three fields. The first one is called name and it's the type of string. The second one is an integer. And the third one is a repeated field. A repeated field is essentially a list of items with a denote type. Take a closer look to the numbers on each field. That's the field identifier. It's used for binary encoding and decoding. It is very important to keep in mind that these identifiers should be unique and you should not be reused in case of the deprecation of some fields. To help you with that, to keep track the deprecated fields, there is the reserved keyword. To denote fields that are prohibited to be reused. If by any chance you'll try to reuse one of the reserved fields, the protocy compiler will give you a specific error. In general lines, that's the entire theory behind protocol buffers. Let's see some of their benefits. So first of all, they're binary encoded. So this means that they're very compact and the encoding and decoding process is very fast. Messages can be also serialized to JSON, Thrift or other formats if needed. The schema is enforced from the ideal level. And the messages are strongly typed, which is definitely a benefit over volatile and typed messages. Moreover, having a single way to define messages, you save lots of boilerplate code for serialization and deserialization. It is also language neutral as it has officially support in 10 different languages. And of course, JSON could be even better than that. And it gives us some out-of-the-box backward compatibility features. Avoiding this kind of code. In real world, you'll have various clients and you won't be able to guarantee that all of them are running the freshest app. With protocol buffers, it's easy to add new fields or deprecate some existing fields or even rename some fields. And it's generally faster. From the network perspective, smaller RPCs consume less space and they're transmitted faster. Also, the memory and CPU usage is smaller because less data is read and processed while encoding or decoding a protobuf. Now, this is an ideal message annotation and it's similar to the example that was shown earlier. From this ideal auto-generated code, it's rather simple to create new objects like that. Sorry, this is Java. The generated code provides a view of builder, setters, getters and so on. However, in Python, things look pretty much similar. And for even more elegance, you can use the keyword arguments constructor to construct your objects. Now that you have the rudimentary understanding about the data that's generated and consumed by the services, let's take a brief look to the way these messages are exchanged between the services. In a REST foolish API, usually entities have some distinct URIs. You fire an HTTP request to them, you get some plain text encoded data and you parse it and do whatever is needed. This is how an HTTP 1.1 request looks like. You send a bunch of plain text headers to the socket and this is just a small amount of them. You'll usually solve that indication with additional headers and perhaps there are some other parameters. The response will start with a bunch of headers again. This is just a small part of them. Surprisingly, sometimes you can receive a bigger amount of headers than the amount of payload itself. This is your actual response payload. With HTTP 2, you could get some performance improvements out of the box. The requests become cheaper as the average request overhead is reduced with multiplexing, header compression and so on. Also, even though HTTP 2 doesn't enforce you to use TLS, it is encouraged to be the only correct way of doing things in HTTP 2. So you'll gain some extra security features. I won't go any further into details as it's not in the scope of this talk. But if you're interested to learn more about HTTP 2 and grasp the entire HTTP 2 evolution, I would definitely recommend watching Annabalica's talk on HTTP history and performance. It is a great talk covering lots of technical aspects. I mentioned earlier the distributed objects. So let's get back to them for a while. In the late 90s and further, the concept of distributed objects became more and more popular. In theory, it looks very nice. You deal with some objects and you don't care much whether they are local or accessed through a network. The concept of location transparency implies that the remote objects have the same look and feel as the local ones. From my own perspective, the term should be rather called opaque as your universe awareness is really limited. With this in mind, I'd like to quote Martin Fowler. So the first law of distributed objects is do not distribute their objects. First of all, we should acknowledge the fact that there is a huge difference between calling some procedure locally and going somewhere remotely and doing so. The most obvious difference would be the latency. As a simple local call could take something like a few nanoseconds to produce its output. For a simple network call, you'd expect it to be tremendously slower in orders of tens or even hundreds of milliseconds. Another difference would be the network reliability. A network call may and will fail eventually, whereas a local call will always succeed. So hiding these facts from the user in a transparent object is not the ideal thing that we could do. Also, in the early 90s, a list of fallacies of distributed computing was established by the engineers at Sound Microsystems. And there are the following. The network is reliable, the latency is zero, the bandwidth is infinite, the network is secure, topology doesn't change. There is one administrator, transport cost is zero and the network is homogeneous. Even though the network gets faster, the bandwidth wider and so on, it looks like the entire list is still accurate and still actual. And this is not going to change in the near future. So let's keep in mind Marfa's law, which says that if something could go wrong, it will go wrong eventually. So do not ignore any of the possibilities and be always prepared to handle them. Now JRPC is an open source remote procedure call framework that can run anywhere. It enables the client and the server applications to communicate transparently and makes it easier to build connected systems. JRPC promises us to solve the issues that I mentioned before. JRPC is a recursive acronym. It stands for JRPC Remote Procedure Call. It is mainly developed by Google as a rework of the internal framework called STABE. The first principle of JRPC is to have services and messages instead of references to the remote distributed objects. The message is a static container with type data that respects its schema and it's pretty much all of it. The messages don't have any behavior whatsoever. The service itself has all of its business logic inside so you give it an input and expect a static output from it. In this sense it's pretty similar with the below the restful services with some extra features like streaming. But let's keep it for now. Another important principle is that the stack should be available on every popular development platform. And it should be easy for someone to build on their platform of choice. It should be viable as well on the devices with limited CPU and memory. And for the full list of the JRPC principles you can follow the linked article. A service in JRPC looks pretty much similar to restful service. It has some endpoints and you can pass messages over them. The only difference is that the entity identifier is not a part of the endpoint. As it's usually in other approaches like rest. In the diagram the service is implemented in C++, the clients in Ruby and Java. And that's obviously just an example as any part can be implemented in any of the supported languages. Now let's design a service that will provide routes between two specified points. First of all we need a service. Our service is called route planner. A service can have multiple RPCN points and in this case it has just one. It's called get routes. The get routes endpoint takes a get routes request message and it returns a get routes response message. Now this is a proto definition of the service. It should be kept in proto files just like any other protocol buffer messages. The request and the response messages are defined just like any other proto messages and the location and route are some user defined messages. To generate Python code from protocol buffers definition all you need to do is to run the proto C compiler and it can be done through Python as well. We specify the proto path, the output path for the generated messages and the output path for the generated gRPC specific code and finally the path to our proto file. This will result in two files. The first one holds the proto messages specific code and the second one the gRPC specific functionality. As we already have the generated code from the proto's let's dive into implementing the service using Python this time. So this is basically our entire service. This class implements the route planner servicer that was generated by proto C and each method from this class implements a specific RPC endpoint. It gets the request and the context as parameters. It returns the response message. The context holds RPC metadata like deadlines, constellations and so on. I'll get to this one a bit later. To actually make use of our implementation we create a gRPC server entity with a thread pool executor. The actual implementation has to be bound to the server and then we specify the socket for our service and start it. And that's basically it. As we already have the service in place let's try to implement the client in Python this time. To access the service a client must create a channel to the listening socket then create a stub from the generated code and form the request message. As you form any other protocol buffer object from the generated code. You'll get the request from that in a blocking manner. There's also the possibility to make asynchronous calls when calling the service. And it could be done like that and in a synchronous call you'll get some sort of a future monod in form of Python 3 feature from which you could get the result at some callback check if it's done and so on. The features module is back ported to Python 2 as well if for any reason you still have to use Python 2. Also you can play with your service using the gRPC command line tool. To use it you just have to issue the gRPC CLI call to your socket. After that you write the RPC endpoint name and then goes your request proto. This is just a here documentation. You could also store the proto request message in text files and provide those to the CLI tool if needed. As a result from the gRPC CLI call you'll get the text representation of your response. In this case is a repeated field called routes with the corresponding data. Let's get back to our route planner service and imagine a specific use case. What if depending on the time and some external factors the service wants to get you rerouted as soon as some newer routes become available. This is possible to be done with streaming. In proto definition you just have to add the streaming keyword before your response message and implement it accordingly. Also let's say that our client is accessing our API from a mobile phone while travelling around the city. And his coordinates constantly change. It would be nice after some threshold to stream to the service your new location so the routes could be recalculated as well. Well this is as well possible to be done with the streaming. Now we have separately the response streaming and the request streaming why not have both. Of course we could have both. We just have to add the stream keyword before both of the request and response and voila you just have it. You have to implement it and unfortunately I won't go into implementation details as it's a bit out of the scope of this talk. But I hope that I managed to give you the feel of those features. Now that we know the basics of RPC definition let's try to go into some more sophisticated features of gRPC. And let's keep in mind that things will go wrong and we should be very well prepared for that. When shooting a request it is not fit wait and definitely for a response. There should be always a timeout set. But how do we determine what's the proper timeout for different calls in a chain. Let's try different approaches. We could put a uniform timeout on all the subsequent calls. Let's see it in action. So we have set this 500 milliseconds timeout on all of the subsequent calls in our chain. And first three calls go pretty well. There is still some time remaining from any of those timeouts. Node B for some reason is quite slow. So by the time it responds the client is at the timeout and fails accordingly. None of the nodes are really aware of that. So they are continuing to do the already useless work until their own timeout gets exceeded. And that's obviously not the best approach. In reality for different services we have different expectations in terms of their response times. That means that we can have custom timeouts just like that. For again for the first three calls everything looks good so far. But when the node B responds it violates the timeout by a very small amount of time. So the corresponding node fails. We could have done a better job as probably the entire chain would have succeeded and would have taken less than those initial 300 milliseconds. Let's try to adapt somehow the timeout. An adaptive timeout would sound like a better option. Adaptive in the sense of cascading the timeout from the first call to all the subsequent ones. So let's try the previous example but in this new way. So the initial timeout would be 200 milliseconds and the first call would take 20 milliseconds. That means that the next timeout would be 180 milliseconds. The next one takes 30 milliseconds so the subsequent call would have a timeout of 150 milliseconds. Now we have all the timeouts set naturally from the initial one. The response wave goes well and the client is happy with its result. This looks like a good approach but it's rather difficult to operate with timeouts. The timeout is a relative value, a delta from a specific point. A deadline would mean an absolute value say a time stamp of milliseconds. GRPC operates with deadlines however they're specified by the user in the form of a timeout and it's computed internally. The deadline is propagated automatically within GRPC to all the subsequent calls. You can access the deadline at any point from the context that I mentioned before. For instance, if you do some heavy lifting before making another RPC call and you want to assure whether the deadline is within some normal threshold, you would make another HTTP call and you'd want to manually propagate the deadline there. It's rather simple as you could simply get it from the context. Let's try an example with deadlines and for simplicity let's use a fictive starting time stamp in milliseconds. A timeout value would be say 200 and the deadline would be the sum of those two. Because the deadline is an absolute value all we need to do is to assure at each step that the current time stamp is smaller than the deadline. So we just go and at any step we just compare the current time stamp and deadline and if the current time stamp is smaller then we just proceed. And the entire request response chain goes well and at the end the last time stamp is still smaller than the deadline. So the client is happy with its result. Now let's see an example when the deadline gets exceeded at some point. We have the exact same setup as in previous example but probably a bit slower network. So by the time the request gets to the node A the deadline gets exceeded. A deadline exceeded exception is propagated back to the client. We failed the entire call at an early point so the node B was never touched. This is obviously a good thing to do. In the same way the deadline exceeded was propagated back to the client. A manual cancellation could be propagated. The cancellation can be initiated by both the client and the server. It immediately terminates the RPC and all the subsequent RPC calls that are pending. And keep in mind that it's not a rollback. So if you need for some reason to, if you did some changes, some changes to the database or to some state you should keep in mind to do the rollback by yourself. And it's automatically cascaded as I mentioned before. When migrating for instance adjacent RESTful API service to GRPC a backward compatibility layer can be sure temporarily with the GRPC gateway. This will translate the GRPC endpoints to a RESTful API. Keep in mind it won't work with streams and other nifty features of GRPC. GRPC has a tremendous language support with C++, Python, Java, Go, Ruby, C-sharp, JS, Android, Java, Objective-C, PHP. And it also supports most of the use platforms like Linux, Mac, Windows, Android, iOS. Speaking of success stories, I'd like to mention Google as it's the original creator of GRPC and GRPC is the evolution of Stubby, the Google internal RPC framework which is widely used there for quite a long time. Also the adoption of GRPC externally is evolving rather fast. Companies like Docker, Square, Netflix, CoreOS, Cisco, Carbon 3D, Juniper Networks and others are using it extensively already. For instance Docker container the implements a GRPC API and their swarm communication between nodes is done on GRPC only. Another interesting example is Juniper Networks. They do mostly software defined networks and they implement the SDN open config on top of GRPC. Now summing up let's outline the benefits of GRPC and protocol buffers. First of all you focus on the design of your API and you establish a strong contract with your clients. And you have the schema defined in one place at a different level than your business logic. Also HTTP2 is awesome and it's supported by GRPC out of the box. You have the bidirectional streaming for free and you have the freedom to pick any suitable language for any specific service and you have the freedom to change your choice on specific service at any point. It is service to service and service to mobile friendly and even more importantly it's production ready. So just give it a try. Thanks for your patience. I believe I won't take any questions at the moment but you can catch me at any point later and I'll be happy to discuss anything with you. And as I believe the lunch is already served. Enjoy.