 Hi, nice to see you all here. How was the coffee? Great. Nice. Okay, so I'm a person that always wanted to be an IT guy, computer science guy and so on. So I was working for 15 years in academic world, doing science, teaching students and so on. In the meantime, I did my PhD, but always the practical things were most important for me. So even in doing some science, I was always trying to do it more practical, available for developers. So you can Google something about it also. So now I'm now I'm working for FLYR and doing microservices for them. I was always interested in distributed systems. So it's perfectly aligned with what I what I was doing before. I'm also one of the people that organized Ruby user group in my country. So please be friendly. Okay, what about FLYR? What we do in FLYR? We do revenue management system for airlines. So actually we take a lot of data from them. So we do big data, ETL pipelines and so on. Then we do machine learning on this data and we tell them what prices should they sell, the tickets to earn most. That's our goal. We have an office in San Francisco and we have office in Kraków, Poland. So you can work for FLYR from European time zone and have a work-life balance. Okay, so that's some things we use, not everything. And one more thing I, you know, my colleagues from Kraków told me that I must bring to you. We were recently on a Python conference. Some of us were in Czech on Python conference. And we've met their very interesting guy. So we hired him. You know, the guy in the middle of the photo, who knows this guy? Yeah, that's right. Kretek in Polish is Kretek. It's a, for the ones that didn't see it, it's a mole. And it's a character from cartoon in our childhood, very popular in our part of Europe. So we, we've met this guy. We took him on a conference. He was on a talk, on different talks. He even gave his own lightning talk. So if you are wondering if you should or not, he did it, you can do it. Okay, okay. So he is actually an exceptional data digger as a mole. So, you know, that's why we hired him. But to the point, the history begins like a year ago, a little less than a year ago when I started to work in FLYR and by then all microservices in our product were communicating over HTTP. And FLYR felt quite comfortable with this solution, but they also have the feeling that for some applications, it won't be the best solution. And that we will at some point probably have to switch. We already had some places where the services communicated over a Rabit MQ, but it was not perfectly implemented and we knew it. So there was a feeling that we should find an asynchronous way of communication. And the thing that actually caused the decision was a new requirement required to our e-commerce e-commerce use case. And the requirement can be described more or less like this. We had an UI interacting with the user. There was a backend for this UI. So there was some interaction. And at some point, this interaction between the user and UI backend caused the UI backend issue a query to the service our team was maintaining. And to fulfill this request, to respond to this request, we had to do we knew that we will have to find out a number of requests to the other services, some of them, most of them external, so we knew it will be time consuming to get the responses. But we wanted to give their user anything to show, to have anything to show to the user whenever we have anything useful for him. And when we get anything better for him, so we will update whatever we are showing. So that was the use case. So we actually wanted to implement something like this. We get the original query. We found out the sub queries. And whenever we get the first response to a query, we send a partial response to the original query to show something to the user. And whenever we get another response to a sub query, we will send another partial response to update this whatever this information that is showed to the user and so on. Okay, so that was the use case we wanted to implement. So that's the first thing. Second thing was related to performance. And I can't give you the precise numbers, but from what I can tell you was that when I looked at Rabit and Q, we wanted persistence in our communication. So when I looked at Rabit and Q at 5,000 messages per second that can be handled when you turn on the persistence, it was definitely not enough. So the requirements were quite high. So that was the second important thing. So of course, you can do it using HTTP, but we already felt that we will anyway need asynchronous communication. So that's the good point to start with. And how we did it, how we approached this situation. Okay, we decided, yes, okay, we need to do, we need to do a, well, but what and how. We have HTTP based infrastructure and it works. Okay, we have experience with it. We have developers experience with this way of communication. They have habits with implementing this way of communication. And you know, competence is, you know, important, but old habits die hard. So it's the hardest thing to overcome in some situations. Of course, we knew we lack experience with this kind of communication because we always did it using HTTP. So one more requirement. Of course, we must do it well and the first time do it well. Well, hard to do it well first time, but maybe. And of course, we knew we will get all these goodies when we switch to the asynchronous communication and even more, for example, more opportunities. Anyone knows what's the first opportunity we get in this situation? Anybody? What do you think? Sorry? Also, sure, you can always refactor the code. But I have some plans for you. Can you catch your plan? Thanks. Oh, sorry. But it's still perfectly operational. Anybody tries? It was probably on the previous slide. Okay. That's not along my line of thinking. Sure. Yes, you do. You can have all these things. But also, you have the perfect opportunity to make new mistakes. Isn't it true? Completely new mistakes. Completely new things can go wrong when you go to asynchronous communication. So we can have different concurrency issues, race conditions because we do asynchronous things. And in the places we didn't have it before. There is a problem, you know, we should choose a broker. We don't have experience. We can read a lot. We can do research. But there is always a chance we will choose a wrong broker because we didn't research for the right thing. If we choose the right broker, there's probably more than one driver we can choose. Okay? So which one should we choose? On what basis? How should we decide? We can use the correct driver, but use it incorrectly. Okay? If it's just in your one simple service, that's basic. But if it spreads all over your system and then you realize, well, you have to do this and that to make this communication stable and now you have to find all these places the other people just copied and pasted their incorrect code, that's hard to overcome. And finally, we can have correct driver and correct broker, but we can use the broker incorrectly. And a lot of different things may also go wrong. So we decided to contain all these horrible things in one place. Okay? A library and call this library asking calls. And that's another hard question. I have a cup for you. I won't be throwing it. I won't be throwing it. I'll walk. I'll deliver it by, you know, not by plane. Why asking calls? Okay? Don't answer. Don't answer. Don't answer. There's a cup for you. No, we don't use asking I.O. below. For some reasons. Okay. The reason is I just didn't want, I just wanted to give you the copy for you to try to answer because the answer is so strange that you didn't have a chance, okay? There's always this naming thing in computer science. Okay. So we wanted to create a library that meets our functional requirements. We wanted for developers, for this library to wherever possible resemble what they already saw, what they knew. And of course we wanted this asynchronous communication below. So whenever it's possible we wanted to join these three requirements. And why the library, you know, for maintainers of a library or of this switch, you know, the Sauron, you know, the guy. One ring to roll them all. So yes, one place to fix all the bugs. Yeah? One place to change decisions. So it's much easier to change decisions. You don't have to trace it all over microservices implementation. Just one place. And if we need to apply good patterns, it's not just that we teach all these people how to use this like Kafka driver well. We just use it well in one place. And we don't have to change what we thought to people because it's harder than just updating the code. But you know, Sauron had to sell these rings somehow to the people. So how do we sell it to developers? There must be something for them in this. Yeah. They can put Kafka on the recipe. Yeah, sure. And not no Kafka because it's hidden. But they do. The complexity will be hidden. If we do a good abstraction above it, so they won't have to think about all difficult things related to it. And lower entry barrier. It's a way different communication than using HTTP. So let's try to do it this way. So the decisions we had to take were easier because they no longer were final. We were more comfortable with the thought that we will have to change these decisions at some point. The decisions were that we will choose Kafka as message broker, not surprisingly for performance reasons. And for performance, it offers just out of the box. And we will choose Confluent Kafka as a driver because the performance reasons. And we also hoped that something supported by Confluent will be really stable and well. Well, that's what we use. We wanted to make it just a library. No framework approach. No put everything inside. Just a communication library. Make it simple. If we need something more complicated, maybe we will put another library above it. So that was first thing. We wanted to make it testable. First, okay, this Kafka is nice. How do I see you to this Kafka? Well, you don't. You can't issue HTTP to Kafka queue. So how do I test it? We need to give people a way to ad hoc send something to just let them test their service. And we wanted to make it testable automatically. So we wanted to provide some reasonable mocks for unit testing to make implementing tests easier. And if possible, maybe we could make it resemble flask just to let developers easier, get used to the new approach. Okay. But it's like half of my time and I'm talking, talking, talking and it's developer conference. So I probably, that's what you think. Okay. So it's okay. I'm showing you the code. To use it. How do we use it? We create an object. Given the service name as a parameter, it's just an identifier of the service. It should be unique across your system. And when you have this object and you want to create a server endpoint, so an endpoint that will be asynchronously responding for some requests. So you just create a function and decorate it with call server callback for. The parameter for the callback is the name of the endpoint. It resembles HTTP endpoint, but it doesn't have to. This slash is not necessary there. Just a convention. This function will get the request object as a parameter. You can do it the request whatever you wish. And you can use this request object to create a response, more than one response. And each of these responses can be sent back and they will be delivered to whoever sent the original request. So you can send zero or more responses to a request. Well, to create, to send a request, so to, oh, that's about identification. That's an ID of the service and a address of the service and this is name of the endpoint. So you send the request to the service ID and a specific endpoint. You can have a lot of endpoints in a single service. To send a request, you use as in calls client to create new message that will be sent. There's a destination ID in this message, target endpoint, and of course a payload. Maybe some more things. And you send this request, but wait, it's all asynchronous. The sending is asynchronous. It is not blocking. So how do we get a response for this request? Now before we send the request, we should define a callback to handle response we expect. So we define a function this time that the creator is as in calls client, not as in calls server. And we define callback for the service that will be sending responses to us. That's the first name and the endpoint that will be queried and will respond to our queries. So we can handle response this way. So that's just about how you can use it. Of course, in the most basic approach, of course, the last thing you should do is you should start listening. In the client, in the server, actually, you can have client and server endpoints in a single service. So you can get some requests, send some responses to fulfill them, and then when you get the responses, you can respond to the original request. It's all feasible. So you just call the asynchronous listening at the end of your program and it will make as in calls receive messages, route them to correct callbacks. So what we have? We have server which is like HTTP server even driven, the callbacks we know from Flask, for example. We also have client which is not like an HTTP client because it's not blocking. It's asynchronous. You send the request and just nothing happens. You should have a callback to handle a response. So if your request requires query from another service, you get the request, you send the query to another service, and your process is not blocked by waiting for a response. It can actually serve another request, waiting for the response for the previous request. A single process can be a server and a client, of course, and we can handle one request and a number of responses. So we can have more than one response to the original request, but we also can have no response to original request. We can just send notifications this way and if the receiver expects just to handle the notifications, it's not a problem. You don't have to send a response. So, okay, now when I showed you the basics, I could talk a long time about different things and the details that are inside, but I just want to tell about one single, for me it's one of the most important things, how do you test it? Is there a way to easily test this as unit tests? So, yes, I think calls have a testing mode. You need to enable testing mode and you can use it in your unit tests. To enable testing mode, you just set the testing flag to true, then you import your application, which defines all these callbacks, and finally you must have a fixture that will reset the testing mode between the tests, because we need to clear some buffers. And when you have the testing mode enabled, you can start testing, okay? So we can have different, different use cases that we want to test. The most basic is we have a server endpoint and we want to check if it gives correct responses. So we want to send it a request and verify what responses we got, and if it's correct. The most basic thing. So we need a way to send a test request to an arbitrary service. We can do it using that, because if you turn on the testing mode, you get something which is showed here, it's test client. And you can use this test client to create arbitrary messages and to send these arbitrary messages wherever you want in your unit tests, okay? So that's how you send the request to a tested service. You send this request and when you send it, the test client will receive the responses for you. So when you send the request, you immediately can check what responses were received. And then you can just assert for correctness of these responses. So you don't have to think about this Kafka below, about messages or anything. You just send a request according to business rules and verify the response according to business rules. That's all. That's the simplest thing to do. The more complicated is when we have another service and we are testing the service in the middle, a money broker. And this service is when it serves our original request, is expected to notify another service, a notification receiver, about some things. So we are testing the original, the money broker, but while testing it, we want to verify that correct notifications would be sent outside. Of course we don't want to spin out all the infrastructure. We want to have it mocked in our unit tests. We don't want such a communication. So we need a way to mock this other service just to verify what it would receive if it worked, if it really was really started. So aside from test client, I think because in testing mode gives you also a test server. So in this test server we can register an endpoint for this service we want to mock. It was called notification receiver. There was a notify endpoint. And this test server will just receive these messages for us and will allow us to retrieve these messages and check if correct messages were retrieved. So we register it, then we trigger the endpoint we want to test. We can do some assertions about responses as in the previous example. And we can use the test server received requests to obtain the requests that were received by the service. And the most complicated thing is when the third service is actually expected to respond to some queries and these responses should be used to generate the response we are testing. So we want to mock all the service together with its responses to produce, to check that correct output will be produced. So here we need to mock the random service. And to do it we just define a generator function. And when we register the endpoint in the test server we just give it the fake responses generator and it will generate responses and we will be able to verify that payload on the output are correct. It's correct. So we have testing tools out of the box. We discolve, in testing mode are made on stack so they are deterministic. The tests are deterministic. We don't need message cube broker. We don't have to think about how this IPC is actually done below. Just think about the business level in testing. We have also many more features like before and receive before send hooks like endpoint context managers if you want to measure how long performance of your end points so you can hook a context manager around this endpoint. We have error handlers for end points, kubernetes health check because we run it on kubernetes. We have sewer like client and a lot of more things. Of course if we hide some complexity we also hide some opportunities not only to make errors. So if you want Kafka streams for example so sorry we won't be able to deliver it because we hid this specific of Kafka below the abstractions. Okay so we still have can have these problems like concurrency issues we will have because we have asynchronous communication we won't run away from it but it's all for developers. It's all on the level of the business logic they are implementing. They don't have to think about Kafka usage pattern driver usage pattern and so on and if there are some problems below that they can be solved in one place and actually were solved in one place without bothering a lot of developers. So switching from HTTP to async calls for server is straightforward. It's not a problem for client it's a little bit more complicated. We support one way communication. If we use more complex use cases yes it's a matter of doing it well. We have callbacks so we always can have a callback hell but we also know that there are patterns to do it well. Okay so we can build something above it if we need. We have easily disabled services so because we have tools to do it and we have now a standard project white layer to make asynchronous communication between the services. Thank you. I have three more cups I won't be throwing it but please grab them and don't make me take them back by playing you know to my place. Any questions? We have time for a few questions. What are two maybe. Yeah thank you for a talk. An interesting thing is how to design exceptions and errors. How did you approach that? We have as I said we have exception handlers so you can register an exception handler for specific exceptions for all your service. So exception handler is a function that will be called when an exception in your end point occurs. Okay for the HTTP developers they don't see much of it of the Kafka specifics but there's like layer, your library layer between throwing specific errors. Exception handlers are for the exceptions that are going out of the callbacks. As for Kafka exceptions, well they shouldn't see them because we should debug the library if not we must fix it. Yeah like message length is fixed and stuff like that. You kind of have too much payload and stuff like that you need to abstract that. Actually you know because in message given communication the errors are handled in the different place rather on the level of receiver than the level of sender of the request. That's also the tricky thing to switch from HTTP. You don't have error 500 because your service is down. Your message is just waiting when the service is out after like ten minutes it will serve your request and maybe it will also send responses to you. Unfortunately that's all the time we have but maybe you can grab... I'm around so... One big round of applause. Thank you very much.