 Well hello and welcome to another Dev Nation Live. I'm here today in New York City. I'm always traveling around, but we're gonna be hosting the show today and bring you another great session and great content. We're gonna be talking about tracing a key technology for your distributed microservices. We know everyone here wants to do microservices and therefore you're gonna need some tracing to figure out where all those service calls are going. We have two experts on the line with us today. They're gonna be walking us through the presentation. Make sure that you use the Q&A tab to ask questions. You can also see me in chat, but you use the Q&A tab to ask questions and we'll get towards those questions more towards the end unless we can answer them in line. But right now we're ready to get started. So we have Yorca and Pavel. So can you guys jump on this line here and say hi? Hey, hello. Hi. Excellent, so I turn it over to you. All right, so while Pavel shares his slides. So my name is Yorca Sipashankaroli. I'm a software engineer at Red Hat working on the Keali project, more specifically on the distributed tracing team. So Pavel is a colleague of mine also working on the distributed tracing team for the Keali project. And before I hand over to him for the demo, I would like to set up some context and explain the problem that we are trying to solve with the Eiver and Open Tracing. Now in the world of microservices, tracing is very important, right? So we may have tens, perhaps hundreds or thousands of microservices. It's also very common on microservice architectures to have canary releases and AV testing and blue-green deployments, right? So it means that we have different versions of the service running at the same time. Now it also means that it's very hard for us to figure out what versions of which services are affected by a given request. Now this is not a new problem. We've been having this problem since we started doing distributed systems, but this is now exacerbated by microservices. Now, we just have a team name for this problem and this name is observability, right? So we have a observability problem. Now, in simple terms, a system is observable when we can collect metrics from it, when we are able to see logs on a centralized place and when we are able to analyze tracing information, right? So some people also include alerting in the mix, but we're not getting to that today. If you're interested in the topic of observability, I do encourage you to take a look at the Keali project. The Keali project aims to bring observability to service matches like Istio. Now back to tracing. Tracing is one of the pillars of observability and we achieve a tracing by doing code instrumentation. Now we do code instrumentation by marking our code. So let's say that we want to trace a specific algorithm. That means that we add an annotation before that algorithm saying I start measuring here and then one annotation afterwards saying I stop measuring there, right? So what happens in between those two markers is stored in a data structure called ASPEN. So ASPEN is a wrapper around a unit of work. We store information like when did that unit of work started, how long it took and perhaps even some other metadata like what is AP address that I'm drawing on, what is the host name and so on and so forth. ASPEN can reference other ASPENs. So one ASPEN may have a reference to a sibling ASPEN and what usually what we have is ASPEN has been a child of another ASPEN. So we have a parent-child relationship. Now I mentioned about code instrumentation and we have explicit and implicit instrumentation. For explicit instrumentation, it is very similar to what I mentioned before. So we have tracing code very close to the business code. And then we have implicit instrumentation. So implicit instrumentation is in Java, it could be done by doing byte code manipulation on runtime by making use of a Java agent. And then in between those two roles, we have network instrumentation or framework instrumentation for instance. For framework instrumentation, one example would be when we are building an application or using a spring boot and we add one dependency to our project and we suddenly have a rich instrumentation for components that are commonly used on that stack. So framework instrumentation is also available for individual frameworks like serverless, EJP, CDI, but also other stacks like micro profile. It is also available in other languages and frameworks on those languages. Now tracing itself while very useful is not sufficient in a microservices road. So in a microservices road, what we need is individual traces for individual services, but tied together. And we achieve that by making use of distributed tracing techniques. And in that technique, we add the idea of context propagation. So context propagation in simple terms, the very simple way of doing that is just by passing a correlation ID. So we generate an ID at the very first point of contact in our system and we pass this ID along the downstream. Now of course, solutions nowadays, they do far more than that, but that's a simple way of understanding what's going on there. The whole idea of distributed tracing is to tell the story of a request across all the services. Distributed tracing is also not really new. So there is a paper from some eight years ago by Google describing Dapper, which is their distributed tracing system. And since then there have been some solutions in some projects built either based on Dapper or inspired by that. And this works quite well. So if people, if you want to use to distribute the tracing today, you can choose a vendor and instrument your application. And suddenly you have all those distributed tracing capabilities. Now, if you decide to change vendors, that usually means that you have to change your code. You have to remove instrumentation from the old provider and include instrumentation from the new provider. Now to solve that. Okay, so hello. Yucca, thank you for great interaction. And I will continue with some examples, starting very simple and showing you how you can start with instrumenting your microservices and then talk a little bit about tracing in Istio, what are the challenges there and then show a little bit more advanced topic, how you can leverage distributed context propagation for routing rules in Istio. So first of all, I would like to tell you more on the instrumentation side. On this slide, there is your application agent. The application process, there is more stuff running in it. There is application runtime with your business logic, then instrumentation, open tracing API and the Yeager client. So what is instrumentation? Instrumentation is basically implementation of a specific runtime API, which is calling open tracing API when certain events happen in the runtime. By certain event, I mean, for example, incoming HTTP request. So if there is incoming HTTP request, the instrumentation will call open tracing API and create some event in the open tracing terms. For example, start span, add some metadata to span or lock something to span. When the request is finished, the span is finished and reported to the tracing system. On this slide, you can see the application takes the same part as the other part, like instrumentation, open tracing API and the Yeager client. But usually the application code is the biggest part of your deployment. And the instrumentation is just very small library sitting between the open tracing API and the runtime. I will not go into details and describe the Yeager deployment. You can find it on the Yeager documentation. So the instrumentation is really just a simple implementation of a specific framework APIs, usually like filter or interceptor. You may think it's simple to do it, but because the code base, it's not usually huge. But what happens is that there are a lot of corner cases which you have to implement and you have to proceed in a correct way. So to do instrumentations can be a little bit tricky because you are not looking at the traces that everything works well. You are looking at the trace when something goes wrong and usually it's a corner case. So this like a filter implementation and interceptor implementations are very like low level instrumentations. To go further, we have also like runtime instrumentations for different applications, right? Like spring boot or micro profile. I will talk about spring boot on the next slide. Here I would like to tell you something about micro profile open tracing. Micro profile open tracing is a specification which builds on top of tracing and it defines tracing for specific JEE technologies. At the moment, it is JAXRS and CDI. So if your vendor implements micro profile open tracing, you can enable tracing just by configuration change and you will get visibility into your JAXRS endpoints and CDI bean invocations without doing any manual like filter installation or anything like that. So now I would like to show you a simple spring boot application. It's called preference application and it's instrumented with open tracing. As you can see, this is the main class. There is, I'm not adding any filter or anything related to tracing. Then there is a preference controller which is the next and the only class. There is one REST endpoint, which is calling some external endpoint. There is some logging going on and returning some status code. So you may ask where is the tracing integration, right? You can, it's only in the POM XML. I'm specified two dependencies. The first one is open tracing spring cloud starter. So a starter is like a spring boot specific thing. When you include it on your class path, it will automatically configure and install new technologies into your app. So this starter is just a collection of different instrumentations for spring boot. So it includes tracing for spring web, spring messaging, MongoDB, RabbitMQ and more. But it's vendor neutral, right? I'm not, it's not tied to Yeager or Zipkin or any specific vendor. So to make it work with Yeager, you have to specify the other starter, which is Yeager starter. It just provides a Yeager implementation which is then used by the spring cloud starter. You may also ask where is the Yeager configuration? It's in application properties where you can define the Yeager URL and the sampling configuration. So let's try to run this app. I will also start Yeager all-in-one distribution. It seems it started and let's call the preferences endpoint. And we can see something bad happened. Preferences responded with some input-output error. We can go to Yeager and find out more what is happening. As you can see, there are two spans. Let's look at the first one and it tells you that this is a server span. So this span is modeling a server in location, what just happened. The span was reported from the satellite integration. You can also see the URL, the meta, the status code, but there are also logs. So let's look at logs and we can see pre-event pre-handle. What is interesting is the class simple name and the method name. And this is actually the class name of the controller and the method of the controller which has been invoked. So preference controller get preferences. We can jump back to the code and see if it matches. Yeah, it's exactly like that. So what else happened? There's also a log regarding to logger, level worn, some threat in the message and some exception object. And as you can see in the code, if there is an exception, we are logging it to standard logging to SLF4j, right? But this log is also attached to current active span. So you really get contextualized logging. Okay, so this was the server span. Let's look at the client span. The client span may see the component, the rest template, the URL, and there is an error batch. So something bad happened. If you look into the logs, you will see the error object. So not see only the basic metadata like URL, the method, you can also see exactly what controller has been invoked and exception if happened. So let's move on. And the next, I would like to show you how tracing works in Istio. And it is demo, I would like to, I will use the Istio tutorial. You can find it on Red Hat Developers demos. It deploys bunch of Java microservices, then of course Istio, Yeager, Prometheus, but also Keali. Keali is a service mesh observability tool, which allows you to show your service mesh, but also metrics associated with your services, then routing rules, circuit breakers, and so on. So the Istio tutorial is deployed on the open shift. There are three microservices. The first one is customer, preference, and the recommendation service. So how does the tracing work in Istio? So maybe some people are confused because they think that Istio can trace microservices out of the box, which is true, but it's not without any effort, I will explain. So if there is an incoming request preference service, for example, Istio will create a span corresponding to that request, server span. But if preference service is calling recommendation service, then Istio will also create a span corresponding to this request. It is client request because you are calling from preference to recommendation, right? But you have to correlate these two requests together. And to do that, you have to propagate the headers from inbound to unbound request. I will show you a simple Spring Bootrest controller and it's very simple. You just inject the headers and then you add all these tracing headers to the outbound request. This looks pretty simple and straightforward, right? But if the application code gets more complex, you will probably not call the other services from the controllers, but you will call it from your business methods and from other parts of your application. So you will have to propagate these headers to those parts where you call additional services. And to do that, there are different solutions, right? You can pass some kind of like context object or you can store these headers in trade locals. But there are different problems with this, right? If you pass a context object, you have to instrument, not instrument, but you have to change all your business logic to accept this context object. If you use trade locals, there are different problems associated with trading, right? You may start a different trades, you use async APIs. So the trade local wouldn't work without, without basically instrumenting, for example, the, without like setting properly the context in a new started to trade. And this is exactly what tracing libraries solve. It's called like as Yuka mentioned, it's called context propagation. And you can basically at the moment, you can use any tracing library to do this task for you. So let's go back to Keali. So you have two options. You can propagate the headers manually or you can use tracing instrumentations with context propagation to propagate the headers. So you can do this as a reference for you. But in addition to that, you will get also visibility into the process like we seen in the Spring Boot example. And as I mentioned, the Istio will create some spans for you, right? And also the instrumentation creates some spans. So you will get two spans for the same event. You may ask, it's a duplication, but in some case it's useful because you will see difference handling requests inside your application and handling requests on the proxy level. So let's try and invoke some requests. I will just call the customer service with Coral. And we can see the customer is calling the preference and the preference recommendation. I can go to Yeager UI and find some traces. So there are a lot of spans, right? And the spans from Istio are tagged with a component called proxy. And you can see there are almost the same tags as we seen in the Spring Boot example like HTTP URL, HTTP method. And the child span of this span is the span reported from the Spring Boot, right? It contains basically the same metadata as the proxy span. But if I close this, you will see there is a timing difference, right? So the server proxy span usually takes longer than the one in the process, right? So it's logical. Let's look at some client span. Here is the client span from the Rust template. And here is the client span, the same client span reported from the proxy. And you can see the timing difference, right? So for example, if you have really slow, implementation of the client, then the proxy can show you like everything works fine, the timing is okay. But the request would take longer because of the, I don't know, class loading or something like that. Also the last service is Vertex. And there is no instrumentation so we can see only span coming from the Istio. In this case, it's fine because the recommendation service is the last one but imagine if it's not, then you have to propagate the headers again. And it's certainly useful to use instrumentation to propagate the headers because imagine somebody new comes and implement some new method in the recommendation and we'll call additional service and you can just forget to propagate the headers, right? So this was for the tracing in Istio. And the last part is how you can leverage distributed context propagation for routing rules in Istio. So as you may notice, the recommendation service is deployed in two versions and so for example, the version one contains a specific fix for Safari users. And you would like to redirect all the Safari users to recommendation service or version two. How do you can do it? You can use the user agent HTTP header which is set in the browser and use it somewhere here in the routing rule. But to do that, you have to change all the services which are in the middle, right? In this case, it's only one but it can be hundreds of thousand services. But if you are instrumented your services with tracing instrumentation, it doesn't have to be only open tracing. It can be an instrumentation with supports distributed context propagation or so-called baggage. You can use it to propagate the user agent for you automatically. So and that's exactly what we have done here. So let's look at the customer service. This again, the Spring Boot application, the instrumentation is done only via starters. And in the controller to use the baggage, you have to inject the tracer which is open tracing interface and just add a baggage item to the current active spanner. So you see, I'm just injecting one header, adding it to the baggage and then creating outbound request. As you may notice, I'm not injecting this user agent to outbound request. It's done automatically in the instrumentation layer. Then if I go to preference service, then there is, I'm not manipulating with any headers. I'm not touching the baggage and baggage is automatically propagated to all downstream services. So let's create the routing rule. It's been created. Let's call the customer service and we can see the response is as before the recommendation version one and recommendation version two. So, but we are not specifying the user agent safari, right? So let's edit. Now I'm only redirected to recommendation version of version two. I will just go back to Keali and you can see there is a badge that there is a routing rule defined. So let's look at the routing rule. It's called recommendation safari. Mission is very simple. If there is a header with the name baggage user agent and the value matches safari, it will forward the request to the version two. If you use, if you would like to propagate, for example, baggage foo with value bar, the baggage header would be baggage foo with value bar. So you see the structure. Okay, this is everything that I have for today and let's continue with Q&A. All right. Well, I'm back online and Pavel, we got to get you back in the webcam here but there is an open question. If you could provide the URL to your sample code, someone wants to see that baggage code a little bit more thoroughly on their own. So if you have a GitHub link, you can provide us that would be awesome. Yurka, you're back. Yeah. Can you hear me? Can you see me? Yes, okay. Yeah, fantastic. So one question that I think is a great question is for microservices overall, why do I need tracing? Why do I need instrumentation? It's kind of a basic question but let's make sure we answer that well. So who wants to take that one on? I can answer that one, yeah. Okay. So when we are talking about microservices, we usually have a great number of microservices and different versions, right? So if we don't use a tool like OpenTrace or tracing in general, we don't know which services at which versions were touched by a given request, right? So you do have then this observability problem that you don't know what's going on behind the scenes. So this is where tracing can help you. Okay, and the point I always make to people is if you have dozens or hundreds of microservices, you have no idea where that transaction went. So having the traceability, having the observability is incredibly important. If you only have two microservices, which actually I talked to a lot of people who only have one or two or three, you're not really doing microservices. You're supposed to have dozens if not hundreds. If you just have one or two, that's really a modeling. Yeah, even in those cases, I would argue that tracing is useful because then tracing gives you visibility into your application as well, not only in your infra, but also inside our application. So you know how much time we'll spend during a database call, for instance. Yeah, we didn't have that demo today, but I know Bob McWhorter has been working on that specific demo, right? The traceability all the way through a monolithic application stack, including seeing how long the JDBC driver is taking. I don't know if we have that demo cooked in such a way that anyone else can see it at this point. Like, Pablo, do you have a link to that on GitHub or anything of that nature? No. Okay, we got to work on that. One more, we did have technical difficulties, right? We lost Yoricka for a period of time. So I think you made a point here in the chat that I think is still critical. Open Tracing API, so speak to that one more time. The Open Tracing API versus Yeager, Open Tracing API versus Zipkin, can I explain that one more time, please? Of course. So Open Tracing API is only the outcome of this specification, of this vendor root specification, right? So it's only an instrumentation API. So if you don't know which vendor to use or which solution to use, then you should at least instrument your code using the Open Tracing API and later on decide which concrete solution to use. And it can be Yeager, I hope it is Yeager, but it can also be Zipkin or it can be any other Open Tracing compatible solution. Okay, and one last question. It was from John and he's going back to Yoricka and he says, what is the difference between tags and baggage and how is that propagated across child spans? Maybe I can answer that. So the tags are some metadata attached to a span, but the baggage is also metadata attached to a span, but the baggage is propagated to all downstream services. You, for example, it depends on the, on the Open Tracing implementation, but you cannot, for example, in Yeager, search for spans or traces based on baggage. Oh, okay. That's good to know. I didn't think, I didn't understand that myself. Okay, one last question and then we have to get out of here over time, but how much of a performance hit do I take by adding this instrumentation, adding this labor observability? Any thoughts on that? I can answer this one. So we aim to have a very little overhead. So the Yeager, so let's talk about the first of the Open Tracing, right? So Open Tracing on the API, so it should have no overhead on your system. Now an actual tracer will of course incur in overhead. Now the Yeager clients, they are built in a way that it's not blocking. So they, as soon as possible, they delegate the delivery of the spans to another thread that runs on the background. And that sends to the agent running on the local host, right? So it should be quite lightweight on your hosting, on your application. Now, of course you should expect some performance hit, but you can adjust that hit by adjusting the sampling strategy. So you can say, I wanna measure only 0.1% of my requests. All right, so and if I do have bandwidth, I can increase that to, I wanna trace every single request. So you can adjust that based on your requirements. Okay, well, we are out of time and I, oh, Pablo? Do you wanna say something? Yeah, I do need to add something. The Open Tracing is, usually the instrumentations are open source, so it's easy for you and look inside like what is doing, what is reporting and it's not something, some black box agent doing some magic with your bytecode. No, thank you. Yeah, very good point. Well, we are out of time. I wanna thank you guys so much for your presentation day. I really enjoyed it. We had some technical difficulties. I even had problems with my laptop here in the Red Hat New York office. Something went awry and I got disconnected myself. So I even missed part of the dead portion as well because I was disconnected. Pablo, great job with demos though. Those came through loud and clear at least on my end and hopefully we had a number of people who were able to watch all that. Feel free to email me. You guys should have my email from the announcement emails to go out to everybody to let you know what the next Dev Nation Live is. We have scheduled all of July as well and we're in the process of scheduling August and September. So there's a lot of good content coming your way. But if you have other questions or wanna know what they're recording, just email me. Thank you guys so much.