 Konichiwa, everyone. My name is James Faulkner. I work at Red Hat. I've been at Red Hat for about seven years. And I focus on technical products management and marketing for our hybrid platform products like OpenShift and JBoss, our application development platform, and a number of components on top of that. So with me today is Daniel. Daniel, want to introduce yourself? Sure. Thanks, James. Hello, everybody. My name is Daniel. I'm working for Red Hat, like James Faulkner and the same team. I'm a developer advocate and also a CNC ambassador. So pretty much focused on Kubernetes and serverless and service mesh, as well as container runtime and cloud runtime like a JavaScript and so on. So we are more than happy to be here today. And then we're going to talk a little bit about for the next 30 minutes around open telemetry, how to make a better experience and then around the open telemetry with the serverless and then specifically Java application. I'm going to hand you over back to James. Thank you, Daniel. So I'll speak for about 10 or 15 minutes and then Daniel has a demo to showcase how open telemetry can solve a very important problem for Java developers in the world of cloud native computing. So here's pictures of Daniel and myself and our Twitter handles. That picture on the right is me, a few pounds lighter, few more hairs in Kamakura, south of here in 2002. That was my first time I worked at Sun Microsystems and I was able to visit the country. This is my fourth time in Japan. And so really happy to be here today to speak to all of you. Thank you all for coming. I know I'm competing with a keynote speaker from earlier this morning. So thank you for that. So I want to get right into it and talk about observability. I want to step off the edge here. So what is observability? From the English word, it means you can see something. You can observe it, but it also means that you can understand it and you can draw conclusions and you can understand what you're seeing and process it in your mind. It's really important to be able to do that in software as well since the beginning of the industry. We really need to understand the systems that we build so that we can improve them and make them safer and more compliant in some cases and better and more efficient. So observability is super critical. One of the challenges in observability is knowing what you are going to possibly ask in the future before the time comes to ask that question. And so observability is really more about capturing as much as possible with certain constraints and not limiting yourself to one specific angle or one specific facet of your system. Because if you only look at that one facet, you're only going to be able to draw conclusions based on limited data. So observability is trying not to define the questions up front, but more important defining the types of data that you can collect that you can observe and doing something interesting with that. So this challenge also exists outside of the world of software. One example recently is this. So you may recognize this. So according to this picture, the ball was clearly out and Japan should not have scored that goal, right? Because you got like, I don't know, a few centimeters of green there, right? If you only look at this one angle, this only one piece of signal, you're going to draw, hopefully, well, not hopefully, but you're going to draw the wrong conclusion, right? So having more angles and being able to observe multiple different facets of your system can help you infer a more correct or more accurate conclusion. So in this case, they were able to conclude that the ball actually was in because we had multiple signals that we could draw from them. And similarly in software, the same exact thing applies. So but getting down into details as a developer, the kinds of questions you might want to ask yourself, is my software running? Is my program starting but not actually accepting requests? If something goes wrong, how do I get to the root cause of what went wrong? If my application is running, why is it slow? These are the kinds of things you might consider as a software developer, but having those multiple angles can open up the aperture of stakeholders to other folks, right? In the soccer example, it opens the aperture up to the fans and others and not just the officials. So it's really great to have those different angles in your telemetry solution. So the industry has made many different attempts to define what observability and what telemetry is. One easy, sometimes controversial, but one easy way for someone who's new to observability is to kind of put the different types of signals into these three buckets, right? Metrics, logs, and traces. So metrics you're probably familiar with. This is like how much memory is my program using. Logs are kind of a mutable list of things that occurred in the past. And then traces, in particular distributed traces in the age of microservices, kind of traces the path of a single invocation or single request through the different services that it visits. Historically, in the world of, you know, mainframes and monoliths, the signals were log files and stack traces or, in earlier cases, core dumps from the kernel. And then, you know, an SRE would take that and try to figure out based on that one snapshot what happened. Imagine if they had, you know, a full set of distributed traces and logs and a set of metrics leading up to an issue. It makes it a lot easier to figure out what went wrong. And there's, in fact, tools that can help you do that today. So one interesting aspect here is logs and traces grow linearly with the number, with the request volume. So as you get, you know, twice the number of requests coming into a system, generally you're going to get twice the number of logs and traces because, you know, twice as many things are happening, unlike metrics, which are kind of dependent on how the metric is actually defined. It doesn't always scale linearly. But that's important to understand when you're designing an observability system because you cannot capture every single possible facet of every single, you know, microsecond in the world of your system because you would spend your time recording to a file for logs and not actually doing real work in the system. So you have to, there's some trade-offs there and open telemetry kind of makes some of that. In fact, a lot of the prior art in the world of observability also recognized this same phenomenon. So how do you get these metrics, logs and traces? Well, the kind of the typical process is you instrument the code, you collect the data, put it somewhere, you process that data, and then you draw conclusions or infer conclusions or draw visualizations to help humans understand this mountain of data, right? And traditional vendors, traditional APM vendors kind of cropped up, you know, probably many decades ago, as we started to build software, as you know, PCs became popular and a lot more people started to build software, a lot more vendors popped up to try and attach, attack this problem, which is fantastic, right? You love to see advances in software, but the problem is they, of course, had their solution for instrumentation, their solution for data processing, their solution for storage, their solution for visualization sort of get locked in, right? And once you do that, it's really, really hard and very costly to switch. So that's sort of that traditional APM vendor offered that and I can name names, but I won't, I'm sure you can think of several in this space. And the reason we're here today is talking about open source. So as the state of the art advanced in software development, as we started to build distributed systems, more and more distributed systems, the cost of providing and keeping up with the state of the art became very high, becomes very high, it continues to rise. So these APM vendors decided we don't really, we don't really want to spend the resources competing with creating APIs. We want to spend resources competing on like visualizations and AI ops and, you know, sort of higher level concerns in the world of software and observability. And let's collaborate instead of, instead of silo the solutions. And so a number of open source solutions became known and started to become developed in the early knots. And so early instances, you probably all heard of things like Zipkin and Jaeger. There's things even before that Google started a project called Dapper. And there's many others that attempted to bring a solution, an open source solution to the world of computing. So as a developer, right, how do you get started? If you, if you're a new developer, you're new to observability and you want to add this to your program or add this to your container runtime or whatever you happen to be working on. Right. The first thing you're going to do, go to CNCF and CNCF website and do a search for monitoring, logging and tracing. And you'll get something like this, right? It's a little bit overwhelming. There's lots of different solutions. That's the good and the bad part about open source, right? Anyone can develop a solution. They can get a listed on a quote unquote marketplace. And now you're kind of faced with a number of somewhat competing, somewhat complementary solutions. So it's really challenging as a developer to understand how to get started. So, um, you know, one of the creators of the Dapper project, um, recognized the same issue and, and said, yeah, we, you know, there's too many logings. There's too many tracing standards. We need one standard. So they invented open tracing. So open tracing was great because it was the first attempt at kind of defining what a trace is. They were working in the tracing area, the distributed tracing area. So they defined what a trace is, right? A trace is a directed acyclic graph of spans of work with different metadata associated with each of those spans. And they created an API, um, which is fantastic for tracing. There's no metrics, there's no logging, just tracing, but it was, it was a fantastic attempt. Um, you may recognize this, uh, cartoon, right? The, this is exactly what happened, right? They're like, there's 14 different standards. That's nonsense. We just need one. So let's create one. And then soon there's now 15 new standards for tracing. So, you know, it was, it was another attempt, uh, a valiant attempt at that. At the same time, other vendors, um, like Google itself was working on bringing additional functionality above and beyond just tracing. So they wanted to add, uh, metrics. Um, they wanted to not only provide an API, but actually provide an SDK, right? A binary downloadable thing that you can use with your program. They wanted to also provide kind of full stack support. So not just instrumenting your code and emitting traces or metrics, but also collecting those metrics and storing them and processing them and providing, uh, hooks for other vendors to do additional work on that data. So this was a kind of the next evolution of a observability, quote unquote standard and open source, which is awesome. Um, but again, right? This is a different one in some ways. Complimentary to something like open tracing in some ways it competes with open tracing. So they had, you know, a lot of overlap there. So the good news is they kind of got together and realized this together and the two projects, um, have merged. And actually, um, this year, uh, the result, the result was released this year, but, uh, this merge happened in, I think 2019. Um, both of these projects were sort of EOL'd or, you know, slowed down or, or, or deprecated in some way. And a new, uh, project was born from that, which is called open telemetry, which is what we're here today to talk about. Um, open telemetry essentially takes the best of both open tracing and open senses and combines them into a full stack observability solution, uh, for a number of different programming languages and libraries and frameworks. Um, it, again, it, it's full stack, meaning there is a way for you to instrument your code. There's a way for you to collect and store the data. There's a way for you to analyze the data and draw some basic conclusions about that data. And it provides the ability for vendors like, I don't know, data dog or snake or, uh, a number of other ones to, uh, compete in that higher level space. So it's, it's kind of a win-win for everyone. In fact, there are, I think there was last count 800 different contributors from 150 different companies. If you go to the open telemetry website and look at the list of companies that are contributing, you'll see all those APM vendors, right? They all want to get out of the business of, of trying to maintain and track the state of the art in terms of how you, uh, collect data and move into a higher level competition with each other. So instead they're basically contributing back and, and, and, you know, um, working with each other to develop and, and move the standard forward. So open telemetry brings both, uh, metrics and logging and tracing. So all three of those observability pillars that I mentioned earlier are, are represented in open telemetry. Uh, now the logging part is not quite finished yet. It should be hopefully go GA sometime early next year. Um, but it's a fantastic way for you to instrument your code and, you know, it does, as I said, metrics, logging and tracing and all that data can be correlated with unamb, unambiguous correlation. But like if you emit a log file as part of the method, it can immediately be automatically attached to a span and then a trace and then the visualization tools can help you, you know, deep dive into a problem. If you're looking at a, at a trace that has some performance issue that you have, you can kind of double click and get down into the logs and get down into the metrics associated with that trace. So it's a really well thought through, uh, solution. Um, and it's really powerful for, again, providing you with the ability to not define the questions up front, but to answer many, many, many different types of questions after the fact when something goes wrong or if you just want to understand the state of your system. So very powerful. And Daniel's going to show you a lot about how it works, um, in a moment. Uh, I just wanted to point this out. So this is the, uh, graph of, of contributions, PRs commits, um, on, uh, CNCF projects. And as you can see at the upper right there, um, open telemetry almost as, as active as Kubernetes itself. And so this, you know, a testament to the power of open source and the power of communities coming together and vendors coming together to work together for a common solution this benefits everyone in this room, I hope. Um, so yeah, so we hope to see that continue going forward. Um, what open telemetry is, as I said, it's a library with metrics logs and traces. It provides not only the specification, like what, um, open tracing provided, but it also provides SDKs for, uh, in various languages for binding to your language. It defines what a binding even means, meaning the different, you know, native types in a language, how those map to concepts, uh, defined by open telemetry. It also defines an, uh, binary protocol for transmitting the data and collecting it and massaging it. And more importantly, it provides a way to maintain compatibility with existing systems. So if you're using things like Yeager or Zipkin or Dapper or any of the other common solutions, um, then you'll be able to use open telemetry with that because it has a way to collect that data in that native format, transform it to the open telemetry format, and then store and process that later on. So very powerful, um, collector there, the thing on the right, the hotel collector. It's sort of a Swiss army knife of tools and it's provided as part of that project. So you can not only use the standard protocol, but also the, um, the, uh, the existing protocols that it may be using. Okay. So I think that's it. Um, so for Java, Java has a particularly challenging, uh, point in time, uh, as the cloud has matured Java has been, you know, traditionally thought of as heavyweight and slow. Um, and so in not compatible, if you will, with like modern microservices. Um, and so we read how they're doing a lot of work in this space and Daniel's going to kind of show you a solution for Java in the context of serverless, which is where that problem, the historical traditional Java is really, um, magnified. So I'll pass it over to Dan. Thank you so much, James. So it's a traditional way, like open telemetry or the other tracing tools. So you already figured out where is my application, the target environment to gather or collect your telemetry data or tracing data or lower, etc. However, the serverless application is up and down anytime based on your network trapping. So it's a very hard to find out and then, uh, collect your telemetry data. That's why I really want to focus on how to, how to dissolve this kind of problem with developer. And then in the end, you can collect that, uh, uh, insightful, uh, telemetry data and logging, etc. on your infrastructure layer. So the Quarkus, uh, it's new Java framework, uh, just in case you never ever heard about that before. So it's a lot of people are actually, uh, moving forward to JavaScript or go, uh, to use the serverless application on top of the image of Lambda, for example, because the job is a pretty heavy way. It was born almost 27 years ago. And then it's a dynamic behavior. You could run Java application, any app server over the internet technology, which is super awesome at a time. However, things change. Everybody is doing Kubernetes just like you. And then your application container, uh, up and down anytime and they're killed and then scale out like a 10,000, even million container anytime, any soon. In that case, Java dynamic behavior is a pretty much not good behavior in the container platform. So that's a lot of people just move away from Java technology. However, we cannot ignore the data more than 16.5 million Java developer out there and they're living in there and then like, uh, develop business application services every single day. So Quarkus actually, uh, give you some shiny things for them, Java developer, uh, with the, uh, bunch of the, uh, like a performance, uh, tuning. And then we change, uh, the fundamental architecture and then we give more like a developer productivity, uh, et cetera here. So I'm going to get right into the demo. I'll give you such a more like an interesting stuff. I feel you got a little bit sleepy after good lunchy. So I'm going to stop a boring slide deck and then here's my application. I already created my application this morning. It's pretty simple. Just let's play PR. You can see that, uh, just let's play PR, like return endpoint. Hello, good evening. And I had welcome cube day japan 2022. And the other is, uh, let's end point, like just hotel open telemetry and then my session title. Let's try to, I'm going to run, uh, my local environment, just to make sure my application running and then working on my local environment and then how to get done. And I'm going to stop it. So I'm going to try to run, uh, my local, uh, logging infrastructure, which is the, uh, uh, Yeager and, uh, open telemetry in a route that I most likely want to use, uh, like a container runtime, like, uh, like a Docker. So here's my Docker compose file. Pretty simple as here's Yeager. I'm going to just run my Yeager image. And here's your open telemetry collector image. And then like a backend port here, which you connect to, uh, like a gathering message in the backend, send it back to Yeager server. So in order to run in that, I'm going to learn Docker compose many, uh, command line and then it just pop up instantly. Let's try to see it's running. Okay. I have a two container process here, open telemetry and Yeager. And now I'm going to run my application, the Java application. Once you run this application, it automatically connect to open telemetry because when I go back to my project and then here's my key and value properties, as you can see, I'm going to make it a little bit zoom in. Here we go. And here's my application, name services, which is identified in the Yeager server. Here's my open telemetry exporter, which is here the open telemetry and the port name, which I did, I just run it in Docker container. And this is all I need to do. However, actually using Quarkus, you don't, you can't skip this kind of computation by default because it's just default computation, but I want to showcase more obviously what kind of computation you need to set it in your application side to connect to open telemetry. So now I'm going to the browser and open to the Yeager web UI. And so as you can see, there's the default Yeager and there's no metric data tracing data at this moment. And I'm going back to terminal window, the empty terminal window. Let's try to access one of the endpoint. Hello, good evening. And now you can see the welcome cube day, just like a Japan 2020, my purple name Dan. And I'm going to reload the Yeager UI. Now you can see the new service came on, just correct your data. And it's not just directly Yeager, it's open telemetry, actually grab the data and collect the telemetry data and then send it back to Yeager. And then now you can see, check it out, your Yeager. You can actually print the Yeager GPIN server with the other like a tracing server just like James mentioned earlier. And then click on tracing, you can have one tracing data here. You can find more detail if you have already some experience to use Yeager before and then you familiar with how to use that. And then if you want to call one more time and then you have now two traces now. And then you, let's try to the other open telemetry, let's say APR and then just distribute tracing integrating with the Quarkus, KNM and OTEL. And then now you can see, I'm going to reload the web page also again. And now you can see the other endpoint here. So hello, OTEL. And then click on OTEL and then find trace. Now you can have one. So I just, I really quick, I'll better find my application, how to connect to my Java application into OTEL, open telemetry. And then back to the Yeager, which is pretty simple. My application pretty simple like just a Raspberry API. I just need three key and values, which allows me to use open telemetry and put together all tracing data into Yeager server. Now I'm going to deploy this application into production, which is Kubernetes as some less based on KNM. Because sometimes I don't, I don't want to my application running all the time and save my money infrastructure layer. So in order to that, at the Quarkus actually provide one great feature here, OpenShift extension, which allows me to give a lot of step manually. So once my, let's try to run my application first. So I'm going to stop my local runtime and then just to build my application. And then behind the scene, it tried to package an application like a Java art, the artifact like a Java file, and then containerize the image. And then that tried to push that image into like a container registry. In this case, I'm going to use the integrate container inside the Kubernetes, which is your OpenShift cluster right there. And then finally, it will deploy to KNM service on top of the Kubernetes. But in the developer standpoint, I just need to run one single command line and the other stop and task automatically happening. What happened when I go back to my project in the target directory, excuse me, there are Kubernetes directory, a confined Java file for KNM services, and then Kubernetes and OpenShift to stop. And then once it deployed, and then when I go to my OpenShift cluster, which is based on Kubernetes. So I already installed Jager and OpenTelemetry using our operator. So when you click on the when you want, this is not local anymore. This is a cloud on Kubernetes. And then there's no trace at this moment, just the brand new, as you can see, just Jager query, the defaulting, there's no services here. And then now in the meantime, my application deployed as KNM services here. And then it will coming up soon. And then once it coming up soon, and this is just running just the looks like it's a normal part. And then back to the here, and then your application running one part here. And then it will scale down to zero as long as you don't have any natural incoming traffic for the next 30 seconds, which is a default compilation in scale to down zero in the KNET with the project. So let's give us a moment. And then once it just down, and then we're going to just invoke the REST API based on this fully defined domain name. And then, and then we're going to see it automatically collect the telemetry data from our open telemetry and Jager. And what happened in there, I'm going to showcase what kind of comparison I actually set it up. So for example, go to, let me try to change admin console and then operator and then KNM serving. And here is the KNM serving. I already created the CR. And then once you go to detail YAM file, and then I already set it up. I'm going to make it bigger. Set it up here at the back end of GK, which is actually point to our running Jager part. And then this is the KNM serving already connect to my back end Jager server. And, and then back to the open telemetry by the namespace here is open telemetry collector CR and then go to here hotel and the YAM file. And now you can see, so in us, you know, here is the receiver and the explorer and the Jager server. So this is how to make it happen when you deploy serverless applications in KNM and it automatically connect to the hotel. And then hotel grab the telemetry data and send it back to Jager server. You don't need to worry about how to communicate in up and down the serverless application like as an agent based because the serverless anytime up and down, you don't know when actually down and up. It's randomly based on your network traffic. And let's go back to topology view and then take a look at that. Our application actually scaled down to zero. There's no part anymore. So let's go back to my local environment and try to access endpoint. Hello and gritty. And then I just invoked my less API and now you can see the part is just going up just like a subless lambda like Amazon lambda just like a subless behavior. And then in the meantime, let's go to Yego UI and I'm going to load that. And then when you load the Yego UI and it will new services pop up based on your serverless deployment name. So here is our subless name. It's already detected automatically. And then when you created the fine traces, and then you can find the hello, the gritty endpoint. If you want to invoke several more time and then you combine the more traces here and you can also call different the less API and then reload the Yego UI and you can find a new operation and just create it here under the same services. And then when you go down, your subless function is go down and it just go up based on the new network traffic. You can find a new service automatically created in the Yego UI. So that's the pretty easy way to how to build up your open telemetry as well as the subless application based on the Java running on your Java cork's framework. And then I'm going to use the back end the Yego server. But you can add more of the polygonal language in a long time like maybe the Rust and the Python and whatever you know, JavaScript. But always the challenge for developer to figure it out, just James mentioned earlier, so SDK API and what kind of things you need to use for integrating this kind of stack for your subless application on top of Kubernetes. Okay, that's all I have today, James. And I just want my like a YouTube channel. And you can feel free to subscribe. I already put in the similar video and the tutorial and a bunch of stuff. And we are more than happy to address questions if you have any questions on there. There we go. Can you pass my mic over there? Thank you for the presentation. One question about the traces. So let's say like we have some health checks generating a lot of API calls. So how would you filter those out or make it so that we don't generate traces for such unnecessary API calls? Yeah, there's so there's a number of different ways to do it. You can do it in the application, of course. If you're looking at like a health probe from Kubernetes, you can also filter those out through the open telemetry collector configuration. Yeah, you did. You probably did see the probe service there in the trace that Daniel showed. So yeah, you can filter those out in the open telemetry collector configuration. Or if you're using something like service mesh, like you can also eliminate those endpoints from being emitted as a trace or emitted as part of a trace. So it's a number of different configuration layers you can eliminate those. So it would be in the collector's site. Or is it possible to configure also the, for example, in Java, like we can use the auto instrumentation library, like can be configured at that site also? Yes, you can. Absolutely. Okay, thank you. Okay, I think we're out of time. If you have any other questions, we'll be around afterwards. Yeah, we're gonna stay around the hallway. You can just feel free to reach out to ask any question. We are more than happy to address. Thanks for joining today. Have a good rest of the day. Thank you very much.