 All right. I guess I'm going to get started since the room is full here and hopefully those of you that are online and Couldn't get in or online and can at least listen or catch the reap Yeah, I think there was about 700 people that registered for the event and when I saw the room I said this is going to be a lot of unhappy people that's fine Yeah, so thanks everyone for joining and coming in person They gave me lots of room, but you guys are pretty crammed in out there. So There's only one person here. I'll explain a little bit about why Pavel's not here and And introduce him. He also was able to pre-record a few sections of the Presentation so you'll hear directly from Pavel My name is Anna Cowell. I work on Yeager. I'm the CTO at Logs. Iow works put them together One of the tools that we offer is Yeager along with Prometheus an open search and so I Spend a lot of time working, but when I'm not working I try to be underwater some photos I've taken recently spend a lot of time diving I live near Miami, Florida and Yeah, a lot of work. I have Pavel's bio. That's why I'm pulling it up and so Pavel Here so Hopefully Pavel's colleagues. He's a senior software. Sorry. All right lots of bikes. Okay So Pavel is Is a senior software engineer red hat working on distributed tracing and observability for modern cloud native applications? Pavel contributes to several CNCF projects including open telemetry and Yeager and Pavel spends a lot of time in the mountains, which is why you see these awesome pictures He's always sending me photos of him skiing and climbing and making his dog do uncomfortable things in the mountains So it's always enjoyable Pavel is great to work with. He's definitely part of the team working on Yeager and Great to work with him there and on open telemetry So you'll hear from Pavel a bit and I'll also talk a little bit through the presentation Today we're going to cover a few different things. I'm going to talk about some of the basics I'm not going to go into a whole tutorial about distributed tracing because You're going to get that in many of the open telemetry talks going on this week I know my colleague Doton is speaking tomorrow about open telemetry But I'm going to go into a few basics just for context because I'm sure many of you aren't Tracing people and you're just trying to see what Yeager can do for you and what this tracing thing is all about And then Pavel is going to go a little bit more deep into Yeager and open telemetry and how they work together I'm going to talk about a really cool feature Probably the biggest Yeager feature that's been built in the last couple of years is how we've integrated Prometheus into Yeager and you can now do application performance monitoring inside of Yeager And it's nice to have the compatibility with the Prometheus ecosystem So I'm going to talk about that and then Pavel will discuss the operator He spends a lot of time and obviously it's super important for those running on Kubernetes or OpenShift To use the operator and so he's going to talk about that And then I'll touch on the roadmap and where Yeager is going for those of you that are using it and want to see kind of What the vision is and then we'll do some Q&A so without further ado. I'm going to jump in here So a few semantics when Yeager was created and it was a project created by Uber and then donated to the CNCF It's one of the few graduated projects of the CNCF. So it's considered mature It's been around for a while. It was actually built around open tracing Which is the predecessor to open telemetry So some of the things in the UI and some of the semantics might use slightly different terminology but I'm going to talk in open telemetry words because that's what you're going to Instrument your applications with and it's by and large very similar to open tracing So the trace is basically think about an end-to-end transaction that's going through a bunch of different microservices And that's what the trace is each of those microservices and the pieces they're executing is a span So even if you're not in a microservices architecture anytime you're calling a service Which is calling a service which is calling a service. You're going to get trace data And the span itself is basically the work that one component is doing in your application And then inside of the span you can have a bunch of different types of data But tags are extremely useful. So if you're running on Kubernetes You can auto populate the data about your Kubernetes infrastructure inside the span So that when you're troubleshooting a problem You might say that a particular pod is having an issue because all of the transactions with that tag are Showing an issue with that pod name So that's what the tags are used for and people stuff all kinds of things into tags You can even like stick a whole log message into a tag. It's not always the best idea, but I've seen some crazy stuff And people also will create a lot of different types of span data related to a trace because it's your code you can essentially Instrument and build out that visibility in any way that you want So open telemetry Supports a lot of different types of instrumentation from SDKs to API's to auto Instrumentation and you'll hear about that in hotel discussions and logs So like another way to look at this visually and I'll show you how it looks in Jaeger is Basically what we call this the root span That's the first part of the transaction up at the top and then essentially can cascade through Asynchronously or synchronously and as these things execute you basically build You know a tree of what the trace looks like and then another way to look at this is over time So as I progress from left to right What are the things that are executing what things are executing at the same time? Where is there potentially blockers? That's really what tracing is useful for and to see where Errors latency or other problems are within the trace itself So in the Jaeger UI here, we're basically just visualizing over time This is a relatively short transaction 700 milliseconds and then you can see where the time is spent But a good example of you know potentially some issues is when you see things that are Stacked up like this that shows that there's potentially ways that you could improve this by threading it or changing your application And then things like errors show up and you can actually drill inside the UI So that's basically how Jaeger visualizes traces And then obviously in this case we're looking at you know front-end service And then you can watch as it goes through the back-end services. So it's just kind of an example So Pavel is going to explain a little bit about Jaeger and open telemetry for a few minutes here in the video and I will Touch on that a little bit later because it is an important part of our roadmap So here is Pavel Hello everyone, my name is Pavel and in this part I will talk about Jaeger and open telemetry first of all I want to explain the differences between these two projects and ecosystem and Then I want to bit more kind of explain or introduce you to open telemetry and Last but not least I want to talk about integrations between these two projects So what is the difference between open telemetry or hotel how we call it and then Jaeger? So open telemetry is all about data collection. It provides capabilities to together Telemetry data from our applications Process it and then send it to to any kind of vendor of our choice On the other hand Jaeger is just a platform it's a kind of a server that you can run on your machine or in the cloud and It has API to ingest but as well query and visualize tracing data and You know before I think 2021 Jaeger ecosystem as well provided the Client libraries that we could use to instrument our applications but this you know part of the ecosystem is now deprecated and the main Purpose of Jaeger is just a platform right now So the server that we can actually deploy somewhere So let's talk about it a bit more about open telemetry I'm pretty sure there is a different open telemetry Presentation at this conference, but just for the Context of the stock I want to introduce you to open telemetry And as I mentioned open telemetry is all about the data collection and data collection is very complex and difficult problem Because if you want to deploy You want to do Observability right then we need to collect as much telemetry data as possible and The problem is that there is too many or not too many but there is many programming languages and There is too. There's too many or many Frameworks in all of those languages and if you want to do If you want to understand what is happening, you know me in our applications we really need to instrument all those frameworks and This is a huge kind of job to To do right because there is just too many of those frameworks and and for example like database clients and whatnot So this is kind of the first part of the open telemetry ecosystem is the instrumentation libraries for, you know, various RPC or database clients for different Languages The second part of the ecosystem is the open telemetry collector, which is able to receive data from those instrumentations The process the data and then send it to Observability platform and the last part of the ecosystem is a specification that basically Standardizes what telemetry data should be collected To be kind of consistent across languages and frameworks so Let's now take a look at the kind of reference architecture when instrumenting The application So on this slide what we see on the other part it's user application process So it's our application in other words and it is instrumented with Some instrumentation API to collect tracing data and then this You know instrumentation Code is then sending this data to the agent and collector and finally data reaches the platform So for the instrumentation API, we used to use the open tracing API Which Yeager clients used to implement But this is not deprecated and we encourage our users to use open telemetry API and SDK to instrument applications and Then for the agent and collector you can still use Yeager agent and Yeager collector But you have as well choice to use open telemetry collector as a replacement, especially for the Yeager agent For the Yeager collector, you still have to use Yeager collector if you want to use Yeager but in some scenarios you can use Open telemetry collector and then send data to Yeager collector, which will store the data to persistent storage So let's Talk a bit more about open telemetry collector because now maybe you are confused. What is the difference between Yeager and open telemetry collector? So all the collector is essentially a pipeline that can receive data in multiple formats process it and then export the same data to to a different system and Yeager collector is a Lot different because it can only receive data in Yeager or Zipkin format and then store it to Persistent storage supported by the Yeager platform, which is you know Cassandra Elastic Surge Badger or Kafka And open telemetry collector doesn't provide any storage or query capabilities It just receives data process it and then you know Sends it to to platform, which in our case is Yeager So in the auto collector there is it's kind of the functionalities divided into three main components, which is the receiver Then processor to process the data So for example do PII filtering data normalization or even export or can extract Metrics from traces and Jonah will talk about it in more detail and Last but not least export the data to a different system So it's kind of like pipeline very flexible one that allows you to process the data Super cool is that you know a single stream of data you can process it and then exports to For example to multiple systems so you can keep some data in or export it to Yeager that you have deployed locally in your in your Environment, but as well export the same data to To an observability vendor All right, so this is the collector and now let's talk about the integrations between Yeager and Open telemetry So first set of integrations is done in the instrumentations. So especially in the open telemetry SDKs So as I mentioned before the Yeager client or Yeager ecosystem used to provide Yeager clients that we could use to instrument our applications However, those clients are deprecated right now and we encourage users to use open telemetry SDKs and APIs and To better kind of support migration to open telemetry We have done bunch of work on the open telemetry SDKs and so The old LSD case supports the Yeager context propagation header or format. So If you're already using Yeager clients and you are using Yeager context propagation You should be able to deploy new service instrumented with open telemetry and kind of make and use this piece of code that will make sure that the traces are not broken and the context propagation works across the old and new services and And The second integration in the hotel SDK is that there is Yeager email sampler implementation So if you are using Yeager sampler Yeager email sampler, you should be able to use it with open telemetry SDKs as well now Let's talk about open telemetry collector and Yeager integrations So in the hotel collector, there is Yeager receiver that can receive Yeager data in pretty much all supported formats Then Then there is Yeager exporter that can export Data to Yeager This as well Yeager email sampler extension. So essentially you can just grab your Sampling configuration from Yeager collector and use it in open telemetry collector and last but not least open telemetry collector integrates with Kafka and You can configure it to use Yeager As a payload style and the last integration is in Yeager query component or Yeager query So we have added Yeager v3 query API, which is very similar to Yeager v2 With one major difference is that the payload is not Yeager Type but is it comes from open telemetry and it is open telemetry resource spans. I highly encourage you to take a look at this query API if you want to query Open telemetry data from Yeager platform Okay, thank you very much and this is everything from the Yeager and open telemetry integrations part Cool Thanks, Pavel. Hopefully he's listening, but he has one other section. You can hold the applause till he's actually done But yeah, I mean Pavel's done a lot of work and I'll also talk about where things are going between Yeager and open telemetry because there's definitely overlap The sampling implementation for those that are really into tracing that Yeager allows for is a head-based sampling But it's it can be dynamic and it was built at Uber specifically for them to change Instrumentation dynamically based on load or other types of characteristics But it's predetermined at the beginning of the transaction not at the end like a tail-based sampler So there's actually better sampling methods and open telemetry, but that's a whole nother talk to have So I wanted to talk a little bit about Yeager and Prometheus in the new monitoring tab. That's part of Yeager We're we're always refining it, but it's been out for many months now So if you grab the newest version of Yeager It's in there and what it is is it's basically This moves Yeager from a distributed tracing or debugging UI into a bit more of a monitoring UI and allows you to actually see operationally what's happening with your service and APM is typically a combination of tracing and metrics together so that you can understand when your service is starting to show Errors going up or latency going up or some other characteristic that showing it's unhealthy or having an issue and this opens use cases for monitoring alerting and things like capacity planning as well whether We're going to implement alerting and Yeager still TBD. We'll see There's also a new Capability that we built. It's actually this is the old name is aggregated trace metrics That we called it now. It's called service performance monitoring SPM in the Yeager documentation We renamed it a little while ago But the idea is how do we integrate these two things together instead of building metrics in Yeager? We just said everyone's using Prometheus. Let's make this Part of the way that people Already work and then once it's in Prometheus not only can you visualize it in Yeager But you can use whatever front end you're you're used to using probably Grafana for the UI So the thing that we tried to solve is how do we generate the metrics from the traces? This is where hotel comes in So what we built is a processor when Pavel was talking about the pipeline Your trace data comes in from the application and then we built a processor Which is called the span metrics processor and I'll have a link to that And then the idea is you export the metrics from open telemetry Into any format that you want even if it's not a Prometheus back end you can export it But in Yeager, we only support Prometheus based back ends for the query side And so this means in our case at my company we use M3 DB but if you're using Thanos or Cortex or Victoria metrics or whatever kind of Prometheus back end as long as it supports promql In remote, right? It'll work seamlessly And then the idea is we derive the metrics from the traces automatically And then we're actually able to store them and visualize them So I'm going to show you how this works in configuration this is open telemetry collector sample config and Basically in here You're we're talking about that span metrics processor the up at the top You can actually define what kind of buckets you want and so when you're looking at a histogram Depending on how you want to segment the data and how you want to store it That's totally configurable based on what makes sense for you This is kind of the default buckets for histograms that we've set up and then down towards the bottom is the pipeline where you would Listen for Yeager traces in this case you would process it and create the metrics and then in the case of And then you would basically define the exporter now down here on the bottom we can send it to Prometheus you can send it to whatever back end you want and This essentially will create that and there's a link The presentation in PDF is in the schedule the scheduling app So you can always download it and the link will work that actually goes to the collector code base but this is part of the Contrib collector distribution if you're using open telemetry and So you can visualize it a couple of different ways in this case as an example We visualized it inside of Grafana just to show you how you can use the histograms And then inside Yeager we created a new service called the metric query service and that will query any promql back end that that understands the promql query language and Today it supports, you know, whatever kind of back end. I put a few examples And others can add so if you let's say use influx DB or a proprietary metric solution from a vendor You could actually modify the metric query service To the vendor that you're using or the solution that you're using and and it would still work in the Yeager UI So in the monitoring tab, there's basically a new little tabs that says monitor and then basically we're visualizing You know the top services And then there's a few out of the box Kind of columns that we're looking at here that we've defined It's relatively basic, but at least it starts to get people thinking about how you can operationalize trace data, which Not a lot of open source tools can do so This makes it a bit more useful And we hear great things from users a lot of people have actually Taken that open telemetry piece that we built and they're using it for all kinds of other use cases Because there's a lot of useful data when you create metrics from traces Cool and I'm gonna pass it back to Pavel to talk about the operator He's done a ton of work on this over the last few years And it's it's definitely one of the main ways that people are deploying Yeager and deploying other parts around the Yeager ecosystem, so Pavel's gonna explain a short bit about the operator Hello everyone, it's Pavel again and in this section I will talk about Yeager Kubernetes operator so Yeager operator, it's essentially a Kubernetes operator that you can deploy in your cluster and it will take care of Yeager configuration and installation on your behalf If you are interested in the operator, you can find a source code on Yeager tracing slash Yeager operator or there's as well good documentation on the Yeager tracing.io so first of all like why you should be using the operator instead of plain Kubernetes manifest files and The the answer is the operator is kind of more sophisticated and it's kind of smarter on On how the Yeager deployment should look like and Probably the biggest advantage for the operator is that it will help you with day two activities like upgrades scaling and monitoring of Yeager deployments It's as well the operator kind of as well automatically recognizes what you have installed in your Kubernetes cluster and as well like which distribution of Kubernetes cluster you are using and based on that it will kind of Unlock features that are specific to that API or platform If you are for some reason cannot use the operator then the Yeager operator you can use it as a binary to Generate plain Kubernetes manifest files from the given CR Which brings us to the CR and I'm pretty sure that almost everybody here is familiar with the term custom resource definition and On this slide we see the custom resource definition for the Yeager And it is essentially a YAML file where you can kind of describe How the Yeager Deployment or Yeager cluster should look like and We will take a look at you know parts of this CRD to explain the operator features and the first parts that we will explain is the strategy and There are three strategies right now the first one is only one in only one strategy the Yeager operator will deploy Yeager as a single binary and The single binary will then you know talk to the persistent storage or you can even skip the persistent storage if you will use the in-memory option with Yeager The second strategy is production The production strategy essentially splits Yeager collector and query into two separate deployments that you can scale independently If you will hit scalability limits with this, you know deployment Strategy then you can switch to streaming that essentially splits Or decouples Yeager collector from the persistent storage by putting Kafka in between And this is by far the most scalable Yeager deployment that you can achieve with Yeager operator So now let's take a look how we configure the storage in the CR The storage, you know, you can configure it in under In the CRD under spec.storage node and there are two most important Kind of configuration fields first one is type where you can define which storage type you want to use There is in-memory elastic search Cassandra, Yeager, even gRPC plug-in and Kafka and the second option is the options that is loosely coupled and here we put the The storage related flags for the given Storage that we want to use and you may ask the question, how do we find out which flags the Yeager supports, right? There are kind of two ways that I can think of you can either go to our documentation on YeagerTracing.io or you can run the Yeager collector or the query docker commands or the binaries With the spend storage type set for your storage of choice and use the health flag and it will print you all the supported flags So for example in this CR we are configuring the collector options and here under the option node We can put the all the supported collector options Okay, let's take a look at another Yeager operator feature that is related to Yeager agent and The operator allows you to inject the Yeager agent into your verclodes and For this you have to in the CR you have to set the agent strategy to sidecar which is default by Which is by default so you can even skip it and then if you want to inject the sidecar then On the deployment annotations you have to you know provide this annotation with value either true that will You know choose the right Yeager instance Where the data will be going or you can set it to false to disable injection or you can here specify the Yeager name that And then The Yeager instance with this name will be used to send data to the other agent strategy is demon set and in this strategy Yeager operator will deploy agent as dean on set on every Kubernetes node in your cluster Last but not least the Yeager operator integrates with Two operators. The first one is Kafka Which comes from streams the operator and the second one is elastic search which comes from open shift cluster logging operator and if you have these or let's say elastic search operator installed in your cluster then Yeager operator kind of will recognize that and it will It's able to provision elastic search instance for you When you will be you know creating new Yeager instance and the same for Kafka And the second integration is is monitoring of Yeager operator So Yeager operator itself is instrumented with Prometheus for metrics and open telemetry for traces So it's able to you know emits telemetry data kind of That will give you visibility how the Yeager operator behaves and what it does So thank you very much and this is everything from the Yeager operator part So thank you for that appreciate it Cool, and so I did want to touch on the roadmap I know I'm running up on time, but this is quite brief So I mentioned adaptive sampling Where there's this new sampling type We recently we're still working on this where we're making some good progress and it allows for dynamic sampling control This is something that that we came out with recently an open telemetry which Pavel touched on The service performance monitoring which I touched on that's also a new feature that's key And then we've actually decided as a project to really move towards open telemetry We've gone through deprecation of the Yeager clients and the SDK And that leads me to where we're going from there Which obviously has a lot to do with with open telemetry in the second part here, but We're we're doing some work on the dependency graphs. So those of you that are using Yeager There's kind of three different dependency graphs in Yeager They're all kind of mediocre or let's say not that useful So we're actually going to be normalizing these together and really creating a nice service topology view That's interactive has metrics overlaid on it. So we're gonna really You know improve the way that the that the service graph the topology view looks and what it does and We also are thinking about moving the calculation of that dependency graph Which today actually requires that you deploy spark or Kafka streams Potentially into the open telemetry collector. So there's some work going on there Of how to calculate these topology graphs and we're hoping we can do that in open telemetry and reduce a component in the Yeager architecture and then Secondarily, which is the bigger one is we're actually moving and Pavel's actually already built a POC of this But we're gonna basically implement the Yeager storage exporters inside open telemetry and Create a distribution of open telemetry. That's basically a Yeager collector So the history of Yeager is of open telemetry is that actually was a fork of the Yeager collector That then had the stuff built around it and now we're going to actually bring it back in and make Yeager collector a distribution and Then another piece where there's active work going on is native support for open telemetry line protocol So that'll allow you to write directly To open telemetry for many Anything that supports otlp and we can take it into Yeager natively, so it's definitely more coming And I would at offer questions, but I don't know if I have time for it And I'm gonna put up a few links. Do I have time to take a couple questions? I think I do so Does anyone have a question and I will bring a mic over to you? Hopefully this works. Yeah Yes, a question about the service performance monitoring would it also expose those metrics as Prometheus metrics to be scraped instead of being forwarded Just for simplicity because it's calculated inside open telemetry I don't believe you can scrape a collector, but maybe you can so if you can it would work the same way It's working another question Thanks, is that a strategy for handling long-running background tasks as part of a trace? How do you handle that usually? It can cause You can do it, but the problem is that sometimes it won't visualize properly So it's actually more of a UI problem not a problem with the actual storage of long-running traces And part of it is the UI starts to look weird when you have something running for a week It's like what do you it's going to be like these little teeny things happening? So the UI is not really built to visualize traces that run over like days or many hours But the data in the back end storage is there So you could write your own Visualization and use it totally so there's no plan like to add some kind of linking between traces or It is in the data, it's just not in the UI, so there's no plan to make the UI work with that. Okay If you're using something like open search or elastic search You could do something in cabana or open search dashboards to visualize it And I know the open search team, which is open source. They're building some Trace analytics capabilities that we're gonna be that are going to be compatible with Yeager where it might actually solve that So I would check out open search. They have a trace analytics plugin That might do a good job and you can use it together with Yeager Soon not quite yet right now. There are two formats in the storage But they are gonna support Yeager format It's part of the plan there. So yeah, sure. Thanks Guess one more till they kick me out. I'm gonna keep taking questions. So I'm working on a rather Unusual use case. I'm instrumenting a CLI app that has gRPC plugins And I want to visualize the interaction between the CLI tool and its plugins What would be the minimal Amount of open telemetry and Yeager components and I need to ask the users to have available when they run this thing So they can ship me Spans for debugging for example, like do I need to So you you when you build your tool, which is a CLI tool You would instrument with an SDK in the tool and that would emit the span data Yeah, but the user is off to have something to collect it. What would be the least amount of Yeah, so you I would probably that's gonna be tricky because the user might not be connected to the network Essentially, right, but they could they collect locally and then ship me some sort of like You could do something like that so when Pavel was explaining the all-in-one you connect There's actually a back-end called badger. That's an open source Database essentially that Yeager is compatible with and that actually is stored in a file So you could potentially store it as a badger database locally And then they can send you the file and you should be able to visualize it with the Yeager Query in the UI. So that might work for you. Yeah with Yeager. Sure. Yeah Sounds like a fun project. I guess I'll take one more. Do you mind passing this back? So I don't Thanks So it kind of depends on your back end and that's more of an open telemetry thing We're not going to visualize exemplars inside Yeager UI, but you might get those inside, you know Grafana UI for example, so we're not planning on implementing it in Yeager UI But it's it's supported as long as the exporter you're using supports it within open telemetry But we're not going to visualize it. So And I think yes Yeah, so there's a hyperlink in that monitoring tab that allows you to move from the service Into an example trace of the service. So it's similar to an exemplar, but it doesn't use it from the metric data It's inside the Yeager UI because it knows that those things are linked together I Think that's all the time I have for but I'll be up here or tweet message And I'll try to answer the Q&A that was sent in online Through our slack channel, but thanks everyone for joining