 Hello everyone glad to be here and Amazing to see a full house. There are fewer seats in the at the front. So if you can get some more people There they're definitely more seats, but amazing to see everyone here in a full house shame We can't see the people there on the virtual platform and really excited to be here at the cube come and Today I'd like to talk to you about open telemetry a project that I'm very passionate about The vision the reality and how to get started with it And let's start with a question. How many tools does a company use on average? to collect telemetry data from its systems Many think about logs metrics traces think about your front-end up your back-end up your infrastructure everything how many tools 10 Any other guesses? Well recent surveys show that Companies use on average between five to ten different tools To collect telemetry data from its from the systems five to ten You can reduce it to one unified standard platform. That's a story of Open telemetry I'm Don Horvitz. I'm the principal developer advocate at logs.io at logs.io. We provide essentially Cloud native observability platform that's based on popular open source tools such as the Prometheus and Jager and Obviously open telemetry and other projects I'm an advocate of open source and communities in general and the CNCF in particular That's why it's very exciting to be here on the cube con stage I Coorganized the local CNCF chapter in Tel Aviv. I Coorganize DevOps days. I have a podcast Open observability talks on open source DevOps and observability. So if you're a podcast fans do Check it out. Any general you can find me everywhere at Horvitz And let's talk about observability as you all know observability is essentially the ability to understand the state of our system based on the telemetry data that it emits and the vision talks about Unified observability across different signal types Classically the three pillars logs metrics and traces, but also across different sources. So again think about your front-end code your back-end code your Kafka ready southern open source projects cloud services Essentially just fusing all of these together to gain Insights and observability into your system. That's the vision their reality however is much more fragmented and The reason is exactly what we asked at the beginning The reason is that we use so many different tools for our observability and each tool and each vendor has its own API and SDK for instrumenting different programming languages and then demon collector agent for collecting aggregating processing the data and then its own data model and protocol for Transmitting that and sending it over And all of that Is not only an operational headache running so many different tools But more importantly it creates tight coupling between the telemetry collection side and the storage and analytics back end and most importantly it makes it very very difficult to correlate that data and Be able to gain that unified observability Across these essentially data silos So that's what open telemetry comes to solve in a nutshell So open telemetry or hotel as its nickname so if I say hotel do forgive me That's how usually had the group internally calls it is an observability framework for generating and capturing collecting telemetry data from systems cloud native systems across logs Metrics and traces so one one framework to rule them all if you'd like and It's you ate a Project under the CNC if it's an incubating project. It's in essence In fact a merge of open tracing and open census projects of the CNCF. So if you're familiar with them That's the future path. It's a you need to move that's that's especially interesting for you this talk and I'm Very happy to be here especially here on cubecon stage to say that open telemetry has been vastly adopted Across the industry so you see all the cloud vendors all the monitoring and observability solutions Everyone has been aligning behind open telemetry, which is fantastic for us as a community and for the CNCF it's also a the second most active Project in the CNCF today. In fact, it's the second most active after Kubernetes itself Which is astounding if you if you ask me so just to show you how much Excitement how much activity and how much is going goes on in this project and And by the way, this is taken from the CNCF dev stats dashboards So you can go check it out yourselves and slice and dice the data But essentially it's pretty consistent for for a lot of time for a long time already And I hope that I convinced you that open telemetry is interesting and that you are And the reason that this is important because this is how projects become Defecto standards when you see the industry aligning behind it when you see it's very active and moving So I'm really excited and I truly believe that this is it is going to converge the industry and I hope that I managed to convince you at least that you're interested in all so let's dive Deeper into what open telemetry gives us So in essence open telemetry provides us the API is SDKs and tools for generating telemetry data logs metrics traces from our own applications And then that's the the bluish part on the on the left hand side and then a unified way of collecting and processing that data across different sources both are ups and other types of sources infrastructure and others and then that's the green part in the middle and then a standard way of Exposing and transmitting that data that telemetry data That's the orange part and then sending it off to whichever whichever back end you choose open telemetry It does not take any stand in the back end You're up to it's your choice open telemetry as a lots of integrations as we'll see it's not part of open telemetry scope so that's open telemetry in a nutshell and Don't worry. I'm going to explain each and every one of these components in greater detail now but before going into the components I'd like to Talk a bit about open telemetry specification Which is not a component in itself, but it governs all the other implementations open telemetry implementations out there and essentially provides Specification to describe cross language requirements and expectations Across the implementations. So it defines the API spec the SDK spec and the data spec things such as semantic conventions annotations and so on cross logs metrics and traces and Though you won't probably as end users interact directly with the specification on a regular basis as much as the other components It is important to understand that because this comes to solve the exact problem We talked about before of fragmentation of each tool and each vendor having its own and each programming language has in having its own API's SDKs formats and so on and Moreover having that Across one platform allows then the correlation between the data. So it's not just One it's one that can correlate across signals and across sources. So that's very important to to be aware And the first component that most people encounter are the client libraries Open telemetry provides one API and one SDK per language With which you can instrument and extract logs metrics traces From your application, of course adhering to the specification that we talked about before It also provides integrations with popular libraries and frameworks. I'll PC storage web Frameworks and so on and for per language, of course So that they can extract from these languages more information and propagate context and so on within these frameworks And also auto instrumentation agents per language Depending on the languages that allow for a low code to maybe no code Instrumentation so that you don't need to modify your code to start extracting at least table stakes type of metrics and data So essentially open telemetry's mission statement is to allow the full range between the fully manual Instrumentation to the fully automated instrumentation and anything in between and that's important to say anything in between because you can combine You don't need to decide either or you can start with auto instrumentation to get some table stakes And also lower barrier to entry and then you can add on top of that custom instrumentation to add more fine grained Zooming in areas that you find important enough And also it accommodates to two different situations if I have something that they can't modify the code I need to more monitors black box monitoring Because I can't modify or I'm not allowed to modify or it's a third party Then for example auto instrumentation would be the only option it would be a must have so you get the full range of the instrumentation capabilities So that's open telemetry client libraries Next up I would like to talk about open telemetry collector That can collect telemetry both from the SDKs that we talked about before but also from other sources It could collect from infrastructure components open source cloud services I don't know you have a Kafka running you want to send it you have a mysql. You want I don't know AWS or Azure and Google services whatever you can collect and the collector can Support that Process it aggregate and then send it to whichever backend you choose And as you could imagine you can see that here also it's built as a typical data pipeline process data processing pipeline so it has receivers in multiple protocols it has Processors and then it has exporters in multiple protocols So just as an example if I if my code emits traces in a Jaeger format They just plug in the Jaeger receiver if I want Metrics from Kafka. I plug in the Kafka receiver and so on and so forth. That's for the receivers then the processors have many types of processors to do different types of aggregations and processing batching filtering sampling and so on and I can also Chain concatenate different processes to create more elaborate logic And then based on the backend analytics tool that I use and he doesn't have to be a single tool I can use several I plug in the relevant Exporters I can if I want to send it to I don't know AWS x-ray. I I have the receiver for that I want to send it to I don't Google pubs up. I plug it to that and to send it to Kafka to Prometheus to My company logs. I owe as an exporter at any vendor and any tool largely supported your exporters So that's about the collector and the last component that I would like to talk about is the Open telemetry protocol or OTL P and OTL P is a essentially a general purpose telemetry data Delivery protocol it can be used as you can see here to send between the SDK the old the auto SDK to the auto collector It can be used to send between the collector and the backend analytics tool if it's it supports it It can be used between intermediary nodes such as two different collectors or any other purpose. It's really a general purpose protocol its client server request response As you can see here at the bottom if you can see it's based on a gRPC and HTTP 1.1 For the transport so you have OTL P over HTTP and OTL P over gRPC It currently supports the binary protobuf encoding And they're in the works There's a plan to support also Jason encoding over HTTP. So you'll have Jason over OTL P over HTTP And It's agnostic. So as you can see you can actually take the proto file and you can generate your own gRPC clients yourselves If you so choose that's part of the the freedom that you have when you use this You're not locked into even using these implementations that open telemetry provides. It's part of the mindset of open telemetry Another important point that I would like to make about the mindset is that open telemetry as a project Does not mandate you to use OTL P protocol as I've shown you just now with the collector The collector supports many protocols for the receivers for the ingest many protocols for the exporters for the egress so You're not bound to it However as an as a holistic framework aiming to provide a holistic way of Generating and collecting telemetry data the purpose is to get the industry into one Unified protocol that can then have the also the benefit of correlating data because if we send logs metrics and traces Together with the unified data model then we can also create the correlation and we can align the semantic conventions and so on so That's the mindset and that's the goal and the same goes by the way to other components as well. You can use the SDK Hotel SDK without the collector for example sending from the SDK directly to a back-end analytics tool You can use the collector without an SDK like I gave the example of collecting from Kafka directly and so on so it's a loosely coupled yet a holistic framework aiming to provide a standard unified way of Generating and collecting telemetry data That's about OTP and maybe the mindset these are the main components that that I'll cover there are other components and other elements. There's a Open telemetry operator that allows you to install that Easily on the Kubernetes and other components, but for the sake of this discussion. I will keep it here And I would like to talk about so, okay, what's the What's the state of open telemetry project and most importantly is that GA? Can I Use it in production. That's the most interesting question, right? So show of hands who thinks yes Okay, and who thinks no and Who thinks it depends? Great so like any interesting question the answer is it depends and the reason it depends is because Open telemetry is in fact not a monolithic project, but an aggregate of multiple Groups and each working on a different Part of this huge endeavor. So you have the metric standardization Specification group and the tracing one and the logging one and then you have the Java one and the dotnet one and the Go one and many more. I don't know if there are maintainers here They will probably can name a thousand more. So I'm oversimplifying even that but just to get the drift and each SIG and each special interest group and each working group and each such component has its own Development life cycle, which means that different parts of open telemetry may be in different state of the maturity life cycle Which is in the CNCF terms draft? experimental stable and Deprecated just for those who are more new we have like 67 percent new people there this year in Qcon It's very exciting and just to align the terms with the common terms in the industry so stable would be G a generally available what you'd be looking for to run it in production And it comes with guarantees like backwards compatibility and so on Experimental would be beta. So something you can start POC on proof of concept So just to align terms and again apologies with the maintainers I will stick to the common terms just to make sure that everyone understand the state So, okay, we understand that it's complicated. Thank you dot on but still what's the state of open telemetry? For that, I'd like to break it down by signal types and the first signal and the most mature one is traces and open telemetry is Generally available for traces since last year Which means that the tracing API and SDK and the protocol specifications are stable the collectors table We have many client libraries SDK implementations That are version 1.0 or above a version 1.0 is when the tracing implementation is complete so as you can see here, we have for a Java and go and dotnet and Python and C++ and JavaScript and Ruby and Erlang and Swift and Working on more and it keeps on advancing in so rapidly that maybe that is even advanced since Since the last time I made the slide and most importantly as I said GA means that it comes with guarantees for long-term support for backwards compatibility for dependency isolation what you be expecting to run it in production Next up is metrics and that's actually one of the most exciting news at least on the observability front from this cubecon I'm happy to say for those who missed the news that metrics has reached the release candidate just the announcement on cubecon and Release candidate again just to align terms means that it's practically GA now Collecting the feedback from users and if nothing major critical comes up It should be at turning the same versions will be turned GA within a matter of a week or couple of weeks more That's that's the time frame. So the API the SDK the protocol are obviously stable The API and SDK specifications are already implemented in Java and dotnet and a Python in RC and release candidate JavaScript is just a week away or something like that. So and then many more languages will join in the coming month or two In terms of the collector the collector supports metric pipelines and Very important to say because as you also heard in the previous talk here in this hall and generally Prometheus being a de facto standard in the in the metric side. So There's Prometheus support that has been worked on in collaboration with the Prometheus community I think it's a great example of collaboration under the CNCF. So applause to the to the teams involved and it means that the SDK has exporters in Prometheus format and the collector The auto collector has receivers and exporters in Prometheus formats from if you see more right The OTP to Prometheus specification is aligned and the data model. So That's that's there and the least I guess advanced signal is logs that are still experimental. We hope to have that GA aid this year And with logging it's important to to understand logging is the most long-standing signal, obviously, so Everyone has logging every system has logging it's been there for ages. So we can't just Ignore that obviously. So the first focus is to align with existing and support existing logging sources and logging systems and for that there's also work around the Around the Sorry around the pendants log appenders that are under development in in many languages so that even existing logs that are typically Text-based unstructured. Maybe even file-based can still be augmented with additional data such as a trace ID and other Important data to allow correlation or and obviously obviously sending that over OTP so even for existing logging sources being able to ingest and then Model it over all TLP and send it alongside metrics and traces. That's the the first priority But then following up as a holistic Framework as we talked about the the mission statement We were working to build a new Strongly typed and machine readable format also for logs. So it's not discarded It's definitely the end goal and to try and converge the industry also around that And I think the most if you look at the components that for example, you see that the protocol is Stable and can send that's actually also right relatively new news. I think less than a week ago We've announced that the Pro the OTP protocol for logs is already stable and supports the data model that has been agreed I think a month or two ago and the and the SDK is experimental And can already transmit over OTP One one thing that is important Is that the API for example is left to last it's still draft because again, we haven't prioritized Inventing new APIs for that. However, there is now the shift to focus or adding the focus on the specification side Getting a specification. What does a log what should the structured well structured a strongly typed log look like And for that also exciting news from the recent months is the collaboration with the ECS elastic common schema For those who know that so collaboration between the communities essentially the open source communities to Get all the aggregated knowledge and work that's been done around the elastic common schema and merge it together with open telemetry And and join forces around that so really exciting news about that as well So this is about the state and In the bit of time that I have left I want to first applaud the main achievement here So let's say give a big round of applause To everyone here a lot of hard work. We have some maintainers here at the audience So we all owe them a great deal of gratitude for for achieving this is very important master And also for those who worked on the logging as I said forgetting that the data and protocol stable And for those who are interested in more about the roadmap and the future path It's a lot less the scope of this short talk, but there was a very interesting maintainers track talk by the open telemetry community sharing what's up next like making the operational side easier like maybe even adding more signals beyond logs metrics tracing discussions around continuous profiling So if you are interested do check out the recording from that session a fascinating session but for the last bit of the talk I would like to Discuss how to get started with open telemetry and I'd like to offer my my bit of advice and you should start by Knowing your stack You need to figure out the basic four questions first Which programming languages? Does your organization use especially if you're a polyglot organization? So front and back end also which frameworks you use for these for example, we use Java with with spring and then we use front end node.js with a happy and express So it's important because that will help you determine which SDKs and agents you can use and need to use for your your application then which signals you intend to collect from your system and and logs metrics traces and also in which formats which is Particularly important if it's a brownfield project. You already have components out there emitting telemetry, I don't know traces in zipkin format or I don't know what and and you need to adhere to that and that will determine the receivers that you will use in the open telemetry collector and Then which analytics tools you're going to send to in the back end which will determine Of course the exporters you that you may want to use And once you analyze and figure out the stack your vertical stack that is relevant for you then Just go and check the status of the relevant components and follow the guides and for that There was a very useful page that we've set up as part of moving from the sandbox to the incubation under the CNCF The open telemetry.io slash status. That's a very good starting point to see the high-level status overview There are many other useful resources on that side like the the docs obviously and there's a Also relatively new registry that you can actually search for your stack So you realize that you need I don't know.net you Google you search.net and then it will take you to the specific links on the github repo that That pertain to that so it's a really helpful way to navigating around the massive github repo there I'd also like to invite you to check out the Beginner's guide to open telemetry that I that I wrote it doesn't replace any of what I shown it just may be a higher level Beginners like hello world. So I created a short link so that it's easy to remember Bitly otel-cubecon And you have there the explanation that I said in greater detail about the different components It has sub guides for different programming languages again Just a hello world not the deep dive and also the links to get the deep dive into the massive data That you have within the open telemetry community So I hope that you find it useful and by the way if you if you do have Feedback on the on the guide or if you have feedback on this talk or if you have any any questions We're now ever going to have a Q&A So do don't go where you have time for you to ask and also for the people from the from the virtual platform Then feel free to reach out to me also after at Horvitz and with that. Thank you very much and don't go Thank you Can we help if you just raise your hands? We'll pass the microphone so you can ask questions and for the virtual audience, please Write down in the Q&A box in the platform and we'll also take some questions from the virtual platform Any question now is the time? We have also maintainers I think I have spotted a few familiar faces So you may have the authorities far greater than me if they have specific questions How is it in go? How is it in a dotnet? How is that in this exporter that so? You have the right people hanging around here Hi Sorry good. Yeah on your left Yeah Is there any Sorry, it's difficult to see with the lights. Yeah, there are any plans to support the error tracking as well like something like century or error tracking is Is an interesting part as I said currently it's Logs metrics and traces and the next signal that is intended to be or now the discussions in early phases of discussion is the Continuous profiling things such as a park I open source if you're familiar with or others some center actually doesn't work that way. It's based on on How does it call the snapshots? But ones that are based on that are modeled can be modeled after as events in the similar to logs It can be converted and then relayed over the same mechanism that is built for for logs So it's interesting maybe to explore adapters that can actually transform that But I don't know of any specific goal to adhere specifically to these these formats again We if we have one of the maintainers wanting to have any more detailed Information feel free to jump in. I hope I answer the question. Yeah, thanks We've got an online question with a few upvotes Yeah, which is how do things like synthetic monitoring real user monitoring and application performance monitoring fit into open telemetry So that's an amazing question I actually didn't know if I have the time for this talk, but if we get it in the Q&A, it's great because one of the Work that have been started in open telemetry is a new working group that is dedicated to client Instrumentation so most of what you see and what you've seen is really back-end based. Of course, you can take the existing JavaScript SDKs and APIs and you can work it also with client side But clients like does have its own constraints. You need the session ID propagated and things like that That haven't been modeled into the Specification so there is a new working group for the client instrumentation that aims strictly squarely at these use cases of instrumenting web pages web apps mobile apps and so on and so I think this would fit perfectly into both synthetic monitoring and and Really you really use the monitoring things that look at your system from outside as a black box So if you are interested in that do get involved check out this new working group It's really in the early phases so you can actually get involved and and influence Do you have more questions from the virtual audience? From the other one here is if we already use Prometheus exporter is there a convenient migration path for open telemetry Prometheus exporter from the application So as I said the collector the open telemetry collector it supports the Prometheus format so you can actually collect it with the Hotel collector and you can relay that over OTLP so there is there is a Collaboration that has been done to support Prometheus in on the collection side As I said you can also use Prometheus as a back-end so you can also use a remote right exporter to send it to Prometheus as a back-end So you supported both both ways Any other questions about the future where we are heading ideas No, so I'm not on Horvitz and thank you very much for listening and see you and coupon