 No test. Oh, there we go. All right. Sorry everyone So the next session is on open telemetry I want to provide an overview of what the project is and talk about some of the backwards compatibility for those of you coming from Open tracing or open census first a quick introduction, maybe See all right. Cool. My name is Steve Flanders. I'm head of product and experience at a stealth startup called omnisian We are in the observability space space set of California in the United States If you're interested in learning more about observability or what I do, I've provided some links here at the bottom All right, so some background I'm going to start by talking about open tracing and open census You may have heard of open tracing because of the CNCF project You may have heard of open census because it's very popular in a Google cloud platform GCP or in Microsoft Azure Both of them use open census today So let's provide some background here when we talk about observability or you might hear the word telemetry There's really three different verticals You may have heard about the three pillars of observability, which is basically tracing metrics and logs But there's also a few different layers for each of those verticals You have the instrumentation API is how you actually get tracing data metrics and logs You have the implementation detail of how you're going to pull that data out and send it to the back end of your choice You have the data infrastructure to support that and actually be able to query over it and then of course the various different formats As it turns out, this is a pretty complex problem because on top of the verticals and layers There's also a polyglot of different languages that you could have written in and you have to solve that problem for each of those languages So there were two projects that existed one was called open tracing. The other was called open census They were similar. They were not the same So open tracing came out first the thing that came out in about 2016 Basically, it solved one vertical tracing all it did was tracing it had one layer an API Just the API for the instrumentation aspect It was loosely coupled supported a ton of different languages and was broadly adopted It's already part of cncf so that shows that it was widely used On the flip side of the house you have open census. This came out in Early 2018 it actually attacked both traces and metrics so it solved two different verticals Not only that but it addressed both the API as well as the implementation So it was the first solution to provide an end-to-end Solution to this problem space. It was more tightly coupled in terms of the frameworks and instrumentations and libraries that it supported I did support a broad set of languages five of them and beta and it was broadly adopted now Just looking at this slide. You can see that there's some amount of overlap And these each had pros and cons. Let's talk about the cons Open tracing only addressed tracing But I'm assuming most people use metrics and logs today So you probably want more than just tracing on the flip side of the house open census was tightly coupled Well vendors specifically would prefer a more loosely coupling aspect so they can differentiate on the different analytics that they're providing So these were kind of some problem areas in each of them. They of course each had strengths as well Now from a adoption perspective both had very healthy ecosystems They were very active on GitHub. There were a lot of contributions and they were backed by a bright wide range of contributors And the open census side omniscient was one of the companies with Google and Microsoft to basically form open census and Build the client libraries as well as the open census service The service was made up of an agent and a collector that was basically part of the implementation side of the house that was missing from open tracing and Other providers and users were also getting actively involved in open census as well Sorry one second sometimes cuts out. It'll come back There we go. So both project projects were well adopted great. Well kind of The problem is that there was no clear winner both were doing very well And that's great except for the fact that as a user you might be asking yourself Well, do I use open tracing or do I use open census or do I use both? I'm confused as to what to do This actually is very well represented in a Hadoop ticket I have a link in the slides that you can go take a look at where someone basically asked Do we use open tracing or do we use open census? It's been open for a few years and there's no answer Not good, right? Like people don't want to have confusion. They want a single solution Ideally they take one dependency and it just works and at the end of the day if we have an open standard If we have an open API if we have an open implementation, that's possible But as long as we have two different solutions that are doing similar things with a little bit of overlap That's never going to happen. This is a problem So to solve that problem open tracing and open census are merging together and the new project is called open telemetry So it's the best of both worlds. It's not starting from scratch Basically, we're taking lessons learned from both open tracing and open census We're taking contributors and members from both of those communities and we're forming open telemetry So we can solve some of the pain points that I showed in the previous slide and come out with a single standard and a Single solution that everyone can consume. This is beneficial for end users because they now know what dependency to take And it's beneficial for vendors because they now know the standard that needs to be followed They don't have to write their own or come up with side solutions This is actually kind of unique in my opinion if you look at CNCF There's a lot of competing products or even outside and open source communities in general You don't typically see a merge like this But I think this is extremely beneficial to the community This is a case where if I if I have to instrument my app for a bunch of different languages I don't want to have to mix and mingle and decide what to go choose. I want to have one standard So I'm glad that we could come together and form a single solution here All right, so open telemetry overview going back to this slide. So what is open telemetry trying to solve? All of it the whole thing we want to be able to Supply traces and metrics and logs we want to give you the API and the end-to-end implementation We want to handle the wire format all of it Because if that's done and it's done well Then all you need is the analytics back end and you'll be able to point to that or reconfigure to that without changing your code It'll save software developers a ton of time. It'll make it possible to solve availability and performance issues with ease So let's walk through each of these instrumentation APIs. What's going to happen? Well, the goal is to standardize around context propagation, but give you choice So if you want to use a certain context propagation format you can The standard here is going to be w3c trace context. The first version should be coming out later this year It's already in its final release and open census as well as open telemetry will have native support for w3c It's likely to be the standard that you see going forward for trace context propagation, but it will support others as well So if you're using Zipkins v3, it'll support that as well So it'll be easy for you to configure the context propagation format that you care about While the initial focus will be around tracing and metrics The goal is to enhance this to logs as well If you look at the open census project it already started to tackle some of the logs at least in the Java client library So what it provides today is the ability of adding trace ID span ID sampled information as Tags to your log message and why that's powerful is because now your logs will have context and correlation I can say for this given trace and all of the calls that it made show me all the logs or From a given log show me the entire trace for that log That's very powerful because one of the problems with metrics and logs is that in distributed Microservices-based architectures. They're missing context and correlation. They are symptoms. They are signals They are not root cause. They cannot provide problem isolation like they could in the monolithic world So with distributed tracing as the foundation you can actually enrich metrics and logs and make them more powerful And so basically what will happen here is open slimetry will provide a tracing and a metrics API And then it will provide a client library implementation that you can leverage And then depending on what you're leveraging from an application or an RPC perspective It'll go ahead and hook up and integrate in there for you automatically The net result is you just take a dependency and it should just work And if you're using a framework or library, it's already instrumented if you have a service mesh like Istio You can take advantage of that as well From an implementation standpoint the goal is to provide one reference implementation for every language The first version of open telemetry which will be coming out later this year I'll talk about that in a minute. We'll focus on five primary languages. I'll try to get them right Go node PHP Now go node Python Java and dot net there we go second time and I got it But there are all the libraries will be supported going forward SIGs or special interest groups are firing up right now if we're interested in contributing to them In fact, I think I just saw one for C++ and C sharp that's just getting started So if you're interested in another language, please get involved. I mentioned the W3C trace context propagation And the goal is to have a wire format for tracing metrics and logs from a tracing perspective OpenCensus already has a wire format open telemetry will have one as well based on the best practices that we've learned so far Data infrastructure data collection if you will so open census had something known as the open census service I mentioned that it had an agent and a collector Open telemetry will have a similar piece of functionality that you can leverage if you want The goal is that it will support all popular open source software automatically So for example for tracing that Zipkin and Yeager from a metrics perspective that would be Prometheus and stats D So that'll just work out of the box, but it's a pluggable architecture So commercial vendors can also provide what we call receivers and exporters into that architecture And you can take advantage of that as well. Many of them are already available You can of course write your own if you have your own custom implementation as well The goal is to support both tracing and metrics to start but logging will come So there's questions about like flu and D and what does that mean? We'll get there We're not there yet, but those questions will be answered And then the agent the collector provide a ton of additional benefits and feature functionality one of the questions earlier at the Yeager Session was I don't want to use an agent. Well, the agent is actually very powerful from a client Instrumentation perspective I can now point to local host which means I don't have to reconfigure my application ever local host is It's always there so I can deploy this application anywhere And I'll know that it gets the agent in addition the agent provides buffering and retry capabilities and batching capabilities Those are things you'd have to add to your client libraries in every language that you're running in otherwise That's overhead for your applications Probably don't want overhead for your application. So ideally you defer that to the agent let the agent handle it locally and then it can send to a collector the collector can be used for Sending this over a WAN. So if you have latency or High throughput of traffic so you need to buffer or cache more of that data The collector also offers advanced functionality like tail-based sampling sampling is very common in the distributed tracing world But today to spend mostly head-based at least from a open-source perspective The open-sensors service was the first open-source software to have tail-based sampling open-sourced and available Typically you'd have to get that from a proprietary vendor Open-slimetry will offer tail-based sampling as well if you're interested in that function now Open-slimetry goals So let's talk about the elephant in the room because this is kind of important. I mentioned open tracing I mentioned open census. I mentioned open telemetry. I said we're going to add the two one plus one And we're going to end up with not two but one Well, this kind of sums it up perfectly the fear is we just invented another standard called open telemetry And now you'll have three choices No, no, that's not the goal. One of our goals is to prevent this from happening So how are we going to prevent this from happening? And the answer is we're going to provide backwards compatibility and we're going to sunset open tracing and open census What does sunset mean? Those repositories are going to go read only by the end of the year. That's the plan So we're basically working right now to make the first version of open telemetry That'll be available later this year right now. We're targeting September for the five major languages that I already mentioned And then by the end of the year by hopefully cube con North America We will announce the sunset or the made renown me version of open tracing and open census But what we're going to do is provide a bridge that has full backwards compatibility So if you're on open tracing or open census, you don't have to do anything by the end of the year It will still work with open telemetry. You have nothing to worry about and we will support backwards compatibility for at least two years That's the end of 2021. We have plenty of time to Migrate over to open telemetry as it becomes available So our goal is to provide enough runway for people, but at the same time put some pretty firm dates in so we don't end up with three standards Really want to consolidate. We want open telemetry to be the future We want there to be one API and one implementation It's actually a medium post on this that has more details if you're interested in more specifics So one project not to definitely not three The goal is to attack tracing and metrics API and implementation For five major languages in the first release The goal is to make it loosely coupled so you can take what you want You don't want to use an agent or collector. That's fine You already have your own client instrumentation. That's fine. You want to use a different API implementation than we have that's fine That's the wrong presentation That's not fine Here we go So it'll be loosely coupled and you can only take the pieces that you care about and then probably most importantly There's open governance. In fact, we're following kind of the Kubernetes model here So it'll be representation for many orgs and they'll be fixed timelines on how long someone can serve on them on the government's board So the project isn't pulled in any one particular direction Again, there's a blog post on this on medium as well So governments governance. I just want to talk about this really quickly There are special interest groups or SIGs that are getting started right now So now is a great time to get involved if you're interested in contributing to this there are GitHub issues open on Open dash telemetry on get hub that is a gotcha. It's not open telemetry. It's open hyphen telemetry on get hub Otherwise, you won't find it. Sorry about that. Someone has open telemetry. We're trying to fix that So there's special interest group for each of the languages and then there's different community membership things You can become a member of an approver or a maintainer There are certain requirements in order to get approved for that You have to have a certain number of contributions and you have to be active in the community And you have to be recommended by someone that's already part of the community If you're interested in learning more if you go to open hyphen telemetry slash community That's where the repository is with all of these details And then finally we're using the CNCF code of conduct open telemetry is a CNCF project The goal is to represent as many different companies as organizations as possible We don't want to have unequal representation or one company kind of driving the project We really want the community as a whole to drive the project Right now elections are based in the Kubernetes model So there'll be nine seats with limits on over representations, you know company kind of controls the governance board Only active code contributors get to vote kind of important and there are maximum term limits So again the idea is to rotate people and companies in and out of the project All right. So next up I want to talk about backwards compatibility So if you're coming from open tracing or you're coming from open census, what does it mean? Let's start with open tracing. So open tracing has a few changes to the API itself Apologies and advance these slides for open tracing are a little technically deep So I'm going to summarize them for you and then you can review them if you care about the specifics So the three major changes for open tracing are around Sampling so there'll be formal definitions on how sampling is handled terminology changes so for example in open tracing Metadata was known as tags, but in open telemetry that'll be called attributes So there'll be some naming changes to get used to and then the trace ID Span ID type changes to make them more standard for the W3C How that context correlations handled would give the W3C trace context that's coming out That'll formally make some changes to the open tracing API spec But again, it will be backwards compatible You don't necessarily need to worry about this as you move to open telemetry. These are just things to be aware of As I mentioned, there's some names to naming changes. So tags or attributes logs become known as events These slides are attached to the session on the scheduler builder So it should be easy to get them if you're interested and All the other notions like follows from and link spans will all be included So you'll have feature parity with what you're used to from an open tracing perspective The tracer itself will also have some changes. So for example instead of ending a trace That terminology that function call will be removed. Hopefully you'll see it in a second instead It will probably support flushing. You see the word probably we're still defining the new API and specification That's actively going on right now But consensus right now is that trace.close will probably go away and trace.flush will probably be introduced instead Again, you can get involved in the community and you can follow this live if you're interested in the specifics for open tracing For open census the story is a little bit easier The primary change is that there will be a clear separation between the API and the implementation That's really what was missing an open census So the idea here is you'll be able to replace the API with another one if you want to that There's some other implementation that you're some other API specification. You want to follow you can Previously that was not possible. It was tightly coupled with the actual implementation itself So divide dividing that and clearly separating the two provides more choice for end users That's a lot more flexible model and basically what the inherent ask is other than that The instrumentation API is going to change a little bit, but none of the specifics are going to really impact you I do want to kind of walk through what a migration might look like from open census. It's kind of interesting visually So today, let's say that you have a service or an application written in whatever language Let's use go and you go ahead and you leverage the open census client library It actually has an API and an implementation. I think that's a tightly coupled. That's how open census is today So once open telemetry the first version comes out What you would basically do is you would take a dependency on the API and the implementation But that will happen automatically. You'll just basically upgrade open census So what open census is going to do very soon is they're going to do a cut-over So when you upgrade to the latest version, you'll still get the open census API But the open census API will just go and call the open telemetry API and the open telemetry API will call the open telemetry Implementation it'll be transparent to you. You won't make any change other than upgrading your open census version Then you will eventually take a dependency on the open telemetry library itself. So that's a code change that you make That gives you a clear separation between using open census and open telemetry and provide the bridge as you make this migration over and Then finally you would remove your dependency on the open census Library once you're ready to do the full cut-over So you can make it a four-step project of process. You can make it a one-step process again The goal is to provide choice and flexibility on your migration path But given this flexibility it should be very easy and very non-destructive when open telemetry is available And now let's talk about the open census service So I mentioned that's made up of an agent and a collector if you're not too familiar with the architecture There's a notion of receivers how you get data into the open census service and exporters how you get data out of the open census service and All the open source popular open source solutions are supported So for tracing that's Jaeger and Zipkin for metrics that's Prometheus and stats D There's also an open census endpoint as well and then commercial vendors have provided their own exporters traditionally they don't typically provide receivers today That are available. So what's going to basically happen is we've already ported the open census service to the open telemetry service So very soon we will cut a new release and you'll basically just deploy that release. It'll look exactly the same It's basically just a rename and then over time We are going to release the open telemetry receiver and exporter the new data format that merges the best of both worlds Over time you will then migrate to that open telemetry service Once you have fully migrated over then eventually in two years when backwards compatibility gets removed We will remove open census as a receiver as it will no longer be necessary So again, the transition should be pretty smooth and seamless for you Especially from the service perspective given that open tracing didn't have a service today This is basically a port over from the open census service itself with that said there are some architectural changes that are being made they'll be transparent to you but The agent and collector code bases are based on the same code, but they're not identical We're moving to an identical binary for both with just different configuration options And we are looking to overhaul the configuration that was used in open census service. So then open telemetry There'll be a new standard way of defining it. That's much more scalable and enterprise ready. So that should be available here By the next month very soon. We'll be cutting the next release of this You can follow the open telemetry service to get more details Okay, so let's summarize because I threw a lot of words at you What's next the first version of open telemetry is expected by the end of this year. We'll have support for five languages I'll try again go node Python Java net hey first try nice There will be a bridge between open telemetry and open tracing and open census So if you're using open tracing and open census keep using it it will work It will be okay. In fact, if you plan on instrumenting today like in a greenfield environment You have to use open tracing or open census open telemetry is not ready yet We'll be ready. We're working on it, but it's not ready yet So if you need instrumentation today, go look at open tracing and open census We try to mark that in the repositories So for example, if you go to the open telemetry service, it clearly says please go use the open census service So we're trying to make it clear to the community what's going on But at the same time be transparent as new features are available The goal is to sunset read that means read-only so going forward There'll be no changes the open tracing or the open census get hub repository by the end of the year That's the current targets Those repositories will remain read-only mode for at least two years with backwards compatibility that backwards compatibility should last through 2021 of course if dates slip here, then we'll give you specific dates, but right now these are our targets and then Open tracing has some changes to the terminology the contact propagation and sampling as I already mentioned an open census We'll have a clear separation from the API and the implementation Those are probably the key takeaways here from the differences as well as the compatibility and the merging between them Now of course, we'd love to have you be involved in this community now is a great time to get involved Right, so if you're interested in a particular language if you have feedback on direction If you want to contribute to the website or the documentation, we love to have that as well The open telemetry community is very active right now There is a debate on whether we're going to be on Gitter or Slack right now We're on both But you can come find us in one of those two forums and definitely post your questions there Or of course you can open GitHub issues as well. All of the SIGs are documented on the open telemetry community page They have weekly or bi-weekly meetings with meeting links in there And you're definitely welcome to join if it's not in a friendly time zone Please open a GitHub issue right we want to make sure that the community can be involved And if that means having every other meeting being in a different time zone We were definitely open to that So the more feedback you provide the better that we can make the community for everyone And with that I'd like to open it up to some questions When it's blue and deep nice, I think there's actually a get-hub issue open for this So again, the initial so when it's fluent D coming is the question the initial focus is traces and metrics So for the end of this year, I wouldn't expect to see any logs It's possible if we are able to consolidate and everything done quickly We'll get it by the end of the year, but I suspect when is it coming I would guess next year That'd be my hypothesis What you may see is the ability of adding Trace ID span ID sampling ID like you saw an open census that might come early But any native integration with fluent D or what that looks like today Our guidance would be just continue using fluent D like you have them And you can enhance the log messages if you want inside of your application using open telemetry Long term we can see if there will be changes to things like fluent D That was probably going to be maybe cubicle in North America I'll have a better answer as to timelines. That's just towards the end of the year But right now that isn't the highest priority the highest priority is really sunsetting open senses in a country Can you talk of it to open metrics and how that relates to the open telemetry project? So open the open metrics project is primarily data format project So it's the format for sending metrics out over the wire Initially the thought was to keep that separate, but open census actually has a format for sending trace data So we are actually in active talks with the open metrics and for me if you spoke about this The talks are still early, but there may be some ability to join efforts there as well So that is likely to happen sooner than say logging primarily because tracing and metrics are the top priority to start with But there is some amount of overlap there With that said my understanding, which may be incorrect is that open metrics is still a little early So we're just trying to make sure we're involved in the conversations And they understand the direction that we're going in as well to see if there are some synergies there and if we can help Open tracing and open senses they provide a capability to you know configure The application from the environment itself Right, so will open telemetry have that kind of capabilities when you say from the environment itself, what do you mean? So say I want to connect to a yogurt collector for all the traces, right? So I have my application that can read stuff from the environment, right? Which auto configures the applications to send it to the given collector. You're asking about like environmental variables. Yeah, okay So yes, Jaeger supports environmental variables today open census supports environmental variables today The expectation is that open telemetry will as well in addition the goal is to provide more integrations So for example open census in the ecosystem repository Actually supported a web hook admission controller to kubernetes where it could actually tag in via environmental variables Pod information so for a given application I could figure out what pod it was running on when it generated that trace What we're looking to do is to provide even better native hooks Like wouldn't it be ideal if the agent could just tag that on and I didn't have to modify the app pod So another reason to potentially consider the agent is that it will actually collect host metrics from say kubernetes natively instead of you having to like get drop-boisered only from your app and not being able to get down to the host itself The agent can get down to the host level So yes The goal will be to support environmental variables and ideally provide as much native integration as possible So again, you have flexibility in choice. You can turn off and turn on what you care about So this is more of a terminology question. So what is your definition or the difference between telemetry and observability Good question. So observability is a common term that came out in the cloud native era or as you move to more microservices The way I like to think of it is observability the term observability came around Because monitoring was In the previous generation with monolithic applications It was known as monitoring and the problem space was different You could use just metrics and logs and you knew where the problem was it was in the monolith But as you move to a distributed based architecture That's not the case anymore and as errors propagate upstream and downstream from a call traditional monitoring doesn't work The term observability can't kind of came out of that Telemetry on the other hand, that's more of a loaded term. I think we Went back and forth on naming for the project given that open tracing was taken open observability wasn't an option Open census was the goal wasn't to use open tracing or open census Do we didn't want to cause confusion in the community? We wanted a new name So telemetry kind of came out of that telemetry data in general is just signals You can think of it as from your application, which is very similar to observability Observability you typically think of the three pillars of observability So you just think traces metrics and logs, but as I mentioned, we're also trying to attack different verticals API Implementation wire formats the telemetry is more encompassing in a way And so we felt that that better represented the name of the project because the goal is not just to provide signals We want to get you to root cause context and correlation through those signals. I Hope that answers your question Is any plan to support a OS like Linux or Google Google s Yeah, so the the binaries support Linux distributions They did support at least the open sense of service did support windows I think we dropped support for that, but we can get it re-added But yeah, the from a language perspective perspective It's it's supported from a cloud perspective open census was supported by Google cloud platform and Microsoft Azure AWS has their own solution and we're also talking to other cloud providers as well What we're hoping is that overall cloud providers will standardize an open telemetry as well Because then I could answer the question of is it my app or is it my cloud provider? That's causing the problem today You can't do that and so if you rely on a platform service from a cloud provider You can't easily tell whether it's slow or not and that results in support calls So we're working with other cloud providers to try to get them on board with the open telemetry project One more minute any more questions One more than that Do you also care about the long-term storage of the data or is that out of scope? Yeah, so basically after it gets to the collector. It's tech today It's out of scope provide your back end of choice So whether that's a local back end like Jaeger or an open source back end like Jaeger or to a commercial vendor like omniscient or Google Stackdriver or whatever you're going to use that's kind of outside scope with that said while it's in the agent and the collector There's some amount of persistence that's needed needed there Typically today that's done in memory, which means it's a little lossy if you have to restart There are actually tickets open to add like a dispatch queue for that so that you could actually retry that data Should you for whatever reason lose that note or if you want to have longer term persistence? What we typically see is it's not entirely the end of the world if you lose a trace or a metric as long as you get most of them So having it be in memory is typically sufficient But then the back end storage of that that's outside scope Features to get it to the back end though or not so for example the open census service supports as I mentioned tail-based sampling that's a way of reducing the amount of data that I send to a back end and hopefully increase the amount of Valuable data like error traces that I send to it versus the number of non error traces that I sent to it But actually storing it and persisting it. That's that's typically considered the back end problem outside the scope of open All right, that's all that time that I have I'll be here for a couple minutes afterwards though if there are additional questions Thank you so much for joining