 So welcome to the maintainers track talk on open telemetry. Thank you for coming I wish we had a larger room because it looks like there are other people trying to come in My name is Morgan McClain. I'm one of the co-founders open telemetry and like my colleagues here I'm on the governance committee and I'll let them introduce themselves and then we'll get started Yeah, I am Daniel Dyla. I work for Dynatrace. I'm also on the governance committee and I maintain open telemetry JS Hi everyone very happy to be here. I'm Alulita Sharma. I'm on the open telemetry governance committee as well as work on the collector as well as the go library and Again, I'm from Apple and also lead the observability for AML there Hi, my name is Ted. I work at late step and I am also on the governance committee and I mostly heard cats around the specification working group And I forgot to mention I work at Splunk So it's been a big year or two for open telemetry the project has gotten even bigger than it was before it's the impact It's quite considerable. I think we've been coming in various incarnations to do cube cons since I mean open telemetry was announced in 2019 And certainly there was open census and open tracing prior to that like Ted I'm trying to remember like when the first time you would have come to keep probably 2016 2017 something like that Yeah, so like there's been a long journey in these projects to to success But like we're we're really proud of what the community has achieved So we're gonna start by giving people a brief overview of where open telemetry is right now And then the roadmap and then we're gonna dive into the semantic conventions as promised So as I mentioned we were announced in 2019 in mid 2020 Tracing the tracing specification and implementations hit stable in open telemetry That's when the project really started to gain adoption Last year we announced at Kubecon EU that the metric specifications of the second signal type in open telemetry Had reached 1.0 status and then in Kubecon Detroit last year in October We had the announcement that a number of our major languages had achieved their 1.0 implementations and metrics and since then We've seen the adoption of that really grow, but we've achieved a lot more than that as a community There's other data types coming in It's probably pretty well known to this point that logging is being added as a new signal to open telemetry And you can see here on this on this timeline When we started logging and when we started to achieve things like data model stability We're hoping that later this year will achieve 1.0 or stability for logging across all the different components Obviously open source project data TPD, but that's that's where we are with logging And there's also other other things we'd like to talk about including the open telemetry demo Which is a demo environment you can use to test out open telemetry and to Sort of experiment with it it contains about 20 different applications written in different languages and you can use the demo to Go and look at our language instrumentation the collector and basically use it as a sandbox to make changes And see where if you were instrumenting your own services, you can use it as a guide To understand had sort of best practices for open telemetry. So very very useful resource I think that just hit version 1.4 in the last week Which is exciting There's also profiling which has started. We'll get into that in a moment And another thing we wanted to discuss on this timeline is that we now have We're about to have community lambda layers. So if you use Amazon lambda The open telemetry community now has official lambda layers that you can subscribe to and use which is also fantastic. Oh And one last thing there. I almost forgot otlp the open telemetry line protocol or open telemetry protocol It is fully stable now for traces and metrics Achieve stability some time ago for I think the gRPC format and the HTTP proto format that was also achieved recently for HTTP JSON So otlp is completely stable. Not that it's taken any changes for a long time But we now have that sort of stamp of approval on it All right, I promise to talk about logs Logs is very very exciting this year. We have traces. We have metrics and open telemetry logs is the third of many signal types That will be added to the project There's two paths that we're taking an open telemetry to approach logging This is somewhat distinct from how we've done other signal types in the past because logging has a lot of prior art So the most sort of the first phase of this which is not stable now But it's actually relatively mature within the collector is using the otl collector a primary agent to go and tail Logging files to you know human readable flat-text files from disk just like every other logging agent in the world does today The performance of this is actually really impressive. It's just a native part of the collector And it will parse these log files and convert them internally into the open telemetry formats for logs And it's nice because it depends all of the open telemetry resource data and metadata that you would expect those logs And then you can send that data to whatever destination you choose using otlp or any other exporters And so that's great, and that's fairly straightforward It's a great way to take existing applications that write logs and bring those logs into open telemetry without redeploying updating making any changes to your app There's also work being done in parallel That will allow you to capture logs directly from your applications natively in an open telemetry format There are numerous benefits to this probably the most prominent are performance because these logs don't have to be written as human Readable text files to disk they can just be sent in a binary format directly to the open telemetry collector So definitely better performance there There's also benefits in that the the metadata and and sort of structure of your logs is guaranteed to be strongly typed because they're not Just being written as a human readable text file We're taking an interesting approach of this that also deviates somewhat from other signals where the open telemetry SDKs and API's and language Instrumentation of course will have these hooks for logging, but they're not generally meant to be used by end users So I'll give an example for Java for Java a lot of people use log for J to write their logs amongst other logging libraries We don't necessarily want to have open telemetry Sort of compete with that or just sort of offer a sort of confusing second choice So we've actually developed as a logging bridge So if you're in Java for this example and using log for J You would just have log for J export those logs in process to open telemetry which would then Process them and deal with them from there in some languages in the future We may choose to have a developer facing logging API I think C++ is sometimes raised as an example of a language where there Isn't a sort of super native logging API that people already use so in those cases Yeah, we may begin to offer one, but I think in most languages like Java We'll simply adapt to whatever the APIs are that people are already using of course a few feedback on this thoughts on this Please let us know at the end of the session or or in the community All right, we'll go a little quicker through the other items here So we've achieved metric stability as I mentioned and the implementations of metrics across most languages are now 1.0. I think go just hit one point or was about to hit 1.0 But we already achieved that for like Java, Node.js, Python various others There's still a bit of more work on metrics like it is stable But we are of course building more integrations for more technologies for metrics because it's fairly new And we want to provide a nice turnkey experience for people There's also work that needs to be done on implementing exemplars that was explicitly out of scope for 1.0 And we still need to finish the job on that as well as implementations for high resolution histograms Finally for my section profiling is a big sort of a growing topic with put topic within open telemetry I think last year in Valencia This was probably the first time this really started getting discussed seriously But since then we've created a profiling SIG. There's a lot of new community members I think roughly 20 or so people have joined open telemetry really to focus on profiling and so our goal is to be able to capture Application profiling information I think generally under the coverages is usually stack traces or data from P prof or JFR or other existing profiling technologies And include that with an open telemetry and allow that data to be to come in and be processed and gain the same metadata And and have the same export capabilities as all other data that comes through open telemetry and thus be correlatable Later with your traces and metrics and everything else that come through For just a very short background around profiling This is going to be aimed at distributed profiling or often called continuous profiling So this is a extremely low performance impacting way to get profiling data from a distributed system So this is not the kind of profiling where you're instrumenting every single function call that is very valuable But it also comes with a very high performance impact when you do it But if you are running, you know, multiple instances of an application This is very valuable because you can build a snapshot of its performance over time Without really any cost and you can do this in your production environments continuously That's the kind of a profiling that we're aimed at our current status is that we're still working on the draft of the specification It's gonna take some time to close on that and from there, of course throughout later this year and well into next year I imagine we will be working on finalizing that spec and then pushing towards implementation Okay, I guess I'll talk about client instrumentation This is something that as the JS maintainer I get asked about a lot and in the past I've always had to say It's something we're working on but we know there's a lot of moving parts And we haven't had a lot of time and bandwidth and I'm happy to say that now we actually do have time to focus on this So things have actually started moving recently quite quickly I know that there is a rum sig that's been meeting for what the past year or so I think maybe a little bit more than that Which has been working on Setting up the specification for semantic conventions for rum The logs data model, which is finally stable is a huge part of that logs and events are very important for rum And the the data model in particular was super important There's also the the experimental Yeah, experimental event API which he uses the logging data format actually Under the hood we consider them to be sort of the same thing But events are a little bit different in that they are meant to be called by instrumentation So if you're interested in that that's something you should ask the rum sig about Further we also have the web JS SDK sandbox This is not a new web SDK, but it is something where we're using to Explore new JS ideas that are very browser specific Right now the JS SDK is very node focused very focused on the back end It does work in the web, but There are Some feedback that we've gotten that that things could be improved So that's really what we're doing to try to Explore those ideas and what could be changed and improved and then fold them back into the main JS SDK Also, obviously when you talk about clients you have to talk about mobile There's ongoing work in both iOS and Android, but particularly in Android. There's a Large code donation that's been made recently or will be made soon something along those lines. Yeah, sort of right now Where I expect Android to be moving very quickly and then your future here towards a working implementation Okay, another thing that we've been working on a lot recently is Configuration a lot of feedback that we get in open telemetry is that it is confusing to configure particularly with Instrumentation where you're working in multiple languages It can be very difficult to Configure it in a way that is Consistent across multiple languages and multiple teams which may use multiple languages And the collector particularly if you're maintaining a lot of collectors So recently if you've been following the Oteps repository, you may have noticed that the configuration Otep merged This is the first step towards specifying a Configuration file format which will be used across all of the SDKs and eventually I believe also the collector So this is important because it should solve a lot of these problems. It is early days, but That work has been moving along very well Important points with that are that It is structured config so instead of having like Flat environment variables and stuff like that you can have typed configurations It will also have a JSON schema for validation So you'll know when you write your configuration that I've written it correctly if you write it incorrectly You can have a validator that tells you this is where you messed up It will have environment variable substitution, which is important moving over from the existing sort of environment variable heavy configuration style to the new style and also with secrets management and stuff like that and also Op-amp which is our remote configuration protocol. I believe the protocol is more or less done But implementations are underway I don't think the protocol is stable done. It's not like 1.0 or anything, but the work is Largely complete to make it usable And there is also a go implementation, which will be used in the collector and we expect additional implementations coming soon All right. Hi everyone so I wanted to cover integrations that open telemetry has today and Both within the project which are baked in into the project itself, but also outside that is other projects or other teams adding support for open telemetry and And again, this is super important because you know the more integratable open telemetry is in terms of collection environments that are supported and the type of telemetry that is you know, ingestible the more pervasive A standardized technology like open telemetry becomes so with that said again within open telemetry as you can see several streaming protocols such as and and Projects such as Kafka support skywalking Very exciting areas, you know because data at scale we all collect in our environments literally petabytes of data telemetry data and Kafka is very typically used for streaming So Kafka and skywalking are very important areas there for elastic search and fluent bit again very pervasive in the observability world Graphite support Jaeger for tracing open census, you know and again part of our lineage in open Telemetry is open census and open tracing so full compatibility with both of these projects and the Integrations that already exist in the field is super important as you know, it enables Teams who have implemented with open census or open tracing to be easily able to just swap out their Modules and be able to use open telemetry out of the box open metrics, which is the standard as many of you know with Prometheus specifically Guaranteeing interoperability from open telemetry to interoperate between the two protocols Prometheus itself, which is very pervasive in the Kubernetes world as well as with other Implementations where you already have metrics that are being collected Zipkin and of course the W3C trace context, which is very interoperable and fully supported by OTLP On the outside again, these are just some of the integrations that have already happened in the last Couple of years Kubernetes of course out of the box I'm just jumping me out there because Kubernetes is a great example, but and a very large example But container the CRIO docker Jaeger micrometer again docker under the hood runs the open telemetry collector So that's been huge Sorry Jaeger I should say I misspoke Micrometer for Java Quarkus for Java so very popular projects and also Fully integrable next JS in the JavaScript world dot net with quartz net as well as ASP dot net where Microsoft's teams have done a tremendous job with Not only contributing to the project, but also you know other works that they have done within their teams engine X All the flavors of engine X while fly from Red Hat So again, this is an example of some of the integrations have already happened again If you are seeing something that is not integrated and you use it Please feel free to make a proposal on the project and we are happy to work with the teams on the other projects or other Even even other vendors to be able to provide that support So it's very easy you can just file an issue and then you know We typically go through and review those and see what we can support also worth calling out These are the native integrations where these people are starting to use native otlp native open telemetry APIs This is in addition to the hundreds if not thousands of integrations that we have through our contrib repos that make existing technologies Yes hotel and and very good point more and because again if you go into the contrib repos for each of the Languages language libraries. There are literally hundreds of supported APIs which are exist which are fully interoperable with open telemetry and when we say native Instrumentation what we mean is that the instrumentation is actually baked in you don't have to install Some separate package written by somebody who is not the maintainer of the actual open source project Good point good call out And again, if you don't see something here, please just ask You know, we're happy to go and find either find it for you If it's in the contrib repos or take a request The other area that I wanted to call out and this is super interesting to open telemetry as well because this request was made at a community meeting in open telemetry in Detroit in Kubecon and This is really looking at how do you have a common query standardized language? Because as you know many of the vendor implementations as well as other long-standing Implementations have had query languages which are very specific to their implementations and optimizations thereof So this is something that bridges those islands right and being able to actually provide at least a baseline for this Specification was something that came up as a discussion and an end user request From eBay and Netflix at the open telemetry meetings and this was again Something that was deferred by the project to be held in the observability tag as a discussion work group so if you're interested in this discussion as well as these work on Identifying use cases and supporting this long-term Please do join in for this work group and again This has been submitted as a formal work group to the TOC the CNCF TOC And we hope to have an formal work group shortly With that said again Ted you wanted to cover you. Yes. All right, so we're gonna do a section here about the semantic conventions in general but The biggest news on this front is there is another project that has been maintaining a set of semantic conventions What we mean by semantic conventions is the schema of the actual data So you have otlp, which is the protocol But then you want to have a standardized way of describing an HTTP request a database call All of the things your program is doing we want all of that Normalized because that really really helps especially When you get into lots and lots of microservices and also when you get into machine learning and other automated Forms of analysis having regularized data is huge We've been doing our own work on this for a while, but there's been another project elastic common schema that's been around for a long time focus mostly on logging and They came to us with an idea that we could merge these two projects together So rather than having two separate standards that might go their separate ways. We should just have one standard So Alex if you wouldn't mind standing up people give it like a round of applause to like elastic and ECS people You know Merging projects and human stuff is like harder than code and engineering in my opinion. So always appreciate this also like between like open tracing open census ECS Everyone talks with the XKCD comic of like, you know, one more standard I think we may be the only project that's actually like Merged more standards than we've created at this point, which is pretty awesome So as part of this We're gonna make some changes Under the hood in particular right now all the semantic conventions are part of the open telemetry Specification we are going to fork that out into two separate repos so that we have two separate version numbers Two separate sets of maintainers This will make a lot more room for the ECS maintainers to come over and it'll also I think Create a lot more clarity about changes to semantics and data versus changes to like our SDKs and APIs still part of open telemetry though Just a different repo. Yes. So part of open telemetry, but just Separate repos with separate separate maintainers The other thing we have to figure out how to do is integrate all of these semantic conventions luckily Open telemetry only has like a small subset of things that have been defined in ECS ECS has also done a lot of great work on security and sec ops So a lot of great new stuff will be coming to the project as a result of that however Even though our semantic conventions have been in use in production for a very long time They are still technically Experimental if you look at any instrumentation package we provide a most language there zero dot X and We have always had it in our minds that we would do a final pass on all of these things really review them based on user feedback and subject matter experts and Make any final changes to them to ensure that it's really the highest quality instrumentation that we can provide part of that This is good timing with the ECS merger because part of that work is looking at the areas where these two projects overlap and figuring out what the best way forward is there and This is where we really really need feedback from the community Because we are definitely not going to roll out some kind of hard breaking change on this front and Even if we do come up with a better 1.0 that we Marcus stable That's different than what's currently being offered. We would still maintain Versions of our instrumentation with the current instrumentation and perpetuity However, we are still very very nervous about anything That's a change of this magnitude changing the data that's going into people's systems seems bad It seems like you could break dashboards and alerts and other things if someone unwittingly, you know imported a Newer version of the instrumentation without realizing it we can prevent that from happening But we also want to understand what other problems might arise around having old experimental instrumentation that will eventually be fleshed out being replaced with Something that will be stable going forwards in other words We're looking at one final set of changes to make everything perfect, but we are really worried that perfect might be the enemy of good enough Especially in things that are really really widely adopted like HTTP and networking So this is an area where we really really need feedback From the end user community you have any concerns if you can see ways that this would cause problems for you We want to collect all of that If it's the case that with some of these they're so widely deployed Maybe good enough is just good enough and we won't touch them So if you have strong opinions about this, please open an issue in the spec repo or the community repo There's already an issue created discussing HTTP in particular But we don't want to move forward with this without You know The blessing of the community we do think it's a good positive thing to go through these one more time We have seen places where we can really improve them, but Open to limit we really really really value stability and backwards compatibility, so that is why we don't want to do this without feedback So, please let us know also you can tackle me after this talk if you are really concerned about this cool, so There are a lot of semantics is actually going to take a long time to get through all of them if we spent a Total of three months reviewing each domain with the domains that we have that's a year and a half worth of work The ones we are currently working on our HTTP. This is where we need the most feedback We're developing new conventions with rum as Dan mentioned earlier We also have a messaging SIG. There's actually a lot of heavy lifting going on here because Messaging includes large asynchronous systems and cues distributed cues of any kind If you're familiar with distributed tracing It really works best when you're talking about synchronous transactions things that look like a tree and When you get into big asynchronous messaging systems, you start to have other patterns that show up, you know things like batch processing things like merge and join that are just a little more complicated than what you would see with a regular old Web transaction so There's a bunch of work being gone going on there to figure out how exactly We want to model that domain So if you're interested in that definitely join that SIG love more feedback and last but not least There's a functions as a service SIG. They're working on not just semantic conventions, but also improving lambda in particular so Trying to get our lambda support to the next level, but I believe we're also targeting other serverless environments Cool on to the community update. Yes So let's go over this pretty fast. This is the CNCF DevStats great resource for open telemetry So the in the blue bars you see our number of contributions to the project I think this counts as like commits and comments and PRs sort of github activity for the project That is growing though though certainly started shooting sort of a stability there in 2021 But still growing slightly, but if you see the red line, I don't know how visible that is but lose the mouse This is the number of contributors to open telemetry per month So we've well across the 900 monthly active contributors to open telemetry. So this is the Remains this I think the second biggest project in the CNCF, which has been for some some time And it's just a very healthy signal of community growth So if you're considering adopting open telemetry know that there's a lot of people engaged on it a lot of people working on it To those who contribute to it like as a community. Thank you very much for your contributions It's really exciting to see like how the challenge of extracting information from services and from infrastructure and everything else Is so motivating to so many people certainly is to me and I think for everyone up here Probably yourself if you're in this room. So thank you As well, we just have a list of some of the major companies that are contributing You know lots of cloud firms of their ability film firms here. That's probably not a surprise But I think what's really changed in the last year year and a half is the ones I put in bold here These are companies that are investing in primarily in open telemetry primarily for their own usage of it rather than for their customers And so if you look at this list now, I think it's over yet well over half are now end users Of open telemetry who are making contributions. This is the top. I think list of top 40 Contributors to open telemetry. So this is very exciting It's no longer just a bunch of vendors of strong commercial interests Although those interests have generally been very aligned with the community It's very exciting to have so many end users and people who benefit from open telemetry and organizations who have to sign paychecks To invest in open telemetry choosing to do so because they're getting value out of it I think that means there's a lot of staying power in this community And it's just very very healthy atmosphere in the community. So we should all be very proud of that Um before we go to get involved all later, was there anything else that you would want to add for this or no? Okay, we're good. Okay, great So if you do want to get involved as Morgan just mentioned we have Over nine, sorry over 900 monthly active contributors, which is huge If you would like to be the 900 and first you're too late, but somewhere in the first 1,000 This is how you can get involved. So if anyone here would like to get involved with the project We are primarily from a development standpoint on the CNCF Slack a lot of the people here are probably already on it If you're not you should be on it. It's a great way to get involved in not just our project But many of the other CNCF projects We're also trying to get better about Following the stack overflow open telemetry tag So if you are not sure how you can get involved from a coding perspective, but you're an enthusiastic or knowledgeable user this would be a huge help for us just to follow that and Help others who are having problems and if you're having your own problems that might be a good way to find solutions And then finally we are on GitHub like everyone else. I think Join a working group on our community repo. You can find a list of the working groups and special interest groups And when they meet typically they meet weekly some of them bi-weekly it changes and That's probably the best way to get involved if you're really serious about about making contributions. I Might add just one thing there I mean slack is a great way to ask questions, but don't be afraid to join the meetings We are a little more than most open source projects I feel kind of like face-to-face on zoom focused. We find that's a great way to resolve Questions that you know are taking time in the GitHub issues, but sometimes I get feedback from people saying like well I'm not like a super. I'm not like a maintainer. I'm not like a Serious contributor. I'm brand new so I don't want to like go to the meeting because I'm not sure if I'm allowed there Definitely, you're loud there. It's great to show up. It's great to ask Basic questions there. We'd love to see your face We are a global community again, you know Sean for example He is joins in from Australia So we do have a pack friendly as well as EU friendly times and so please don't be shy you know there are alternate meetings for different time zones and It is in general if you need a meeting on a specific topic Please just ask for it. The meeting is very supportive. The contributors are very supportive of being below All right with that I think we're done We have two QR codes here if you're interested the first is for general project feedback on open telemetry if you're a user of it The second is the standard testing feedback form for the session. Thank you for coming I think we have a few minutes to tackle live questions So Ted and I can run around with the microphones if people raise their hand to have them ask a question Oh and one last thing We have some maintainers here, please Aaron Jacob, sorry Anthony And and I think Jacob had a request. He is looking for He he's actually a maintainer on the operator for the open telemetry Kubernetes operator And he is looking for other contributors. I think and do you want to add anything else? Yeah Yeah, after this session is done I'll be like hanging around if people have questions about Contribution or questions about the operator in particular So just yeah, come on by cool. All right. Do you have a question? Please raise your hand Ted and I will pass you microphone Do you have any Parts on using the logging signal to collect metrics because I think that's used by Istio and Envoy and I heard about it, but I don't know Is that something So the question I guess everybody heard the question So I don't know if we've any official thoughts on that. I mean logs are often used to Capture metrics that they can be I think generally the preferences use native metric types if you can they tend to be more efficient They go directly into time series databases. If you have existing sources that are emitting logs, though You can metricize those. Yeah in the collector. Yeah, I think In terms of instrumentation, we're providing we want to provide a suite of Tracing logs and metrics out of the box. However There's absolutely nothing one of the advantage of having structured logs that with like a regularized format Is it becomes much more feasible to start generating metrics out of them the harder problem of generating metrics out of traditional, you know Flat file human readable logs is the parsing the fact that they may not be standard across different service types and things like that So that's definitely a thing you can do you can also generate Metrics out of traces and that's actually a pretty common thing that people are doing today Yeah, and that's what I was gonna say is you can generate Metrics both from traces and logs using the collector there are components in the collector for that So if you're speaking about deriving metrics from logs, that's possible today and Expanding but if you control the source, I would stick with the actual core metrics type if you can so you got time for probably two more questions anybody Going once We'll take trivia questions All right. Well, you've been great audience. We got one Sorry, sorry So given that you're working together on the elastic common schema and the thing that I am missing with ECS is a JSON schema to validate against Is that something that's on the map or what you have thought about because currently this yes, yes So right now Sort of like phase one is to get them kind of like locked in And stable In the spec but once we've done that we have to then roll out an implementation and something we've discussed is part of that Making sure we have a validation and like testing harness that we can be using to to ensure that everything Out there is is actually conforming to the spec We think that will make it a lot easier to keep track of this stuff. So yeah, that's that is something that's on a roadmap We could probably do one last question if we see a hand. Yes In terms of backward compatibility, I wonder why open telemetry at least in the metric schema doesn't allow the slash character in the metrics names because That's that was allowed in open sensors, but not allowed in open telemetry. I mean, it's a very low level question But yeah, yeah, Josh sir, it would be the person to answer Josh sir to also works at Google actually would be the person asked I think the the the answer has to do with interoperability with Prometheus, which also does not allow slashes and names And interoperability with that project is a core goal of ours like right from day one But I would talk to Josh and internally because you're Google as well. Yeah, hello Okay, I think We are out of time. Thank you everyone