 Hey, everybody. Thank you for taking the time to come to my Open Telemetry Community Day Lightning Talk called Five Minutes to Value, Strategies for Faster Observability Onboarding. So real quickly, we'll just go over what I'll discuss today, do a little introduction, talk a little bit about what five minutes to value means and how it relates to onboarding in Open Telemetry, talk about some really useful client configuration options for onboarding, and then provide a few suggestions for future contributors when they think about how to treat onboarding when it comes to writing Open Telemetry instrumentation. So a quick introduction. My name is Eric Mustin. Hello, nice to meet you all. I work over at Datadog as a software engineer on the APM integrations team. And I'm also active in the Open Telemetry community. I'm a contributor and approver on Open Telemetry Ruby. And I have committed to a number of other Open Telemetry repositories. So whether it is at work helping customers onboard to the vendor I work at, or it's in the Gitter channel or in SIG meetings listening to users, talk about how they're getting started with Open Telemetry, I've had a chance recently to learn a lot about onboarding pain points and what we can do as a community to make that easier for users. So first, let's talk about this term five minutes to value, right? It's a catchy term. So what does it mean? It's first off, it's a goal in the Open Telemetry Collector roadmap, but more broadly, it's just an extension of the onboarding concept time to value. It's just the idea, how long does it take for a user to start to realize value out of a product from the moment they begin to use it and get it set up? For the Open Telemetry Collector specifically, a lot of the roadmap goals are around sort of just working out of the box, right? So distributions for common targets, whether you're running on Docker or Windows, just giving you easy ways to get up and running. It's also about giving you data sort of out of the box to start to play with. So it's automatic collection of cloud provider, metrics or tags, K8s telemetry, or even just host metrics. And then it's also about getting and gathering application specific metric collections. So if you're running Kafka or Hadoop or you're running Sidekick, a lot of these pieces of popular open source software emit data out of the box and we can collect that information automatically and populate it in your backend of choice. So that's one concept of five minutes to value, sort of like just magically working out of the box. But as we all know, when it comes to onboarding, it's never just as easy as just working out of the box, right? The difference between theory and practice is always real. On the left, you can see sort of onboarding in theory, you invest a little bit of time and you get this great value and then you get incremental gains as you invest more time. But in practice, there's peaks and valleys, right? You invest a little bit of time, you get things working, but then you have an unusual situation where things break or the data isn't exactly how you want it or you, so you invest some more time, you gain some more value and then something else goes wrong and it's sort of this pull and push and pull. And each of those valleys in this graph on the right is an opportunity or an area where the user might turn or the user might just throw up their hands and say, you know what, I'll just stick with what works for me. And so for open telemetry, we want to limit those situations. Why does onboarding matter at open telemetry? Because onboarding speed, five X many minutes to value helps adoption. The faster a user can onboard and sort of get the data exactly the way they want it in open telemetry, that means the more time they have to actually prove the value of all this rich data to their stakeholders internally, it means they have time to more time to actually dig in and start to diagnose performance issues or understand root causes. And it also gives them more time to configure their setup for specific business use cases. So if they're a financial services firm, things they might have to regex out of their data are going to be different than if they're a e-commerce shop. So onboarding matters for open telemetry. And we want to make sure that that onboarding process is as seamless as it can be. And so a lot of the ways this has been done historically is sort of with magic defaults. And that was talked about a little bit in the roadmap earlier, but there's more to onboarding than just magic defaults, right? Magic defaults are wonderful. They give you a really wonderful sort of demo view or view right, a getting started view, but a user is only actually onboarded when they have all the data they need formatted exactly the way they need it. And anyone from a vendor can tell you that the questions don't stop once something starts working, they only stop once everything is exactly the way a user wants. So when it comes to open telemetry clients, that's difficult because we as authors or contributors here simply can't know ahead of time everything a user is going to want. We're not going to be able to capture every piece of metadata ahead of time. And so instead, what's really important is exposing really easy ways to augment, modify, and configure telemetry data to fit any use case and doing so without forcing users to maintain tons of custom built code and sort of glue code as it's commonly called, which leverage open telemetry tracing APIs. We don't want users to have to write software on top of the software we provide them. We just want to be able to have them use open telemetry software out of the box with these really great configuration options. And so I've just gone through and quickly want to quickly highlight a few of the really useful configuration options, specifically request response configuration hooks allow denialists and the ability to toggle on and off different tracing of middleware or unimportant spans. So let's start with the one I think is most useful is configuration hooks. I'm using open telemetry JS, specifically their node packages as an example here. We want to, for example, have a user is making requests to the GitHub API. The GitHub API is very heavily rate limited. And so they want to be able to know if a request is 404-ing or returning a 400 or 500 response code due to rate limit issues. But the issue is that the automatic instrumentation of each node HTTP clients doesn't automatically capture every single response header. So normally this would be an issue that have to write a wrapper around their HTTP calls or they'd have to be able to hook into some sort of after request hook in their HTTP client and be able to grab the current span context, but instead there's this really great configuration option called apply custom attributes on span. And it's a hook that takes both the span and the request and response objects as arguments. So it's as simple here as setting an attribute on the span that is set to the response header that we need that contains the rate limit information. We don't have to pull in the current span context. We don't have to store the response object anywhere to access it later. And now as you can see in my vendor backend, I have that X rate limit remaining span attribute available so I can filter and slice and dice based on rate limits. And this is a really great option and makes it easy for users to extend their instrumentation to any use case. We don't know ahead of time that it's going to be rate limits. We just wanna be able to make it easy for them to define the information they need without having to roll their own custom code. Besides being able to just augment your telemetry with useful extra metadata, another really important aspect of onboarding configuration is giving users flexibility when it comes to security, performance and data ingestion costs. And that means being able to control really precisely what actually gets instrumented and doing so via configuration. It means more than just dropping spans once they've hit the collector because we don't wanna spend all that, we don't wanna add more network costs and it means making sure that sensitive information doesn't leave our environment. So allow denialists for example, our great example of this, open telemetry pythons, Django instrumentation has environment variables you can set that can exclude URLs based on regex. So it's called hotel Python Django excluded URLs and in the example here, you can see we're excluding URLs for a specific client's metadata, which might be a particularly sensitive client and we're also not tracing health check endpoints because there's a relatively low value pieces of telemetry data. And then you can also see an open telemetry JS express, the express instrumentation has options to not instrument specific pieces of middleware. So here for example, using the option called ignore layers, we've chosen to ignore the body parser middleware. And so this is really important if let's say you're getting charged or you're storing data yourself and so every additional span you're storing has cost to it and something like a body parser middleware is almost never going to have any value and it's just not necessary to trace. So you can optionally choose what you wanna instrument and what you don't want to. So those are just a few examples of areas where client configuration, which lets users augment their span data can help onboarding tremendously. And so for future contributors here who want to add their own instrumentation or contribute instrumentation, some lessons to think about are one is treat onboarding as a first class citizen. Users don't wanna write glue code, they wanna be able to use officially supported configuration options when available. Let users augment their span data. We're not going to ahead of time, note every single piece of metadata set as a span attribute, but if you give them an easy or a happy path to augment that data themselves, that can be just as good if not better. Also, when you're writing instrumentation, take a look around at what other language client implementations may have done for similar libraries. It's important to try to borrow best practices from some of the other languages when you can. And that also means looking around and saying, what are the existing naming conventions and environment variable conventions that exist here? We don't wanna add to the cognitive overload of folks who might be running multiple clients and multiple languages in their production systems. They wanna be able to remember specific names for config options and specific environment variable names and be able to set those across all their languages when possible. So thanks for coming. I hope that was valuable to you guys. If you guys have any questions, feel free to reach out during the conference or reach out to me on Gitter. I'm happy to chat and I hope you guys enjoy the rest of OpenTelometry Community Day.