 Hi, everyone. Thank you for being here at a 5.30 session. And even though it's the first day of the conference, I wanted to today, as co-chair of the tag for observability, chat a little bit about what we have been doing on the tag and what are some of the discussions that are ongoing, the projects, as well as talk a bit about some of the cool projects that we kick-started. So with that said, many of you already probably know me. I'm Alalitha Sharma. I have been on the Open Telemetry Governance Committee for a few years now. So fourth year, I think, no, third year. I also have been the co-chair for the observability tag for a year now and have been super happy to kind of drive some of the initiatives that we've been leading in the tag and also working closely with all the observability projects. And I also lead all the observability engineering at Apple for all the AIML projects, which includes Siri. So with that said, I also recently joined the CNCF governing board on behalf of Apple and again, hope that we can all represent our observability projects and support our developers and engineers on the projects. With that said, again, today, we have two co-chairs on the tag. Matt Young, who is not here. He couldn't come, but me and Matt Young have been the active co-chairs. Tech leads, Bartek, who has been part of the community for a long time, and he kind of joins in whenever he can. But again, this is also an opportunity for others to join as technical leaders and as subject matter experts in the tag. As well as there are a few working groups that I just wanted to call out. The observability, observe.kats is a work group where Kubernetes-specific observability projects as well as components are being discussed. And that's a regular work group that has already been approved by the TOC. Folks are already actively working there. There are a couple of other initiatives and key discussions that may result in work groups that is in progress right now, specifically the query standardization effort, which I'll be talking about more in detail. And there's also been a lot of good work on profiling, which has been done in the tag, which has actually then kick-started a lot of the profiling effort and the profiling support on the open telemetry project itself. So that said, again, just wanted to give you a picture of the landscape of the projects in the observability space that CNCF has. And obviously, there are a lot more other projects in the larger space of observability that are vendor-based, but yet open source. But they, again, interoperate with many of these key projects. As you know, the only three projects that have graduated in the CNCF world are Fluent D, Yeager, and Prometheus. The incubating projects are an incubation stratus, believe it or not, are large projects, such as open telemetry, on one hand, and then also other projects, which have been very popular. Cortex and Thanos included Open Metrics, which has been used both by Prometheus as well as interoperability and fully compliant on the hotel side, on the open telemetry side. Thanos and Cortex also use Open Metrics. So again, there is a lot of good collaboration across those projects. And then you have other projects like Chaos Mesh, which are really focused on chaos engineering, but still also related in the observability and analysis space. So this is just, again, CNCF's definition. It evolved over time. But any analytical analysis-focused projects have kind of landed in this specific category. And then you have Sandbox projects, such as Pixi, OpenCost, both very popular projects today, as well as Chaos Blade, which is most chaos engineering. Schooner, which is Kubernetes observability, as well as some of the others, such as Kuber Healthy, which is also Kubernetes-related observability. So there are different, again, projects within the CNCF landscape, which are developers working on different specific areas, but then also interoperating to make sure that the larger projects support them. So with that said, I wanted to dive a little bit into, we gave an update at KubeCon in Detroit in October of last year. And again, this is kind of a six-month fast forward. So if you want to kind of find out about some of the projects that were worked on in the tag, last year you can go and catch up on the recordings. But this is kind of what we have focused in on, based on discussions and really participation by different developers, projects, as well as end users on the tag. And this is the cool thing about the technical advisory groups that are set up within the CNCF space, that it gives you the ability, whether it's an engineer who's thinking of a great idea, but doesn't have a project yet to land this into, really to think and have a larger community discussion around coalescing some of the ideas from end user perspective, from a product perspective, then being able to actually find implementation on projects themselves. So observability query standardization, query language standardization has been a big topic for a while. But nonetheless, we actually had a lot of good discussion around KubeCon in Detroit six months ago. And that led, really, to some of the key movers from the end user community coming and saying, hey, let's draft up a proposal. And this was actually raised in the open telemetry project discussion at KubeCon. And that led, again, I'll dive in a little bit later into fast forward what we're going to cover today. Another area that came up, which is of prime interest, again, to end users, especially who are using observability components from our projects, has been continuous cost measurement and optimization. And it is a specialized segment of observability, but nonetheless very, very valuable, especially when you're using public cloud infrastructure and you are very interested in measurement of the resources that have been used and the costs that are associated with that. So it kind of falls into the observability space, but it is nonetheless specialized. You're profiling in open telemetry, as we talked about, graphs in observability. That's also another area that's being worked on cross collaboratively. Exceptions as another telemetry data type. Again, and some of you might have heard some of the discussions that are ongoing in the community as to is telemetry data really only the top three signals? That is logs, metrics, and traces? Or is it more than that, right? Because as you get more data and you make your systems faster and smarter, you get different types of data. And is that observable or not? So exceptions has definitely come up as another area which has been looked at as a different telemetry type. Correlation, nonetheless, you know, correlation is very valuable. So just wanted to call out some of these areas. And then, of course, the work groups and activities. There's a lot happening in this area. I'd really urge folks to get more involved. And there is, again, a lot of good participation, both from end users, where requirements and discussions are shared, as well as producers, that is the vendors themselves who are writing or contributing great features in the products they build, as well as the open source projects to be able to participate. That said, I wanted to kind of dive in into three of the key areas that we have driven in the last six months. First, that we have been reaching out as co-chairs to different end users, as well as subject matter experts, both in the product space and vendor space, as well as in the end user space, to be able to come and talk about some of the areas that they have innovated on, improved upon existing open source projects, or added scalable ways of, you know, building out an observability solution. And to that point, Vijay Samuel from eBay, he has actually done some foundational work. He presented at the speaker series on the tag, as well as Zayn Asgar, who many of you may know from the Pixie project, has presented on some of the foundational work that has happened on the Pixie project to improve EBPF and specific implementations around EBPF support and interoperability with projects such as Open Telemetry there, right? So again, please catch up on these links, they'll be available. Pixie is also applied for incubation, so one of the good practices we have tried to encourage all the different projects as they graduate from one level to the other is to come and present and update to the tag where the larger community can actually participate, provide comments, review documents, review the code, and be able to actually share if there's anything missing, right before we go to the next level. And this works well for smaller projects. There are obviously much larger projects like Open Telemetry, which where you'd really have to hone in on a specific area in order to actually deep dive on feature sets. The other part that I would really urge you to kind of dig into, and this is something that we are super proud was brought up in the discussions around the Open Telemetry project meetings and then collaboratively suggested to land into the tag was profiling and profiling support. Many of you may know Ryan, Ryan Perry of, who is now part of, I guess, Grafana Labs after the acquisition of their company. He was one of the key drivers in presenting why profiling support is useful and also proposing an Open Telemetry enhancement proposal which was developed based on the discussions that were had in the tag meeting to be transformed into profiling support and how that can be implemented in Open Telemetry, right? So there is a very harmonious relationship between the projects and the tag where many of these discussions, brainstorming sessions as well as really looking at different use cases for identifying support for features is happens on the calls. You might want to take a look at this presentation. This is a slide deck which was presented on the Open Telemetry profiling architecture and a proposal for implementation there. Again, if some of you attended the project meeting yesterday for Open Telemetry, there was some discussion about profiling, but you also heard that on the projects part, there is some delay because of, I think, just lack of maintainers or enough contributors on the profiling side working on the project actively to implement, right? So again, for those of you who are interested in profiling, please get involved on the hotel side. But this is a good presentation to kind of deep dive into some of the architecture and the assumptions and the design tenets that have been made and proposed. The other area, which is super interesting to all of us and I think many of you probably are very curious about this is, as I said, the Observability Query Language Standardization Effort. And again, this is effort that came about in the Detroit meeting at Open Telemetry. We worked with the project closely. It was recommended by the project maintainers, the project GC, to continue the discussion on the tag and not deemed to be in the scope of Open Telemetry at this moment because again, the project is very large but also very focused on really being and framework for collection and instrumentation of observability data, not so much defining and inventing a new query language, which could be a different project, right? So this has been very, very useful because the primary use case, as many of you know, is that as end users, especially large end users and eBay and Netflix have been very key in being taking the lead on this, have the foundational problem of having too many types of implementations and frameworks for monitoring that have been set up and instrumented over time, right? So over many years, you kind of have these islands of data and data collection pipelines and data analysis pipelines for observability that exist in these large organizations which also create a lot of fragmentation and it is a huge engineering cost on the part of each of these large organizations which have distributed telemetry data and distributed sources of data and hence distributed query languages that they need to know and adapt to that there is a heavy amount of engineering that sits on top of that in order to make sure that all the data can be fetched, analyzed, correlated, interoperated upon and visualized in a useful way. And that's very large, very high level description but they're very specific use cases when you're looking at large scale data where you need to address these islands and bring them together. So based on that discussion and based on that use case, the eBay and Netflix and Chris is here from eBay. Who will be suggesting, who will be walking through why and what the query specification is about? I just listed some of the timeline parts that went through the tag but we do have an open issue to get a work group set up for actively working on this area and think fit as the maybe even something similar to SQL in the long run but today at least it's a very inception that it's very inception and the great news is that not only have end users collaborated on this effort but also many of the vendors who have their own query languages such as LightStep, such as Google and others have also started actually collaborating on the same effort. So with that said, this is the TOC issue. I'd like to invite Chris to be able to kind of dive in into some of the areas on the query specification dock. If you haven't looked at it, it's here. I've linked it in the presentation. Please provide your feedback. This is just the initial charter of what we found a focus in on but that said again, Chris, come on in and deep dive. Hi, my name is Chris Larson. I've been in observability quite a while now. I was at Yahoo for quite a bit, maintain the ancient creaky old OpenTSB time series database now, which is being supplanted by Prometheus and all the new kind of time series databases and observability vendors out there. And now I work at Netflix, which has its own time series database internally and the log system and tracing system and whatnot. So we've had the problem, or I've seen the problem of query languages over the time and trying to correlate all of these signals. So I wanted to jump back into the industry and kind of try to corral everybody, all the vendors, the end users together and see if we could come up with some kind of query standardization across the industry that might unblock migrations and customer acquisition coming forward, kind of do the same thing for the flip side, the egress of telemetry data that OpenTelemetry has done with the ingress side. So that's kind of the goal of the work group. We want to help reduce developer toil. Like Helen Letha said, kind of work towards a SQL-ish standard that every developer knows. Maybe it will be SQL who knows, we don't know. Or PromQL, it could be something like that. But we want to help the end user to be able to correlate data, port their data across systems, their queries and their dashboards. You don't have to write a new dashboard and new alert every time you switch vendors, a product. That would be great for end users. So the goals of the work group itself are pretty narrow. We are gonna go old school and do a ton of research and documentation instead of jumping in and saying, oh, is this syntax cool, is this semantics cool? We want to go to all of the developers of these modern query languages for metrics, logs, traces and maybe work with vendors and users to look at query languages for profiling or exceptions, other kinds of data, excuse me, that haven't been really utilized heavily yet as far as correlation. So we want to research all this, compile a list of the research. We want to chat with end users and compile a use case database kind of that we can then cross-reference these query languages against and see what matches, what doesn't. Look at all the commonalities as far as we can, of which there may be very few, but there should be a few, a couple that we can find and then work as a group and a community to kind of work through the differences and see if we can arrive at a recommendation for future working groups or projects to actually implement. So the scope is just research analysis and recommendations of this work group. So what we need is support and help from everybody in the industry to come up with use cases, document those in our database and once the group's up and running, we'll have that ready, and then chat together and work together towards a standard. So that's all I have. Thanks. All right, so again, I think, thank you, Chris. It's a very interesting area, right? Because it's very large and yet it's very focused. And there are two things that I'm super excited about this initiative. One is that you see the convergence of open source and standards coming into play, right? Because traditionally the standards world has been very different and very separated from what open source is. And typically open source has kind of been known for ad hoc standards. Linux is a prime example of that. And then setting the standard, which is also what is in the implementation and what is adopted by the end user. So a practical open discussion about where some of the pain points lie, what can we do well about observability data being more analyzable and easy to analyze, and also bridging the gap from a correlation perspective, making that more efficient, making that more transformable, and I would say think of it as querying as code, right? Because at the end of the day, people talk about dashboards, it's code, people talk about analysis with ML and other techniques. But the primary problem is that when you have large amounts of data and literally petabytes of data that we're emitting from not only our cloud infrastructure, but also from the applications and services that ride on top and then the end user networks of devices and different components that are talking to them, when you're looking at that solution end to end, how does an end user actually solve the understanding of having system health and performance understanding, which is different from business analysis, by the way, and at that layer, how do you have an intelligent queryable standard that you can use, which actually bridges all this type of data, and at the same time, all the implementations too, because it's a handshake, it's like maybe USB-C, right? So at the end of the day, I hope that we can actually, with this discussion and having the observability community working together, identify the end user use cases, which is beneficial for everyone. Also understanding some of the existing query languages and what are some of the best components that could be perhaps reused and interoperably, and also perhaps developing a set of semantic conventions around this, which could be applied in a way where it is actually standardized not only from an ingestion standpoint, but also from an egress standpoint, right? So this goes back, Ted, perhaps, to your pet project right now. But again, I'd really urge folks to get more involved, the specification and everything, and we have discussed so far is all in docs. You can go and read up on them. The links are there. We are just in the process of getting approval from the TOC to be able to do this work in the work group. And again, it's open to everyone. We do need to document use cases. We'll do it diligently. But also different use cases you might have run into as you've implemented observability tools, right? And work towards really setting the standard for all of us together. So that said, again, it's just that it's inception, so I really invite folks to get involved. And the impact of this is huge. I mean, you may not see it, but it's like an iceberg, right? It really is massive in terms of the end user space because the amount of toil that is put in engineering just to kind of compile the data and armies of data scientists and others who are looking at this data and looking at, okay, how does this really result in having good root cause analysis at a minimum is incredible. So again, I cannot restate the importance of this. Going forward, I would say that some of the collaborative workspaces back in the tag are that we're also working inbound with the CNCF infrastructure or the CNCF teams across the tags with the security tag, for example, to collaborate on identifying security observability use cases as well as other supply chain discussions where there are supply chain implications especially with some of the SBOM software bill of materials, dependencies that are kicking in where observability is actually used and solutions are used not only for traceability of provenance, but also perhaps other data that can be conveyed through and collection implementation as well as analysis where that's easy to correlate, right? So there's a lot of discussion happening there. Another area is actually building a standardized observability glossary. It's amazing beyond the core maintainers who actually work on the code and work on the implementations, how different the world interprets some of the terminology. And I think really having a clear glossary actually would help and it evolves over time because it's a living document and we can evolve it all together. This is open source. And also being able to collaborate and drive more collaboration with the TOC, which is again the central body and the CNCF, but also built close collaboration with the project. So the tag serves as that binding glue and I really would invite anyone who's interested to get more involved. If you have something, any areas that you want to brainstorm about, tag is a good place to do it. All right, so moving on. Just wanted to call out. These are some of the places you can get involved. We typically meet 1,600 to 1,700 UTC twice a month. So it's only two calls a month. And then the work groups of course meet on their own cadence and can determine their own cadence. Everything is very well documented and published very transparently. You can find a lot of detail on the tag observability repo on GitHub and CNCF. And of course there's a lot of conversation even on the Slack channel. So if you're interested, please, if you have any areas you want to discuss or something you don't see on the projects which is happening, this is a great place to bring it up. And the good thing about that is that if you want to convey that to the TOC, the tag can help do that. If you want to convey it to any of the projects, the tag can also help do that, right? And the tag also can recommend different proposals as you saw with the profiling proposal or with this query specification discussion, take it to a point where it's a clear recommendation or a spec, and then it is creating a project is then based on the contributors who gather around that. So with that said, again, please give us your feedback. That's the end of my presentation. Happy to take any questions. Questions. Come on, Ted. Maybe. But we're not implementing anything, right? It's just a recommendation. Yes, Jacob. Yeah, all right. That's true. And I think it's a good question, right? Because we're getting what's implementation, really, right? And we've thought a bit about it, and I think Chris was here also, where we'd like to at least see requirements and the use cases clearly specified and also some of the pain points that exist today, right? Because that gives us a foundational understanding of what the language is trying to even address and is there anything common there? And also, I think the evaluation of different languages that exist today and taking the best out of that to be able to have an interoperable proposal is key. I mean, again, I think product managers at companies do this all the time, but on the other hand, it's not open source, right? And it's not an open spec, necessarily. And I think that towards the work that all of us have done in the open source observability space, it's very important to realize that certain layers need to be standardized and need to be commoditized, right? Because at the end of the day, what's the value you're looking at in a product? And that's kind of complementary to what we are building in open source. And you can very clearly see to your, it's a long-winded answer to your question specifically that it's very important to address not only the ingress, semantic conventions there, as well as a schema, a common schema, but also at the same time, perhaps also having compliance tests. And I would think about it as a test suite to be able to vet out, okay, if this was really a common proposal, a common query language recommendation, then how would you actually comply with it, right? And how would you actually make sure it's conformant? And it could be as basic as that, right? Because you have an ingress schema definition, you have the use cases clearly outlined, and then the ability to extend in the future. Because we don't necessarily, I mean, this is an iterative process and we don't expect that we'll identify all the use cases day one, right? They'll come over time, just like in any open source project. But hopefully that provides some, yes, yeah. And I think that discussion will evolve, right? Because you're right. I mean, obviously not only are the individual elements important in a schema, but also the categorization of them, which kind of transforms into objects in a programming language or in a design, right? But it really is also first understanding what the complexity of the queries are. Because, for example, today we have in the logging space EQ, elastic query language, for example, which now the ECS donation into open telemetry will greatly solve and resolve some of the inconsistencies and make that more standardized in the ingress of data, right? But it needs to carry through all the way, right? It needs to carry through into the analysis. It needs to carry through that into the querying and into the visualization or the reporting of data or the visualization of what the relationships between system components and the data within R or the data related to it. So think of it as, you know, this query language really is another component of that whole pipeline to be able to address how do you reconcile what's being consumed in a standard way safe with OTLP into what is being emitted at the analysis level, right? And then transformed into visualizations. Yes, yes. Right. I mean, again, a standardized way of understanding and optimizing a language would help greatly and making that whole process more efficient because it's very fragmented today. Any other questions? Yes. Right. And I think that there is always going to be great product value in the ability to, you know, provide the analysis that is required from these queries. What this is addressing is at a lower level where you are actually understanding the data and being able to have a common language to be correlating all that data, right? Because that's actually very expensive. And even for a product to be able to do that, it's many years of work, many actually hundreds of hours, thousands of hours of engineering work, even on a product side. And I think that in the world that we live in, especially on standardized cloud native infrastructure, that world is converging, right? And you've seen the solutions also evolve tremendously on the product side. I think we're at time. So thank you everyone. And thank you for joining.