 Der Rott hat gesagt, gutes Feedback, weil er denkt auch so. Nee, das erwartet ich von ihm auch nicht, aber dass er, dass ich nicht so laut. Das ist irgendwie lustig. Das ist ein bisschen, er will mich das Gefühl geben, ich weiß nicht wovon ich rede. Und ich misse laut, weil ich das nicht, das Ganze nicht verstehe. Ich bin schon in einer Meeting, in my case. No, we cannot hear you Richie, not yet. Hello everyone. Hello. Okay, looks like people are still joining. I know Richie tries as well, but Zoom. It's not always working what best with Linux. Oh, Richie is here, okay. So, I was, I was wrong. Anyway, so welcome everyone. Please make sure you open the dog. And you know, mark your, there is like attendees list. So please add yourself. Let me actually send you the dog as well in case you don't have that open. So we have a couple of agenda items today. If you have anything else, just, just drop it there. And we have first one from Steve, I suppose. Are you here with us? Yes. I am here. I will go, I just share my screen. We can kick off. Cool. So based on the conversation last time there was kind of request around semantic conventions. So I figured, I would do a quick introduction of what the project is in case people were not aware. And then talk about semantic conventions at a very high level. But was only to take about 10 minutes a year. So I'm sure there'll be action items. I'm going to link in a presentation I did earlier this year. This is some of the slides actually from that presentation. But if people are interested in learning more about the project, it's about an hour long and it kind of dives into all the different aspects. So, joining of two other projects in CNCF, you may be aware of open tracing, which is incubating projects. There was also another one called open census, which came out of Google, Microsoft and omniscient. These two projects combined to form open telemetry. This was announced at KubeCon Europe last year. And the way that you should think about it is that open telemetry is an all die lessons learned and best practices are being added to open telemetry. There is backwards compatibility with open tracing and open census through the use of shims. But going forward, all the development will be on open telemetry. Now, what is the open telemetry project trying to solve? I like this table representation. So if you've heard of observability, you've heard of the three pillars of observability, traces, metrics and logs. These are just different data sources. What's actually more interesting is that for each of these data sources, there's a variety of different layers, like the APIs to actually generate and omit the telemetry data that you care about, the implementation, which is typically referred to as like the client libraries that you add to your application code itself. There's all the infrastructure aspects. So think like agents or gateways or collectors or whatever you want to call it. And then a variety of different formats, like in the case of tracing, you have context propagation. So, in case of the W3C Trace Context, you also have different wire formats of how you actually send this data over the network. Open telemetry is looking to basically solve all of this. So the best way to think about it is anything you do to instrument, generate and omit telemetry data in your environment. Open telemetry is trying to provide a solution for. Where it draws the line is that it does not provide a backend. It supports a variety of different open source and even commercial backends, but it does not actually provide a backend and that is not part of its scope. The initial focus for open telemetry was very much focused on traces and metrics. So log support is still very early days, but is part of the charter and is starting to be incorporated into the project currently. Kind of taking that table and representing it into what you'll actually find in the, in the repository, or the project order of open telemetry. You have the specification, which is the foundation. That's actually where the semantic conventions live and where we'll be spending time today. But there's three basic components here, the API, the SDK and then all the data stuff, which is semantic conventions and protocol. On top of the specifications, there's two other main components. There's the collector, which is a way of receiving process things and exporting generated telemetry data. It's actually a single binary that can be deployed either as an agent or as a standalone service. And it is the default destination for open telemetry client libraries. And then client libraries are just a single way to instrument your app and to emit traces and metrics and eventually logs. It supports both manual, which means you would go in and make code modifications, as well as automatic, which means you change runtime parameters or add dependencies. The automatic aspect only works for traces today, but the goal is to add that for metrics and logs as well. The project as a whole is officially in beta. Beta was announced in early March and GA is planned before the end of the year for traces and metrics specifically. So a lot of active work going on right now as GA is pretty imminent at this point, we're probably about four to five weeks away. And interesting data fact for everyone. The open symmetry is actually the second most active project in CNCF today behind only Kubernetes. This is according to CNCF dev stats. You can actually go look this up. It's basically a Grafana front end to some Prometheus metric data that CNCF collects. So the project is very active. And then final slide basically is just, there's a lot of contributions and adoptions going on here. We're seeing Cloud providers. We're seeing vendors and we're seeing end users kind of all get together. I think this speaks to a problem that open symmetry is trying to solve. And we're seeing other industry products get behind it as well. For example, Jaeger already announced that they are moving from the Jaeger collector to the open symmetry collector. Fluentbit has added log support to the collector. There is roadmap items to add open symmetry client library instrumentation into Envoy and Spring. So a lot of cool work going on here. There's some links to some additional reading information as well. Question in the Slack. What is the open metrics roadmap? How to better incorporate open metrics into this model as well. So today the focus has been on supporting like the Prometheus endpoint, but nothing specific to open metrics. Question, which vendors do you have on board? You name the vendor, I'm sure they're on board. So New Relic, AppD, Dynatrace, Honeycomb, Splunk, LightStep, Elastic, Sentry, I don't know, there's a lot, but pretty much any major vendor in the space and all the open source ones are supported too. So Jaeger, Zipkin, Prometheus, but any vendor in the tracing or metric space pretty much has a way of consuming this data today. There's actually a lot of vendors that don't and that aren't involved just a heads up. Yep. And then from next steps if people are interested, all the conversations happen on Gitter, you can go join Gitter. There's a whole bunch of SIGs or Special Interest Groups for Open Cemetery and a lot of GitHub issues have been labeled with good first issue and help wanted labels. So that's kind of some quick context. Any questions before I jump into semantic conventions? Let's do it. As I mentioned, there's a specification. That is in the specification repository. There's a huge table of contents here, but one section is all around data specifications and semantic conventions. When you click that link, you will basically see that there are three different types of conventions that are defined today. Spans for spans, metrics for metrics and then this thing called resource. I should explain resource. So basically Open Cemetery's approach is that anything infrastructure like is considered a resource. So, if I have like an application that emits a metric or emits a span, well, that application is running on something. Maybe it is a Docker container on a Kubernetes cluster that's in an AWS region as part of an EC2 instance. So that like chaining of events is considered resource information and Open Cemetery has conventions around tagging what is a resource so you can identify objects regardless of their data source. Spans and metrics are kind of self-explanatory, eventually logs will be added, as I mentioned, it's still early days for that in the collector. From a resourcing perspective, there's a bunch of kind of unique names here. I won't have time to cover all of it, but as you can see for each subcategory, like what is a service, there are what are called attributes. Attributes are like labels or key value pairs that you add on to things. For each of these different conventions, you'll see kind of a description, what type it is, whether it's required which means it needs to be included as part of the telemetry data that's emitted or whether it's optional and you don't have to rely on it. The primary reason why these semantic conventions exist is because normalizing how you emit the telemetry data gives you the ability of having a vendor agnostic solution. So if everyone kind of knows that in this case the service name and the instance ID is required and the other fields are not, well then you can now take on a dependency and you send that data back in. For the non-required fields, well some vendors actually do care about the non-required fields. It's actually required for them. Some of the non-required fields are just additional metadata that can be beneficial but could also increase costs or cardinality depending on your use case. If you drill into these you can get down to very specific things. Like I mentioned AWS or different cloud providers, native or CNCF you'll see that there's different naming standards around namespaces or pods or controllers. All of this is kind of defined on a variety of different readme files here you can drill into the specifics. There are tracing semantic conventions so these are things like how do I know that thing over there as a database? How do I know I'm making a restful call? How do I know I'm talking to a serverless function? What is that message queue? These are kind of normalized ways in order to get on that information. One of the reasons why this is kind of powerful for distributed tracing is because I can actually infer services that are not instrumented. If my application calls a third-party database that I do not control and I use these conventions I can actually infer that that is a database that it had an error, that it has this latency and I can actually show that in a backend of my choice. Which is kind of cool. So, last but not least, I'm going to get a quick overview of what this library is all about. I'm just going to go on it. So, that one has not been merged yet. And what I would encourage folks to do is actually look at what are called OTePS. OtePS are open-to-limitry engineering proposal. So, there are several different proposals against all different categories. Traces, metrics and logs, being the primary ones. In here there is an OTePS As well as the labels, the labels are the dimensions of the metadata that you enrich this proposal kind of outlines how to do this in different aspects, whether it be the collector or the client libraries. There's a bunch of links as to how this is currently implemented in open telemetry, but there is no finalized version for the metric semantic conventions which I'm assuming this audience primarily cares about from a Prometheus perspective. Also, there is a proposal up. All of these proposals people can kind of comment on, and then eventually they get approved, merged and implemented across client libraries and different and the collector itself. I want to be respectful of time so I'll open it up to questions and I'm happy to do follow ups if we want to drill into the specifics of these but hopefully this provides a good foundation on what is open telemetry, what are semantic conventions and what is currently available today. This is an amazing overview. So, maybe first question. Can you link the PR into our documents so we can maybe take our time and review. I think this is put, you know nice. I love the idea of proposals. It's super powerful in a good hub. Ja. One question was around releases. Like how open telemetry releases those different, you know, support for vendors or. Yeah, anything really like how it works. Do you have any release process cadence. Ja, ja. So the way that the project operates today is that every, every repository has its own set of approvers and maintainers. And you should think of it as an independent project in many ways, shapes or forms. So every team can kind of decide what they want their cadence to be. In general, you'll find that the client libraries release anywhere from once a week to once a month. That's general. It falls into that realm somewhere. The collector is once every two weeks today. And specifications are updated as, as needed. All the SIGs meet at least once every two weeks. Some meet every single week. The more interesting question is actually the one around vendor support. Let me just show that real quick. I'll use the collector as an example, because I'm most familiar with this aspect. There is this repository called open telemetry collector. It is the core collector repository. Everything in here is open source. So we'll find that it has like support for Zipkin, Prometheus, Jaeger, Kafka, but there are no vendor stuff in this repository. Instead, what open telemetry does is has this notion of a, oops, I guess I didn't go to it, of a contrib repository. So for every core, there is a contrib. The contrib repository is where either vendor or commercial third party stuff lives or where non common open source things live. So if we go into here and look at the exporter, I mentioned a whole bunch of companies that are supported. You can see all the vendor stuff listed here. Contrib is a superset of core. So you get everything in core within contrib and then all the additional stuff on top of that. Most of the client libraries do the same thing. You'll see a Java and a Java contrib. You'll see a Python and a Python contrib. Nice. No, it makes sense. Now, the things that I'll explain in a minute, I can't say anything, no matter what code is going on. Yeah, How, I mean, it's always kind of tricky with, especially with no go along and stuff, how we how you add those plugins and, you know, there's like a vendor, it just basically wraps all of them. Right? You can't do auto instrumentation with Golang. So you just wrap the packages that you're, that you're using. He says, yeah. I have a question that I've gotten from a number of others. Die Folgen haben existierte Prometheus, Orthenus oder Cortex, oder ein paar Sorten von openmetrics, remote read remote write-based, durable metrics backend. Ich verstehe, dass es die anderen Weise geht, aber für Dinge, die instrumentiert sind und die OpenTelemetry Metrics durch die OpenTelemetry APIs und mit Collectors und so weiter, ist das ein Dokument? Du kannst uns das auch pointen, oder ist es ein kürzerer Weg, um eigentlich von OpenTelemetry Metrics zu haben und die Metrics in einem remote write compliant backend zu haben, ob es Prometheus oder die anderen Piloten sind, die mit dieser Interface verabschieden werden? Ja, also die remote write Kapazitäten, nicht alle sind noch da. Unsere Antwort heute wäre, dass man den officialen Prometheus-Support sieht. Oh, ich habe die falsche Pause, jetzt gehen wir zum Export. Exportern sind die Weise, wo man die Daten auspricht. Also man sieht, dass die remote write exportern nur verwendet wurde. Also schau dir das an, die Presse ist scharf. Also diese sind die Weise, die man auf dem remote write-Support zu senden. Also du readest die Recher-Dokumentation, wie du die Daten inbaut, oder die exporte Dokumentation, wie du die Daten auspricht, um die Daten auspricht. Es ist nicht egal, ob die OpenTelemetry sitzt zwischen, ob du eine App, die einen Kollektor sendest, und der Kollektor sendest zu Prometheus, oder ob du eine App, die Prometheus sendest, und der Kollektor schraubt Prometheus, und die Daten auspricht. Beide Mechanismen werden unterstützt. Aber es gibt keine realen Dokumentationen heute, außer die Recher- und Exportern. Das ist etwas, was in OpenTelemetry gesprochen hat, aber es ist noch nicht available. Ich sehe, in der vorigen Recher-Dokumentation, nicht der remote write Exportern, sondern der Prometheus-Exporter. Ist das ein Prometheus-Syloxporter, der dann muss reistrebt werden? Ja. Okay, cool. Ja. Ich bin froh, mehr Zeit zu spenden. Ich weiß, wir haben andere Themen für heute. Es wird mehr als eine Recher-Dokumentation sein. Fühlen Sie sich frei, um Fragen zu stellen im Zoom- und Slack-Channel, oder im Google-Dokumentation. Wir können definitiv Folgen von diesem machen. Ja, ich denke, wir wollen ein paar Webinars machen und ein paar Projekte introduzieren. Es ist ein Schritt, um OpenTelemetry zu introduzieren. Wir können ein bisschen Zeit für Fragen haben. Also, niemand, sei nicht schliess. Wenn nicht, können wir eigentlich über die Metric-Label-Semantik sprechen, ein bisschen. Ja, ich putze das zurück. Ja, das Slack-Channel. Und diese Semantik sind nur Suggestionen, oder eigentlich werden sie im Grunde gevisiert? Ja, vielleicht ein gutes Beispiel ist die Tracing-Label-Semantik, seitdem sie schon verwendet sind. Wir benutzen RESTful-Calls als Beispiel. Diese Pläne wird groß sein, und das ist ein Beispiel. Hier ist ein Beispiel für Spanien, für Trace-Label-Semantik. Wenn du ein RESTful-Label-Semantik verwendest, was für Metadata und was für Metadata wir Konventionen haben, dass du die Metric-Label-Semantik verwendest. Man sieht, dass die HTTP.Method ein String-Field ist. Das ist ein String-Field und es ist auch ein String-Field und es ist ein String-Field. Es ist ein String-Field, das aber nicht unbedingt ein String-Field. Es ist ein String-Field. Es ist ein String-Field. Da ist etwas, was die URL nicht zum Benutzung ist. Da sind Traces-Label-Semantik. Man sieht, es ist ein String-Field. Sie sind die String-Field. Die Konventionen sind von vielen verschiedenen Vendoren, Vendoren, die mit Open-Tilimitri, natürlich, nicht Vendoren, die nicht sind, aber viele Vendoren bringen in, wie sie in die Rückend-Impflementationen und diskutieren, warum sie das so gemacht haben. Und dann werden Vendoren darüber reden, was das beste Weg ist, wenn sie versuchen, eine Semitik-Konvention zu bekommen. Die End-Unser werden das gleiche machen. Viele End-Unser haben ihre eigene Klienten, ihre eigene Tilimitri-Date heute, und sie schauen vor und passen, wie sie ein mehr open-Source- und open-Standards-Apoche gehen. In dem Fall von Metrics, während der OTEP nicht wirklich ein PR ist, wie das ist, das ist bemerkt. Ja, ich bin nicht eigentlich in Pull-Request. So, das ist eigentlich bemerkt, als ein Standard-Metrics-System, aber nicht etwas, das eigentlich über die Metrics-Sig, ich glaube, bemerkt in die Konventionen. Das ist warum die Metrics-Page hier noch etwas zu tun. Das könnte auch noch etwas zu tun, weil die Metrics-Specifikation noch nicht in der Stabil-State, wo die Tracing ist. So, das könnte auf Metrics-Stabil sein. Ich glaube, da ist ein PR open für das jetzt. Ich weiß nicht, ob es etwas zu specifisch ist. Ich möchte hier schauen, aber man kann es sehen. Die OTEPs sind typically link zu den prioren Implementationen. Du siehst, dass es eine vorige OTEP war, die versucht, die Konventionen zu definieren. Du siehst Dinge, die eigentlich in verschiedenen Projekten passieren, wie die Kollektor oder die Klienten-Librärie. Und dann, based auf das, ein Proposal ist gemacht, um was Konventionen zu sehen, welche Unien, die so supportiert werden, Dinge wie das. Hm. Das... Okay, so, das ist nicht wirklich necessarily around Labels itself. Es ist wirklich, auch über Symanteins, über, wie jeder Metrics-Specifikation sieht aus, oder? So, das ist beide. Es ist supposed to talk about Names, Labels und Konventionen für die kommunale Instrumentation. Aber ich glaube, wenn du das schaust, die meisten von ihnen sind Metric Names. Names, Names, Names. Ja, Metrics, Metrics. Beispiel, Go, Uses, Go. Ich meine, das sieht aus, zu reden, über Metric Names, und es macht, über die actualen Labels. Ja, das macht Sinn. Okay, so, meine erste Frage ist, ich bin von Prometheushold, Prometheus-Maintenor. Und zu mir, wir sind immer immer über die Names. Und, es ist, wenn es so ist, wie Counter und dann vielleicht ein Totalsafix und so weiter. Also, für mich, diese Names sind total neu für mich, aber wahrscheinlich, ich war nicht ein schwerer User von anderen Monitoring-Systemen auch. Also, ich glaube, es war eine Erfahrung von vielen, vielen Vendoren. Und für mich, es war ein Projekt, das war, irgendwie, wirklich, für mich, ein paar von diesen, zumindest, Names, wie mögliche Names, sagen wir mal, Rules, die open Metrics sind. Also, wie ist es relevant für das? Wurde wir irgendwie, um das zu kollaborieren oder vielleicht nicht? Was ist dein Ziel hier? Ja, also, die Opensymmetrie Team hat, ob ich es open habe? Ich muss jetzt auf die Community-Page. Da, da, da, da, da, da, Community-Members. Okay, cool. So, the Open-Symmetrie Project has a governance like most do. It also has a Technical Steering Committee. Where is that? Maintainers, Trace Approvers. Oh, here we go, Technical Steering Committee. So, this is the current Technical Steering Committee. This group I know has met with the Open Metrics Team multiple times. I do not know the outcome of those decisions. I am most familiar with Bogdan. So, I would recommend reaching out to Bogdan, but he would probably have the latest on what is going on with that. I know there's been a lot of back-and-forth communication. I haven't actually seen anything come out of that. So, I don't know the answer to your question. Yeah, no worries. Thank you. Yeah, to be honest, you know, seek observability is the, you know, I don't know, like a very good place to actually have those discussions around, yeah, somehow tabu, you know, topics of, you know, some overlaps between projects and, you know, why we should kind of collaborate. But, yeah, I think we should definitely talk about this a little bit more. But those semantings are, you know, one level of overlap that, that, you know, something that Open Telemetry, Open Metrics already kind of specify in some way, but not everything, not the actual naming of the, you know, CPU utilization. So, I would be curious if they don't even, you know, conflict, because that would be kind of annoying, really. So, yeah. That would be my kind of concern and something to discuss further. But overall, yeah, even in Prometheus, there are many semantics, let's say, recommendations that are just spoken and never actually, you know, written down. So, it would be amazing, even I would be super curious to get for that. And to review. Yeah, thank you. Yeah. And Richie is speaking on the channel with some more information, I guess he cannot talk. I joined the metrics called several times. My last status was that the plan of Open Telemetry Metrics was to support Open Metrics as a first class wire format and naming was a huge part of the discussion. I was under the impression that Open Telemetry would follow Open Metrics, Prometheus and Kubernetes as the standard. So, I guess there is some kind of miscommunication or something we can actually discuss further. Yeah, there is a dedicated metrics specification SIG for people that are interested. There's links here that people can go join to kind of talk about that particular aspect or there's the general specification meeting as well. But yeah, I can follow up with Bogdan on this and try to figure out what the status is. Who on the Open Metrics side would be a good point of contact? I think Bogdan would probably know. Richie, Richie. I was not involved in the discussions, but I'm super into community and just collaborating with what we did. I mean, you did that as well with the open tracing and open sensors, how you were able to actually merge and not fight each other. We are doing the same with Thanos at Cortex. We are literally slowly reusing the stuff. So, it's hard, but the discussion is worth to have. So, yeah, I'm happy and curious how we can move that forward. Yeah, having that remote right exporter is that that's a really good sign, right? I mean, that by definition is that wire protocol. So, I'd love to see. Well, remote is nothing to do with Open Metrics though, right? Und, ja. To be honest, remote right is even more concerning for backend systems like Thanos Cortex and 3DB. I don't know, CloudWatch, which probably someday will support, you know, or even now supports remote right or things like that. Because I assume this is like a sidecar. Our application, there will be a sidecar with Open Telemetry that just pushes metrics directly to those backends. As far as I know, this is not like very scalable, but I guess for some use cases, that makes sense. So, but anyway. Maybe it would make sense. But the rest is not hard. Yeah, maybe it would make sense for Richie or someone to do an overview of Open Metrics and what specifically it is. I was just meaning more in terms of interoperability, like we use remote right as our standard internally and in our own infrastructure. You know, and granted that's not specifically to your point Open Metrics specifically, but it's the closest thing we could find quickly to be able to interoperate across systems for metrics in any event. I know Richie can't respond with a mic, but I'll take that. That would be a good topic. I would definitely be interested in that for sure. Yeah, I know it's been, yeah, well, I'll let Richie cover it maybe next time around, but there's some very long, long running, long pull things that could be coming to a head as I understand in the near term around Open Metrics as an actual standard with IATF and stuff. So, anyway. But I want to just summarize that having a nice semantings like metric name defined for overall CPUs, CPU usage for, let's say, both Golang and Python and different kind of, like it's golden, because right now, like all those dashboards and others would be able to work across different, you know, different bug ends, different applications, different languages, different, maybe orchestration systems. So that's really, really amazing. And we have to have those. So, that's a really good step. Okay, so looks like Action Item. Talk about the Open Metrics specification and the metrics semantics. Definitely Open Metrics doesn't define everything, but I, for example, having a dot in the name of the metric is, you know, I don't know if it's even allowed by Open Metrics, so how people can kind of leverage the semantics here. So those small details might be curious of. I added an Action Item on maybe Richie to show us and talk about Open Metrics more at some point as well. Cool. Yeah, one thing I'm worth noting, I mean, we didn't cover the specifics of it, but one of Open Telemetry's goals is to be format agnostic. So it has something called OTLP, which is the Open Telemetry protocol. It is meant to be a superset of capabilities. So, like the collector can actually receive in Prometheus and export in whatever other format. Maybe Zipkin to Yeager is a better example. In Zipkin out Yeager. And it does that by converting Zipkin to OTLP and then OTLP out to Yeager. So if Open Metrics doesn't allow periods, it doesn't matter. As you convert out of OTLP, you would convert into the Open Metrics Convention. So there actually is a path forward. That's part of the goals of Open Telemetry is like there will be a new standard tomorrow. We need to be ready for it. How are we going to be ready for it? Yeah, makes sense, total. Cool, let's move forward. We have two more topics, 20 minutes left. So, okay, we have a topic from Jona. Do you want to cover that? And maybe before this, like I just want to mention, like the whole kind of meeting we have right now, it would be amazing to, well, I think we are doing it right now, but the whole goal is to just maybe, talk about status, synchronize, advertise our working groups, but actually looks like the direction is to actually do the work, the actual work offline, so we can just summarize and talk about the details and synchronize, yeah. Because otherwise we don't scale. That's why I think the most important point is that we are not walking through the dock as we used to. So yeah, what's the plan Jona with working group? Good to go. All right, cool. So I've been chatting with Matt about the user survey that we did and trying to make some improvements on, let's say the methodology or process and really understand what users are doing and where they're going with their strategies and their use of open source tooling, I would say. And obviously there was the release of the landscape document, I guess a couple of months ago. So I started putting together a document with Matt. Well, just suggesting some of the things that we could do to improve the methodology for the data collection so that maybe we could get a broader view of what people are using from an end user perspective. I outlined some of the changes and Matt put together sort of like a template for us to start with. The idea here I think would be to try to put this into some type of document and then track it in an ongoing way through GitHub issues eventually and try to come up with a survey that we can execute just to get a better handle of what people are doing and where they're going. There's just the discussion we just had about overlapping standards and what people are doing. This is a perfect example. What are users thinking? Where do users wanna go? What do they want from the community? I think these are some of the questions that could help us make better decisions as a community and maybe force us to collaborate a little bit versus building multiple standards all the time. So, that's the thinking. So please leave comments on the doc. Feel free to go into suggestion mode or add anything in here. And then I'd like to sort of collaborate on the doc as Bartuck was saying and hopefully come up with something more meaningful that we could use. I don't know if you had anything else to add, Matt, for what you wanted to do in terms of driving it forward. No, I think this is great. I think I like the idea in the end of the doc there, there's a rough timeline around publishing an initial draft as has been done now in the Google doc and kind of keeping it open for four or five weeks. My only input would be before we actually go, launch the survey and start executing it. We'll need to have a formal proposal for the working group formation that is approved by the TOC. I don't think that's a big hurdle or anything like that, but just, we can iterate on this within the SIG for as long as we think we need. Five weeks to me seems reasonable. I'm curious if others like that timeline. But at that point, we can actually have a formal proposal that the TOC would vote up or down. I would expect up and then we're off to the races. Yeah, thank you for putting this together. I did very little, I copied this and some stuff in the top. So, but I would encourage everyone on the call or watching the recording later. You know, this is what we can do as a SIG, so please do engage and let's work in this doc and hammer out something that we come to by consensus. And although we're all technologists, I think it's important to also highlight people and process and other kinds of things that we don't tend to think about so much, but it's really important to end users to understand where they wanna go and where they need to go, because the vendor side is all technology and I think it's important to understand how it fits into, you know, the culture and organization and processes. So, that's all. Yeah, that makes sense. So, to motivate you all in this document, like there is a huge, really nice kind of potential list of questions that we want to ask. And that's definitely something you can contribute to, like what would you like to know around, yeah, also what questions could be fair and not exhaust the reader. I would also be curious, just one more thing to add is that how to motivate people to also answer, right? If you have any ideas, let us know. I would suggest also, come to think that we find some time, you know, Jonah, if you wanna organize it, or we can try to organize it on the CNCF calendar, but you know, again, the working group isn't ratified yet, but we could do all of this async using Slack in the SIG Observability Channel, but it might make sense to have some working sessions like we did for the charter working sessions, like maybe weekly for the next five weeks or something like that, where we can discuss these comments, we found that to work pretty well, I thought, for the charter document. And so maybe we could do the same sort of process, so that again, this call, the SIG fortnightly call is just a status check, but we've actually got time outside of this to move this forward in a structured way. So, I guess it's the time to ask for volunteers or who wants to participate in this type of thing. Is that sort of how it proceeds, as someone that hasn't. I'll volunteer to participate, but I would prefer to be a facilitator here, not a dictator. It'd be good to get some users involved too that wanna answer certain questions. You could also put something out to the mailing list as well. The CNCF SIG Observability Mailing List or just the End User Community Mailing List is a broader one to say, hey, hey, even if you haven't been engaging with SIG Observability, here's a place where you could not only help us in terms of what's your feedback, but you could go to your own connections in your own personal network and help drive this forward. I think that the big takeaway I had from Cheryl a couple of weeks ago when we were talking about this or four weeks ago now, time flies, is that it's probably gonna be personal networks and personal connections that really allow us to horizontally scale the effort versus just blank-giving emails and things like that. Yeah, totally. But I think what you're talking about are just one meeting or why one work to be done to actually create a survey, design it in a nice way. But another is execution. I think that could be the couple as well. Yeah, makes sense. So, we have last topic then. Simone, are you here to introduce that to us? I'm here, yes. So, from last week, we were trying to have a discussion about what we want to do in the SIG or what's going to be, well, at least have some direction. I can turn on my video just to be a bit more friendly here since you guys have a video. So, I don't have a background, I have just a kitchen. So, my idea was just to write something. So, Prabha was the first one that said that we've all her heads off. She would have more an end user perspective. In my case, when I started with the document, I was looking more into use cases. So, we have at least with things that I work with. So, we are also reshaping and doing a lot of things internally in Ericsson, how to operate and expose data from the network to operators or for Ericsson or for other users. So, based on these users, we have also different use cases, which sort of data, which sort of observability and what comes with that. So, my idea with the document was to have something like that as I'm not sure if this exists in the CNCF or under the observability umbrella that defines a little bit what is ... Well, we talk about distributors from observability, like more generic. We have other ... Like other sources as well. For example, Google had this software, not software, the site reliability engineering book, where they talked about these four golden signals for mainly metrics, I think. It was like more for the infrastructure, for the research reliability engineer guys in Google. Then there is like a lot of discussion and a couple of good blog posts from Cindy Sheridan, where she talks a bit about distinguishing monitoring from observability, what is one and what's not the other one. And a lot of the talks that I have been following in this space is about companies that are walking this route, they are working on their observability internally. Usually they are ... Either they are writing their own apps or they run the infrastructure or sometimes they do both. For example, in the case of Google, they have both. In other cases, they are just writing apps but deploying on some public cloud. So they usually have a lot of stories from their more like lessons learned from scaling data, ingestion, consolidation of tools. For example, which tools were better working together for the use cases. They had migration challenges that they had moving from one to another one. I guess a lot of that maybe it's covered in the open telemetry group already, I'm not sure. But I thought more about having something that a bit sets the ... like we would call on the ITF, usually you don't start a working group without a use cases document. In our case here, we don't have a document but we are trying to get all this observability information that we have in other projects in CNCF and try to consolidate and have a start here for users and people working in this area. So I thought about this could be a structure somehow for us to start working on how to write something that could be a public document, public white paper or something. No, it makes sense. It's actually the same topic actually popped out very early. I think even Steve mentioned that. So let me add the issues. There's issue 16, which is you can go and essentially on our GitHub, which is ... let me open it again. So essentially the white paper about observability. So what we are saying, what we are referring to as a rebook, something around CNCF, spectrum around observability, just that could be something like this, I assume. And something in this area, there is also issue number 19. Also talked about entry point page or documentation where someone totally new to the CNCF world. Let's say they grab Kubernetes and now they need to migrate or that they want some observability. And they don't know where to start and what is the kind of offering, which will root to the proper documentation for different projects and maybe have some global information. So we are definitely looking for some good ideas on how to solve those. And also I think there was some idea to create some working group around writing this kind of paper. But we need to definitely frame it better so as to replicate the work that all those smaller projects already wrote about observability topics and how to fairly compare those. So yeah, any other thoughts? But it makes sense. I was not sure how to start actually. I saw that a lot of you guys have already practical experience implementing and running and working with different frameworks and observability. I come a little bit from a different corner. We had legacy things that are being moved to more cloud-native containerized environments and we have a lot of, let's say, regulations, for example 3GPP that decides which sort of data has to be available for which entities. We have, for example, low-interception. We have operators that need to have their own set of data sets. We have us as troubleshooters that need different data. So I was a bit like, how do we put this together as a common workbase, as a common base for us to do something here? Yeah, anyone has any thoughts? Hey, everyone. Sorry, Matt. Hey, everyone, this is Karthik. I think I had a chat with Matt last week, a few days ago, about the sick observability that was really interesting to join. I think I just wanted to build upon some of the points that Simone made. So I am coming from a project called Litmus Chaos, which is a chaos engineering project for Kubernetes. And I think the notion of observability within chaos is something that we are trying to build upon, trying to understand. And I think any document like this, which talks about observability use cases in general, an index page talking about the various options can be really instructive. We can get a good idea of how to go about things. And I think I was just listening in into the earlier topics about standards and conventions around naming the metrics and things like that. So just to get a sense of what all is observability. For example. And then find out how to go about doing it. I think I am probably a good example for the beginning of the end user case in that respect. So yeah, I really looking forward to that kind of material. Yeah, I mean the intro that I left in the document is a bit of like things that I have seen in other communities. People working with different stuff. We have also more like this SDN hats here. And everybody is basically talking about the same thing. Exposing date or collecting date in some form to have better ways to monitor, troubleshoot or what not the system. So the CNCF, the observability. If I talk about observability and the data sets on hat on, a lot of people go back to 3GPP legacy way of doing observability that define some data sets that are telco specific. And these data sets is only about the system health. It's only about saying, okay, the radio is running, the core of the network is running, but there is no connection to applications. There is no connection to performance. So, any more refined things that you would like to do in the system. It's just like a very high level overview and just has to do with things like performance to collect and expose all this data. But things are changing now. So we are going in a different direction, but we still have to educate a lot of people. If I talk about observability and what they know, that makes sense. The best way to get an elephant is one piece at a time. That's a terrible analogy, I suppose. But I heard a couple of things, which I think would be good to start work on. So one of the things that you mentioned is defining the different operators and what types of job functions and or people need access to observability data and enumerating those and calling out them almost in a requirements-based way. As a small concrete example, my day job company, we work in insurance and financial tech. So we have a lot of auditing and regulatory compliance issues, not issues, but concerns and requirements in terms of how we store data that might have various sensitive information. So I think there could easily be a white paper now today that we could do in the context of the SIG collaboratively that calls out all of those stakeholders or roles and what their concerns are. That provides a starting point, a shared set of nouns or personas in terms of what projects exist in the CNCF, what projects wear their gaps, and there's obviously a lot of vendors and a lot of overlap. We just talked about it earlier this hour. But that same analysis that just enumerates not so much on the operator side, but on the technology or the project side, the gaps in identifying where the CNCF does not have a comprehensive end-to-end solution. That's one of the primary motivations for making special interest groups in the first place from the TOC's perspective. I don't think one blocks the other, but I see both as really important. I didn't generate a GitHub issue for the paper intro there that you started because I didn't want to lead the witness at the Bartek's point. I think if we can either firm up the scope of it or split it into maybe those two things as an idea, or maybe three, I don't know, then yeah, let's work on it. Yeah, I think we should start with a skeleton at least or have an idea, or which sections, which things we want to write about. As you said, retention, regulation, this is one aspect that might affect one industry, but not the other one. When I talked about use cases, it's more like the end user at the end. If a system that I deliver crashes, we have the operator side that has their own OSS on site, but if something crashes that needs support, for example from Ericsson, then we need troubleshooters from Ericsson that has a completely different view from the system. So there are these sort of things that I was trying to get a structure if we could maybe generalize that for your employer, for my employer, for other industries, which are the challenges that we have when I talk about observability. Law, Interception and data retention policies that are completely different depending on where we deliver something. So I certainly need some help writing that, but my idea was like to get more this skeleton, something if the group is interested in that of course. Definitely. To me, it feels like the kind of the white paper idea we had. So we can just start it off. Like we do with the working group for the survey and the state of observability, we should just join people who would be passionate to help and feel that, or just create the structure for the questions in some placeholders for such kind of document. Some kind of overview that we are looking for. So I would say let's get issue number 16, so the white paper and just get this started. Start a document and start kind of some working group or rather some focused group on this one. Yeah, let's do it. I can make a Google Doc or anyone can. Again, we have the long running thing to actually have our own doc store. It's still coming, I think it's just been busy. We should have that in the next couple of weeks. But for now we can just, I can make one, or if you want to make a Google Doc. I notice we have a word doc there, so we can easily collaborate on Google Doc. I think we're almost out of time. We are, it's 101. But if you want to start something in the Google Doc and then you start writing or brainstorming and then other people come and give their own ideas, then maybe we have a better structure in my Word document in Slack. No, no, no. Yeah, I'll put something together. I'll take what you've already done and put it into a Google Doc and that'll be a starting point. Thank you very much for your having it. That's awesome. I gave it a read earlier today. Thank you very much. Okay, looks like time is up. Thank you so much for your time. Thank you so much for your contributions and please be involved and help with the working groups if you are interested. See you in two weeks. Bye-bye. Bye, thank you. Thanks.