 So, I'll let me share my screen. So, can you guys see my slides? Yep. All right. So, thanks Ben for the introduction. Once again, my name is Noman Abbas. I work at Pinterest and basically I work with their, with Pinterest visibility team and our team builds tools like metrics pipeline, log search and distributed tracing. Before Pinterest, I worked at companies like Netflix and Microsoft where I was building components for their cloud platform. So, in this talk, I'll just cover how we use our Pinterest data to solve performance challenges. So, here's the agenda for this talk. We'll start with the Pinterest infrastructure, basically how our distributed tracing works. Then I'll go over some data visualization tools that we have built to look at this data and then some more tools that we have built to do deeper analysis and then some future plans for this project. Okay. So, this is how our tracing pipeline looks like. We have instrumented most of our services starting from the clients like iOS, Android and web, including CDN, the front-end API and all the backend services. And all these services, when a request passes through these services, they log trace data to our Kafka pipeline. In this diagram, we can see the black arrows are the request, the user request path and the dotted arrows are the trace data path. So, all this data from all these services goes to Kafka. From Kafka, we have a Spark collector that picks up this data and pushes it to Elasticsearch. From Elasticsearch, we have built a bunch of tools to view this data, to search this data and then to do deeper analysis on this data. And in this talk, I'll quickly go over all these tools and explain why we built them and what are different use cases of these tools. Before I move forward, I just want to highlight a few things. The first thing is end-to-end. So, our tracing is enabled from the very start, from the client to all the way to the backend. So, we capture all steps of this process. This includes time spent on the network between the user and CDN, time spent on CDN, and the time spent by all backend services. Second thing is it's all real-time. So, our users, basically our engineers, they can access this data within a few seconds of the request happening. So, there's no lag in this pipeline. I think in worst case, all event data reaches our backend within one minute. Third thing is we built Spark Collector. This is basically to consume data from Kafka and push it to Elasticsearch. But we added some additional functionality in this collector as well to do some kind of post-processing so that we can do some feature extraction or data cleaning or data blacklisting or these kind of things. Just want to highlight one more thing here that this Spark Collector does all the processing at span level, and we're planning to extend it to add windowing in Spark so that we can do some post-processing at trace level. And again, we need this windowing functionality so that different events, different spans of a trace can reach at different time. So, unless we have this window of one minute or something where we can get all the spans for a given trace and then process them, we cannot do trace level processing. And the last thing is scalability. So, our current pipeline is very scalable. Right now, we handle around 20 million data events per day, and we have seen 10x spikes in the data volume. And so far, our system is scaling pretty well for these spikes. Towards the end of the presentation, I'll circle back to this point and explain why having a scalable system is very important. So we'll start with how we view our traces. To view our traces, we use Zipkin GUI. It's a really nice tool to look at the details of a single trace. It gives you the timeline of all events. You can look at metadata and annotations and all those things. One thing we did was we extended this tool to add archive feature. Currently, our elastic search retains the data for 10 days. And we had this request from our users that they want to save some critical traces that they want to use in their bug logging or want to reference later. So we add this feature to archive trace. I'll just quickly show a couple of screenshots from how we use Zipkin GUI. So basically, in this screenshot, you can see that we highlight errors in our traces. We also add in process spans. For example, here we can see there's a subspan for authentication. There's a subspan for request handling and processing. This really helps us in performance tuning our services. We exactly know how much time is spent in user authentication and how much time is spent in request processing and then encoding the response. And we do a lot of annotation in our request. Here we can see that we add errors and we add stack traces. So let's say if a request fails, our engineers have a lot of rich data to look at and they can exactly see the line of code that triggered that failure. Zipkin also comes with a service dependency graph. The out of box service dependency graph did not work for our scale. So we had to rewrite a few of its parts to make it scalable for our scale. But this service dependency graph is really useful for us to get a bigger picture of our infrastructure. And especially for cases where, for example, recently we're deploying our services to a new AWS region. And this really helped us in finding out what are the upstream and downstream dependencies for any service so that before deploying the service to the new region, we exactly know what other services should be deployed there first. And here's a zoomed in view of this service dependency graph. It kind of shows basically how the request comes from client goes to CDN, from there it goes to NGAPI. NGAPI is our front end API service and from there the request fans out to all our backend services. We have around 80 services and they're still growing. So that kind of tells you the scale of our microservice infrastructure as well. Okay, so now we know how to look at a trace. The next problem that we wanted to solve is how to find the most relevant traces for a given issue. Zipkin GUI gives you some functionality to find these traces, but still you don't get a visual representation. So first thing we did was we integrated our tracing data with our metrics visualization tool. We have a tool called stats board and what we did was we overlaid this trace data on top of our metrics data. And from here user get a really good visual representation of spans. For example, in this case we're looking at the metrics for one endpoint for a service. And we're overlaying the corresponding spans for the same endpoint. And here user can see that which traces correspond to the P99 latency of that endpoint. And they can quickly see that, okay, at this time there was a spike. So let's look at these set of traces and they would be good representative of that spike. So next thing that we wanted to look at is trends in our traces. Trends like is the latency increasing, is the error rate increasing for a service or not? So for that what we use is Kibana dashboard. So since our data is in Elasticsearch, its Elasticsearch comes with this Kibana plugin. These dashboards were really helpful to give us a high level trend in our traces. We have three main dashboards. One dashboard is for services where service owners can look at their trace trends. The other dashboard is for clients like iOS, Android and web. And the third dashboard is for the Pinterest developers and operation people so that we can see how Pinterest pipeline is doing. And if there are any latencies or delays in our data. So for example here we can see the dashboard for a service and the service owners can see how much spans are coming for their service. What's the breakdown per endpoint? What's the latency per endpoint? And the most useful thing here is error rates. So they can see what kind of errors are coming per endpoint. From here they can click on these errors and this will filter out just the spans or traces that are relevant to that error and that endpoint. And again it gives them an easy way to explore their trace data. The next dashboard is for clients. Again pretty similar. You can see the breakdown of different clients, their versions. You can see latency comparison between different versions. What endpoints are more frequently used for each version. One really useful feature or useful thing about these Kibana dashboards is it's very easy to find the outliers. For example in this first case on the left side we have a graph that shows us the top traces with high number of spans and we see that there's this spike at certain point where we saw that there were traces coming with tens of thousands of spans and there was clearly a bug in our instrumentation that was causing this. But without a tool like this it would have been really hard to figure out what's the source of this bug. Similarly on the right side we see a spike in spans coming from our dev servers. So again someone had some problem on their dev server and it started sending too many spans and we were able to catch it quickly. Okay so now we have a way to look at traces and a way to look at trends. Next thing that we wanted to do was to get a deeper analysis of our traces and answer more complex questions like how much time is spent on a single service for a given user operation or what's the difference between performance between two different versions of software. So for this we build a trace analyzer and this is the tool that we explained in our blog as well. So the requirement for this tool was that we wanted to get an aggregated view of traces. Looking at one trace might not be the right way to go because it might not be the representative trace for any service. So the first requirement was to have a way to look at the aggregated view. Second requirement was to compare two sets of traces. For example you want to compare traces from last week to this week or between two versions or between two devices. You want to compare traces from Android and iOS. Then we also wanted to provide a way to do root cause analysis. Let's say there's a regression in your performance and now because we have 80 services in our back and it's really hard to find out which service is causing that regression. So we wanted an automated way to surface up the root cause for any performance problem. Then we wanted it to be scalable since there's a huge volume of data. This tool has to scale with the volume and easy to use so that users don't have to understand the details of tracing but they can quickly run a report and get the answer. So the tool we build, here's the basically architecture of the tool that we build. For the UI part we use Jupyter Notebooks. It's a tool that's used here at Pinterest and people are used to this tool. This UI from this UI, users can provide their input, their parameters for the analysis. This notebook then triggers a Spark job. This Spark job reads, spans from Elasticsearch, does lots of processing and analysis, builds a report and writes it back to Elasticsearch. And then again the Jupyter Notebook reads the report from Elasticsearch and displays it to the user and this whole process can take from seconds to minutes. So once the report is done the user gets an email and then they can go and look at this report. Here's the UI for running this report. Basically you can specify the time period that you want to look at, the service endpoint and some other binary annotations and you can provide it for two different sets of traces if you want to compare these traces. And here's the sample report summary for one of these reports. And in this summary you can see things like what was the average latency, what were the calls per trace, what were the total number of calls made. The most interesting part here is if we look at the three graphs in the bottom, the first graph, this report is by the way for NGAPI. So the first graph shows the self latency of NGAPI and it's a histogram. And we can see that in batch one and batch two, the two sets of batches that we were comparing, there wasn't much difference. But in the second graph, the overall latency, we see a clear shift of the histogram towards the right. And this kind of highlights that while NGAPI did not add any performance delay in the request, the backend services did add some performance delay because the overall latency has increased. So this kind of shows you how this tool can be used to do root cause analysis and fight the root cause that where this extra latency is coming from. And from here we can, again this report has lots of other things. One more thing is downstream service latency. So this table lists down all the latency that was introduced by all downstream services. And this is their self latency. So this does not include the latency that was incurred by any further downstream calls made by those services. And again, we have a table for number of downstream calls as well. And again, in these tables, the tool kind of highlights the services that have the highest impact and then engineers, they can quickly look at other dashboard for those services and see how they're performing. This last table is for looking at per endpoint latency impact. So it has basically each row shows you a span name, client and server. So this gives you a further detailed overview of what endpoint added the highest latency in the set of cases that we are looking at. And again, from here, the users, the engineers, they can see that, okay, this service, this endpoint of this service had the most impact and they can go and look at the code changes that they made for that endpoint and see what added this extra latency. Okay, so we are kind of running out of time. I'll quickly go over, I have a few more slides, I'll quickly go over them. This tool that I explained earlier, the trace analyzer, it's not very flexible. So some of the engineers, they requested some more library where they can just pull all the traces and do some ad hoc analysis. So we built a Python client for them that has all the helper functions to do the analysis. Here's the sample code for this client. So these few lines lets the engineers pull all the spans and traces of services and then they can do their own ad hoc analysis. And some future plans. So so far, we have done the trace instrumentation visualization analysis. The next thing is automation. These tools that I explained earlier, we want to automate them so that they can run after every build and release and we can have a daily reports and basically remove so that we don't have to require the engineer to run these tools, but these tools can run themselves. And whenever anything breaks, they can just send an email to the engineer that, hey, your services having this problem. And one more tool that we're working on is a visual dashboard. So this, I don't know if you guys are familiar with Netflix visual tool, but this tool lets you get view real time state of your services. And we are using this tool to basically get to display the real time view of our trace data and the output of these analysis tools. So let's say if one of the services is having some issues, some errors or some latency spike. This will clearly show you where the spike is coming from and you can see the root cause of that spike. All right, I'm two minutes over time, but I think, yeah, should be okay. So yeah, let me know if you guys have any questions. Go ahead. Yeah, so I do have one question if we have time for that. It is, so we are trace analyzing report slide, you have negative values on the raw difference column. Why is that? So the negative values is if, if the latency is decreasing for some endpoint. So again, if you're comparing two sets of phrases, the latency can decrease for certain endpoints to write. So that negative value shows that that the performance improved for that endpoint. I see. Okay, so it's okay. Okay, makes sense. I'm sorry, I kind of went pretty fast through these slides. But yeah, I mean, I can come back in the next call to to I mean if you have any questions later as well. I was curious about Kafka actually given the consistency guarantees it's trying to provide are probably stronger than what you need. I was just wondering if that's a painful part of the of the stack, or if you don't really have trouble with Kafka. No, so far we did not have any trouble with Kafka and again, we have other services at Pinterest that that use basically that have higher requirements for Kafka and even for them it's working pretty good. Because not one of those services that pushes Kafka to its limits. So, so far we didn't see any problem with Kafka. Great. Good to hear. So I have a couple of questions. I'll try to make them quick. First of all, thanks so much. Awesome presentation. I'm curious, like almost from a product standpoint and from a technology standpoint, how do most people who use the various things we've shown get into it? Is it that they they go in their browser and navigate to an internal URL, or is it that they're being diverted from some other tool like a metrics tool, or something else. And then my second question as a follow up is, is, you know, I last time I spoke to him on about this is probably a year and a half ago and he hadn't managed to get the kind of instrumentation coverage that you've you've gotten at this point. And I was curious, like from a organizational standpoint, what what you all did to get such high quality coverage of your system. It's a challenge for a lot of companies. And I was curious to hear about that as well. Thanks so much. Sure. Sure. So, for the first part, basically, we have internal URLs for these tools. And people are pretty familiar with pin praise and Jupiter notebooks so they know where to go for to use these tools and again we have all these tools documented in our internal language as well. For the second question, how did we get this instrumentation coverage so good. So, what we did was we we kind of delegated this instrumentation work to the service framework owners for each language. It's not, I mean, if, if just me or my team is doing all the instrumentation, it won't be scalable. And again, we don't have skill sets in iOS and Android and CDN and all those places. So, so we worked with other teams and gave the ownership for every language and framework to the right team. And since they have all the skill sets for them, it wasn't too hard to instrument their code and instrument their frameworks. And that's how we were we were able to to get such a such a good coverage. And again, now moving forward those teams they maintain their own instrumentation. So from our perspective, we just give them guidelines that, okay, let's use this standard, let's use this schema. And if the and we help them debug issues as well that when they start implementing these instrumentations usually they they run into some problems and encoding problem or those kind of things. But what we help them and we work with them but in the end those teams are the owners of of the instrumentation. And that's why we were able to scale so well. Did I answer your question. Yes, you did. Thank you. It seems I mean some companies don't have the benefit of having a central framework scheme for each language. I think that that's probably something that that helps a lot for Pinterest case like you actually faster that into your org chart in some way which is which is really smart. So, so that's great. Thank you so much. Sure. Sure. I had a question. Thanks for presenting by the way you mentioned the seven day retention period. I was curious if you keep statistics on how many traces are viewed and how many are archived. So, we don't do statistics. I mean I haven't looked at how many traces are viewed but once in a while I go and see how many traces are archived because they're in a different index so it's really easy to go and view them. So we don't have a whole lot but every day we have few traces and again the biggest use case is that let's say someone finds a bug in their code or they see some interesting trace. They want to save it because the timeline of fixing that trace and then deploying the code can take more than seven or 10 days. So our retention policy for regular traces is 10 days but for any such use case we want to keep those traces for longer and again people are using it a lot. It's like I would say a few traces a day but even then though those are really critical traces that we really want to save for users. That makes sense. So they get saved in the JIRA or wherever that bug is being tracked. Yeah. Yeah. So their link doesn't break and we just keep them forever. Awesome. Well thanks for presenting. You're welcome. Okay. So it's 9 a.m. and we've got a couple other items on the agenda. Unless there are any last questions I'd like to move on. All right. Let's move on. Next up on the agenda is just a sort of report back from the W3C trace context workshop that occurred in Seattle last week. There were a couple of things that happened there and a couple of people who are on this call were there. I'll just give a high level report back and then maybe Erica and then others can chime in with their thoughts. The main thing that was discussed there was the trace context specification that really has two parts. Trace context which is information being propagated about the trace itself from one tracing system to another potentially. And then there's another header called correlation context which is sort of like baggage propagation. A way to just transport a set of key value pairs down the stack through some header that's been whitelisted by the various proxies. There was a lot of discussion about these two things. In general it seems like more and more agreement is being reached with the trace context and trace context extension headers. I would encourage people to go look at the repo to read up on where that's currently at. But it felt to me like that was really getting into hair splitting territory and as far as the open tracing project is concerned. I think it's reaching some point that means we should start thinking about it. In particular do we want to expose some of these fields on the span context API. So there's going to be a span ID and a trace ID that come with this new header. And people have been asking for the ability to correlate spans you know doing span observers and things like that. And not having an exposed span ID or trace ID has kind of been a blocker for us to be able to do some useful things. So I think to me that was the most important thing that came out of that. The second most important thing which relates to the correlation context header was a discussion on security. And when you add baggage in your program and then you have some third party piece of instrumentation that's serializing that baggage onto your HTTP calls or message queues or what not. But in particular HTTP calls. There's not really any API mechanism for indicating whether or not a request is outbound to another third party system or staying inbound within your own system. And that means this is a security issue without some clear way of identifying you know when that correlation context header should be populated. And without any kind of encryption on the data inside of there. It's it's really kind of handing a foot gun out to application developers. In fact, in the open tracing API right now there's literally no mechanism for deleting or indicating not to propagate baggage. So I think that's a real problem that we should think about. There's some other things that happened there open senses showed up it was nice to see them. But I think those two points that I just mentioned exposing the trace context fields on span context, and really thinking hard about baggage in terms of security were the big takeaways for the open tracing project. Anyone else like to chime in on what they what they saw there. Somehow I volunteered myself to write a trace context implementation, the basic tracer. Adrian was going around the room after you guys left and, and I was like, he was like who's going to write reference implementations and someone volunteered for, I don't know what for zip in someone volunteered for senses, I think. I said I do it for basic tracer. Yeah, I don't know I mean there was one interesting piece of the senses API was the ability to propagate baggage without starting or finishing a span ever. And that was sort of interesting to just think about as a concept for me just as though is that something that users might want to do and is that the job of open tracing. I remember there was an original version of open tracing that was public open context or something like that. And I suppose tracing we built on top of that but this was sort of just the pure element of KV pairs propagate baggage, you know, who cares if there's a trace ID or not. That I thought was sort of interesting and fundamentally challenging some of the model, but people in all trace vendors and doing. I think I was convincing specifically that, you know, we were trying to solve lower level problems at the higher level by not having some form of, you know, some form of standardized context propagation because that doesn't really exist. We end up, you know, we need it for tracing and so then we bake it in and then everyone comes over to the tracing system, which is observability system. This is, can I please ride shotgun on this thing? Can I just tag my thing on to here because you're doing the context propagation in a way nobody else is. And that's because the tracing system is the only one that's getting propagated through the proxies, then also when it goes into the process has the right kind of information to know which outgoing calls are related to the incoming calls. And so we're doing all of the hard work of wiring up this context, but without some lower level primitive. And I can see why we would not want to be in the business of trying to provide all of that on top of providing some standard tracing API, but but the problems kind of keep circling around each other. Yeah, there's like sort of a separation of concerns. And in previous places I've been. We've had, we had context propagation before we had tracing so there wasn't a need to couple the two together because we had one before the other but in a new context when I'm trying to think about this and bringing this into a new place. So I have neither of these now I'm faced with a very large barrier to entry, or a much larger barrier to entry if they're not coupled I guess. So, thanks to think about you're going to do the work at the same time and it's maybe a nice carrot to provide the instrument or if they're having to propagate context for tracing and tell them oh also you can use this channel for all this other wacky I probably shouldn't rely on for your application. It's amazing what shiny graphs do for buy in for things. Yeah, no I don't think you should teach developers to use it for their core application logic but the the census stuff is using it just for monitoring tags. And I know of companies that do put application logic in there. I was going up the hill to Brown University to watch them talk about all these exotic applications of baggage I was often thinking of trace litics as a way to. We could somehow productize the propagation of context on its own. And then it seemed like a bad idea, because we would be at fault, providing automatic instrumentation if we missed the spot and then never will be mad at us. I think it's much safer to give that to the manual instrument to the engineer themselves at their own company rather than have automatic instrumentation vendors to try and you know promise to hit every context propagation opportunity. But it's kind of neat. I don't know. I think it's in five years everyone is using context propagation for their all kinds of exotic use cases it'd be cool. I will say in a previous life, we had a very large scale microservice is deployment with probably on the order of 2000 microservices. And where engineers put a lot of stuff into context, and we kind of let it be like free form like you want to use it go ahead and use it and there wasn't really much governance around what go what should or should not go into context. I think at one point, we incurred like, I think we had like 50% extra latency because we were sending that much more unnecessary context through our microservice architecture because of it. So, we had to actually dial it back, because we had so much. There was just so much in there. But that's a, it's an interesting problem to hit and have at that point. Yeah. Something that was brought up that I think is relevant. And I'm kind of interested what what people on this call think is sort of the opposite direction which is, there's no way to sort of tag traces at the trace level. There's, you can tag spans, and you can attach baggage to a tracing system and baggage is sort of like trace level context in a way, right you're saying this, this trace has this project ID associated with it, things of that nature. But because there's getters on baggage the semantics is it's a way of propagating in band information. Then that implies there's this cost to to adding, you know, baggage to your system because you're going to be propagating it so you have to think about, you know, the size of that. And something various tracing vendors were bringing up at this workshop was, you know, if you're just doing it for the purposes of monitoring within your tracing system. There's no need to be propagating that information in band, right, like it would be much better to be telling your tracing system, hey, here's, I'm just tagging this trace with project ID. And then out of band it's indexing it or doing whatever it does with that. And you're not worrying so much about. Oh, is this all going to fit in a single header or something of that nature. So that that was just a point that was brought up that I thought was was potentially interesting. And I'm not sure if open tracing needs needs such a thing, but but it was interesting to think about, if you wanted to be indexing these traces with baggage, like if that was a thing, tracing systems we're going to start doing. Is it the kind of thing we're having the getters on baggage then sort of complicates that issue. The baggage seem to be doing double duty right like if the tracing system is trying to use that as information and you're providing to provide users with indexing based on that, then you're got this sort of spurious overhead that's associated with them. This is a baggage is for not for the tracing system at all and it's just, it's just the mailman delivering this to some other system, in which case tracing vendors were like that is not my job that is not why people are installing my system into their system. And I don't want to be on the hook for when the the mail gets lost. So, so I thought those were kind of two things that were almost kind of at odds with each other. I enjoyed talking to vendors, but as a vendor I enjoyed talking to customers and trying to really nail down why they wanted baggage, because I was hoping that the answer would be, they just really want to tag their traces not they actually need to use it at every single hop, because that would make my life a lot easier. Yeah, it seems like like that that API is really doing double duty. You throw in the extra added fun of security and propagating these things out of your system by accident without any kind of encryption on the data, or adding encryption to the data and thus potentially inserting a lot of overhead into this whole affair. I don't know, I think I think baggage needs needs to get reviewed in a very serious way. Possibly, some of these things may show up in the specification issue backlog. I think a couple things that we should start thinking about based on that workshop are exposing span ID and trace ID. You know just assuming trace context is going to come through the pipe what would we like that API to look like an open tracing would it be problematic. I think that's one issue. We're going to raise again, it was sort of called correlation ID or debug correlation ID in the past, but we're going to bring that back up. And then I think an issue around baggage review around security and some of these other things will get brought up. And I bring that up on this call because the OTSC is not only supposed to be kind of reviewing and driving that stuff, because we haven't really been doing a lot of it because we've been focused on in process context propagation. There hasn't been a lot of movement on the spec outside of that. But now that we sort of passed that particular gallstone. I think the time is ripe to sort of pick up some of these other issues and start moving on them quickly. So I would ask that when those things pop up in the specification issue backlog. People who are members of the OTSC, you know pay attention to them and make sure we're not. I don't think we have an official SLA on, you know, responses and resolving those issues, but we should think about, you know, operating as if we did just to make sure things aren't just kind of dying on the vine. So have a look for those things coming soon. So I think that gets through the next thing on our item here was sort of bag in span context relationship where the getters live in the interest of time. I think I want to just hop over that for the time being I think we can discuss those nuances on GitHub. I don't think we need to talk about them here. But just basically, you know, where do the getters live, you know, around whether span context is immutable or not. But I think we should discuss that on GitHub, since it's very dry. Moving down the list. I think this next one is a fairly non controversial issue the CNCF has requested that we switch the licenses for all of our repos to the Apache. I believe Apache, you know, v2, or whatever the standard Apache licenses, we have a number of them that are currently the MIT license, the rest are already Apache. I don't think anyone has any particular attachment to the MIT license and think it would be a problem. So just very quickly if anyone does think that's going to be a problem, can you speak up now. I didn't think so. So I think we're going to go ahead and start. We should probably open an issue and like let it sit for a few days. I mean, it's not like everyone who cares on this call. I agree that no one's going to care. It's an API. It doesn't make sense to license it. I don't think it'll be controversial. The reason for people if they're curious is just that MIT license has some loopholes in it that can become problematic. It's like submarine IP is a long story, but it's just like super wonky legal stuff. For open tracing, it makes literally no difference. The only reason we went with MIT is that I think MIT was more popular in some of the languages. We just didn't want to add additional licenses to people's software where it wasn't necessary. But I think Apache is ubiquitous so it shouldn't matter. So we went through this process with Yeager when we joined SCF. The way we approached it and I consulted with our legal, they said open an issue in a particular repo and just ping every contributor who is in the history of GitHub for that repo. And just let it sit for a couple weeks. If no one objects, then feel free to upgrade to Apache too. We also adopted, by the way, so like Yeager code used to be on over a repo which had a CLA. So we killed that when we moved to independent work, but we also adopted the certificate of origin from CNCF like the one that Linux uses. So every commit that happens in the repo has to be signed, which is like a dash s switch to the Git commit. There's not much, but there is a checker which verifies the tool commits in the PR assigned. So we may want to consider that as well. Great. Yeah. And I didn't mean to imply we were just going to go ahead and just do this, you know, by fiat to everyone just more that I don't think we need to have a big debate on this call about whether it's a good idea or not seems pretty non controversial. Anyone else has any further licensing related comments like spend the last 10 minutes just talking about some new project structure coming up. So, in order to kind of keep the OTSC focused on sort of a steering committee specification level stuff and not fill this time up with a lot of nuts and bolts. We still need to get those nuts and bolts put into place so created a new set of working groups. One is called the cross language working group. Which is tasked with figuring out some of the day to day project management around these various backlogs. So the language API backlogs and OT core and then the backlogs for all of the contributed instrumentation. There's just a certain amount of process we want to make sure we have in place around just processing issues in PRs that currently is not really nailed down, you know, things like, you know, templates making sure when people make a PR an issue it's just focused on a single subject. Who should be assigned as reviewers to these things, not that other people can't comment but who should be assigned to ensure that these things are getting shepherded through to completion. If they can't reach a completed state and have to be set aside where do we put them a bunch of that kind of day to day project management needs to get sorted out. So this is a group focusing on that. And as far as API decisions is concerned also focused on making sure all of the language APIs are implementing this spec. So for example we had a context propagation as a concept, making sure that actually goes into all the languages that would need it that the API is that result feel coherent as a whole if you're doing cross language open tracing. So that's sort of the mandate of that particular group. And I would welcome people to join it. There's a new getter channel under open tracing called cross language, and we'll be having sort of regular workshop meetings to kind of get the stuff moving. Any questions or comments about that working group. Cool. So we did ask questions on getter related to that there's also a new documentation working group that's been kicked up. And that's shocked to overhaul our documentation. Basically, we're missing a bunch of styles of documentation that would be really helpful. There's a lot of, you know, consistent questions people have when they come to the project that we could clear up. I think making that a lot cleaner would be helpful. Also, we'd like to create a sort of open tracing cookbook cross language examples of all the different tasks and scenarios you end up doing an open tracing. I think that'd be very helpful for people. And finally, getting some kind of, you know, searchable index of instrumentation some way of being able to to really focus people on the fact that what open tracing is for is it's for instrumentation it's for having a bunch of standardized instrumentations that all work with each other, that have some amount of guarantees being attached to them. And it would be great to have something like, you know, MPM or Ruby gyms or CPAN, where through the open tracing website you can kind of search and find all of this stuff. You can kind of dig around the, you know, OT contrib GitHub organization right now and find things but I think that's just opaque enough that it's not doing that ecosystem justice. So that's the other project. I think that group's going to work on. Any questions by the docs working group. I would highly encourage members open tracing members who come from organizations large enough to employ technical writers to consider having your technical writers join this group and donate some of their time. And to also think about how the open tracing documentation could get worked back into your own vendor specific documentation because a certain amount of your API is of course the open tracing API. So that's another reason to come to this group already. And that that's that's all I have on new project structure going on right now. You'll probably see a lot more movement coming in these various repos as a result of that group kind of getting off the ground. We finished early for once. Congratulations to fast talking. Yeah, so open floor. Anyone have any other questions or topic discussion. I had just one thing. As I mentioned, I didn't think it was that important. So I was going to, I figured we'd run out of time, but since we didn't run out of time, I didn't want to mention that. I met with Louise, who is like doing a lot of the trace context work in works at Dynatrace. It has a fancy title there a couple of days ago in San Francisco. And I think that we're going to try and find some it's unfortunate that the next distributed tracing workshop where the trace context stuff is being discussed the next one of those is concurrent with KubeCon in Europe in May 2nd and 3rd, I think. So he was talking about trying to get that workshop moved a couple of days or to maybe have some separate thing that happens just on either just before just after the KubeCon thing. Try and get together and talk a little bit more about how to consolidate a bunch of API is Dynatrace is also pretty interested in being involved open tracing and the capacity of contributing like higher level, higher level APIs that build on top of open tracing, like they have a bunch of APIs that track individual users and things like that that wouldn't make sense. And open tracings layer of abstraction would make a lot of sense when level up from that. So that that was very interesting. So we'll probably try and incorporate them and him into the work that we're doing. I'm going for it as well. I just wanted to mention that. Great. Speaking of higher level APIs that was actually a piece of report back from the trace context workshop round. There was some requests from from various tracing implementers around having higher level APIs and open tracing around common tasks such as tracing an HTTP request. To have some kind of higher level API that sort of forced you to put in certain key pieces of information to kind of ensure right now we're trying to use testing to sort of ensure. You know, some form of kind of uniformity but there was a request to sort of bake that into the API's maybe other high level APIs might be useful now that the, you know, lower level ones are sort of settling into place. So that's that's something worth worth considering. We haven't spent much time focusing on tags, since we've been so focused on the API's but tags, how should things be tagged, what counts as an official, you know, HTTP request span or something like that. I think that's really useful information or useful task for us to do, maybe at the OTSC level, or to start another working group to really focus on the sort of tag taxonomy. The only aspect of that though is that it's mostly makes sense in the context of go where there is the single HTTP request API, where there's no pretty much any other languages that that sort of thing becomes fairly impossible. Like you can't really provide helper methods. Yeah, or you can but maybe they I don't know, like, they're not going to help you automatically schlep the information out of the request object right it would be more just shaped in such a way that there was, you were forced to extract all the information this thing needed to tag things correctly. Yeah, that's I think was one of their feedback is like why can't you just define a struct that users need to populate instead of say no this is a document describing tags. Yeah, what do you need to do are some kind of special begins and method that had required arguments and then very attic arguments afterwards or something like that, which is. Well, I think yeah, the structure is kind of this way to approach that. Yeah, it's extensible. Yeah, I think the one thing that I was skeptical of was people really demanding like compilation guarantees was a lot of what was brought up but it's I don't that seemed like I don't know. I would be just as satisfied with very clear guidelines and then ensuring that for stuff that that's in OT contrib there's like a review to prove that these things. You know, it's just like, if you do a CDB you have to have a test that proves that you know you're assigning these these various fields, you know, various tags and things to it. I'm really satisfied with that other people definitely tests that validate your conformance with the spec sound fairly. Right. Yeah, you could have a testing tracer that this whole job in life was to, you know, help verify these things. Right. You know, it seems like the kind of thing that would be easy to to maybe make some standardized tests or test harness runs on this stuff. Like, it's worth thinking about or we just have a review process of being like, things can't, you know, come in without, you know, reviewing and saying do you have a test as the test show you've tagged this properly. All right. Well, it was good meeting. I think we're done. Yeah, and I want to thank Norman again for the talk with Austin. Right. Lovely seeing all your lovely faces. See you next time.