 for Red Hat, and I also act as a tech lead for the C instrumentation. And today with me, there is Patrick that is working for Intel, and he is the Structure Logging Working Group Organizer, and he will be talking later about more things on logging. So today, what we will talk about, we'll do an introduction about the group, what is the purpose of this group, and what we are doing upstream, as well as we'll cover the different activities that we are responsible of, which are, for example, some sub-project related to observability in Kubernetes, as well as the different observability signals that are really common nowadays, such as like logging, tracing, and monitoring. And then I will kind of teach you how you can also contribute and help us building observability on Kubernetes in a better way, and where you can find us if you have any question or anything really. So what do we do? So first, if you are not really familiar with how the development cycle works in Kubernetes, like we are split into groups that are called a special interest group, and there are many of them, and all of them have a particular focus. And in the case of instrumentation, well, the main focus is observability, and all these groups have a charter that they define at the creation of the group that basically give the baseline of what's the purpose of this group and what it is going to cover. In our case, that's an extract of the official charter that we have for the group, and we are meant to cover the best practices for cluster observability across all Kubernetes components. So in the Kubernetes project as well as develop new components and new project around observability in Kubernetes as a whole. So we are not only working on the Kubernetes repository in the Kubernetes project, we're also working on many sub-projects that are covering observability in Kubernetes, such as CubeStateMetric that is producing metrics based on the Kubernetes states. So based on the API, as well as, for example, KLOG that is a logging library that we are using in Kubernetes a lot. And of course, like many projects that use Kubernetes as a library, for example, a client. There is also MetricServer that is a project used for auto-scaling based on resource metrics that is part of our responsibilities. And there are many, many more. So we are kind of split between like working on a different Kubernetes project as well as like many other sub-projects. And we are also working a lot on all the different observability signals that you might have already seen in Kubernetes, so metrics, logs, event as well, and traces. So how do we do that? Well, really like any other group, we are triaging and fixing the relevant issue that are assigned to us. And how it works is that we have a label that identify issues or PR that are related to a particular group and then we just go through them and try to see what can we do. And sometimes it's like the person that opened the ticket that assigned us or sometimes it's just someone that tried it that put us on the list of people that should review the thing. We also try generally to review all the new addition to metrics or all the changes related to metrics because we really want to have a high quality of metrics in Kubernetes. But the review is like stricter depending on the stability of the metrics. So if it's a new metric, we might not review it but if it's a metric that will become stable, we'll definitely review it. And we also involved a lot on features development and announcement related to observability as well. So if anyone in the project has any issues with observability or if there is any gaps then we will try to cover it. And as I mentioned before, we are maintaining a lot of sub-projects. So for the sub-project, I will speak about three of them because they are the one that I'm the most familiar with and that I contributed to. So kubesit metric, metric server and previous adapter. So for kubesit metrics, for those who don't know and I'm not too familiar with the primitive terminology, it's meant to be an exporter. So the goal is to create metrics and expose metrics based on a third-party application that isn't meant to be monitored, for example. Such as the Kubernetes API. We have a lot of objects in Kubernetes and sometimes you want to have metrics about them. Let's say you want metrics about your deployment, your stateful set. Well, kubesit metric will be the tool that generates a primitive-style metrics based on those. And for example, on the slide, we have like two of them that are deployment-related metrics. The first one gives you the number of replicas that the deployment should have in spec. And the second one is the status of the recent updates of the deployment. So it basically tells you if an update like a rollout, for example, has been completed or not. So it's really useful when you want to, for example, investigate issue related to a deployment that hasn't completed, for example, or stuff like that. And it does so by watching for all kinds of events. So creation updates of objects in Kubernetes. And then expose the metrics. And then you have your platform that will ingest these metrics. Since it's a text-based format, like any platform can really ingest the metrics and use them. For metrics, a metrics is a project a bit different. It's an implementation of a metric API that is the resource metrics API. So we have like three API that we maintain, actually, which are meant as a way to communicate between the auto-scaling pipeline and the metrics. And it's a way to have auto-scaling based on metric, basically. And over the years, we've seen many use cases for this that were not designed very different. For example, KubeCut on top, which is used a lot nowadays to get the usage of a pod and to try to track in a Linux way, for example, like Kleenextop, the usage of your pods and containers. And then that was created based on this API, as well as, well, resource-based auto-scaling. So if you have an HP and you want to auto-scale your pods, based on, I don't know, if they reach 50% CPU usage, then you will use a metric server. Prometheus adapter is a project that is fairly similar. It's also meant for auto-scaling, but it's supporting a larger amount of metrics. Instead of only having the resource metrics for auto-scaling, you will only be able to auto-scale on any kind of metrics. So for example, if you want to auto-scale your microservice based on the number of query per second that it is receiving, then you can use a project such as Prometheus adapter to do so. And what it does basically is that it adapts the query that comes from the API server. So let's say you have an auto-scaling request that should go through this API, and Prometheus adapter will convert it into a punctual query, query Prometheus with that, and then return you the results. So you can imagine that you can then, via this adapter, convert really anything into a Prometheus query and auto-scale based on it. And then I will let Patrick talk about the logs. Okay. So as you know, SIG instrumentation owns two different things, the metrics, and as another pillar of observability, the logging infrastructure and log output to some extent. But mostly we just maintain the infrastructure and help other SIGs to write good log calls into their code and help guide them towards producing better log output. And one of the initiatives that has been started a while ago was to rethink what a log message should look like from a Kubernetes component. Originally, it was inherited from K-Log. It's just a plain string with no format whatsoever to it. And if you need to look into that string, you basically need to do regular expression matching or grabbing or something, and it's fairly informal. What we are trying to achieve now is so-called structured logging, where the output contains clearly separated lock message, a string, ideally constant, and key value pairs in an easy to pass format. The text output still exists. It's useful for debugging for developers, but we also now support outputting the same data as JSON. And the idea here is that any kind of log ingestion pipeline will be easier to implement if we use the JSON format. So that's the structured logging. It's implemented as rewriting source code, replacing one unstructured log call from in K-Log with another variant that takes key value pairs and the rest is taken care of in K-Log. But it means touching a lot of code. And that's what we are organizing. And we've done that work with the help of quite a lot of contributors and helpful people with KubeLit. That was already completed in 1.21. And we also completed that work in KubeScheduler in just the current Kubernetes 1.24. There was one final blocker, and that was related to multi-line output of strings. It became unreadable if it was a quoted string. And we changed that in the K-Log text output for 1.24. So that it remains useful even also in the text output. And with that change, we convert the basic velocity remaining to a log call in KubeScheduler. And we consider it done. This is just for the KubeScheduler code itself. It will still call into functions from client go, for example, that don't use structured logging. But then those appear as a single string as they did before. So you can turn on JSON and you get all the JSON outputs from these two components. You can also turn it on in all of the other components. It just won't be very structured. It will still have only that string. And that's where we need help further going on, going further to continue this effort. But I'll get to that when I present the working group that I'm a part of. We also own or kind of are the de facto maintainers of KLOG implementation itself. One thing that we observed in Kubernetes is that for historic reasons, we have lots of command line options in KLOG that we don't think fit into a container, like of log file handling, that really should be handled by whoever ingests the data. So we want to deprecate. And we have deprecated all flags related to log file handling, for example. There's a longer list of things that only make sense in very specific scenarios. So these are now marked deprecated in Kubernetes, but still available. If you get a warning, if you try to still call them, and we will remove them in 1.26. So if a clock is ticking, if you are calling Kubernetes components with some of these flags, you should better start removing those options. The main goal here is, or it was, to remove or get rid of that code that we don't really need in Kubernetes. But we kind of refought our approach here, and we will continue to support them in KLOG itself. It's just that the Kubernetes components won't use them. So if you are a user of KLOG, rest assured, nothing. We're not breaking your application that is based on KLOG. It will just make Kubernetes simpler. And the other purpose is that if we have a JSON back end that writes log messages, it's basically a really different code path that is being taken. And things like increased verbosity by a module, which is one of these KLOG options, that won't work if you use JSON. But that's now fully documented. You see which options are supported and which are not. There's a cap that's linked here with more details. Now, the new feature that I've been working on in 1.24 is basically a continuation of that structured logging work. With structured logging, you still call a global KLOG function to output a log message. And it's using the globally configured logger. It will be the same for all Go routines in your program. And contextual logging goes one step further. It uses an abstract logging interface, Go logger. You have some utility code in KLOG to extract that logger from your context that gets passed into your call chain. And then you can do calls through that local logger that is specific to the current call chain. The caller basically determines what that logger does. It can do things like adding a certain key value pair. And it will be printed by every single log message inside that call chain. That's useful if you are doing request processing and you have concurrency, so multiple things going on at the same time printing logs. Now you will see what each log message is about. And that was not possible before. We're going to use that in KubeSketchular, for example, to include the pod that is being scheduled in all calls. And that also works in code that doesn't even know that it's working on a pod, because it's part of a context. My own favorite use case is logging in unit tests, because now the output can be redirected into testing tLog. And Go test will just show me the log messages for my failed test case. And not all of the rest of the log messages that are of no interest to me. So as I said, we are not going to break KLOG users. All of the code, if you are importing a Kubernetes library, will work with a global KLOG logger as default. But we make it so that if a code, a library, is instrumented supporting this, it basically can also work without depending on KLOG output, for example. You can have your own logger. You can completely replace the logging back end in your binary and make sure that you will be a kind of rest assured that KLOG-based code, Kubernetes code, will pick it up and use it. And that's part of it. We started that work because it also, again, needs to touch all of these source code lines that do log calls in Kubernetes. And going forward, we'll combine both. We will go straight to the contextual logging function calls. They also have very similar parameters. So the previous work on structured logging was the first step. Now we need to take another step and replace the global logger. That's what we are now going to do also in other components. So Damian introduced the SICK. This whole work on structured logging is organized as a working group. That's another concept in Kubernetes. It's basically a smaller subset of people who meet independently of a larger SICK. It has separate organizers. I'm one of them now. He was not here today. And we don't own the code. So any kind of triaging code ownership still rests with SICK instrumentation. But it's a good way to get started getting your feed wet with doing some Kubernetes work because it's a smaller group. It's less distracting also. I don't know much about metrics, for example. I've used them. But I don't usually attend for SICK meetings. So I learned something today about metrics. But this is basically a different subset of work that we are doing. And you're welcome to join our SICK or Zoom meetings. Show up on our SICK because we do need help. This is basically, to some extent, busy work. But it's also interesting because you get to see a lot of code, get a lot of experience, exposure in the community. It's a good way to get started in Kubernetes, in my opinion. And for contextual logging, you actually need to understand a little bit. Just cut and paste. You really need to make some investigations about what's the best way to pass in a logger into a certain sort of code base. I think it's a good way to get started. And we are looking forward to contributors. Thanks. And now I will present you the work we've been doing on tracing recently. And most likely, what we've done to build the experience around tracing, it's been an issue for a while. And we've been trying to improve that over the years by designing the features and starting to implement it. And the original goal that we wanted to achieve was to be able to trace any kind of request that is made to the API server and to be able to know where it went, how long does it take to, for example, return from HCD, how long HCD took to do the work there and stuff like that. So we wanted to have tracing available for the whole control plane. So we added tracing to the API server in 122 and into Chalpha. We are still working currently on promoting it to beta. And it should be happening in the upcoming releases. And it will now be available by default on Kubernetes. And it was also added to HCD because we wanted a full experience of distributed tracing in Kubernetes. So the support was added to HCD in the v3.5.0, but it's still experimental in the same words in Kubernetes. So you will need to enable it yourself if you want to start using this feature. And we've also had use cases where people wanted to also have tracing because it was working well in the API server, so why not adding it somewhere else. So it's like some effort started on pod lifecycle. And there isn't an ongoing effort to add tracing to the cubelet. It was supposed to actually lend in 124 and start being available now, but we were a bit late. So maybe in the upcoming releases, you will be able to see that. And also, when we started this discussion, some people from the container runtime also were interested by this feature. And now it's available in Cryo and ContainerD. And you can enable it in your Kubernetes cluster. And yeah, we've had also some users of tracing that started seeing gaps in the current solution and had some ideas to improve it. One of them was to add context into every controller so that we can follow the actual requests that are made from one controller to another and propagate that into the tracing so that we have an overview really of what happens. And this is still in discussion as well. So I will give you a quick example of how it works today to have tracing in the control plane. So since the feature is still alpha and still experimental, you need to enable the API server tracing feature gate at first. And then to configure it, it's pretty easy. Like a lot of configuration in the API server, you need to add CLI flag and then add a config file to specify the configuration of your tracing. In this case, the example just showed very basic tracing configuration with a one person something rate. And it's quite the same. You need to set a CLI flag in order to enable the feature since it's not available by default. And then the only thing that you really need to set up in order to start tracing your cluster is to have open telemetry collectors running as a cycle of each of the, like for example, the API server container and the LCD container. And that's about it. Then you can send the traces to which you're back and you want and start using this amazing feature. And this is an example that was done with Jager, but you could use anything really, of the result that was achieved with this feature. We can see different spans. The first one in tool from the API server, so you can see where the API server spend time when the request was received. Same for at CD, like you have at CD in a wrench. And you also have like for some requests, they may be eating a web book, for example. So you will also have that and you will be able to see really like where the time was spent. And if you ever had to investigate issues with slowness, for example, of a microservice of like, or like any kind of latency issue, you most likely had a bad experience because it's really hard to figure out where the problem is going from. It could be network, it could be your application, and then if it's your application, where exactly in the application did it happen? And with tracing, this is something that we can see more easily I would say, because you can really see like what was the factor that made your request slow. So that's the key behind this feature. And now I'll talk about like metrics. As you might have seen as Kubernetes users, we have a lot of metrics in Kubernetes. And that's because most of the components in Kubernetes of default integration is from it is like the whole using the formative client to expose metrics in a text-based format and in their metrics endpoint. And that can be scraped by really anyone. Like you can use any kind of time series database that is compatible with the primitive format and that would work, but in most cases, it's still the primitive that is used because it's like the most common nowadays. But over the years, we've encountered many issues with metrics, you might have faced some of them, but what we saw is that there was a reoccurring problem which was related to memory like, there was some spikes in memory usage sometimes, there was some memory leaks that we detected and the issue is that like, we were seeing metrics that were growing, growing and growing with never stopping. And the problem was related to metric cardinality which is a concept where for example, you take this metric that could be a metric for many microservice, then the metric has some labels and you can consider labels as a dimension and then the label name would be the width of your metrics. So let's say in this case, we have a width of three, we have the verb code and pass label and then for each label, there will be label value and that's the eight of each label. So for example, here for the verb, we have a height of four, same for the code and the pass is one. And then if you really want to estimate like how many time series your metric will generate, well, it's pretty simple. The time series basically, like a unique set of label values. So for example, put 200 on the pod pass will be one time series, put 201 on the pod pass would be another time series, post 200 on the pod pass would be the same. So you can see where it is going like, each value will multiply with one another. So here, the cardinality, the theoretical cardinality would be four time four time one, right? So that can be problematic in a sense if you have a label at some point that have an unbounding number of values. For example, a URL is simple, but well, if someone eats your API and you are like, no matter the path that is queried, you are producing a metric. Then that means they have control over the actual values that the pass can take. And the problem is that there is some security concern with that because well, they can basically create metric that will be stored in your monitoring platform by themselves. And if they query basically a million pass that doesn't make any sense, then you will store them in your monitoring platform. And in most cases, if you don't have any kind of security features enabled, then your monitoring platform will be down. Like it will be in the dark in your cluster and stuff like that. So it's something to be considered. And since it was itching a lot of people in the Kubernetes community, we've tried to put something in place in order to cover that. So we've introduced a framework that is meant as a way to define the fact that a metric is an immutable API in a sense that if someone ever were to add a new label that would take a thousand value, well, they won't be able to do so because like the metric would have been stable and it wouldn't be possible. So that's what we call the metric stability framework. And the idea was that we added some checks in order to verify that whenever a metric is stable, it's impossible as a developer to add new labels to it. Like we have checking CI that prevents that. And we will like see it as one immutable API like any other API that reaches stable at some point. But like we noticed that it wasn't enough. We were still seeing the issues cause like, well, you cannot see anything. And even maybe some alpha metrics are also causing like cardinated explosion, right? So what we did was that we had to start with which was that when you need to fix this kind of problem in the project itself, sometimes it takes a while just to get the fixing. Then sometimes like the problem was from like two reasons ago, so you need to do a back port which takes a lot of time. And as a user of Kubernetes, you might see the fix like maybe one or two months after really the patch, right? And after you encountered the issue. So we added this tool that is a metrics cardinated enforcement tool that allows you in each component of Kubernetes to specify a CLI flag in order to like specify the metrics that you want to remove in case like they are exploding. Cause like we are a developer, we might make mistakes. So mistakes might happen, but we want to give you a way at runtime to be able to fix this kind of problem if they ever occurs. So this is what we built to solve this issue. And we have some other ideas around that cause the stability framework is great, but we want to use it to improve the quality of the metrics that we have. So in the future development process in Kubernetes, we have like different stages, which is alpha, beta and stable. But in the metrics, stability framework only have alpha and stable. So we want to add beta in order to add a bit more expressiveness to this framework because today it was like either we could change anything in the metric because the metric was alpha or it was stable and we couldn't change anything anymore. Like there was no thing we could do to the metric. We wanted to add more expressiveness as well as there is a big problem in Kubernetes that you might have faced that there are many metrics and there is no actual documentation to really know which metric does what and even like what are the metrics that are available to me. So we want to build an automated documentation that gives you the power to really use metrics. And yeah, get involved. Like we are a very welcoming community. We are always looking for new contributors and the best way to start is to start attending our SIG meetings because that's where we usually meet and onboard people. The things you can do like there are many ways to contribute, you don't have to just contribute code. You can do reviews, you can fill issues, you can contribute to the documentation and stuff like that. And since at the beginning of the talk I said that we have like many sub-projects as part of the instrumentation group. There are many of them that are looking for contributors, new contributors, new reviewers and stuff like that. So if you are interested in one of them feel free to reach out to the person in chat and they will be very happy to onboard you. For the meetings that we are running, the SIG meetings, we have a regular one that is bi-weekly where we discuss like general topics for the group. It occurs at 9.30 a.m. Pacific time on Thursdays. And like it's one time it's the meeting, the other time it's the triage and triage is more like we go through issues, we go through the PRs and make sure that someone is assigned to it and looking at it actively. But other than that you can reach out to us on the SIG instrumentation select channel. We should be fairly active there. Or if you have any questions that you wanna ask to the lead directly feel free to reach out to any of them. They will be very happy to answer all your questions and to start onboarding you in that. And yeah, that's about it for the talk. Thank you everyone. So we don't have any questions online. So if you want to ask a question raise your hand and I will bring the microphone. Hi, first thanks. And so you mentioned about tracing, setting it up and running a collector as a sidecar which doesn't seem very easy when using managed communities such as EKS. Do you have a solution for that or are you working with providers to provide those kind of solutions? On top of my end I don't think we've considered even supporting it by running the sidecar ourselves in communities because we don't want to take a position, like choosing for example the open telemetry collector or any other future collector that might be created at some point. But I think some vendors are actually trying to onboard tracing in general. But we will see it's still very early to take that kind of decision and even for vendors to onboard this feature. So the more stable it will become the more likely it will be that like for example GKE, Azure, whichever like cloud will start supporting this feature. And regarding the tracing, I was wondering which is the idea the user should enable it whenever he sees an issues or the idea is to have it always enabled and if so, which is the overhead? I think the overhead is still quite big. It depends on the sampling rate. But it's mostly a way for us developers, for example components in Kubernetes to see if we have any latency regression to then enroll them or for you users to really empower you to know like where is my problem. Let's say tracing to me is a mean for investigating your SLOs. Like if your problem with your SLO, your availability is degrading, then it is a mean for you to notice, hey, the problem is coming from HCD, the problem is coming from the API server. And then you can know exactly like where it is coming from. But your overhead is still quite big. So yeah, I think it's more, like if you encounter this issue at some point, I think it's a good way to say, maybe I need to enable tracing. But it's more like depending on the budget that you have and how much you can afford because like the more signal you have, the easier it will be for you to investigate any kind of problems. But it depends on the budget. Like sometimes it's something that you need to enable. So you need to choose whether like it seems worse or like you don't have the budget for it. Thanks. Hey, so thanks a lot for the talk. So I have a question related to logs, but perhaps more to the logs for applications that run in the cluster rather than the logs of controlling components itself. And this information may be out of date, but last time I checked, it was not super easy to collect logs from application running in the cluster, right? You either will need to go to the CRI, which requires quite some privilege that may not available everywhere or modify the application code to send the logs somewhere. So my question will be, is there any plans or are there any ideas of how to make a bit easier to collect logs for the application running in the cluster? So conceptually at least containerized application should be writing to standard out or standard error. And then it's the job of the container runtime and the container orchestrator to do something with that data. Marek, my co-organizer, he is interested in that topic. He wants to investigate how to make it more performant, but there are no specific plans at this time. So if you are interested in that low level part and how to feed that data into local collection agents, Marek would be the person to ask for, because he's probably going to do something or look into that fairly soon. Okay, we have one more minute for the last question. Anyone? I think we are done. Thanks. Thank you.