 So let's begin. So hello and welcome, everyone, to this maintainer track session on revolutionizing Kubernetes logging. My name is Naman Lakwani. And I am a final year student from India. Previously, I was an intern at VMware. And I started my open source journey two years ago in 2021 with the Google Summer of Code program in the CNCF Thanos project. And in the same year, I got a chance to contribute as a Linux Foundation mentee in the Kubernetes project. And it was the first time I was introduced to the Kubernetes community and the software. And today here I am giving my first conference talk. So thank you, everyone, for coming and joining me here. It is going to be a great session. So today on Agenda, we have logging, structured logging, and contextual logging. We will start with the basic introduction of Kubernetes logging. And then we will dive deep into the structured and contextual logging one by one. And we'll see various code examples and performance metrics and various design decisions which were discussed during this work. So let's begin with the Kubernetes logging. So we all know it's a crucial aspect of containerized applications. We can monitor the health and performance of our applications with the help of Kubernetes logs. And it's very crucial aspect of the whole Kubernetes ecosystem. The logs can be generated from various sources, from the application containers. These Kubernetes system components like ETCD, API server, and et cetera, also from the nodes. So we all know Kubernetes logs are a bit messy. So if we want to troubleshoot any issue, then we will have to go through all the logs in the terminal to get the idea of what is happening. And what if the logs are coming from various sources? So log aggregation is a technique which is used to centralize all the logs at a common place so that the query can be searched and run at a single place. And we use solutions like elastic search for that. So logs can also be used for monitoring. And we can set alerts for certain log events so that when certain log events happen, the developers and engineers can get the alerts on their emails and slag that something has broken. And for that, tools like Prometheus and Grafana are highly used in the ecosystem. And for all these solutions to work like Prometheus or elastic search, we need to have some standard log formatting. Because previously, it was in whole plain text. And it was very difficult to build any solution which can work in all these scenarios. So standard log formatting is very necessary for the solutions to work. And for that, structured logging was proposed in the community. So the main motivation behind structured logging was because the parsing, processing, as well as querying the logs was hard. And it forced the developers to rely on some ad hoc solutions, like regular expressions. And they can't build a proper solution which can work in every scenario. So structured logging was proposed. And the main use cases and main ideas behind structured logging were to define a standard structure for Kubernetes logs among all the Kubernetes components. Then to add the methods in K log to enforce this new standard log formatting. And to configure Kubernetes components to produce logs in JSON format. We will be seeing why JSON and nothing else in the coming slides. So the goals are very much similar to the proposal like to make the most common logs more curable, introduce new K log methods. And also to simplify ingestion of logs into third party solutions like elastic search or primitives. With this, we are not replacing K log or the way it is used. Also, we are not doing structuring of all the logs in Kubernetes. Only the main components, which are in high views. So this is the log message structure which was finalized after having consensus from the community. So in this log structure, we will have the message followed by the key value pairs, like key one, value one, and key two, value two. So this is the standard log formatting which we are talking about. So two methods, InfoS and ErrorS methods were introduced in K log library. Where S stands for structure. You can see the declaration of the InfoS method. The first parameter is the message string and followed by key values. So in the example, we are calling the InfoS method somewhere in the Kubernetes code. The pod status updated is the message. And pod is one key corresponding to kubediness is its value, then status, and then radius. So corresponding key value pairs. And you can see the result in the logs which we will get in the terminal when we will inspect the pod. So pod status updated, we get the message. Then pod equal kubediness, key one, value one, and status equal ready, key two, value two. Similarly for ErrorS, the declaration was the first parameter is error, the actual error. Then the message string, and then the key values. So in the example, we are passing error as the first argument, and then the message failed to update pod status. In this particular example, we don't have any key values, but we can have some. As the result, you can see that we get the first string, the message string, and error equal timeout, which can be different and different scenarios. So the idea is to use kubediness API first approach to get kubediness objects. So these two methods were also added in the K log. The first one is K object. Second one is K ref. The K object methods takes object metadata as the input. And K ref takes name space and name in the form of string as the input. And both of the methods return object ref struct in the output. And object ref struct has name and name space as its field. We will see the example. So in the example we can see, in the first line, we are creating a pod object with the pod name kubediness and name space kubesystem. And instead of passing the pod name while we are calling in foes method, we are now passing the Kubernetes object. So K log dot K object and in the brackets we have pod. So you can see in the red. And in the second example, instead of passing object, we are passing it by the reference. The first argument is kubesystem, which is the name space. And kubediness is the name. And the general formatting is name space slash name. So in the output result, you can see that pod equal to kubesystem slash kubediness. So name space slash name. And if we already have the pod object, we can pass it as a pod object. Otherwise, we can pass it by reference. So you might be thinking by JSON and nothing else. So there are many pros of using JSON because it is totally adopted by logging libraries with very efficient implementations. Then it is easily parsable, transformable, and also human readable as well. We also have solutions like jQuery, which we can use with JSON. So JSON was finalized as the standard output format. So you can see the example here. ts is the timestamp, then verbocity is 4. And you can see the pod object. Name is kubediness, and name space is kubesystem. So it is very easy to parse JSON and use it. So let's see the performance metrics, like we migrated from plain text logs to structured logs. So what is the performance or whether we improved or not? So you can see that for text, the InfoS implementation, which is the new implementation, is 9% slower than the InfoF. But for JSON, InfoS implementation is 77% faster than InfoF. So you can see that JSON InfoF is taking 1, 4, 0, 6 nanoseconds per operation, while JSON InfoS only takes 319 nanoseconds per operation, which is 77% faster. So it's a huge improvement, which has been made. Now let's speak about and discuss about contextual logging, which is based on the GoLogger API. And the GoLogger API is designed around structured logging only and supports attaching additional information to a logger. So there are two design decisions which are made for contextual logging. One is we can attach the logger as a value to the context. Also, we can retrieve the logger from the context. So these are two design decisions which were made. So here are some use cases of having contextual logging. We can add a prefix with the help of with name method. And we can also add key value pairs with the help of with value method to the logger. Also, when we are running unit tests, if any test fails, we will not see the error logs from all the parts. We will only see from all the tests. We will only see the logs for the current filling test. So this is another use of contextual logging. Also, we can change the verbosity of log messages. So this is the practical real world example of having contextual logging. So there is a developer, John, who wants to know which part and which operation and scheduler plugin log messages are associated with. So if there is no contextual logging, it will be very difficult to find out like with which plugin log messages are associated. But if we have the contextual logging, we can use the with value function, like logger.withvalue, and pass the pod object to the logger and attach the pod object to the logger. And in the final output log, we will get this prefix in the logs, like nominated pods, slash filter, slash volume binding. So with this, we will be able to know that there is something happening with the volume and the storage. And we can check further with this help. So the goals were to remove direct log calls through kx.io-klog library. And to grant the caller of a function, control over logging inside that function. But we are not removing the K log text output format, or we are not debricating K log. It is text output format is still present, and we are using K log. So there are various risks while using contextual logging. So one is uninitialized logger. We can't use uninitialized logger. We have to initialize it properly. Then there is performance overhead. We are passing logger to every function. So we have to also check about the performance. It should not deprecate much. We can pass logger in two ways as a explicit parameter, or we can attach it to the context. So here is the code example, which we can see. So initially, Snapshot was not taking any input, but we have changed it to take logger as a explicit parameter. And in the last line, you can see we are passing logger to this Snapshot method. But from where this logger is coming from. So we are retrieving this logger from the context, assuming CTX is present in that particular function. We are retrieving the logger. Then we are attaching the pod object to this logger, like K log.logger with values, the old logger, and the pod object. So this line is updating the old logger to the new one. And we can use this logger, but we also have to update the context. So we are also updating the context with the help of K log.newContext method and passing the old context and the logger. So this new context will contain the new logger. And we are using it. So the output will look like this. So you can see the last line. It is the structured log. But we can't say from where it is originating, from where it is coming. But if you see the binder.go, the second line in the red, you can see pre-filter slash volume binding is the prefix, which is attached to this log. So with this, we can say there is an inline volume, which cannot be created because storage is exhausted. So contextual logging is like, let's say there are three functions, a, b, c. And function a calls function b. And function b calls function c. And there is any error from function c. Then if you don't have contextual logging, it will be very hard to say why it is failing and which function is calling this function c. But if we have contextual logging, the logger will be passed from a to b and b to c. And in the output, we will see the prefix. So we can say, then we can determine that function c is failing because it is being called by function a. So the current status of structured logging is in GA, general availability. The contextual logging is ready to get promoted to beta. There is a recent PR, which has been opened. Currently, it is in alpha stage. If this is something which excited you and you want to involve in the community, you can join the structured logging working group in the Slack, in the Kubernetes Slack. Also, we have bi-weekly meetings running every Thursday at 3.30 PM British timing. And if you want to see the current and past work on GitHub regarding this work, you can search for the label WG-structured-logging. And you can help us in solving issues and reviewing the PRs. So CAP-1602 is about structured logging. And CAP-3077 is about contextual logging. So if you want to read further, you can click to these links and check it out. And when we are migrating from plain text to structured or contextual logging, we have to keep in mind certain instructions. Because as we saw, there are performance overheads and various risks. So we have to keep those in mind. And this document documents all the instructions for that. So thank you. And if you have any questions, we have Meng Jiao and Chivanchu from Segens 2 Meditation. So feel free to ask any questions.