 So hi, everyone. Sorry for the small delay. We had some technical problems. And before talking, I just want to say thank you for the fluent guys in the community that are hosting us here. I'm Eran Reichten and Prangel Gupta. We are from IBM Research. And we are working very strongly with the Red Hat team. So there is IBM Research and the Red Hat team are working together. The area that we are focused on is observability and observability stack of OpenShift. And Prangel will talk about it and elaborate a lot more. The area that we are trying to promote is treating logs to be equivalent citizenship, like to the rest of the resources in clouds and in Kubernetes in OpenShift. So just like CPU and memory, which are managed and controlled resources in such distributed environment, we're trying to get to the situation that logs are treated in the same way so that we can control them. We can control the amount of logs that are being generated and the amount of log that is being collected using Fluendee and the logging stack. Just that this is managed and those are not like free resources in the system. And I think that Eduardo touched that in the beginning when he was saying that there are too many logs in the world and everyone is sending logs out. So this is exactly the situation that we're trying to handle over here, making sure that everything is under control. I will move the give the stage to Prangel to give us the entire talk and elaborate. Thank you. Thanks, Irann, for the introduction. Hi, everybody. So first, I would like to introduce what is OpenShift. So OpenShift is a flagship product as a service platform from Red Hat built on top of Kubernetes. And it allows you to deploy and manage your containers in an easier way than using plain Kubernetes environment. OpenShift logging is like a subsystem for logging and how to configure logging in your OpenShift cluster. It provides a high level semantics that are in form of an API to customers so that you can configure your logging architecture. So this is an example of how you can control your cluster logging through form of a custom resource definition. This simple and intuitive API will help you to generate complex flow-in-d configuration. And on top of that, it adds normalization, metrics, and buffering to your cluster logging. So on this slide, this is a very high-level view of how our logging pipeline looks. So logs from containers in the form of standard out and standard ERR streams are written to log files on the disk by container runtime interface. And logs from these files, which are stored in varlog containers, are read from Fluendi, normalized, and then sent to persistent storage. Elastic search is log affluent forward. However, this seemingly simple architecture has many bottlenecks and of which we are only talking about what we can control from the log collection side. So in situations where you don't have control of the amount of logs being collected, where a lot of logs are being generated from each application and you want to troubleshoot, you want to debug, and you don't know what is happening. So in those situations, you need to have a very good state of what is happening in the cluster. So in those situations, when you have CPU and memory resource crunch, there can be a buffer overflow at due to which logs are not being flushed regularly to your endpoint. So this causes back pressure to your connected components and you start to miss out on logs. So these two bottlenecks, we have done a study on these two bottlenecks and come up with a feature in the Intel plug-in, which is one of the most widely used plug-in in Fluendi, to control what amount of logs is being sent and do you know how much log is being lost and what are the sources of that logs? So now we formally define our two areas. One is log loss. That means the difference between what was collected and what was generated by workload applications. So this means that when Fluendi misses log rotation, you start to lose logs. And this can be accounted to the number of missed rotations into the size of each log file. The second one is the data clogging. So when you have very less memory or CPU resources available to Fluendi's output buffer, buffer starts to get overflow. And you tend to lose logs because you don't know what to do. Fluendi starts to push back to slow down its reading so that it can send what it has already processed. So this is data clogging. So these two are internally related. So data clogging can cause log loss. So given these scenarios from our architecture, we have come up with a motivation. So during worst case scenarios, when you want to debug and troubleshoot what is happening in our cluster, you want to prioritize log collection at the input level so that you don't miss on important logs. And as a part of the aggression process, you want to make sure that your crucial resources, like network bandwidth and persistent storage, are not saturated given the resource constraints. So as part of our research, we have developed an open source benchmarking tool which allows you to generate and measure log stress conditions. So using this tool, we performed our experiments and had some form of reproducibility in our experiments. One key feature of this tool is that it can allow you to configure your log rotation pace in the cluster. So you can control the log rotation pace and check out how many logs or what is the amount of log loss that is occurring in your cluster. So before moving on to the observations, I will just give you an overview of what is our experimental setup. So in general scenario, you have two groups of containers. One is very important, which you don't want to miss logs from. And one is the less important containers, which are chatty, which are noisy. And it is OK if you lose some logs from those containers. And the objective is we want to preserve logs from very important containers so that you can troubleshoot, which is very important for you as a developer or an SI to come to a stable state. And the approach which we are following is that we are saying that we can afford to lose some logs from less important containers and preserve more from what is important to us. And as a baseline, we are using one of the existing open source plug-in, which is called the throttle plug-in. It allows you to control the rate of logs flow in your pipeline. And if the rate of incoming logs exceeds, then it starts dropping logs. So in all these experiments, we have two graphs. One is where we don't have throttle, which is the normal situation. And one is where we have throttle applied to the less important containers. So upon applying throttle, you start to lose log, but you also get some benefits. So there's a trade-off in what you choose. So in this case, as you can see, when there is no throttle, the rate of collection from each group of containers is pretty much same. But when you apply throttle on the less important containers, the rate of collection for important containers, that is the blue graph, is increasing. And the green graph is pretty much controlled as you have set it in your configuration. So this is what we want, right? During exceptional situations where you don't have any control of what logs are being collected or which logs are being lost, you are preserving more from important containers, and you are doing the best you can in this situation. So in a way, you are increasing your Fluendi capacity to collect more logs, at the same time dropping proactively so that you are staying current of what is happening in your system. So this means that if you control your CPU usage in Fluendi at any point, whether it is input, filter, or output, you can collect more, right? The second observation is more related to the implementation of Intel plugin in Fluendi. So this is an experiment which we did to test the impact of output's buffer size on the reading nature of Intel. So when we varied the size of buffer, whether it is file or elastic search or any other common buffer, common output plugins that you use in your cluster logging, when we have a large buffer size, we saw that the peaks are different. What I mean by peak here is the amount of lines or the instantaneous rate of lines read by each file. So different peaks denote different workloads, and each peak denotes how much line is read from that file or from that workload. So when you have a large buffer size, let's say 1 GB, the amount of logs read from each file is different. And when you have a smaller buffer size, you see that the peaks are of equal size. That means irrespective of what is the generation rate of your log, you're reading equal amount of lines. So why is this important? Because in worst case scenarios, when you don't know what to do, you need a good amount of logs so that you know what is actually happening in your system. You need to have a good, clear snapshot of your entire system, so you need to have some information from all of your pods. If one of the workloads starts going haywire, it is generating thousands of logs per second, you don't have a good snapshot of other pods, so you can debug what is happening. So in a way, you need some form of fairness in your reading so that you have a good snapshot of your system. So based on these observations, if we can control the rate of flow as early in the pipeline, we can save some CPU cycles and increase our collection. And we can ensure some form of fairness so that we can have a good way of debugging our system. So here comes the feature, which is called group-based throttling in Intel. So you can form groups, you can define user groups in the Intel plugin and you can define rules for assigning each workload to a file, to a group. Then you can rate limit the logs being collected from those files and this feature ensures that your groups are read equally and rate limiting is done at the time of reading. So this will ensure that you have enough CPU cycles saved because you're saving a lot of time while in like as early on in the pipeline stage. So this is an example for the generic or default use case for Kubernetes. By default, it will extract information from your path. So the generic workload path in Warlock containers follow a specific pattern where the first keyword is pod name followed by name space, container or Docker ID. So you can specify your parameters like in the form of rejects. So this rule states that match all containers which have the following namespaces, space one, space two, or space three, followed by a pod name and which starts from app.anything after that and rate limit it to a number of lines as 200 after every 30 seconds. And this will also ensure that the total line limit is 200 per group. So 200 divided by the total number of files or the workloads in that group will be the lines read by each file. However, this Intel plugin is not only used for Kubernetes, right? So we have made this grouping pattern generic so that you can use for other files as well. So you can define named captures in your group pattern and then you can specify matching key file or hash table in your match parameter in the rule directive. This can allow you to customize and generalize your grouping rules so that you can use it anywhere you want. This feature will be available in 1.15 release version of Fluendi, which will, I think, be released in May end. And you can have a look at the PR and it has not been merged yet, but feel free to look at it and make some reviews if you want. Now coming to what we are doing as part of Red Hat and IBM Research, just a reminder, this is the API which we are using to configure our operator. So what we are now doing is we are defining some policies through which we can control different components of our cluster logging pipeline. For example, we can simply limit the rate of logs being sent to Kafka to one gigabit per second to avoid saturating network link because as I said, persistent storage and network bandwidth is very crucial. Similarly, you can control certain parts from a namespace with certain labels and you can define per container limit, which is rate limit per file or for rate limit for the entire group. Or you can simply ignore certain parts as well. That means you don't even collect from those parts. In this way, you are saving again the resource, very crucial CPU resources and you're concentrating on what is important. So how does this API look when we apply these policies? So you can see on the red box is the limit reference, like we are trying to drop logs if the maximum records, the incoming log rate is more than 50 lines per second, per second, sorry. And you apply this rate limit to an input application which is, we have defined custom groups there, which is like collect all lines from pods which have named less important and are from namespace log stress. So in this way, you can control different aspects of your cluster logging pipeline, whether it is input, whether it is output, or you can also control the filter components of your pipeline. So to summarize in this talk, we identified what are the different bottlenecks in our cluster logging pipeline and we also showed you like a benchmark tool which we have developed for generating and measuring stress conditions. Through our experiments, we saw how to increase collection through throttle plugin and what is the impact of output's buffer plugin when we change the buffer size. Finally, we have also come up with a new feature in Intel which allows you to control log loss and add throttling at the input level. As part of our work in Red Hat, we are working on a policy-based log slow control so that you can control different aspects of your pipeline, including Fluendi and Elasticsearch. So this is our team of our five members and if you have any questions, please feel free to reach out to us on this email. Thank you. If you have any questions, we are here to answer. Yeah. So my question is normal to think of dropping logs instead of increasing the capacity of the aggregator. I mean, I never went in the situation that I want to drop logs. I want to improve to avoid dropping them. So in Kubernetes, the CPU and memory and resources of the logging stack like any other set of application is also limited. So we don't want to take it to infinity. When there are applications that emit a lot of logs, really a lot of logs, it does make sense to put some threshold or some limit to the amount of resources that the logging stack itself is taking from the system because it will start to affect the other application that you have on the cluster. This is why it does make sense sometimes when it's really a lot of logs to start to see log loss. And this is exactly where you want to see the log loss on containers that are not the most important containers that you have in the system. We want to balance the effect. And this is exactly what we're doing here. Okay, so basically it is to decide which logs to drop because you can actually also put some limits on the containers at the Kubernetes level. But then you cannot decide which logs to drop, right? Yeah, exactly. Thank you. Any other questions? We're good? We're good, thank you. Thank you very much. Thank you.