 Okay, now let's get going. Thank you for joining us, everyone. Welcome to today's CNCF live webinar, Dynamic Right Sizing of Kubernetes for Cloud Cost Savings. I'm Libby Schultz, and I'll be moderating today's webinar. I'm going to read our code of conduct, and then I will hand over to Varsha Nike, DevOps Engineer at Trig, and Chip Huang, Technical Product Marketing Manager at OCI. A few housekeeping items before we get started during the webinar. You are not able to talk as an attendee, but you can leave all your questions in our chat box. We will get to as many as we can at the end. This is an official webinar of CNCF, and as such is subject to the CNCF code of conduct. Please do not add anything to the chat or questions that would be in violation of that code of conduct. Please be respectful of all of your fellow participants and presenters. Please also note that the recording and slides will be posted later today through the CNCF online programs page at community.cncf.io under online programs, and they're also available via your registration link. The recording will also be available on our online programs YouTube playlist on the CNCF channel. With that, I will hand it over to Varsha and Chip to kick off today's presentation. Take it away. Cool. Thanks, Libby. My name is Chip Huang. I'm with Work with Cloud Infrastructure, and today I'm joined by Varsha and Naya from Trade, one of the large insurers in all of Scandinavia, to discuss the topic that's top of mind for many of you when it comes to running your applications, Kubernetes applications in Cloud. And that is a cheap optimization for your cluster without confidence performance application. And that is why right-side Kubernetes cluster is so important because if it's done right, you can effectively achieve both of the objectives. Next slide, please. So when we're talking about right-side and Kubernetes cluster, we're really talking about two things. The first of which is we're allocating the right amount of resource for a cluster, and this means the memory CPU. It's important that the cluster does not resource around the application to run the application effectively, but you don't want to over-provision and waste resources. So by right-side and Kubernetes cluster, you are saving money because you're only paying for a resource to utilize. The second aspect is really selecting the right type of hardware in no tights. Not all of the applications perform the same. Some require more CPUs, others require more memory, or more hardware intensive, or require specialized hardware in order to run effectively. So providing the right no tights and right hardware to your application, you allow it to perform optimally. The third of which is really when you right-side Kubernetes cluster, you maintain a balance of resource for different applications. And this balance is critical because it minimizes the resource conditions for application, which can cause usability. So if you right-side Kubernetes cluster effectively, essentially your cluster can offer more smoothly, as well as become more stable. So a key aspect of right-side Kubernetes cluster is really looking at the best way to scale your application. Depending on the application, it might be using vertical pod-offscaler, or some pod-offscaler, or combination mode. But you're also looking for the right metrics in order to be able to scale your application. So just by going through the exercise of right-side Kubernetes cluster, you essentially make your applications more scalable and also allow it to run more efficiently. Next slide, please. But when you right-side Kubernetes cluster for the cloud, it does come with some challenges. Well, first of which, the workforce and Kubernetes is dynamic. And the amount of resource that you allocate to your application to run effectively will change depending on the load application. And because of the changes, the way that you right-size your Kubernetes cluster needs to dynamic as well. So you need to adjust along with the load-down application. And before you even start, you are able to right-side your Kubernetes cluster. You really have to understand how your resources are being used by your application in the cluster. And that means you need to find the right way to monitor your cluster. But what are the right tools, and what are the right methodology? So these are some of the things that's kind of hard to determine. And also, when you're right-side your Kubernetes cluster, it is complicated. First of all, you do need to understand how the application behaves. But even knowing how to update your face, you still need to understand for the whole cloud provider what hardware and what type of nodes are available so you can match this up correctly. And finally, when you're doing on, when you're recycling Kubernetes cluster, it can affect your performance application. So you have to keep that in mind when you're doing dynamic-sizing Kubernetes cluster. So the method that you do not interfere with the performance application. So considering these challenges, it's crucial to address them effectively and to shed light on the topic today we have Bartram Trigg, who will share her experience and her expertise on right-side the Kubernetes cluster and what she did for Trigg. So Varsha, I'll turn it over to you. Thank you, Chippink. Hello, this is Varsha Naik, a DevOps engineer from Trigg for Sucring. I also have my platform manager present here, Pioto Haikovsky, again from Trigg. He's one of our panelists today. So let's get started. Actually, before we get started, let's talk a little about who we are. We are Scandinavia's largest non-life insurance company. We had got it in Valhub, Denmark. We have over 5.3 million customers and over 7,000 employees. Where do we stand in the market position? In the whole of Scandinavia, we hold the top three market positions spread across Denmark, Norway, and Sweden. And in Denmark, we hold the highest position. We have a broad variety of insurance products made available to our customers. And it is spread across various business sectors. For example, in the private sectors, we have accident insurance, home insurance, pet insurance, health insurance, and various others. We have a commercial sector where we have insurance for small and medium scale businesses. We have a worker's liability insurance, property insurance, motor insurance, and likewise. And when it comes to corporate sector, we also have group life insurance, including property insurance, transport insurance, and the like. So what I'm trying to stress upon is we have a huge variety of insurance products that are made available across various different business sectors. And which means we have a huge amount of data flowing in at real time, both structured and unstructured. And what we have to do is we have to collect them from various different sources. And many of these sources have density of data in terabytes. We have to structure them. We have to model them as per the Accord standard for insurance. We have to centralize and streamline this data and make it available at one place at real time so that we can feed our analytical and business intelligence services and also to our SDPs to spread through processes. This is just to ensure that we improve the customer experience and also fasten the process of insurance itself. Of course, this involves a lot of technologies and all the applications are to be deployed on Kubernetes. And that's what we are trying to achieve today. So the agenda for this session would be, we'll first try to have a quick view of how we do a deployment, a quick overview of that. And then we talk about the challenges that we initially faced. There were quite a few. I'll try to touch upon the major ones. And we'll talk about the solutions. Solutions as in what we did to actually right size our Kubernetes cluster. So that are actually spread across two stages. One is we have to right size the worker nodes involved in the Kubernetes cluster. And the second way is we have auto scaling, various techniques of auto scaling that we have leveraged in order to optimize the utilization and save cost. And finally, we would be glad to share the statistics and the results that we have achieved before and after we did these optimizations and a quick summary. So coming to the deployment overview, we are using, of course, Oracle Cloud and we have Oracle's Kubernetes Engine that's orchestrated using Terraform which is infrastructure as a code. We use this because it's easy for us to collaborate within the team and also maintain the code. And we have cloud native applications deployed as containers inside ports of the Kubernetes. And this is done using Helm charts, using GitLab CI CD pipelines. And to start with, we used a basic standard VM that was made available in Oracle's VM list. And we tried to deploy all the applications onto the same type of nodes in a node pool. Now talking about the challenges that we faced with this architecture. We had workloads waiting with CPU hungry workloads. We had few memory hungry workloads. We had few performance intensive workloads. And we understood that we could not put all of these application parts under one umbrella and have the same host or same host node for all of these applications. The second issue being we have a huge deployment, a huge scale of deployment and every deployment of every Kubernetes cluster will comprise of at least 2,100 odd number of ports. So this would basically mean that we are stressing the underlying host quite a lot and we hit a few edge cases. And to get around it, we had to use few customization scripts. And this we were able to achieve using something called Cloudinit. Cloudinit is basically a customization scripts that you can run as soon as the VM comes up. And of course, this is not available in all cloud platforms, but they are available in, of course, Oracle. Okay, Google's GKE and Amazon's EKS. What, for example, how we use this Cloudinit was we wanted to change few arguments in the kubelet service of all the worker nodes. For example, we have wanted to increase the system reserve resources of the node itself on every worker node. And this we were able to achieve using the Cloudinit script. And then coming to the third point, we had a diversified workload as in the, there were busy hours and there were idle times in our workload. And we had overcome the resources allocated to the busy hours alone. And this meant that when the load is basically idle, we were still paying for the same amount of resources regardless. And that was costing us too much. And of course, like every other business, we also had budget constraints and we had to bring down the cost somehow. So I keep stressing about, we have a huge large deployment and a huge scale of deployment. So this basically means that for every kubelet is cluster that we have in our production environment, we have approximately 5,000 vCPUs, approximately 13 terabytes of memory, 300 odd terabytes of storage and 2,250 number of pods. And this is for one kubelet is cluster and one production deployment. And we have at least four of them running at any given instance. So there was a huge dear need of optimizing the cost and the resources here. So we started with right sizing our worker nodes. We tried to explore what options to our cloud providers provide. So apart from just selecting the host OS type or the flavor, we also thought about the processors that are used by the compute instances. So for example, at least in our project, we had few Java applications which were AMD64-based and we had Kafka clusters also deployed as pods within the cluster using the Stremzy operator and we used AMD64-based compute nodes for this. And we had few other Java applications like in the second box where ARM64-based images were created and we could easily deploy them onto the ARM64 compute nodes. The advantage of this is they are highly performant and also they are fairly cheaper in comparison with the AMD64 ones. And these ARM64-based compute instances are available in most major cloud platforms be it Azure, Google, Oracle of course, and Amazon. So we could leverage that wherever possible. And then we had a special requirement for running a database as a pod inside the Kubernetes cluster. Now you would ask me why? This is because we wanted our applications to be able to talk to our database as much as possible and as frequently as possible without incurring much of a latency. And Oracle of course, provides a VM type or VM shape which has a local disk attached to it and NVMe-based SSD attached to it. Of course, this is not available in all of the cloud platforms but it is available in Google's GKE and Amazon's EKS. So you could search for ephemeral storage local SSD in Google's GKE or you could search for local disk provisioning in EKS. Furthermore, most of the cloud providers have these demarcations of various VM types. They have CPU intensive, memory intensive and IO intensive, sometimes even balanced types of VMs. We could select our VM types of the underlying host based on our workload necessity. Furthermore, now that we have chosen the nodes to deploy on, we now have to make sure that the pods, every pod that we have application pod goes to the right or stipulated nodes, worker node. And to make sure this happens, we use Kubernetes node affinity, node selectors, taints and tolerations. And talking about ARM64 again, we were able to build ARM64 based images using the same Docker multi-architecture builder called BuildX, so in case you're interested. Furthermore, now that we've decided that the flavor is so-and-so and we have this type of VMs to select from, we also have to consider how do we architect our Kubernetes cluster itself. So from our learnings, we suggest that we should have a discrete node pool planning, basically meaning that every node pool should serve just one purpose and one kind of workload. It'll be easier to manage and also makes more sense this way. Consider I choose a very huge node with say very huge amount of memory in CPU and C, I will deploy all my pods onto this node. There is a disadvantage to this, which is if we have block volumes or volumes attached to the pods in these nodes, there's a limitation on the number of volumes that can be attached to every instance, compute instance. And this is true for all the cloud platforms. So we'll have to watch out for that. If I see, if I take the other extreme wherein I have a very small node, worker node with very small amount of CPU and memory and I try to have the same set of pods to be deployed, then I'll need more of these worker nodes, which would mean I'll have more worker nodes and I'll need more number of IPs and there is a chance that I can run out of the IP assignment for the worker nodes itself in the Kubernetes cluster. So we'll have to watch out for that as well. Of course not all cloud providers have this flexibility of selecting the memory to CPU ratio. Oracle provides it in the form of flex virtual machines, but few other cloud providers, for example, Azure provides an exhaustive list of standard VM shapes and sizes to choose from. So that also will help in choosing the right size for your kind of workload. Of course, we suggest that we have a limited set of node pools so that it is easier for us to manage every Kubernetes cluster. All right, so now that we have decided on what host to deploy our pods on, let's venture into auto scaling. And just before we start with auto scaling, I would like to give a prerequisite for scaling, which is called a metric server. This is basically a server that is to be deployed as an add-on in most of the cloud providers. A metric server will basically collect the container resource metrics from all the cubelets in the worker nodes and then send them to the Kubernetes API server. And that becomes available to all our auto scaler speed, horizontal, vertical or cloud or cluster auto scaler. So that's kind of a prerequisites. It's also used when you try to use a cube CTL top command. It results in how much of CPU and the memory every pod is utilizing. You get to see all those stats if you try to install the metric server. Now let's venture into the first kind of auto scaling, which is horizontal pod auto scaler. As the name suggests, it tries to scale out the number of replicas of a particular controller horizontally, as in when the load is high, it tries to increase the number of pod replicas belonging to a particular deployment or a stateful set. And when the load decreases, it tries to scale down again. And there are two ways you can do this. One way is using the CPU and the memory based on how much CPU or memory every deployment or pod and every deployment is using, it scales out or scales in. The other way is to use a custom metrics. Custom metrics, as in your application pod, will export some metrics, which makes sense of course to the horizontal pod auto scaler to the Prometheus server, which is installed maybe on the Kubernetes cluster itself or somewhere outside. And then using these metrics, the Prometheus adapter makes these metrics available to the horizontal pod auto scaler as a feedback loop. And the horizontal pod auto scaler then decides if it has to scale out or scale in. So there are two ways of using horizontal pod auto scaler and there's just a note that you have to consider if you're using a horizontal pod auto scaler with CPU and memory, you will not be able to use vertical pod auto scaler. So if you're using custom metrics horizontal pod auto scaler, then you can use vertical pod auto scaler. We'll get to what is vertical pod auto scaler a little later in the slides. I just thought I will do a mention here as well. Okay, before we talk about this graph, I'll just give a few introduction to this graph itself. This is a custom tool, which is derived, which drives all the metrics and the stats from our OK or a Kubernetes engine cluster. And this is collected from our production environment. So I will be showing you many such graphs and they're all collected from our production environment. And this is in congruence with the deployment chart that we showed initially with 2000 odd pods and 300 terabytes of storage. So here you can see on the X-axis it is time and on the Y-axis you can see the number of pods and with time you can see the number of pods are varying because the workload is demanding so. You can see that at one point it went below 350 and at one point it even went beyond 400. So we see that this is varying with time as opposed to a parallel line parallel to the X-axis as it was earlier before horizontal pod autoscaler was introduced. Now with this, are we saving anything? No, unfortunately, because we are only varying the number of pods in the cluster. Doesn't mean that we are changing anything with regard to nodes. The nodes are still static and that's what we pay for. So unfortunately, with just horizontal pod autoscaler, we're not achieving cost savings. So let's bring in cluster autoscaler. So to explain this, I'll just take an example. Let's consider node one and node two of the same size for our convenience. Let's say that they can accommodate a maximum of three pods of the same type and both node one and node two are occupied, right? So now let's consider there are two more pods trying to get scheduled on the same cluster. Now the cluster autoscaler senses this and starts a new node or provisions a new node and then deploys these two pods, pod seven and pod eight onto those nodes. And this is how the upscaling works in cluster autoscaler. And downscaling, again, when we consider the same node one, node two and node three, let's say node two and node three are not optimally used and they have enough capacity to accommodate two more pods of the same type. Then cluster autoscaler senses this again and then it marks one of the nodes for deletion and then tries to move that pod on scheduled on that node to the node two, as you can see in this figure. And then it will try to delete the node three. This is provided the pod disruptor, the constraints like pod disruption projects are all off the scene. So there are a few constraints on which the cloud cluster autoscaler acts, but then if there are no constraints, it will try to reschedule the pods based on which node can accommodate it and then try to reduce the number of nodes. Of course cluster autoscaler is provided out of the box in few of the cloud platforms like Azure. And there we have the liberty to choose the minimum and the maximum number of nodes in every node pool. And that's the maximum we can configure. But there are a few other clouds, a cloud platforms like Oracle, where we will have to install it as an add-on. And this gives us the liberty to choose or fine tune the cluster autoscaler for our needs. I have just tried to highlight few of the flags that we have configured and tweaked with just to make sure that they fit our needs better. So I'll just quickly take a few examples here. For example, you can see this scale down utilization threshold, which will tell the cluster autoscaler how much of the utilization should a node be using before you try to scale down that node. So before you consider to scale down that particular node. And you can also specify how much time you give a particular node to get provision and come up to the ready state. Here we have just kept as 15 minutes. There are many such parameters that you can play around with and you can find that on the GitHub repository, of course. You can also select the node pools on which the cluster autoscaler should act. And you can also provide the minimum and the maximum node count for that node pool alone. So this way you can pick every node pool that is to be monitored under cluster autoscaler. Now this graph is together with the horizontal part autoscaler and cluster autoscaler. This is what we achieve. This is a similar graph provided from the same tool, but on the y-axis you now see the node counts, the number of nodes in the node pool. So you can see initially the node count went up to 100 and then it gradually tried to reduce and it stabilized at around 60. This is because in our workload, we have initial, since we are DNA, of course, data and analytics, we have a huge amount of historical data to be loaded initially. And then when it gradually gets to the current time or CDC, then the load is pretty much stable. So that's how you can see the node count goes up and down. And are we saving anything here cost-wise? And the answer is yes, because nodes are what we pay for. And if we see less number of nodes, of course we are paying for less than number of them, as opposed to having a straight line parallel to the y-axis and we would have to pay for all 100 of them at all given times. So this is an advantage, but is this enough? Our graphs say otherwise. And these are graphs taken from the CPU and the memory stats of same production clusters, same workload, and at the same time. Let me talk about a few things in this graph. For CPU, the y-axis denotes the number of cores. And for memory, the y-axis denotes the number of gigabytes of memory with time, of course. And the red line indicates the actual utilization of the resources of the pods put together. And the blue line indicates the resource requests of every pod in the manifest file. Of course, this is like a mining factor for the cluster to scale it to scale up or scale down the number of nodes, because this is a guaranteed amount of resources that we guarantee to the user. So this is what we pay for and this is what we utilize. You can see a huge difference between these two. And you might ask me why? The answer is, if I am a developer and I have an app and I have to deploy this app, I would give the pod manifest in such a way that the resource request is high or it caters to the busy hours of my application. And I'll keep a little bit of buffer beyond that because it's an estimate at the end of the day. And this is for one app. And we are talking about 2,150 pods. And when we put all of them together, this is the cumulative disaster. We have a huge difference between the utilization and the request. And we are paying for the request, by the way. And vertical pod autoscaler comes to our rescue. Again, vertical pod autoscaler together with horizontal pod autoscaler can work only when horizontal pod autoscaler works on the custom metrics and not CPU or memory. Now let's consider vertical pod autoscaler as opposed to horizontal pod autoscaler, which increases or decreases the number of replicas of a particular pod. Vertical pod autoscaler increases the size of the pod itself. So when I say the size, it basically means the CPU and the memory requested by a pod is increased based on the actual pod utilization. This is done on a regular interval. Of course, that's configurable. And in steps, which is also configurable in the vertical pod autoscaler. So in this diagram, what I tried to show is, first we start off with the minimum CPU and the memory that is configured by the user for the vertical pod autoscaler. You can also configure a limit. Sorry about that. So well, it started with vertical pod autoscaler. I'm hoping got through this slide where I discuss why we need VPA. So we see that the utilization and the requested resource is way off and we have to bridge this gap. And VPA comes to our rescue. VPA together with HPA can be used only when HPA is used with custom metrics and not with CPU or memory. And VPA as opposed to HPA, where the number of pods that broadcast gets scaled up and scaled down. VPA tries to increase the size of the pod itself. When I say the size, it is the CPU and the memory requested by the pod is increased based on the actual utilization of the pod. And this is done in regular intervals, which is configurable and also in steps, which is also configurable. Just for the sake of simplicity, I have chosen some random numbers to depict how this works. So initially you have to set a minimum and the maximum number of CPU and memory that VPA can play with. And then it'll start with the minimum configured CPU and memory on the pod. And as in when it senses that the pod is utilizing more than this, it steps up and it tries to increase this requested CPU and memory and steps up further again based on the requirement. And scaled down is also happening in the same steps. And at the same regular intervals that's configured in the vertical pod of the scalar. Now, this is of course added as an add-on, installed as an add-on in the Kubernetes cluster. And here I have tried to just share a few flags that might be of use to you. Of course, there are many others that can be found in the GitLab repo, but these are the few which we would like to highlight. And you can ignore these numbers that we have configured because we had to do a few iterations to get to these numbers and so will you. Because if you have to tailor this VPA to your workload, you'll have to run a few iterations before you get to the ideal values for your project. So to start with, I will try to highlight a few flags here. One is this recommendation margin fraction, which is basically the amount by which the CPU or the memory gets stepped up or stepped down. So we have chosen 30%. So every step would be 30% increase in the CPU and the memory. We would also have parameters like how long do you want to retain the memory and CPU history before it dies down before you discard them. We can also set which namespace do you want VPA to act upon. And we have used Prometheus for our storage here. So we provide the Prometheus URL, so on and so forth. This is for the VPA recommender entity. We also have something called VPA updater entity where you can provide informations like what is the interval at which you want the VPA to run. We have said that to 10 minutes, you could choose yours. And we can also provide the number of minimum replicas of every controller that is to be run, that is needed for the VPA to act on a particular controller at all. And if you see too many evictions of the ports and you want to reduce that, we, for example, use this eviction rate limit, rate bus and also eviction tolerance. This is just to tell the vertical port of the scalar that you have to have a certain percentage of the replicas running at all times when you consider evicting one of the ports in the controller. So these are few of the flags that we have touched upon, but there are various others that might interest you. And with VPA together with cluster of the scalar and horizontal port of the scalar, get this graph. And it is the same workload and the same production deployment. We see that the red line, which is the utilization of the resource, actual utilization of the resource is very close to the blue line, which is the requested resource. You might ask, there is still a huge glitch or huge difference initially, at least in the initial few days. And this is because we have configured it to be such. This is because, totally because our workload behaves this way. We have huge historical data in the beginning and we want to reduce the number of evictions that happen due to the vertical port of the scalar and we have configured it kind of this way. But you can refrain from doing this, of course. And are we saving here with this? Of course, combined, we are saving a lot because firstly our ports are optimized, our ports are scaling and the nodes are also scaling. And the nodes are what we pay for and that's also optimized. And we save huge on the cost. So this table that you see on screen is from the same workload in our production environments, two different production environments, one without any of these optimization techniques, one at the bottom that you see. You can see that we have used 4K CPUs and after optimization, we've come down to 1.7K CPUs and with memory, we had used 12 terabytes of memory and we came down to 6.5 terabytes of memory, almost 50% of the cost saving that we're doing here. Of course, you can find few differences in the tables. This is because we tried to get visor, so to say. We tried to increase the number of node pools and discreetly have nodes defined and we also changed the shapes here and there. So that's the reason why you see it, but the load that it is carrying is the same. So you could see these Denzio A1, E4, Flex. These are basically standard VMs provided by Oracle, Kubernetes cluster. Denzio is the one with locally attached NVME SSD. A1 Flex, we are these ARM64-based compute instances. E4 is the standard AMD64 images, compute instances. So with this, we come to the summary and to summarize, we first need to explore our cloud providers, standard VM sizes, shapes before we actually decide on what our node type should be, a particular node pool should be based on our workload requirements. And furthermore, we have autoscaling techniques like cluster autoscaler which scales the number of nodes in the cluster node pool. And we have horizontal pod autoscaler which furthermore ensures the pod replicas are scaling and hence the cumulative sum of all the pods keep decreasing whenever it is not necessary. And VPA, which is vertical pod autoscaler which ensures that the pod is optimally used as in the resource request is as close to the actual utilization of the pod as much as possible. Hope this was of some help to you. Thank you for your patient listening and sorry about the technical glitch. We are open to questions and answers. Maybe I should get back to the chat. Quite a few. We are not doing GitOps as yet for the first question. Okay, let me get to the question. If you do a new deployment, the latest value will get overwritten by the value that was in the manifest saved in Git. Do you do anything special to sync to Git? When I see deployments in production, we do fresh deployments from the GitLab CI CD every time. So the values are in the GitLab repo and once we update there, that's when it goes to the master and that's what gets deployed. So we don't have this issue as of now, but when you get to GitOps, I'm pretty sure we'll get into this issue. And according to VPA configuration resizing, many, it may happen every 10 minutes based on one RdK histogram, how much stability was affected with endlessly starts across. And this is the reason why we had this initial glitch that you saw in the graph. That's because we tried to start the VPA a little later and we configured the resource parameters, resource request parameters to a very high value initially because initially that's when, at least for our workload, that's when most of the loading and the busy time happens, we have a huge amount of historical data and to get to the stable CDC, till then we don't want too many evictions of the pods that would just delay the loading process. And that's the reason we try to manually have a glitch there and we start applying or patching the VPA a little later during the deployment. So a horizontal port or the scalar can't use memory or CPU. What metrics are you using to know when to scale up the multiple pods? Yeah, we use custom metrics. We have applications which export a particular kind of metrics, which is kind of a deciding factor because it provides basically how much are we lagging in time? I would not try to go into the details but it could differ in your project as well but this value should basically be the deciding factor for you to know if your workload is heavy or if your applications are busy or not. So if you have some metrics in your application you could of course export that metrics and then Prometheus to this Prometheus server and have Prometheus adapter or try to make it available to the horizontal pod or the scalar. How does cluster or the scalar handle when there is one or two pods preventing it from revoking the node? Is there any way to flag some pods as not important or that they can be rescheduled if needed? Yeah, this is one thing that we also faced with cluster or the scalar. We had pod disruption budgets and that meant we had one instance of the pod and the pod disruption budget was saying that I should have a minimum of one and in which case cluster or the scalar reference it raises its hand and says, okay now I can't do anything with this node you will have to exist because the pod is still existing and there's no second replica of that pod. That's kind of to say the wrong configuration because pod disruption budgets should start with at least two replicas in my opinion. And to get around this, I would say you will need a bit of manual intervention if you really want the cluster or the scalar to run or you have to make sure you decide wisely when you have the pod disruption budgets configured. Do you have KEDA along with VP? Is it possible? I am not aware of KEDA to be frank. Piotr, do you have any inputs on this? This is for collecting various metrics. We could use it, yes, but in our setup we don't have it. Maybe we will use it in the future. For example, we use custom metrics from Kafka. So it would be a perfect fit for the purpose. Yes, it is possible. So each app has to push metrics to Prometheus cementable or does Prometheus pull? There are two ways the Prometheus works. One is push and one is pull. So we are doing the push wherein we push the metrics to the Prometheus. And there is configuration in Prometheus itself where you can see how frequently should you pull or push the metrics from various apps. That's a configuration in the Prometheus. When you install Prometheus onto the cluster, you have those configuration parameters. The HPA and VP methods are used together. What is the difference between its metrics? Okay, if HPA is used together with VPA, I'm presuming this is because HPA was used with custom metrics. And the custom metrics basically means your application is trying to export some customized metrics, very customized to your application. Right? And VPA methods, VPA uses the actual resource utilization. When I say, when you do a kubectl top command, you actually get to see how much the pod is actually utilizing regardless of how much you've configured the pod resource request. So that is what it considers for VPA, vertical pod autoscaler. And for HPA, you have used the custom metrics, which is what your application is saying I need to have a threshold to scale out and scale in the number of replicas of a pod. Hope that answered. Hope I didn't miss any question. Yeah, hope the session was of some help to you. If you further have any questions, you can of course leave them with the organizer and they will try to communicate to us over mail. But thank you for your patient listening. Okay, thanks so much. Are there any other questions? Before we let everyone go. All right. Well, excellent job. Thank you, Varsha and Chip. Thank you everyone for joining us. You know how to reach our speakers if you have any additional questions. But thanks for joining our CNCF Live webinar today. As a reminder, everything will be online later today, early tomorrow. So just let us know if you have any questions and we'll see you again next time. Thanks so much.