 In this video, we will explore how cluster scaling works for worker nodes as the workload running on the cluster changes. We can scale the IPI provisioned OpenShift cluster in two different ways, one manual scaling or two auto scaling. Both types of scaling are managed using custom Kubernetes objects in a Kubernetes native way. Let us understand how manual scaling works first. Every worker node corresponds to an object called machine. These machines are managed by what we call as machine sets, just like how the replica sets manage parts. Machine sets manage machines. You need a machine set per availability zone. So if you want to increase the number of machines in an availability zone, you'll edit the count for this machine set. Let's look at these objects using CA life first. As I said, these are custom Kubernetes objects. So we have custom resource definitions for machines and machine sets. So we are looking for custom resources of type machines and machine sets machines and machine sets are both in OpenShift machine API namespace. So let's get a list of machines in the OpenShift machine API namespace. This shows us three machines that represent masters and three machines that represent nodes. Let us also look at the machine sets. These are the four machine sets one for each availability zone, but we have machines running only in the first three availability zones, which is US West to a to be and to see. So the machine sets are only defined for the worker nodes. Let us see these on the web console now. We'll go to the compute section. So we have six nodes, three masters and three worker nodes and we have six machines correspond to those nodes and the four machine sets that we saw from the CL as so to a to be and to see availability zones are running one machine each will edit the count for the first machine set in US West to a and will change this number from one to two and this will scale up the cluster now as of now one out of two machines are available in a few seconds will have a new virtual machine coming up in AWS in US West to a let's look at the machines now we have an additional machine that has come up in US West to a it will take a couple of minutes for this VM to be ready. You can see that the new virtual machine is getting initialized and that virtual machine is now running and if we go back to check the nodes that newly spun up machine is now getting ready to become a worker node. The status has now changed to ready, which means that it is ready to accept the workloads. So to summarize just by changing that count machine set created a new machine and the corresponding controller spun up a new virtual machine on AWS and then it installed OpenShift to make it a worker node. Now in order to scale down, we can do the exact reverse of that. I'll edit this count from two to one and within no time, the number of worker nodes reduced from four to three and you can see there are only three nodes running while the entire scaling mechanism of spinning up a new virtual machine and installing OpenShift on that and making it a node on the cluster is all automated. Still I have to manually go and edit the count in order to scale up the cluster. Can we automate this part as well? That is based on the amount of workload running on the cluster. Can the cluster adjust its size automatically so that it can spin up additional work nodes when needed and scaled on when it is not needed and that's where we get into cluster auto scaling for cluster auto scaling. We have to set up two more objects and machine auto scaler for each machine set that manages the number of replicas that is it edits the count on the machine set based on the workload needs and a cluster auto scaler that manages the machine auto scalers to scale up and scale down the number of machines. There is just one cluster auto scaler for the cluster. We'll have minimum and maximum number of replicas configured for each machine auto scaler. Why is that so? Because we don't want the cluster to go out of control and keep on spinning up additional nodes just in case of a bad job consuming a lot of resources. So we want to set some limits within which the cluster scales up and down. So let us set up machine auto scalers for our cluster. So under the compute section, let's get into the machine auto scalers part and create a machine auto scaler. I'm creating a machine auto scaler for the machine set in US West 2A and I'm setting the maximum number of replicas as for. In the same way, I'll create two more machine auto scalers for West 2B and West 2C now have three machine auto scalers bunch one for each availability zone with minimum number of machines as one and maximum number of machines as four. Now, let's create a cluster auto scaler. I'm setting the maximum number of worker nodes on the cluster to be capped at 10 and went to scale down the cluster. Let's create the cluster auto scaler. Now we have set up everything needed for cluster auto scaling at this point. We are ready to generate some workloads that will scale up the cluster. I'm creating a new project and I'll be deploying a workload generated to this project as this job gets started. Let's look at the number of parts. There are a bunch of parts that get spin up and each one of them consumes a lot of CPU. So it will force cluster to scale up as you can see. Some of the containers are getting created and the others are pending and waiting. Let's look at the number of machines. We can see that new machines have already come up. So the cluster auto scaling is working in a few minutes all these new machines that got spun up will become OpenShift nodes. Let's switch over to Graphana to see how the workload is increasing. I'm in that namespace where the workload is being generated and this shows the amount of CPU usage and the memory usage. The machine sets shows that it is trying to scale up to four machines in US West to see and two machines in US West US West to be now the machines that scaled up are now mapped to nodes. So if we look at the nodes. Almost all the roads are now ready. So this shows you how the cluster can be set up to scale up automatically and as this job completes which takes a few minutes after which the cluster will scale down automatically. We can see that the workload has now gone down both in terms of CPU usage and memory. This happened a few minutes back and the cluster is gradually scaling down zones. US West to way into be are just running one machine now and US West to see is still running three and it should go down in a few minutes and the total number of nodes has now reduced to five. All the parts are now in completed status. So I waited for a few more minutes and the cluster is completely scaled down now. You can see that all the machine sets are just using one out of one machines and the number of machines are down to six again three Masters and three worker nodes and the number of nodes are again down to six. So to summarize in this video, we have seen how to handle scaling manually and then we also saw how to set up machine and cluster auto scalers to set up auto scaling of the cluster based on the amount of workload going on on the cluster. I hope you enjoyed this video. Thanks a lot for watching.