 Kubernetes, Kubernetes, Kubernetes, so Kubernetes is so popular topic it became a joke, right? And my presentation is about Kubernetes as well. So but it's a short one, I don't worry too much. My name is Matt, I work at Affinity. So if you would like to learn a bit more about the Singaporean company, you can go to lematri.com. There are like few companies, I think this web page will show you a bit better what this whole company is about. Today I'm not focusing on the things we are doing at Affinity, I'm focusing more on basic stuff. So this basic stuff is related to Kubernetes autoscaling, right? And Carpenter. By the way, this QR code is just my LinkedIn page, there is nothing to win there. So right, Kubernetes autoscaler, right? We have classic autoscaler and this new autoscaler which I'm going to present you today. And at the end I have a very short demo. Right, in Kubernetes we have control plane, right? And we have our workloads. Workloads are deployed on instances, the nodes here. Okay, you have to follow. Things are very easy if our workloads are static. We don't have to worry too much about adding or removing nodes, right? But things get complicated if it's not the case and there might be a situation where we are deploying a new workloads and in a classic approach what happens is there will be this new pod which could not fit into existing instances and it will be just the control plane will try to keep and schedule it as long as it has resources, right? So something has to provide those resources. So Kubernetes itself does not manage the nodes. We are managing the nodes. But part of the Kubernetes project is the classic autoscaler. And classic autoscaler is universal, it works with many clouds. It's part of the Kubernetes project. So it's not like cloud specific. And for like Amazon we use managed node groups. We can use managed node groups. We can use autoscaling groups. And the classic autoscaler will add or remove our nodes using those technologies. So the way this autoscaler works is fairly simple. We have our pending nodes. The autoscaler will detect those. It will decide we have those pods. We need to schedule them. Let's add a new node. Maybe there will be a node which has no posts on it so in that case it will try to remove it. Very simple, effective way that we've been using for a long time, right? And now we have this new stuff, Carpenter. So Carpenter is also autoscaler. He's a capacity manager, very similar to classic one. But the difference is that this autoscaler is really looking into the pods. Because previously the classic autoscaler, it only cared about the things that has not been scheduled yet. But it doesn't care what kind of pod that is. It also uses node groups. So we have to define our node groups before autoscaler can do anything. This is a fairly new technology, right? Q4 2021. It has been developed by folks from Amazon and currently the support is for AWS. But in general, the concept is if someone does implementation for other clouds, it should work with other clouds as well. Open source. And the first interesting thing, no node groups, no autoscaling groups, right? So how does it work then? And it works by talking directly to EC2 API. So our nodes are nodes which will handle our pods, our workload are actually managed directly by this controller. And what's the benefit of that? There are several benefits. So it's a bit faster. And also we are able to provide a very diverse types of instances. So with a classic approach, we have to specify those maybe if we have in mind 20 different types, probably we have to deploy 20 different scaling groups. So faster scheduling. Faster scheduling means it's better for our availability. So I'm a reliability engineer and for me it's very important to have a reliable service and to be able to provide the hardware infrastructure to run the pods. Another important factor that makes this specific autoscaling interesting is that it can minimize operations by removing this middle layer. And smarter scheduling. So smarter scheduling means we have a better return of investment. Basically, we're paying less. We're not paying for resources which are not used by us. Our nodes are going to be as big as needed for pods. Installation. So installation is fairly simple. We need to have some existing nodes in our cluster. So we can even use Fargate for that. The point is when we deploy our cluster, we need to have a place to put our controller. And beyond that, it's very similar to previous approach. So we need to have some kind of IAM permissions. And we need to tag several resources which will be used by the autoscaler such as subnets and security groups. And this specific tool can be deployed using Helm and it's configured through the config map inside the cluster. So now we have the two control loops. The first control loop is very similar to what we have in our existing autoscaler. So some pods are unschedulable because we don't have enough nodes. And Kubernetes classic autoscaler and Carpenter will provide this capacity. But there is another loop which is unique. So in this case, when we have existing capacity, Carpenter can reschedule things into maybe better optimized nodes, right? So in this example, we can see we have three similarly sized instances. And after the optimization, we have one big instance and some smaller. So we have here like this is waste space. So maybe we're paying less for that. Yeah, we are paying less. So this is a good moment to take a step back and think, right? Why do we care about this stuff? And it's important. Like we are professionals. This is a new project. So for me personally, Carpenter is something attractive. I would like to evaluate it and maybe introduce into production in the future. But for now, I would like to learn more about it. Learn how it will actually benefit the company I'm working for, right? How can this technology benefit? Right. So there is one important resource which is called Provisioner. It's a custom resource introduced by Carpenter. And what it does, I'm sorry if you don't see, there's a lot of small text. But basically, this is a Kubernetes resource where we specify what kind of instances it can schedule. Because previously, we had ability to define it through managed nodes. So how many, what's the maximum size of our cluster? Let's say I have only budget for 10 nodes. So my maximum is 10 nodes. In here, I can specify that my maximum resources are 100 CPU cores and maybe one terabyte of memory. And in the example, I also put a GPU. Another thing is we have ability to specify maybe what kind of instances we want to use. We can use spot instances. We have different families and different sizes. So there is a lot of customization possible. And of course, for different loads, we want to use different instances. So it would be very nice if Kubernetes would provide this abstraction layer which is super universal. But the truth is there always will be some kind of application that needs tweaking. Maybe GPU instances, TPU instances. Maybe some application will perform better with certain type of CPU. So we want to have ability to maybe taint our provider in a way that only certain pods will schedule there. And another important thing, if you want to deploy in production, what you want to do is to deploy in different zones on different instances. If there is a failure, if the instances go down, your service should be still operating. So what happens if we have multiple provisioners? What is recommended is to have different provisioners for different workloads. So they are very different. But if provisioners are similar, they will be basically used at random. So provisioner is one of the custom resources. Another custom resource which was introduced recently is AWS Node Template. So when you're using Kubernetes, AKS specifically, it's the best to use Amazon provided instances, which means the template is made by Amazon. They usually work very well. But there is ability to provide your own instances with your own user data. And you can use that custom resource tool. It's a fairly new addition to that tool. So I already mentioned that we want to connect our pods with a specific node. And we have different ways to do that. We can also, in this example, use a standard Kubernetes specification for Node Affinity. We can label our provisioner and the scheduling will happen in a way we want. So when dealing with autoscaling, there are some things that you have to be careful about. So it's a new technology. I already said that for networking. Historically, it actually made sense to have a lot of smaller instances. And the reason for that was the limitation of Amazon networking. So there is a limit how many IPs can single instance get. So there is a limit basically how many services we can put on a single instance. This has been improved recently, a few years ago. And maybe with IPv6, this will not be a problem anymore. But I know that was an issue and we need to be aware of that. Another thing is when we have our instances with attached volumes, like EBS volumes, we need to be careful about availability zones. So maybe instead of EBS, if you want to switch between different zones, there are some alternatives like Elastic File System or FSX. I already said that there is this ability to customize things. But my recommendation would be to try to use Amazon provided instances, because they usually don't have any issues. I did have issues like the nodes, the cubelets on a single instance. On custom nodes, they were basically failing. So Kubernetes could not connect to my custom nodes. And it was a bug in the Kubernetes code base. So very hard to debug. Spot instances, right? Spot instances are very nice, but there is one thing to remember. We want to have ability to choose from wide range of instance types for spot, because it gives us the best chance to get a good price, for instance, or even get the instance. And pot sizing. So auto scaling is all nice, but if our pots are not using resources as we planned, we are basically wasting money. Right, so we have all this cool stuff and what we can do next. For auto scaling, I think I've been focusing a lot about the nodes, but of course, what we really do in our application, we are scaling our pots. So typical approach is to do the horizontal pot auto scaling. Of course, some applications they are not so nice. We cannot really scale them horizontally for different reasons. So we still have ability for vertical pot auto scaling, which means that we are basically scheduling a new pot with different requests for CPU and memory. We have to create a new one. Scaling from zero, right? If we want to save money, if our service is not being used over the weekend, maybe there is an ability to have zero pots, right? That way, we are not paying how to achieve that. We probably want to have some kind of queue and we want to schedule pots based on queue size. And I think for some workloads, what is really cool if you can do like scheduling based on time. So if there's a weekend, you want to scale your cluster in or maybe out, depending on your workload. Really cool thing about Carpenter, I like it personally very much. Typical, classical auto scaler, what it does is have expiry when empty, which means when our node is out of the pots, there is some time that the auto scaler will keep a node because there is a possibility that a new workload will go in. That's typical and Carpenter has it as well, but there is time to live here, which does not exist in classic auto scaler. I like it very much because it allows us to do a very little piece of chaos engineering. We can define that our node will basically die after some time and not die in a way that our workloads will be affected. It will actually be smart, so the pots will be rescheduled into new nodes. I like this very much because I would like to have ability to frequently refresh my nodes for different reasons. Updates or maybe my application. I want to be sure my application works well and rescheduling pots is a good way to ensure that, not having any kind of surprises. Pod customization, right? Different architectures like arm instances are great. I highly recommend using Graviton. Of course, different availability zones. For some applications, you want to keep into one zone because maybe availability for you is not as important as a cost, cost of moving data around different zones. Of course, there is ability to tell, I don't want the spot to be evicted at all. Cool. We have a lot of different resources. We have, of course, Carpenter website. I highly recommend the documentation. It's fairly short and good. It's not very complex. I think a person with four hours can go through Carpenter and learn how it works. Thank you very much for the demo. I have, no, no, no, it's not done. I'm just depending on my slides a bit too much. So I have a short demo. It's short because I have prepared a lot of things before. So what I have here is, again, a Kubernetes cluster. I can do kubectl and let's say I can list my nodes. I have one node. I have deployments and there is one. I can get pods with all namespaces. So this is a very typical deployment. I just have Carpenter controller. I have installed it and there are some things which need to be run in the cluster. What I will do now, I will use my nodes to, let's say, I will deploy, maybe not five. Let's go crazy and do eight. Eight replicas. So I already showed that there is only one node. There is another one going on. But what's important for Carpenter is this resource called kubectl provisioner. We have one provisioner default and this is really cool. This provisioner here is very simple. So this is a provisioner resource and my specification is very simple. Maximum 1000 CPU cores. I want to have spot and there is something about Intel architecture in D64. I don't specify any kind of instance type, right? I don't say I want to have t3a x large or 2x large. This is very simple configuration. So what I have right now is I have two nodes, which one is this basic node where my controllers are running and this new node. So let's take a look at the console. What's going on in my console? Let's take a look at the compute and I can see t3 medium, which comes from my managed node group and I can see c5 for x large. So I didn't decide that I want to use it. I just specified that I want to use a spot and I want to use something that will fit my workload. My workload is very simple. So this is deployment. It's just something that does nothing and it's consuming one CPU core. So basically I said I want to consume eight CPU cores. And what Carpenter did, it found me a spot instance that can handle this workload. Thank you very much. That's all from me and it's time for questions. So Carpenter is super in terms of our proactive node scaling. I think one of the point is basically also that they don't re-balancing. So they don't re-balancing which is like for any momentary spikes where a partner should be able to learn the new cluster. But Carpenter is basically the proactive create a new node, but there's no nurses. The cluster was tailored basically. It does the job for the data. I think this is one of the considerations where it is not yet for the protection you see. But other than that, yeah. So it's a fantastic idea. I know what you mean. So I think what you just said is that for some things, autoscaler is not smart enough. We have to accept the fact that just booting up instances takes time. So if we have sudden spike and we don't have any prior data to predict it, it might affect our workload. So if we know there are spikes, we probably don't want to do this exact fitting. We want to have some spare, right? That's true. To define what exactly? Parity. Okay. Right. So I think what you mean is that what we want to achieve is to pair our pod with specific instance, right? No. Let's do it in terms of specific machine types. Okay. I want to kind of prefer machine type A before it goes to machine type B. And it could be good weather reason. I'm not interested in spot, for example. I'm not interested in spot. I just want to use my workloads on type A. Okay. Okay. Suppose type A has like an output that there's no more nodes available. Only then go to type B. So in this example I gave is like C5, M5, and R5. And what you're saying is, for example, whatever reason I would prefer C5. So I'm not sure. I'm not sure what would be the use case. What I can imagine is I define different provisioners. Okay. Right. So I do not know the answer for your question. I didn't think about this. It's a good question. What I know is for this topology spread constraints, we have ability to say schedule anyway. So we have ability to have preferences here. Maybe there is a way for having preferences for instance types. Good question. So it's late. Late. One hand and who is using Kubernetes? One hand up and shake. Shaking is mandatory. We are not shaking. The other hand, if you are doing autoscaling. Oh, so little? Wow. Your companies must have so much money for all these instances. Cool. Yeah. Great talk. When you're doing the demo and you have to scale your application. Yeah. So essentially what we did was we scaled it to the data. That's actually putting the CPUs in the hole. Right. Sorry. And then it actually gave you a C5 box. So you mentioned that you didn't tell it to give you a C5 box, but it just happened a C5. So you can just say it's a computer type set. Does Carpenter consider that? It knows that you're behind CPU more than memory, for example. So it's primary type of C5? Or was that a new development? Okay. I think it's a similar question to the previous one. I'm sure that Carpenter is not such. It's so smart. Like we had presentation about co-pilot. So co-pilot, if you use co-pilot, it's almost like reading mind. Believe me. If you do like, sorry for the aggression, if you are doing the test-driven development drunk and you first create code, then create test, the co-pilot, it really knows what you're doing and it can really tell you what test you want. In this case, there is no artificial intelligence. There is no machine learning. So to answer your question, I'm pretty sure you have to specify in your provisional resource what you really want to get. I'm not familiar with functionality that it would actually look. Of course, it's aware of CPU and memory, but we are using Spot. It just chooses Spot, which is available for low price. So essentially, what happened was that, what you're saying is that it could just easily update your Carp5, for example, Yes, I believe so. Okay, thank you. Thank you. Sorry, that was the last question. I'm available after. Thank you.