 Hello everybody we're gonna get starting next session. So hi everybody. My name is Daniel. I'm CNCF ambassador I'm so happy to be here as a moderator this session today. So I'm really happy. I mean back to the return. I mean in-person event Since the pandemic so today we're gonna talk a little about next session As you can see the Kubernetes event driven autoscale we cater So please welcome our next great speaker Jorge from Doc planner. I'm sorry my bad pronunciation Spanish and this v-neck from the red head and Principal software engineer and a please welcome We are gonna talk about Aka And first of all I'm One of Kedam and Also, Microsoft mvp technologies and you can see my No problem at all. And now my fault being a we'll introduce. Hey, my name is being a grow buddy I know the name is pretty how to pronounce but you both guys did pretty good job. So that's fine. I'm from Czech Republic. I'm So far into the red hat and I'm working on open chiefs or less Which is k native stuff and I'm also Kedam entertainer And yeah, we can go ahead. So today we will talk about Kedam. So this is the agenda and We can start so basically what is Kedam? What is the project? What is the main goal of the project? if there is just one sentence one thing that you need to Know after this presentation is that we are trying retrying to make Kubernetes event-driven auto-scaling that simple. So Probably maybe we can finish the presentation. What do you think? Okay, okay, okay, I'll try something so what is the use case so Let's say that you have very simple application. It's a consumer application that consume messages from some external service let's say you have you have Kafka consumer application that consume messages from Kafka topic and you would like to auto-scale this application So how you can do it on Kubernetes? You have HPA, right? But with HPA the options are you can scale the application based on CPU or scale the application based on memory but now it is event-driven world writes everything is event-driven so This might not be the best solution how to drive the auto-scaling because the CPU or memory usage Might not correlate with the needs for the for the auto-scaling. So Here comes Kedam for the rescue. So basically This is the example application just redesigned to use Kedam. You don't have to change the application You don't have to change the deployment You just plug Kedam in and Kedam will scrape metrics from the external service So in this case it is the Kafka topic and based on the number of unprocessed messages It will automatically scale application and if there are no messages, it can also scale down to zero This is something that HPA cannot do. There is like a alpha version that already in the in the Kubernetes But but currently you cannot do that with HPA. So this is the main goal of of the project and Basically, this is just a quote from one of the users that he mentioned because there were there were ways how to do the auto-scaling based on custom metrics but the the stuff to actually Actually do that and to configure the custom metric is very very complex and we just try to do the same way but very easily You don't know I could explain to you no problem in Kubernetes. We have three different APA metrics API. One is Metrics that is already locked by the normal metric server. The second one is custom metrics Third one is external metrics. Usually After monitor from each user orders blocks custom metrics. What's the problem imagine that you are starting your project You are okay. I have a rabbit. Let's all let's install directly the rabbit adapter yet for scaling based on what's the problem Imagine that oh, but I would like to get some metrics for information monitor or from another provider Monitoring system the problem is that you cannot because the AP the custom metrics API is already long But your rabbit So you have two options move all your things to promise you a scrap a scrap in the matrix from different sources and Just use it for the same Without change you would use the normal approach of deployment plus HPA Okay, what's Kedas off? Why Kedas is useful at least in my opinion who can be who could imagine that I Get a useful. Okay. The point is that we have more or less the same scenario here We have the deployment, but we replace the HPA with the state of it The state of it is just a CRD that a Kedas register in Kubernetes API API Which don't worry. We will see one example of them. But basically is another object that Says to Kedas how we want to scale our application. So Kedas will read this state of it and Internally we Repair all the needed things just for expose this metric in a single metric cycle So Kevin will look external metrics API for serving the matrix But with Kevin we closer all the different actually we could have a scalar based on serving parts I know they're based on produce use Kafka and a lot of different scalers that Kedas support Just for giving some highlights about Kedas. Maybe in the other side Kedas The only intention of Kedas is make the outer scaling simple We don't want to manage any other thing than how to say the scope is really Right now Incubation Two years ago and after version 2.0 now right now Currently the current version is 2.7 Usually we release new versions every three months just for giving some highlights about how Kedas works and Is Kedas a project done by a group of friends? Of course, I expect that I consider you as a friend Nothing about it And Also our users are quite different If you use Kedas, don't worry you can be listed we will be proud about you also In the bottom you can see Kedas.sh slash a community you can see all the information About the community, about how we manage the project because we try to be Closest as possible to community because it's a community project, not a business project from a company and Every two weeks we have our open meet-up with all the interesting people Okay, so let's talk about the concepts and architecture how Kedas is designed and what is actually doing So as we mentioned we are auto scaling deployments Kubernetes deployments you can also spawn Kubernetes jobs based on the event And also you can target a custom resource if the custom resource Implements specific thing, but you can also target your custom resources. So for example argue roll out As we said before It was you We have 50 plus scalars so different services AWS Azure Rabbit Kafka you name it And basically the main concept is really with scale based on the events in the target system Important aspect is that Kedas itself does not manipulate the data. You need to handle the data transfer to your application We just do the scaling. Okay, so this is like let's say the architecture So there are two main components. There is Kedas operator that monitors the custom resources And then we have the metrics server which provides the metrics to the HPA So under the hood Kedas for each scale object Kedas creates HPA that does the scaling from 0 to n and Kedas operator does the scaling from 1 from No from 1 to n and Kedas operator does the scaling from 0 to 1 so that way we can achieve the full scalability Okay So this is the example of scale object as you can see it's pretty simple We just need to tell it what workload you would like to you would let a scale. This is like the scale target Then you define the minimum and maximum replicas and then there is a trigger section and in the trigger section You can specify multiple triggers in this case. This is just a Kafka. So you specify. Okay. This is my Kafka broker This is the consumer group. This is the topic and the lack so based on the based on the lack It will it will automatically scale my application There are multiple additional options, but this is just the highlight how you can how you can do that and This is scale job example So this is the other CRD so for for spawning new Okay, okay, so basically it's a very similar thing here you here you put your standard Kubernetes job specification and Based on the events on the system it creates a new new Kubernetes jobs So it's very good for batch processing and especially for processing of long-running executions because if you would like to If you would like to scale your application that Handles some long on process the HPA thing might not be might not be ideal because you know once you process the messages From the system and the processing starts the metrics already already down So HPA might scale down your application in the middle of processing But if you scale jobs, you can just spawn a new new jobs to do the processing Yeah, some adventures What that means it's already has a super good piece of software of software named HPA controller We just try to reduce it. I mean we can expose the metric and then configure the HPA to Request this method, but why couldn't we extend that functionality in Keras? Why not? Because we can Implement some other features like a fallback mechanism. What is a fallback? Imagine that your Kaka broker, your from-it-use server, whatever, it's done. It's done. So you cannot wait What do you want in that case? You can specify using this feature setting the fallback replicas. Okay, in case of any extreme is a viable for High-priced in a row just a scale to or not a scale to just consider that scaling that figure as a Requirement of the scale to the set replica. For instance, if you have more than one trigger And only one place that trigger will request that amount of replicas and the other trigger will expose the other value So the HPA will proceed based on those metrics Another important thing is that I Hope that you don't need to customize the HPA the HPA behavior because it's a pain, but maybe you need to do it Get a support that's Specifying those values in a section in the same job until I will get faster the HPA You can do it if you need to customize because you need to a scale down a scale out or a feeling faster or slower You can do it. So it's a good thing One point super nice Trust me. I have been on call today and this feature So for me is a super good night The possibility the capability of pausing the auto-skating what that means imagine that you are under maintenance And you need to a scale to zero in other mechanics or with other tools you need to be Remote the HPA because if you simply scale to zero your deployment the HPA we will say Deployment again with this feature. You can just add in an annotation in your You can say okay never mind. I want to have This feet a lot of instances and I take the risk because I know what I'm doing. It's halfway for doing maintenance Or just for dealing with problems So for me is one of the the best feature Obviously Keda as a majority of software nowadays exposed from it use my fix Because if you don't do it, you are doing the things really wrong in my opinion at least inside to work Right now we have more than 50 scalars But the most important thing the scalars are not opinionated a scalar that that means that they don't have any logic They go to the upstream and request a metric if you need to do any logic based on the metric Yes, no, I would like to have that metric plus one Well, whatever not important the target But you can have more advances scenarios. Keda supports being extended implementing your own GRPC server implementing the external scalars interface or if you are more comfortable with the rest API you can just use the metrics a matrix API is there and just implement your own API Rest API and get that value from you What you can do so you can extend Keda easily to fulfill your requirements one One sample of these extensions is a HTTP app. We tried The object of this meeting Oh, yeah, I always break the microphones. Sorry, you speak too loud probably Yeah, sorry Okay, sorry The HTTP atom is just an external metric server that we use for extending And scaling base of NT on HTTP. It's him beta. I guess it's an example so there is an example of external metric implemented inside Keda and In the last line, don't worry. They say Slides will be served. So you don't need to just copy in your block and your block. Don't worry is the link to the H to the scale object definition just for watching all the all the capabilities inside the scale object And what about authentic authentication? This is important topic in my opinion is one of the most important things in Kedah because Kedah usually needs Based on the work that Kedah does usually needs high privilege on different system Kedah needs to the permission for listing me Kafka brokers for listing messages. So it needs Some privilege that are quite risk if there are not good manners So for for solving that first of all Kedah allows to reduce the credentials I mean, you don't need to just copy and paste credential every player in every place that you are there You can specify to Kedah. Okay, just go to the war load and take this environment variable Where it's already the the secret Yeah No problem at all and How can how can do that or how can go farther and reuse more Kedah has another to cheer this another custom resources One is trigger authentication and the other is cluster trigger authentication You could imagine the difference one is name a space it and the other is just a cluster scope Applied and basically using them we can extend even more the security capabilities Why because we can use we can specify some secret from okay took the health secret from the deployment or from that secret But also from one I pull from pod identities I don't know if you know what our pod identities basically pod identities is a mechanism is from several cloud providers For using even a system manage identities in their in their side. So it's the most secure thing it's like System manage identities in a sur or a am in AWS So if the most secure option, but if you don't if you don't use them You can use has it or both directly integrated inside Kedah not using any other integration Kedah can go to has it or both and just read from them and also from a sur keyboard Yeah, just to add to this the cluster of the indication is good because for example You have an administrator who has some credentials and they don't want to share it with the developers so we can you know define the cluster to get all application object and Developer just reference in the skilled object to this cost a ticket application object. So we can do this stuff as well Okay, yes for So in some have a highlight. So I will maybe show the Kedah stuff You'll show the Prometheus because I don't like it. Yeah Sure. Yeah, so again This is like the skill object so as you can see again very simple just the skill target If you don't specify the API or kind, it's automatically deployment But you can define as I said before you can use the custom resources for example as a Argo route. So then you specify the API API and API version and kind then the minimum maximum replicas and the trigger section So you see it's just a couple of lines and you can explain the Yeah, you catch me drinking. Yeah, I know we need to prepare better for the next time Okay in this side is the common HPA Nothing new under the Sun under the sky the common HPA But it seems or they seems quite similar. The problem is that for using this side for using this API You need to do this also in the adapter I mean you need to configure the adapter just for renaming the matrix and exposing them In the name space you will need them So the HPA is quite similar, but the work and then the under the hood is higher It's really higher with Keda. You don't need to do this to do this You only need to spam Keda in the cluster install them and that's all Yeah, so if for example, you would like to scale your one deployment You need to define the rules if you would like to add that another deployment or auto scale it You will need to again reconfigure the whole adapter. It is not like the CRD thing Okay, and what about now? Yeah demon time. So yeah, talk is cheap. Yeah, right. I can talk maybe and you can So as you can see this is like the Oh, they don't see why Okay, I don't know So this is the deployment, right? Yeah Yeah, I can do it. Don't worry. Okay. Yeah, basically just for improving the demo I will do it there just for not making a bit bully them all always presenting there You can see there I for just for improving the process we have in we start in a scenario I have already installed Keda Keda is already installing the cluster and I already install a helm chart a head concept Obviously a rabbit queue. I install a rabbit server just for using the rabbit a scalar as sample But you can after the demo I will say I show you the where you can download the whole sample But basically we have already deployed This file in the cluster this file just has a secret with the secret of the for connecting rabbit host has a super Simple deployment with a mistake the annotation is not there, but ignore it is just a sample like a rabbit customer Don't worry. Eat the queues messages from the queue and do absolutely nothing with them, but the most important thing Scroll below Is this this is the scale object in this is kill object. We are basically specifying the Steadest color target ref basically What what we want to scale? Nothing important these values are for just for improving the speed of the demo and the max replica count we want to have 30 replicas at maximum and The triggers this is the place where we are we are going to introduce the different scalers the different rules that we want to use just for scaling in This case important is an a is an array so you don't need to scale based on only one You can specify all of you want without any limit Without any limit. Yeah Just double checking And the trigger authentication why a trigger authentication just for a specifying Where are where is the connection stream from for their rabbit server and We define the trigger authentication and we have specified it inside the trigger just for saying this trigger Should use this authentication so go there and take the parameters the secret parameters from there And just for doing the things totally transparent. I will deploy this job. This job is Super simple job just for a Q in Generates the load basically you should mirror the screen, right? As you can see, I'm not the best with Mac OS You are Microsoft finally, right? Yeah Not my bosses my bosses are more Microsoft haters You have a typo over there Yeah, that's good. I can hold. Oh, thanks So Jorge just created the publisher job, which is the stuff for generating the load as you can see he can pre Right now. We are going to watch that is what's Qtl. Yeah Yeah for watching watching the consumer application. So he's watching the work load as you can see It's already auto scale down to zero. I don't know why I Know because yeah, I make a big job for him. It's a trap. Okay. I added the annotation that I explained it during the during the Presentation just for pausing the auto scaling and so in so in it so Okay, sorry, I will remove it locally because otherwise it could be a pain Best demo ever as you can see the annotation is there Is this so I will try to remove it. Oh my god Sorry nice Okay, no problem and Now automatically the auto scaling has stopped it one instance Because we were in a zero scenario. We were just a scale to zero So remember the operator has increased from zero to one and now the HPA will take care We take the control and we'll scale this world. So in a few moments, we will see that okay, right now We have four instances Yeah, because we are generating the load So we are generating messaging to the rabbit MQ and the application is probably processing them And it will be scaled down to zero right after couple. Yeah, after we will reach 30 instances and Sadly, we will scale to zero again Okay So maybe in the meantime we can check the Additional stuff right because we don't have much time. Yeah in the meantime. Oh This is the URL Basically this URL this demo is there is a public demo that we have we use a rabbit and You go there you will see a step by step how to Reproduce this demo just for try in your under infrastructure You could try it in mini cube in local cluster. Never mind. No problem. That's why we try to use rabbit They don't pay us for using rabbit trust me or at least not to me But basically that's the URL you can find it in github slash get a core slash rabbit Okay, no the spring is not mirror it So can you move the slides? Yeah, yes for sure as you can see already all those go down to zero So it is pretty simple and maybe we can go for the right. Yeah, I'm gonna go to the slides That sounds easier You don't need to do the presentation, right? Okay Again, no, no, no, no We are bored about you. Yeah, I will say that this is yeah, awesome So what about the future of the project because you know, we can already autoscale the stuff But still we would like to stay focused on one thing So there are several things that we would like to improve So one of them is cashing the actual metric values in the in the Kedah. So then you can Save basically the traffic to your to your external service. So for example to do Kafka broker or then maybe we can do some nice Analyzes on the on the values and maybe do some predictions. Maybe plug some AI ML stuff into and monitor monitor the Basically the metric values and then based on that maybe start scaling a little bit faster or a little bit in advance And I just another thing because as Jorge said at the beginning the there is the limitation that could there could be only one Extension point for metric server in the Kubernetes cluster. So that means that you can have only one kind of installation per whole cluster So we will need to somehow solve this. So this is another like a big thing that we would like to do Other cool thing is cloud events because cloud events are everywhere right now So we would like to you know, maybe expose some cloud events and do some cool stuff So then you can then you can integrate integrate basically Kedah with your portfolio and maybe you know based on the cold events based on the statistics about scaling you can do sub additional stuff Yeah, and that's probably it So do you have anything else? Yeah, I have noticed that I don't know when we delete the one of important features that Kedah supports our Architecture yeah in the latest version. It's not a critical But if you are on AWS and you are using gravity to note, you could use also there So do you have any question? We have five million and we have a bunch of t-shirts here So if you have any question and then good questions. Yeah, you should but questions go out So, yeah, just bear in mind. So we have a one but your question We're gonna maybe address that first. Okay. So, can you scale a part according to the load of the difference part? You need to you need to somehow get the metrics from the different port So once you expose the metrics from the different port maybe from through Prometheus, then you can scrape the Prometheus Scale scrape the metrics from the port through Prometheus. So basically To do the scaling we always need to have the basically the source. So yeah, this is doable. I suppose Cool. Yeah, thanks for the answering and then questions Okay, yeah First first I think the start at the start you mentioned you can create multiple scaled objects for the same deployment No, no, there could be only one scale object per deployment Okay, but you can specify multiple triggers in one scale object So it could be like a trigger for Kafka for Prometheus for rabbit or CPU targeting the same deployment and then HPA works this way that basically it selects the the greatest value to drive the the scaling so basically so it takes by Greater value. There's no like exactly fight between them So I can for example do something like scale by metrics and then say Wednesday, I have an event at 4 p.m. I want to scale it way up and it will not fight. Yeah, okay But all that must be always only scale object targeting the deployment you cannot combine scale objects and your HPAs It could be only one one object. So if you know what it manages. Yeah. Yeah. Thank you I'm gonna go to back there Go for it How does it work for example, I mean in this scenario that You are upgrading an application. So are you are triggering triggering a Kubernetes deployment? And it's a scale to zero. What happened? nothing because Because if you update the deployment Cannibal after a few seconds, it will notice that the deployment has changed like the that the replica number doesn't Equal what is basically provided by the metric server? So it will again scale to do what should we do? Well, that's the thing I'm wondering for example, if you have a scale to zero and then you are constantly checking for example the number of Messages in rabbit and you and then you upgrade the version of that application There's no no no Okay Yeah Because you are always targeting the same deployment it doesn't mean that it was changing a bit the version was changed But it's still still the same deployment. So it doesn't care like so it should it should work Okay, thank you. Yeah Thank you. So I think we have one last question there Don't worry Hey, so does you know, I'm thinking specifically around Kafka, but it probably flies elsewhere Does KDA support like partition level Scaling No, KDA itself does not scale Kafka It scales just the consumer application, but for example, if you are consuming Messages from Kafka, it doesn't make sense to scale out the number of consumers that is larger than the partition So yeah, it's it's kept so basically it will scale only to the maximum number of partitions You can rewrite this so we can there is like optional value I would like to scale to more consumers, but it doesn't make sense. But yeah, this is covered. Okay, thanks Maybe you can do it another question there was one Okay Hi Hi, I saw the demo that you were like you put 18 replicas and then he went up super fast Do you have like throttling because This is the stuff that you define on the HP level. So this is like the scale It's called scaling behavior. So then you can define like the all the options. This is Everything that's provided by HP it can be like Can also be by external services can my limit can be based on the metric of external service I'm not sure I get the question. So like in case like I am, I don't know putting in a Amazon limit I am using AWS services. I'm I am Putting like a batch of bots running I don't want to get the limit of throttling of AWS could I put like a limit on that based on that limit by Taking one metric from AWS and to see if I reach in the limit of the throttling or not. I don't know It's playing well, maybe we can discuss this offline because I'm not sure I get properly or you understand Okay. Thank you so much. Yeah, so thank you so a great presentation demo today and thanks for attending So enjoy the rest of the cube call