 We will be talking about Project KEDA and let's start with the interactions. Okay, I'm going to introduce myself. I'm Jorge Turrado, the first one, obviously. Viniek doesn't match with my face. I'm a 3D expert at LIDL, the International Hub, Akaa, Sparse Group. I'm KEDA maintainer and also I'm CNCF ambassador and Microsoft MVP in developer technologies and Azure and you have there my connection or my handle for Twitter, GitHub and LinkedIn. You can send a five-quest or write me whatever you want. And the space is yours. Thank you. So my name is Viniek Rubalik. I'm from Czech Republic. I'm engineer working at Red Hat in Open Chief Serverless Team and a long-time KEDA maintainer. I'm also active in KNATIVE community, so I'm a member of the KNATIVE TLC and let's start it, right? So, we will quickly tell you what is KEDA. We will describe some new features that we have, some best practices, and we will see a demo, hopefully. And before we start, actually, I would like to ask you a question. So who knows KEDA in this room? Can you please raise your hand? Awesome. Which KEDA users? Are you interested in being listed? So, and who are the users of KEDA actually from this room? And is there anybody who is considering using KEDA? Okay, cool. I will ask after the presentation. Did a great job. So, what is KEDA? So, let's describe it as a problem. So, we have an application. This application is some consumer of some data. So, in this case, it could be like a Rebidemq consumer. It consumes messages from Rebidemq. And I have a problem. I would like to autoscale this application because the application is not doing well in my setup. So, what are my options? If you use Kubernetes, there is an HPA, right? So, you can autoscale the Kubernetes deployment with HPA. But this has some kind of problem in this setup because if you would like to autoscale your application based on CPU or memory, which are the only metrics that HPA provides, this might not correlate with the actual needs of our application because our application is consuming messages from some external service, in this case, from the Rebidemq. So, we would like to autoscale this application based on some metrics from this external service. So, let's see the solution. The solution is very simple. You plug KEDA in this setup. And what KEDA does, it just scrapes metrics from the external endpoint from the Rebidemq. And based on those metrics, it does the decision on actual autoscaling for this application deployment. Let's see this in action. Yeah, because talk is cheap. Let's start with a demo. Okay. Why do we still see the presentation? Oh, okay. I need to stop the presentation. Yeah, that's it. Okay. I'm not gonna invest so much time here because basically it's an application that consumes some messages from Rebid and then I'm going to deploy an scale object which is the wrapper on top of the HPA that KEDA uses. So, let's do it. QCTL, apply, minus F, simple demo, and let's deploy the consumer. The publisher, sorry. So, if I go to that name space, suddenly my deployment is growing. One pot, but now has four pots and we are not gonna wait until the demo ends because you can see that it's working. I have published some messages and my workload has scaled out just for consuming. Don't worry. In the final demo, we go deeper and we check those different things, but it's just like, okay, I'm gonna tell you about KEDA. We are gonna talk about, speak about KEDA. KEDA works. This is the demo for ensuring that we are not gonna lie, okay? KEDA works, so maybe we can finish it all right now, right? That's it. Thank you. Thank you all for your time. Now, let's continue. Okay, so let's continue. Sorry, I will keep you with it longer. So, as you know, so KEDA is a project that aims Kubernetes-event-driven autoscaling that simple. So, this is like our motto and we try to stick to it. It allows you to auto-scale your applications, your deployments, your workloads, your Kubernetes jobs based on some events happening externally, so not just CPU or memory, which is built in Kubernetes. To achieve that capabilities, we have 60-plus integrated triggers or event sources, so basically those components are talking to these external services, so it could be the Prometheus or the Rabbit MQ we do so, et cetera. This is our web page. And this is the community slide, so do you want to cover it as an ambassador? Give me my opportunity. Okay, basically, KEDA community is more than a small project. Anyone who knows about it, we have several big companies like Microsoft, Red Hat, Spark Group, Reddit, IBM contributing with the code, but also using KEDA. That's the list of listed users who are using KEDA on production. And there are really huge names like FedEx, Zapier. There are big players here. And first of all, before continuing, now we are trying to understand better the user requirements, and that's why we are asking for your help with a super small survey about how you use or you want to use KEDA. And if you can take a picture, don't worry because at the end of the session we will place again, but we wanted to introduce this survey. Thank you. So let's go and see some details. So for those of you who don't know how KEDA works, I will just briefly explain the architecture. So there are actually two main components. One is the KEDA controller or the operator and the metrics adapter. So imagine that I would like to auto-scale my application, so what I do, I will create a scaled object, which is the custom resource that we provide. The KEDA controller basically is watching for those scaled objects or scaled jobs and based on that it connects to the external service, so for example the Rebit instance, and it creates HPA and then it provides a metrics to the metrics adapter that is being used by the HPA actually to do the decision because HPA can use some external metrics except the CPU or memory, but you need to find a way how to provide those kind of metrics and it's kind of tricky to do that. It's very easy, but we try to solve this issue. And also, HPA doesn't allow you to scale down to zero, so you cannot scale your workloads to zero replicas, so that's why we basically work around it by leveraging these capabilities to the operator. So the KEDA operator scales from zero to one and then the HPA takes over and it basically scales from one to whatever replicas I have defined in my scaled object. So this is very simple. No big deal. We have some admission books also and we have some other stuff, but this is like the main concept, so basically there are two main components, the operator and the metrics adapter and they are doing most of the job. So, and this is the example of ScaleObject, one of our custom resources, so with ScaleObject you can target your deployment or stateful set or any custom resource that exposes scale sub-resource. You can, if you have your own custom resource and you would like to auto-scale it, you just need to provide this endpoint. It's very simple and then we can target it KEDA or with HPA actually. So if we look at the ScaleObject, we see that we are referencing the example deployment in the scale target draft then there is minimum maximum replicas and then the Triggers section. So in the Triggers section we actually specify the credentials or not credentials but the metadata for connecting to the external service and there could be multiple triggers specified for single ScaleObject. We have another optional options but this is like the very specific one just for purposes. And this is ScaleJob, this is our second main custom resource and this one is for scheduling Kubernetes jobs because if you look at this ScaleJob it's very similar to the ScaleObject it also has like the Triggers section but instead of referencing an existing deployment you can put a whole Kubernetes job spec into this field and you can basically schedule new Kubernetes jobs based on the events in the RabbitMQ. This is in particular useful for processing of long running executions because imagine that you are for example our consumer application is consuming those messages from Rabbit and based on this message it is doing some expensive calculation which may take hours. Then the HPA style of auto scaling is not ideal because once we consume the message and the workload starts processing those messages the metrics are going down so basically it will be scaled in the workload in the process of execution so if you want to keep really long running jobs the ScaleJob is the right option because you can run the very long workloads as a Kubernetes jobs based on those events. You could imagine, for instance go back please, you could imagine this ScaleJob, the use case for instance is imagine, do you use Github Github, HODbobs, those kind of CI system. Imagine that suddenly the HPA controller decides that the pod that is running your specific pipeline should be evicted and removed so your pipeline will die and will fail and it's horrible because why has failed? Has failed due to the HPA controller. This kind of tooling, this specific tool like ScaleJob is for that because it ensures that the work started by the pod ends and it's important in those cases as the CI processes or long term processes in general. So these are our main two customer resources we have also other customer resources for securing the credentials or for credentials handling because obviously you don't want to put your credentials directly in the Skellogic but you might want to reference them directly from the world so we have a special customer resources for this. Okay, so this was like the KDA in five minutes I would say. 6 minutes. Speed up. So now let's talk about some new features. So we'll go through some architectural changes that we've done recently. We'll talk about certificate management about webhooks, about metrics and other cool stuff. So let me start with the okay, this is cool, I don't know. First the old architecture then the transition and at the end the new architecture. He just changed the slide so I don't know. So basically this is the old architecture. If you recall from my interaction in the beginning so we have the main two components control and metrics adapter and they are responsible for the scaling of our clothes. So basically what we did was open a connection to the external service so the external trigger source could be like the RevitMQ and we opened the connection both from the controller, from the operator and both from the metrics adapter. So basically we have two connections and basically it was independent on each other. We changed this architecture a little bit so voila. We just opened one connection from the controller and we moved the majority of the logic from the metrics adapter so this metrics adapter is really just a proxy to talk to Kubernetes and we do majority of the stuff from the CAD operator. Why we did it because we wanted to do that and... We will talk about this in detail more later but maybe couple of examples so for example we reduce the number of open connections so imagine that you have a large cluster and a lot of deployments that are using Prometheus and they are scraping metrics from Prometheus so our scale objects are scraping metrics from Prometheus. With this change we just reduce the number of connections by half which could be a lot so we are using the load on the external service and second of all because CAD operator does the scaling from 0 to 1 but the HPA does the rest of the scaling and we don't have any power to actually the scaling behavior let's say or imagine that you have two scalars, two triggers defined in a scaled object. In this case what HPA does it asks for metrics for each trigger and then does the scaling decision based on the let's say largest number so whichever metric reports the largest number then this is the final replica count so imagine you have for example Prometheus trigger specified and RabbitMQ trigger specified the default behavior that the larger number wins we have no power to actually change it but with this change when we are in power of all let's say the metrics loop then we can modify those metrics a little bit before we are actually sending them to a metrics adapter we are using GRPC so it should be relatively fast so we would like to add new capabilities for example that you would like to specify in case you have multiple triggers you would like to specify okay not to use like the highest number but maybe average level or whatever or maybe some more complex logic into evolution of the target scale another cool stuff is caching of those metrics but I will talk about it later on it's my turn? they are bored sorry for them okay if you are not bored maybe the certificate management is not the most interesting and enjoyable topic around the world but I will try to do it at least funny okay due to this change that we have done in the architecture we need to discover the strong requirement of encrypting every internal traffic inside the cluster for the KEDA components why? because otherwise anyone can put another application that's a man in the middle and suddenly your AWS or GCP build grows and grows and grows and why? because you are scaling out more than expected or less than expected so we need to introduce a trustable way for communicating and that's why we introduce a mechanism for automatically generating TLS certificate self-signed obviously we are not a trusted CA but we support the capability of providing your own certificate with your own CA however you have generated them but you can't use it not only for communicating different KEDA components also for the communication between KEDA and the cluster because the cluster needs to ask KEDA okay KEDA this metric what value has this metric so for that communication we also use that CA so we have been increased the security but it requires to manage the certificate if you are lazy like I am KEDA operator can't dash it alone but obviously if your enterprise security agreement and your enterprise policies requires to use your own CA it's doable and it's supported okay I'm talking about CA's because the certificates around internet okay if you are using your own CA it's really common that okay I'm using TLS the server the Prometheus or rabbit server is using my own certificate so the standard libraries in any language writes an error if the certificate is not in the trusted CA store from the container an error will be right so in KEDA we have the support for providing your own CA's in this way you can use your own CA in a full validated communication an encrypted way in the scalers which is another improvement from security point of view and the last but not the least point about certificate is that we have made optionally or setable the minimum TLS version so nowadays latest versions of KEDA ensures that your encrypted communications is using at least TLS 1.2 which is the minimum secure TLS encryption protocol that exists you can use your own it's an option you can modify but at least you are safe we are moving forward we are designing the roadmap for making KEDA safe as default in case of not configuring anything KEDA will be safe and will fulfill all security requirements because it's not interesting it's not necessary and if we can do it we delegate or why should we enforce you to do these boring stuff webhooks validation one really common topic in GitHub repository in the issues are things that for me as a KEDA maintainer as a person really focus on auto scaling sounds super clear but my experience says that it's not obvious it's not obvious at all how many of you know that a workload in Kubernetes not important deployment state full set customer source or whatever how much of you know that you can only have one HPA per workload you can have multiple but if you have multiple it won't work correctly because those two HPAs will be fighting exactly if you want to scale nothing KEDA in Kubernetes in general if you want to scale based on CPU and memory you should mix both metrics in the same HPA if you spawn, if you deploy two HPA the behavior will be crazy and those things could seem obvious but they are not obvious at all another feature that we have introduced is a validation a validating webhooks for those things the validation webhooks will block any scale objects that try to scale or that try to control a world that is already controlled by another scale object or by another HPA and also sorry the same validation webhooks validating webhooks will check obvious, apart from that if you want to use CPU or memory scalers you need to have those values in the request of the pod if you don't have them defined the HPA will do nothing nothing at all so why wait until that scenario if we can catch during the deployment thanks to this validation too much slides now it's my moment ok we have been working also in the observability because ok I have shown you it's not a joke we don't record a demo it's a live demo but how can I check it what can I do if Keda is failing how could I notice that Keda is failing ok during the last year we have been working hard also in the observability of Keda we have added some metrics you are in the middle bro I need more it's not enough now Israel I'm joking come here we have added a lot of metrics because we think and we are not the owner of the truth that's why we prepare the survey we attend we have issues, slag discussions because we want feedback but we have added these metrics just for ensuring that Keda works and those are available just scrapping them from the server cool let's move on so these were some new features that we introduced recently and let's talk about some best practices just real quick and then let's go to the demo so think properly because we will be doing the demo right so pulling interval this is interesting and stuff so this is like an option in scaled object and the pulling interval is because if you recall there are those two components operator and metric server the pulling interval is only for the operator so it tells how what is the frequency of the checks from Keda to the external service so let's say I would like to check my Rebitem Q every 15 seconds so this is what the pulling interval is for but it is only for 0 to 1 scaling it is not related to the relevance to the HPA or so to the 1, 2 and scaling because as I told you before like the HPA requests the metrics on its own so and this by default is 15 seconds there is an option for it and but this option needs to be needs to be set on the Kubernetes platform so if you are not managing the platform on your own you cannot change this period so it means that if you for example would like to make the auto scaling let's say quicker so you would like to scale faster so you would like to decrease this limit you have no option if you don't manage the cluster so how you can bypass this so that's why we did the move from like let's say the architectural change so now we are in a full control of the metrics flowing into the system so we have a new feature which is called metrics caching and basically it means if you enable this setting you can enable per each trigger so it means that the only request for a metric is coming during the polling interval and then we are caching the metrics in the operator and when HPA requests those metrics for example every 15 seconds but we have the polling interval set for one minute so we only query the metrics to the external system every minute and the cache is being hit in this case and we are also saving like the network traffic so it's useful to think about these options if you are designing your solution because for example some solutions don't require that fast fast start up time for the scaling there might be some delay so you might want to extend these options maybe other workloads require let's say higher frequency so you would like to really tweak this kind of settings then there is another cool recommendation which is called HPA scaling behavior it's like an advanced field in the HPA settings so it is HPA building stuff and there are two options there is a stabilization window which is good if you want to prevent like flapping the number of replicas so let's say my metrics reports for example scale to one replica then to ten one ten you would like to avoid this situation so you can specify the stabilization window and it prevents this kind of behavior it makes the scaling smooth also you can define the scaling policies both for scaling out and in and basically you can modify the algorithm to scale maybe a little bit faster or slower so increasing or decreasing the number of replicas so I definitely recommend using these settings in your environment and this is interesting topic and I would like you to talk about it this is another interesting topic and maybe it's not a practice we didn't know the best place to put or to explain this but it's important because we have introduced support for float numbers but if you don't know I can tell you Kubernetes doesn't support floats float number they are not supported at all Kubernetes change the scale if you want to say 1.5 Kubernetes will use 100 mili in general mili cores, mili whatever but the scale it's changed and that's why you could see in your HPA something weird like 4800 m 5 what that means that means 4.8 5 why we explain this because we have noticed in our issues and in our community channels that there are a lot of questions about this and this is due to this not limitation, due to this design in Kubernetes but due to this the support for floating numbers as a value for instance you could say I want to scale based on or the value that I want to reach is 0.5 potatoes per replica and why per replica because there is another important topic that we have support and we want to explain and is the metric type and this is not part of KEDA itself because this is extremely related with the HPA you will see this in the HPA directly not in KEDA the metric types do you know the metric types in Kubernetes which kind of metric types do you have a Bible in Kubernetes in general never nobody knows don't worry that's why I wrote them we have the average type which is the most obvious it's like okay my pot can process 5 messages at once in parallel so I want to have 5 messages in average per pot so if I have 10 messages I will have 2 pots easy to understand the second one is value value is not a custom it's another formula like 3 4 I don't remember the value as last 20 I don't remember but you have the link to the documentation where it's properly explained the specific formula but you have another way to calculate because maybe your workload needs other scaling way than just the average and if you are using CPU or memory you probably recognize utilization because it's the metric it's the common metric for CPU and memory for instance if you see the percentage in the HPA it's because you are using utilization if you will be using any other you will not be the percentage so this is not part of as best practices but it's important to share this knowledge because you can face with these values metric types and it's important to know how it works it's part of the internal working way of Kubernetes and KEDA exactly awesome so I have to do another demo oh yes I don't earn enough oh sorry I don't put your go ahead okay now I'm going a bit deeper in the process we have deployment they are really normal deployment you can find this example in KEDA organization in github and I have the scale object as has explained but I have another important point he has explained we have other CRDs in this case the other CRD is trigger authentication what are we doing with that trigger authentication we are using a native integration in this case with Azure Keyboard for pooling the secret so the connection stream to that rabbit isn't in the cluster is on demand pool from KEDA and how can we connect with that Azure Keyboard we are using managed identities if you don't know what are managed identities they are known as we are a identity federation in Azure and GCP and EAM role assumption could be in AWS is the most secure way that you can use to connect through the cloud provider infrastructure because it generates a single token based on the identity we should speed up no problem at all so yeah this is the rabbit if you check rabbit is not secure because I have generated my own self-signed certificate so as I have explained it obviously KEDA will say but 5 minutes nice sorry but in this case I have the secret with the CI the CI is registered in KEDA so KEDA trust on the CI and obviously on the certificate sample in this case I am going to publish to the other queue I can see that the scaling start again now from 0 to 1 and I can keep it there for the future but we have a bit rush I will jump to the next one how can I check I show you that we expose metric we want to expose metric we monitor how KEDA is working because it's important it's critical if you need to answer and your pot is down because KEDA hasn't scaled your workload you are failing in production and that's not acceptable we need to figure out how to do it and in this case we expose metric this dashboard is a totally demo dashboard let me put 30 minutes but these are a sample about the metrics that you could use you only need to scrape them using Prometheus and use them in your alert in your monitor dashboard wherever you want maybe let's go back to the presentation this is just the future topic we don't have time so this is just the list but basically this is like related to our architectural changes so this is what's coming and this is the end of the session please if you are KEDA users make up time to fill this survey because it will help us to prioritize what are the features that you would like to see in the KEDA and what you have to improve and there is also the QR code for the session feedback so please if you have any questions feel free to ask them now I'm sorry for speeding up the end of the demo my fault one microphone here can we have a mic in here so anybody has a question in the meantime so this hello, good so you mentioned about the validating webhook stuff what happens if you deploy say if I'm deploying everything with a helm chart with flux or whatever and it's checking the deployment for CPU and RAM being there what happens if the KEDA resource is deployed slightly before the deployment does it fail? in general, helm first spawn the deploy and then spawn the scale object because they are custom CRDs but if you have any edge case that is not supported you could use webhooks webhooks, no, sorry, helmhooks for slowing the scale object deployment or deploying the cluster anybody have a question? yeah well I go next until my show is there regarding the Prometheus metrics thank you regarding the Prometheus metrics you showed the dashboard, do you have a community dashboard that is available? we have a dashboard in the repository that isn't this dashboard, I created this dashboard for this demo but it's based on this dashboard so you have the starting point and you can iterate from that starting point we plan to extend this dashboard alright, thanks