 So I was going so far You enjoying the speakers I Think they're amazing All right, do you know guys the one from the Italian the German and the English that walk in a bar? Now I'm joking. I'm just not gonna talk. We have Salman from up here. Please give it up from Salman. He won second Okay Welcome everybody. Good morning Amsterdam. How are we all doing? Yay This is the last session before lunch. So try and keep it light, right? Imagine you have a website could be any website and There's some traffic coming to it and it's working fine But at some point are you launching your product? I don't know iPhone or whatever and there's a lot of people coming to your website now and There's not enough traffic being served and people are not happy What should happen is that we should use what Kubernetes been promising us right the auto scaler you can auto scale it But it doesn't do everything by default There's some work that we have to do in order to do that and that's what we're gonna look at today how do we auto scale based on some events and Basically scale up the website and let people do what they need to do But before we start because you've all come here. I thought we should reward you with something Everybody know this tux right the Linux tux. It's a Lego. I have this Lego Right here. We like Lego. Who likes Lego? Yeah, but Everybody's not gonna get like that, you know, there'll be a bit of a quiz just one question shout out the number People know about the landscape CNCF landscape the CNCF landscape No, not that one This one right so this this landscape this shows, you know, we were talking about Sarah was talking about this morning about Choices and how many choices we have there's a lot of projects on there or Cards so here's a simple question the question is how many cards are in the CNCF landscape some of you might already know this But I'm not unreasonable, so I'm gonna give you a bit of a hint This is what 20 cards look like right so that's 20 cards for service mesh, right? So here you go shout a number whoever's closest my William here will give you this so just shout a number 400 400 higher 600 keep going up keep going up was that 1000 keep going 1200 I think yeah, let's let's let's stop here one one seven four. That's very good a round of applause to that gentleman there Can you just? pass it on One one seven four enjoy the Lego. I don't know how long it takes to make it But apparently the instructions are as good as what we see so just a few more things That's one one seven four projects on there I'll just put out some more stats on here. It's from it says of last night 36 the 3.6 million stars But more importantly like countless of community stars right people will work in on these projects in their own time So great and some some of the other stuff there. So this was as of last night 1174 but when I woke up this morning, it's it looks a bit like this it keeps increasing every day Right, so but I know you're a bit disappointed that you didn't you didn't win Maybe that Lego, but don't worry I work for App here and we have boots downstairs if you'd like to win some of this stuff We have some cool stuff to just come through. We are downstairs the mark Lucy and Rory are down there, and I'm gonna be myself there So if you want to come and have a chat I work for a company called Appia as an MLOps engineer and we are a startup We're a consultancy. We also have a product that helps you deploy stuff to cloud easier And also work as a Kubernetes instructor for learn kids check out our website learn case.io And we have said we have a bunch of blogs you can find me on Twitter at a soul manic bar I also have a bit of a YouTube channel. You can check that out soul manic bar. So that's Appia app yet.io. Check this out I have to make sure I put that there because I need to get paid So we're done with that. But yeah, so that's my YouTube channel Just check out my YouTube channel, but before I start really want to thank the hosts and the organizers You know, they've done an awesome job putting an event together. Would you agree with that? Yeah? Yeah Thank you. Thank you so much. So and also I want to thank my friend Daniela Palenches from learn kids who helped me put together this This presentation. So thanks Daniela. So here's what we're gonna do brief look at services and a quick look at ingresses I'm sure you're ready aware of what those are but just to make sure we're all On the same page and then we'll look at cater How do we all just get it and then I'm gonna show you how to auto-scale the ingress whatever I show you Even though we're gonna scaling the ingress you can also apply to any other workloads that you have So it should work fine. So imagine just a just a bit of a quick recap this is imagine to deploy a website and It's just a static page And if you want to deploy this in Kubernetes Usually you deploy the website and you have to stick in a load balancer because you need to make sure the traffic is Distributed evenly and you can have multiple load balancers because you might be running multiple applications inside And then if you're running multiple applications, you want to make sure you root the request based on The request that they've sent to so in Kubernetes as we all know the internal load balancers are services They pick the right pod to send and then the the the top layer load balancer is called an ingress So this is sending requests from outside the cluster to inside the cluster and then we have pods I'm sure we all we all know this so this is so this is what it is We're gonna focus on on ingress today, but before I can talk about ingress it's important to just point out really quickly what services are and Or the limitations of services in in the context of running websites, and why do we need ingress? So every time you deploy an application inside a pod a container or multiple containers gets an IP address and Then we use service as an abstraction layer because convenient to to have an abstraction layer You send a request from another pod or from outside somewhere the service knows how many pods live underneath it and it can basically you request you send a request to the service and It's it's an abstraction. So that makes a request and only the service knows how many pods are running So you could have two and if if a pod dies or comes back up It doesn't matter it will keep track of it. He'll pick it And all if it goes up to 10 pods up or down doesn't matter the pod that's sending request doesn't need to know anything It's the service that needs to know it So that's very high-level service a definition if you would ask Chad GP that I guess that's where to give you But there's different kind of services. We have a headless service That's a building block and the most basic service it basically like all it is just an endpoint the IP address It's not very good with load balancing. So what we what Kubernetes also provides is the default service So if you could a service in Kubernetes cluster IP extends headlights kind of like a load balancer So it does a better job than headless for load balancing Then we have so the two services so far the all internal services you can't expose it to the outside cluster Outside of the cluster. So then we have this thing called node port is external Service you basically opens a port on each node But you might have multiple nodes running So what Kubernetes provides is another kind of service called the load balancer service and it just has a node port and if you're in a cloud provider it provisions an actual load balancer and attaches it to the nodes and If you're doing this on-premises that you've used the service type load balancer You have to have some kind of a project like metal LB that allows you to do that But it's kind of useful if you're doing with a cloud provider There's a couple of things There's a few gotchas with services What good thing is that automatically provisions the load balancer So we're talking about external services because we want to make sure we can access the website From outside of the cluster is automatic provisioning it provisions the nodes on the ports and attaches it to the load balancer but as I said, it's cloud provider specific and If you're doing on-premises yet, there's extra work you need to do and I don't if you have done this I'm pretty sure you're aware cloud providers are not cheap in the club in sorry load balancers are not cheap in the cloud provider Imagine if everybody's spinning up a load balancer for their application could be could cost a lot of money and The limitation is it's basically just layer 4 if you wish to expose services that use TCP or UDP services are good Your application so this is where we need something like this an ingress Something that can understand HTTP requests and can deal with it and send the request to a service And that's why we have the service that we have this ingress that sits on top of service I'm sure you all know but that is what it is What we can do with an ingress is we can route traffic based on the requests that we make for example in this I have this cluster with multiple applications the red the green and the yellow and You can do part-based routing. So if somebody says example.com forward slash account send them to the red the red pods if somebody says forward slash Check out send them to the yellow pod and so this is where ingress comes in quite handy Or also you can do sub-domains like if you need to do domain-based routing you can do all of this So that's one ingress is right. So that's the ingress the point of an ingress everybody happy so far. Are we are we all good? okay, cool so So far I've shown you ingress as just a black bar. That's all it is and I'm sure you already know this it turns out that An ingress is just a deployment a regular deployment of depending on the type of ingress controller You pick this different English controllers and a service usually load balancer or not because you know It's like external service. You can expose it and in this diagram You can see that the black bar has been disappeared as disappeared And now we have this service load balancer type and we have an ingress pod in this case We have engine X you can pick any That you like there's this menu and there's quite a few which which ingress controller you should pick depending on what your requirements are there's some There's some research out there that you can check out on Learn case.io So it's basically a regular deployment and ingress comes in two parts You have this thing called the ingress controller basically the pod that's running that understands what to do With the rules that you write and the rules are those ingress Manifests that you write these yaml files, right? So these young people written this ingress yaml file before right? Yeah, people written it one person has in this room. I'm sure there's more than one good good excellent people written this right? so this is this an ingress is writing the rules of Defining what should happen to the request? Where should it go to which service should the request go to? So we write this yaml file and we submit that to the Kubernetes cluster the cluster takes that request and Stores the information inside an at CD database and we have a couple of components and here We have a scheduler of your controller manager But the information goes in and goes in goes ahead and get stored in the at CD database That's where we store the state the desired state and the current state and the controllers do the magic and make sure the state The desired state the current state always goes close to the desired state. So that's what it does but When you deploy this rule the ingress yaml file The cluster itself will the pod the ingress pod the controller will spin up It will get the information from the ingress because at CD and it will configure the rules So engine x config if you use engine x it will configure the rules in engine x config to say if this is that the pod You need to send the request to this service. That's all it does right and then it will just write Or if you write another another ingress who will take that rule it will update its config and it will understand this so Here's the thing if you're running a website So that's ingress and service and that's our back story But if you run a website now we understand the request has to come from outside of the cluster So it has to go via the ingress pod in this case is engine x if it goes via the engine x pod And if a lot of traffic comes in you might end up with a bottleneck if you have just one replica running You might say hey What should happen is that if if a ton of traffic comes in We should scale out You might say oh, maybe I can just start with 10 ingress pods running at all times Which is fine. You can't do that But of course it's going to cost you money. So if you want to save some money in these times Especially in uk because the pound is very down. We need to make sure that We can scale down when we need to So what you can do as I said in ingress the deployment It's a normal deployment You can configure this if you deploy it in your cluster as a helm chart or whatever it is You can configure these settings you can change these settings You can say there's there's a file in here. There's a line in here called replicas I can open the file change it to three replicas and submit it to the cluster and I'll have three replicas But it's not very scalable. It's not very You know, you have to maybe you have to wake up at 2 m in the morning We need we need something good, right? We we don't want to wake up at 2 m in the morning We we want it to happen automatically. So this is enter Our helper the horizontal pod autoscaler It's a resource in kubernetes that scales out the pods the vertical pod autoscaler is different It changes the cpu in memory so it can increase more cpu in memory for it horizontal pod autoscaler spins up more pods So you can see this in this case. What should happen is imagine there's a deployment for the ingress we've deployed we deploy this ingress controller in our cluster and It's running with three replicas What horizontal pod autoscaler can do is it can query some metric Let's say number of active connections inside each pod, right? That sounds like a reasonable Reasonable metric. Let's say you can query that metric and we can write a rule We can say hey if each pod is has got more than 100 connections. We need to scale up reasonable request let's say Then what should what horizontal pod autoscaler does is it'll do its calculation and if you Breach the threshold it will automatically adjust the deployment and it can it can scale up add another pod and scale back down And remove all the pods that we that we don't need So what we want to do is scale based on incoming traffic With me so far everybody good. I know it's 15 minutes to lunch. So We're we're on track. Don't worry. We're on track So what we want to do is we basically want to scale On hdp requests is coming in basically requests coming in But you can do it on any metric you prefer in our case We'll just do it on let's say active connections So we first of all we need to have metrics available and we need to expose it So the horizontal pod autoscaler can take those metrics and do something with it So usually you'll write in in your application You do forward slash metrics and you can write key value pairs in there And just in your application make make it available. So something like prometheus can can scrape it But we take that metrics. We need to store those metrics because You know the requests have to come over time We need to store it just run our calculation And then once we store it so collect and store it and then we're going to do autoscaling We have based on those metrics. We'll write some rules and we'll autoscale and this is what you see in the demo as well So what we have is luckily for me Makes the demo easier and ginex provides Already some metrics. That's so I don't have to write anything from scratch and ginex pod has forward slash metrics It provides these metrics in here. You can take this metric and ginex connections active So this is all the connections coming in. That's what we're going to take They're available, but we need to scrape the metrics So what we can do is we can use prometheus people like prometheus here. Anybody like prometheus? Yeah, we got this great project Right. So we basically what it does is it helps us You have something called a metrics server in kubernetes That that can that can be that can be used to pull out the metrics from whatever resource that we want to pull out from You take it off from kubelet and provides it to other resources like horizontal pod or to scale up And this is where we can use prometheus prometheus allows us to do that in prometheus We can scrape the metric and then store it whatever we like in this case We're going to store in something else But what we're going to use is prometheus. You don't have to use prometheus But in this in this one we're going to use prometheus and I have a cluster that's running locally which has prometheus already installed and then We're going to use this project called cater Or keto kubernetes event driven auto skating have people heard of or use this project. Oh, yeah, excellent We have quite a few people So if you get stuck you can come up and you know, give me a hand if the demo doesn't work, but it should be good So here's the thing. We need to expose the metrics. We need to store them So we can scrape it using prometheus. We need to store the metrics And then we basically need some kind of an adapter that can take the information and feed it to the resource metrics So we can we can act upon that and then we need something that scales based on requests coming in to Coming in from from wherever they're coming in And this is where we can use this Cater kubernetes event driven auto scaling. I'll show you in a second Couple of diagrams for what it looks like It lets you basically drive the auto scaling of kubernetes based on events Events that happen inside the cluster or events that happen outside the cluster too Which is amazing if you have a bunch of requests outstanding in a kafka queue You can scale on that Um, so there's a bunch of scalers that kata already provides that we can plug into even stuff like sql queries run a sql query If the result is above some uh number Do something with it. So that's where we're going to use kata. What we can do is it's quite flexible it's open source and It drives the horizontal pod auto scaler something needs to feed all this information to the horizontal pod auto scaler to be able to scale it And and that's what we're going to use kata What is it? So we have the metric server as I talked about before just briefly is used to collect metrics from cubelet So the cubelet that's running on each node it collects metrics from there and exposes it to the kubernetes api And it plugs into the aggregator api. But uh, what we have is this metric server It's not enabled by default. So you have to enable it. But when you install with uh in kata on the cluster, it will be enabled And then we need an adapter It's you know, prometes adapter same kata also provides one it can collect and serve custom metrics to an external metrics Well to the horizontal pod auto scaler kata has it And a controller that sits in the cluster and acts upon the vents So that's that's we need so if the controller is there we're going to have a bunch of custom resource definitions so things that we do things that kata defines and I'll show you an example of that and controller is something that extends kubernetes and does a functionality and then we have What are called scalers? We're not going to talk about them today. But this is I briefly mentioned it You know if you for example, you can based on you can define some rules Based on requests coming in from rabid mq. You can scale out your application. There's a lot of Cloud watch events like aws cloud watch anything in pub sub in in the cloud provider It's very neat and it's quite easy to set up And so this is where kata is kata comes in handy and that's what we're going to do a demo on So we have 10 minutes So a lot can go wrong in 10 minutes. But so there's the demo We're going to do live if it fails I have a video as well, but we'll pretend it was live if it if it fails But I'm sure it'll work. So you can if you want to check out daniella wrote the blog for this bit.ly forward slash kcd scaling and runs on minicube. You can run it yourself or you can run in any cluster you like So that's that's where that's where Explanation and and the code and everything's there. So definitely check that out. So what I got Which I'll show you in a minute is a minicube cluster Is running I've installed ingress in nginx ingress on it already Because I do not trust the internet during demos. So it's already running And I've also installed prometheus and kata. I've used helm for it, but you can install however you like It's just one command line spins up the pods. I'll show you the pods in a second And then what we need to do is we need to have a way of generating some load, right? Because otherwise you'll think I'm making it up. So we're going to use Locust it's an open source project to just generate some load And hopefully if everything works, we should see scaling autoscaling ingress. So are we ready for the demo? Less mirror We'll give it a second for it to pop up. All right We're cool here. So here's a few things that we got which I'll put actually put in around So this is all real. There is a deployment file. I've deployed Stefan's pod info pod. I'm sure you've seen this. It's running and everything is running locally I trust internet but not during demos. So everything is running locally. So that's the pod that's running Has a service of course as well So pod and a service. I don't know if it's big enough, but let me Is that is that good? So we have service and a pod that's already deployed And we have an ingress and an ingress you can see there's a there's a bit of a The rule here example that goes to example if if the request is for example.com send it to the pod info service, so I'm not making any of this up because if I can Do a clear k get pods You can see In a second. There's some some stuff running. So we have We have our deployment here the pod info deployment. We have kda I've also installed kda on there the operator is running and we got a bunch of crd's to go with it and we have pod info Sorry, we have nginx ingress running and we've got some of the Prometheus stuff I know no kubernetes demo is complete without running kubectl get pods So I just wanted to get that out of the way. So that's done now We can we don't have to look at this instead of me showing you lines We have a dashboard here, right? So these are all the pods that running in the cluster Let me see if I can zoom in These are all the pods that are running in the cluster. It's not it's not a photo It's an actual like a dashboard and we have one nginx Pod running. So that's all good. I'm going to put that to the side And we're gonna Step onto here And Prometheus is also running we we've got five minutes, which is good. That's all we need and Let me just quickly zoom out for a sec if I execute this this is nginx ingress active connections Basically what we're talking about you can't see this is quite small But there's only one request because there's only one request that that was running And then what I've also deployed is locust and locust is an open source project Let's just open this and all it's got is got its own deployment But in here you can define a config map and a config map We're defining you have this locust file the py that's written in python But you can give it a task and all this is doing is allowing me to send requests to example.com That's all it's doing. I want to send requests to example.com And because it's locust I can do a bunch of things. I can Give the host I need to send a request to how many users I want to send per second and what's the what's the So what we can do is we can start swarming the host We can start sending requests. Let's hit on the chart. Uh, I don't know if it's uh Let's just do this is this That's There's some some things about to go wrong bound to go wrong, but let's go There you go charts Okay, so let's go hundreds. What was it? I'm just gonna stick here and also what we have Is uh with that I think I need this engine next ingress And spawn concurrency is 10 so it should Start sending requests right so it started started sending requests and I should see some of the requests Go up on here in a second. We'll have more requests for ingress active connections Outstanding so if I just quickly do a refresh here Sometimes it'll come up in a second So basically what we've got is number of users are increasing and requests are going in and the response time will keep going up But what we need is to scale up So for that we have to define this call something called a scaled object Which is a crd and in here. I'm saying what kind of deployment I want to target the main engine x ingress deployment And I just define how many replicas I need and we're gathering metrics from from Prometheus because that's that's where it's scrapping it from This is the one i'm looking for the metrics name that doesn't name and the query is this all we're saying is if the Engine x ingress connections active is more than 100. So that's a threshold. Do you scaling? Do you think? So, uh, let's just see if I can execute this you can see in here It's a very faint line on on the edge there, but there's there's there's a number of requests that are going in there's only one Now we're going to focus on this part here. What we'll do is deploy the scaled object and we should see A number of engine x pods increase if it does go wild if it doesn't still go wild Let's uh, let's let's do this. We're going to do a Minus f4 we'll apply the scaled object What this will do is Okay, that's created And what this will do is we'll you look at the metrics and it'll start increasing K get deploy it. Let's just do that It should eventually start increasing the number of pods Let's just make sure There you go The pods have started to spin up and it will spin up based on the request that we said, right? So now yeah, we've got it working. We got there So this is as you can see now On here the total number requests are going down because we scaled up based on the request based on the based on an event And if I keep this running, uh, once all the traffic served it'll scale back down That's where the scaled object is and you can do this for not only ingress You can do this for any any workload that you might need So, uh, I'm just going to do one last thing So if you if you want to let's just hop back on this Okay, where's my pointer? So that's what you saw you saw us gather some metrics and we just talked about services ingresses then We exposed some metrics and we gather them. We Kubernetes doesn't do this by default There's a lot of work we have to do to get all of this running and you saw that if you want to try try it out As I said It already is a blog for this bit.ly For slash kcd scaling check out learn case.io.io. We do we have some awesome blogs and kubernetes And we do some training as well to check that I'll also check out apio.io If you have any questions or anything like that, I am downstairs all day This is our booth in one of the areas come and grab me Let me know if the if it was useful or not If you want to stay in touch with me, my name is Salman Iqbal You can find me on twitter at soulman or also youtube but apart from that Thank you so much. Have a great day and a great conference Thank you Salman, uh, that was Pretty pretty intense very good the demo gods have been listening to you While I was watching Salman, I I started thinking Uh of an experience I had Uh, excuse me, excuse me people that are living in the room one thing There is a workshop going on please Go outside through and to left to the sponsor area lunch is going to be In the sponsor area served and lightning talks in the workshop area So back to us what I was what I was thinking is, um, I always thought I am very intelligent guy and above average Certainly the average in this room, but I I I realized it wasn't the case went to a went to a when I had to marshal Uh yaml into go anybody knows what I'm talking about. Yeah. Yeah, did you did you hate your very existence when that happened? Yeah There are online through through there are online tools. So that that's what I thought I thought why shouldn't it be an online tool? So I went the stupid way. That's when I realized that I'm not above average Quite below and I wrote my old my my very own tool to marshal Two days later a colleague of mine said but I use this one online tool and yeah You're not very kind So anyway, thank you very much someone. It was very good. Thank you and what I would like to Invite you again guys, please let the workshop guys go on Go through the smoking area outside Grab a smoke if you like or don't and then you go through the sponsor area There you can grab your lunch is going to be served We're going to be having the lightning talks in the workshop area And I'm going to see you here at quarter to two at 145 with Robin Sipman from ing. Thank you very much Koon, can you please give us the screen for a moment? We want to test one laptop