 Good evening and welcome to this last session of today. My name is Bastian Hofmann and my name is Simon Pierce Bastian and I both work for a company called sys 11 Sys 11 is a full managed provider based in Berlin. We've been with a strong focus on managed hosting We've been offering managed hosting now for our clients for the last 11 years And one product that we started this year is also managed Kubernetes on open stack And today we want to talk about how UN can scale and autoscale these Kubernetes clusters on open stack The good thing about Kubernetes as a container orchestration platform is that it makes it very easy to scale your applications and services Why do we want to scale? Yeah, why do we want to scale on the first place? There's many answers to this question and one of the answers could be to handle increased workload Some of you may have encountered this Unfortunate event that one of your marketing guys sets up a big campaign and forgets to tell this is our department What is going to happen? Most probably your application is going to crash This is one of the reasons where autoscaling and Kubernetes can help you Another reason is maybe your financial department You only really want to pay what you actually get and what you actually need to use Resources are expensive on public clouds, and you probably don't want to book too many Another reason could be you may want to help save the environment So if we want to know now that scaling is very important to reduce costs to reduce energy consumption We have to think about how we can scale our application And there are two ways the first one is horizontal scaling Horizontal scaling means that you either increase or decrease the amount of instances Of your application that are running or the amount of nodes in your cluster The second one is vertical scaling Vertical scaling means that you either increase or decrease the CPU or memory usage of One instance of your application or that you increase or decrease the capacity in CPU and memory of one node in your cluster If you want to do this on your bare metal Infrastructure the thing you have to do is first you have to probably order in your server Probably write a ticket to your data center and then a couple of weeks later if you're lucky you have a new server delivered Next you have to put the server into a data center somewhere via all the cables Install an operating system on them hopefully not Windows 95 and Then after that is done You can start provisioning all the servers all the necessary Dependencies with puppet or chef or any kind of other things to get the Java version on it or whatever you need for application to run And only then you can go to the fun part deploying your actual service and your code to this new server and Probably also reconfiguring the load balancer to get new traffic to this node and to this instance that you added these are a lot of steps and These steps also take some time sometimes days in between and if you want to scale up very quickly Because we have a TV commercial coming up. This is like a way to slow So next we would like to look at the cloud provider approach What is this process gonna look like if we're using one of the public clouds? We could maybe start off with creating VM from an OS image could be something that's pre-configured by your cloud provider and already there Then we would also need to provision the server with the necessary dependencies that are needed to run my application So same step again, and we also need to deploy these services to the new server Reconfiguring the load balancer also needs to be done So we still got a lot of steps that we need to perform before we can actually get to our end result Of course then Cloud providers say yeah, we have these things called auto scaling groups Which are kind of nice and in theory can help us with doing that by automatically booting up new VM images if the load increases The problem with that is a lot of these auto scaling group APIs are very proprietary That means they are different for every cloud provider And if you're an AWS for example, and you now want to move to open stack Or you move want to move some of you were close to open stack. You have to implement that again and again Kubernetes makes this a lot easier by just standardizing the APIs for that So in to show you a bit more on also in demonstration later on how this works Let's talk a bit how Kubernetes actually works internally So to start off with this, it's quite important that we're talking about the same Terminology that we all understand the Kubernetes basics that we're talking about so to start off with I would like to explain what is actually a container a Container normally runs a docker image. It doesn't have to there are other container runtime interfaces around That can be used Normally a container as a good standard practice should run one process one process only There are of course child processes, which is processes gonna fork, but otherwise we should be talking about one process There are a lot of bad examples around on duck up, which maybe some of you have encountered But this is normally the way it should be done In the Kubernetes terminology, we often come across something called a pod a pod can be one or multiple containers Normally we assume that we share certain things like networking for instance because we need the same networking We could also share the same storage if you wanted to Not necessary though Here's an example of what a pod could look like so if you look on the Left-hand side we have the PHP FPM container which is in front of it is our engine X container running our web service. So the PHP FPM would typically connect to the engine X container and send its pre-compiled PHP data and Just to make sure that we've got a logging facility We now have a sidecar which also belongs to this pod Which is used to send on our logs to a log stash for instance So just to give you a short overview of what a Deployment YAML file may look like in Kubernetes. This is the kind is a deployment We've got the API version We've got the metadata of what we're actually going to deploy and the actual container Which is going to be pulled is the engine X demo in Version 0.2 if we don't state anything at all here this image will automatically get pulled from Docker up If we now have this deployment and we want to scale this horizontally we can use a feature in Kubernetes called a replica set So in this case here we have One pot running here and we want to have this three times in our cluster So in a replica set we can define hey Kubernetes. Please run it three times and Kubernetes will then ensure that it's running in the cluster three times And even if a note goes down where these containers would be running on it will then reschedule the containers on some other note Where there's available space To give you an idea how this looks in code we have our deployment and Basically this deployment includes a replica set which includes a port definition and with this replicas count We can now tell Kubernetes. Hey run it three times So the next thing we would like to talk about is vertical scaling How can I use a vertical scaling for all my containers? So we're normally talking about one container which we need to scale up because the process probably needs more CPU or more memory maybe So in general we're looking at a few fields which Kubernetes use for scheduling We'll come back to this later on in one of our demonstrations that we're going to show you today and one of the Important fields we're talking about here is the actual resource request the resource request request is something that Kubernetes uses for its scheduling decision. That's important so Before a pod is actually scheduled It checks the CPU and also the RAM that's been added to these two fields and looks for a node Which has that capacity? If it doesn't find anything then it will not be able to schedule that container or pod so limits Limit the container to use more CPU and memory So it will not be able to use over that. These are values that can be changed manually So you can basically edit your deployment, which we'll show you later and you can change these Of course changing this manually is nice that you can decide this at some point Yeah, I need like three replicas or I need like five CPUs for the service But what you actually want to do is you want to change this also automatically based on usage based on Traffic that your services are getting so that you don't have to get up at night When there is like a spike in traffic and also that you don't pay too much if you don't need it In the end this is all about focusing on what's really important and what's really important It's not your infrastructure, but the applications that you're running on because that's what's making money So let's show this in a live demo. So number one we need a few preparations to get this demo up and running So to do this, we're gonna need a Kubernetes cluster Which we're gonna try and get up and running in a few moments and we would like to Show you the difference between doing yourself via a managed Kubernetes solution Setting up and maintaining Kubernetes is hard Who of you has set up Kubernetes themselves like really the hard way? Okay, keep your hands up who liked it Okay, that's what I expected It's definitely a difficult and a very tedious Pastime and most people forget a lot of crucial components as well Or they do not back up the DCD correctly and find themselves through Certain problems at the end of this monitoring can also be a crucial component that you're gonna need a Managed offering hopefully takes care of a lot of these and gives you a certain amount of carefree usage and a certain amount of Take away certain jobs You just don't want to do because who actually wants to focus on Kubernetes most people don't most people want to focus on their application To name a few of the popular Kubernetes managed offerings. We could talk about GKE With one of the first managed offerings on the Google cloud Amazon with EKS Sys11 with meta cube So what are the advantages of having a managed solution? One of them definitely should be having easy upgrades if you try and Run your chef or your ans will playbook to try and upgrade your Kubernetes cluster You're probably gonna be sitting there with sweaty hands hoping that everything works and that your containers are still running afterwards And I've definitely broken a few clusters trying to do this Easy scaling should also be something that you should take into consideration You don't have to go to the rack like bassin was telling us earlier Or maybe you have to pre-order a serve or something because your capacities have run out on most public clouds You can just receive more resources when you need them Also load balancing is a big thing that people need to take into consideration You need to be able to expose your services to the outside world because what are your services worth if they're just inside a Kubernetes cluster So you're gonna need some form of load balancing which you probably also do not have if you don't run on some form of managed offering Distributed persistent storage. So you're talking about okay You need to manage all these hardware nodes the Kubernetes installation and now on top of that I need a persistence layer, which I also need to take care of We're talking about a lot of things to manage and one of the last things that people often forget is Backup and even worse than that recovery. Do they actually test their backups? Do they actually work? We've done a lot of this for you Premium support is also something which can take away a lot of pain If anything goes wrong with your application or you would need to talk about Scaling how things are done properly best practices. You can come to us and we can talk about these sort of things And mostly we have answers to Monitoring is also one of the key things that you need to actually properly Set up and maintain a Kubernetes cluster if you're not looking at your KPIs and you're not looking at certain Things going in and out of your cluster Then it's just not going to work and are you actually receiving these alerts? You're getting paged correctly when things get triggered So basically the key message is you can focus on what is important So just to give you an idea how this can look like in a managed offering to set up a Kubernetes cluster Instead of doing it the hard way with a thousand steps and lots of documentation. You can just log in and Sign in and then create a new cluster Give it a name Choose the Kubernetes version Click next one time choose the region you're running in provide your open stack credentials Choose the open stack project where your nodes should be running in choose how many nodes in which flavor should be running Add an ASXH key maybe to it to be able to connect to them later on manually if you need it Click create and then a few minutes later You have the nodes up and running which is a lot less than this Kubernetes the hard way So if you have a Kubernetes cluster now, let's talk about how we can scale our workflows of that And the first thing is what we want to do is I want to show you how we can scale a pot a single instance With a replica set manually in a horizontal way Okay to do that. We're gonna switch to our first demo hopefully Okay, and our first demo is we would like to show you how you can basically scale up a deployment So let's see if we can yeah No, not delete So what we have already here and while Simon's typing we are showing here on the top right side The pots we are running in our namespace of one deployment instant at the moment. We have one instance on The lower right side. We are showing the CPU and memory usage of every pot that is automatically measured with Kubernetes, okay, and yeah, so we could see our deployment here. We basically got a deployment with One replica at the moment and we should also have a service if we have a look So if we look at the service, we have also a service with a type load balancer The type load balancer should automatically give us a public IP. We can quickly check this and we should have this public IP I'm running here But you also can see that the server name is automatically here the name of the pot So it's actually like a live running demo. We are not faking it So if we want to scale this up now, we can use the cube CTL scale command Okay, so we can still this deployment fairly easily from one replica up to three replicas So what you're gonna see now is you should see new New containers or new parts getting spawned on the right hand side What's happened? Let's copy and paste this. Okay. Sorry about that Okay, okay, that's it. Okay, so you can see the containers coming up here on the right hand side and Now we should see that our deployment should have a desired state now of three Current containers, which is now running or pots and we could also scale that back down at any time We wanted to we could also edit this deployment So Yep, edit the deployment. Hello world. So if we look through this yaml We should have our replica count which we saw earlier, which has now been set to three the Scaler has done nothing else than to edit this so we can now change this number to one again Edit it save the file and then we should see the parts getting terminated again Yeah, that's happening so This was manual scaling Which is already very nice scaling of pots The next thing we want to show you is how you can actually do this automatically if the load on the server Or on the pots are being increased so So to do this we can use something which is basically already built into Kubernetes the HPA the horizontal pod autoscaler The pod autoscaler can be set to Check certain metrics the Standard is to look for CPU The standard is to look for CPU basically we're going to set this to a very low value now to 5% for Demonstration purposes, of course you would not do this in production, please We're gonna do this and then we're gonna create some load with AB with a benchmarking tool from Apache on this pod And hopefully the metrics API should pick up the increased workload Which should fairly quickly hit the 5% target that we're looking for once this happens We should see the horizontal pod autoscaler kick in on the left hand side in the bottom We're running a describe on that HPA and you should fairly easily quickly see that this is gonna start happening hopefully So it takes it takes a few seconds for that to happen because it needs to get the metrics data and It's still at zero. So hopefully this will happen shortly Let's Let's see what happens Because we're trying to to run this over mobile connection. So it could be Let's try the Wi-Fi. Okay. We'll just try the Wi-Fi For was that it's always like mobile connection something could limit the amount of requests we are getting doing in parallel Yeah, so to one host. So we need to get a certain amount of requests to that It's okay connected to the Wi-Fi Okay, that's what go then Yeah, we can see the CPUs cores have already got increased so you can see there's already a certain amount of load Which is over our 5% and you can see on the other Side that the first parts are already in a state pending. So they are they're running now So we've already increased the number of connections that we'd be able to accept for application without really doing anything And set setting up the HPA to a certain CPU limit Okay, so this is horizontal pod autoscaling the next thing we want to show you What if we actually vertically want to scale a process because it needs more resources because we are getting Oh and all the time and one of our containers needs definitely more memory For that we need manual vertical scaling. Yep So what we can do is we can edit the deployment again And if we look down here to the pod definition, we see resources for requests for CPU and memory and we can just increase it So let's see That's a request that we were talking about which Kubernetes of course uses for its scheduling decision So if we edit those and increase the CPU to use half a CPU and the RAM to 700 megabytes per container We should find a certain amount of containers stuck in pending state Yeah, so what we need is now trying to do it does a rolling deployment of these containers from the specification with less CPU and memory requests to the specification is more and Now we see for this new Replica set here all these containers are in pending if we say keep CTL describe pot and then this pot here. We also see why because It actually can't schedule these pots because there are insufficient memory and insufficient CPUs available in the cluster So at the moment in the current state of our cluster Kubernetes of course will not be able to solve this problem because we have unscheduled resources so we actually need to add more nodes to our Cluster to schedule additional pots So what we could do is we could manually add more VMs to the cluster This is of course a bit cloud provider dependent At Swiss 11 meta cube you can either do this over CLI or you can also do this in the nice interface We can say hey add a note and the amount of notes and what flavor there What we're actually using under that is the Kubernetes cluster management API which defines how you can interact with nodes in a cluster and abstracts the cloud provider specific things away and With the ecometric machine controller that knows then how to talk to open stack we can provision VMs on the fly easily So what we are going to show you in a minute is that we have a machine deployment there The machine deployment manages rolling updates on a machine set in The machine set that is defined how many instances of Notes we want to have in our cluster and every instance is the machine resource in Kubernetes And then we have the machine controller that listens on these machine resources and then on demand creates or deletes open stack VMs provisions Kubernetes on it and Makes them join the Kubernetes cluster as a node So you could of course do this manually again But also we want to show you how to do this automatically With not auto scaling. Yeah So in also a meta cube for open stack We have implemented and extended the Kubernetes cluster auto scaler to talk with this cluster management API this is Ready in a way that we can show it to you now. It's not ready yet that we can actually open sources We hope to do this like next week or so. So this will be available soon at least as a pull request to the auto scale So Let's so what we're going to show you next is we're going to show you how we can try and get these parts out of its pending state and Deliver enough resources for our Kubernetes Scheduler to be able to actually schedule this. So what is the node auto scaler doing? It's looking for exactly this It's looking for parts in a pending state which do not have Sufficient resources to be scheduled once it finds these parts. It will try and provide us with some new machines so on the left side here, we have a watch on all the machines in our current is cluster that also then Correlate to nodes in our current is cluster and now we are adding a new machine deployment We can say with annotations what the minimum size and maximum size of this deployment is for the auto scaler So in this case we have a minimum of zero nodes in this deployment and a maximum of 15 nodes and We can also define drawing update strategies like how many of these nodes should be maximum unavailable if I change the flavors for example to it and Also, I can define all the necessary fields like what operating system image I want to run what flavor of of the M. I want to run here What's network and security group these nodes should be in and so on So if we apply this So once we apply this we'll have a new machine deployment created which then in turn will watch for this group this replica set and Once it's needed it will start spawning new machines, which is already done because we already left An amount of pods in pending state So what you can also see of course booting up a new VM That's quite fast for a creating you them from the image provisioning it with a new cupler It means up get install of a couple packages installing. I think Python and other things. So this is going to take like two to five minutes Depends on where you also can see if we now run here open stack server list That there are in my project already The first machine the VMs already there and they got public IPs and I know just waiting to join the cluster Yeah, so that might take a while to actually provision to everything's up and running and they've actually joined the master So we're finished off with our slides first and then we'll come back to this Demonstration at the end. We saw the cluster management API and the demo Okay So we'd like to give you a brief summary to wrap up what we were talking about today So that you've got a fair understanding when you leave our talk about what is all about so with auto scaling you can Save valuable resources if you need to and most people will want to do that. You can maybe even Help save our environment and make this place a better planet Kubernetes makes auto scaling a lot easier because it has all the things on board that you actually need We saw the horizontal part auto scaler. We saw vertical scaling And we also saw the node auto scaling which can easily be integrated into a new cluster and One of the best things about it all is you may save yourself from getting up at night because one of your applications is Not working as it should be Thank you very much So we have a couple minutes for questions and we definitely can go also to the nodes later on and see them up and running If you wonder if questions now that free free and you can also join us in our Since 11 lounge for beers later on and we also have a booth outside there We can also answer you any questions and you can also send us emails and reach a little Twitter And yeah, if you have questions now for free Okay, so if anyone would like to answer any questions later on please drop by a booth will be offered some beers later on So just here just come around. Thank you