 We have quite a sleepy audience and then I actually do help in sleeping so one of the things that I really liked about this, I need to do anything, okay so one of the things that I really liked about this root corp is that there is a rope lying there I am not really sure why, it is a big rope bundle, I don't know what they are planning to do with it but I thought that it's nice to point it out. Alright so I am going to talk about Kubernetes today and we are going to look at deployment strategies, some very common deployment strategies. A little bit about me, I am a systems engineer and DevOps since 2011 and I led systems engineering team at BrowserStack before this. Currently I am running my own system engineering and DevOps consultancy called DevOps Nexus. I am a contributor to open-source projects including Kubernetes and Fedora project and I have been an author, a published author and speaker at various conferences including rootcon, Fostum, Flock and FossAsia. That being said I actually lied when I told you that I am going to talk to you about you know deployment strategies that was my ploy to get into the schedule because they were not letting me inside so I thought that if I put deployment they'll allow me so that was a very clever way to fool the editors. Right so what I am going to do is I am going to talk about certain concepts of Kubernetes that will help you in deployment, they'll help you in designing your own deployment strategies. So and by understanding these concepts like you know labels and labels and schedules and all those things I want to learn deployment as a byproduct. Right before we go ahead we need to have some sort of initial setup. I will basically need a Kubernetes cluster setup and I need an image with which I will demonstrate how this will work. Now I am going to use the nginx image because that's pretty much very common thing to use. I request you not to follow the demo right now because that will screw the internet and consequently my demo will not work that nice. So please don't you know try to download the nginx image right away. So with that I would probably want to introduce you to labels. How many of you here are familiar with Kubernetes or at least have heard what Kubernetes is. Right how many of you have used any cloud provider out there, any one of them right quite a few. In any cloud provider there is a way to identify resources and the most common way to identify resources is to use something called label or tag depending upon what your cloud provider is calling them right. So Kubernetes has some has a similar thing known as labels. Basically you pick any artifact like a pod or a deployment or a service and then you assign a label to it and then subsequently you can identify that particular resource using that label. Now we are going to use this to our advantage. So basically when we want to do anything of the sort of deployment or routing traffic or whatever we need to make sure that we have right labels in place. So I'm going to use quite a few labels here. Most common ones I've written down like we have a label for environment, for application, for service. The color label is something which I will talk a bit more about later but yeah that's another label. Second thing I want you to realize is to imagine how your traffic flows from a client's machine to the service that you want. So we started at the very beginning somebody sends a gate request or something like that and it hits the load balancer right. Now what happens to the traffic at the load balancer? Load balancer basically acts as a central brain. It tries to understand that where should I direct this traffic to right and the load balancer or the router or the service that is supposed to find out the resource for which the request is meant and then send the traffic there and then that resource or that particular server is supposed to take care of the processing of the request. Now this entire thing is quite standard. We have all have been doing this if you have worked with any of the cloud provider be it AWS, GCP, DigitalOcean, all of them have their own load balancers and or if you have used HAProxy all of them work on and almost the same principle. Now that we know we have this key concept in mind that we can actually tweak the routing of our traffic. Let's now start with the deployment strategy that we want. Okay, so this is the pretty much this pretty much standard. So before I go ahead, let me bring up the initial cluster which I'm going to use. So I have two machines with me. This one if you notice this is the Kubernetes master on that side is Kubernetes node. Okay, I'll keep on toggling between two, but I'll make sure that the one which I'm writing to is maximized. However, you can always see that master is written here and node is written there. Kubernetes, if those who are not familiar, it's a very standard client server architecture. There is a master which passes which against which you pass commands and there's a node which actually processes the commands in the sense that it is the one which is responsible to run the containers. Just to get a hang of it, let's see. I have one node ready with me. Alright, now if you look at the initial setup, we have a two node setup, one master and one node. And I'm going to use nginx 179 for this demo. So I'm going to create a deployment and a service. A deployment basically will create the Docker containers onto the node. Alright, and the service is responsible to route the traffic from the, from the user to one of these two containers that will be created. So let's quickly look at the things that we have here. I have an initial deployment with me. Right. So I'm basically starting two replicas of nginx 179. Now interesting as a note here is that I have applied certain labels to it. There's an app label. There's an environment label and there's a color label. Please ignore the color label for now. We'll come to it later. But most interesting label here is the environment in the app right now. So let's go ahead and create containers and just to make sure that we don't have any residual things running here. If I do a Docker PS, yeah, there's nothing running here. So I'll just create the initial deployment. It's created. Let me also show you the service that we have here. Right. So this is my service. Now, if you, if you realize that we actually tagged our resource, right, I'm going to use the same tag. Environment production and what this service is going to do for me is that it's going to receive the traffic on port 80 and it'll direct it to any resource that has this particular tag to it. Right. So all the pods that I have all the containers that I have with this particular production and blue tag, they're going to receive the traffic. So I'll just create it quickly. When I once I create the service, I get an endpoint basically just going to copy this and so I'm getting served with in the next 179 here. This is my initial setup right now. I've not done any deployments. This is my, you know, the first stage going back to presentation. The most common thing that you do once when you start playing with any application or when you want to do an upgrade is to test right before deployment, you always test. And one of the most common ways to test is to do a canary deployment. Canary deployment means that you basically route a very small portion of your traffic to a canary. I mean, very small portion of a traffic production traffic to something which you want to update to. So what we are going to do is we are going to route a part of traffic. Now, can anyone suggest a way to do that? Like I have we have discussed that there are ways to route our traffic using labels, right? So can anyone here tell me a very good way to probably route the traffic easily, but not the entire traffic only a portion of the traffic to a different machine or to a different container? Any ideas? Is it too early in the morning? Okay, too early in the morning, I guess then. Okay, so what I'm going to do is I'm going to create another deployment just like I created engine x 179 deployment. But I'm going to make sure that I use the same tags. I use the same labels. What will happen because of that is the selector, which I which my services use. If the label is same, then it's going to route a proportionate amount of traffic to the new set of machine as well. Right? So let's say if you have a cluster of probably 10 or 100 nodes, if you just want a canary of one node, then have a deployment, a single deployment, a deployment of single node where you can basically do this. Is it visible? Like with the font small? Good morning. I increased the form before coming, but I didn't realize that it'll be still smaller. Is it visible now? I'll move it up. Is it visible now? All right. So now what I'm trying to do is I'm basically going to run a canary deployment with one replica. Now, for me, it's 33% of my infrastructure, because I already have two. I'll add one more. That's 33%. But if you want to do a real canary in production kind of setup, then probably make it less than 5%. I'm going to update the image. I was using nginx 179. Now I'm going to make it 191. Thing you need to notice here is that I have added a few tags here like type is added. But remember the selectors that the service had? Service had two selectors. And that was environment production and color blue. Service does not care what parts it's sending traffic to as long as the selectors and labels match. So you should take advantage of that concept and basically route a part of your traffic here. Before I do that wait, let me just show you. I'm just going to fire about 1000 requests to see what happens, right? It's all nginx 179. Now I'm going to create the canary. Right? So my canary is created. It'll take about a second to get created. If I fire the request now, I should actually see certain occurrences of 191, right? Can you observe that? In fact, we can probably go a bit ahead. It's going to fire the request in background and then put a counter 328, which is approximately 33% of what we were doing, right? So this will give you a very good way to do canary deployments and test out the release before actually putting it production wide. Right? So this is what we do. We just realized that services realized where the labels are and then send the traffic across. Now once you're done with this, the next step is rolling deployment, but basically deploying to the entire cluster. And that is slightly easier because Kubernetes supports rolling deployment out of box. So you don't need to play a lot around with labels and all. So I'm just gonna quickly show you that. But to show you that to show you the rolling deployment concept, I need to upgrade the size of my cluster because right now we just have two or three machines running, which might not be very helpful. Someone just want to increase the size of my cluster to six. Right? Now I have a lot of machines running. And what I'm gonna do next is just set the image to the new one that I want. All right. As soon as I do that, the rollout will happen. And the rollout is like this. So if you look at it, not everything is replaced right now, right? Only four of them were replaced. First one got replaced. Now it's trying to terminate some of the older ones. So it's it basically rolls to ensure that your users will not see a downtime. Not entire cluster will be taken out in one shot. It'll be rolled. So while it uses will not see a downtime, you have to understand that there might be some latency issues there. So when you actually do this, keep that in mind that your users will not see a downtime but might see slightly degraded performance. So best is to either increase your cluster size like I did before doing rolling deployment, or do it at a time when you know that the customers are, you know, you have very few customers around. So a rolling deployment will not actually hurt their experience. Rolling deployment actually has a lot more things around, which I quickly wanted to show. For example, if you want to check out the revision history, now there is an open bug because of which this is none. But what you can do here is that you can actually go ahead and check the revision history by revision number. So if you do that, you'll actually see that there was a previous version with 179 deployed. Okay. And if you think that, you know, things are not working your way and you want to roll over, you want to go to a previous version, then that's easy as well because what you need to do is just do a rollout undo and that's going to put you on a previous version. So instead of getting 191 here, you will get a 179 back. Okay. So that's basically rolling deployment. Now, lastly, I want to talk to you about blue green deployment. This is this is a different case, which doesn't fall into rolling. What you do in blue green is that you maintain two sets of separate clusters. One is blue, one is green. That's why I use the labels color if you notice. What happens with them is that you say that my traffic is going to go to blue. But as soon as you are, you want to update, you set a new cluster called green. And when you're satisfied, your tests are done on green, you basically route your entire traffic to green. Now, this is also very helpful if you are into immutable architectures, where you don't want to, you know, change what has been done once. So in that case, you probably create a new cluster, do your tests on that and just route the traffic. So what I'm going to do is I'm going to right now we are on 179. So let me I'm just going to delete the canary. You don't need it anymore. Once I delete the canary, I am going to create a green deployment. Now green is basically one nine one. So if you look here, whenever you hit this, you're going to get 179, because that is what the default is right now. That's what where that's the blue and that's where our traffic is going. So now to get our traffic to the greens green cluster, what do we need to do? Any ideas? Basically, we need to edit our services. Instead of the instead of selector blue, we'll use the selector green. And that should do the trick. Now, Kubernetes give you facility to edit a live artifact. So now I'm going to do just that. I'm going to edit it live, which means that I'm just going to go here. These are the selectors that we have been using color environment. I'm going to set it to green now. And as soon as I save this, my traffic will start going to green. So where is the mouse? Alright, now if I go here, I should see one nine one. Yeah. Alright, so my traffic is now going to the green cluster. I can take down the blue cluster as in when the pending requests are served. I can do that. So with that being said, that's all I have on clustering and labels and Kubernetes. Do you guys have any questions? No, yeah. Yeah, so while changing the from blue to green, as we are humans, what if you make a mistake? Is there a way to handle in Kubernetes? No, if you make a mistake, then yes, your traffic will go here. It's generally a bad idea to make them. Okay. What I did here manually is probably something that you should not do manually. This is done for the demo, but ideally you should probably fire commands. So for example, if you notice when I fired when I changed the set image to one nine one, while doing rollout deploy, I could have done that using edit also, I could have changed it live and it would have if I would have made a mistake, it would have failed. So using a continuous integration solution like probably Jenkins or something and having those commands in place rather than firing it manually, that usually helps. And I do not recommend you to while I'm saying that it's possible doing it live, I do not certainly recommend doing it like this. This is a demo. And this is just a way to showcase the features. But yes, ideally you should use a continuous integration or a system like Jenkins to make these changes or a deployment suit to make these changes. That's slightly peculiar because okay, what container is basically a it's an it's a see is the same process running in a different namespace and you know, with its own isolation, it doesn't really that's actually an interesting use case. I did not realize that that happened. But have you tried using Java restrictors like xmx and all those things. So Java is not honoring that as well as to be looked at. I'm not entirely sure how that Hey, my name is Satish. Suppose actually we have actually deployed here here. Yes, suppose if we deployed actually microservices and multiple containers, instead of actually routing, actually, these microservices they want to talk internally, how we can actually make sure that it's not hitting the load balancer so that containers can talk each other in the communities world, you're saying that you want containers to talk to each other without involving any sort of service or load balancer. Yes. I would probably not recommend that I will tell you how but I would not recommend that I would want you to use something like Kubernetes services because that is helpful in case you're one of your containers decide to fail on you. That being said, if you really want to do that, pick any overlay network that you like like slannel is very useful in that because in that case, I mean, you have to pick a overlay as well as you have to pick a DNS add on because if you pick just the overlay network, then you will have to memorize IP addresses. Your applications need to know the IP addresses which is not always advisable because IP addresses change. If you pick overlay along with the DNS add on, then you just need to know the name and your and the container should be able to hit each other directly. So even actually I have a presentation layer which is actually separated from my microservices. Does it mean actually my presentation will always talk to the load balancing and get the services? Ideally, it should. I'm not sure how your system is set up. But ideally, see load balancer, something like service is actually very lightweight. It's mostly by default, it's based on IP tables. So basically, you're just routing the traffic using IP tables and it normally works very well with very minimum overhead. There is I've never seen a delay of even like two, three milliseconds, something that doesn't even add a couple of milliseconds is I think a worthy option. If it increase the reliability of your application. Thank you. I can Yeah. Okay, cool. It does. Okay, so blue green means that you act. I mean, if you talk about the normal, the defy the by book definition blue green, then you have to have enough capacity that you can run two clusters parallely. Now, I know for most of the organization, it's a waste of money. For me, it's a waste of money. If you are on a provider like AWS or Google Cloud or digital ocean or something like that, then it's usually not a problem because you basically just pay extra for like NR or something like that, whatever they bill you for. And that's not a big cost. But if you are with the data center provider, if you have your own physical hardware that you're managing, then it becomes a cost issue. And yes, then it will be a problem rolling blue green or a hybrid of rolling and blue green that helps. But technically speaking, that's not exactly blue green because at some point in time, your capacity will be compromised when you are basically in the transition state. Compromise in the sense that will be reduced when you are in the process of rolling out, either you have to reduce the capacity, or you have to make sure that both of them, that the users are getting served from both of them simultaneously. So you have to handle either that situation or reduce capacity situation. It's your pick. What do you want to handle? Last question. You're very okay. Right, right. Yes. Yes. Yes. It can handle it to some extent if you're if you are hitting your CPU, then Kubernetes can scale automatically. There are auto scalers available, which work out of the box inbuilt in Kubernetes. That's one way. I personally have found it to be limited because it does not cater to all the parameters that I want to auto scale it on. So I sometime back for a client, I ended up writing a custom solution which will hook to graphite and then scale because scaling is just a command by the there are Kubernetes APIs. So you can use the API rest API as well and hit it directly instead of using command line. So what I would probably recommend you if you are just looking for very, very basic scaling, just CPU based scaling, then you have Kubernetes auto scaling group, you can look at that if you want something more advanced, if you have more parameters to consider, then you'll have to get a little bit of hands on with the code and API. It's it's not too difficult. Just have your data shove to graphite or whatever graphing engine you have where you realize that these minutes are being received or so on. And based on that, you can call the Kubernetes API to scale up and down whenever you want. So usually it's it takes a little time to understand, but coding it is is not very difficult. You can do that. Oh, you if you are on if you are on data center, then I mean, you are not on a cloud provider. You're not data center. Okay. The reason why cloud provider came in picture and gain popularity was because of this because scaling with the hosting product is very difficult. And I don't think Kubernetes or any other tool will scale hardware for you. For hardware, you have to do a capacity planning. No, it'll it'll direct the request, your application will time out. Is the standard thing it forget Kubernetes or forget any containerization or anything like that. If you have a process running, and if you bombard it with if you have a website running, if you bombard it with a request at some point of time, it's going to time out for you or for whatever customers, it might be a serving a partial partial number of records, but obviously for certain customers, it'll time out. That's not a Kubernetes thing. That's your application thing. That's your application thing.