 Hello and welcome to choose wisely understanding Kubernetes selectors. My name is Chris and I'll be taking you through this session. I work for a company called RXM. We are a cloud native training and consulting firm. So I do a lot of training. I do some consulting and I do a lot of speaking. So you may have seen me last year at KubeCon here in Europe. Well, I suppose it was virtual because of COVID and talked about rollouts and rollbacks and behaviors with controllers today. It's sort of related in that we're going to talk about controller behavior with selectors. So without further ado, let's get started. So before we talk about how controllers use selectors, we need to talk about what they're selecting. So in Kubernetes cluster, most objects have some sort of label and they may even have some sort of annotation. And this could be any object, a pod, a node, service. The nice thing about labels is that they're selectable. And so we can actually use KubeCTL commands to create labels, review labels. So as you can see on the slide, we've got the show labels flag. We can add a label after the fact. So if an object already exists, we can use the label command to add additional labels. We can overwrite existing labels if we want a new value for a particular key. And we can remove them as well for existing objects. In terms of using patterns to select things, there's three patterns. There's the quality based, which means that objects returned by that query have satisfied all of the constraints. They may have additional labels. So if you just selected one label, but there are several others, the object would still get returned by the KubeCTL command as we see here. They're set based, so you may want to filter based on a key with several values, and you want to see the objects that have one or more values. And then the key name. You can just use the key and see all the values for a particular key value pair. And so you can see some patterns here. We've got the team key, and we're using inScariff with Bespin, and we're separating the two values with a comma, and that logically ends. And so we get all of the results back for the objects that have both team equals Scariff and team equals Bespin. We can expand upon that by using a set based selector and an equality based selector. So a second example here is team inScariff, Bespin with apicals rideshare, and then of course that minimizes the results that get returned. We can even use labels with things like the logs command. If I don't know the name of a pod, but I want to get its logs, as long as I know a key value pair that is on that pod, then I can just use logs with a minus L, and you can see here that we've got three different pods that answer right, and they've got 36.3, 40.2, and 40.3. Each has a different IP address, so we can tell that it comes from three different Apache web servers. And we can even delete things with selectors, so in this case we delete a deployment based on a team plus a release key value pair. So just like we can use labels and selectors in kubectl commands, controllers will use selectors to find and manage other objects. So selectors are used by application controllers like deployments, replica sets, and others to target sets of pods to manage. And deployments and replica sets and jobs and daemon sets will support both the equality based selector as well as the set based selector. Okay, so I've got several node cluster here that I'm going to be working with. We'll just do something simple. First let's start a watch. I've got a watch here. So we're going to watch deployments, replica sets, and pods. I'm going to show their labels so that we can actually see those. I don't have anything deployed yet, so it's just going to be empty for now. And then we'll create a deployment and we'll just call it KC deploy. Just very simple. Use a particular image, the host info image that we're going to use throughout the demos here. When we query it directly it will give us its IP address and its host name, which will be valuable for demonstration purposes. We're just going to run three replicas in this case. So we see the deployment, the replica set, and the pods get deployed. And we can see the labels that are used, right? The deployment has the app equals KC deploy label. And it has that because we use the create command and we're using an imperative. So when you do create deployment and the name, then the key value pair becomes app equal to the name. So we called it KC deploy, so we've got app equals KC deploy. But we can also see that an additional label was generated, the pod template hash with the replica set and then that also shows up on the pods. So what that means is if I run another pod, right, so I'm going to run this KC deploy pod. Just use the same image and then label it with the KC app or sorry app equals KC deploy. What we would expect is that the replica set is going to ignore it because it doesn't have that extra key value pair that's being used by the selector that the replica set is using. Even though we're using the exact same label because we had that extra dynamic label, we avoid that kind of situation. So what that means is that a deployment controller will use a dynamically generated key value pair and that will avoid any potential overlaps with standalone pods. And the pod template hash key value pair really isn't intended for a user to create. It's going to be managed by the automation and with every revision of the pod template that gets deployed, the hash is going to change. The labels value is going to change too and this allows us to again only control the pods that we intend to control and avoid any potential collisions. So let's take a look at another type of controller. Let's look at the RS directly. So if I use an RS directly rather than using it with a deployment, the behavior is a little bit different. So let's take a look at a demo using replica sets. All right. So this time around we're going to look at replica sets to see how they behave and we'll do another watch. So this time we're going to watch just replica sets and pods, but also taking a look at their labels. Again, nothing running, cleaned up after the last demo. And so first off we will run a pod that is going to use the label we'll use with our replica set. So start with the pod this time and see that it's running first rather than as we did before running the pod after the fact. And then I've got a replica set and I've done it in YAML form because there's no way to do that. Imperatively, if you take a look at the slides, you'll see the YAML in the deck. So I'm just going to kubectl apply that and just called RS.YAML. And what we're going to see is that in that replica sets, manifest that the desired count was three. And so when we deploy that, it actually only created two pods that were deployed when the RS was deployed. And we can see that it has now taken control of our original pod. So unlike deployments, using a replica set directly means that we do have potential. If the label selector is basic like this and matches one of the label selectors of an existing pod, it will maintain control of it. Now, let's say we had a second RS. So I'm going to create a second one that called ClashRS. It's identical YAML file to the first one with the same selector just with a different name. And we're going to see that it is going to create two replicas. And of course, it does that without detecting the original pod. So it essentially ignores our original pod because that has now come into the control of our original replica set. So what that means is they're not going to fight each other. Now, if I COOP CTL DELETE and we'll do both of these, we should see that everything gets garbage collected. So that all of our pods now go into a terminating state. Even though that pod was created first independently because it's under the control of our RS controller, it is now terminated along with everything else. So what that means is that replica sets on their own don't create that extra dynamic key value pair to try and avoid collisions. So it is possible that a given replica set could inadvertently begin controlling a standalone pod that has identical labels that is being used by the replica set's selector. However, once that happens, if we have a second or third or additional RS using the exact same label query, it gets ignored. So the only one replica set will pay attention to it. That way we don't have any sort of fights happen between the two RS's. If one had a replication factor of two and the other one had a replication factor of three, you could imagine they could start to fight each other and lead to thrashing. So that's a good question just in general. And what about not standalone pods but controller-based pods that have overlapping labels? So again, the example being two RS's use the same selector. One has a replication factor of two, one has a replication factor of three. Would they thrash adding and deleting pods until we went into crash loops? In current versions of Kubernetes, no, because the controlled objects actually have a reference to their owner. So around the one-sixth time frame, this was added to a number of the controllers and you can actually see it. If you use kubectl explain, you can see that there's an owner reference and in the highlighted text here we point out that if the object is managed by a controller, then there's a subfield called controller that is set to true and of course we use kubectl explain again to show that. So really, label overlaps really only occur with standalone pods, not pods that come under a controller already, like from the very beginning. But then of course once a standalone pod is taken over by a controller, it actually receives the owner reference. And that's why we saw when we deployed the second RS, the first RS had updated the pod to give it that owner reference. So there's no way that the second RS could have taken control over it. Now what about other controllers? Things like daemon sets, staple sets, jobs and cron jobs. Each one of these controllers has implemented some kind of labeling scheme that allows them to avoid any sort of collisions with standalone pods and of course pods that have been deployed by a given controller have that owner reference as well. So it's not just deployments and RSS but each one of these controllers have had that owner reference added to them over time. But we can see like deployment based pods, the daemon set pods are given a hash label, staple set pods are given both a hash label and a pod name label. So an extra label for staple set pods and then jobs and cron jobs. The pods controlled by these are given a unique label related to their controller's ID. Just some thoughts about how we could use this. Could those dynamic labels be used in a standalone pod? Let's say you wanted to bring a pod. Of course the controller is trying to avoid inadvertent control but maybe this is a desired behavior, right? Yeah, you could. You could add a dynamic pod template hash for example to a pod that you wanted to add to a deployment after the fact. So you could use the label command on that pod with some caveats. If the deployment was created first and all of its pods were deployed first, when you added the standalone pod, if you ran it subsequently and then labeled it, the replica set is basically going to kill it because it's last and first out. And so you actually have to create the pod first and you saw me do that in the demo with the RSS. I created the pod first and then I created the RS with the three replicas, right? That's because if I created the RS first and then ran the pod, the RS would have seen the mismatch between the desired and current and just killed the one that was the newest. You have to make sure that your pod that you want to add to your deployment is older. And then if you're using something like a deployment, a rolling update will essentially replace that standalone pod with a pod based on the template. So you'll lose that pod and one of the ones from the template will come in its place with the change. A workaround could be to basically disable that, right? So you can do a rollout pause on the deployment. Even if config changes happened, of course the rolling update would not trigger. But it's kind of a lose-lose, right? Because either you're going to lose your standalone pod when the rolling update triggers or you're basically losing the rolling update feature. So yeah, you can do it, but there's lots of caveats in place. So moving on from controllers or application controllers, we want to look at another type of controller which is a services endpoints controller. So in this case, the controller is used to target a set of pods and create a pool of IPs for load balancing so that we can load balance client and peer requests to a set of backends. Unlike other controllers that we've looked at before, the services or the endpoints controller can only support equality based selectors. So you can't do group based selection here. In this demo, I've broken up the terminals into three parts because what I want to do is, first I want to establish a watch for all the objects we're going to use. So we're going to use a deployment. We're going to expose that deployment with a service and then we're going to use a standalone pod. And we're going to again have label overlaps on purpose to show how the behavior of service selectors work. So of course nothing just yet. We will run our client pod. So in this case, it's just a busy box so I can get a shell and run a loop. Now in this case, I'm going to grab the loop here and paste it in. And so we're just going to query the service called KC and of course we don't have one yet. So let's create the deployment. And the deployment is just going to be called KC again using the host info image to give us the host names. Give us three replicas. So just basic deployment and then we'll expose it. So in this case again, we're just exposing the KC deployment, mapping the port that it listens on to a standard port to make things simple. Now that I have the service established, I can start my query. But what I want to do first is run a standalone pod that has a similar label. So we can see this service is using app equals KC as the selector. And so we'll run another pod using the same image because we want to see the host name of course as part of the demo and then label it so that it falls underneath the service. And so we'll run this loop. Oh, and the very first request goes to our standalone pod. So of course we're randomizing our load balancing with a normal service. So what that means is unlike other controllers that we've seen, the endpoints controller does not try to avoid any sort of selector overlap. So of course this could be problematic if an unrelated pod or unrelated groups of pods inadvertently end up under a given service. But the nice thing is we can actually use this to our advantage because application controllers support the implicit config change trigger and rolling update features. But they don't really expose other strategies. So like blue, green, highland or dark or canary, the stateful set controller does have a partition where you could do a canary. And what if you don't have a staple application, you have a stateless one and you want to use a deployment controller? Well, this is where we can use the service selector to our advantage. For the blue, green strategy, we will start with an existing deployment of pods. But we're not going to use the deployment controller. This is because we're going to need to make subsequent edits to the pod template. If you're familiar with the rolling update, you know that that comes with the deployment. You know that when you make changes to a pod template in a deployment controller, the implicit config change trigger fires and you're going to have a rolling update. But we don't want that strategy. So we're going to use replica sets directly, create a service that is going to use a selector that can be shared between two groups of pods, the blue and the green. And then we will add the common label that we're going to use with the service selector to the pod subsequently. So you can see the replica set on the left here has a single label demo equals blue RS. And then we're just going to use the label command to add the pod labels directly. So once we've got the blue pods in place, you know, we've got them running for some time, right? We want to do an update to the application. Then we create a second RS. And in this case, we're changing the image from the alpine version to the latest version. We deploy the green RS and pods. And if you want to do a dark strategy here, we just don't immediately add the labels to the green pods and we kind of let them warm up in the background. And then subsequently, we will then add the labels using the label command to the green pods. And we'll have a big pool where the blue pods and the green pods are all going to be rolled up under the service. We can't immediately remove the label from the blue pods because remember that controllers and Kubernetes have eventual consistency. So the endpoint controller is going to take a minute to not a literal minute, but it won't be immediate, right? To update itself, get all the pod IPs. And Kuproxy is going to need to implement this. So in our case, we're using the default configuration of Kuproxy, which is using IP tables. And of course, Kuproxy has to go and do that work. There needs to be a little bit of overlap. During this period, you're going to see that both blue and green pods are behind the load balancer. Once we're satisfied that we have enough endpoints behind the service, then we can simply again use the label command to remove the labels from the blue pods and the traffic will just be directed to the green ones. We can retain the blue pods for rollback or undo if something is wrong with green. Or if you want to go the Highlander strategy route, you can immediately remove the blue RS pods after removing those labels. So let's take a look at a demo of this particular strategy. So again, we're just going to watch deployments, replica sets, pods and services, see all the things that we have deployed, nothing right now. We're also going to run a client so we can actually see, again, we'll use the host info image. So we can see those host names come back. So I'll start the client right now. You'll see it creating and then we'll again have another loop going. Just copy that. The service is going to be called blue-green. So we'll query that, but we're going to wait. So first, let's apply our blue RS and it's identical to the blue RS we saw in the slides. And so it just has the one label right now, demo equals blue RS. We will then also apply our blue-green service. And the selector we're using is going to be method equal to blue-green. So right now the service isn't going to have any pods behind it. And we can confirm that with the endpoints. And of course the blue-green service has no endpoints. But if we label, try that again, we're going to label pods that have the current label demo equals blue RS, which is our RS pods. So we're going to pass that command to the label command and then add the method blue-green. So we'll have some pods that will be behind that service. And down below you can see that we've got the, now we've got the blue-green. And we can repeat our get endpoints. And of course we've got some endpoints for the blue-green service now. And I can start my client loop. And of course we get the blue RSS. So we've got our blue version of the application deployed. And we need to then add the green. Let's go back to the apply. And use the green RS, like we saw on the slides. And so again we've just got the demo equals green RS labels for these pods. So of course we wouldn't expect them to be under the service right away. Now we can use that same label command that we used here. Just change the demo blue RS to green. And this will add the method blue-green to our green pods. And we'll start to see them show up in the load balancing with the client on the right once I hit enter. Here we go. So now we've got some green mixed in with our blue. So the endpoints controller is updated. Kuproxy has implemented the new targets. And we've got the blue and the green mixed in. And then we can go ahead and remove the label. So we'll go do this one. And syntax is just label but method minus that will remove the label that we don't want from the blue RSS. And there we go. Now our load balancer is just sending traffic to only our green pods. Lastly here we can do a canary experiment using similar technique. So in this case we can actually use deployments. And you absolutely can use RSS too. So your production, let's say your production set of pods is using a deployment. And we're not going to really do that many or really any edits to the production. What we're going to do is add baseline and canary. So it sort of depends on how you want to do it. So we can use a deployment with baseline and canary and deploy them with the labels that the endpoints controller is using. And they'll just be added to the load balancing pool. Or of course you could use RSS deploy the pods first and then use the technique we used in the blue-green adding the labels subsequently. It's really up to you how you want to do it. But either way you can achieve a canary experiment. Okay, again we're just going to watch deployments, RSS pods and services. In this case I'm going to use the deployment method. So I'm not going to use RSS. Which means that my YAM manifests for my deployments already have the correct label. So we'll run this, again we'll run another client here. So we can see the load balancing across the production, the baseline and then the canary. Let that go for a minute. We'll set up the loop but not start it like we did before. So in this case the service is going to be called canary demo. So let's actually take a look at some of these. So look at our prod first. So the common label that we're going to use is app equals demo. And each one of the deployments is going to have its own release version. So we wanted to let's say query for a particular deployment by its release type. We can do that and delete them. So I could use like a delete minus L and then like pick a release that I wanted to delete. So same image that we've been using up to this point. So we'll go ahead and apply that. We've got some production pods rolled out. So let's also look at the service. So it is service canary here. And so again it's just going to be using the selector of app equals demo. And that will be a common label across all of our deployments. Just a basic service using a cluster IP. So we'll apply that. It would help if I use the word apply. So we've got our service. We've got our production deployment. We'll go ahead and start the loop here for the client. And of course we've got our production pods aptly named so we know how the mix shakes out. So now we can look at our baseline and very similar. It's using the same release which is the Alpine release. It should be the same release version or whatever experiment you're going to do. It should be identical to the production version. And then we'll also look at the canary. So in this case we're again doing the update of the image from the Alpine version to the latest. And you can see that these pods are going to have the app equals demo label two so they can fall underneath the service. Now in this case I'm going to do applies at the same time so we have an honest experiment. Because we don't want to have an experiment where we deploy one first that would not be a proper experiment. So we'll apply baseline and we'll apply canary at the same time. Now we've got a whole bunch of pods, some of which coming off the screen a little bit so we'll move this up. And we can already see on the load balancing here on the right that the canary pod has been queried a couple times and there's our baseline. So we've got a mix of canary, baseline and production and of course while the clients are communicating with the various releases, we can collect all of our metrics, measure and then determine whether or not the experiment is a pass or fail based on whatever we're trying to prove. Does the change to the image introduce more latency to the application? Are there more error rates or whatever it might be? And then at the end of this what we'll do then is simply say oh okay well let's say the canary experiment was a success. And so this is a long command I'm just going to copy and paste it from some notes I've got here. And so we're going to delete the baseline, we're going to delete the canary and then we're going to set the image for the production deployment to the latest because the canary was successful so we want to update the version. What you'll see is the RS hash for the production pods in the load balancing will transition from the 5F hash to the new hash. So let's go ahead and trigger that change. You know what I need to do is do the right path for this delete. Well we're doing the rolling update right so there's the new hash for production and I just need to give the full paths for these guys so we'll get them out of there. So ideally of course we do that all at the same time, really wouldn't hurt if it wasn't all at the same time like we did there. But of course we delete the baseline in the canary and we've got our new production deployment based on a successful test. And that brings us to the end. I just want to say thank you for your time. Come find us over at rxm.com. We have quite a few training courses that are available as open enrollments or also private trainings with teams. We even have certification boot camps and self-study so if you're looking to take the CCAD, the CK or the CKS we've got you covered. Once again thank you and enjoy the rest of the show.