 Hello, everyone, and welcome to Cloud Native TV. My name is Saiyam Parag. I'm a CNCF ambassador and working as director of technical evangelism at SIBO. So welcome to Cloud Native TV, and welcome to the certs magic show. This is an official livestream of CNCF, and as such is subject to CNCF code of conduct. Please do not add anything to the chat or questions. That would be in violation of that code of conduct. Basically, please be respectful of all your fellow participants and presenters. So Cloud Native TV, and this is the certs magic show. And it happens by weekly, but there are shows running every day on Cloud Native TV. So make sure you subscribe that now. And certs magic is a show about Kubernetes certifications. And where I come with new concepts, new certifications, and talk about the favorite ones, which are CK, CK, ADC, KS. And we go over through the curriculum and try to cover some of the topics, explain them in a different way, and try to do some hands-on scenarios as well. And in between, if there are any tips and tricks, then we also go over them. Sometimes we do have a guest. Sometimes it's all by myself. So it's just go like this. So all the streams, all the past episodes. So this is the fourth episode. And the past three episodes are there on YouTube, on the CSCF YouTube channel. So make sure you check them out. What we have covered till now is we have covered introduction to certifications, why certifications are important, and why you would need the certifications. So that is pretty important, like why all any of the certifications matters. Then in the second episode, we went over through the cluster setup and the basic Kubernetes architecture. So we went from zero in the Kubernetes architecture to the explanation of end-to-end each component, what it means by team. And we also went through a Kubernetes setup, which was the Q-Vadium and Cryo setup. And in the last room, which is episode three, I discussed about the Kubernetes objects. Let me just show you that. So we did the Kubernetes objects and the pods, deployments, demon sets, stateful sets, I mean, not the, we didn't do the examples of all these, but at least we discovered what a pod is, what a pod spec is, how do we create a pod, and how do we run it? How do we run it in a different namespace? What is deployment? What do you mean by replicas? How do we run the deployments? How do we do the try run and save the YAML files? And all those different things. How do we scale it? How do we scale down? How do we roll out? How do we record the deployment? How do we undo the rollback, the changes? So all these things, which are very, very, very relevant to the certification point of view. So this is what we have covered till now. And today, before we go into the, more into the CKA stuff, what I would like to do is, I actually want to cover something which is actually related to the certification. So I thought like, we should take the certifications to the next level because the show is search magic and it's about humanity certifications. And right now at the moment, it is not limited to CK, CK, AD, CKS. So we have a few other certifications which are there in the industry and that you can actually take up for free. So we'll talk about those first. First is the certified Calico operator level one. So this is a very interesting certification provided by Tigera, it's free of cost. And you get to learn about Kubernetes networking, then installing the Calico, what a network policy is and how to use the network policy best practices, network policy for hosts and the node ports and then everything you need to know about networking, basically the pod connectivity, how the pod to pod networking, how does that go? EVPF data plane, the next generation data plane that is there, encryption and then the IP address management, peering with BGP. Also everything you need to know about the community services. So introduction to community services, how, what is Q proxy, Calico native service and advertising services. So this is actually a free certification and there's also a book which is more about the Kubernetes networking written by Alex. I did a stream with Alex on my channel as well. So a great guy, great shout out to Alex. His explanations are actually very simple to understand. So make sure you take that course. It's very, very good and free of cost. I'll drop the link in the chat. Just give me a minute. Next one is certified, get certified on the Essentials for Istio. So Istio again is the most popular service mesh out there and with, like, solo has come up with this sort of certification where you learn the essentials of like what is to data plane is, what is envioproxy, how to install Istio and then the data operations of Istio, then how to slowly introduce Istio in your organization. Then the observability on that, adding Kubernetes services, MTLS, debugging networking. Again, this workshop and certifications are free of cost. I think there are a lot of certifications by Sumo Logic. So make sure you check that out as well. So these are the ones. So you can just log into Sumo Logic and there are like fundamental certifications and a lot of good things like advanced metrics, security and compliance for Kubernetes, monitoring and travel shooting. I think this one would be really interesting. So I think you can take that. Another one is in the cloud native ecosystem which is getting popular is the chaos engineering. So chaos engineering certification is also there and you can take that certificate as well. The next one and by far, I think the most cool one is the Rancher operator level one. So you will get to understand all the concepts of Rancher and the RKE which is Rancher Kubernetes Engine, a Kubernetes certified Kubernetes, certified CNCF Kubernetes distribution by Rancher. Actually we have Rancher, we have RKE too also now but I think this is really good certification and this is really good in terms of learning the concepts of the Rancher Kubernetes Engine, what Rancher is, how to install Rancher with Docker, Rancher with Kubernetes and all those stuff. So designing the provisioning the clusters, then the cluster roles, RKE templates, how you can play with that, travel shooting clusters, the Rancher API server, container runtime, node conditions and all those things. So I think they are really, really good. And even after that you have advanced things like enable advanced monitoring, configure the notifiers, alerting, namespaces, projects, what is the project in Rancher? So running the Kubernetes workloads, persistent storage, config map secrets. So I think that's a complete, complete, complete course that will give you lots and lots of understanding of the concepts as well as it's a certification as well that you get at the end after doing a sort of test that you get. So yeah, that was pretty much it for the certifications. So I think they are pretty good and I hope you take them all. Now, also while attending this stream, you get a chance to win 50% discount coupon on Kubernetes certifications which are CK, CK, AD, CKS. I have two vouchers to give away for today and make sure in order to win that, you tweet them out and you tweet out what you learned and you tweet out all the things, also be interactive in the chat because that's how simple it is to win that. And now, today, we'll be discussing a very interesting topic which came up actually in the last stream in the previous one where people were asking about the taints and the tolerations, what are they, how they work, and basically people usually get confused in the taints and tolerations concept. So I'll try to simplify the concept and make sure we'll try to make you understand the concept of taints and tolerations. Apart from that, we'll also look at node affinity. So first node affinity, then taints and tolerations. So let's get started with that, okay. So yes, it falls under the category of workload and scheduling, so we will be covering node affinity and taints and tolerations. So first of all, node affinity. So we are talking about scheduling. Scheduling means like whenever you create a pod or a deployment, where that pod actually goes to which node that pod actually goes and runs the application. So for that, we have various concepts and one of them is node affinity. So node affinity is basically where your pods can actually be scheduled based on the labels of the node. Now, you might be thinking that it's the same purpose of node selector. Yes, it is. It is similar to node selector, but with much more deep meaning, with much more expressive language. And you can specify like hard and soft mechanisms for your nodes. Now, we have two things which is in the node affinity, which is required during scheduling, ignored during execution. This is the hard way of doing it. Next we have is preferred during scheduling and ignored during execution. And this is the software. Now, this one means only run the pod on the nodes with XYZ labels. So it is required. So these are the required set. And the preferred ones would be like prefer running them on this. If not, like if there is no other option, then obviously you have to run, you can run on the different ones. Like, but prefer running on this node, but if this can't satisfy some of the other things, then we cannot. So ignored during scheduling is actually same, but I think in future there'll be more options with respect to this particular word. And if labels are changed at run times, like nothing would actually happen. So if you change the node labels that are, if you change the node labels at the run time and some pod is already running and it doesn't have that labels, so that pod will be running. So means the ignored during execution means the labels you are changing on the node during the execution time will be ignored for the pod which is already running on that node. So that makes it clear. Next one is your example. So you have a pod, you have metadata for that. Now in the spec section of the pod, we define something called affinity. In that we define a node affinity. In node affinity, we have defined required during scheduling and ignored during execution. And there are the node selector terms. You can have multiple node selector terms and at least one of them should be true. So we are now matching the expressions. Now how we match the expression? We say key, which is the key for that and operators in and the values are this. This means a pod can be scheduled with this particular label and either one of these values onto this node. A pod can be scheduled with humanities.io slash e2e is a label. Value, any one of these. So if this particular pod will be scheduled on the node where you have these labels. Now if multiple nodes meet the above criteria, then prefer node matching. So this one will prefer that. So if multiple nodes are meeting the criteria for the pod to be scheduled, then prefer the one with the label this and the value this. So prefer the one with the label which is in the preferred section and the value with this. So that's how the node affinity works. Now people often do get confused in node affinity and taints and tolerations. So now according to the official definition, node affinity is the property of the pods, this one. Node affinity is the property of the pods that attracts them to a set of nodes because you are defining the affinity that it is attracting towards the node. Whereas taints are the opposite. They allow the node to repel the set of pods. Now it is basically restrictive. Now I am telling the node not to take up these set of pods. Now let us try to understand with the example. I don't know how much relevant it would be but I just came up with this. So let's say you have a party, you have a birthday party or just any other party or maybe a cube confadi. And in the party the dress code mentioned is red. So you have a dress code which is red. Now there are three friends, blue, green and red and they decide to go to the party but out of that only one person has dress red. So that means only person with the dress red can attend the party. So you are understanding, right? So treat this as a node. And this particular node can only take something with label with the paint which I'll explain in the next section with a paint, has a paint called dress red. And so all the pods, all the pods and in this case it is all the people which do not have the toleration will be rejected. So blue comes, it's rejected, green comes, it's rejected but any pod with the toleration, any pod with the toleration dress red will be admitted. So that's how you can relate and I mean, can be a weird example but might stick to your heads. Now let's try to understand in proper Kubernetes way. So on a node, so obviously Kubernetes, you have this Kubernetes, let's say it's a control plane and these are your worker nodes, okay? So this is one of the node. Now in this particular node, in this particular node we have a taint applied, okay? We have a taint applied. Now that taint is foo which is the key. So it is in the form of key, value and effect. This is equal key, value and effect. So in this particular case, there is a taint which is set on the node which is called foo equal to bar with the effect of no schedule. That means any pod that comes without any toleration would not be scheduled to this particular node. Now suppose if a pod, in a pod, we define a toleration. Obviously there are use cases which we'll definitely talk about. Now if we have a pod with toleration, in the toleration we have specified key which is foo operator which is equal, value which is bar, effect which is no schedule. So everything is actually matching. So foo is matching key, value is matching, operator is equal, so foo is equal, the effect is no schedule. So everything is matching. So this particular pod can be scheduled, can be scheduled on this particular node. Now we are restricting the pods from entering the nodes, but we are not saying that any of the pod with this particular configuration has to or will definitely come to this node. It has the toleration. If the scheduler picks node and test against the taints and toleration, this will satisfy and will be scheduled. But there can be a scenario where you have a pod and you have multiple nodes. So, and the, sorry, we already made this. So you have a pod over here and you have multiple nodes. Now this particular pod is having a toleration against this node, but these two nodes do not have any taints itself. So the pod can be scheduled here as well. So if we want to schedule specifically on a particular node, then we define node affinity. Then we define node affinity. Now we want it to be scheduled on a particular node. We want to be scheduled it to a particular node, then we specify node affinity. This particular scenario is, we are just telling the node that you will only accept the pods which have the toleration to this particular taint. You will only accept the pods which have the toleration to this particular taint. If not, then do not accept. If any pod with toleration comes to you, then accept. But that pod can also go to some other nodes without any toleration, which is also fine. So, but we are restricting the pods to be scheduled on the node. And affinity, we are telling the pod to be scheduled on a specific set of nodes or a specific group of nodes that we have defined with us. So with that, I think now that is clear. Few other things on the operator. So there are two operators, exists and equal. Now in equal, by default, obviously it's equal. And when you specify equal or not specify, you have to provide a value. So there should be a value which is equal to. And exists is if it just exists. So if the taint just exists on the node, then also you tolerate that. Then also the pod can be scheduled. So exists to no value is required. Now there are three effects. So this is the effect. And it is no schedule. It is prefer no schedule. And then it is no executed. So no schedule is please do not schedule the pod. Please do not schedule the pod that do not have the toleration for this taint. Prefer not to schedule. Please try not to schedule pod which do not have the particular toleration for this taint. But if there is no other option, then you can. But prefer not to. No execute. It is same as no schedule with one thing extra which is pod eviction. That means in no schedule, that means in no schedule, if you have, so this is node one. And it has two pods running already. Now you apply a taint of no schedule on this particular node. These two pods will still be running even if they don't have the toleration. Now you don't put the no schedule one but you put the no execute one. As soon as you put the taint on the node with the effect of no execute, the pods which are already running and not having the tolerations for that taint will be evicted. So they will be evicted from the node. So that is the difference between no schedule, no execute. So you have, so right now, till now what we have learned is we have a taint we have node affinity where we can define like we can supply to information to the pod so that they can be scheduled on specific nodes. Second is we have taints and tolerations. So taints are applied on the nodes, taints are sorry, taints are applied on the nodes and we tell them the like any pod without any toleration should not be entering the node and pods, sorry, toleration is applied on the pods. So pods we specify the toleration. Okay, this particular pod is having the toleration. So it can be scheduled on the nodes. Use cases, very important use cases, dedicated nodes. You can have dedicated nodes for specific purposes and only pods which are required or the applications which are required to run on those dedicated nodes should be on those dedicated nodes. So we'll be having dedicated nodes. We'll be having taints on those dedicated nodes and then the pods which we want to be scheduled on that dedicated nodes, we will apply toleration to that should be clear. Special hardware, you can have, specialist hardware with more CPU and more RAM and you want heavy applications to be should you learn that nodes and not other any other applications. So we'll put a no executor, no schedule label, sorry, a taint. So you will put a no executor, no schedule, taint on that particular node and only the pods of memory hungry or CPU hungry pods we will give the toleration in the spec section. And next one is taint-based evictions. So this I already explained, like you can have, you can maybe you have changed or maybe some policy have changed where you are required to apply some taint on the node and you also want to evict any other pod on that particular node, which is not following the toleration. So you will be putting no executor over there. Now you must have wondered like whenever you set up, like I showed you the setup right of QADM, QADM plus container D. So very simple setup. I took like four instances from Civo and then ran each of the commands from the gist that I showed previously. I can again share the link, no issues with that. And then we'll be getting a four node cluster where one is control plane and three are the workload where your workload actually runs. Now you must have wondered like why my pod doesn't get scheduled on the control plane node? Why does it not, why do the pods not get scheduled on the control plane node? There are some other things which are also there and there is one extra thing that you can define here which is called toleration seconds. Toleration, duration or toleration seconds, something like that. I'll confirm just after we close this presentation. So that means if you specify example, no execute, so that will obviously no execute will evict this pod. But if you have the seconds defined over here like 3600 seconds or some seconds, then those pod will be still running for this particular duration and then they'll be evicted. So it is helpful in some scenarios. Okay, so I was talking about the default taints. So the node, so this is this small piece of snippet is from the docs. I will show you where the docs are. The node controller automatically taints a node when certain conditions are true. The following taints are built in. So Kubernetes.io not ready. So node is not ready. This corresponds to node condition ready being false. So this is automatically added by the node controller. And then you have your, there is a no schedule on the control plane as well that is put when you initialize the cluster. I will show that as well when we move to the demo section. And we have the unreachable. We have the memory pressure. Node has a memory pressure. Then this taint is added. Node has a disk pressure. This taint is added. Node has a PIV pressure. This taint is added. Network is unreachable. This taint is added. So that is taken care by the node controller. There are different set of controllers that are there. Node controller, demo set controller, deployment controller. So all these controllers. So node controller is the one which is responsible for all these adding the taint to the nodes. I mean, obviously there are other responsibilities but this is one of them. So these were the default dates. So I hope the confusion between node affinity taints are cleared. And I hope the concept of taints and toleration is clear. Like what is a taint? Taint is applied to a node. What is the toleration? Apply to the pod. When a taint is applied, there is no toleration. The pod will not be scheduled. When a taint is applied and there is a toleration which matches the taint, then the pod can be scheduled on that particular node. There is no guarantee if a pod, if you have a node with taint and pod with toleration then it will definitely go to that node because there can be other pods which do not have any taints and it can go to that as well depending on what schedule it chooses. So yeah, that's pretty much it from the theory point of view. We now move to the demo section. Before that, I will quickly show you the taints and toleration docs. So these are the docs. And please, if the concept is clear with you then just saying the chat that the concept is clear because it took a little bit of time to arrange it in this manner. So I would actually feel happy if you say like the concept of taints and toleration that clear to you now. Even it will be more clear when you see the demo. So it's okay. And I see a lot of folks in the chat. So hi Saloni, hi A.J. and hi Girish. Hope you are doing good and please keep sharing all the stuff that is happening. Interesting. Now, these are all the concepts and obviously we'll apply the taints and we'll apply the toleration and see how it works actually. And these are the taint-based eviction that I was telling. I just want to show, yeah. So it was toleration seconds that you can define. So Kubernetes automatically adds a toleration for node not ready and unreachable with 300 unless you or a controller set those explicitly and also if you apply specifically the no execute taint you can toleration. So you can specify those toleration seconds. Toleration seconds is a good thing to have. Then yeah, these are no-definity one. So I'll paste the link for that as well. No-definity, we have covered all things. Now I will switch my screen. To, I'll switch my screen to my terminal. Just give me a second. Okay. So you can see the terminal window now and I will show you all the concepts of taints and toleration. So we'll be doing the demo for taints and toleration. So first of all, Qxerial get nodes, the fancy and the famous command. So you can see this is the same, this is not the same cluster that we created. Interesting. So I need to log into a different cluster because this one is a different one. Just give me a second and let me go and pick and grab my IP pool. Qxerial get nodes. So we have like this is the one that we created actually based on the script. I can even show you the script once again, it's okay. So we have a control blend node. We have free worker nodes where the pods are scheduled. There obviously can be some of the scheduled pods, which are okay. I'll remove this particular thing, Qxerial delete, pod hyphenhyphen force. Okay. So Qxerial get nodes. We have four nodes. We can do a Qxerial get nodes. Qxerial describe node twice. So you can see we don't have any tints. So in the tints section, it is null. So we don't have anything. What we do is if we want to find like tints from all the nodes, the simple trick that I use is Qxerial get nodes. So I get this order. I get this order. Now what I'll do is I'll do a Qxerial describe node. And grep for taint. It will be in same order. So this will be for the control plane. This will be for the worker one. This will be for the worker two. And this will be for the worker three. And what we can do is let's have a pod. I already have a pod spec. So let's see the pod spec. We have a pod spec, which do not have any toleration. Which do not have any toleration. So what we'll do is we'll first taint the third one because there is no tainting and we'll try to see like how to taint the node. So the command is very simple, cube CTL taint, then node, then the node name. Then what taint we want to apply? Let's say, Siam equal certs magic. Okay. And with the effect of, we can choose the same effect that is there for all. And we can see that the node is tainted. So now if we rerun this particular command, which was this. So we should see another one appearing in the last one. So this is for the control plane. And I told you like the control plane already comes with a node. So this is a default taint that comes wildly during the installation of the cluster. So we have the node role Kubernetes IO master node schedule. So which means that no, none of the pods will be scheduled on the master node. And it should actually be the case. In reality also you should not schedule any nodes to the master node. So which is a good thing. You should not. And next is these are the ones that we manually apply. Now there are taints on all the nodes. And if we have a pod it should not be scheduled. So let's see. So kubectl apply-f pod.yaml pod is created get pods. Pod is pending. Pod is pending kubectl describe pod. It says there is no node available failed scheduling. So one node has a taint foo bar that pod didn't tolerate. One node has a taint foo double o bar that also is didn't tolerate. One node has had a taint of master that is also didn't tolerate. And another one had search magic that also does not tolerate it. So we have four nodes all the four are tainted. What to do now? What can I do? So it's very difficult, right? So what we'll do is we will I have another pod spec. So let's see that we've added toleration. So pod2.yaml. So this is the one and this is the section in the spec. So you can specify the tolerations in the pod spec section. So you have your key foo and the operator is equal. The value is far effect is no should use. So it should be tolerating this particular node which has a taint of foo bar. So let's apply that qctl apply hyphen f pod2. Pod is configured qctl get pods. It is container creating because we added a toleration. Now it should go on. I think node, sorry, worker one. Let's describe that worker one is having the taint of foo bar. Okay, so now let's see qctl get pods hyphen wide. Absolutely, so it went on worker one. So it tolerated, so it had the toleration that it can be scheduled on a node which has a taint of foo equal to bar. Another interesting thing. So we can see that we have a few pods which are like worker one and on worker one. So what we'll do is we'll see another scenario of the eviction scenario. So we'll put no execute. How to do that? First, let's remove the taint. So qctl, so we'll run the same command. Not sure how much back, okay. And we will not choose this one because we didn't do this. So we will, we have worker one, this one, okay. And we have the taint as foo bar, perfect. Now in order to remove the taint from a node, we just have to add minus symbol. So the node becomes untainted. So let's do qctl describe node, the taint. You can see the taint is removed from worker one. And what we'll do is we'll add a taint of no execute. So it is tainted with no execute. We see the pods have been terminating because they evicted. The pods from this node is if it evicted. So that's what I was telling you. So if you have the no execute one, the pods will get evicted from the nodes. I hope now you are able to understand the concept. You know how you can apply the tints? How you can remove the tints? How you can apply the tolerations on the pod? So pod2.yaml, how you can apply the tolerations on the pod? How you can apply the no execute one and see the eviction? So how the eviction happens? So I think all the scenarios we have covered and that was the main goal to make you understand how the tints and the toleration, the eviction process, they actually work. So most of the people get confused in this. I hope this particular stream helps you to understand the tints and the toleration concept in detail. And for more reference, obviously you can go to the documentation. But I have seen, I have gone through the documentation and I have seen like the documentation itself talks about all these concepts. So the preferred no schedule and the no schedule and the no execute. And then you have, these are the effects and then you have your, what do you call? The operators which are exist and the equal. So that can be, that is there. But example-wise, I think that should clarify some bits. So if you like, then just shout out in the chat like it was useful and you were able to understand the tints and the toleration concepts. What we are going to do is, yep. So that was pretty much it that I had for today. And thank you so much for tuning in. And the last section is like the certification voucher thing. So I think two people who are active, just want to ask Girish, like, did he get the coupon previously? Because I don't want to give away, two coupons to the same person because actually it would be unfair for the folks, new folks who have been joining. So Girish, if you have got the coupon before, please drop in the, drop a message in the chat that you have got the coupon before. And so that I can, announce the winner. In the meanwhile, so like I told you before, cloud native TV is, there are different shows that runs throughout the week with all the, with everything. And I want to plug like this show as well, sayimparag.com slash YouTube. If you want like in-depth videos of cloud technologies, then you can subscribe to my channel, which is sayimparag.com slash YouTube. I don't know YouTube channel where I keep on doing live streams with the, you know, other folks in the cloud industry on different topics. Also follow the cloud native TV because we have shows running each and every day. So it's not only my show, which is bi-weekly obviously. So you have to tune in, but there are shows which are running every day. So this coming Friday, which is tomorrow, there's a show spotlight live with GRPC where we'll be having April from Google to discuss about the project. So I think that's pretty much, that's what that should be, that would be really cool. So you, there is also like the registration for KubeCon, cloud native Con North America 2021 is open for in-person and virtual. So explore, you know, all the registration options. I'll drop the link in the chat for that as well. And yeah, also cloud native TV is actually now on CNCF store. So I'm really, really happy that there is a, you know, a decal pack for my show as well. So make sure you, you know, check that out and you get the CNCF search magic sticker. So you can see, go on store.cncf.io, get all the collectibles for all the shows which are happening on cloud native TV. So make sure you to subscribe that button. So since Girish has not responded, so I'm not sure, but I have a doubt. And anyways, for the folks who have joined first, so I'm giving based on that. And I saw some of the tweets as well. So for today, I'll be giving out the CNCF certification 50% discount coupon to AJ and Saloni. So AJ, please do reach out to me on Twitter because I don't know, like, you know, how to contact you. So this is my Twitter handle. So you can see on the screen, say I am partner and Saloni, please reach out to me on Twitter and I'll hand over to 50% discount coupon on the certifications. And thank you for tuning in. See you next time. This video will be uploaded on YouTube as well after 10 or 12 days or something like that till the time it stays on Twitch and share it with friends so that you can have, you know, all the knowledge about the certification and the concepts, I'll try to simplify them and I'll try to get, you know, we'll also have more guests coming up on the next shows and we'll talk about some of the other modules, like the troubleshooting one and the volumes one because those are also, I think, very much confusing for some of the folks. And I want to like explain that, explain them in a way like you understand it from certification point of view and understand it from the regular working point of view as well. So with that, I hope you enjoyed today's show with set of certifications. We talked about more definitely gains and tolerations and the way it goes back in ways. Thank you for joining in. Always try to be interactive, follow Cloud United TV, enjoy the other shows. And thank you so much. Bye all.