 Hey folks, so I guess I'll start with the session. This is my very first Qtcon talk, and so I welcome you to my first talk and very chilly evening, I must say. So we'll be talking about the positive sides of binding parts which is also a toolkit for app life cycle. Now, initially this was a do or talk, but it got converted into single because Adam couldn't make it due to some reasons which are out of his hand. So let me just quickly tell me about myself here. So I am a Kubernetes and Cloud native associate that is KCNA certified. I'm a captain contributor also and open source developer. I have worked as a program manager in CMA software and as a junior developer evangelist in Mariko. I love taking part in communities and attending events, any technical events, and also I believe in power of collaboration. As for the introduction of Adam, Adam is a developer at Dynatrace. So Captain is a project which is by Dynatrace, and he's again a captain contributor, and also an open source advocate. So these are our socials, Twitter and LinkedIn, in case you want to have chat around this session or anything around DevOps, I would be really happy to have a chat with you. So now moving on to the agenda. So we will start with pending parts. So what are pending parts? Why are they so common in Kubernetes? And all the other areas. Then we'll see how are pending parts beneficial. This is the heart of our talk because not in daily use case or in daily application deployments, you see that pending parts are beneficial for developers. They are always threat to them. But we'll see how can they be beneficial for future developers. Then we'll see the problem indications. When is the right time to get aware of the pending part problem? Let's say there might be a huge threat to your application. So you have to be knowing of what's the error, which is let's say causing the pending of the parts. Then we'll see Captain in the open source tool, and we'll see how we can utilize it in order to make the pending parts more helpful to us. Then we'll see the demo. So I have a partially recorded demo here because initially this talk was of one hour, but at the end moment we have to change it to half an hour. So it was a pretty long demo. I have some half part of it recorded and half part, and then I will continue the live part from the recorded section. And then we'll conclude the session. So let's start with what are pending parts in the first place, right? So pending parts are like there are nothing but incubators but apart that, let's say have been created, but not yet been scheduled to a node, right? So let's say, let me give you a quick example of this. You have an application, right? You have all your YAML configurations written. You have all the files created, all the deployment and all the files are ready for you. You hit the command kubectl apply minus seven in the particular file name. You see that your terminal shows you that the pod is created, but you see that your application is still not running. So then you debug it a lot. You go to kubectl get pods, you see that, okay, the pods are created, but again they are in pending state. So all these pods which are created but are not scheduled to node and because of whom the application is not running are pending pods. So now why are pending pods? They are like so common in Kubernetes. So we have some overview of some, let's say common errors, which might cause pending pods for our application. So the broader ones are insufficient resources, node and scheduling constraints, networking and image pool delays. So let me quickly brief you about all of these and we'll understand each one of them. So the first one is insufficient resource. Now pods are scheduled to nodes by Kubernetes, according to the resource request. So let's say there is a deployment made by you, there is a resource need which is required by a certain application to make it running, right? And all the pods which are scheduled to nodes are working on that resource request. So a pod will stay pending if it demands like let's say more resource than it has been allocated. And this is the basic let's say example, you have a certain box, let's say you have a glass of water, you can pour the water as much as the glass, till when the glass is filled till top. So this is the perfect case for insufficient resource. You can only deploy to the part where your resource is not exceeding. So this is the perfect example for insufficient resource. So in clusters that are like unprovisioned or no, some experiencing excessive demand, that is the case for insufficient resource. Now I have a YAML configuration here. So if you see that we have a kind pod, we have our container named myGundana and we are using just the locally-built image and we are requesting the CPU of 10,000 MB, right? So this is a whole lot of memory for a simple configuration. Now this is the perfect example for insufficient resources because again, when you go ahead and apply this configuration and you will see that the pod will go in pending state and then when you try to debug the pod, the pod which went in pending, you'll see that the default scheduler is not working and it has failed scheduling. So it says that zero out of two nodes are available. So that is insufficient CPU and we saw that we asked for a whole lot of memory for a single pod. Now moving on to the next point is node or scheduling constraint. So node limitation for pods can indicate which node they let's say want to schedule on. So a pod will stay on pending state. Let's say if it does not have the resource which it needs, right? So this is again a perfect example for node and scheduling constraints. So it needs to have the required resource. Anything which you want to run needs to have some sort of allocations to make it running, right? So in cluster with like wide mix of nodes, there are various capabilities and this can prevent issues. So to handle the error in node or scheduling constraints in Kubernetes, you need to like specify the constraint using node affinity and entity node affinity and that we'll see in the next slide where I have the YAML configuration for this particular error. And then once you have identified the constraint, you can, the pod will meet all the demands and then you can maybe have this pod running. So this is the YAML configuration. This is a quite different one from the previous one because we have node selector terms and required during scheduling and node during execution. So this is a particular use case which we are giving here. But this again will throw an error because again we are not giving the right scheduling information here. So the last one or the let's say the most common error is image pool and like there are various few errors such as crash back loop off and all the others but these are some of the common ones. So in this if let's say, if you're downloading container images from let's say container registry, it takes longer than expected in few configurations and this is where you know that you're having a networking or an image pool delay. So let's say you have your application, you have applied the configuration and then you see that okay the pods are pending and the application is still trying to retrieve the data from the link which you are requesting from. So this has an example for if you see here I have a like a dedicated link for image error and it will not be able to drive all the not required details to break this into the deployment stage. So there are some custom conditions as well and these are like enabled by a captain. So captain lets you like extend the definition of not ready to deploy to a broader set of answers. So let's say just like you can go for deployment doesn't mean that you should obviously or you should always go for deployment. Maybe there are some cases where you need to test few things, right? So now of course the technical narrow considerations like third party systems or infrastructure, they might not be ready but more important is the broader lens. So now you may be technically able to deploy but the bigger question is should you go for such deployment? Perhaps like there are marking campaigns which are happening which cannot be even potentially interrupted. So if you see to the technical side here there are third party systems that are overloaded, no data storages and I had the temporary high failure rate on dependent service. So and then there is a whole lot of different section of which is completely business centric. Like there are no humans ready to support. In there is a very let's say short maintenance window, current ongoing even marketing campaigns in progress, change freezes, too many unsolved tickets, call center too busy or low stock and various others. So the deployment is a promotion page for a new product but when you check the system, right? So you have any of the company they make the development in order to, so that the users of the company can access the page or you can see use the product which has been created by the developers. So but you have like low stocks available there, very low level stocks. So you know that ordering will fail and users will be frustrated but you need to have pre and post checks. So that was all about that section. Now we'll see how our finding parts beneficial and this is the main like section of our talk. So again, there are, I've taken the three broader perspective on this. So first one is predictable scaling. Second is improved observability and then early detection of problems. So we will be quicker on this section because we have a big demo to be displayed here. So first one is predictable scaling. So pending parts like may happen during times of high demand. Let's say there are various or high number of parts being deployed for a single application. There might be resource constraint. You know, a single part might be requesting or let's say all the parts might be requesting resource at the same time and some part might not get the sufficient resource. So you need to make sure that you have a sufficient resource in order to make the deployment beforehand. Now due to Kubernetes ability to like automatically grow your cluster like by adding extra nodes whenever let's say you are making a deployment. So this is like this is a case of rising workload and this can be advantages in some sections. So the auto scaling which aids in like preserving the performance and the ability of the program. Moving on to the next point is improved observability. Now like you may learn more about the behavior of Kubernetes cluster by maybe deploying an observability tool in your deployment or maybe using an observability tool with your application. You'll get all the data, all the matrices and then you can maybe see the future let's say future marks for the project or how it has been behaving in the past or what measures we can take in order to scale it up. So now you can grow your cluster appropriately by using the right observability matrices and log services, let's say the three playlist, right? So for instance like you notice a pattern of parts getting stuck in resource constraint. So this is where observability can help you. Now the last point for this is a little tension of problems. So it's very like common. You can improve if you have the problem with you at very early stage, right? So pending parts could like indicate an issue with your Kubernetes cluster. So part doesn't go to the pending state without any reason, there has to be some error which is causing it to make it fall in the pending state. So like it can be lack of resources. Again, we have had a chat about the lacking resource and like image pull difficulties and whatnot. Now you can stop these problems from like causing your application to go down if you have the right metrics to let's say or if you have that capability to debug the part, right? So now understanding when parts indicate problems is let's say parts stuck in pending state for a long time. It can be a huge problem. Group of parts like there are group of parts which are failing at the same time. This can indicate a huge problem to your deployment. Like you have the same reason for failing even after debugging the part or even after debugging the errors a lot of times. But again, that goes into the let's say the errors. So this can be a problem indication for you. Now they are pending even if you have like all the resources with you, all the conditions are met, but if they are still pending, this can be again a very big problem for your application. So will be how is like open source tooling helping us here in order to let's say make this pending issue or how do we overcome these challenges? So captain does not like only depend on particular GitOps tooling. It works with everything like Argo CD flux, GitLab and whatnot. Now KLT was captain lifecycle toolkit which is now captain itself. So what it does is it emits signal at every stage. So you have Kubernetes events, open telemetry matrices and traces and it ensures your deployments are observable. So there are some available steps like which are applicable to both your workload and applications. So there is a pre-deployment task, like for checking for dependent services, checking if the cluster is ready for deployment or et cetera. Pre-deployment evaluation where you evaluate metrics before your application gets deployed. Post-deployment tasks such as perform any action. So this is properly based on TypeScript. You can perform any action on, maybe your return action on TypeScript. So now there are post-deployment evaluation. So captain helps you to check even before the deployment is made and also after the deployment in order to have that correct scaling of your application. Now the things we talked in the previous slide is pitch rise here. If you see you make the deployment here using kubectl apply minus f and let's say we are deploying the complete application. So there are pre-deployment evaluations which again checks if enough CPUs resources are available. There are workload pre-deployment tasks which checks if entry services, let's say readable or not. And then again the same works for post-deployment tasks and post-deployment evaluations which sends a notification, a Slack notification or something which can be helpful to the company or the folks who are using the application or maybe building the application. Now I'll just start with recorded part of the demo and then I'll continue with the live part. So I'll try to explain what all I have been doing in this video. So yeah, if you see here, I have just started with creating a repository, not a repository directory and I have created a kind cluster here, right? So in order to make like captain work you need to have a cluster, you need to have doc installed in order to make the kind work, right? So what we are doing here is we have like created the kind cluster and I have all the steps here in order to, so that you can also follow along with me and let's say if you are lost in any, let's say a step. So now what we are doing here is we are deploying the Helm charts of captain. So this is basically installing captain into your application which helps you observe or get data out of your application. So I have deployed the KLT charts or the captain here, made an update for the repo and then also created a namespace for captain lifecycle toolkit system. Moving on, after the, let's say, this section is completed, we'll move to, so it takes a little time to make captain work or deploy captain into your application. Now, we'll be creating few configuration files in order to make the complete captain work. So as you see here, we'll create a collector config YAML which has certain configurations, such as the namespace which we just created captain lifecycle toolkit system will be triggered here, we'll be having some open telemetry data URL and then we'll go ahead and apply this configuration. So I have created a file here with the YAML syntax and then I will just go ahead and apply it. So again, we'll create a namespace for local demo namespace in order to demo the application of which we'll be retrieving the data power. I have just done this and applied for the namespace again. Again, we'll do this for a few of the configurations. Now, this is the demo application which we'll be using. So this is a Nginx, we'll be using Nginx here for this demo and you see that we have a namespace here with a captain demo name. So this part is done, now it will take a little time to create this namespace because again, we are going to deploy the complete Nginx controller and also we will see this and we'll visit and check if our application is running or not after this is done. So what we're doing now is I'm checking the namespace for captain, like the demo which I will be working on. So this is the namespace, we see that it is created. Now it is like checking the app version for captain. You see it's currently in the app deploy phase and again, it will take some time to be completed. So I'll just skip that part and you see that it is completed now. So we'll go ahead and check that the captain version is completed and then we'll see the wider version of it. So we have a name of captain demo app version 0.0.1, app name as captain demo app version as already specified, phase is completed. We have pre-deployment status which is succeeded, pre-deployment evaluations which are succeeded and all the post and post deployment evaluations and all the other things are succeeded as well. So we have our version one app deployed here and we'll be checking this by visiting 8080 local host. So again, now we are moving forward with, so this part is, we will be using Dora metrics here in order to collect all the telemetric data and use it for our own benefit. So this is the command that is used for it and then we'll visit this local host metrics part to see all the metrics that are being deployed. So moving on, this is our application, and then this application which we did just deploy and this is our Dora, so this is our Dora metrics part. So this is the metrics which we got from our first deployment. Now moving on. So now we'll not be only using just Dora metrics, we have certain other tools to be used as well such as cert manager, JGAR, and various others. So we are going and deploying with the first tool is cert manager and again, I have skipped the dashboard or the local host part for this one because if I would have covered that, that would be maybe a half an hour long demo. So we are using cert manager here. I have already deployed the cert manager part. We are going for JGAR deployment. Again, we'll create a config file for JGAR. So again, these deployments take a little time. This is the reason I'm having the recorded demo for this section just so that I can cover the complete demo in half an hour. We'll create a namespace for observability, you know, deploying of premises in Grafana and other tools as well in order to get the correct data or know your application in a better way. We have all these things deployed. I'll just skip this part. So we are now deploying the data source, right? So it has a configuration for all the Grafana part. So what you can do is once you have reached this section, right, you can maybe create a dashboard out of using Grafana and Prometheus so that you have, you know, pictorial view of your deployments and data. So we did deploy Grafana here and you can see there are various other tools used in deployment like ODIGAUS, you know, KLT, Captain is there, Glue by Solo and Prometheus and ladies and other tools as well. So the absolute part is deployed. Again, we are, so now what I did here is we had the first version of the application being deployed, right? Now what we are doing is we are changing the version to second part, like we are upgrading a version from one to two in order to like check if our data is retrieved from the application or not and to see if the post deployment check did work or not, right? So we see here that for now I have just changed the 0.0.2 and haven't applied it yet. So I'm waiting for the Prometheus part to get deployed and once this section get deployed I had changed this and applied here for that part. So now again it will take some time to get to the Captain app version and it's in the app deploy state and that was completed. So this was the part for recorded demo. Now I'll continue live from here, okay? So in order to make like the post deployment check like to know if your application is working good after the deployment, you need to have a web book. So I have applied few configurations before the talk. So this you see here is a web book. Like I have created a deployment for Redis, deployment for a web book and various services as well. So when you get the parts, you see that there are few parts created for the names of the web book and the containers are still in the creating phase. But again it would have taken all the time so I did this beforehand. Now all these are created. We see that it is up and running. Now moving on to the, I'll continue the demo part from here. So after this what you have to do is you have to deploy certain configurations again in order to see if the post deployment check is working or not. So for this we again have a separate link working for this part. So I'll quickly go ahead and show you. So if we head towards let's say the link localhost 8084, right? So now we have the web book which we just deployed here. Now you see we have a unique URL here. So this is completely unique to you for your personal deployment. So what we'll do is, yeah. So now we will verify this web book sync in order to check if the post deployment were done or not. So we have to make a captain task definition before making or verifying this. So we have localhost running. Now what we'll do is we'll add a post deployment task. So that is a configuration. Now we can name it anything, right? So what I'll be doing is I'll be heading back to my folder here. So I'm in my demo folder, right? Yep. So let me just quickly create a let's name it as you know, check.yaml, right? So we have, we'll be creating a configuration here now. So this is the configuration. As I said, we have to create a captain task definition which basically sends, you know, it says that the post deployment check did work. So we are sending an event for the captain demo and we basically are just pinging the current URL and this is the web book language we just deployed and it's just an application which is sending captain send event or it will be just sending this message. So we'll go ahead and apply this. So we'll hit cube CTL, check. Yep. Unique ID, sorry. I didn't get, no, no, that is not needed. So what you have to do is you have to just create this file, right? And just go ahead and apply this manifest. So I think it's showing error editing data app, API version not set. Maybe I did mess some API version in the Amazon configuration, let me just check. Okay, so we have an I here, try again. Okay, so it is created. So captain task definition is created now. Again, this is, now we have to make another configuration in order to check if that message was sent or not. So we'll go ahead and deploy another YAML configuration. So we'll name that as check one. So I'll just go ahead, check one dot YAML and I'll again just verify with this configuration. So we have another captain task configuration in here, which basically runs the previous configuration. And again, it is using the captain demo namespace and it just triggers the send event which was previous configuration. And so this is for the version one, right? But remember we have already changed the part to version two. So you can continue with version one as well, but it's beneficial to have the getting started one and this post deployment check demo separately. But since I have version one here and I have already changed to version two, it would likely to go to version two. So I'll just save this and apply it. Yeah, so this is again applied. So now what we'll do is we'll see if the job is created or not. And I hope you are not finding trouble to see the screen. Looking fine. Okay, so again we have to wait for a few minutes in order to make this job up and running. So till then maybe, I will explain you what all we have done in this captain task definition. So what we did is we did create a deploy web of sync out of which we will be making the post deployment check. Then we did go to the local host where we had a unique URL of ours. Then we did verify the web of sync here and we did make two configurations, check one, check two. Check one is sending the message and check two is verifying if that message is sent or not. Now we have this, let's go ahead and check if it's working. So it has completed, you see it was zero out of one, it's one out of one. So we have the job running. Now what we'll do is you remember the app which we deployed in the getting started section. There are a few labels in that now. We have a different set of labels which we'll be using in order to make that app work. So what we'll do is we will go to VI our app. So this is the part I'm talking about. You see labels here. We are using just the Kubernetes derived labels for now, but we'll be using the captain labels now in order to make the pre and post deployment check. So I'm going for the pre, the post deployment check. So I'll go ahead and add this section and I'll replace it from, okay. So as you see in the last line, I have post deployment tasks which is send event which we had in the check one configuration. Now I will go ahead and save it. Okay, so indentation error. Let me just quickly fix it. I think it should work here. I guess demo gods are not with me today, but no worries. We'll try again. Okay, so it's showing indentation error, but I'll just tell you what will happen is you can just go ahead and, since we are very, very less time left, you can just go ahead and check on the web book sync which we did just deploy. Yeah, so this is the URL. So what will happen is you will see a different set of captain send event section error which says that okay, your post deployment check did work. Pre deployment check we have already in the getting started section and this was all about the talk and you have all the, I have like mentioned all the resources here. These are the captain docs. Maybe if you want to join the community, if you want to check out the GitHub, you can go ahead and do this. Also you can scan the QR code. Thank you. Do we have? You did mention about post deployment evaluation. I have a simple question around it. So is there any particular duration for which the post deployment evaluation will run? Like I have seen in a lot of cases, once our pods are deployed, it will show successful but after some sort of time, like two minutes or three minutes, it will get failed. So in that scenario, how will captain will help in that scenario? As long as you have captain deployed into our application, it will- It will do the post evaluation. Yeah. All right. It will like, if you want you can manually take it as well. All right. Can we customize that post evaluation? Yeah. I just mentioned that you can customize your checks by using TypeScript. All right. Thank you. Any other questions? Okay then. I guess this is it. Thank you so much. Thank you so much for being at my first Q-Point talk. Looking forward to network with all you folks.