 Welcome everyone. I'm Shash Reddy. I'm a software engineer at Pivotal. Today, we are going to talk about Kubernetes native style resources for building CI CD pipeline. So I appreciate everyone being here. I understand we're at the tail end of the conference, and we all want to go home and relax. So bear with me for the next 30 minutes, and we'll be a step closer to that. Let's jump into the agenda of the talk today. Today, the first thing I'm going to talk about is the terminology. So let's make sure we're on the same page about what is CI, what is CD, and so because I'm going to be using this throughout this presentation. The next section is why Kubernetes fits the bill for building CI CD platform on. Then very briefly, I'm going to touch how are we accomplishing this. And for the majority of the presentation, I'm going to be concentrating on different resources and the product itself. I do have a demo plan. However, the live demo did not work out as well as I expected. So I have a recorded of the demo, and I will play that. I hope I will end this presentation in time, so I'll have some time for Q&A. But if that does not work out, I'm going to stick around, and there are other members of the product here to answer any of your questions. All right. What is CI? CI stands for continuous integration. And this is a process where you have multiple team members, all of them committing changes to a shared repository. You have a CI server, which is continuously picking up on changes on the shared repository. And it builds your application, battle tests against multiple test feeds. And once the commit has passed through, whether it's a failure or success, it pipes the results back to the cometer. What is CD? CD stands for continuous deployment. And this is an extension of CI process, where once you have your commit, which has been battle tested against multiple test feeds, you can actually now deploy this in production. Why, Cades? Why do we want to use Kubernetes as the underlying platform for building this? I know of your KubeCon. Obviously, we love Kubernetes, but that's not the only reason. Containerization. Running the CI jobs in an isolated containers reduces the number of external factors which can influence your CI builds. And this is very important. And Kubernetes provides a great way of running containerized applications or builds via pods, orchestration. At any given point, when you're trying to run a CI CD platform, you're not going to have one CI job. Your CI is going to run multiple jobs. So you need your platform to support a scheduling algorithm which can balance the CI jobs across multiple machines. And you need to configure or you need to build an algorithm which can balance load. Kubernetes actually provides this for free if you use this, because Kubernetes has battle tested over the last few years to come up with a good scheduling algorithms and multiple ways on how you can tune the scheduling algorithm as well. Observability. So there are two aspects of observability. The first one is, as a platform operator, you want to ensure that your CI software is up and running at any given point, and you want to manage that. The second is, as a user of a CI CD platform, you want to access the logs of your build. Kubernetes, by itself, does not provide both of these options. But it has been around long enough that there are a ton of suites of products which help you with the logging solution, tracing, metrics, alerts. So as a platform operator, you can plug in and play any of these vendor's software out there to get the observability needs that you'd have. Cloud Platform Agnostic. This applies to any piece of software. Nobody wants to be tied down to a cloud vendor. And Kubernetes provides a great abstraction layer on top of cloud vendors, where you can move your platform from one Kubernetes to the other without being tied to any specific vendors. Identity management. A lot of the CI CD builds contain very sensitive information. And you want to ensure, as an operator, that the right people have access to the right resources. And Kubernetes provides great APIs, like RBAC, where you can set policies on how you want to restrict what resources can be accessed by what users. So I have cherry picked a very few reasons on why Kubernetes fits the bill, but there are a ton more on why this is a great piece of software to use. I'll jump to the next part of the presentation on how we're accomplishing this. CRDs is a solution. I'm not going to do a deep dive on CRDs in this talk, because I feel there were a lot more talks in the last few days, the last couple of days, which did much more justice to this topic. I'm going to just briefly explain what is custom resources. Kubernetes provides a great way to extend existing set of resources to add custom resources. And the CRDs API is almost enabled by default and by most cloud vendors today. And the real power lies in using custom controllers. And when the controllers are continuously watching these custom resources and perform an action when there is a change. What does this mean in the case of CI CD world? The desired state of the object is, I want my CI job to be completed. What is the current state? Current state represents what is the list of actions that I want to perform. And the reconciliation loop, or the controllers here, continuously try to reconcile your given set of actions to ensure it is completed. Whether it's successful or failed depends on what the actions itself are. But the goal is to reach the completion stage. All right. So I've talked so much about why Kubernetes, so today, the rest of the presentation, I'm going to talk about the actual project itself. Tecton CD Pipeline, it's an open source project which defines CRDs to build a CI CD Pipeline. And a bit history on the Tecton CD Pipeline, it originated in a Knative organization as Knative built. And then it transformed to Knative built Pipeline. And very soon, the scope of the project outgrew Knative and spun off into a separate organization and has now found home in CDF, Continuous Delivery Foundation. And it's actually part of the Linux Foundation umbrella. All right. So I'm going to dive into individual CRDs that are defined in this project and what they mean, how can you configure them. And hopefully, you can get information on how can you construct a basic Pipeline using Tecton CD resources. The first resource is Pipeline Resource CRD. Pipeline resources are a set of inputs and outputs that you can define for your CI jobs. These are the currently supported types of Pipeline resources in Tecton CD project today. I think probably I'm going to just briefly mention GitHub as this is the most common resource that we have all come across so we can relate much easily. So as you define a GitHub Pipeline resource type, let's look at a sample YAML for clarification. So the type here refers to what the type of the Pipeline resource itself is. And then the next section is the parameters. Parameters are the defining characteristics of the Pipeline resource itself. And in the context of Git, this refers to the revision, which corresponds to what is the branch name, or it could be a commit shell, or it could even be a version tag. And the second one is the URL, whereas the project actually present, so it can actually fetch that. And the parameters are going to change depending on what resource you're talking about. For example, in the context of Docker image, this is going to be an image digest. And this is going to be a URL, which corresponds to where is the registry in which the Docker image is hosted. Task CRD. This is the next CRD. So task refers to a collection of sequential steps, and each step corresponds to a container definition. And task can have a series of input resources, number of output resources, and also a few parameters defined for the task itself. So like I mentioned, it can have a number of input resources, and all these steps are executed serially. And as an artifact or an output, you can have a number of output resources that are produced as a result of execution of these steps. I like to think of tasks CRD as a template, where you dictate what's the input parameters and the resources. And addition of the task resource itself does not actually create any Kubernetes resources. There is no custom controller, which is watching the task resources and acts upon it. So I'm going to explain in the next part of the presentation how this template is being used. So let's take a sample task, and let's try to see how does the YAML look. So in this example, the input is a GitHub repository, and it's expecting an input Git resource. The task itself is building an image. So it takes a source code, builds an image, and the output artifact that's produced out of this task is an output Docker image. So this is the sample YAML. So the first section is inputs. So inputs by itself is divided into parameters and the resources. Like I mentioned, this is a template. So this dictates what are the input parameters. And as someone who's trying to configure this template, if the default value is different, they can override the default value. But if they are following this pattern, where the Docker file is present at any mentioned location, they don't have to configure the parameters. The second section talks about the resources. And if you see here, it only mentions what is the expected type. It does not actually bind the pipeline resource itself. Very similar in the output resources. It mentions what's the type here, but it does not bind the pipeline resource. The next section, we have steps. And each step corresponds to a container definition. You can have as many steps as you want. But in this example, I only have one. And in this example, I am building a Docker image using Kineco. Kineco is an open source tool by Google for building Docker images from source. And if you notice here, the Docker file actually reads the parameters using this templatized way. So if you actually override the value, you don't have to change the steps, because it picks up the change using the value. All right, let's look at Task Run CRD. So this is where the custom controllers are continuously watching, because Task template is used here. And Task Run CRD binds the input pipeline resources. And if there's any output pipeline resources, it attaches them. And this results. And also, if the pipeline resources requires any identification, like if you have a GitHub repository, which is private, it attaches the service account to provide identity. And what does this end up creating? It ends up creating a pod. So this translates to a pod service account. And all the input pipeline resources are mounted on a shared volume under Workspace. And all the steps defined in the tasks will be executed serially to ensure the workspace resources. And once the steps have finished the execution, output pipeline resource will be executed. So let's take an example where you have ZIP artifact generated as an output of the steps. So at the end of all the steps execution, your output pipeline resource will update or upload the artifact to a cloud storage. This is a sample YAML where we are binding actual resources and creating Kubernetes resources. So the first section refers to the task reference. So this is the same Kineco task we saw previously, which is a task. The next section is where it binds. So I mentioned here I have the template on my right hand. And I have the runtime object on my left hand. So if you notice here, this is what the template looks like. It mentions that it needs a type git. The runtime object actually binds the MyAppSource, which is a pipeline resource, as an input. Very similar fashion, output resources, where the output is of type image. And the output resource runtime configuration actually binds the actual image. If I've lost you in the YAML so far, I have a summary which probably provides a bigger picture. So as a user, when you submit the task run YAML file, Task Run Controller, which is continuously watching these resources, will fetch the tasks your ID, which is your template object, and validates all the requested parameters. And the resources from the template is provided in the runtime configuration. And if that matches perfectly, Task Run Controller goes ahead and fetches all the input pipeline resources, builds a pod, executes all the steps. And once the step has finished the execution, it's going to upload your pipeline resources, whatever you have defined. I do have a demo to show this in action. And the demo uses all the three CRDs I have described so far. So I'm going to play the video. So for context, this demo is using K3's cluster. K3 is from the Rancher Labs. And I have installed it on my Mac. And I have not sped up the video. So hopefully this shows how lightweight the CI-CD layer is on top of Kubernetes. And this can perform powerful actions using something very simple. So I have a Hello World application, a very simple server, which has a handler for printing Hello World when you query it. I have a Dockerfile. And this Dockerfile is not doing anything crazy. It has a Golang-based image and compiling binaries and running the binaries. So this is the registry I'm using. I'm using GCR. And I don't have the demo app image here. I don't have anything installed on my Kubernetes cluster. No pods, nothing in any of your namespaces. This is the installation of the TectonCity resources. I'm applying the resource and the task camel itself. So for context, I'm using the same example that I've described so far, where I have a GitHub resource. And I have an output, which is a Docker image resource. And I have tasks, which is only one task, which is referring to the Koniko. And I'm applying the task run, which is actually going to bind all of these resources together and watching the pods. So this is where I've actually edited the video, because building an image and uploading takes about 2 and 1 half minutes. So I have cut that part of the video. And you can clearly see that it says 2 and 2 and 1 half minutes, and it didn't take that long. All right, so here, voila, we have an image. And that's the end of my demo. Sorry for the hiccup. So here we talked about pipeline CRD, task CRD, and task run CRD. So this is a very small execution. In enterprise building pipelines, you need something much more, because the pipelines are going to be complicated. You need to have a fan-in-fan-out structure. So you're going to have multiple parallel suites that you want to run. So pipeline CRD sits at a layer above the task and task run. So I think of pipeline as a template, which is orchestrating multiple tasks. So as I mentioned, pipeline has a number of resources that you can define and a series of tasks. And pipeline actually dictates what is the flow in which the task needs to be executed. Let's take a simple pipeline, where we have the first task, which is running test, and the second task, which is actually building the image that we're all familiar with. And the output of this pipeline is going to be building the Docker image. Let's look at some demo. The first part is defining the resources. Very similar to how tasks CRD is a template, pipeline CRD is also a template. It does not actually bind any pipeline resources. So here, it talks about two types. One is Git and Image, because those are the same two resources we're still using in our pipeline. The second section is where we define the tasks. So the first one is where we want to run the unit test. And we refer to a task called as unit test, which dictates what the step is. And we define what the input resources are. And the next section is building the app image itself. And here is the important part, which is run after, which dictates the flow. So once this pipeline CRD is presented, it actually constructs a direct acyclic graph. So the DAG is constructed based on this keyword. So it decides the flow of the task are dependent on the run after. So the first unit test did not have any dependencies that will be executed first. So pipeline run CRD, very similar, is a runtime configuration that uses a pipeline and binds the pipeline resources. And the pipeline run controller creates a number of task runs, depending on what the flow is expected. Again, going through some YAML refers what the pipeline is. A service account is how you attach the identity for the pods. And next section, you actually bind the resources. Again, very similar to what we see in Task Run, where this actually references the actual pipeline resources. If I have lost you already in the YAML, again, I have a summary slide, which hopefully puts them all together in a good picture. As a user, when you submit the pipeline run YAML, pipeline run controller watches for these resources. And very similarly, it takes in the pipeline CRD, checks what is the expected resources, and checks whether the runtime object has provided all the resources, if it all matches well. The pipeline controller now tries to find out series of eligible task runs, which can be spun off now, because it does not have any dependencies. So it is responsible for creating task runs. And once the task runs go through this lifecycle, the pipeline run controller updates the status itself to ensure the pipeline run is completed or it's not completed. So it continuously watches for the pipeline run status. All right, I don't know if you have actually got this question so far, but each task run corresponds to a pod. So how is the information shared between multiple pods? The answer is PVC. PVC stands for persistent volume claim, and pipeline run manages the lifecycle of PVC, where after execution of each task run, the state of the world of the pod is transferred to a PVC, and before the next task starts, the world is transferred from the PVC to the pod before the execution. So, and very soon we realize not all the IIS has been up volumes at a rapid rate that we want, so support for adding, using the storage buckets like GCS is already supported in the project today. So you can either use PVC, or you can configure a storage bucket to use as like a scrap space for the pipeline run controller to make sure the state is transferred from one pod to another. All right, how are we doing on the time? I think we have about 10 minutes, so if you have any questions I'm happy to take. I will leave the references slide up here. I have the link for the project documentation, and there's a catalog of tasks. If you want to build something new and you want to use this reference, there's a list of tasks that you can reference from. I learned very recently that Tecton project releases using Tecton CD resources itself, so that's awesome. And there's a list of features that is going to be upcoming in the project. And if you're interested in what you've seen so far and you want to contribute, there's a contribution guide as well. Yeah, thank you. When we have Jenkins Peppella and Kubernetes Peppella, how we chose the Jenkins Peppella or Kubernetes Peppella in our production? So use your question about whether you want to choose Jenkins Pipeline versus using Tecton project? Is that my reading understanding question correctly? Yeah, yeah. How different is this? Okay. There's James Rawlings, and I think he can answer the question because he actually is a creator for Jenkins X. If you want to take the question. No, no, no, no, no, no, no, go for it. Hi, so we have, in Jenkins X, a similar, we have a symbol syntax that Jenkins users are more familiar with, but it all generates pipeline, sorry, Tecton resources underneath. So all the Tecton tasks, tasks runs, pipeline resources, everything we generate from Jenkins X channel. So it's really up to you for whether you want to use Tecton resources, vanilla, like the Tecton projects are using themselves and many other people are using themselves. Or for a Jenkins point of view, we wanted to create a syntax that was familiar to existing Jenkins users, if that helps. So the choice is yours. Does that help? Is that okay? That's okay. Don't worry. I can speak Chinese. Sorry, I came out. Okay, okay. Just because we're in the production environment, that is to say, we will use that, Jenkins and Kubernetes's plug-in. Somebody wants to translate that for us? I need some translation, sorry. I'm really sorry I cannot help you. Are there any other questions? Oh, I can actually listen to the translation, I see. Do you want to repeat the question that I can try or we can try this after too? Okay, just because we know that Kubernetes itself can also become a pipeline, and then Jenkins also has a pipeline. That is to say, when we're choosing, how do we quickly use these pipelines to solve our problem? That is to say, when we're choosing, which one should we choose? Okay, so I think the question is more about what level should we consider using this? Do we want to, as a user, should we build pipelines at the level of Tecton CD or should we use something like Jenkins X, which is at a layer above where it generates Tecton CD resources itself? It is completely up to you because as a user, if you're already familiar with Jenkins X style of writing pipelines, then you can continue to do so using Jenkins X, but if you are okay to adopt something new, you want to try it, then Tecton CD projects are available too. But again, underlying layer, the Jenkins X uses or constructs Tecton CD pipelines. Hello, well, I have a question about, well, Tecton's role and what kind of the role that Tecton will play in the future? Well, Tecton have more and more tools that will come out that Jenkins X may have already. Sir, if I'm understanding a question correctly, it is going to be like an ecosystem of integrations that are going to be using Tecton project. The goal is yes, definitely to have Tecton CD as like the underlying running engine for the CI CD resources and other CI CD products can translate its YAML into Tecton CD pipelines. That will be the end goal. Well, they, so, well, is there, well, how can I say? Are there any existing integrations currently? Well, any alternatives would come out, made that the Jenkins X maybe has that already. I mean, the some of the technologies that the Jenkins X right now may have, right? So, you wanna take the question? Yeah, go for it. This is Christy Wilson. She's a tech lead on the Tecton CD project. Hey, so I think the question you're asking is about will Tecton add features and tools which are also provided by Jenkins X and then people maybe will have to choose between using one or the other or maybe it would even obsolete some of the things that are offered by Jenkins X. Is that sort of what you're asking? Yeah, so I think the goal is that Tecton would be building blocks that Jenkins X would be consuming. So, and James is nodding vigorously. So I think the idea would be that if we start providing functionality in these tools, Jenkins X would then be also using that. So I don't think there would be a conflict. And also one thing to touch on the previous question about choosing between the Jenkins X YAML and using the Tecton resources underneath. One thing we really want to have is we have this catalog that Shash linked up there, a catalog of tasks and we really want to get to a point where people can choose items from that catalog and they can use it with Jenkins X or with any other system that is using Tecton underneath. So hopefully you wouldn't be choosing between them. It would be more like you can use the features of Tecton with Jenkins X or whichever CICD product you're using. And James is just confirming that Jenkins X will be adding support for the catalog. I have any questions for one. I want to know about the top product performance for deploying all of the time of the pipeline. How long of time to finish the pipeline? That's the first question. And the second question I want to know how to roll back the version as soon. Okay, so the first question is, sorry, the first question was about the performance of the pipeline. So performance of the pipeline totally depends on what the pipeline itself is because if you have a series of parallel tasks, then you will have all the task runs that are spin up parallelly. So obviously you're going to get the most performance. So there's no way to measure the performance because there is no single pipeline out there which can say this can execute within n seconds and we can measure against different CICD products. For the second part of the question, sorry, I forgot. A rollback options. The rollback options, again, it depends on how you configured your task. Are you talking about the rollback of the Tecton CD project itself or the rollback of the application? Deploying the YAML. Then it depends again on how you have configured your task because if you've configured your task to deploy, let's say, a newer version of the application to rollback, you would have to run another task which probably does a rollback option. Can we use it like a ham rollback like that command in Taco? Are you talking about like the helm feature or like a spinnaker where it's like being much more intelligent about doing intelligent deploys of the application? The server will get the version I deployed, one, two, three, and kind of rollback, one command rollback to... Okay, yeah, I understand your question. Task, so Tecton CD projects provides resources to build the pipeline and what you're asking is the next level of intelligent deployments of the application. So this is another problem or another use case which is solved by spinnaker and again, spinnaker actually is under the same foundation which is targeting a different set of the problems and Tecton CD is not solving that problem as of today. Any other questions? Hi, so you're using like multiple CRD to like for the test and pipeline, how do you get only one place to find the log, find the result of the thing, to make sure that like if something fell in the middle of the pipeline, how do you find where is it? Because it's like different container, different logs. Yeah, so each pipeline when it creates task runs, it attaches using the owner reference, so you can actually track which task runs are created by individual pipeline and individual task runs creates pods and it attaches its own owner reference to the pod. So if something fails, you can actually track back using the parent references which pod failed and which container failed and the status of the object itself will hold information on what is the pod and what is the container which actually failed for you. So which mean you're using like keep control to look for the results? No, no special source. Okay. And also one more question about the share storage, are you like supporting like read and write, does it require read and write many? Yes. Okay. Because if you have a pipeline which is going to spin off multiple pods and multiple pods at a given point want to access the PVC, then yes, read write is required. And that was another reasons why we added support for cloud storage because it was much more easily solved at the storage bucket level. I think we're out of time. If you have any questions, I'm going to stick around. I think James and Christie will be around here as well. Thank you so much.