 All right, thank you everybody for coming, showing up today. Very happy that you made it this afternoon. This talk is a new way to roll. We're talking about supply chain choreography for enterprise Kubernetes. My name is Steve Watkins. This is my colleague, Kirti Apte. We're both solution architects with VMware with the Tanzu business unit. So that means we work with all things, modern apps and Kubernetes and things like that. So very happy to be here. Let's kick it off. So first thing, why in the world where we even talk about supply chains? So our most important thing is dealing with our developers. Developers cost a lot of money, right? We want them to write code. What do developers want to do? They want to write code. How do they want it deployed? They don't care while we're at KubeCon. So we want it deployed in Kubernetes, kind of obviously. But for those of you that have a little gray in your beard like me, if you remember back, all the way back about 15 years ago, the book Continuous Delivery, Jess Humble and David Farley nailed what is the preeminent challenge for us. If somebody thinks there's a good idea, how do we quickly get that in front of our users? How can we make that happen? And that is still the challenge. That is the challenge that we're all moving towards. Get developers, get something smart, get something useful, get it in front of users. So that was 15 years ago. Even up to today, we're still running into this issue. 15 years later, we've built a bigger cloud native infrastructure. We're doing a lot more cool things. But Joe Beta, if you know Joe, one of the co-founders of Kubernetes, called out specifically, our job is to manage complexity and unlock velocity for all the good features that we get out of Kubernetes and cloud native operations. We're still can create some drags. So we're still struggling with velocity, getting things out. Okay, so the traditional way of approaching it or the way we have been approaching it is using a CI CD challenge or CI CD. Pretty straightforward, right? I identify some tasks that I need to do. There's code, build image, run some tests, build a release, get it in deployed and operate and monitor. And this has been around for a while. It depends on an external orchestrator. So an external orchestrator will start and stop each one of these tasks and it'll control them. And there are some advantages for sure. It's very mature. It's been around for a long time. There's a lot of great products out there. You've probably saw a lot of them on the Expo floor. It's also pretty easy to monitor. If you want to monitor uptime and delivery and really drive metrics and improvements, it's pretty good. And then for basic tests or straightforward tests, it's actually really fast and really quick to set up. But it's not perfect. There's a couple of things that could be improved. For one thing, it's very tightly coupled. So what does that mean? That means that because I have an external orchestrator controlling each one of those tasks, if I want to change one of those tasks or if I want to add a new one of those tasks, it impacts my orchestrator and it impacts my upstream and downstream tasks. So things are a little bit more difficult, a little less flexible. The second thing is that even though it's easier to monitor, it also creates a single point of failure. So this can take down production and slow down getting code in front of users. And finally, it's a little bit rigid. It can be rigid. So what does that mean? Well, let's say, for example, we discovered there's a CVE in our container image or in our libraries or something. We want to fix that CVE, so we've got the patch. In order to roll out that patch, sometimes I have to fake a new delivery of a code commit just so we can run through end to end. So it's a bit rigid that way, not so flexible. Okay. That's all well and good. A lot of people can live with that, but where things really get challenging is when we start to look at orchestration at scale. So here's what happens a lot of the time. We'll start out, we'll have one team, maybe they're developing and go or something like that. And we'll build a pipeline and that pipeline will go through A, B, C, D and E. Okay, so far so good. But then it turns out we have another team and this team, they're whatever, they're doing data and they're doing it in Python. And this team, they don't use A. They just use B, C, D and E. Okay, well, we're gonna build another pipeline for them. And then it gets more complicated because the Java guys are always more hassle and they're gonna come in, they're not gonna use A and they're gonna add another step F. All right, so this is just an example. So what happens? We end up having a library of pipelines to try to contain an orchestrate from one. And secondly, what happens is a lot of this work on customizing the pipeline gets pushed down to the individual development team. So now I've got developers working on pipeline and DevOps stop instead of writing code. And the point is to get code out as fast as possible. Okay, so how can we solve this? Well, as with so many things in agile and in DevOps, we can look to the practices that came out of lean manufacturing. And if you look at lean manufacturing, when they look at this problem, it is a supply chain problem. And here's the thing about manufacturing. Here we are in Detroit, by the way. And back in the 70s and 80s, it was the Toyota quality system that taught Chevy and Ford how to build a car to be perfectly fine. And we're doing it still today. So let's say we look at each one of these tasks instead of as a task that needs to be controlled, think of it as a downstream part of my supply chain. And the thing about in manufacturing and input for one becomes an output. So let's say I have raw rubber, I use rubber and I create a tire. And then the next step is we'll take that tire and we'll mount it on a rim and, or pardon me, yeah, a tire and mount it on a rim and now I have a wheel. Now that wheel goes down to the next step and it gets screwed onto a car. Hopefully quite securely with luck. So the point here is that the inputs, or pardon me, the outputs of each step in that supply chain, one output is the input for the other. And one output that becomes the trigger for the next steps. We start to establish pull, which is again one of the lean principles. So if we look at lean manufacturing, this gives us a way to approach the problem. Okay, here's the other thing that's really important and that was a critical part of the Toyota quality system for sure. And that is establishing provenance, trust, and transparency. So provenance, where did it come from? Who touched it? Which parts were added on? I need to have a complete trust and an agreement with my suppliers and the seams down the side and I should be able to pull up and view what happened each and every time. All right, so let's take that model and let's apply that to deploying some applications. So I'm gonna start out here and this is a typical collection of folks you might have, SREs and SecOps, Devs and Ops. All right, so let's start with our tradition. So we have our, we're gonna watch the repo, gonna build an image, I'm gonna do a test, configuration and deploy it. Those are my steps. All right, let's start by making each one of those steps instead of externally controlled. Let's make those independent steps. Let's encapsulate them and say you are self-sufficient now, okay? And then what we'll do is we're gonna wrap this in a template, okay? Now, by the way, we're gonna go into this in a lot more detail a little bit later on. I just wanna give you the high level version first. So we'll wrap this up in a template so it no longer becomes a very specific application, it no longer becomes flux or Argo, it becomes pull my repository, okay? Good, so then I'll say, all right, I know that you are now a template and you are independent, so your input, your raw material will come from the previous step. So test, look to the build image process to feed you what you need to do to run a test, okay? So far so good. So we'll put all of those together. They're all independent, they all fire independently, they all know to look for each other and we're gonna put that in a single definition and we're gonna call that a supply chain, okay? So into a single blueprint. All right, good, so far so good. So now somewhere, I have to be able to find a definition for what this supply chain is and be able to call it and kick it off. All right, so let's bring it back to our people. So first of all, our SREs and our SecOps are the ones that are going to help define and maintain this supply chain. And by the way, this is all Kubernetes native stuff. So it's not an external app that sits out there, it lives right inside of your Kubernetes cluster, right? And then my developers going to say, yes, I've finished my code and here's the workload gamel. Now we're gonna get into this a little bit more as well, but here's what's in a workload gamel. The GitHub repo for my code, or Git repo, pardon me, and the type of app or the name of the supply chain. And the name of the supply chain could be something like it's a web app, it's a data app, something like that, very straightforward. So the cartographer controller will say, yes, I can map that particular definition to a supply chain, the supply chain executes, runs itself and the delivery team is able to also maintain and match that supply chain definition in their output. So we're all able to kind of organize and manage this all at once, okay? So we've gone from having an external controller that's rigid and requires an external management into something that's a little bit more organic and it has more of a pull feel and it leans very heavily from not only manufacturing, but for those of you who are writing code and are familiar with an event-driven architecture, that's a better way to think about this. So I have publishers and subscribers, okay? All right, so now this is the high level view, we're gonna go in, we're gonna double click and go a little bit more detail on each of these components. All right, so the first thing we wanna do is we wanna introduce you to cartographer. Here. Okay, so hello everyone and first of all, thanks everyone for showing up for the session. So now we are going to go deep into cartographer and try to understand what cartographer is. So let's take a step back and let's look at, this is somewhat a messier slide. So what I'm showing here is the CNCF landscape. There are lots of projects and technologies are getting added into CNCF landscape. But if you zoom into a specific area, a very important area, CI CD, there are abundant amount of tools which are available in the CI CD space, such as Tecton, Argo CD, then lots of built tools, K-Pack and others. And how many of you use this CI CD tool from the CNCF landscape? Are there any? Yeah, lots of them. So basically community has provided these tools to integrate the CI CD. But when you're trying to make CI CD work, how does these tools come together? And that's where the motivation for cartographer is. So cartographer is an open source project. It is initiated by VMware and it joined the CNCF landscape as well as of October 2021. So fairly active community in that space as well. So now let's look at cartographer-based CI CD, how you can implement cartographer-based CI CD into your organization. So cartographer offers two levels of abstractions. First is supply chain, cluster supply chain and cluster delivery. So you can implement your CI or continuous integration with cluster supply chain. So what do these abstractions do? They help you define standardized way to implement your workflow. CI workflow as well as CD workflow in a Kubernetes native and in a declarative format. So that's the advantage of cartographer. So what operators will do, they will define this CI workflow supply chain. Typically developers will commit the code. It will go through the CI pipeline. This is continuous integration, which we will do it through supply chain that will be running on the shared Kubernetes cluster. So you can view that as a build cluster or a run cluster or where my supply chain is installed and entire workflow is getting executed. And typical outputs of the supply chains are images as well as application configuration. Now images, you can push it into the image registry. Application configuration, you can push it into the GitHub repository. Again, reusable asset. You are pushing it to the GitHub repository. When you move to higher level of environments, such as staging, production, you need to worry about deployment configurations, such as what are NFRs for your application, right? So basically, how are you going to scale your application? How are you going to, how many replicas do I need for my service? So all these decisions are baked into that deployment configuration. And environmental configuration, such as how my service is going to connect to other external systems, such as database. So production database, external URL will be different from the development database, external URL or endpoint. So delivery, if you look at it, CD actually will be again in a supply chain, similar to supply chain, it will be a resource which will define the workflow to deliver your product to the multiple workload cluster. So it will take inputs from, such as configurations, different configurations, image registry, and it will define a workflow, cartographer will implement that workflow, and then it will help define that workflow and install Kubernetes resources onto multiple cluster, multicluster environment as well. And the final product is the running app. So this is the entire CI CD flow, which you can implement with the cartographer. And what you're getting with this, you are getting reusability. You are getting full end to an automation. Many of the processes are manual, but here you are getting full end to an automation and also reproducibility. So you can reproduce errors or anything, anytime. So these are the advantages of this CI CD process. Now let's take a simple workflow, where I'm pulling a source code, I'm building an image, and I'm deploying it to the Kubernetes cluster. Now this is not a real world scenario, real world, it's going to be complex, but let's try to build a concept. What we are saying here is we are pulling a source code, I'm using community provided tools, such as flux CD, to build an image, I'm using K-Pack and deploying it to the Kubernetes cluster. Now what I'm saying here is the flux CD definition that I have given here is the normal Kubernetes construct, which is a GitHub repository. Now in the supply chain world basically, what operator has defined, operator defines this workflow, defines the steps inside that workflow, each step is autonomous in nature, has inputs and outputs, and also each step is implemented using the tool of your choice. And then each step later on can be templatized, and that's where Cartographer provides you this abstraction in the form of templates as well. So there are a couple of templates, there are four types of templates, there is cluster source template, cluster image template, cluster template, cluster config template, also runnable interface available there. So let's take an example of one template, each one has different inputs and outputs, and based on your task or the step requirement, you can choose which type of template to use. So for example here, to pull the source code using flux CD, I'm using cluster source template, and very simple, right? So flux CD definition, I have wrapped it under template section, and the output of that is the URL path. Now URL path is the flux CD, basically pulls the source code and generates the source code URL, and that will be stored in the URL path. Similarly cluster image template will have a different output which will store, which will provide you image path as the output. So generated images will be stored in the image path in the GitHub repository, and cluster image template will have image, as image path as the output, cluster config template will have a config path as the output. And yeah, sure. Okay, quick point, one of the questions we get from folks is they say, well, we just implemented Argo, can't we just do this with Argo? And what Kirti was showing is we happen to be using flux, if you're using Argo, you insert Argo in there. The goal of making the template around is that you can swap your tools out. Remember when I talked about different teams want different tools and different steps? The idea behind the templates is to make that very, very simple. I can use the same supply chain with different tools, the only thing that changes. And it also allows you to have a little bit of extra diversity. So if I wanna make a change and we wanna try something out, we can work it. Yeah, so basically templatizing one advantage is instead of flux CD, I can use Argo CD template and just get it working as well. So how do you stitch all this together? The cluster supply chain defines that workflow, such as CI workflow or the CD workflow, and then you stitch together with cluster supply chain. It is nothing but combinations of different templates and then wiring of inputs and outputs. So for example here, flux CD generates the source URL output which will be input to the image builder. Then image builder generates images that will be input to the next step which is deploying those images. Also another one is developer basically typically submits or create the workload resource that is also extracted out. It is nothing but the application configuration. Developer knows really well the application parameters and the configuration and things like that. So those are stored in the workload and supply chain selector will be matched. Cartographer will match that supply chain selector with the workload label. So that way that supply chain will be executed. So for example, you have multiple supply chains for different purpose in your organization. Then you can use that selector matching algorithm which is provided by cartographer. Security you cannot ignore. Security compliance need to be embedded into your supply chains and into your organization. Now there are lots of vectors where your application and the systems will be compromised. For example here, CI CD can be compromised, build, there are vulnerabilities. There are scanning and other tools are necessary. So what we, these are the best practices that we recommend where signing the images is really important. Guard rails are put together security guard rails in terms of signing the images, scanning the images for vulnerabilities, your images as well as source code needs to be scanned. And the best practice is to store in a centralized, all these results of scanning and vulnerabilities into centralized data store. So for example here, VMware Tanzu application platform which is based on open source cartographer, we provide out of the box supply chain which is called test scan and store which provides you this ability which is enterprise grade supply chain that we provide with the scanning ability with the gripe scanner and others and which stores the metadata results into the metadata store which is the centralized store where the results of, where you can query that as metadata store and then generate SBOM which is really, really important from the auditing perspective for your security. So here just the output of the SBOM is shown where you are seeing the packages inside image and then CVEs are detected with that. So you can query that any time, give it to developers, submit to auditors, there are multiple advantages of that. So with that, Steve, you can talk to me. Okay, so one more thing on that. One of the advantages of this model of having inputs and outputs is you can capture those outputs and use that for your security purposes. So it's very cool this way. What we're gonna do for you is we're gonna take you through a live demo. It's gonna be great but it's gonna be a lot of yaml so I'm gonna give you a quick flyover. So this is how things actually work in real life. So I start out with a dev and what he's gonna do is gonna write some code and he's gonna do a commit and do some kind of a get repo. As part of that, as we saw, he'll do the workload. Call the workload out. The workload will kick off the supply chain. The supply chain will execute as it executes. Some of the outputs will be an OCI compliant image that'll get pushed out to some kind of a container repo and then a configuration will be put into some kind of a get repo as well. So that'll be out and then once we have things out, remember we do push and we do pull just like in manufacturing, there's a pull you establish pull. So we'll pull that into a test scenario and the idea here is that my developer can have real-time access to it and be able to pull in and run his tests. The goal here is to iterate as fast as possible because remember, we wanna get good ideas out to end users as fast as possible. So once we're happy with the code, we do the publish and then when the same thing runs through only this time we'll pull it out and we'll pull and deploy into production and we end up with happy users, okay? So high level, that's what's gonna fly. You'll notice that we're using all open source tools in this. If you recognize some of the icons, we're using Plex, we're using Tecton, we're using GRIP, we're using K-Pax and CloudBuild, we're using Cosign for our sign, we're using Knative to do scale to zeros and we're using Carvel to do some of the extra templating. Okay. All right, who wants to see a demo? Yeah? All right, good stuff. Hey, so it's going to be a live demo, so bear with me and I hope demo guards are with us. Yeah, switch it to mirror, you can see it now, okay? Okay, so I have, it's going to be a live demo. I like a scripting demo actually, so I have created a script for the demo. So what that means is the commands. The commands are scripted because everybody gets nervous up here and does a fat finger type. So they're all scripted, but it's actually running live and you'll see it running live, okay? We're not trying to pull a fast one on you. Okay, yeah, that's a great explanation. Okay, so let's consider a basic GitHub's workflow here. So pulling the source code from Flux CD, building an image with K-Pax, then I'm going to generate a config map and then pushing it to the GitHub repository using Tecton. So that's a very basic workflow. Now what is installed on the cluster? Let's see that real quick. And I'm using Tanzu, which is provided by Tanzu CLI, which is provided by VMware Tanzu application platform. So here, actually I have, let's say, the cartographer running and then I have also installed the community plugin, such as Flux CD, I have installed GRIP and also the Tecton pipelines. Now cartographer will choreograph the passing results from one resource to another using the graph which is described in the cluster supply chain object. Now let's look at the developer perspective. So my developer basically is in charge of creating a workload. So how does that workload looks like? So basically what I have done here is I'm using a Golang project and this is a GitHub URL and I'm using the main branch and the labels. Basically this is many supply chains are running on this cluster. So I'm teaching cartographer to use supply chain web demo for this GitHub's flow. And there are specific build related parameters that my application architect has defined that I have included in that as well. So let's try to deploy this workload. Very simple to deploy that with Cook CTL apply. I'm going to deploy it. Okay, so the workload is created and let's see the status of the workload. Okay, so this one, these are the resources created after deploying that workload supply chain. Supply chain demo is used there and then the source provider, these are the other resources which are created. It's waiting for status dot artifact URL. So basically flux CD is trying to pull the GitHub repository and then trying to generate the URL. So let's see that one more time. Okay, so now I think flux CD has generated the URL here and others are basically in progress still. So it will take some time. So in the next window, actually I'm just going to watch it a little bit more. Okay, looks like it is ready. So what it is done is basically generated the flux CD URL. This is the flux CD URL, which is generated. Then this is my, basically I'm pushing my image, image and digest is created. I'm pushing that image to the harbor registry, but it can be any image registry of your choice. Harbor is by the way, open source product from VMware as well. Open source product and it is used heavily in the VMware. So let's switch back to this window now and let's see now how from the operator perspective, how did I put together the supply chain or which supply chain or did I use? So in this cluster, basically all the supply chains are running and I use for this GitOps flow, I use supply chain demo. So let's look under the hood of this supply chain and what's is there. So these are the resources. So these are multiple templates, which I have used and I have defined the order, execution order of the template in such a way that, you know, source provider, source provider output is input for image builder, config provider with the config cluster config template and image builder output is input for that step. So this is just a wiring that I have done. And if everything went good, actually I should have, you know, config uploaded into my GitHub repository. Let's see that live. Okay, so this manifest.yaml created two minutes ago and this config is uploaded. So this supply chain basically got executed, this config, what we were expecting results, basically upload the config to my GitHub repository, that's done. Now these are the templates basically in the supply chain. So this is, like I said, there are templates available and these are the API resources. Basically cartographer uses these templates and just to show you cluster image template, basically output of the cluster image template is image path. I'm teaching a cartographer to use image path and use value status dot latest image. K-Pack actually builds the image and puts it into status dot latest image. I'm teaching cartographer use that image path and pass it there, set the input and pass it to the other step. Now let's look at the enterprise grade GitOps workflow. We just saw basically basic workflow. Here I'm adding testing. So basically source tester, Tecton pipeline or Tecton, basically Tecton job. And then source scan is gripe, image build with K-Pack, image scan with gripe. So I'm going to scan the image and push the results of the scanning into metadata store. Config provider, you generate the config map and then configure writer with Tecton pipeline run and deploy as K-NATO service deployment and HTTP proxy. So these are the Kubernetes resources which are created. To save some time, I have already deployed this basically. So what you see here, the supply chain is source test scan to URL. That's the supply chain I use for this. And these are the resources created by that supply chain. It already deployed it. Another thing to point out, delivery. So basically what I'm doing, taking those configurations and deploying it into Kubernetes cluster. So there is a source provider. Flux CD will pull that source config map and then deploy it with the deployer, which is cluster deployer in this case. And K-NATO service is also created. And as you can see, there is external endpoint created as well for the application where end users can access that. And just to show you a little bit more, if we have time, Tanzu application platform, which is based on open source cartographer, gives you visualization of that supply chain as well. So what you're seeing here is the UI based on backstage. And it gives you basically the supply chain visualization and you can see the health of each component inside or workflow basically inside that UI. So here is the test. Basically, if I click on that, my developers have written test, unit test. Test is run and the test is successful. Then just to show you image scanner, basically all these are all the CVs detected with my gripe scanner. And basically there is I found one CV critical. I can define a policy on the cluster where I can say don't promote the code if critical CVs are found to other environments or the higher environments. You can define those policies as well. And so this is about it. This is the choreography. I know I rushed through the demo, but it was a live demo. And I hope you have gotten something out of it. We hope we got some extra bonuses for doing live demo. All right, so I know we were at time, but a couple of quick things. First of all, as you walked around on the floor, you've seen a lot of supply chain coming up. It's really starting to pick up, especially in the last six months. And some places like Google is way ahead of the curve. We think that supply chains need to have security built in. So we're actually working towards the Google standard, the Salsa standard. I know in our team, we're working towards level three and we expect to be there very soon. We're already at level two. So I just wanted to bring that out that this is, even though we did diverge and we talked a little bit about our particular product, our particular product is built on all of these open source tools and we're embracing those open standards. You have the next slide, please. And then finally, a couple of things I want to leave you with, please, please, please go check out Cartographer and it's cartographer.sh. If you go to IO, it'll take you to the completely the wrong place. Don't make that mistake. But go to cartographer.sh. You can download this. One of the really cool things they did is they packaged in a sub folder called hack. There's the ability to run and test this in a kind cluster right on your laptop. So you don't even have to deploy it on a Kubernetes cluster. Makes things a little bit easier. Of course, there's always the GitHub. Please check us out on our Tanzu cartographer repo. Pull some things down as well as this Slack channel and of course, Twitter. That's what we have for right now. I know we're right at time. I don't know if we have any questions or actually, Kirti and I will hang around at the end and we'll be happy to take questions as well. Yeah. Thank you.