 This is a maintainer track talk about Cloud Custodian and where things are today. So to do a little bit of intro for people that may or may not be familiar with Cloud Custodian, it's an open source project, Apache 2.0, it's a CNCF incubating project, and what it is, it's a rules engine for helping you manage your Cloud accounts as well as being able to enforce those policies on your IAC code before deployment. So both runtime, shift right and shift left. It's a simple YAML DSL, very declarative, very easy to read, where you write query a particular resource, you do some arbitrary set of filtering on that, and then you just take a set of actions. It's sort of best in class at doing, actually fixing problems in terms of doing remediation. It's forced doing real-time enforcement by integrating with the Cloud providers serverless and event runtimes so that as API calls are happening, you can introspect them and make sure that they are compliant to policy. It's a stateless tool and we support multiple providers, most of the big public clouds as well as Kubernetes, OpenSack, and Terraform, which we will go into in a minute. It's used in production by thousands of companies. We even occasionally get reachouts from the Cloud providers before they do an incompatible update so that they don't break their own users. And we've got about 1,200 people in the custodian slack and about 3,000 in the Phenops clock custodian room as well. Just to see what some use cases are that people use this stuff for, I'm going to cover off on a couple categories. One that's been popular the last little while is something around as the Phenops use cases. Find the old things, take out the garbage on snapshots, on setting up life cycles on S3 buckets, on turning things off when not in use, on finding the underutilized things and getting rid of them. So super helpful for sort of cutting down on waste and being more efficient about your Cloud spend. These always been one of the collect students core use cases and it allows you to do all kinds of things from setting up, making sure everything is encrypted with custom managed keys, making sure that things are not accessible from the network, that resources with embedded IAM policies are only accessible to the right audience. And then of course being able to do incident response. And it ties into the Cloud providers native tools. So this is all about being the easiest way to take advantage of those native capabilities. Be that AWS security hub or Google Cloud security command center, being able to take advantage of some of the native capabilities to be able to actually use them for enforcement and remediation purposes. So you might get a notification from guard duty that an instance has been popped and you might set it up to remove its IAM role, take a forensic snapshot and shut it down. And then operations use cases, which is sort of a grab bag for lots of different things. You might do centralized logging, cross out of region backups, make sure that you're not getting cross AZ traffic on your NATs or your ASGs to your database. Lots of different use cases that pop up for operations as well. All right. So that's sort of a brief overview of custodians. So what's been going on? So this is only since the last state of the mop, which was at KubeCon in Amsterdam. I did look at the numbers for the full year since the lab KubeCon North America. And oddly, they're almost exactly half. So our pace has been pretty steady. We've added, in this year total, we've added four new maintainers. In this past six months, we've added two. They're actually both in APAC. So those are our first APAC maintainers. It's pretty evenly spread. We've had a lot of work being done on GCP and Azure in particular this past quarter or this past six months. So there's definitely been a lot of new resources in general. This is pretty common for custodian in terms of, you know, we're keeping up with the pace of innovation from the cloud provider. So a lot of sort of new things, new resources, new filters, new action capabilities that emerge from that. But a few that I wanted to highlight is two new cloud providers that we have. One of them is for OCI. It was contributed directly from by Oracle. It's still fairly early. This is the newest provider that was added. And it currently is being used for things like some securities cases, as well as some off-hour cost phenopsis cases as well. We're looking at supporting event-based policies for this. I'd currently call it sort of an alpha beta in terms of it's having the full capability set that we like to support across providers. The other new provider is Tencent Cloud, contributed by Tencent. And it supports a fuller variety of things, including several databases. So network, container registries, object storage. I can never actually figure out what the names are until you actually look at the docs, like what is cost. It's actually object storage, but I'll leave that aside. But this is also in use, and it also does not support event-based policies yet. So I was actually at a Phenops meetup yesterday, and I was talking to someone who had been using custodian for years, and they were saying how they wanted to apply their policies before the resources get deployed. I'm like, well, we've actually had that capability for a year. So I want to take a moment to look at some of our capabilities around shift left. This is a brand-news capability from roughly last September. And what it does is, and it's a little bit different than the standard custodian policies, it has its own CLI front-end because it's really targeted towards being able to be applied, run those policies on a developer workstation in your CI or CD. And in the context of being able to do on a developer workstation, we want to have developer-friendly output. We're directly tagging the source lines that are non-compliant to a policy. In CI, we're actually focused on the code reviewers. Like so, we do direct annotation of pull requests so that the code reviewers can see exactly what resources are flagging against which policy. And we'll go through a few slides on this in more detail. But from a policy language perspective, it's got a few more capabilities because we're operating entirely on an in-memory graph. So the example policy here that's here is actually, how do I write one policy across multiple cloud providers that enforces my tag standard? And that's what this is. So in a regular custodian, we actually have to write a separate policy for each resource. But in shift-left, we can do a generic policy for this use case across all resources, which I think is pretty cool. We've added a few more things in the last six months being able to do some of the tagging things here as well. And we're looking at adding support for cloud formation and then the Serif security output. On the CI integrations, we currently support GitHub, GitLab, and Azure DevOps. The latter two primarily be a unit XML, although we all support GitLab's SAS security format if you're using the right addition of GitLab. Just to give a quick feel for the flavor of what it looks like, you just run it. Actually, I wonder if I want to take this, go a little bit wild, and do a live demo with no prep. Why not? What could go wrong? All right, so just taking a run. And so we can see, you know, we've run some real-world Terraform. We ran a bunch of policies. And it sort of lets us see on an individual resource basis, like exactly which lines are flagging. And we can go look at it that way. You can also do, if you're looking for more of a coverage semantic, you can do a separate summary instead of by policy, by resource and see which resources are being evaluated. All right. That. And then as an example of sort of the integration with CI, there's two, a couple of additional capabilities here. I mentioned the multi-policy resources. You can also do, we also have built-in policy testing. So there's a built-in unit test runner here. You basically provide examples of resources which should pass and resources which shouldn't. And you can write, and there's a simple assertion language. We also support doing arbitrary traversals. So any two resources that are connected that have a reference to each other, you can resolve that. So you can go from like a security group to, sorry, from an EC2 instance to a security group to a rule on that group. And so do that arbitrary chaining to get from one edge to the other. And then make assertions about the attributes on the final edge. All right. So, yep, we're gonna finish way, way on time. So I wanted to look at some of the provider roadmap. Generally with Cassadian, like, you know, we've had, I mean, I think all these providers have had at least 75 pull requests in the last year. Most, like I said, a lot of it's just keeping up with the capability set that's emerging from the cloud providers. In terms of looking at some more generic capability sets, on AWS, we've added another provider. It's a little bit different. It's called, we call it AWS CC. It's based on the cloud control API. For us, what it primarily does is allows you to get full coverage across a wider set of resources than the ones that we've handwritten before. And it has generic update, delete capabilities. We wanted to, and keeping with the theme of being able to move some of this in shift left context, we also wanted to add support for cloud formation as an event-based execution mode. And it's not really, it's primarily a CD enforcement technique. It doesn't apply to pre-commit or pre-burge, but it's still very helpful for being able to catch things before they're deployed. And then also add the event modes for post-deployment runtime immediate detection. And then looking at trying to provide additional IAM identification capabilities generically across IAM entities using Access Analyzer. And then a common request has been being able to tag resources from the resources on a tribute. And so that's something we're also looking at. For GCP, originally when the provider was written, most of the GCP APIs were effectively global. So we went with that there. We are starting, many of the newer GCP APIs are primarily regionally targeted. And so we're trying to retrofit that support in. Should be backwards compatible. We'll default to looking at all regions, for regions not provided, but we will now start respecting the region flag when passed on GCP policy sets. Azure, continue to look at how we can slim the Azure SDK down. This has been an ongoing thread with the Azure team for a few years. The Azure SDK for all the APIs we use is well over a gig at this point because it's mostly generated stuff. And trying to see how we can get that to something that's reasonable and tends to upload our Docker images as one thing in particular. And then having the ability to copy tags from a subscription or a resource group down to the underlying resource. On the release engineering, we've made a strong effort on fully automating the releases. Our goals, like cell cell level three, are like no humans in the loop. And then just have a release go out on a regular cadence automatically. Assuming no blocking issues. We were mostly all the way there. Release artifacts are now fully NCI, inclusive of unit and functional tests. And we had originally started experimenting with the Wolfie and Changard images, but they started commercializing the registry. So Wolfie itself is open source, so we're waiting for the Wolfie based images to be available from a public non-commercial registry before we do that. And then finally, we started working on our graduation process. Like I said, the custodian is used by thousands of organizations, has 400 plus contributors. It's time to start moving to the next step on the CNCF process. We have just completed our security review with a third party audit and starting to do review of the audit results starting next week. So hopeful to get that done. I'll be optimistic and say maybe summer next year. And that's all I had. And I'm happy to take questions or we can continue on to the pub crawl. Sorry, the cube, what is it called? The expo hall crawl thing. Any questions? There's a mic in the middle of the room or you can just stand up and use the loud voice. Take your pick. Is custodian supported? Do you use event driven for GCP? It does. I've used it in AWS and in Alamda. Could you describe how that works in GCP a little bit? Sure. In GCP, we, I forgot what it's called, Sackdriver. No, I think they've renamed it. Effectively there's an API audit log in GCP that gets relayed through PubSub to a GCP cloud function. So we effectively, in the same way that you do it in AWS, you write a policy, you do custodian run, it provisions all the event sources and the serverless functions for you behind the scenes. So do you have a server send? Nope, we do, it's all taken care of for you. You basically just run it. It basically just creates, it creates the log sync, it creates the log sync and the forwarder to the PubSub topic and the function. So those are the three pieces that get created. And the P99 on it, in GCP is like, it's milliseconds. It, I mean, it's fast. So for Azure, I believe we use Event Hub. I used to have a slide for this on the intro deck. We use Event Hub and Azure functions and we hook them up. Azure functions are a little, they're a little bit different than other serverless runtime environments, let's say, because they're based on type of web apps, but effectively, same general concept. You write the policy, we provision the event sources and the function for you. I think it's going, I think we're going off of ARM resource logs in that context. And then of course, and all providers are all, for AWS Azure and GCP, we also support multiple ways of getting the data from getting and querying the cloud resources, both through the data of service, as well as the respective cloud providers, third-party stores, so AWS config, Azure resource graph, GCP cloud asset inventory. Go for it. So that in cloud, in IAC, because we have the whole graph in memory, we can do arbitrary traversals with like zero cost. So in cloud providers, we don't do arbitrary traversals. We typically have implemented hops, where you can go from, we call them related filters, where you can go from, say, an instance to its IAM role or an instance to its security group or subnet. So for the cloud providers resources, we have that capability, but it's expressed on an implementation basis for each hop. It doesn't, I mean, it has some elements. It's not really comprehensive in many ways, and it has some things. It's not fully comprehensive like KMS keys, especially on newer resources, like even config's ability to do like select resources is inconsistent across the resources that they have. There's a partial subset of resources that they do it for. So it's not very consistent. That's just why we also implement, we've chosen to implement that directly. So that's it. Then I bid you all a happy cube con. Take care.