 Iawn i'n cael ei wneud i gael eich ddweud. Welcome everyone to SIRP Manager, past, present and future. I'm going to keep the audience interaction to a minimum. It's Friday morning and there was obviously the parties last night. I hope everyone's feeling okay. Anyone else at Spukenetys last night? Any hands? There's some nods, there we go. Thank you all for coming. I will start off by introducing ourselves, you might have expected that. I am Ash, Ashleigh, I don't mind either way. I am a senior software engineer at Jetstack and also a third manager maintainer, which is useful for this talk. I am super interested in cryptography, certificates, TLS, all that kind of stuff. That is the kind of angle I approach it all from. Obviously some interesting humanities as well given the nature of third manager. I'm Jake, I'm a platform engineer at GE Research, we're kind of a London-based financial research company. Also a certain manager maintainer, and I'm kind of interested in highly scalable secure systems where everything has an identity. Yes, aren't we all? So yeah, we thought we'd talk a little bit about certain manager where it came from. So in May 2016, there was a project called Cube Lego, which is open-source by Jetstack, my old employer. The idea was that you'd have your ingress into your cluster and you'd want a publicly trusted certificate for it, so you'd stick an annotation on your ingress. Cube Lego would contact Let's Encrypt, and only Let's Encrypt, it would solve the HTTP.01 challenge and it would give you a publicly trusted certificate. This was a really popular kind of thing to deploy on Q-meters because the workflow was so easy. We liked this, I thought it was a good idea. Tried to make it a little bit more cloud native, so started to create some custom resource definitions that would hold the state of your certificates, your certificates and your challenges, so you could store all your state in the Q-meters API. So people started using this over the next two years, so we thought, well, we better make our CRD API stable because we have a lot of production users now, we're not going to break it. Anyone here are a production user of SAP Manager? Yes, that's what we like to see. It's got 15 million pulls a day when I last looked, so there's a lot of production users. So yeah, can't break them, and even if you don't know if you're using it, it might be because it's included by default in a few cloud distributions. It's like the VMware people reached out to us a while back, showing how it was embedded in Tanzu. I've actually spoken to a few people this week who told me for the first time that they were using it as part of the distribution. I was like, oh, neat, cool. So yeah, we ended up with our V1 CRD API in September 2020, and we promise we'll never break this API. We've done it well so far. We'll do our best. So fast forward to February this year. We kicked off the incubation process, and we officially announced just last week that we've joined the CNCF incubator. Woo yay. Thank you. Got to give some shout out. So thanks a lot to Ricardo Rocha from the Technical Oversight Committee for sponsoring us. Oh, thanks to Ash and my old colleague Joachim for kicking off the process, and everyone for seeing it through. There was a whole bunch of work there, and thank you for the shout out, Jake. But there's so many other people we could thank as well. It was a really great process, and I think there is a possibility that in the crowd today there are people from the TOC who gave us a thumbs up for incubation. Thank you specifically, obviously, or if anyone watches this back. And also just thank you to everyone that's here, because this is so cliche, right? This is what everyone says, but the community is everything, and we genuinely love talking to people about this stuff, and it's people contributing and raising issues, poor requests, just getting involved that really drives everything forwards. That's why we are where we are. And yeah, just a huge thank you for all of that. It's a great train to be on, and we hope it continues. So that was the history lesson, and we'll just talk briefly about what set manager is, although maybe people know already. So we're basically a cloud-native operator for certificates. We've got our CRDs and set manager, which operates over them. Probably one of the first things that you'd install on a production cluster if you were a cluster administrator. We basically just want to make managing TLS in your cluster simple and easy. Like, our workflow is really simple. The final certificates are available in Q-Matter Secrets, and originally they were just in PEM, but apparently real users need like PKCS 12 from Java key stores, so we make all of the formats available, and you can just mount them in your pod and consume them. And you don't need to learn ASN 1 or to manually make your certificate request and email it to your certificate authority like before. Stificates are just renewed in good time, which means that fairly often, even now, you see big companies go down, and it turns out one of their key certificates buried somewhere in their infrastructure has expired, and it's really embarrassing for everyone. If you're using set manager, that should never happen. We promise. Asterisk, we promise asterisk. And yeah, all of the components are modular, so we have a lot of integrations, but I think we'll talk a little bit about those in the forthcoming slides. Yeah, we have this. We threw the number of our GitHub stars on here. Every time we write slides, this seems to have to be updated. We'd love to get to 10,000, so maybe this is a mini call to action. If you could star the repo after this, we'll get there. I'm going to do a really quick overview of the general gist of how Set Manager works. We're not going to dive into code or anything because who wants to do that, right? Generally, we have a few CIDs which model the core ideas of issuing a certificate. At the top there, we have issuers, and that's how you describe how to get a certificate from what service or from where or whatever, things like authentication for that service or whatever it may be. We've got some examples of different issuers there. Let's Encrypt is a big one. The big one, I'll talk about that in a sec. HashiCop Vault is also very popular. A lot of people are managing all of their most secure secrets in there. Benify TPP is another option that you see a lot of, especially enterprises, using. But there's more that as Jake alluded to, there's external issuers as well, there's all kinds of things. I actually had a guy came up to me yesterday and asked about an issuer that I'd never personally heard of, but we were able to find a certain manager issuer that someone had written for it. It's always cool to see that. In the middle, there's a manager, right? A manager will talk to whatever issue you ask and try and get the certificate that you request and you specify that certificate in a different CRD. It's really pretty simple from this high-level overview. The end goal, ideally, if everything went well, which usually it seems to go well, is that you end up with your certificate inside a Kubernetes secret, and you can use it however you want to use it. That might be identifying a service, like a website or something, WebBKI. That might be identifying a client for NTLS or something like that, but once the certificate's there, you can use it and you can hopefully rely on it being renewed. If you don't care about this overview, we kept the very simple just annotate your ingress and you get a certificate workflow, which we think that 90% of people are using. Yeah, it's very popular. I mentioned the ACME stuff, but ACME stuff is by far the most popular as far as we can tell. We're not collecting telemetry or anything, really, so we don't really know unless people tell us. But certainly ACME seems like the main one because the certificates are publicly trusted. That tends to take the form of let's encrypt. That's the most popular, super awesome. They've got the ball rolling and making TLS certificates more obtainable for everyone. But we, like I said, we have all these other issuers, as you might need them, it's automated. We drive that point home and it's extensible, which we also want to drive home. We don't want to leave anyone out in the dark. That's kind of the power of an open source project, is that we can just define those APIs and people can go and run with them. There's at least one person I spoke to this week who's running their entirely private PKI and they've written an issuer, an external issuer just for their private internal organisation PKI and that's totally how they do, right? A huge part of the current status of the project is obviously security. Given the state of Kubernetes generally and the aims of the project, security is pretty important. We really need to focus on making sure that the private keys are actually kept private and all that stuff. We do view cert managers being critical security infrastructure and we would argue that if you're using it in production or even in test environments, then you should probably view it the same way. It really is important and if something goes wrong, people can do horrible things like impersonating your services and your clients and nobody wants that. Towards that end, we're actually pretty early adopters of Project Sigstore, which we believe has had a V1 release this week, so yeater them. We use Cosign from Sigstore to sign all of our container images. You can validate those when you download them. All the details for that are on our website and we would encourage more people to do that because it's just an extra reassurance that you're actually getting your containers from us and not someone random. We have seen third party mirrors of our container images on the web and you've got no way of knowing if they've been tampered with and obviously you don't want your certificates to be tampered with. We would encourage more use of that. There's a certified US government container which we have no idea where it came from. I'm by us though, so do you trust that it's up to you? We hope you trust us. We also are self-certified, we're saying, as Salsa2 and Salsa is the best-named project ever, I think. It's just kind of a way of specifying how much time and thought we've put into our supply chain security. How can you verify that the things you get are actually from us? So we think we're at level 2 where I think we're nearly at level 3. The differences between that don't really matter for this talk. We're not talking about Salsa here but generally it's a pretty good indicator of us taking the time to look into this and I would recommend if you're interested in that or you've not heard of Salsa to check out salsa.dev for more information on that. Of course another aspect of security is actually asserting things like policy about the certificates that you're issuing which leads nicely into. Yeah, so obviously we've got a lot of production users and I'm just going to briefly say that Sert Manager is not just the Sert Manager kind of certificate renewal operator, it's an entire project and anything that interacts with TLS workflows in the cloud native space can live under the Sert Manager GitHub org. So one of the things that we've put there is policy engine. So in Sert Manager if you just deploy it it will approve anything in your cluster which is probably not what you want because if you happen to have a cluster issuer that could sign your production web services anyone could just make a certificate in their namespace reference it and then Sert Manager would just sign it and someone's now taken your security structure and ruined it by getting a valid certificate. So yeah, we have a policy engine which you can it's fairly simple, it matches on DNS names or common names so is this person allowed to issue this certificate and the requester information is embedded into the certificate request so you say only this service account in this namespace is allowed to sign our production sets. If you're using Sert Manager for your credentials you probably want them short lived like MGLS credentials so right now we support certificates down to one hour in duration and have them automatically renewed every 20 minutes or so. One hour is not quite short enough but that was the one mistake we made in our v1 API so we're not sure whether we should change it or not We're trying to get it lower and of course private key rotation policy so if you can you should always rotate your private key when you issue your certificate if you remember Heartbleed a bunch of private keys got leaked and everyone had to reissue their certificates and there were some very large percentage certificates that got reissued with the same private key completely invalidating the point of reissuing them so make sure you always do that but it turns out that some legacy apps will rely on the private key being the same so we allow you to sign it off and we're sad about it. We already talked about integrations but every part of Sert Manager is kind of modular and it can be disabled and you can drop in your own component so our approval policy it's pretty good if it does what you want but if you have a really complicated internal security posture you might want to put your own approver in and we had an example with OPA I think at some point we still have it. As a cryptography person I'm super interested in things like policy which can say hey you must have a private key for your certificate which is of an acceptable security level you don't want people issuing keys that are going to be broken and like a week on a Raspberry Pi this stuff is important. We've got lots of integrations with other cloud-native projects so we can replace Citadel in Istio Sert Manager can be your RA as well and we've got really good integration with LinkoD and I think you gave a workshop on it. If anyone wants a link to the workshop I can provide you that after the talk. While we started with Ingress we have support for SIGNetwork Gateway for API and Dave was a really good talk yesterday on the Gateway for API. I don't know if people went to that. Shout out to Jake for adding all that support because it seemed like a really hard piece of work and he nailed it. Thanks. We have rich, comethious metrics which is really important for observability like if you have hundreds of thousands certificates in your cluster and you're checking them Citadel Sert Manager will tell you when a certificate is due for a new rule and you could maybe alert on the certificate going past that without being renewed because then something has gone wrong. That's kind of the asterisk earlier when we said we promised that your search will get renewed and you won't go down because of it. You have to check the logs for it as well. It could be someone else's fault. If it was a paid for issue you could have run out of money. That's true. We also have two CSI drivers so kind of a sticky point sometimes with enterprises is they don't want secrets stored in the QMAT as API so a normal, a TLS keyfair has a private key in. We have a regular CSI driver which will expose the certificate to your pod but the private key will only ever be in memory but we also have a CSI driver that implements the Spiffy X509 standard. I think there was a talk on that yesterday as well. The thing about coming last you can reference all of the previous events. Maybe at the next KubeCon we'll be able to talk about more of these things. We have a lot of integrations as we just said and we really, really want to keep on doing this because it's such a great way of showcasing what we can do and what other projects can do. I don't think it's crazy to say that a lot of projects have a need for some sort of certificate management. It comes into a lot of things just because TLS comes into a lot of things. Ideally it's used everywhere nowadays. So if you have an idea, if your project maybe could integrate with Cert of Manager we could have a blog poster or a workshop or a talk in Amsterdam maybe about it. We'd love to talk to you about it and maybe work together, who knows. So please do drop by and chat to us especially at our booth which we'll talk about a bit later if you've got any ideas on that and we'll talk to you. Cool, so that was the present and now we can move on to the future. So we keep talking about our integrations but the CNCF landscape is growing every year. There's lots of meme templates about it. So we want to integrate with any CNCF project that talks about TLS certificate management. Especially like other service meshes. There's a couple of posts about Cilliam which we're excited about. I'm also personally very interested in Spiffy Inspire and so while we support a very small part of the Spiffy spec in our CSI driver it'll be nice to at least implement the Federation API so we could actually join Cert Manager and Spire trust domains. A lot of work recently has been going on the developer experience. There's a small number of core contributors but we all have other jobs so we need to make it as easy as possible for the community to get involved in the project. Recently there's a lot of work on fixing our kind of flaky CI so now if you make a PR and the tests fail it's because you didn't test it not because our test infrastructure is terrible. At least in theory. We also participate in Google's summer of code every year and recently Google season of docs so we're really trying to bring new contributors up and get them up to speed with Cloud Native. There's lots of things in our roadmap. Just through some examples in the slides if any of them jump out to you please reach out and Slack or anywhere. Further on to that we are pretty convinced that roadmaps are crucially important for an open source project because for people to rely on that project they need to know what's coming in the future the way we're achieving that is to have a roadmap available in the main Cert Manager repo Cert Manager slash Cert Manager. We can't promise any timelines or anything because as Jake said we have other jobs and there's only so much time in the world for adding all these features but we do have a sort of aspirational list of the kinds of things we'd like to work on if there's something there that would be really useful for you plus one would be appreciated like talk to us about it if you think there's something missing on that roadmap that we should add we're happy to chat I'm saying that a lot but we are really genuinely happy to chat another thing on the future this is kind of becoming my baby and I'm sort of super enthusiastic about it and this top line I've said it so many times already at this conference but getting a certificate is only half the problem I think I'm going to get that tattooed across my chest so what I mean by that is it's all well and good if you've got the certificate that identifies your website that's great if you use Acme, clients will trust it automatically because it's a publicly trusted cert they probably already have a bundle of certificates on their machine which will allow them to validate that cert that's great but the other half of the problem is generally how do clients know how to trust certificates and I think it's an area that as an industry we're not looking at enough as I say that we need to work on this kind of stuff so clearly people are looking at it but I think we really should be looking at it more without trust none of TLS works and getting the certificate doesn't really do much good if you don't know how to trust it and by default in WebPKI you trust pretty much every CA in the world which you don't know if any of them have been compromised or maybe you don't actually trust a certificate from a country that you're a war with there's all kinds of spooky things going on here this isn't a Halloween themed talk but there is some scary stuff going on and we'd like to help solve the problem if we can I mentioned the upstream work but we have as a non core project we can move fast discover things, learn things hopefully work with people on this but I'm more interested in a bit of a Halloween scare I did a talk a few weeks ago where I sort of go into trust manager which is what we hope will be our solution to this problem and sort of the kind of things that we think about and how you might want to consider what could go wrong in your cluster so yeah please feel free to scan that QR code, the slides will be available they've been uploaded, thank you Jake for remembering that because I would have forgotten so yeah you can get that there as well if you fancy a fright there you go another thing that we'd really like to highlight is that we do get a lot of community engagement on this project we mentioned the 9,500 kit hub stars, we get a lot of issues and pull requests and messages on Slack and emails that's great, we're not saying we don't want that the community engagement is fantastic but there's a lot of work to do and as we said we have jobs and we'd love to get more maintainers I'm sure this is standard for a lot of projects but if you think hey certificates are interesting maybe I'd like to get involved in some way that doesn't have to be code it could be docs it could be community stewardship that's a tricky thing to say on a Friday then yeah please do get in touch again you don't need to have any experience in doing this already there's no gateway to entry that you need to hear we can help you out and we want to talk to you about it so please do reach out I've said that so many times one great way of reaching out is to visit our booth obviously it's Friday mornings so people might be going home but we're going to be there all day today I'll be there all afternoon most of the afternoon I'll try and find some time to eat something but yeah please do drop by the booth to talk about anything some people just come by and say they like cert manager and that does wonders for my ego so please do that as well maybe more of a draw than inflating my ego further is to get your own souvenir certificate this is a super cool thing we print out an actual X509 certificate on a little piece of card we do a little wax stamp with a cert manager logo it's so much fun we've done tons of them please come and get your own please use it to cryptographically verify that you are actually here at Cubcom no one will trust you otherwise it really is so much fun also so much fun is this line which I'm so pleased with myself for we'd love to have a TLS handshake with you in person or an elbow bump if you'd prefer so yeah please do come along we'll still be there until the end of today we threw loads of links on this slide obviously you can't click them in person but the slides will be available and we encourage you to check out all of this I'd highlight as well the meetings and the Slack channel we have tons of ways that you can reach out to us and our meetings are public if you're US based I was going to say fortnightly meeting but that's not a thing that people say here the meeting that we hold every two weeks is available and will work for a US time though the stand-ups are early morning EU time zones so whatever works best for you but please feel free to join there's also an email address if you want to go a bit more old school than Slack like that's totally cool too we do check those so yeah we'd like to thank you all for turning out today we recognise as we started as we said at the beginning there was obviously the parties last night I had a very intense game of big chess with someone in the tent outside thank you for turning out it's been great to have you all here there's a feedback link here in this QR code we'd love to take any questions that people might have and if you don't get time for a question or you'd rather ask in person again I'd direct you to the booth or Slack or any of the methods I've just talked about for like five minutes so thank you all so much now's the moment of truth any additional questions for the gentlemen you've blown them away thank you guys so very much thank you