 Hey, I'm Tomas, I work for Red Hat on the OpenShift team, and even it may sound so this project is not actually connected to what I do there. It started a bit over two years back. It was something like half a year after Let's Encryble NGA, and I was running this small web server at home, and it was like an engine service proxy and few containers using Docker Compose, and learning from my mistakes and all the downtime, I was actually trying to move to Kubernetes and OpenShift, and at the time I was using, well, the usual self-signed certificate, and it was really a pain to explain to someone how to import them and all the stuff connected to it. But those were like my pet projects. I didn't really get paid for it, so I just couldn't afford to buy a certificate from a certification authority, and then Let's Encryble came, and they said they will give me a certificate for free, and all I have to do is expose one token at a very specific part on my website, and that's it. So I was like, well, I want this. It's free, and it seemed very easy, and then I read to the next paragraph, which stated the validity of those certs is only 90 days. So assuming I will renew those certificates somewhere between one half of the lifetime and one third of the lifetime, making this something between 30 to 60 days, and I'm not going to commit myself to do this every two months manually all over again, so I'm an engineer. I provide a program to actually do it for me. Well, learning from my previous mistakes, I actually started by searching for other projects to do it because it is always the way that someone has already thought about it and did it before you. So I found this one project. It was called Kube Lego. The Lego stands for Let's Encryble Go, which is, by the way, a great language to write something like that, and it was only for Kubernetes. So I was still searching, and the only thing that was meant just for OpenShift, because OpenShift has routes and Kubernetes has ingresses, which doesn't differ much, but you still want native support for it. And the only thing I found was this one little base script, which neither I had confidence to run, to manage my certificates, nor I think I could actually fix it if something went wrong in that script. So I went back to the Kube Lego project. By the way, this is now called Kube Serp Manager, I think, or something like that, and I started looking into how it works. And the first step they asked me to do was replace my router with their own one. Well, it was an ingress controller, but that's just different payment for Kube and OpenShift. And given I had some operational experience, I knew this was not ever going to fly with the admin managing the cluster, because he won't just switch this for me, his battle-tested router, with this special one, because I need this for my ad. Moreover, I wanted to run this on OpenShift online, which is like this big shared cluster, like thousands of nodes, and I don't even know who the admin is, but I for sure know what his answer would be if I asked him to change his battle-tested router for this piece of code. So even more, the router doesn't actually have to be a pod running in the cluster. It can be a hardware, like F5, being placed somewhere in the data center. I for sure can't just ask the manufacturer to put a shim there to make my project work. So this was a deal-breaker and it actually made me to start this project and to write it with a different approach. So I was trying to find a solution, how to actually expose the token without modifying the router and being pretty seamless. So there is this one feature in OpenShift and in Qube, and it's called pod-based routing. What it means is that if you have two routes and they have the same host, but differ only in the pod, the most specific one wins, which is exactly what we want here. We want to steal just the traffic that the let's anchor it will be trying to verify back. I want to steal that part of traffic to our controller, where we expose the token. And we want no downtime for our application, which is exactly the case with the pod-based routing. So this is the current architecture for the OpenShift Technic Controller. As I said, there are the two routes and based on pod-based matching, it will actually split the traffic and the normal traffic will go to the pods throughout the horrible time, like nothing happened. And the only thing is that it will steal the traffic for a very specific part, like dot well-known slash acme challenge, slash some hash, which is based on the token that let's anchor it wants to validate, and send this traffic to our controller. So it's a pretty simple idea, and you don't need any modifications to the cluster if you do it this way. Well, I actually started implementing it, and it turned out, well, you can't just direct traffic for a route to another namespace, which is a bit problematic because our controller is running in a different namespace, and we can't just create a route in the namespace where the controller is, because there is this rule where only one namespace can claim a host, because otherwise in a multi-tenant environment, users could just steal their traffic, which wouldn't really work, right? So we need to create a route where the actual user route is. So I found this little trick where you actually create a service where the route is, and you create a headless service. So the thing is, in Kubernetes, a headless service is a service that will select no pods. So normally, a service composes two objects in Kubernetes. One of them is the service, and the other one is endpoints. And endpoints is the object that, in normal circumstances, when you are actually selecting some pods, the controller, like the internal cube one, will actually put IPs of those pods into the endpoints object, but we don't get that created. And that's actually the trick, because our controller will introspect its own IP address and create its own endpoint subject where the service is, and that's how you actually direct the traffic to another namespace. So that's a bit of a magic behind the controller and how it works, like. And also, I forgot to tell you, the ECME controller has an embedded HTTP server which it uses to expose the token. So this works great. Well, there was this one bug when I actually put the wrong IP there, but I fixed it now. Well, like, here I go. But this is working great, but it turns out it's kind of the user experience, the contributor experience kind of sucks, because if you want to run this controller on your own laptop, well, you need to somehow get the traffic from the cluster to your own laptop, so let's say it can actually reach back to your controller which is exposing the token and validate. So there is this command in Kubernetes which is called kubectl port forward, which sounded like a great way to do it, except it can only forward the traffic the other way around. So it can only forward from your laptop to the cluster, which is the opposite of what you want. So to actually develop with this project right now, you actually have to create SSHD port, replacing the ECME controller inside and to SSH into that port from your local host. And SSH can actually port forward traffic both ways. So this actually works, but I'm not proud if I have to explain this to a new contributor. So I actually came with this new design. Well, it's actually reviving one of the old ones, which was competing with the one we have right now. And it is to actually run a port where we create a route. I didn't want to do it at the start because actually running port is kind of complicated and you have to handle all the error conditions and cases that can go wrong, but given I actually maintain a lot of Oracle controls in Kube and OpenShift, I think I can avoid a lot of those beginner's mistakes. So this is something I actually have working in my own branch, but it still needs a bit of development time, but it will make the user experience, well not the user experience, the developer experience much better to contribute to this project. So if after this talk you want to send a PR, just bear with me, the experience is going to get better, right? So this was the architecture. The controller will actually provision the certificate validated with Let's Encrypt and it will put it into your route. And also it will sync the certificate into a secret matching the route's name or you can actually overwrite it with whatever you want. So you can actually mount the secret into your port and gain end-to-end encryption. And you can actually have pass-through routes that have different protocol than just HTTP and HTTPS. So I'm passing like that, like IRC, to be encrypted because it's another protocol, but you can actually make it a very good pass-through. So that's why we sync the secret. So I was maintaining this project. I actually became part of the OpenShift team shortly after and I started pushing things and at this point we are set to be part of the OpenShift GitHub organization and the move is to be happening shortly after we get CI working. As it turns out, I spent like half of the time developing this project on trying to get CI working or actually doing this or just trying to. But it turns out this is pretty hard if you don't have any budget and you need to spin up OpenShift clusters and you also need the cluster to be publicly accessible for any of those things you should pay for. So I started hacking stuff around and at the beginning it was just a trarist instance and I was using this tool called ngrog to actually gain a port for routing from a random subdomain on the ngrog to actually make this working. But it was actually spinning clusters with OC cluster app which is like a separate installer to how you install OpenShift clusters normally and it's not that, well, it's not that it doesn't install the cluster fully and with all the settings. And that's one of the reasons we are now trying to move to OpenShift CI. I opened the PR like few months back but I was trying really hard for the past week to actually make it working for this talk. But it seems like GC hates me and it's killing the connection if you try to reach your own router from inside the cluster. This is not actually specific to just this project running in the OpenShift CI. It's actually the CI OpenShift cluster being broken but we'll fix this. So if you see some PRs waiting there, I'm sorry and I'm working on it. All right, this all sounds great, right? But I bet most of you are the people who actually want to just use it. And I think the best part about this project is how simple it is for you to set up. It's like the killer feature. All you have to do, you have a route, right? And all you have to do is put this one annotation on the route. And that's all. The controller will provide you with the certificate, manage the validation, manage renewals. You just put this one annotation there and you never ever have to take care about your certificates again. You will just get green in the OpenShift URL. That's it. The annotation is like Kubernetes arrow slash TLS Acme with the value true. And that's all you have to do. So I bet this sounds great, right? I actually have a demo showing this. So let's do it. I pre-recorded this, sorry. I kind of managed to break my networking while working on OpenShift. But so this is actually showing even how you deploy the controller. This is like you get two lines of OC create to actually create the deployment and stuff that you find on the GitHub. There is a great dreamy showing you how to do it all. And you just deploy the controller. We're just checking the log so it's that it's up. This is how any other Kubernetes worker control looks like. It is the same looking. All the shared informers and all the fancy API machinery you would use in a proper controller. So we just create a new project which is called test to actually deploy our application there. This is just a very simple application returning Hello Universe because Hello Universe is so much better than Hello World, right? And we'll expose this simple HTTP server using a route. So this is your project, basically. Like just a very simple one. Okay, I was typing slowly. So we will just validate. We get the SSL warning because we don't have a trusted certificate there. We get the default one from the router. And this is the thing you have to do. Like you can do it from the command line or we can actually write the annotation there. But it's just metadata.annotation and you set the value I described earlier there. And that's all. And we are just checking for events because the controller is actually communicating with you using events. So you have a nice way to actually find out what's happening. And it's telling us it has provided the route with a new SSL certificate. And you can see, we see no longer the SSL warning there because this is the trusted certificate by the system and even the web browsers. So with that, I hope I will never ever see a website running on OpenShift having an even only certificate again. Right? Thank you. Seems like I made it so we can do Q&A. I wish our management console is actually exposed by a route. So just annotate that route with this. Oh, you can expose it by a route. Okay. I think it's now on the API server. Actually, with 4.0, I think there is a route. But I'm not sure, actually. But so you can always create a route and direct the traffic to where it was going before. I actually plan to do this even for the control plane itself because this will just secure applications, right? And there is this whole control plane underneath. And the two of the biggest worries are the console and the API server, which are the one usually exposed publicly. And in 4.0, it will get much easier to actually set those certificates because they will be Kubernetes objects. They will likely be a secret. And I plan to add a secret support for this so we can annotate those secrets the same way you do for routes. And those secrets will be injected with the certificates. The control plane will update immediately. And you will have valid certificates even for the API server and whatever else is running underneath that you have publicly exposed usually. Because there is this chain of trust inside for the internal components like QBlad and control managers. Could you speak a bit louder? Oh, yeah. Behind the proxy. Like you mean separately from the cluster? Oh, yeah, yeah. I know what you mean. There is actually a PR open. I'm waiting for the CI to get it working. So I don't break anything. So when we have the CI, there is a PR open because this is actually just the go HTTP client inside and it supports proxies. We are just not setting the variable which is a very simple fix. So, yeah, you need to have the project to cluster publicly accessible or redirect the traffic somehow in there. So let's think it actually supports more types of validation. Well, this is the one they started with, but they have added DNS validation right now. It's one of the features I want this project to have. So you can actually prove you own the domain by inserting a record into a DNS. But it's a bit harder to implement because you have to support all the providers. So you end up with actually doing an interface like, say, something like webbooks and implement a few of the providers. And if people have different ones, they can contribute or just use it on their own because there will be an API you can use. But, yeah, it's not a very simple change, but I want to do this because with the DNS challenges, let's think it will give you wildcard certificates. So that's what you need actually for the internal router. If you want to provide certificates for it, you need wildcards. So, yeah, it's on the roadmap. Help is always welcome. Are there more? Okay. I guess. Thank you. Have a good one. Enjoy the app.