 Who's ready for a week of arguing how to pronounce Cube CTL? I've pronounced it the correct way, the one correct way. So y'all can head home early. I'm Jeremy Eater. I work for, you may know me from my past work in the community, but I'm working in Red Hat Service Delivery now, which is the team that's responsible for running OpenShift as a service. Today I'm going to talk to you about how we do that. Maybe it comes as a surprise that Red Hat has SRE folks. I traded my performance engineering shirt for site reliability engineering shirt, and everything else remained the same kind of. We have a team. And so that's the first, did you know? By the way, who's old enough to remember when ESPN used to have, at the end of every sports center, used to have a did you know moment? Remember, did you know? So did you know? That's the shtick. We have fully managed OpenShift. You can choose, right now, whether you want, sorry, thought that had a lapel mic. You can choose whether you want to run that on AWS, which is called OpenShift Dedicated. And we also have an offering, which we run with Microsoft, called Azure Red Hat OpenShift. Who's heard of Arrow? Yep. And then there's another flavor called OpenShift Dedicated. And of course, you can manage it yourself, of course. So another, did you know? One of the talks earlier mentioned that someone was running OpenShift 2. And our team, some of them, have been around that long, running OpenShift 2, then running OpenShift 3, and now running OpenShift 4. So there's a fair amount of history, even though the skills, sorry, OpenShift has changed fundamentally several times, as you're aware. Next, did you know? Right now, only place to get OpenShift 4 in a managed way. So who here has any software as a service as part of their portfolio that they run? I want to say that's over 50% of the group. So that's good to know. Only place to do OpenShift 4 right now is OpenShift Dedicated. Our team is not only responsible for SREing customer clusters, but we also run some of the fundamental services that underpin a lot of the value that you saw Derek and Clayton talk about this morning, in particular, Telemeter, which back calls data to us for analysis and proactive troubleshooting and whatnot. So we run those. We also have a SAS front end called cloud.redat.com, which I think showed up at last year's summit for the first time. It might have been the previous one. Has anyone seen cloud.redat.com? Couple people, 20, 30%. I'll show it to you in just a second, if you haven't been there. And then maybe more importantly, and what I think is one of the coolest things that we're doing right now, is we're building a way for customers to interact with Red Hat via API. So I'll show you a little command line utility that we have to stand up OpenShift clusters that are managed by us. And we have a go SDK as well for it. So what does it take to run OpenShift as a service? First of all, we're foundationally bought into the fact that operators and the design of Red Hat CoreOS allow us to scale OpenShift as a managed services business. So clusters are self-driving, self-healing for the most part. So that's a very fundamental bedrock change. What does it take to actually run OpenShift for? The platform itself is pretty snazzy. Day two operations in operators. Our team has an abode into that. I think there's at least eight operators that we've written. I'll go through them in a minute. There's at least eight, maybe more, that we've written to help operate OpenShift for in a managed way. Advanced monitoring. So the monitoring upstream, the Prometheus team, which we work very closely with, as you can imagine, takes our feedback and rolls it ultimately into the product. That feedback loop is very important to us, because sometimes there are either bugs in the wild or just like gaps in whatever monitoring we have. And we send those changes back to the monitoring team for inclusion in OpenShift, the enterprise bits. Of course, we collect telemetry data from all these clusters. And something that may go, cannot go overstated is the interface between a cloud provider and OpenShift is an area of great importance. And man, that cloud can change out underneath you at any given moment. It can be having a bad day. We have to harden our platform to handle a cloud provider's bad day. So API-driven OpenShift cluster management. I'll show you the guts of this in just a second. But here's how it's strung together currently. And if you've heard of GitOps-based cluster management, you may have heard of some of that this morning. Our team's been doing GitOps for many, many years now. And bringing that to OpenShift would certainly lighten our load quite a bit. So anyway, here's what it looks like architecturally. Cloud.reddit.com slash OpenShift is a set of microservices. One handles the cluster, it's called cluster service. That guy handles standing up the cluster. The second microservice, AMS, just handles your subscription stuff. Not too glamorous, but highly important. And what the cluster service does is it renders what's called a cluster deployment, which is a custom resource that we've invented in a set of controllers called Hive. And Hive is how we spin up OpenShift clusters. Hive's job is to talk to the cloud provider API and ultimately run the OpenShift installer. It can install any version of OpenShift 4, so 4. Whatever. From there, we have a cluster. And we can lay down, so we don't do anything to OpenShift itself in the front end. Everything we do is principally OpenShift 4 wants you to do everything as a day-to-operation. And that's where those operators come in. So we have a way to create that cluster, wait till it's done, and then lay down a bunch of operators. And then we have our OpenShift dedicated or managed OpenShift product. That's how that works. We good so far? I'll show you a cluster deployment YAML file in just a sec. That'll hopefully help clear things up. Here's some of the operators that the product team has written, OpenShift product team, or the managed services folks. On the left-hand side of the screen, you'll see we have a centralized back-end that handles common operations that are not, doesn't have any metadata shared with any particular cluster. So Hive, I mentioned earlier, just runs the OpenShift installer. We have integration with PagerDuty so that we can actually SLA these things. We lay down certificates. Whenever you install OpenShift 4, if you've gone to the management console or OpenShift 3, you have a self-signed certificate. So we wrote an operator that goes out, fetches a let's encrypt certificate, attaches it to the ingress and to the console, and then refreshes that every whatever. I think it's 45 days. So you don't have to worry about certificates expiring. The last one is a bit of DNS stuff that we're doing. So if you've gone to try.OpenShift.com, you'll see it's going to ask you to create a DNS domain. And if you've ever, it's always DNS, by the way, if anyone's ever been in SRE. I'm sure you're well aware of that. We currently use Dyn DNS, but the point is that we take care of that DNS juggling for you. So that happens in the shared side. On the right hand side of the diagram here, a set of a handful of operators that run on each cluster, one that lays down security, one that lays down some config changes that we do. We do a couple of things, and then that gets folded back into the product. Or maybe it's too specific, so it never gets folded back into the product. So we've got this customization layer. We call it a managed config layer. And that guy will, for example, bump up the system reserved to a gig and one CPU, something we do. Yeah, we run backups of SED and the, I think, yeah. We run backups of SED using Laro, if you're curious. And then we have a set of custom alerts, which I mentioned earlier. So that encompasses what you get with OpenShift dedicated at the moment. When we release Arrow with version, sorry, Arrow running OpenShift version 4, it will look something like this. Certainly we'll use operators and maybe a little bit of different architecture and whatnot, but ultimately the same kind of end goal. Let me see if I can switch screens here. And since this is glass, I can't use my mouse. And I'm so anti-touchpad. But let's give it a go. So can you see? Is that font big enough? Can you see that? OK. So I fetched this code lest I be yelled at by Diane. I fetched it already. And we have a command line utility called OSEAM. And what this guy does, it's in GitHub. And it's the OpenShift cluster manager CLI tool. So you can do a bunch of things to clusters here. You can see you can look at your account information. You can create clusters, of course. And to a certain extent, you can modify the configuration of those clusters. So real simple. But it may seem simple. But actually, this is quite a big deal. And here's why. Interacting with a vendor by API is fairly commonplace these days. AWS, GCP, everybody's got their own SDK. And here's an SDK that Red Hat can offer you for managed services. So that's actually a first. And pretty cool. OK, so I've got quota for 15 multi-AZ clusters and quota for, well, I've got three running single-AZ clusters right now. So I can find out a little bit of that information. Let's see here. I'm going to create a cluster here. It takes a half hour to install it, but I've already created one. Super simple arguments. I'm going to create it in US West 2, because my favorite least finicky AWS region is Oregon. And if there is still internet, this command will return. So there's a cluster deployment object that was just created. And what Hive is going to do now is pull down the installer image and begin provisioning a cluster for me, in this case, on AWS. This one's from earlier today. Trust me, they look roughly the same. We couldn't do any of this, incidentally, without CRDs. So this is all extensions to Kubernetes itself. And so we use Kubernetes, or we use OpenShift to stand up OpenShift. It's quite a cycle. Clusters get labels, for example, are they in production stage or integration? We have some knobs, whether we turn off pager duty, for example, we don't want the SRE folks who are on call to get paged for development clusters, for example. So there's a couple of things in here that are related to the operators I showed you earlier. Whether it's managed as in here as well. On line 24, you can see that's the randomly generated DNS name for this particular cluster. We went out to dine DNS and fired up the zone file for it and so forth. We got some certs, provisioned a bunch of workers for you. In this case, the master has I01 volume types. If you've seen some of the best practices, we recommend for SCD. So that's kind of stuff is baked in. Not a lot of rocket science going on here, but this is what it looks like. Cool? So this is OpenShift 4.2.2 at the time I deployed it. And I just checked before getting up here and it looks like the SRE team has upgraded this cluster in the last couple of days. So yeah, that's what the cluster deployment looks like and Hive takes this and out comes a cluster on the other side. Can I show you what cloud.red.com looks like? Yeah. Maybe zoom in. So here's the cloud.red.com OpenShift cluster manager. Right now I've only got two clusters stood up here. One of them is nuked from earlier today and there's the third one actually as I'm speaking. So you've got, I've deployed a 4.2 earlier and then I've got this Jeter commons one from which is currently being deployed. So you can see it'll be this way for a half hour or so I wanted to spin one up separately so that I could show you some of the telemetry data actually being used. So this resource usage from the cluster is actually backhauled and all of the details about the cluster are backhauled from the cluster into our telemetry system. And one of the microservices for cloud.red.com calls out and fishes that data out of our telemetry system to display it. So yeah, 4.2.4 is what's currently running. So as a managed service we wanna make sure you have the latest versions as well. So we're doing upgrades right now to see streams on a weekly basis. So if you're a purchase an OpenShift dedicated it would get upgraded every, I don't know, Wednesday or something like that. Maybe later in the week. We backhaul some of the monitoring data as well. No alerts firing on this one. Status of all the cluster operators and so forth. So yeah, all that comes from from telemetry system that we pull data back from. And here's the install. So this is the pod. When I mentioned earlier that Hive was going to kick off the installer. The installer runs in a pod and these logs are what the standard out was from that pod. So towards the end is like install succeeded. So that's what it looks like. And then of course we can click launch console and I end up at a login screen and I've configured GitHub as an authentication provider and I am in to, hopefully it'll work. But anyway it's logging in. Not a lot to show there. So that makes sense. All right, so that's what we're doing right now and I reserve the right to change that in a moment. That's where it is today. So question so far. That's the OpenShift install product installer binary. So the OpenShift install. It's actually the name of the binary. Yeah, we kick that off in a pod and that's how the cluster gets set up. So okay, a couple extra things. Yes sir, sorry. It's available. It's available. It's not productized, but it is all open. We're gonna save the questions until the AMA session. Otherwise we'll never get to beer. Oh okay. Oh how am I doing on time? All right. You're doing good, you're on time. All right. Keep going. Just had a couple more. I just wanna let you know some of the other stuff that we lay down on top of OpenShift. So things we've learned along the way. Couple things we have to keep an eye on. Cloud provider storage may get wedged. They have, someone giggled because that's a fact in life. DNS for sure can be problematic. We wanna know about those things, whether they've resolved them on themselves as something else entirely, sometimes they do. And then we wanna keep an eye on, something funny we noticed is like, there was no alert at the time. There was no alert for whether the number of machines matched what the operator thought there should be. And when there's a mismatch near, that's something we need to alert about. So we're currently carrying that and it'll eventually get back into the product. Couple other things that we're learning and trying to feedback through to the product team and then to eventually to customers who want to use OpenShift on-prem or just manage it themselves. I mentioned earlier we're doing our own GitOps thing that's totally built in-house like most of you others might have. We're keeping an eye on some of the developments in the GitOps CICD for Kubernetes world. A lot of customers, so we have a Prometheus that runs on every OpenShift dedicated cluster or every OpenShift dedicated cluster. Eventually you'll be able to stand up a second copy of Prometheus for your own, what's called user workload monitoring. We're keeping a really close eye on that because that's probably the number one feature ask right now is to allow for you to scrape your own Prometheus metrics and that'll come into OpenShift in one of the upcoming releases and certainly we'll support it. We wanna make sure you can customize your own DNS. So for example, if you've got a VPC peer into your OSD cluster or if you've got a VPN tunnel into your cluster, you may wanna configure DNS, we're gonna make that happen. And then another thing which we sincerely would like to have is the control plane of OpenShift 4 as part of a machine set. Right now there's stood up as static pods if you've ever seen the architecture diagrams. And the worker pool is a machine set so what we'd like to see is the control plane also be a machine set so that we can do things like change the type of node, potentially scale out the number of masters although that would be pretty rare. So a couple of tips on how we're debugging these clusters. How many of you have seen the OCDbug command? Not enough. So now there's no excuse. Go type OCDbug somewhere, OCDbug node slash and then the name by which Kubernetes knows your node and it will launch a privilege pod on that node that you can do local debugging on the system. So that's a fun one. Of course I mentioned telemetry earlier and then the audit logs, Kubernetes audit logs. So there's a ton of data being emitted by the system itself. Cool. I have just one or two slides on some of the roadmap stuff that I guess I picked off the product manager to see which kind of stuff I thought might be interesting for this audience. Not exposing the API publicly. So you'll have basically no public, public-irroutable DNS name for your private cluster was what we're calling private clusters I should say. Every cloud provider seems to define these differently. Bring your own clouds, you can give us your AWS account and we will provision all that stuff into your account so whatever existing business relationship you have with the cloud provider in terms of cost and whatnot that will obviously carry over and we're doing a bunch of compliance work and then in the next, I will say year or so ish, we'll have support for running OpenShift dedicated on GCP. So when we round out that story, we'll have coverage across at least in the US the three major cloud providers when we have Azure Red Hat OpenShift, OSD which is OpenShift dedicated running on AWS and then OpenShift dedicated also running on GCP. And your front end will look like something like a member I showed you the OCM cluster I created earlier. Imagine an additional flag that says cloud provider equals GCP, something like that. Put it wherever you'd like. I think maybe one of the more fun projects we've got going on right now is, so right now you can stand up OpenShift. Imagine being able to provision not only OpenShift dedicated or Arrow or something else but other Red Hat products that run on top or take it the other way. Maybe you just bought some middleware from Red Hat that runs on top of OpenShift. Imagine being able to provision that and you automatically get a managed cluster underneath it. That kind of stuff is being worked right now. On the Arrow side, just in the last couple of weeks, I think at Ignite last, once I said two weeks ago, they finally removed their requirement for reserved instances which caused you to pay for a bunch of compute in advance. Now we're back to, I think it's hourly billing. We're tying in Azure Log Analytics to that service and eventually, sorry, these first two items are actually not roadmap, they're in right now. The last one is supporting OpenShift 4 right now. Our Arrow runs OpenShift 3.11. So soon it will run OpenShift 4 as well. And then we're also gonna do multi-AZ cluster support on Arrow. On cloud.redhat.com side, like I said earlier, potentially surfacing other Red Hat products in there, we're gonna do multi-cluster dashboards. We're gonna be able to import your on-prem clusters, if you'd like, we're not gonna be able to manage them, but we'll at least be able to visualize both Arrow and disconnected clusters. And then we finally, instead of logging in to every individual cluster and clicking the upgrade button, you could potentially do that from a single pane of glass inside OpenShift, inside OCM. So that's all I've got for you today. Any questions or thoughts, observations? Go ahead, ask. I know you wanna. It is open. Can you run one? I highly doubt it. Yeah. Something's coming. Just go for it. And he's over here as the right to say that. Yeah, so you heard of here first. So we're gonna have an AMA session, so hold your thoughts. Thank you very much for coming and doing this today. It was very good to hear and to see in person. So thanks a lot. We have one more talk. Thank you. Thank you. Thank you. Thank you.