 Good morning. Good afternoon. Good evening and welcome to another edition of the OpenShift administrator office hours. We are going to be talking about what's new in OpenShift 4.7 today And I am joined by the one and only Andrew Sullivan, the cuddliest curmudgeon I could find to talk about it. You know, it comes with experience, right? The beard is gray for a reason. Right. Right. So hello. Hello. Hello. Welcome everyone. So this is the OpenShift administrator hour, or we are in the process of rebranding to the Ask and OpenShift admin office hour. Yeah, I mentioned that last week that our marketing team, our branding team has very kindly agreed to help us out and help promote the show and just generally improve things overall. So thank you to them. But same great host, same great time, same great show, just a slightly different name and probably some new colors and new images coming in the future. Yes, sounds like all that fun stuff. Yeah, you know, it's we're getting official. Kind of. Yeah. Yeah, official-ish. So this is one of the office hour shows, right? Really, the goal here is for you, our audience, right, all the people listening and watching in to be able to ask us questions about whatever it is that is on your mind, whatever it is that you would like to know about anything OpenShift related. I love to say that we can field any of those questions, regardless of the topic, but, you know, Chris and I both have an administrator background. So if it's dev-related, it's we'll probably have to take a note and follow up with you. Yeah, that was yesterday's show at 11, but yeah. Yeah, and don't let that stop you from asking those questions though. Yeah, yeah, seriously. So we are dependent on you, on our audience, for those questions, for that interaction. So please don't hesitate at any time to ask any of those questions that you have. However, in the absence of that, if you all are feeling shy or quiet or, you know, all of your questions have been answered already, I'm sure that's the case. Right. We do have a topic. We always come prepared with a topic and today that happens to be drumroll, please, OpenShift 4.7. And why today? Because today is the GA day. Today is GA day. Today is GA day. So if you go to OpenShift.com, you can see there's at least one, if not like five or six blog posts that were all put up around various aspects. If you browse to cloud.redhat.com and go to the OpenShift install page, you'll be able to download all of the binaries and everything else. It is not in the update channels. We'll talk about that in just a moment. But yeah, 4.7, it's here. It's here. Yeah. Someone asked yesterday, when is the 4.7 release? And all I said was soon. Knowing the whole time that it's today. You say that, but OpenShift 4.7 was GA last night when I went to bed at, well, you know, we were doing this APAC virtual tour thing. So I went to bed about 1 a.m. and it was GA and I woke up this morning to, uh, we found a really late bug, so we're going to have to push by a week. So yeah, you knew, you never really do now, but yeah, it's entirely possible for these things to come up at any time. Right. Exactly. You know, despite the best efforts of our engineering, QA, QE folks, right, to find and fix those things, they still happen. Yeah. I mean, it's software folks and we are merely humans. Exactly. Well, most of us anyways. Oh yeah. Good point. So, um, today's topic is 4.7 and I'm going to focus on, I've got Chris, I was reading the crystal list of things that I've got for today. There's about a lot of different things that I want to talk about in various aspects that I think are going to be important for us as administrators. Um, so I'm going to kind of go through those, but again, don't let that stop you from asking whatever questions happen to come to mind. Um, you know, if there's something that's bugging you, something on the top of your minds, don't, don't hesitate. All right. I am going to share my screen. Yeah. Today's screen. Today, if I could talk now, uh, so today's screen share is probably going to be documentation heavy and there's a couple of different reasons for that. Um, one, I like to think that the documentation is the source of truth. Right. And if it's not, if there's something wrong, then our goal is to get that fixed and get that updated. Um, there's a really great, if you aren't familiar, there's this open an issue thing up here in the top corner. At any point in time, if you find something that's wrong, go and click that button and it'll open a GitHub issue and the docs team is phenomenal about responding to those. Um, it doesn't take a red hat employee. It doesn't take, you know, reaching out and poking us literally anybody can do that and you don't have to provide a fix. It can just be, Hey, this looks wrong or this didn't work for me for me and they'll track it down and take care of it. Exactly. A couple of comments very interested in the compliance operator updates and pipelines are they GA. And I forget if that was part of it. I thought pipelines were already GA. Well, I thought they were in tech preview for six. Let me double check. I'll double check. You keep going. All right. I'll, I'll, yeah. Mm hmm. So, um, Christian, do you want to install the operator? Just want it managed by L O L M pipelines are not GA. Thank you, Christian. Thank you. I hope your workout is going well. I hope I'm like, yeah, you're handy on the keyboard right now. That's for sure. For those who don't know Christian usually listens to the show as he's doing his, his Wednesday morning workout. Okay. Like that question. Yeah, please. All right. So, as I said, documentation is super important. And I know that the docs team puts a huge amount of effort into this release into, you know, they're always doing lots of efforts. Sometimes I find it funny to appear in the selector. If you go all the way back to like 4.1 and look at what the docs looked like, they're pretty dramatically different, just the amount of content that's here that they maintain that they add. Like you see over here, there's no day two thing. There's no, you know, the post installation configuration. None of that stuff existed until later versions. So the body of work here is absolutely incredibly massive. Yeah. And I really can't sing enough kudos about the docs team. I know that it's a Herculean task that they take on. So a couple of things to look at here, a couple of things to be aware of. As always, the release notes are going to have a pretty substantial amount of information about any release. You probably can't see it on your screen, but my little scroll bar over here is quite tiny because this is a massive page. Yeah, definitely recommend that you read through that you look through that you check out all the things here in particular pay attention to the deprecated and removed features. Right. There's this lovely chart here of things that have been changed or removed. Additionally, I'm going to jump back up the top. Apologies. I would recommend checking out the known issues. So known issues are always an important thing. Right. If there's something that you should be aware of like this one all the way back from 4.1. Right. Making a recommendation on how to improve the security. So always always always be sure to at least check again I would suggest those two things in the release notes to make sure when you do an update or if you're deploying a new cluster you're not going to accidentally break things because users love it when we accidentally break things. So I saw the get ups operator was released in four seven it is still tech preview. So I actually have that I have that way out here on one of my tabs here getting started with the open shift get ups. So it is tech preview. I would definitely recommend if you're interested in get ups and the Argo CD stuff check out Christians live stream is every other Thursday at 4pm Eastern 3pm Eastern 3pm Eastern. Too late for Europe. Yeah. Yeah. So definitely check out Christians live stream. It's a great one. I always learn something. You know this get ups thing is new to me as well and I'm learning as I go so he's really good at explaining those concepts. So. All right. The first thing I want to highlight with four dot seven is we have updated to write the underlying Kubernetes is now one dot 20. There's a great blog post that was done by Gourav. I'm sure I'm butchering that name. This is on the product management team around kind of the what's new or what's the things that have changed in Kubernetes one dot 20 and being that this is on the open shift blog. Right. It is going to be things that are important and relevant in open shift especially and not just, you know, Kubernetes generically. So a couple of the interesting ones here. Right. Storage. We can see things like snapshot objects are now GA. They're also GA inside of open shifts. Additional network things right all kinds of stuff inside of here. I'm not going to go through and read this blog. Again, definitely check that out because Kubernetes being the core of open shift remember open shift is built on top of Kubernetes. All of this stuff is going to be very, very relevant. So the next thing that I wanted to highlight is that and I'm sort of keeping up with chat here I see it blinking by out of the corner of my eye. So if you see me looking over this way that's me reading chat by the way. So the next thing I wanted to talk about if you if you haven't heard, and we've done a couple of streams on this we did. I know we did one in the early days of this show. I think we've also done another dedicated live stream for the assisted installer. So if you're not familiar, all I've done here this is cloud at redhead.com right if we from our clusters here if I go to create cluster. It takes me to this page and then we can come down here to the bottom and do this platform agnostic. And I have this assisted bare metal installer. So the page here still says developer preview but it's actually, I think it's now beta or whatever the next step in the path to GA is. So it has been promoted straight to beta essentially. I can select here and this walks me through the process of deploying a cluster on premises, or in the cloud I think, depending. No, I think it's on premises only. We'll do this with the assistant installer. Yeah, it's on. Yeah, it's on print. Quote, on prem, right? Yeah, you define your on prem. Yeah, my brain is only partially functioning. We had a long conversation about DHCP already this morning. I'll talk about that more in just a moment as well. And somebody pointed out that, oh yeah, that's that only applies to IPI. And I'm like, wait, where where does that say that? Oh no, it's like break right across the page. Bare metal IPI only. Oh, okay. Yeah, my brain is anyways. So the assistant installer is a really great way of very simply deploying a cluster. Essentially you plug in your information rate you hit this generate discovery ISO. It gives you an ISO that you boot your servers to. So again, physical or virtual bare metal installation. And then they boot that ISO and then they register themselves here in the interface. And it's really cool. This is why I use the stand up my cluster here at home and routinely reuse it to rebuild it when we destroy it on here. It makes life really easy folks. Yeah, yes, the nodes show up here right you can assign their node I don't have any nodes all my resources are currently in use for 4.7 clusters that are demo clusters and all that other stuff so I don't have anything to show here but let's show up you can assign their role you can assign all kinds of things you don't need a dedicated bootstrap host right it'll it takes one of the nodes and uses it to bootstrap the others and then reloads that nodes to join the cluster as a normal host. Lots of cool stuff that's happening inside of here. And they're doing constant improvements to this as well I was just talking with them about. I wanted to integrate OCS deployment into this as well so basically you tick the box say I want OCS and it'll automatically deployed out of the box for you. I don't think that's available today but that's one of the things that they were talking about. So, lots of lots of things interesting things great things coming. So, what we've got here does OLM even manage non supported operators. So, yes. So, OLM operator lifecycle manager is kind of you, it is used for all operators that are a part of a catalog inside of OpenShift. So it doesn't matter if it is a red hat to provide an operator or you provide an operator right if you add the catalog it will register it you'll be able to deploy it it will manage it throughout the lifecycle. The difference is if it's your operator one that's you know Andrew created or you created red hat wouldn't support that right we would support all them but we don't support your operator. Hopefully that makes sense. Right, like your custom operator. Yeah, we don't know about. So yeah. Yeah, although I will say that the operator framework is fully supported. So if you have if you're if you're creating your operators and you have an issue with the operator framework. You can help with that but again, whatever your custom logic is we can't help you write your logic for better force. You wouldn't want Andrew writing your logic. Now, same for Chris. Let's see so moving on. So one of the things I wanted to talk about with regard to the docs team. This is probably my favorite new page in the documentation. See I'm on the 4.7 documentation here, just the very top level about and we have this learn more about OpenShift container platform. And what the docs team is calling this is essentially persona based documentation. So depending on your role right what is what is your job with regard to or in relation to the OpenShift cluster. This links directly to the things that are relevant for you in the various stages of the cluster lifecycle. So I'm an administrator right what do I need to know before getting started. What do I need to know to deploy my particular cluster, so on and so forth. So, if you're new to this especially if you're not new even you may find a lot of useful information here. Every time I look at the docs I swear I find new pages that I didn't know existed here. Yeah, tell me about it. Let's see let's move over here. Oh, another one that I just learned about this one this morning, speaking of the docs team, validating an installation. So for a long time, effectively, we document how to deploy the cluster we document how to use the cluster but we never had anything that said, here's how to figure out whether or not the deployment was successful right here's how to figure out whether or not it really did the things that it did. You know, our, our generic response to that was well if the installer finished successfully then the cluster deploy was successful. But reality was often, you know, sometimes a cluster operator was still in the process of doing something, or maybe it was successful but there was still a lingering error. I always used to point out the registry rate the registry shows that it's good but unless you provided at some sort of storage it's not actually there. So, anyways this documentation or this page here walks through a number of different steps to kind of go through and just do a high level validation that the cluster is deployed the cluster is ready right you can begin going through and doing all of the day to configuration stuff. So this section in the documentation here when we move it up in the in the screen here so the post installation configuration is rates that okay the cluster successfully deployed kind of the day one operations. Day two is now what do I need to do in order to get it fully ready and prepared to be used by the applications. So I this one is near and dear to my heart because I helped to create this section of the documentation. But you know just going through and evaluating what are the things that I need to do on that day to so you know do I need to go through and you know configure time synchronization. Probably a good idea. So documentation on how to do that right. Do I need to add any kernel arguments do I need to change something with those nodes in order to make them you know conform to my particular usage right real time kernel right right any of the V ran stuff or anything like that. Anyways, huge section of the documentation, you can see there's lots of different tasks. So I like to think of this as an administrator right if installing is sort of the, the old testament if you will the post installation is the New Testament right it goes through and it covers rates everything that I need to know to basically hand off right here here's your accounts developers a Catholic application teams go forth and do great things be awesome. Yeah. They should they should be good at that right. They should be. All right so quickly catching up on chat here. So by you yes, assisted installer definitely recommend giving that a poke and trying it out. So it says bare metal remember that bare metal is an installation method, not an infrastructure type. So if you're doing the non integrated bare metal installation with virtual machines, it will work there as well, which was my slight confusion. When I was talking about it initially. Yeah, like mine is all running on the server just, you know, seven KVM VMs and off it goes. You know, it's Christian who has a server literally sitting next to yours is in the basement right. Yeah, mine will be in the basement forever, just because it's loud. All right. Why did I Oh, so one of the other links. We talked about this a couple of weeks ago. Normally I open the show and I talk about recent developments or things that have come to the top of my inbox, which really the big one that that we'll talk about is I'll circle back to that a few minutes. But one of the things we talked about was, when did nodes get rebooted, and nodes can be rebooted for a number of different reasons and sometimes not always expected. And during that episode I highlighted that and I think we were talking about registries. Yeah, with the registry show. I highlighted that if you change the insecure registries, right if you add an insecure registry or a move an insecure registry, it will result in the nodes being rebooted. And there's a number of those those reasons. So internally I know we're working to document all of those conditions right all of the times that the machine config operator will reboot the node to apply some sort of configuration. But right now we have in this section of the documentation. And I need to get better about posting all these links into the chat here. Yes, that would be very helpful for me. Yeah, so and my apologies for not posting them into the chat already. So we do follow up each one of these shows with a blog post that will have all of these links as well as links to the specific times in the video for various topics. So usually those come out Friday morning Eastern time. So just keep an eye for that if you happen to miss a link or anything like that. We'll have all of them on open shift comm slash blog. Yep. Sorry. Rabbit hole or a tangent achieved. So down here in the documentation underneath this understanding the machine config operator. We have some changes as to things that no longer cause a node reboot. So things like here the SSH key being or SSH authorized keys rather being changed, no longer reboots all of the nodes, changing the realize that it did. Yeah, so, you know, from a, you know, administrator perspective. It's with a sufficiently large cluster, right, you know, several dozens of nodes. You can almost reasonably expect something to be going through and changing updating modifying at any point in time. I had this conversation with someone recently of we effectively release a Z stream right so 4.7.1 or 4.6. whatever the next one is 19 or 20. You know, every two weeks I think is the official release cadence. So it's one of those like you can reasonably expect to come in every other Monday and see that I have an update to apply and that's going to trigger all of the nodes in the cluster to reboot. And with a sufficiently large cluster, especially if it's a physical, you know, physical servers take, you know, three, five, eight minutes sometimes to reboot. And that's a lot of capacity to potentially be unavailable. So you can pause those updates. Basically, it's a MCP a machine config pool setting that says, you know, pause updates or pause rollout, and let them pile up so that it's fewer reboots, but there's still reboots that have to be involved in many cases. This just helps to identify and alleviate, like you said, would you have expected changing the authorized keys for SSH to result in a node reboot, probably not. So identifying those, reducing them where we can, that type of stuff. Yeah. So here's the one I was referring to the updating the registry settings on the nodes. Some of them no longer right trigger a no reboot. Now that I didn't know like messing around with that file you got definitely trigger most cases but yeah, no it's good. So from here, we kind of get into a little more. So those were kind of high level relatively small changes. I've got a whole bunch of other ones that we can talk about here. I have a sticky note to help me out here. So for KVM based deployments. So if you're deploying to Red Hat virtualization or OpenStack, the QMU guest agent is now included in CoroS. So if you were ever concerned or if you noticed in your Rev or OpenStack GUI that you didn't have any details like, I don't know, guest IP address, all of that should be there now. I think we have officially documented that it is supported to with OpenStack deploy a cluster that spans both physical and virtual. So this is a question that comes up quite a bit actually. So, normally, you cannot you can deploy a cluster that only consumes one infrastructure type. What do I mean by that? If I do a UPI or an IPI installation, basically all nodes have to be deployed using that same method and all in the same infrastructure. So I can't deploy some nodes that are UPI to Red Hat virtualization and some nodes that are UPI to vSphere, right, because they're two separate cloud providers, two separate infrastructures and it just doesn't work. And that's a Kubernetes limitation, not an OpenShift limitation. So the way to get around that is a non integrated aka bare metal UPI installation. So essentially there's no integration with the underlying infrastructure. It's unaware that it is, you know, it doesn't know that it's on vSphere, which also means that you can't do things like use the vSphere CSI plugin. So with OpenStack, because of ironic, we can effectively use the same cloud provider, the OpenStack cloud provider, to talk to both virtual machines and physical servers. So as a result, that means that you can mix infrastructure types there as much as you would like. Maybe a virtual control plane with physical worker nodes or physical, you know, create a compact cluster, right, day two schedule, scheduleable control plane nodes that are all physical and then dynamically scale that up and down all using that OpenStack cloud provider. We have now added wheels to our car. Thank you, Christian. Let's see what else have I got here. CSI snapshots are generally available. You haven't seen that yet. I'm going to switch over to here. If we come down to storage. And so I have which cluster am I in the wrong cluster. So I come down here to storage. So in here I've got a CSI storage class. This one is very simply called Lab Silver. And I've got a few PV, PVCs inside of here. I can just go over here and say create snapshot. And it has logged me out from my particular cluster. So it's going to be difficult. I prepared, I remember I said I was up till like 1am because of our APAC virtual tour. So I prepared this page very early last night. It's not cooperating. Come on, buddy. Is there any benchmark about the latency added by PSEC on that show? I think we talked, we did a show. Let me go find it. I'll go with Mark. Was it Mark or Don? I forget. I'll find it real quick here in our archives. What has happened to my cluster here? Your cluster is gone. It is. It is not. It is not liking you today. We'll try it from a different browser and see if that helps. Persistent volume claims. We'll create a snapshot. Hey, there we go. So creating a volume snapshot here. This is just going to use the CSI provisioner to literally create a snapshot. If you use this, these GUI elements existed in 4.6, but it had a big red banner across the top that said tech preview. So it is no longer tech preview. It is now generally available. You can go in. You can see all of the snapshots. You can manage the snapshots. You can revert to snapshot. You can create new off of snapshot. Do all of the great, funny things that you would like to do with a snapshot as the case may be. Whatever you're talking about. This is a snapshot. In that cluster, the one that I was just showing is one that I use for open chip virtualization testing. Open chip virtualization will do CSI clones of volumes when creating virtual machine disks, which is really cool. So if your storage supports it, so the offloaded clone function essentially, it takes like two seconds to create a whole new virtual machine inside of there. Really interesting stuff. Other things. Upgrades are blocked if a machine config pool is degraded. So now that I have way too many browser windows open. Do you have as many as I do? That's the question. So this is just the macOS desktop and screen that I'm sharing. You can't see the other virtual desktop, nor the other window that has the dozens of tabs open. So if I come down here to compute and go to machine config pools, so we can see that neither of these are degraded. But essentially, if I had an MCP, if I had a pool that was degraded, it would effectively immediately disqualify the cluster from applying upgrades. So if we were to come over here to cluster settings, it would essentially say that unable to apply because of a degraded machine config pool. So while I'm here, you'll notice a couple of things. So first and foremost, this is a GA 4.7 cluster deployed into Azure. Awesome. I deployed it's literally two hours ago. Something like that. Yeah, so if you go to cloud.reddit.com right and you go and you can pull down the OpenShift install and OC client libraries. You can go and deploy a new cluster using 4.7 if you would like right now. Nothing wrong with that. So a couple of things to note here. First, you'll see that this update status version not found. Well, that's only sort of true. So yes, 4.7 is available. Yes, it's fully supported. Yes, the version really does exist. What this is saying is that the update or the Cincinnati update channel is not seeing 4.7. So we've looked at this before. We looked at it a few weeks ago. If I go to github.com slash open shifts. And I find the Cincinnati graph data repository. So this repository and specifically the channels data that's inside of here is where it looks. You'll notice that there is no stable 4.7 and that's why it's complaining. That's why it's saying the version is not found. So usually it takes a little bit of time for the PRs. And if we look, there's a PR here that says enable 4.7 and stable channels. So usually it takes a little bit of time after the bits are released for it to make it into the channel for the PR to be approved, etc. So that's why we're getting that particular error. So the other thing that you'll notice if you're on a 4.6 release today is that there is no update path to 4.7. That's fine though. Yeah, it's expected. You can probably it should be available and fast relatively quickly. But remember fast indicates right that there has basically we don't trust it in stable yet and don't trust might be a bit of a strong word. But essentially it's we haven't gone through all of the normal testing all the normal validation for all of the potential upgrade paths. So keep an eye on and now I'm not going to have the link. Where's the link. Back to that other desktop of which you probably have 20. No, I'm looking for the the tool that we just released that shows upgrade paths. Oh, shoot. Hang on. I got it somewhere. Not a data graph is the repo but. See, this is what happens when I got to log in hang on. Anyways, so Chris is digging for we recently published a week ago, maybe two weeks ago. One of the labs teams published to access.redhat.com a tool that's very hopefully will show you the exact upgrade path from your current version to whatever version you want to go to just drop it in chat. Thank you. No problem. So if we look at this, and again, this probably won't reflect four dot seven because four dot seven is not in Cincinnati yet. Right. But I can select. Yeah. So right now I'm on stable four dot six and I want to go to where I'm currently on say I'm on four dots. Nine four dot six dot nine and I want to go to four dot six dot 17. It tells me exactly how to get there for that nine six dot nine to 17 is kind of boring. So let's pick something earlier. Right. So this is telling me if I were today on four dot five dot one to get to four dot six dot 17 I would have to go to an intermediate four dot five dot 24. Right. So when four dot seven is in Cincinnati, all of this will automatically updates. They all all of these things rely on that repo for their data. And you'll be able to see those things. Just remember stable often takes two to four weeks after the release to reflect the new version. So we go through this with every four dot X release. So don't be alarmed. Don't be surprised if you're using the stable channel and you don't see the four dot seven updates for a little while. Also keep in mind that it is going to be going to be dependent on which source or which current versions are eligible for that upgrade. So if you're on, I'm going to say here, we'll switch back over here. Let's say that you're on four dot six dot 17 today. It might be that you have to update to four dot six dot 18 or something like that to be eligible to then upgrade to four dot seven. So just keep an eye on this page. Keep an eye on. Christian said put in four dot two. I don't think you can put in for I guess I could put in four dot two. You're at let's just say the latest of four or whatever. Okay, fine. Yeah. Now do all the way to the latest. Yeah, why not. So that's that's quite the tree. That's the chain. Yeah, that's. Wow. Okay. So anyways, but I feel like you could do that in a weekend, right? Like that's not as heavy a lift as it used to be. Um, I guess that would depend on your apps and everything of size of cluster that kind of thing, but yeah, this is one of those. You know, Kubernetes and OpenShift and the applications that are deployed. It's super important for us as administrators to work with those app teams because for example, if they don't define a pod disruption budget that is appropriate. Then yeah, us just going about doing an update or an upgrade is, yeah, can potentially break something, right? Those pod disruption budgets and the other things that are in place to protect the workload are important. And, you know, with, with a virtualization solution like Rev, we pretty much own that, right? We understand, you know, hey, I'm just going to live migrate this around. I'm just going to, you know, it's going to be non disruptive containers aren't that way. Remember they terminate. They don't migrate. Um, so yeah, it's, it's a little bit of a, and I feel like I'm probably preaching to the crowd or preaching to the choir with this audience, but it's important to understand that we do have to communicate. Yes. No, communication is key. Talking to your coworkers is like most of DevOps, right? Like, no offense, but it kind of is. It's also the hardest part. Exactly. The culture is the hardest part. Circling back around. So I think as you pointed out, Chris, you did a live stream with either Mark and Don or Mark or Don, or there was a dedicated live stream about IP second encryption with OVN Kubernetes in 4.7. So the crux of this here is that it will automatically encrypt all traffic in certain conditions. So in particular, you want to read this section of the documentation. Let me post that in here. So for example, when IP sec is enabled, following network traffic flows between pods are encrypted. So traffic between the pod between two pods on the cluster network. So without using, you know, an application level or a pod level, you know, TLS certificate or something like that, essentially that in that communication is encrypted by default. You don't have to do anything. You turn it on at the cluster level and it's automatically happening there. But note that things like so traffic between pods on the host network is not encrypted. So in other words, if it never leaves the host, it doesn't get encrypted, right? If both pods are on the same host, that traffic would not be encrypted. So read through this, please make sure you pay attention to when it does and doesn't apply and how that affects whatever your security stance may be. You know, I understand security teams can be, shall we say, curmudgeony. Sometimes I'm familiar with that word. Yeah, I mean, I wouldn't say curmudgeonly is just they have very strict requirements that they have to live up to right so got a little bit of slack. Yeah. And my experience, you know, when I used to, when I used to have to certify or work with the security teams was that oftentimes I had to explain the technology to them and help them. I think that's one of the reasons why I feel like I do okay at being a TMM, you know, TMM is basically explaining engineering level technology to us regular people and asking a lot of stupid questions of engineering. You know, you get, it's translating so and security folks sometimes need that help. So anyways, I digress. So in the interest of time I'll keep moving along here because you see I've got a bunch of tabs still to go through. I'm happy to say that the NM State operator is being moved into OpenShift proper. So if you're not familiar with the NM State operator. So this has been deployed and supported with OpenShift virtualization since it was released. So what NM State operator does is it uses Kubernetes NM State, which is the Kubernetes integration with NM State. So NM State network manager, right so effectively if you're familiar with using NM CLI to create configure manage the network configuration on your host. NM State does that in a stateful way. The NM State operator and NM State Kubernetes introduce all of that into Kubernetes. So why is this important? Why is Andrew excited about this? So if you have basically more than one network interface or if you need to manage control change the networking of your host, NM State is a great way to do it. For example, I deploy my cluster. I've got, you know, I use the install config or not the install config, the kernel parameters or the live ISO and configure my first network interface for the machine sider, right? The one that's going to run SDN and all that other stuff, right? But maybe I've got, you know, two or four or however many other interfaces and I need to create a bond and I need to put some VLANs on that bond so that way it can connect to, you know, maybe storage. And this particular app network, you know, your DMC, whatever, yeah. So NM State makes that super, super easy. Essentially, if I go to updating the node network configuration, let me paste this into the Christian Hernandez, I am not a people person. So if we look inside of here, right? So here's how to create a VLAN interface using the NM State operator. Essentially, I'm creating a node network configuration policy that says I want to create a VLAN named eth1.102 with this VLAN interface information. And we apply that it uses the node selector to automatically apply it to the nodes. So for example, if I had this node selector as a role, right, maybe it's node selector is workers and, you know, some other arbitrary label associated with my machine setters. Something like that, right? Any time I automatically add a node or a node joins the cluster, it's going to implement, right, whatever this configuration happens to be. So very useful, very helpful for configuring all of that additional network stuff in a much more robust, much easier to understand way than trying to do it with machine config, which can get very cumbersome. Do note, so it is in tech preview. I was telling Chris before the show, I found a bug in the way that it's, I was trying to create or move a secondary network interface to a Linux bridge that had already been given an IP via DHCP. And it didn't like the way that the routes worked. Interesting. It's a bug in that it's unexpected and expected behavior. The workaround is to first remove the default routes that it creates for that network interface and then move it to the bridge. But so yeah, that in that team is super helpful. Peter is the engineer that I was working with super, super, super helpful. Again, much like the docs team can't say enough good things about them. Exactly. Oh, horizontal pod auto scaler. So if you're not familiar with HPA. Historically, it automatically uses right it basically uses the metrics to gauge when a pod exceeds whatever the defined CPU threshold is. So maybe it's using more than, I don't know, two CPUs worth of, of, of, you know, 2000 L a CPUs worth of resources. Hey, it exceeded this threshold. I'm now going to automatically deploy, you know, two, three, five, 10 additional instances of the pod to help spread that workload out. Historically, it's only been CPU based. So as of this release, as of four dot seven, we now add memory based utilization metrics as well. So yeah, very helpful. You can still use custom metrics as well. So I don't know if that's, I think that's GA, but effectively defining custom metrics to trigger auto scaling pod auto scaling is is a thing if you so choose. Awesome. Like that's super powerful. Yeah. So another one that I'm that I'm strangely excited about the de scheduler. So the de scheduler, which sounds kind of scary is an effort to balance the workload in the cluster, according to your policy. Right. So when we think about the Kubernetes scheduler, essentially it works off of a bin packing algorithm. Right. How do I, you know, this, this pod needs this much resources rate all of my nodes have these much resources available rate how do I choose which one I wanted to go into. And that works great and fine to a point. And that point is effectively when nodes start getting heavily utilized, right. So the pod eviction policies don't take effect right it won't start removing pods from a node to free up resources until there is active contention right whatever that threshold happens to be I think it's 90 or 95% by default. Node auto scaling doesn't take effect until pods fail to schedule. So the de scheduler essentially is looking for conditions. And we can define those, those conditions using these policies or profiles. And basically have it say this pod is, I want to reschedule this pod. So effectively it terminates the pod whenever it meets a condition prior to that eviction threshold prior to that running out of capacity threshold. So the profiles combine a bunch of different strategies to then try and make it behave the way we want it to. So what do I mean by that affinity and taints. So if I enable the de scheduler using the affinity and taints profile. It is going to look for pods that as we can see here are violating the pod anti affinity rule or the node anti affinity rule. And the reason for this is relatively straightforward right you think well I set an anti affinity rule right why why why would it be in violation, and it can happen for a number of reasons, maybe it's a soft anti affinity, and you want to go back and reinforce that anti affinity. Maybe the pot or the particularly with the taint maybe the node has been updated after the pod was scheduled with a taint right stuff like that so that it constantly checks and it looks for those things to go through and ensure compliance with whatever your profile wants. So there's a couple of different profiles here affinity and taints topology and duplicates as well as life cycle and utilization. You can apply more than one of these if you so choose you can have all three if you like. So life cycle and utilization one I think is. If we're familiar with how resource balancing happens in like red hat virtualization v sphere with DRS etc. This is kind of a similar concept right so low node utilization right finds nodes that are underutilized and evicts pods. So effectively what this is doing is it saying that I have maybe two nodes in my cluster. One of them is at 10% utilization one of them is at 80% utilization so I have a node that is below my low node utilization threshold. So even though the other node is happy and healthy. As far as the resources are concerned this other one right what I can move some of those resources and make them more balanced. So it will you know terminate pods on the highly utilized nodes or node or nodes in this instance, in order to hopefully have the scheduler put them on to the lower utilization notes. Again balancing that out. So and then the pod lifetime right you can set you know hey I don't want pods. I want to constantly have pods that are less than 24 hours old stuff like that. So kind of in conjunction with that in tech preview is scheduler profiles. So this is an interesting one. Yeah, because it works with and you can see like this low node utilization scheduler profile. So the d scheduler is it works off of the I think the technical term is hope and prayer. Effectively right if I use the low node utilization profile. It's going to say well I've got these you know in my example two nodes, this one's highly utilized this one's underutilized I want that workload to flow from, you know the workload to move from high to low. So it terminates from the high with the hope and the assumption that the scheduler will make the right decision and put it on to the load load the lesser utilized node, but that's not a guarantee. It could get rescheduled right back to the same pod or the same note rather. So the scheduler profiles assist with making those types of decisions right hey I want you to target nodes that have the lowest utilization. Or maybe I want you to pack as many pods as possible into as few nodes as possible. Whatever that happens to be. So check these out straight maybe useful maybe interesting to you. I'm going to paste the link inside of here. Remember they are tech preview. But this is something that I found, particularly when used in conjunction with the scheduler to be a pretty powerful way of potentially either balancing the workload across your cluster. If you want that, you know, even distribution or compacting it into as small a space as possible, depending on your preference rate maybe you want to have as few nodes as possible, maybe you want to have extra capacity. Like if you need all your notes running 80%. Right. Yeah, that's the idea. So, the last thing I've got to talk about. I skipped over that tab that was the getting started with get ups tab, which we talked about earlier. So the last thing I wanted to talk about and and I kind of alluded to this earlier we had a lengthy conversation. And one of the internal chat rooms about DHCP and IPI. So, all IPI methods require DHCP for all notes. It's always been that way since day one, you know, it's how else can we dynamically create, you know, nodes, you know, the cloud providers will reach out to the infrastructure provider, you know, these are IPI hey vCenter create me a new VM. We rely on, you know, the intrinsic infrastructure functionality to do things like give an IP address. But there can be issues there. And this is what turned into a 200 plus message, right, thread in with with a bunch of our field folks as well as a bunch of us BU folks. And, you know, Chris pointed out that it's really great that red hat has this culture of like, very open feedback and yeah, I have to say I appreciate it as well. Everybody's opinion was considered right. Yeah, it was pretty awesome. So, one of the things that I believe it's documented in a KCS I'll have to dig up the KCS but one of the things that we recommend is setting static DHCP reservations for the control plane nodes after deployment. So, when the cluster deploys, it comes up it pulls those DHCP addresses and then day two for the control plane set those a static. The rationale makes sense, right, essentially, maybe I have a small DHCP scope or maybe I have a DHCP scope that is almost full so there's a lot of churn. So, if my host, you know, goes down for something as simple as a reboot or maybe I turn it off because I need to do something and it's down for 45 minutes and it comes back up that pulls a different IP address on a control plane node. This is bad. Well, and the reason for that is, as you might expect, at CD, so at CD is configured by the CD operator at CD cluster operator to use IP addresses for discovery. And the operator will recover right basically if it says hey there's a new control plane node there's a new CD node I need to point at all of that it'll reconfigure everything and it's great. So long as only one node at a time changes. If I have two nodes change or all three nodes change effectively at CD can't find its peers. And that leaves the disaster. If CD can't come up, then the cluster can't come up and therefore the operator can't come up to then fix the original situation. So by defaults rate the blanket recommendation is day two after deployment set static IP reservations or static DHCP reservations for your control plane nodes. So, we went through this debate literally at I don't know nine o'clock last night was again I was up late because of the APAC virtual tour thing. And it turns out that unbeknownst to me I found out this morning that our smart engineering folks are already starting to address this. You know, obviously this has been in the works for longer than we had been talking about it last night but if we look in the documentation here. And I'm going to post this is the the bare metal IPI per install prerequisites documentation. So essentially there's a little blurb in here that says, if you set the DHCP reservation to have an infinite lease. Then there's a script that will actually go in and convert that to a static IP config on the control plate note. So this was really interesting to me right because if we think about it right DHCP leases are you know they're set so that they can eventually be reaped and reused etc. But if it's infinite, then the host can basically assume well this is always going to be by right it doesn't expire. It's not going to get reaped it's not going to be you know given to something else. So, I went and did some digging and found the script that actually does it. So, really cool really interesting rate of basically it figures out that you know here we can see it's using an MC li to figure out the connection type show connection show infinite lease to static. Right, and if it determines that it is an infinite DHCP lease, it modifies the interface to a static IP. And this, this works for both. This is the IPv4 it also works with IPv6 just a different file for IPv6, but I thought this was really amazing and an interesting way to solve that problem, because the alternative is some kind of day two thing to go through and set those those reservations or I've seen some people use you know machine config operator to go through and effectively do the same thing set a static IP or use an M state operator to do the same thing. My recommendation has always been for the primary interface, right whichever interface sits on the machine cider. I don't want to modify that through other mechanisms then effectively like the boot time kernel parameters or the install time kernel parameters rather. Because if you break that interface, you broke the node because then the SDN can't stand up I can't talk to the rest of the control plan it can't do all of those other things. So this is an interesting way that they've done to work around this. Whether or not it will be expanded to other installation types right right now it's bare metal IPI, but the same principle right could be applied to I can see how it could be applied to others. So anyways, Friendly reminder three minutes left. Last thing. Yeah, I know we have a hard stop today for the open shift Commons folks. Well, as I said this is this is the last one so. Oh, good. Yeah. That's all I've got. Wow, you nailed the timing on that one. It's like you do this for a living and I will once a week ish. So, with that, please don't hesitate again if you have any questions, drop them into the chat we'll try and address those in these next few minutes. I will summarize by saying 4.7 is a huge release. It's funny I had this, I had this debate with my product marketing counterpart of product marketing says, Oh, there's not a lot going on right you know we will do a press release but you know there's there's just there hasn't been a lot of things like have you guys looked at what's going on like there's a huge amount of things that have gone on inside of here. Definitely check out the release notes, look through all of the things that have changed inside of there please don't hesitate to reach out to ask questions. You don't have to wait for the admin hour every week you can reach out to me directly on Twitter at practical Andrew, you can reach out to me via email Andrew dot Sullivan at redhat.com so any questions with any of that. I know Chris will also volunteer his contact information here in a moment. See short at redhat.com and at Chris short on Twitter. You might get more responsive and this out of Twitter to be honest with you. Matt compliance operator. Yes. So the compliance operator is as far as I know GA. I have to find the documentation page. I know that they are working diligently to add additional profiles so that you can basically apply more of those security settings. So, keep an eye out in the next. I know we generically said I think q one calendar year 21. We're expecting I think three additional compliance profiles to be released, including we will I think we're wanting to look at the CS benchmark. So one of those is the CS benchmark for open shift. So remember release dates are flexible. We talked about this at the beginning of the show and literally within like eight hours ago, right. We had a feature get bumped by a week. But keep an eye out for us to release additional compliance profiles to get integrated inside of here. So I am working on a show focused on the compliance operator. It will probably be in the mid March mid to late March timeframe. It's actually April. I'm trying to remember the schedule. So we'll go in depth on the compliance operator and the not too distant future and talk about all kinds of things inside of there. So if you have questions about that, please feel free to reach out to us now we'll do those kind of one off but we'll summarize all of that in a show as well. And with that, we're at the top of the hour. So thank you so much. Really appreciate everybody who joined today all the questions. Please keep an eye out on open shift comm slash blog for our follow up that has all the links and other information from today's stream. Yeah, and stay tuned for the open shift commons briefing we have. I think it's data dog today. I could be wrong. So don't hold me do it. But see you over there in just a few seconds folks. Thanks.