 All right, welcome to the Commons, KubeCon. This is one of our favorite Commons of all of them, because we're at KubeCon. We thought to make it a little more special. We're going to do this between two ferns. Who didn't buy the ferns? We're going to do this between two curtain style. All right, here we go. All right. Mike's very excited this morning. You can tell being in sunny San Diego. Let me get into, hopefully, somebody have seen between two ferns. All right, so Carlton Coleman from the hiking franchise. It says here, because the word Kubernetes is everywhere, that you only speak ancient Greek. And Derek Carr, you're in the shipping business from the looks of you, the dock worker, I believe, yes. And my people did get the note about the spicy foods. We won't have any. But it says, is this why we're on the boat, by the way? I have no idea what you're talking about. Don't plug your projects yet. Don't plug your projects. That comes after this point. I will say Derek's actually going to change his last name legally to Carr with a K, because you can't launch anything these days. And now that he's got children, he doesn't want people to think that he's not contributing to the Kubernetes ecosystem. All right, in all seriousness, Clayton Coleman, Derek Carr. Clayton is a thought leader in the Kubernetes. If you haven't met him, definitely shake his hand at some point today. Derek Carr is a steering committee member and also a co-chair of multiple SIGs. So thanks for having us in this busy week, because I know you guys are having another conference over at the convention center today, right? What? Yes, the Committer's Conference or something like that. Yeah, that's right, the contributors conferences today. A ton of other events going on at Kubernetes, but we're really glad that. Ignore the unusual times. I guess we are going somewhere. So I thank everybody for coming here. It's really exciting to us to get a chance to talk to folks. And I know that there's so many people here, but if you see us and we're not talking to anyone, do not hesitate to come up and ask us a question if you've got something on your mind. There's plenty of other engineers here. There's a ton of PMs. All of Mike's PMs are here today. This really for us is about making connections with the folks who use OpenShift, who think OpenShift isn't good enough yet. And we want to hear what it is that is going to make a difference for you in your transformational journeys. Yeah, thanks for that. So on to the news, Clay, and I hear you have something pretty exciting to announce. That's right. So today, I just tweeted about this about 25 minutes ago. OKD4 Preview is out. And so this is roughly based on OCP for three. Same open source code that's been there, but it uses Fedora Core OS. We'll have someone up on stage in a little bit who will do a demo of it. But we're really excited about this. And we apologize for everybody in the community who rightfully took us to task for taking so long to get OKD out there. It was a little bit more of a journey than we thought it was going to be when we got OpenShift 4 ready. And I'm not going to blame it all on the Fedora Core OS guys, but communities sometimes take time. And we had to make sure we were doing the right thing in Fedora before we moved OpenShift to Fedora Core OS. Thanks for that announcement. So let's start from the beginning. Install and upgrade. And Derek, when I walk down the corridor at 4.16, there is a saying that goes the first cluster is always the most important cluster. Can you talk about the first cluster being the important cluster? Everyone here will hear. OK. Nice. All right. So Derek, when you walk down the corridor back in engineering, there's a saying that the first cluster is always the most important cluster. Can you go into that for a moment? Yeah. So typically, everybody starts their Kubernetes journey somewhere. Usually, it's your first cluster. And that cluster might act as a management service for other future clusters you create. And we'll talk about some of those projects later. But if that cluster doesn't stay up and running and upgraded and life-cycled, then you're kind of left high and dry. So with OpenShift 4, when we took the technologies from Core OS and we integrated them with Red Hat, we asked ourselves, what can we do to make sure that when you create that first cluster, you have a pleasant experience that you can create that cluster in as many environments as possible. Enterprise IT environments are very, I guess I would say, Diverse. Hybrid. Or yeah, diverse is a good word. And so we tried to make sure that once a customer or client can get that cluster up, that afterwards that cluster is running, their experience to upgrade and lifecycle that cluster is as consistent as possible. And we kind of took it from the perspective of the whole stack out. It's one thing to get a cluster up, then as we'll talk a little bit later, you have to keep it updated. So we did a lot of work to make sure that the spirit of always keeping security top of mind and wanting to always ensure that not only being able to install that cluster, but then lifecycle that operating system afterwards was important. But you can't do much work without a cluster. And oftentimes, you need to create that first cluster or create your second through 1,000. So I was going to add here. So how many people were here at KubeCon last year in December when Derek and I did the demo of OpenShift 4 on stage by Shahantz? So it's been about a year now, a little bit less than a year. And we had a very early version of OpenShift running at that demo at KubeCon, of OpenShift 4. Actually, a lot of the physical interface hasn't changed, right? The same concepts, operators having the OS tied to the cluster, how we were doing updates. But what we spent most of the last year on is finding all the places where that breaks down. So I don't have the exact numbers, but Michael accused me of being a stats nerd now. So what we've tried to do is apply a much broader set of data, capture, and analysis on CI. And I think we've launched somewhere on the order of 500 to 600,000 clusters since we stood on stage last year. And that's about 10,000 clusters a week in CI. Inside that set of clusters, we've probably done something on the order of 100,000 or 200,000 upgrades from December of last year to November now. And that process, it's not just about making sure the features are there, but it's doing it over and over and over until we're absolutely certain, even at the really small statistical levels, that there's not some weird race condition somewhere in Kubernetes, that even though everything else worked correctly, something in Kubernetes holds us up. And actually, that takes a long time to flush out. And I think we're actually starting to see the benefits of that both in Kubernetes and in OpenShift as we've gotten to the practice of updating, right? It's not just about you can do it. It's do you do it all the time. There's some other things to bring up, which is the first cluster is the most important cluster, because it's the first cluster that if you actually get it running, you have hopes of doing it a second time. So frankly, when four first came out, we had some work to do as we were still closing out the year to get integrated with folks disconnected environments, your proxy environment. I get plenty of emails from Mike talking about customers who have the world's most complicated proxy configuration, right? And there's no great benefit in being able to go and start talking about clusters as cattle and creating thousands and thousands of these things if you can't even get the first one created. So it was really important to us for this year to focus on ensuring that we get the platform right, that we can integrate it in your disconnected, your proxied environments, your exotic network configurations with clouds, basically all the real world things you hit when trying to actually install and lifecycle the product in the enterprise. And speaking of the lifecycle of the real world, you both go to work and you work in the upstream cube and you see all the projects on updating cube and keeping cube a simple installation. Why is OpenShift different? Why did we move the boundary of what we consider our responsibility to be the storage, the network, the security posture, the registry? Why we've taken responsibility for that and why is that more difficult? So for me this is, and I think we said a little bit of this last year, but we kind of made a bet with Kubernetes which was people want this kind of declarative approach to running applications, right? At the end of the day, it's about, can you run those applications? And containers help more than they hurt, right? There's plenty of things that all of us would look at what we do and we're like, oh, you know, this isn't perfect yet. And so that core model of, OK, you're going to see what my application looks like and sure I'm going to have giant piles of YAML everywhere and I'm going to have five layers of CI CD, but we were going to have that in some form anyway. Some of what we've tried to do is, there's always going to be some level of abstraction. How can we take a lot of the abstractions we're doing, whether it's continuous development, containerization, bring them together? And for us, infrastructure was a really natural part of that because at the end of the day, Kubernetes is trying to make it so that all of these machines kind of fade into the background, right? If you're dealing with a Kubernetes cluster and you're in there every day and you're like, man, this node's broken or this node's broken or this node, you got a pet node in there that's always causing you problems, you're spending too much time operationally on something that Kubernetes is supposed to abstract. And so we looked around the community and the ecosystem. There's a number of promising open source projects, like the initial version of the cluster API and SIG cluster lifecycle upstream, which focused on having APIs for machines and bringing the infrastructure under control. So that was kind of a starting point for us is we know that machines are a big problem. We know that cluster infrastructure is incredibly complicated in the way that people are deploying it in the enterprise, especially when you talk about things like which VPCs are you running? And that was like a key bit of feedback for us that was kind of surprising. And I always looked at it was, most organizations are actually at the point in their cloud journey where there's a team that may not be the team deploying Kubernetes that owns networking infrastructure. So there was an early phase of Kubernetes where we're like, oh, we're gonna take that all the way from these crusty old admins who know how to run networks and the developers are gonna run it. And that's not the reality, right? The reality is you still need firewalls, you still need security protocols, your CISO needs to know who has access to which clusters. And so for us taking that feedback, even as we went and said, well, let's put networking under control of the cluster, we got a lot of feedback that was, well, there's specific things that are managed by another team. How can we work with you to be in this other control? And that detail, there's lots of open source projects for managing clouds. For us, it wasn't about any individual project, it was about that sum total, which is Kubernetes is your abstraction. And you work with the teams within your enterprise or within your company or within your team or the people within your team to split up responsibilities. And it's still an evolving journey. I think you said a lot there. I look at this more simply, which is like the upstream worker on Kubernetes is awesome. I think Kubernetes should be everywhere and anywhere. But at the redhead side, when I have client conversations, they're not just interested in life cycling the Kubeblet or the scheduler or the API server. Like the second question becomes, well, how do I integrate that with my LDAP? How do I do authentication to that service? Clayton and I have a bunch of healthy debates on just how thin a cluster should be. And so we have these discussions on, all right, well, just what should be installed by default and what should come optionally afterwards. And many times you get into conversations where everyone says, well, I want Kube, but then I also need to integrate with my LDAP. So I need authentication or, my central container registry is not perfect. So I actually like the integrated container registry and OpenShift because it provides some type of replication. And I need monitoring. So I need Prometheus installed out of the box. You know, storage is really important to me. The default storage in Kubeb isn't as great as I need it to be. I need container storage. Security is really important to me. So I need a way of life cycling twist lock or similar types of products. And so when we looked at like the distribution as a whole, it's more than just Kube as the kernel. It's, what can we do to make it easier for you to life cycle the whole stack top to bottom? And so that's just Kube, but then all the additional operators afterwards, which kind of reinforced why we went so deep on the operator pattern. Well, speaking of those operators, Clayton, talk for a minute about the leap that happened between controllers to CRDs, to what we're shipping as operators now. Yeah, and it's funny. So I think it was the second public community meeting that we'd done for Kubernetes. So this was in 2016, I wanna say, 2015, 2016. And Brendan Burns from Microsoft, one of the folks who is at Google and helped start Kubernetes and I were talking about, we wanted to have this library of patterns that make it easier to run applications. And so what you can think of in Kubernetes is like that first wave of patterns. So we said, okay, well, you implement patterns in Kubernetes using controllers and APIs. That wasn't enough, right? There was a ton of feedback from everyone, which was, well, that's really hard. So we went down this path with making Kubernetes easier to extend. That took a couple of years. I think we're at the phase where now- We also took a lot of heat. We were one of the very early extenders of Kubernetes. And we lived the journey hard, right? And so I think about like ingress. In the beginning of OpenShift, a lot of people would criticize us and saying, well, why do you have these different objects than the upstream? And now we're at a phase where if you go to KubeCon and there's 12,000 people and hundreds and hundreds of vendors, they're all basically doing what we started in OpenShift 3 with the API extensions now that you can do through CRDs where no longer are you having discussions on, well, is this the right ingress or OpenShift routes for ingress? It's like, everybody recognizes that it's more than just the 10 or 12 core API types and Kubernetes that you need to be successful. And CRDs is really that enabler for that. And with CRDs, you have to lifecycle those because if your applications depend on this controller and CRD working and all of your applications are using those, what happens when you do an upgrade and no one actually tested that that version of that controller and that CRD works on the next version of Kubernetes? And maybe at some point in the magical future, Kubernetes is just gonna stop changing completely but I don't ever believe that. And so we've got, we have a duty as an ecosystem, as an upstream, as a one vendor, as a vendor who works with other vendors to build tools and patterns that help people manage this journey because the benefit of declarative applications infrastructure is just trying to reduce the separation between what I have to worry about as a developer, as an operations team, as a systems admin from what somebody else can do for me. And whether it's done through operators or whether it's done through somebody running something developed by their own internal teams on their clusters, we wanna make sure that the whole ecosystem moves forward and that takes time. One last question on this area. We moved the boundary to include the operating system into the platform. Can you talk about why we made that move for our customers? I think this is the easiest. So we took, this was, I think, one of the hardest discussions after the CoreOS acquisition at Red Hat. So CoreOS had made a big investment in Container Linux which had this idea of a mutable operating system and it was container, it ran well with containers but it wasn't really tied to Tectonic which was CoreOS' distribution of Kubernetes. And we had a lot of early discussions which was let's go a step further. Let's fundamentally tie machine lifecycle and how machines are updated because at the heart of it, and Red Hat has dealt with this for a long time, containers just run on a Linux kernel. And so tying that lifecycle together with all of the different parts of the system, it's not just the Linux kernel starting a process, it's C groups. C groups is changing in a major version transition in another six months or so. Sorry, started in Fedora. We knew that these kinds of changes would keep coming to the container ecosystem and so it was a natural time and place to say let's make this bet, let's go a little bit further than CoreOS had done and let's tie everything in the cluster to the lifecycle of the cluster. And it allows us to, I think we've seen a lot of success this year and when anyone who's running OpenShift 4 upgrades the cluster, every machine in that cluster is rebooted and updated to a new version of an OS and a new KubeLit and a new container runtime. I've seen lots of people open bugs about details of how these systems work. I've never actually seen someone say, you know what, I didn't like that that machine got updated atomically and I didn't have to worry about it. And so what we've been trying to do is actually learn where there might be gaps. People want to make sure that they can control how fast upgrades are working. But it's been really surprising to me that we just haven't had that challenge. I look at this really practically which was as a community, we have to make opinionated choices on how we look to offer Kubernetes in a broader distribution. I help maintain the KubeLit upstream and for much of the lifecycle of OpenShift 3, you'd get a bug. It would say, so-and-so is not working, my pod won't start, it's stuck in some crash loop back off. And then you'd have this back and forth with either a member of the community or a client and say, all right, well what version are you running? What version of RHEL are you running? What version is the container runtime at? And you were spending more time gathering information about how to help that client than to actually address their problem. What I really like about what we've done with OpenShift 4 is that, as Clayton said, if you go and download 4.2.3, everybody knows what that means. I know the exact level of the OS that's running at that point, the exact level of the KubeLit, the exact level of the container runtime, it is immensely easier to go and actually solve and fix actual customer problems. Similarly, as Clayton noted, right before I came up to stage here, there is a huge shift happening in containers right now with the transition from C Groups V1 to C Groups V2. And one of the things I hope that we can do within the OKD community is when we integrate Fedora Core OS, is that we can start to get that C Groups V2 real world practical experience out in the broader community much earlier. Cause when you think about it, like it's kind of strange that Kube has been in the wild for seven years or so and nobody quite realizes that the entire isolation primitives that guide your containers are just about to change underneath the covers for you. And so to me, that introduces a lot of opportunity for us to engage with the community and do it safely and with some sense of empathy, like I would hate for everyone to have to lifecycle the OS at the same time that we're going through this major change. So to me, I think it's just a no-brainer that why would you want to make your life hard? You guys must have argued for a good 30 days about how to implement the VM image or an alternative boot strategy. Can you talk about why you decided on what you did? So a lot of our focus has been how do we make sure that Kubernetes can behave the same everywhere? Cause that's some of the promise of containerized applications is you can make it work in a production environment and you can make that run on your laptop or your data center. You can run it in a cloud. And there's a lot of nuance. There's a lot of real subtle details. But the heart of it, we knew that we wanted something that works on bare metal all the way up to cloud environments or on virtualization platforms. We wanted to be consistent. And so actually the easiest part of the technical decision there was if you're running on bare metal, it's harder than it has to be today to re-image those. And we'll actually do some demos today of the work that's going on to bring some of that flexibility to bare metal. Right now, by the way. What? Right now. Right now? Now, Angus. Angus, come on up and come on Angus. So, and that sort of flexibility on bare metal is also something we can take advantage of in the cloud. And so the technical details actually helped us provide that kind of experience consistently across both cloud and bare metal. Hello, is anybody able to hear me? Excellent. All right. So my name is Angus Thomas. I'm a senior engineering manager at Red Hat and I'm going to talk about bare metal, I install a proficient infrastructure which has been added to over shift in 4.2 as a developer preview. For lots of reasons, partly because of wifi but also because anything with bare metal takes a long time. I'm not actually going to try and demonstrate this live. I have a video which I'm going to go through which I put together over the weekend with a guy called Steve Hardy who's one of the lead engineers. I will start this off and I'll try and speak in line with what's happening and see how we get on. So we're actually using virtual machines for the demonstration which is a little bit odd. It certainly is simpler to set up and it lets you see the power state of all the machines on the right as we go along. The machines are going to be managed using VBMC. The provisioning is done with Pixi and the management is done by IPMI. So it is actually all of the same processes that you're using bare metal provisioning except with VMs behind. So there's a table there of the IPMI ports through all of these instances which we're going to work through. And I actually want to add something to Angus's point. Some of what we've tried to do and what Angus's teams have tried to do is at every level where we can build that linkage between something that Kubernetes or OpenShift is doing to work the same everywhere the better off we are. And so we're using virtual VMC here but we know that we want that to be consistent with how people actually deploy metal. And if there's one thing about metal is that it's hard to get metal and bring it onto a boat, I'm sure if we brought it on, we'd have to move people to this side of the room because we'd be tilting that way or something. But that consistency and trying it, that is a fundamental part of everything that we're talking about here is trying to be consistent between all of your environments. Not just cloud versus virtualization but virtualization versus metal. Sorry Angus. No, that's a very good point. And as the name suggests with bare metal install a provisioned infrastructure, the point here is that the installer doesn't just deploy the application bits, it actually brings the underlying infrastructure into existence. And when you do that on EC2, I think people are fairly familiar with creating VMs and so on, but we're going through the same processes as Clay said, using IPMI and Pixie and a lot of the things that people who are familiar with bare metal management will recognize. And it's all about consistency. So here is the install config which is going to be consumed by the installer. A lot of this is the same as when you're installing anywhere else. There's a bunch of parameters that define the cluster. We're going to deploy rather than name and so on. And a specification of a number of workers. We're not going to deploy any workers in the first instance but we are going to deploy three masters. There's a little bit of bare metal specific information, DNS for the, the VIP for the DNS rather for the DNS service that we run. We also have a load balancer service which provides some of the underlying capabilities that EC2 or something would provide on a different platform. And then below that you get into the details for the individual machines. And here we've got the access credentials for the bare metal management. I don't know how many of you are familiar with bare metal management, but machines have a Drac or ILO card where it might, whatever it might be, listens on an IP, has a hardware, sorry, a password and a username and that lets you log in and manage the machines. That's how we're going to do this. There's also hardware profiles which we don't use directly but this is a sort of future feature where we will be able to do profile matching and use machines for different purposes based on their exact profiles. And then there's the pull secret as in any other deployment. All right, so that's the install config and then we go off and run the install. As Clayton was saying, this is a consistent experience. This is very much the same invocation of the initial install you'll be doing on any other platform with Create Cluster. So that runs off and in a moment on the right-hand side you should be able to see, there we go, down at the bottom, the creation of, excuse me, hold on just to drive this correctly, the creation of the Bootstrap VM. Now the Bootstrap VM app is part of the deployment. It is the control played for the initial deploy of the cluster. In this instance, it is hosting a bare metal hypervisor which is the encapsulation of the interface that does the actual management of hardware. We are using a project called Ironic for that which is a bare metal hypervisor which I'll talk about a little bit more later on. So at this point, the cluster is deployed. There is an API endpoint for the cluster itself and the three masters up and running. So if we go in, we can see that there are three machines, excuse me, and there is a machine set for the workers which currently doesn't have anything to find in it. And then there is this bare metal host group and this is a new thing. This is part of the metal cube. This is the bare metal implementation. This lets you see details around the actual physical machines themselves and their API endpoints and so on and the fact that they've been externally provisioned. At this point, we're running on the cluster. These were initially provisioned by the Bootstrap so they're external. But yeah, those are the actual physical machines. And it would be fair to say that the bare metal, those bare metal hosts are essentially your hardware inventory. They can be part of the cluster or not and they reflect the state of the underlying hardware and that abstraction actually is the same way we think about machines in the cloud which is the cloud provider has some representation or a virtualization provider might have some representation of what that machine is. Bare metal host is just expending that to host hardware and it is capable of doing all of the same things that we would want it to do, that we would if we made a cloud provider call to some remote VM. Yeah, exactly and part of the implementation of all of this bare metal support is being done with a sort of overarching project of Kubernetes native infrastructure and the point is even though the implementation that is talking directly to the bare metal is a particular bare metal hypervisor, the encapsulation of all that is done in operators so that it is Kubernetes native and things show up and are managed in the way that you would expect as elements within Kubernetes. So at this point we have our three nodes which are running the masters and the cluster is up and the cluster itself has its own embedded bare metal hypervisor and the next thing that we are gonna do, the file that is on the screen at the moment is the definition of a bunch of additional machines which we're gonna be able to load in order to make these instances workers and essentially there is two fundamental things here, there is the IPMI credentials again, the address to be able to access the interface to manage these machines and the secrets, the username and password to be able to manage them and in order for these things to be available within Kubernetes so we can vision them and make them workers in a moment they need to be registered, which is what is about to happen. 10 machines and in a moment we will flick off and actually register them all. The problem with the video is you can't pressure the presenter to make them go faster when there's dead space, last year after you saw me I badgered Derek endlessly while we were doing the demos, it was very satisfying so he was smart enough not to do that this year. All right so those 10 machines were registered and now if we look at the bare metal hosts you can see the full state of all of them, there are the three which were the original set which are hosting the masters and then there's all the ones we have just registered, two of which at the top there are being inspected, one of the capabilities that ironic the bare metal hypervisor has is this whole series of going off and doing inbound inspection of machine hardware profiles and finding out exactly what the capabilities of this hardware are, exactly what's present, which can be very useful in terms of finding faults and so on but also for differentiating between precise hardware profiles and we will see some of that hardware profile information again in a bit but there we are, that's the machines are now registered within Kubernetes as bare metal hosts. Right so now that the cluster is up we have our three masters working switch over to the console where you can see a lot that is common and as we already said this is about having a consistent experience regardless of the underlying infrastructure so the representation of nodes and machines is exactly the same. There is this new thing which is the representation of bare metal hosts this is obviously specific to bare metal management and as you can see we've got our three provisioned masters and a whole bunch of machines which are reported as being available which we will be using shortly. There is the ability to individually add machines via the UI which is what this screen is about we had a YAML file and uploaded a whole load of them in one go but we can add them individually so now the machine sets this is the key point what we're gonna do now is increase the count on the machine set for the number of workers from zero I think to 10 which will immediately invoke the provisioning of a bunch of workers in order to do reconciliation between the desired state and the actual state. So if we go over to the bare metal hosts all of these machines are now shown up as provisioning and this is the point at which the cluster is using its own internally hosted instance of the bare metal hypervisor that I keep saying does all of the stuff under the hood to go through provisioning machines over on the right you're gonna start to see machines power cycling and being deployed with our costs all of the stack of software that is needed to turn them into workers. It's fair to say you're auto scaling metal here? Yes. Exactly. It's pretty cool. The only real difference is you do have to pre-register the existence of the metal as a time. And I think if you think about this at larger scales so a lot of folks start with small bare metal clusters and might grow out but if you're planning a big data center rollout because these are APIs because these are tools you can use existing config management or content management systems you could use Ansible you could use a number of other technologies to say I want to distribute the inventory for these machines so that they can independently function or might centrally manage that and make sure that all of these individual discrete clusters are sharing the same pool of hardware. In both cases it's the same kind of patterns that you could use in a cloud environment. We're just trying to make sure that it's Kubernetes first and the infrastructure that supports it is consistent everywhere. All right, so here we go. Now as the provisioning is going on we click into one of the details of one of these workers and you can see some of the specific hardware state. The processes and the amount of RAM and so on. This is the information about this machine that was gathered when they were registered and they were inspected and because there's a database of exactly what hardware there is all kinds of clever things will become possible at some point about workload placement if you've got particular jobs which require graphics cards for offloading calculations or particular nicks or whatever it might be. There's a whole load of stuff about physical infrastructure where you might want to use a particular machine for a particular job. So we have this discovered hardware profile. Meanwhile our provisioning I think is still going on. There's going to be a leap to it having happened at some point. Back over on the bare-middle hosts, it's still waiting for the provisioning to complete. This is the process where the time is spent on power cycling or deployment and so on. So there will be a moment, there we are. As if by magic, they've all deployed. And so now in the nodes view, there are all of these workers, links to the machines associated with them. Again in the machines, you can see each of the provisioned instances and the node that is related to them. All of that is consistent across all the various infrastructures. The machine set is showing that the state has been reconciled with 10 instances created. And then over in the bare-middle hosts, you can see that all of these machines have now been provisioned. The node role that is associated with each physical machine. And I think we're it, that's it, that's it. And actually, we talked about consistency. So there's a really important feature coming in OpenShift 4.3 that we're all really excited about which is machine health checks. Which is the ability as an operations team to define a criteria. So there was a joke that went around on Twitter that I retweeted this weekend that I really liked. Kubernetes is just turning things on and off at scale. That was all we basically invented in the last five years. And I was like, okay, that's actually really great. Operators turn on everything else that's not run by Kubernetes on and off. And a key part for us is if we can control the infra, we can turn those things on and off for you. So as an example, we have statistics on the health of the OpenShift fleet. One of those is about 0.5% of nodes are down at any one time. And it's actually a really common problem for someone to say, hey, I tried to do an upgrade and it failed. Well, why did it fail? Oh, because one of my nodes was down, but I didn't go do anything to recover it. And so machine health checks are a great feature coming in OpenShift 4.3 that don't care about the underlying infrastructure. They use the abstractions that we're building that we've had for cloud and for virtualization as well as for bare metal now. And those machine health checks can do that simplest thing of this thing was supposed to be working. It stopped reporting in. Why don't we try turning it on and off again? And that sounds like the dumbest thing in the world. I'm kind of embarrassed I'm standing up here saying that to people. But I think we all kind of have that secret suspicion that most problems can be fixed by power cycling something. And so we have brought you power cycling as a service in OpenShift, in OpenShift 4.3. It's opt-in, so for a given pool you'll be able to say, I want to. Let's give Angus a round of applause. Thank you, Angus. So it really was amazing what he ended up showing. Angus is being modest. He's a leader in the OpenStack community. We've taken the OpenStack knowledge, that ironic API, we've potified it, we run it on Kubernetes, and then we use it to target machines as if they were EC2 instances. Bare metal, bare metal, bare metal. Mike's very excited about this. And honestly, it's so hard to get stuff working. The first time you see this, Derek and I always talk about what those magical moments in Kubernetes, the first time that I saw someone demo this to me, I was like, that's magical. That is the next level of, I don't have to think about this anymore. And the reality is, of course, you have to know some of it. If you give it a bad BMC password, it's not gonna work, but we can tell you. And that power cycling, I feel like those are the kinds of magical things that we think are becoming even easier as we've gone through Kubernetes. It took a long time to get Kubernetes to the point where we could just trust it to just run our software. And now we're starting to get into that phase where people have a good idea, they can use automation, they can use the tools that people are building in the ecosystem, and they could turn that stuff from, hey, this was hard before. Oh, I don't have to think about this ever again. That's like some of the power of modern infrastructure. Day two management. Let's move on to the next topic. We made a bold decision to make OpenShift represent a completely different profile than the 3.x series and the 4.x. Can you talk about how we minimized the core and made all other components a day two operation? Yeah, so over the life of OpenShift 3 and the development of Kubernetes at that time, the configurability surface of the overall platform just grew immensely. And depending on when each client got engaged, what they chose to turn on and off and how in sync it was with the upstream at that state and time often led to a very broad support matrix and not always the happiest of outcomes. And so we made a conscious decision in OpenShift 4 and I think this is a real trend we're seeing across the Kubernetes ecosystem, which is try to minimize the set of configuration knobs that you expose into profiles or groupings that guide end users to ultimate success so that you're not all having to think about all the various 600 or 700 flags just in the core Kubernetes components. And then beyond that, the 1,000 additional flags that come through in every add-on that one lays onto the platform. But then the even more critical thing was, once you get a cluster up, you weren't wanting to always mess around with bash grips and that type of thing. And the discussions around CRDs and stuff is like, how do we make the state of the cluster always reconciling, always knowing what you want? And we saw a lot of trends emerging in the ecosystem around get-ups and practices like that. And frankly, we think OpenShift 4 right now is a great target for integrating get-ups solutions too, because if you want a YAML definition of how you configure IDP or you want a YAML configuration of how you deployed or tweaked Prometheus on the cluster, like that is your interface now to the cluster, that is the API definition. And then you just trust that the operators then running inside will make it so. And that's really the guiding principle behind our day two management. Now, Mike, as a product manager, is upset that we didn't bring over all 1,000 flags, because many of you, I apologize, we might have missed a handful. And we're working rapidly to bring them in. But we think in the end, like our ability to understand how people are using the system, and our ability to test transitions across those states is just gonna be way better. And it's an interesting thing we talk about. If you do something all the time, you're familiar with it and it works. If you only do something rarely, oftentimes you find out it doesn't work when you try to go change it. And so that reconciling model for config actually has a lot of nice properties because it makes it really easy to test config changes because you stand up a cluster and you apply the config and if it doesn't work, you apply the old config. And very, very, very, very rarely it might break. And that's something that we can do. But it's easier for us to test OpenShift if you can boil the details of getting a new cluster down to some standardized process that you have, whether for us it's CI and our integration systems. For someone on-premise, it might be an automated deployment system. It might be a continuous delivery pipeline that deploys your clusters. But if you've got that configuration and you can change it and then change it back and it happens quickly. And I think that's a key point too, is it happens quickly. That means it's a lot easier for you to bring those into your own environments and test those changes and have the confidence that, well, I can test it but I can then move back to the previous one. That allows us to do things like, if you wanna roll out a change to your clusters, if it fails halfway through, just go change the config back to what it was previously. And then OpenShift's job is because OpenShift isn't doing something one-off for that particular config, we're just rolling out a config change to machines or we're rolling out a config change to the control plane. That means that that process is also very well tested. And so it helps us decouple that this parameter over here causes this system to fail. Doesn't work like that anymore. Speaking of changing a cluster, talk to me a little bit about upgrades and why you went as far as to create a SaaS service to help our customers. Yeah, so we have a pretty cool dashboard that we can see now in OpenShift engineering, which is everyone who uses OpenShift 4 and Opsin to provide telemetry data back. We can see all the clusters that are being created in the world with anonymized data about how it's configured or if it's being successful or not. And frankly, that's what our operators are striving to do is give us that information so that we can see problems at scale. I've always been envious of cloud environments that have a standard platform or operational view. And we at Red Hat and working with many of you don't have that luxury, right? So we have to innovate and look to see, well, what could we do if we were the SRE team for everyone's data center? And so in the back end, we have this massive telemetry system that is getting data sent back to clusters. And our engineers wake up every day, log into a Grafana dashboard and say, hey, what's looking odd here, right? And you might see that 4.2 customer had configured a proxy and we're now knowing that that proxy's misconfigured and being able to proactively engage with that user to say, hey, we see you did something weird and oftentimes they didn't even know they did something weird and try to help them out. The other thing around day two config management is like some new components and maybe Clayton wants to talk to about is our insights operator which can gather basically that configuration of the cluster anonymized. And do you want to talk through that a little bit? Sure, and this is always an interesting topic. So the analogy that I would draw is, as Derek said, when you run on the cloud, there is an SRE person who is watching that system. They may not have all of the details, but as part of your relationship with that cloud provider, you're trusting them to help you succeed and to make those details go away. And so one of the things that we wanted to offer as that option was the ability to help us ensure that you succeed, whether the telemetry component or the insights operator component which are really just what we think of as like that remote health checking part of OpenShift is we can bring someone like Derek to that dashboard in the morning as part of his warmup for starting the day. Seeing what's working and what's not working helps Derek frame how he engages with the open source community. So for instance, it was the most recent one that I was struggling with. We were actually seeing during upgrades that there was a problem with some of our Kubernetes pods that was resulting in an unusual behavior. We dug into that, it was triggered by some of this health data that we were seeing. We triggered that. It turns out there's a bug in the cubelet that's been there for about four years now that's actually fairly serious. That insight into the data was we get that through bug reports and we get that through support tickets and we get that through discussions with customers, but we were able to help short circuit some of that process. So the insights operator is tied into our insights service and it helps us collect a little bit more properly anonymized, but at the end of the day, it's we're trying to save customers and users and people trying out these systems. When you hit a new configuration that we might have slept up on or an upstream release has regressed, it's not enough just to ship software. We have to help people succeed. And so some of this is, this is an evolving journey obviously, but we've been pretty pleased by the results people have brought to us. Let us be your SREs and speaking of SREs, I was at a customer site who was upset that we made the console look as nice as it does for Kubernetes configuration because it was really driving their customers, their end customers to touch it. And what we're finding is customers don't want to touch the cluster anymore. They want to use GitOps. And can you talk about this craze with GitOps? Yeah, so a little earlier in the year, actually all of us basically did a broad survey of the state of the GitOps world. I don't know how many projects we went and met with. At least 15, I think. It felt like a lot. And we were pitched a lot of them. GitOps is, to me, it's a nice way of saying, what can I do to decouple my app from the fate of that cluster? And so that if that cluster did have a momentary issue, if that region had an outage and I need to shift something elsewhere, like GitOps is the pattern that decouples your application delivery from ultimately successful deployment to production. And so it's obviously a use case that is broadly applicable to the world and one that we in OpenShift will have to provide native support patterns for. So we looked at a number of projects. One of the ones that stood out to us was Argo. We felt the community was really solid. We've been doing some work to demonstrate reference architectures, integrating Argo with OpenShift, both to provide cluster configuration, as well as application definitions. So in the case of, Mike over here talking about, yeah, we have a great web console that lets you go and change the setup of the cluster. If you did tie it to an Argo pipeline or an Argo GitOps flow at the end, even if Mike was a nefarious individual and changed something locally, you can trust that the GitOps solution is just gonna undo it and put it back as it should be. And at the end of the day, this is like everything we hope for, which is just what can we do to eliminate Drift from the platform? Because Drift is ultimately what causes bugs, ultimately causes unhappy apps, ultimately causes challenges for me and Clayton and others to support. And speaking of Argo, let's bring Ryan up and take a look at what Argo looks like against OpenShift. Yeah, it gets. I hope I don't have to be this close. It's gonna be awkward. All right. I hope I do not have to cover Argo's CD as well as Derek did, as he covered GitOps. The idea is much better than I possibly ever could have. This microphone, there we go. Let's see if I can move it here and up. Okay, so what we're gonna do is this. We're gonna take a look at Argo CD as a sysadmin as well as a developer. So we have two new clusters that have no authentication set up besides kubeadmin. Pretty common issue. So what we're gonna do is we're gonna use auth. We're gonna use GitHub for authentication. And we're gonna apply the same configuration to two clusters. So I'm gonna choose my Git repository. And then through the use of customize, I have commons east one and east two. And the only differences between commons east one and east two is just the secret that lies underneath for GitHub. So we'll choose commons east one. Big. And we're gonna create this application. So what that's actually gonna do is it's gonna call back to Argo CD, tell it to run a Git sync and apply versus what is in the repository now. So to make this more interesting, I'm actually gonna do the same exact steps in our west two environment, using the same repository, using the overlay this time of west two, which contains our secret for GitHub for the west two cluster. Choose west two and apply it. So what we're gonna see here is that auth one, well, auth one, which is cluster one has already synced. So what we see is within that cluster is that we have different things actually applied to it. And you can actually directly look at the manifest that's actually being applied to the cluster. We see that, let me scroll down here. Three scroll bars, I think that's a new record. I know, right? That's pretty good. We see that I am a cluster admin of this directory, of this cluster that is. And we also see that Derek is also a admin as well. Congratulations, Derek. Thank you. So this is an extremely common pattern, which is just like I have 10, 15 clusters and I need them all to have a similar shape and structure. Some default namespaces, some default role binding. And I need to just ensure that as a new one comes up, it stayed in sync. Yep, that's directly it. And as you can see here with the last scroll down, I have the organizations have open shift and multi cluster applied. So that means any member of those organizations can go ahead and work on this application. And we see that West 2 is also synced. So if you know anything about the OpenShift GitHub organization, there is a ton of people. So Clayton, we're gonna give you a new job as well. You get to be an admin on there as well. I appreciate that. So what we're gonna do is we're going to go into our base. Kind of like Derek was saying, there's some common things that you're gonna apply to every one of your clusters. So we're gonna modify our admins file. And as you can see here, Clayton is here added. So we're gonna go ahead and shift this change adding Clayton as an admin. You're gonna give me the talk about what great power comes great responsibility or? I feel like at this point, you probably know this for each of us. So we're gonna do again, we're just gonna go and hit the sync process. We have the ability to prune. So if, for example, Derek had added an user as an administrator, it would go ahead and remove that for us because it's not part of the get recall story. So we'll synchronize that and do the same thing on our West cluster. And now you see Derek is a cluster admin. Just that fast, the configuration was applied both to East one and West two. So now let's look at this from a developer standpoint. We have an application. It's just a game. And we'd actually like to run it on both of our data centers. So we'll use the sync policy again of automatic, same repository and our East one overlay. We use the East one overlay because we're actually using open shift routes within our data repository. So as they were saying earlier, everything that is configurable within open shift is accessible and manageable through, you know, get repository. So when I click create here, we're actually gonna go ahead and it's gonna create everything that we need for our application from the namespace, service accounts, to even our deployment. As you can see, the application is coming online now. So let's go ahead and add our West two cluster as well. You know, I have to ask Ryan, like can I put all of the config that you're doing here in a GitOps repo to drive GitOps on my GitOps? Actually, you can. So you could put GitOps in a GitOps if you wanna make a meme of it. So the really cool thing is about Argo CD and GitOps in general, is it has this idea of applications, which all that means is it's just in YAML. So in theory, I could take this manifest here, plug it into a Git repository and then I could have Git managing Argo, managing another Git repository in our applications. So it's a kind of a multi-layer fun story. So as you see, our game deployed. You see it's running in East one. So what I'm gonna do is, right now we're on West Coast, so why even run it in East one? So we're gonna modify our game code and we're gonna actually set our replica to zero. And this is through the use of overlays. We could say that we want no pods running in East one. And this makes it so we don't have to interact with a load balancer. We don't have to worry about external services to manage shutting that cluster out of the rotation. So we'll go back in. Hopefully we don't get caching. We have a little bit of caching. So what I'm gonna do is I'm actually gonna force sync this application to remove that pod. So as you see, right here is our deployment of our application, our Pac-Man game. And it's synchronizing and we see out of sync and it is terminating the pod. That took seconds. That's just so slow. Do you think you could speed that up any? I know, you said you were gonna crash it. That's why you have to be admin of that one cluster. It's amazing too. Like if you think about like a lot of these demos and we've done, you know, I think Kelsey Hightower's first demo of Kubernetes at one of the meetups in the San Francisco Bay area. You know, it was, this is still such a valuable pattern. And we see these demos and the reason these demos matters is because it's about trying to quickly tie intent to action. And like if we talk about like everything that we've built in Kubernetes over the last five or six years, it's about trying to get to this declarative point, which is if you say something's true, then you can have relatively lightweight processes like Argo that say we'll take the truth from Git and move it to a cluster. And then the cluster can then handle those details and summarize them up. And you've kind of broken the problem down into individual steps. It doesn't mean you can't do it other ways, but like this is the pattern that I think keeps coming back as we work with the Kubernetes is it happens so fast because we've broken the problem up and that's how organizations scale. That's how developers and operations teams that they split responsibilities and you put tools and APIs in place between them. Exactly, so what we're gonna do is just to show that we're in line. It's running only in West and just as an extra addition for your development flow, I just wanna show that swapping an image and Pac-Man deployment. So applications require different images at certain times. So what I will do is I'm gonna make a change of the application that we're running in the game. And the very cool thing is we will go back to our console, go to our West two data center. If I click the right app and we're gonna force a sync. And that's gonna actually do our standard deployment flow of replacing the current deployment config defining the old image and then put the new image in place and our application will still be running. That's awesome, thank you. Can you bring those slides back up? Ryan works in our solution engineering team and that team is writing several reference architectures for a lot of different GitOps. We don't think customers necessarily want us to ship the GitOps engine like Argo or Razzie or Flux or whatever the case may be. They just want guidance on how to target open shift with those solutions and he's documenting those for us. So thank you Ryan. It's a skip ahead to multi cluster. Somebody talked to me about Hive. Yeah, so I don't know if everybody knows this but open shift also runs a dedicated service. So you can go and ask Red Hat to get you a cluster and we will SRE on your behalf. And so Hive is actually a project that is the backbone of that managed service offering. We've yet to integrate it into a Red Hat product outside of the open shift cluster manager you see as a SaaS offering if you had subscribe quota to go and create new dedicated clusters. At the end of the day, when you're creating those clusters or manipulating them, you're interfacing with a component called Hive. And Hive is basically our multi cluster life cycling API that can create and destroy and resize clusters as well as deliver config down to those clusters. We have a lot of operational experience now with the project over the life of four running it in production now on AWS in a growing set of cloud platforms. And some of the work that you saw from Angus we're doing to integrate now so that you can use Hive as a life cycling solution for bare metal. But the interface of the project is pretty simple. You say I want to create a cluster and it has the same basic structure you see in the open shift install config to describe the shape and structure of your cluster and then a few minutes later, out comes a cluster and you can get all sorts of operational detail about the logs. I can manage DNS for you. It basically is a backbone of much of our dedicated offering. So I can ask for a cluster just like I'm asking for a pod. Yes. So it basically brings decorative cluster management to open shift and the benefit of this is that as we improve the Hive project, we also improve our managed service offerings. A little bit earlier, you were talking about all these clusters sending telemetry back and then us looking at some of that telemetry to find patterns. That is actually building a massive Prometheus cluster in the back end. Can you talk about what we learned and how we're pushing that out to the customer base? Sure. So if folks who've used OpenShift, we've had the cluster monitoring operator in OpenShift since Forex launched. It was a tech preview in 311. That's really the continuation of the investment that CoreOS was making with Prometheus. After the acquisition, we'd already been looking at Prometheus and we were super excited to continue to put more and more resources in this area of Kubernetes because monitoring is such a fundamental part of understanding what's going on. But it really comes down to what details are truly important. And so it's about connecting a little bit like we said before. It's about trying to connect the people who understand a particular subsystem, making sure that they have great metrics and are surfacing those up. So we've continued to invest in Prometheus and exposing it into OpenShift and OpenShift 4.2. There was a metrics viewer and we've had the alert UI as well. We're taking a lot of steps to continue to grow that on a single cluster basis. But there's kind of a reality of scale that comes when you start talking about either huge deployments within enterprises when you're talking about hundreds or thousands of clusters, or when you're talking about hundreds of thousands of applications because a Kubernetes cluster is really just a pretty simple straightforward application. You might have 50 nodes, you might have 1,000 nodes, but some of the world's largest applications run on top of Kubernetes. You might have thousands of pods or tens of thousands of metrics per service that might include metrics from your load balancers or the storage engines or databases. And so one of the things that we looked at and worked in the open source community to make an investment in was a project called THANOS. THANOS is the idea of Prometheus but building it out in a cloud native fashion so that it's infinitely scalable, or as infinitely scalable as any software is. And that idea was there was a very naive or let's call it simple federation aspect of Prometheus that we knew there were limits too. And so we worked within the THANOS community to say, and we wanna be able to ingest millions of time series records a second from thousands of clusters or thousands of applications. And so the first parts of what we consider the productization of that super high scale observability aggregation is what we're using in our managed services today so that individual clusters send very small anonymized chunks of Prometheus data about the functioning and health of the cluster directly to THANOS. And that actually, we use this as this really tight loop between Red Hat and our customers in the open source communities which was learnings about how Prometheus works. We send that through an API that was developed. The remote write-in point in Prometheus was spearheaded by Red Hat engineers who knew that aggregation was a challenge that Red Hat customers were facing and that allowed us to focus their efforts on centralized those metrics. And that's really the underpinning. There's a lot of exciting things that'll be coming in the next year or two in high scale observability across applications and clusters. Our work in OpenShift 4.3 where we bring tech preview of user workload monitoring and observability so that you can tie in all the mechanisms without having to run your own Prometheus servers. Like that's really, that's all going to come together over the next year or so and it was led by this effort to bring together the data on our managed clusters. So Derek, we've been working upstream on a lot of different multi-cluster activities for quite some time now. I think in the last year we noticed a trend for a lot of people to run, sort of not an agent, but something in another cluster to allow you to see multiple clusters. And we ran into some great partners this year in the upstream. Can you talk about some of the areas we ran into, MCM, IBM, MCM? Yeah, sure. So generally when I think about the multi-cluster space and I think there's a discussion at KubeCon about this might be on a panel is people often look at it in isolation, right? You need a way of life-cycling and deep provisioning clusters and in the OpenShift context that's more than just the core of Kubernetes. It's everything that goes with it. So that's our Hive project. Then you need a way of monitoring all these clusters, right? And so that's what Clayton was discussing. But then we've seen an emerging trend around wanting to run a hub and spoke model essentially where you would define a central hub service or a management cluster that can then go and reach into these spoke clusters to drive configuration. And frankly, if you look at Argo, it's just another one of those similar types of tools that's delivering config. In the partner community with IBM and now with MCM, they had a solution that's been in market for over a year now. And as we started to work with the IBM team, we realized, oh, wow, we've been working in a lot of common projects together that we didn't realize. So we've done a large investment in the SIG multi-cluster community upstream. And from there, we've had adoption of some of the components around a cluster registry. So not every cluster is being able to be life-cycled by some automatic cluster provisioner. Many of our clients that have to go and deploy clusters on pre-existing infrastructure still want a way of aggregating that into their multi-cluster view. So that was really what the role of this SIG multi-cluster cluster registry project was, which is just no matter how clusters came into existence, giving a listing of all of them and some data about how they are there. And so when we met with the IBM team, we were like, oh, cool, I didn't really use that. Like I thought we were the only ones looking at using that. And so they're adopting that from the SIG multi-cluster community. And then there's a lot of interesting trends around agents and wanting to stick agents into clusters and feed them back. So today we have a telemetry agent in every OpenShift cluster that feeds beta back to Red Hat if you opt in. But if you're running this on-premise and you can't give data back to Red Hat, you're running an air-gapped environment, you wanna be able to still aggregate data back to your on-prem hub or your data center hub. So one of the cool things that the MCM team has is this agent called a Clustulit. And what it's capable of doing is distributing application content across spoke clusters. And so one of the application concepts that it's using from the upstream community was developed from SIG apps, which was this application CRD. And so there's a lot of interesting things happening in the multi-cluster space and a lot of the piecemeal projects that are being developed both in the upstream Kubernetes community as well as the broader ecosystem are starting to get integrated into some cool products. And we've been working to make that successful with the OpenShift team. Yeah, I have not seen it. I don't know if you guys have seen it. Let's bring Tim up. Tim, come and show us this IBM MCM. No pressure. No pressure, no pressure, go up. Wow. You need a taller microphone. Okay. All right, so my name is Tim Boyer and I work for IBM. I don't work for Red Hat, although maybe. Now that we're merged, maybe we're all one big happy family. So I used to run the CI CD team for IBM Cloud Private, which used to be a friendly competitor with OpenShift. Obviously, you guys know who won that battle, which is fine because now we get to focus on other things, things that add higher value like MCM. So like you've heard this morning so far, a lot of the things around multi-cluster management are really, there's two different sides of this story. One is how do you provision everything? How do you get the infrastructure up quickly and easily and pull it in so that you know about it, you know its status, you know you're aware of it. The other side of that is how do I govern this? Once it's up and running, how do I ensure that my security policies stay in place or that my applications are being deployed from a particular repository or I know for sure that my images have been signed before they've been allowed to be deployed in a particular cluster. So I'll just jump in a little bit. This is the MCM console and I'll look down here at clusters. So this is a list of clusters that we pull out and we understand that we're managing. Now we never started on the end of provisioning. We discovered that maybe we weren't so good at that and there were other people that were really good at provisioning clusters and setting them up. So we always took the point of view is that you have a cluster somewhere and you want us to help you manage it. So this comes back to the cluster registry work that you were mentioning earlier that we decided to use. So we could easily import a cluster but now with Hive, which we really just started discovering last week in RTP, we're very excited about integrating Hive into this product so that we can now start standing up clusters and then having them automatically be imported and managed by MCM. So these clusters, this is not a very complicated model. We do have an integrated web terminal since you are working with multiple clusters. It does become a little bit more difficult to switch contexts between these different clusters. So this comes from an open source project or project that we've open sourced called Kui, which is a web-based terminal emulator, which is very easy to use and easy to switch context with. So now that I've got this open in my web console, I can say OC get clusters, oops, sorry, not clusters, cluster or not. Gotta use the namespace. All namespaces. So now I can see all the clusters that are running in my, being managed by MCM. If I want to take a look at this, you can see that the YAML will show me that I'm literally using the cluster registry object type. Once I have these, once I understand what clusters I'm running, then the next step is applications, right? So I know that being Red Hat folks, you probably aren't big fans of Helm. No, no, no, that's not true. Not true. Okay, we love Helm 3. The perception maybe that you might not be big fans of Helm. That was just Helm 1. We love security. Okay, okay, that's a great way of putting it, all right. So the one thing that Helm did bring to the table that is very useful is this logical binding around an application. What are the different pieces that make up an application? So if I did a Helm release, I could easily go and see under that release what all the different objects were created in my Kubernetes cluster that were related to a single quote unquote application. If any of you have ever gotten a call from your CIO saying why is the app not running? He usually doesn't say why is the microservice not running or why is the object not running? He usually says why the heck is the app not running? So we like to think about things in a logical sense of I have applications. So we've adopted the application model from the Kubernetes SIG app. So if I say OC get app application, again, I have to say all namespaces so that I am being secure here. Uh-oh. Add an S. Add an S. Oh, namespaces, thank you. You can just use dash capital A. That is part of the value Red Hat brings to the community. We're like, I hate typing dash dash all namespaces so we add a dash capital A. I'm still getting used to OC, okay? I did everything with kubectl before, so. kubectl got a two, that's the value of open source. So we can look at things like stock trader. Let's pull in this model here and basically you can see it comes from the application type. We can then look at going a little bit deeper about this app, this stock trader app. It'll be a little bit easier to explain if I jump out here and look at it from, let's call it a prettier view, right? Sometimes CLIs are really easy but are very difficult to understand when looking at things. So if I look at the stock trader app from this perspective, you can see I have a kind of a topology view of what this looks like. Ultimately, I have an application. This is my stock trader app. This application has a subscription. Now, this is a concept that we've introduced to application management within MCM. Everything is built off of an application but an application has channels and subscriptions. So there's an open source project that we put out there where we're building this in the community. So I guess you can teach a blue dog red tricks. So we're putting this out there in the community to get feedback and develop it further but ultimately you have a channel which describes where it's coming from. What is it that I want to deploy? This could be like a Helm repo with a specific Helm chart or it could be a GitHub repo or Git repo pointing at some YAML file definition, right? It doesn't really matter what it is. It also can be a namespace with secrets. If I contrast what you're showing here versus the prior demo, right? When Ryan came up and he showed Argo integrating with particular target clusters, Argo today you have to basically describe your cluster and define your Git source to that logical cluster. MCM is just taking it one layer beyond that which says clusters may come and go. They might get imported into your hub and the application model you're showing here is basically saying I'm going to be able to dynamically choose which cluster I assign that app to in case that cluster is no longer healthy and I need to rebalance my app and keep workloads up. That's correct. That's basically what I view as the major point of contrast and then Argo itself would be a great tool to push into your hub, right? Yeah. So we have this idea of subscriptions and channels which have placement policies. So in Kubernetes when you deploy an application it just the schedule just decides which pod to put it on based off of criteria. Well our placement policies are somewhat of a similar concept in that I can choose which cluster to place my application in based off of some criteria. In this case I'm just matching a label. So we labeled in a cluster and we're just gonna match it. But there is the possibility of creating much more complex and much more elaborate placement policies for how to manage the distribution of my app on the cross clusters. Now in this particular case there's only one cluster which is web dev one, but we've got eight deployment resources here. If I can scroll zoom out here. We've got four services over here on the left and then we got four pods on the right. These are all deployed to a single cluster. If I look at a little bit, let's look at another application that at least tries to attempt to do what we would call a hybrid deployment. So by hybrid deployment I mean not everything is contained on a single cluster. So when we've opened up the ability to manage multiple Kubernetes clusters and we've separated out the deployment of an application from the cluster itself. Now we have the capability of taking an application and deploying it across clusters. If the application is built appropriately and is using the right tools you can then take an application. In this particular case I've got a ticketing application. I've got the front end web app portion of this application running on one cluster, web dev four, and then I've got another back end which is just a Redis master running on social dev one. But the key is that both of these pieces, both of these deployments in social dev four and web dev one, or social dev one, they have to work together in order for the application to work. So how do I know how my entire application is performing when it's deployed in this cluster? So again we are using Prometheus. Now we may need to take some notes from you guys on how to handle federated Prometheus better because we do federate Prometheus at this point from all the spoke clusters. But once we have this pulled in this is a dynamically generated Grafana dashboard that gives me all the information about the pods in my deployment. And it's giving me information about the cluster that my deployment is deployed on. And this is dynamic. So if I change the placement policy of my web dev or my Redis master, so let's take a look at that. It is, yeah, the wifi's a little, okay. So let's switch over. I think I'm in, let's see what project I'm in right now. Should be in Acme, yes I am. OCGit, let's go subscription, subscription. So these are my subscriptions for the ticketing application that I'm in. So I've got one here for the ticketing Redis master. And if I look at the placement policy, I have a placement policy rule called Acme Railways Ticketing Redis Master. So if I edit that, OC edit placement rule, paste in that. So now I'm gonna be editing the placement policy. And instead of social dev one, I'm gonna change this to social dev two. And what we should see is that this policy is automatically applied. And we should end up with, we should end up with this application, the Redis master automatically being deprovisioned from social dev one and provisioned on social dev two. So if we jump out back to our application and we look at our ticketing app again. So now you'll see that the Redis master is on social dev two instead of social dev one. And if I pull back open this Grafana dashboard, what you should see is that instead of monitoring social dev one, since no piece of my application is on social dev one anymore, I'm now gonna be looking only at the clusters that I have actual applications on my social dev two. Cool, thank you. That was amazing. Tim, let's give him a round of applause. And that's gonna conclude our session. I think we talked about a lot of technology. We saw a lot of demos, a lot of exciting stuff. MCM's available right now against most of your Kubernetes clusters, including OpenShift. Any closing thoughts? So I'm really excited about where we're going with managing clusters, working closely with the IBM team, finding ways that we can share technology in the open source communities, helping drive upstream projects in directions that make it easier for people to not only build and manage the clusters they have today, but build and manage the coming sets of clusters that they will have in the next several years. I'm actually really excited about the opportunity to participate in going to that next level above Kubernetes. Like we've spent a lot of time in the last couple of years building ecosystem and working to make these things possible. I'm really excited about starting to put a single pane of glass type mindset on the whole Kubernetes ecosystem. Yeah, I mean, to me, the type of demonstration you see there is just showing years of hard work on a lot of our parts to make what you just saw possible, possible and kind of not that big of a deal. But I'm excited as well as Clayton said to continue to engage with everybody in the open source community to drive useful outcomes out of all the projects that we're trying to work on together and assemble. Great, well thank you for the conversation. And back to you guys.