 Welcome to the session everyone It's become a bit of a tradition for me to sit down with Derek Carr and Clayton Coleman and about Thanksgiving time kubecon North America Commons and to really dig in to what the year ahead looks like why we're doing the things we're doing how it's connected to you and With that gentlemen introduce yourself Yeah, thanks Mike. It's always a pleasure to get together with you in person or or virtually Here at kubecon in the Commons My name is Derek Carr. I'm one of the architects on the open shift engineering team and I've had the pleasure of Talking about Kubernetes and OpenShift for the audience for many years now. I'm excited to be here again Clayton and it's also great to be here Mike and Derek. I love having this conversation with y'all and it's just a great time of year to To gather together virtually and talk about our favorite subject, which is OpenShift as everyone knows we can talk about that forever you know, I've worked on OpenShift for almost eight years now and You know every year we come up with something new because the industry is changing and so you know It's both OpenShift architect and hybrid cloud architect at Red Hat. You know, I'm totally committed to You know seeing this this awesome experiment in this great ecosystem continue and I love being here to talk about it All right. Thanks gentlemen, and it's not just the two of you. We have Reeze Oxenham a net-clued and Paul Maury popping in and out to show some product demonstrations and Concepts of what we're going to be discussing. So with that, let's dig in You know, I think it's best if we start our discussion Around all the innovation that's happening around bringing Kubernetes out to the edge And at the same time closer to the hardware or infrastructure, whatever word you want to use You know, I think it's it's probably fair to say that you two have spent the bulk of 2018 and 2019 and you know Truthfully even some of 2020 Moving Kubernetes into this self-managed control plane and a lot of people don't understand what that necessarily means Maybe they're just using Kubernetes or OpenShift for and Kubernetes and they're not necessarily picking up on the the complexity of the use case that we solved for them with that 2018-2019-2020 investment. So Derek, why don't you talk about that a little bit and sort of look under the hood if you will? Yeah, sure. Thanks Mike. So It's always good to come back to Commons and and reevaluate the progress we've made and you were looted to the big pivot We made an OpenShift 4 around Providing a self-managed and self-updating Kubernetes platform and I think last year when we met and talked We always talked about how our first cluster is most important cluster to get into a customer Environment, but once that cluster is in your environment You want to ensure that all the software that is used to support and run that software run that cluster upgrades together and continues to work as a full platform and so One of the unique capabilities we have at Red Hat is being able to provide a platform from the hardware out, right? And so within OpenShift 4 today You install your cluster and you get an immutable version of RHEL We call RHEL CoreOS that provides a platform for not just the cluster infrastructure services itself But also your end user workload applications and then OpenShift 4 you can go and install it in multiple cloud platforms or on-premise in your environment both virtualized and on bare metal and we like to talk about how just taking Kubernetes isn't enough right to build a viable container platform you have to surround it with Supporting infrastructure services and so that might be your ingress your DNS Your monitoring stack your logging stack a whole host of applications are needed to support that And so when you deploy OpenShift 4 today, we have an opinionated set of core services We deliver with every cluster right and we test together as a stack in what we call a release And so in our 4-1, 4-2, 4-3, 4-4, 4-5 and and hopefully now by the time we're meeting here our 4-6 release We've been able to install that stack into our users environments and upgrade the platform in whole Not just the Kubernetes assets, but the underlying operating system and all those supporting services So last year when we met we talked about how a lot of innovation in the Kubernetes ecosystem is happening Outside the core of Kubernetes itself right like Clayton and I always like to talk about how we want to make Kubernetes boring and Allow the innovation to happen above and around us, and I think 2019 and 2020 are evidence of that happening and one of the things that we've put into the center of OpenShift 4 is what we call our operator hub or the The whole notion of operators about how you can deliver extensions or add-on capabilities on top of or around that cluster To get incremental value, and I know today we're going to talk about a lot of those things, but I Think it's worth taking a step back and just being like man. There's a lot there right like Whether in the last year it was OpenShift virtualization so you can manage VMs on top of bare metal OpenShift Like who can think of that right containers and VMs together managed by the same common orchestration framework like it's crazy if you were to talk to Clayton and I six years ago We'd like that's that's amazing OpenShift has grown to take on that use case or service mesh right a lot of people are doing interesting things around the SDO community to Monitor traffic happening across their apps or in some cases across clusters And functions, which we'll talk about later with serverless. These are all the components that we see Evolving and iterating above and beyond that core stack and today with an OpenShift 4 when you install your cluster Aside from our opinion it set of core services Our users are able to pick and choose like individual elements that they want to augment their platform with and we provide a very elegant life-cycling solution for those additional solutions to ensure that Not just the core platform is life-cycled and kept up debate and secure, but the whole Software stack whether that's the core add-ons the core OS or The core Kubernetes orchestration layer, so I think it's fair to say OpenShift 4 has grown a lot We've proven its ability to upgrade and maintain stability in our user community today And we look forward to doing that as we continue of all the platform moving forward Yeah, thanks for that Derek. What I really like about that that concept is that all those Features of the edge that you had to build on top of Kubernetes to give somebody a valuable Platform you've made them self-aware. You've moved them to CRDs You you you leverage the basic functionality of Kubernetes to Maintain them and to tell them that they're out of sync and then they need to come back in the sink And they need to find consistency and that was a just a really great way to have a sort of a self-managed platform That's leveraging its own technologies Really great stuff. So so Clayton, you know now starting in 2020 Now that we conquer that world we move into this explosion of people that want the benefits of Kubernetes But they maybe want a smaller footprint or a month a more compact cluster They want it's crammed in places that Kubernetes maybe had never been before Can you talk about some of the things we're seeing in that front? Absolutely So over the last few years Kubernetes itself has grown, you know everything that we talked about that's a core Fundamental requirement of Kubernetes cluster It's something that consumes resources and if you're bringing it along you need to watch it make sure it's still healthy As well as we've added extensibility to Kubernetes We went from a really simple initial state where we just had a few binaries that we ran Everything was fine to a world where Kubernetes is actually managing a lot of the components on the cluster We run components on the node like software defined networking like the proxy elements that make sure apps can talk as layer-on-layer Value has been added to Kubernetes. We do need to spend just as much time in Optimizing the platform Optimizing the components of the ecosystem around the platform and looking for new Efficient footprints that keep all the benefits that Derek mentioned while still giving us the flexibility to Deliver that consistent lifecycle because the worst thing would be happen is if we over optimize for a particular scenario And it breaks an use case that we don't want so this has been there's been a pretty long journey I've been involved with performance in Kubernetes since the very beginning I'm guilty of some of the the early projects in Kubernetes that Made the control plane more efficient a lot of the work done red hat and others in the ecosystem We looked for ways to slim down the footprint of the core Kubernetes control plane We then applied that to the cubelet and two elements of the operating system on there So some great works been done in the cryo project over the years in the cubelet to really Even in the most complex and demanding environments to slim down the resource footprint of Kubernetes itself as we went we built a lot of infrastructure in the Kubernetes project and The open-shift engineering Ecosystem around that to make sure that regressions were caught and captured right we we watch how many resources are used by the component By the core platform the components of the platform and that's something that you know as every release goes out red hat picks up New versions of Kubernetes very early compared to many of the distributions ecosystem So we often find and catch those regressions and then make sure that they're they're fixed upstream in the proper way and As ecosystem components put it are pulled in we're always looking for ways of making sure that we can deliver a reliable stable solid platform So as we've reached the limits of what you can do by optimizing Kubernetes itself We started looking at additional footprints. So one of the first steps that we took an open shift for was allowing you to run fully Full highly available Kubernetes cluster that had three control plane instances But with no workers and actually let people schedule in those workers. That's a fairly common configuration and a lot of retail and edge deployments where Power or space isn't as a premium, but you want no more hardware than is necessary and obviously You know as we've all learned over the years of we've gotten better with distributed systems there's really just three topologies that you can run you can have three of something and You take a vote and whoever has the majority is in charge That's the system that Kubernetes uses with at CD and that actually provides the strongest possible Failure mode you can afford to lose a single instance and so an open shift for We added the compact mode where those three control plane instances are running the control plane as well as user workloads We try to make it easy for people to switch between that So if you started with three you could add more and a lot of that work ties into things that Derek mentioned building a Control plane and an API system that made it easy and natural to start simple in your data center on the cloud and grow your cluster There's two simpler configurations They on the three node control plane and that's having a single node and having two nodes in an active passive kind of setup our focus for since OpenShift 4 has been Ensuring that code ready containers, which is our single node development focused experience works really well in a single node environment There's a lot of optimizations that go along with that not just Installing the cluster which sometimes takes time because we have to make sure everything set up in the code ready container scenario We can optimize for starting that VM image as quickly as possible and simplifying some of the steps It's not a real production environment But it actually sets the stage for some work that we're doing both in the ecosystem as well as moving To having full single node support and OpenShift at a future time In addition, we've been working with groups within Red Hat and ecosystem on active passive setups This is actually a pretty common concept for Administrators of enterprise Linux and we've been doing this for quite a long time with some of the components of Linux that do Active passive failover setups and we think there's a lot of interesting stuff there It's not our primary focus right now, but it is something that you can accomplish today using You know technology supported by Red Hat So we're trying to balance the needs of resource constrained environments Keep all of the flexibility and power of the platform Keep the configurability and the self-management self-monitoring because as you move towards the edge If one of those three hardware instances fails, you still need a solution and we've begun working as well in higher levels of the stack like the Red Hat ACM product to make sure that as you start proliferating this large number of smaller clusters that we have the necessary support to To manage the life cycle and to monitor those clusters and distribute workloads So this is something that's an ongoing project, but I'm pretty excited that you know, we've continued to simplify and and ship an open a fully production grade version of Kubernetes and to keep all of the benefits with none of the downsides All right, that is absolutely correctly We do have some customers that have ventured forward and are doing open shift in remote locations And there's really some exciting requirements that are popping up that personally I didn't think of when they first brought them to us and it really is around Hey, you know, I don't even have the staff in those locations or it's a different staff in that location with a different Sort of charter or maybe the infrastructure services as simple as DNS or NTP aren't there Derek want you to talk a little bit about how we're dealing with those Yeah, sure. Thanks Mike so Yeah, I think it's been an interesting learning experience as we gather more feedback from customers on the unique constraints in which They may choose to install open shift And of course as we go further out the edge or as you deal with the realities of putting Clusters in many locations of the world you end up with unique ways of looking at problems Then maybe you previously held one of the things I'm happy about that. We're doing at Red Hat is we're kind of Looking to see if we can kind of blend connected services with that broader installation stories so that for example the Administrator who is putting a piece of hardware into an environment may only have to boot An ISO like to just power it on and then we can start to explore interesting concepts about well What happens when that machine powers on can we have it connect to a different endpoint and allow a different persona or a different? Actor within the overall enterprise solution Stitch these pieces of compute together to form a cluster, right? So I Think that's a really interesting unforeseen New evolution around the whole installation Day zero day one activities that we see more and more as we get to the edge and so there are some Some exciting stuff that will look to show one of these things is our assisted installer and so a Lot of times today when folks think about installing open shift on metal or in a virtualized environment They know that you not only just pull down our installation binary, but you pull down You know the ISO to boot rel core West One of the things you'll see that we're exploring in our assistant install program is that Rather than pull down both the install and an ISO You pull down just a bootable artifact that knows how to connect to a remote cloud service And so today that's a service that we're offering On cloud that red hat comm and in the future, you know, we'll look to see what we can do to make it available elsewhere But what's exciting is when you boot that first artifact that ISO you get a way to connect and phone home to a remote location so that the persona or the engineer or the the actor who's working in a retail location on a telephone pole or Anywhere else that you can imagine we'd want to run open shift They just deliver the hardware turn it on and know that it will phone home to a location that allows a different persona or a different actor to stitch it together in a larger solution and This idea of like blending together a pool of compute that you can then have a second Individual come in and say I want to stitch this machine and this machine and this machine together into a cluster is super powerful but what's also interesting is we can minimize errors that can occur as users are kind of Understanding their environment by making this bootable artifact know how to pull out and understand It's it's environment right like what is The nature of the host itself does it meet our minimum requirements for CPU memory and storage? so we can do a lot of Validations and verifications that the cluster will be successful before it's ever installed And we think that this is a really exciting emerging pattern that we'll see more of in 2021 and so With that Mike, I think it'd be great to go to the demo. Yeah I completely agree. I got to see this and with that reason why don't you come on up and take us through the assistant installer? Thank You Mike My name is Reese Oxenham I run the field product management team here at Red Hat and it's my pleasure to introduce the new OpenShift assisted installer to you all We really wanted to make the deployment experience of OpenShift even easier And whilst we recognize that there are many deployment options and target installation platforms for OpenShift We wanted to provide our customers with a more streamlined and a guided workflow for deploying clusters The OpenShift assisted installer is a web based utility live on cloud.redhat.com today Ready for our customers to start using and although it was primarily designed for the deployment of bare metal clusters Where automation can sometimes be a little bit more challenging It can also be used to deploy clusters across a wide variety of infrastructure and cloud platforms We set out to develop a solution that had very minimal external requirements. I was incredibly easy to use I'm going to walk you through a deployment using this new offering and hope to demonstrate some of the design goals that we had in mind Now we're into the main workflow. We have to provide a cluster name Which is important as it forms part of the DNS domain name for the cluster So I'll choose a name that matches the environment. I want to deploy into Next to ask which version of OpenShift I want to deploy and for this I'll stick with the default 4.6 pre-release Finally it provides an option for me to input my pull secret an Authentication key for pulling the container images required for both installing and operating at any OpenShift cluster This has been pre-populated for me based on my cloud.redhat.com login The next page is where the bulk of the configuration will take place and the cluster name has been carried forward from the previous section Next is going to give us the option to download a discovery ISO and this represents one of the most important design principles of the assisted installer Every node that we want to become part of the cluster needs to be initially provisioned via a discovery ISO One that has been dynamically generated for us by the assisted installer This was chosen for its simplicity We need only boot the target machine with a discovery ISO and it has a phone home Mechanism where it can receive all of its instruction from the assisted installer by passing any manual configuration for the administrators So let's go ahead and select the download discovery ISO button For troubleshooting it asks us to pull in a secure shell public key So in the event of the nodes not appearing as expected we can do some troubleshooting This section also confirms whether we want to use an HTTP proxy And if so we can ensure that it gets injected into the ISO, but in our case we don't need one Behind the scenes the assisted installer platform will now generate our custom ISO Already pre-configured for the cluster we're creating and will make it available for our download. This takes about 20 seconds or so I'm going to be deploying a three node cluster where all nodes are simultaneously master and worker nodes Demonstrating a use case that is in significant demand from our customers and partners Especially for edge scenarios and where bare metal affords customers the ability to also run virtualized workloads Or workloads that can exploit hardware functionality such as GPU acceleration For access to the cluster. I'm using a jump host that I'm accessing over VNC Primarily because I need to attach this discovery ISO to my bare metal machines over a virtual media interface a Brief overview of the setup that I'm using here I've got three Dell FC430 blades and I've opened up a virtual console to each of them that will allow us to monitor the Progress and also attach the discovery ISO directly We're only using three bare metal machines here to demonstrate the converged master and worker configuration But it would be absolutely possible to have additional nodes in this configuration So let's first get the ISO downloaded into this environment Now that I have that locally I can attach it to each of my bare metal machines through the virtual media interface I'll speed this process up for the demo But I'm just attaching this ISO directly to the virtual media interface of my three bare metal machines and Instructing them to boot via the virtual CD-ROM It's important to note that the nice so doesn't have to be used with the virtual CD-ROM as it's a common type of boot media It's entirely possible to use pixie virtual USB or direct virtual media interfaces, too After a few minutes these nodes should start appearing into the UI And I will go into an initial discovery phase where the agent running on the machine will report information about the system CPU count memory disk availability and so on if we dive on to one of those nodes You'll see that has been correctly identified as a Dell FC 430 Correctly shows the 400 gigabyte solid-state drive and the network interfaces and IP addresses assigned via DHCP The status of all of these machines is now pending input as we have to provide further configuration before the machines will be ready for cluster deployment Now that we have our machines ready we can make some configuration changes to suit our needs We have the ability to use automatic role assignment where there's logic built into the platform to help with best fit role placement Based on available nodes and their respective configuration Here we're going to leave it on automatic as we only have three nodes and hence they'll default to both Baster and worker But we can override them if we need to There's also some additional options for each of the hosts on the right-hand side of the pane Here you can override the host name if it's not provided via DHCP It also provides an ability to disable a node view host events or delete it entirely from the discovered hosts list Next I've got to enter the base domain of my cluster and as this is a real environment I'm going to match the base domain with the host names of my machines and combined with a cluster name This will form the full internal red hat domain of my new cluster There are a few networking Configurations available for us to choose from we can either go with a basic default networking or a more advanced configuration We can override the default subnet allocations, but here I'll stick with the basic option There's only one network subnet that's been discovered and that's what's showing here My bare metal machines have multiple interfaces, but only one with DHCP I'll use this as the base network for the whole cluster where my API and ingress services will listen We're asked to define IP addresses that we want those API and application ingress services to listen on and they'll become virtual IP addresses Self-managed and made highly available by the OpenShift cluster itself further reducing any external environmental requirements I have the option of entering them if I had already reserved IP addresses pre-populated in my DNS infrastructure But if there's no reserved IPs we also have the ability to automatically allocate virtual IPs from the DHCP service if permitted to do so The DHCP server in our lab will happily allocate IP addresses to any device on the network so I'll use this approach now I'll select validate and save changes at this point and check that it's able to get a lease for two addresses Here you can see I got .177 for the API and .182 for the ingress traffic Now I'm ready to install the cluster You see that when I previously selected validate and save changes this has checked that everything is in order and that the configuration I've requested is valid The machines move into a known state when they're ready Let's proceed and select install cluster Here you can see it takes us to the next pane where we can see that it's preparing for installation There's also a cluster events pop up which will give you real time deep dive insights into what's happening under the covers And we'll return to this over the course of the deployment to see the type of content that gets pushed here As you can see the nodes have now progressed into a starting installation phase And you'll also witness that one of the nodes has been selected as the bootstrap node This is another incredibly important design principle We wanted to minimize the hardware footprint to reduce complexity here We do not require an additional separate bootstrap machine for installation The deployment process simply deploys a temporary two node cluster on two of the nodes Where the third node temporarily acts as the bootstrap machine where the installation process is executed Then when that two node cluster is up, it pivots and gets deployed as a full third cluster node Restoring full high availability and quorum to the cluster Again, for the purposes of fitting as much as we can into this demonstration The recording has been sped up considerably But behind the scenes the two nodes that were selected as the standard master nodes Will now get provisioned with Red Hat Core OS onto the root disk And will be rebooted so they can start deploying the two node OpenShift cluster itself Orchestrated by the temporary bootstrap machine Throughout all of this the cluster events paint can give a much more in depth view of what's going on And can be filtered if required Here you can see the disk write process and the state changes Like before if we look at the console log you'll see the detailed list of steps that the provisioning process has taken You can also download the installation logs for all of the machines Which could of course be incredibly useful for troubleshooting or gathering detailed information about how the cluster was deployed Simply select the link at the bottom and your browser will download a table where each node will have its own set of log files And various other elements that may be useful And there we go, installation has been completed successfully We're immediately provided the console URL first to connect to We're provided the username and password that will need to authenticate against the cluster And we're also able to download the kubeconfig should we want to use the CLI tools against the cluster However as we use dynamically assigned IP addresses I need to quickly update my Etsy host file so I can access my cluster This wouldn't have been required if I had used IP addresses that were preconfigured in the environments DNS server Like usual I'm going to need to force my browser to accept self-signed certificates But I should eventually get through to the login page confirming that my cluster is operational The username is the standard kube admin and I can copy the password directly from the assisted installer page As you can see the cluster is still coming up and starting all of the required pods So in the meantime let's quickly jump over to the CLI and ensure that the kubeconfig is working properly I'll use the file that we just downloaded and we'll ask the cluster for a list of nodes Here clearly showing that we have three nodes each of which are both a master and a worker node We can also verify the version here being a 4.6 nightly or pre-release version From here the cluster is fully operational and ready to serve workloads and we can go on to deploy any other operator As an example because this is a real bare metal cluster we can exploit the nature of bare metal performance and deploy OpenShift virtualization And there we have it this concludes the demonstration I hope that it was useful and clearly demonstrated the new OpenShift assisted installer Thanks for watching. Passing it back over to you Mike All right, thank you Reece. That was a great demo and I'm excited about where that entire project is going to take us in 2021 Let's switch topics. Let's get into application design in a multi cluster. We talked a lot about Kubernetes But Kubernetes without an application or a reason for it to exist doesn't really make a lot of sense So let's get into how customers have been leveraging Kubernetes in their application services Clayton, can you take us through some of those topics? Absolutely. The journey we've been on from the beginning of Kubernetes Kubernetes was initially successful because it was a set of concepts in application infrastructure that made it easy for you to run applications It helped you deploy a container set of code inside of that mixed with your application runtime environment Whether that was Java or a compiled binary from Go or C++ or Rust Or source code paired to a runtime environment like JavaScript or Python And that image needed a set of criteria for how it ran whether that's environment variables or volumes matched up to it And Kubernetes was kind of that first system that gave you just enough of an abstraction that you could write and run almost any kind of containerized application on top of it With that came services. We added concepts like deployments and ingress around those two simple ideas Deployments let you manage the rollout of your code and pause things from rolling out further if stuff stopped working Ingress allowed us to get traffic into the system and alongside that we then added staple sets which allowed you to run the parts of applications We added jobs and crown jobs and as we went through this process we started to recognize that not every simple abstraction was going to be great for all users And so we had already from the beginning of Kubernetes thought about extensibility and how we could broaden the reach of Kubernetes to new types of concepts Whether that was how you run the app like a workload or whether that's an integration like the way that service mesh provides high level primitives for traffic splitting And matching on parts of URLs or matching the body of a request All of these concepts we knew they'd be really powerful we started working on extensibility in part to provide new ways to run workloads in part to provide new ways of binding workloads together And then on top of that concepts that we really hadn't anticipated policy and new types of integrations into both the nodes or into the cloud environments that ran around Kubernetes And so over the course of years as we've seen applications a lot of people started with really simple 12 factor style applications on top of Kubernetes And they got very big they were able to run you know from the very early days of Kubernetes 1.0 and OpenShift 3.0 because we focused on that core problem It wasn't the highest scale system in the world right I talked earlier about you know the improvements we've made over the lifetime of Kubernetes So one of the big improvements was going from 100 nodes to 1000 nodes to tens of thousands of nodes in very specialized environments or hundreds of thousands of millions of pods And so as these workloads grew people gained confidence in Kubernetes and OpenShift they began extending new types of workloads And as each new workload came in Kubernetes was so important to the ecosystem that we really couldn't say no So today you know in OpenShift 4 we bring in a bunch of extensions and standard parts of the Kubernetes ecosystem We're up to about 150 individual Kubernetes resources from the very simple I think we had 15 at the very start in Kubernetes So there's been almost a tenfold increase in five years and then a number of concepts that even a fairly standard Kubernetes distribution would have And that doesn't even get into the complexity that people build on top of Kubernetes and their pipelines of how they build and deploy applications So I actually think we're right at the edge of a really important transition where we start thinking beyond the cluster What are the policies and patterns and integration points that will let us run applications naturally across clusters Everyone can do that today right nothing prevents an organization from building and running parts of their application in different places But I really think the next frontier for OpenShift and for the ecosystem is how we take some of those core Kubernetes concepts And we try to add concepts to the mix that let people quickly and easily run those applications in many places And if you will bring orchestration on top of Kubernetes from something we all build ourselves to something we all share Which will allow that next level of workload to really materialize for Kubernetes Yeah and the next topic that always comes up once they've mastered the whatever workload API they're using Whether it's replica set or staple set or any of these variety of combinations They always want to know how do I do it how do I cross a boundary whether it's a rack or a data center floor or a cluster Or the case may be Derek can you kind of talk to us a little bit about how to think about a stretch cluster or individual clusters Yeah sure so I think it's fair to say that there's no one size fits all solution to every problem right Even in our dialogue so far we've talked about how we can support running large scale Kubernetes clusters are needing to fit within resource constraints environments And I would say like from my experience thus far it comes down to probably three things right One what is the required or acceptable failure domain for your deployment model And then the second would be like in the case of failure what is the overall blast radius that you feel is impacted by that And maybe the third we don't always give enough attention to but it's very important is like what are the security boundaries where the cluster meets your app that need to be taken into account And so that first question of well if you start out with one cluster you're going to put that cluster in a given data center right Or you might put it in a given region in a particular public cloud and even within one cluster we get questions on well should I run that cluster HA And should I run node pools within each availability zone within that public cloud or within my data center within racks And so it's it's probably first important to ask like or recognize that we have ways of managing failure domains within the cluster itself before you even decide to run more than one But oftentimes you know you end up having to run more than one you might run an application that's globally replicated so you need to run that cluster in more than one region or more than one data center Because you want to bring your workload closest to your end user and so that first real world recognition typically makes users go from one cluster to two right because they want to deliver that app globally or as close to the users as possible When when folks run on prem they might have a contract to have two data centers like they might have an east data center and a west data center And sometimes we run into other variations of that situation where they might have two data centers that are co located within particular latency windows where we run into the issue of asking what should I do stretch So there's a lot of variables that come into this but from a simple perspective like if you have an east and a west data center you probably do these things so you have separate failure domains so you can control your blast radius and so you might choose to run two clusters And then in the nature of that is you got to put your app on both clusters and your apps will then probably run either an active active or an active passive setup and then a lot of real world ramifications come up to that right does your app need data And do I need to make sure that that data is replicated to both environments so today in Kubernetes you have primitives to handle things like persistent volumes and persistent volume claims But there's nothing innate in Kubernetes itself that says how storage is replicated right we have to kind of look at a layer below the orchestration platform and say how can we do that how can we handle that problems Similarly above the orchestration level we have to say how do I load balance to my app how do I handle getting traffic into one cluster versus another And so a lot of the unique situations around individual use cases will motivate how and people choose to do things As Clayton talked about earlier we have evolved OpenShift over the years now to include a multi cluster management experience which we call Red Hat Advanced Container Management And no matter which choice that a customer may choose to make whether they're running one cluster or many clusters there's a lot of exciting primitives that are introduced in that solution to both lifecycle and like provision and deprovision clusters And then once those clusters are provisioned you can group them right like in the same way that a replica set or a daemon set can kind of group pods by label We're starting to look at innovations around how we label clusters themselves and then define applications and policies above that so What's exciting about that is we can kind of start to treat the cluster as an atom that because OpenShift 4 is fully decorative configured We can propagate that config across clusters so if you have an east west data center that each has GPUs and need to have things configured consistently across them both Well that ACM solution can go and provide that management plan to do that whether you choose to do one cluster or many clusters So it's really interesting when you work through the details of why and how you end up with both either multiple availability zones within your cluster or decide to do multiple clusters But no matter your choice I think we can have tools today to meet users where they are And that last point is really key I think Yeah it is and you mentioned some east west north south topics and some customers get a little confused when they hear us throwing that jargon around So Clayton when a customer decides that they want to deploy an application maybe a replica of itself in completely different clusters that don't know about each other Sometimes we end up talking a lot about north south, leaving the cluster and coming back into the cluster And then there's also some research being done about east west staying within the same knowledge or the pod network if you will And can you talk a little bit about the differences there Absolutely these these two phrases north south east west if you imagine a map where you know your data centers or your clouds are set up horizontally North south north is traffic coming into a cluster and south is usually a way of orienting for it going back out or going to another cluster or another data center Whereas east west is typically in most common you know the thing we would use to refer to inside of a data center When we talk about east west we're kind of thinking about Kubernetes as a as I like to call it you know a virtual computing plane where all clusters No matter where they are might have partitions to separate them out into different security levels or different geographic regions because of latency But they're all kind of peers of each other so east west is cluster to cluster and north south is coming into clusters and leaving clusters So there's a bunch of ways that people solve this problem today in Kubernetes as Derek talked about there's a lot of standard patterns that have carried forward into Kubernetes One of our very early OpenShift customers actually ran a geographically distributed set of clusters all around the world in Kubernetes 111 And they actually did a networking configuration within their enterprise that ensured that each cluster had a unique set of pod addresses and a unique set of service addresses And every cluster could reach every other service so that's maybe the simplest level of east west like network level configuration but that requires a lot of pre-planning in an organization There's another level above that which might be specific kinds of integrations that you can do either with your network stack your software defined networking Or at the level above that that's something we might refer to as VPN or tunneling so the Submariner project which Red Headers have been working on Helps you build VPN tunnels from cluster to cluster that kind of is a level above the network and sits usually a level below something that's getting a lot of discussion recently service mesh So there's both service mesh within a cluster but one of the goals of a service mesh is to give every service a unique identity and make richer applications talk without really thinking about the details A service mesh federated of course can sit on top of that next level up and hide the details of where a service runs whether parts of it are at one cluster or another And then finally at the top layer of east west you have what I might call virtual application networks so Red Hat has been exploring this space for a while through a project called Scupper And Scupper lets you connect up an individual application components without controlling the clusters underneath it so each of these layers offers options and different performance tradeoffs One of the things that I think as Derek alluded to that Red Hat would like to do is work within these ecosystems and within the Qube community so no matter what abstraction you use it can be standard for all of those So if you define a service for your Kubernetes application and you want that service to be accessible on another cluster there should be a simple way to do that Both at an administrative level and an application level to ensure that you know whether you're a development on the development side of the house or the operation side on the house You can get your applications done depending on how much control you have and likewise for north south whether it's involves cloud load balancers or on premise data centers We'd like to help standardize some of those central mechanisms so an application where part is running on one cluster if it gets moved to another cluster Traffic should follow the application not be tied to location and standardizing these patterns in Kubernetes and the ecosystem will actually help us build automation in standard ways For multi cluster apps to move between cluster to cluster without an administrator having to take action and I'm pretty excited about that Yeah that is I mean unbelievable points and with that let's bring up the second demo and what the second demo does is it takes a concept of the east west network And it shows the performance differences that you would gain going east west as opposed to ingressing and egressing a cluster through a routing tier And what they've done is they've taken the replication of Seth the back end this is our data services team here at Red Hat And they've figured out how to bring up an application on a different cluster that is feeding from the replica the Seth replica And with that Annette please take it away Thank you Mike So we're going today to talk about OpenShift and how you could do disaster recovery either for a particular project or namespace or for an entire cluster So on the left we have the current active cluster region one and on the right we have a second region or a second data center We have OpenShift resources and persistent volumes and the data is being asynchronously replicated over to site 2 So in this demo I have a load balancer as shown in the previous slide And we have a common get repo that will be used via webhooks so that when a webhook is triggered that will cause a tecton pipeline to run With a particular set of activities that will then be able to both scale the application as well as promote and demo the storage So we see on the bottom there that this is supported with Seth and Seth is a component of OpenShift container storage And we will take a look now at the application which is WordPress and MySQL So right now the application is active on site 1 So just to test that I can recover I will go ahead and put in a comment And the comment then once we are on the other side should still be there So we will now look at the two sites So on the top left is site 1 and on the bottom left is site 2 And then on the right I have logged in to the Seth cluster on site 1 top right And bottom right I have logged in to the Seth cluster for site 2 So if we look now at the information in the what we would call an image it will show us how the mirroring is set up The mirroring in this case is enabled and it is set up to do something called snapshot which means it would be asynchronous And on the primary site or site 1 is currently true meaning the storage is being used on site 1 On site 2 where the data is being replicated to the difference is it is not currently the storage which is primary I am going to run a script now that will scale down the application on site 1 Which you can see on the top left is currently running And when this runs it will trigger the webhook because there is a change in the repo and a pipeline will run We can see these webhooks which I configured in my multi-site application repo And there is one for site 1 and for site 2 So now let's go ahead and promote the storage on site 2 So we took down site 1 for the application demoted the storage and now we are promoting it on site 2 When we do that again webhook and a tecton pipeline is ran to do that task The application though is still unavailable So let's go ahead and scale site 2 up by increasing the replica count We can watch the pods coming up on site 2 And as we see it there MySQL is already up and WordPress is almost online On the storage side it is now flipped over and the primary mirror now is site 2 So the storage and the application now are being accessed from site 2 Now if we go back to look at our application we should find that it actually is back online And we can see the comment that was added and we can actually add a new comment now Posts the comment shows us that the storage is working on site 2 So now that we've successfully failed over to site 2 We want to see how we would fail back to site 1 So right now the storage is primary on site 2 So we'll again run a script that will trigger the webhook to run a pipeline And in doing that we can watch the pods on site 2 And soon they will start to terminate We can see they are terminating So now we're going to swap the storage over to from site 2 to site 1 And to do that again we're going to run a webhook The webhook will run the pipeline and in a few seconds the storage will be swapped over So we can check that and stuff we're going to do is we're going to go ahead And now that we have swapped the storage we need to promote it again on site 1 Again this will cause a pipeline to run The application is still unavailable And our load balancer is actually critical right now Meaning both sides are not taking connections because we have not scaled up the application yet So let's go ahead and do that And now we can see also that our storage is primary on site 1 So we'll use a script to scale the replicas on site 1 And again we'll see that that triggers a pipeline to run And then we can come back and do a watch on the pods coming up on site 1 And we actually can see that they're already running So WordPress is now back online on site 1 So let's take a look at our WordPress application And we can see that we do have the comment that we added when we were over on site 2 Over to you Mike And that was incredible I cannot wait until that makes it into the product Let's go 2021 I'm rooting for 2021 So let's switch gears last topic Last topic is going to be around workloads, around developer experience And an interesting part of this is the CNCF has really exploded in 2020 Around pipelines and get ops and build techniques But Clayton we've been there for quite some time I think we got involved in 2015 towards the end of 2015 and into 2016 And from day one we felt it was imperative that we'd be able to build applications inside of the cluster And when we were looking around there weren't too many people thinking that that was like the right way to go Why don't you take us through some of our journey and how we got to where we are today Sure Mike And this is a great topic because it talks about how software evolves over time And some of Red Hat's commitment is trying to help people evolve their organizations over time And still benefit from the latest and greatest open source But to take into account there has to be solutions from this problem So at the very early days of Kubernetes I know this is really hard to imagine now Most people didn't know what containers were And the way that you built images was you pulled up your developer laptop and built those and shipped them out And so we felt that it was really important that OpenShift as a comprehensive distribution for Kubernetes Had a way to help developers take source code and convert those into container images Because in Kubernetes you can't run anything without a container image And that process is actually a pretty common one in organizations The development team needs to use a standard runtime environment that's properly patched That has the right security rules around it that might be scanned at an interval determined by the operations team Combine those two together and run it And we wanted it to be easy and reproducible And so we worked on a number of technologies that both helped you do this on a cluster As well as used Kubernetes as a jobs engine And again in the early days of Kubernetes the jobs concept was very new And co-developing the build feature on OpenShift with Kubernetes the platform Actually helped us identify places where we needed to improve Kubernetes security And over the years has different ways of combining these technologies have emerged The Linux kernel has been improved to offer some new capabilities that make building images in user space Much more achievable because you can reuse those same fast primitives that the kernel offers for the container runtime To an end user securely We've really evolved this story and I'm looking forward to the very wide range of ecosystem components out there That meet the different requirements and will continue to evolve OpenShift and support the build API So you really get the best of both worlds You get choice and flexibility and you've got the option of new technologies that will bring in And support alongside those existing concepts Again I mean at the end of the day it's really just an application service on the platform They want an API endpoint, they want their application feeding to the world And in that regard Derek we have in this last year Exploded in the number of services that we are offering as a company Has Red Hat out on cloud.redhat.com on the Kubernetes platform on the OpenShift platform Can you talk to us a little bit about that experience? Sure Mike, so we often like to talk about how OpenShift depends on OpenShift There's a lot of backend services that we've had to build at Red Hat to just support delivery of our product out to our users That whole supply chain of building our own software, packaging and putting it into an image registry and making it available to the world A lot of folks might not appreciate that that's actually all running on OpenShift behind the scenes So today if you go to cloud.redhat.com you'll see an ever-growing list of SaaS style services that our Red Hat SRE teams are managing And one of the key ones might be things like the OpenShift cluster manager or OpenShift dedicated service As well as some of our supporting infrastructure like the Quay image distribution system itself Our remote health monitoring system that's running a really large metric sync data store that lets us know if everything in the platform is running well together And we can go and make upstream communities better as well as the projects that we consume better It doesn't make our users happier, right? So to do all these backend services when we talked about failure domains and blast radius These things apply to Red Hat just as much as our end users and so we've been naturally having to build multi-cluster services to support just the business of OpenShift What's interesting and what I admire about Red Hat is, you know, things do go wrong, right? We try to minimize our outages like every other production system in the world But when a single cluster goes down, right, we can rally together the kernel engineers, the Kubernetes engineers, the networking engineers And try to solve that problem at a very deep level to ensure that those who run that cluster on premise or themselves don't have that issue But naturally like we don't want a single point of failure to take down our production systems So we have to run more than one cluster And so I think we've started to see like an emerging set of patterns and practices around how to build multi-cluster services That I would kind of summarize as like first deciding if your service is global or regional, right? But at some level you put a regional microservice out there that says this is how you interact with your solution And then along with that regional microservice you need some persistence And you know we're running that on Kubernetes and so you might be using OpenShift container storage And just like a Nets prior demo where she showed data go active passive across locations You need to make sure your data is not home to any one individual cluster And then you have to tie load balancing into your clusters as we talked earlier But at some level depending on the work and how you navigate users to clusters You're ultimately going to figure out a way to pin your workload and that instance of your workload to a particular cluster Where the job is actually done, right? And so if you're running more than one cluster Eventually you have a way of pinning a request or pinning a user desired state to an individual cluster and doing action afterwards And on each of those clusters one of the reasons that Red Hat was so deeply invested in the operator pattern Is we want to keep the intelligence for how to run that application, that concept Within the cluster itself so it's replicated, that same logic applies everywhere we replicate it and we know it works consistently And so today within OpenShift you see tons of operators that are appearing in the operator hub That represent content we actually run today in production So if you get quay through the operator hub you're starting to learn the patterns that we use to run quay live itself And those inherently spread clusters But all these things together like I would say in the end we realize that we need multiple clusters to run reliable services Because our services are globally distributed and we don't want to allow any individual cluster to have an individual failover domain or accident And what's exciting is we're starting to learn a lot of patterns, we can start to codify these things as we bring them out in the future And at the same time like CUBE keeps evolving So a lot of times when we talked earlier about having an active passive setup of your app Like in your passive data center you might not want to dedicate all of your compute to that thing that's not yet being run These things take power and cycles and you ultimately want to drive costs reliably And so some of the exciting stuff I see going around is just everything around a serverless and inventing So how can we scale things down to zero even when they're replicated across clusters? I think a lot of unique innovation is going to come out when we even look about what it means to host Kubernetes itself as these things evolve It's very meta, but definitely for sure we're always looking here at Red Hat about how we can run the services to support OpenShift on more than one cluster reliably and at good cost That's a good point, Derek, bringing up serverless and inventing So we had the CNCF explosion of all the on-platform building, all the integration with pipeline tools, all the integration with GitOps That flowed nicely into people building a lot of APIs, a lot of services and wanting to really have a SaaS offering within their own company to each other To really build this framework of this community of application processes But it really the flip side then becomes how do I consume that? I don't want to talk containers all the time I just want to talk API, I just want to have a serverless experience where I'm just talking about functions, if you will And functions without inventing is kind of boring and we've GA'd inventing in serverless So let's bring up Paul Mori Paul, can you take us into the serverless world and show us what's going on? Thanks a lot, Mike Hello everyone, I'm Paul Mori and I lead our serverless engineering team As you may know, OpenShift serverless is based on the Knative project which has two key functional areas Serving and inventing I'm really excited today because inventing is joining serving as GA in OpenShift serverless So we're going to concentrate today on some concepts from inventing Another exciting piece of news is that we now have a developer preview of functions So we will be working that into the mix today as well As I said, we're going to explore some key concepts in Knative inventing Knative inventing is about addressing common needs in cloud native development And provides composable primitives to enable late binding between producers and consumers of events Let's talk about the central goals of inventing We want to facilitate loose coupling and allow services to be developed and deployed independently on a variety of platforms We want producers and consumers to be independent, meaning that any producer should be able to generate events before there are active consumers And any consumers can express an interest in an event or class of events before there are producers creating them We want other services to be able to be connected to the inventing system to create new applications without modifying existing producers and consumers And we want to enable cross-service interoperability using the cloud event specification that's developed by the CNCF serverless working group So we're looking at the topology view in the OpenShift Developer Console And this thing called default here that we're looking at is a broker, an inventing broker And the event broker resource is a powerful tool to achieve the loose coupling and independence between producers and consumers of events They basically provide buckets of events that can be selected via their attributes to send to consumers The trigger resource is what we're going to use to express selection criteria for which events consumer wants to see But to start, we're going to make a service that consumes and logs events so that we can illustrate how these concepts work We're going to do this by writing a very simple function. Let's check this out So we're doing KNFAS create display And we're going to use the event type to activate it What this is going to do is make a template function definition that we'll look at in just a moment Build an image for it, push the image to Quay, and deploy the function wrapped in a Knative service So that we will get all the awesomeness and autoscaling from 0 to n back to 0 again and the power of immutable application revision So let's just go ahead and check out that function body It's basically one handle function So if you are familiar with Go, and probably if you're not familiar with Go, this is probably seeming pretty approachable We're basically just going to print out the event that we got Alright, it looks like So that's what the function body looks like Let's take a look at this back in the developer console And we can see here that we got a We got a case service for display And just a quick refresher Service is a very high level concept from Knative serving It's a type of controller similar to a replica set or deployment that creates other resources to do its work Service allows us to specify the spec for what our pods that are run should look like It creates a configuration resource that produces revisions which are immutable snapshots of an application Every time I make a change to my service, I get a new revision Service is also a point of control for traffic split and creates the route resources that control how traffic goes into our revisions So what I did was pretty neat because with a single command, I just got my function up and running and I didn't have to mess around with YAML or really any details So now our display function is deployed and we can start pumping some events into it Remember how I said that we can register an interest in events before there is a producer that makes them? I'm going to do that right now by creating a trigger that will give us all of the events that go into the default broker So I'm going to go and just add a trigger And I'm going to keep the filter clear for now And we can now see that there's a connection between the default broker and our display function Which you'll just see has been scaled down to zero because nothing is going to it right now So what I'm going to do next is I'm going to create a type of event source called a ping source And that's going to give us some, that's going to give us some events to display What this is basically doing is every minute it's going to pump out this hello open ship comments Hey there everyone, event And we're going to see that going into the logs of the display function So check that out, we've already got our first event in here And if we look at the topology view, we can see that there's the ping source connected to the broker And that the display function is listening or is getting events from the default broker via that trigger that we just set up One thing I want to note here is that the channels that the events are moving through in this demo are backed by an in-memory implementation In a real production scenario, we would want to have an implementation backed by some durable store so that we didn't lose events On OpenShift, we'll be using Kafka for this purpose in future versions of OpenSurf serverless and integration So while our ping source keeps on happily producing events, let's do something a little bit more advanced I'm going to run a pod in the cluster that lets me run curl so I can send events to the broker interactively What I'm going to do is I'm going to run a command, a curl command that is producing a cloud event that says hello kubecon And we can see that went into our display function So what if we're only wanting to sing to see the ping source messages? We can add filtering based on the attributes of the message to do this So what I'm going to do is I'm going to edit the trigger and we're going to add attributes We're going to say we only want to see the dev.knative.sources.png type We'll just head back to that play function log And now if I just send the same event, it's not going to have the right type And you can see those events are not going into this log So what we'll do is we are going to trick the trigger by changing the event type so that we're going to see this event show up And you can see now we did get this event because we changed the type to dev.knative.sources.png and it says hello kubecon So hopefully this gives you a fairly good idea of how these broker and trigger primitives work in practice They're very powerful tools for achieving loose coupling between producers and consumers of events We hope that you will check out eventing as GA in OpenShift Serverless and check out our developer preview of functions Thanks a lot. Back to you, Mike All right, Paul. Thank you for that demonstration You had a net with the DR, the application Can you imagine connecting that as a back-end component of a larger serverless framework? It's really coming together. I can't wait to see this all in action And that really brings us to the end of our conversation And it's been a really up and down 2020 With that, are there any parting thoughts, Derek or Clayton, that you want to leave us with? What you see is happening in Kubernetes Okay, well hopefully Mike you can hear me and I'll say it again I think at Red Hat we're all passionate open source engineers and we work in a variety of upstream communities And I think we're all just proud on a human level that during the COVID crisis we've been able to have been productive as a community to get new capabilities in all of our upstreams And so as we look ahead to 2021, I hope that we can keep sustainably working within our upstream communities to bring innovation to all of our users together But I think even with all the things that are happening in the world, innovation never truly stops Stuff is always happening around us and things are evolving So I would say I'm particularly interested in a number of the hardware innovations we see happening across the data center today Everything you could attach to a computer is getting some level of intelligence associated with it And we want to be able to take advantage of that both inside the cluster with our workloads and potentially outside the cluster to orchestrate these systems So there's a lot of interesting incitements, innovations I see come in that space that I think will drive change in cube, Linux and the overall open shift distribution itself So I think that's one of the areas that I'm particularly interested in seeing evolve in 2021 And Clayton Yeah, and I agree with that Derek, there's a lot of, yeah there's a lot of areas in the ecosystem that are going to grow heavily over the next few years People have made huge investments in Kubernetes and building out this cloud native ecosystem There's always new capabilities that people are dreaming up to better connect their apps or to connect to data I think for us a focus where we're going to place some of our investment in our bets over the next few years really comes down to dealing with the complexity None of this stuff is getting any simpler and I think it's the open source community and the Kubernetes community and the folks who every day go and build these applications to build the simplifications that make building reliable services at scale easy Because it's not getting harder every day, it's just getting the needs on us, the requirements and whether it's in industries that are heavily regulated or industries where people lives are at risk We need to think about the design and the reliability of everything that we help people build So I'm super excited about the work that we're doing around multi-cluster resiliency, I talked about the interconnections between clusters I'm hoping in another year or two I'm going to be standing up here and you'll be programming applications to the Kubernetes model But you won't think about where they're going because your operations team, your cloud provider, your service provider, Red Hat as a provider We're all going to be working together to make your applications run where you need and they will stay running throughout screw-ups of config changes or application disasters or COVID-19 No matter what it is we want Kubernetes and the ecosystem around it to stay resilient So I am extremely excited about that Great points both of you And with that we're at the end and I want to thank both of you and everybody watching for keeping our tradition alive through all these virtual sessions Have the best KubeCon North America that you can and we'll see you next time