 Hello everybody and welcome again to what's becoming a Very regular happening another Kubernetes update on the OpenShift Commons briefings. This time We're going to be talking about the Kubernetes 1.8 release Which should be out sometime later today But I've got today both Derek Carr and Clayton Coleman with me from Red Hat So walk us through everything that's in Kubernetes 1.8 and there's a lot So I'm going to let Clayton kick us off and it started right away. So thanks Clayton and Derek for joining us Thank You Diane. Good morning everyone My name is Clayton Coleman architect for OpenShift in Kubernetes at Red Hat with me I have Derek Carr who's the lead engineer on Kubernetes at Red Hat and This talk is focused on giving a little glimpse of what's coming in Kubernetes 1.8 And which will eventually make it into OpenShift 3.8 The features here are a sampling because there's far too much going on in the Kubernetes project for us to ever cover and a mere hour But we will try to do our best to give you an idea of the things we think are important things that you should pay attention to and the things that you may want to learn more about we'll leave some time at the end for questions and We will get started out. So what's new? This release Kubernetes 1.8 was the biggest release ever as is no surprise for a growing project we had over 2,000 pull requests and 2,500 commits merged between June 30th, which was the Kubernetes 1.7 release date and the Kubernetes 1.8 release date, which is hopefully today, although May change as circumstances demand We had over 380 committers. There were 39 features added to Kubernetes, which is actually a fairly low number if you compare it to the total number of pull requests So don't don't underestimate the the amount of changes in Kubernetes There was 29 SIGs and five working groups a SIG in Kubernetes is the organizational unit of Either a functional area or a user focused area and working groups or a new concept Or a fairly new concept that try to bring together people with Dispersed instances across many different parts of the Kubernetes code base to affect real change and to to drive important initiatives In this release we put four features to stable and this is a little bit of a term I can expand on stable and Kubernetes usually means that we consider this Ready for general use that we have strong API guarantees around will break the API is going forward and also that it's general for ready Regular use features. We move 16 features to beta and there was a large number of new alpha features in the various areas Which I'll go into in a little bit The I think the important takeaway is Kubernetes is growing very rapidly and as a part of that growth this release has also seen a lot of Focus inside the Kubernetes community on stabilizing Not just the things we deliver the code and the documentation and the examples and the tutorials and the What it enables for Kubernetes users, but we're also focusing on the the meta of Making sure that Kubernetes is a successful community that it is efficient that people can orient themselves and work within the community if you followed along with The Kubernetes release Or if you followed along the Kubernetes mailing list, you may have seen that the steering committee elections are happening in the steering committee as a body that will help Legislate at the highest levels Disagreements between various parts of the project as well as helping us set in place a formal structure that is community owned and community driven that ensures that everybody is able to Participate in the ecosystem that there's clear ownership between areas and that when different SIGs and different working groups have Different directions that those can be unified and that we can we can help people make progress to deliverable features and to Ensure that Kubernetes is a stable place for people to run their software. There was a new SIG added in the one One-eighth time frame and that's SIG architecture and this is this is the SIG that tries to mediate between the different Kubernetes SIGs and help Organize the direction of the project to take some of the philosophies and principles that exist from the technical what is Kubernetes to API conventions and general patterns that we try to instill across the platform and to help ensure that everybody in all the different SIGs has a place to go to get when coordination between SIGs is not as efficient as it could be or when something impacts multiple SIGs SIG architecture was the place to to bring issues and to get answers as well as to help identify when Others need to be brought in to help move the process along Part of part of SIG architecture's responsibility is going to help be to formalize the proposal process which is how Large changes to Kubernetes are proposed discussed designed and iterated on that proposal process Just got a new name in the last couple days called the Kubernetes enhancement proposal although that itself is since it's a process may subject to change over time and We've doubled down across the project in the investment on things like testing contributor experience documentation Ensuring that the the business of Kubernetes is flowing so flowing smoothly and As always, you know our goal with Kubernetes is to build an inclusive ecosystem that is able to serve as a valid or as a powerful core for Distributed applications for microservices for legacy apps moving into the cloud Anything containerized should be able to run on Kubernetes and a part of that is making sure that not just the core of Kubernetes works Well, but that others are able to to orient themselves to what Kubernetes is doing and to fit within the ecosystem well so that that goal You know across the entire 1-8 release we really did Try to focus on what was really important in Kubernetes and as Kubernetes has evolved as we've as we've built out new features and New areas things that Kubernetes helps users accomplish Kind of across the board in the community we felt that it was a good idea to begin focusing a little bit more strongly across the board on stability as well as taking features that we had added in previous releases and putting a little bit of extra effort on moving those into a stable state and On the red hat side red hats Focus and Kubernetes has always been on making Kubernetes boring on ensuring that it is a stable place for people to run applications and that Red Hat can deliver that through the OpenShift origin project and through OpenShift online and OpenShift dedicated deliver that to people in a stable and predictable way And so with this release the overarching theme was stability and graduating Features from alpha to beta or beta to stable but also some very specific things That we've done in the in the 1-8 timeframe to make Kubernetes better at scale. So the first one of these is just straight across the board bug fixing I talked about maturing features there was a lot more emphasis put on taking existing features and closing out the loose ends that gets them to the next step versus adding new features and Making production work well taking feedback from regular Kubernetes users from the various Deployments of Kubernetes both in the cloud and on-premise trying to synthesize those down into a couple of core Into a couple of core areas for each special interest group and the first one of these I think is one of the ones that was most impactful and Derek if you'd like to talk about this one since this is really your baby Yeah, so as Clayton said one of our biggest priorities here at Red Hat are to demonstrate that you can run Kubernetes and OpenShift clusters at large scale And one of the items here that made cube 1-8 That we think will broadly benefit the community was around what happens when things go wrong so as most users know of about OpenShift and Kubernetes today is that You know you a great debugging tool to find out what's happening in your cluster over the life of your resources are events and Events tend to not be punitive when everything goes well but in environments where Things may not be going well. Let's say your pod can never start or the image Can never be pooled or you're in some crash looping scenario At small scales, it's great to know these things, right? It's great to know that My application is not working and I can look at the event in the long tail It's it's really problematic to be keep being told that your application is not working or that your pod can't pull its image or That a particular event can't happen so one of the things that we've observed when when offering our OpenShift online offerings is oftentimes applications get defined and then they might not ever be able to be converged on their desired state and It's great that the system tells you about it when it first happens It's bad when it keeps telling you about it constantly over the next week two weeks three weeks And so at scale what we observed was Let's say you had some percentage of your applications that could never run on the platform because they were poorly configured or something like that You ran into a spam problem to your master which ultimately could really Detery cluster performance. So one of the things that we did to address this was Define an event budget. So a given resource Has an initial budget of 25 events and then it has a refill rate of like one event every five minutes and this had a really dramatic impact on Reliability of our clusters. So if you imagine across hundreds of nodes that chief nodes has You know one or two pods that may not be successfully running By design that We were able to dramatically reduce the long tail of events being sent to our masters from an environment We're getting hundreds and hundreds of events per second to approximately three events per second So at scale hundreds of nodes where some percentage of those nodes might have pods that don't actually successfully run we we were able to use our experience to find something and Address it so that as cluster operators who are running these application Platforms on behalf of other users users that make mistakes don't Wake the cluster operator up at night, right? So I think that this was a really good data input to the community the PR Unfortunately didn't land in cube one seven but didn't make it in cube one eight But then along in the cube one eight cycle we did things to improve events further So in addition to the client side rate limiting We also looked at the actual event sources themselves and started to question if this event was actually valuable to a user or not In some cases we we made changes were appropriate and then in addition We went and added an admission controller that allows server side control to Control against events ma'am in practice. So basically a lot of work was done here to allow cluster operators to feel more secure and giving Application users access to the platform so that when those users do something And correctly it doesn't get the cluster operator up at night. And so I think this was a great area of focus for us and Demonstrate of our experience in online Yeah, and I It's very important important to be able to To take actual customer scenarios and user scenarios and translate those back into meaningful fixes Part of this is closing the loop with at the very large scales making a concerted effort to identify You know the top blockers both from users and for from customers and from Community at large members and people who have had similar problems and try to synthesize the overall an overarching effort out of it And that's something that we think is a unique value to how we contribute to Kubernetes a Second part of this very large cluster scaling also came up When you are running very very dense development clusters That's a scenario that a lot of folks in the Kubernetes community don't necessarily deal with day to day But it's something that OpenShift users very commonly see clusters with tens of thousands of applications or tens of thousands of users who are quickly spinning up or tearing down applications that may be running as development environments or Test environments offering playgrounds to developers to have a shared pool of resources where they can at a fairly low cost To the overall organization Experiment in a fashion that's going to really resemble their production environment Obviously Kubernetes Tries to be as simple as possible and no simpler and one of the things that one of the improvements that went into the alpha state In 1-8 was the ability to take some of these very very large dense clusters Which are making API calls to get very large numbers of pods or very large numbers of namespaces or user information and to Add capabilities to both compress and to break those into smaller chunks This was really in a sense. This is driven by actual experience from users As well as in the OpenShift online clusters where We had ROAs planned to do this in Kubernetes and the 1-8 timeframe was really the right opportunity for us to take these features Compression went in in 1-7 but wasn't enabled and we began really Stressing it in the 1-8 timeframe and then the the chunking of a very large API calls into individual results had a benefit both to The cluster itself because a lot of integrations into Kubernetes Involve listing everything and then watching for changes It's a very powerful pattern because you can continually check to see that you're up at the right state But when you make those very large requests, it was having an impact on other operations on the cluster And so by chunking the results Instead of asking for all pods on the cluster You can ask for the first 500 pods and then get once you've processed those you can get the next list We actually plan to enable this by default in Kubernetes 1-9 but it should also be available in OpenShift in a fairly aggressive pattern We think that this is going to be really really valuable for bringing down the tail latencies on both API requests as well as Honestly, the number one impact is that for an end user large administrative queries Become much more responsive because you start getting data almost immediately without giving up any of the consistency that you want And I would raise that as well for the previous one our goal with events was to keep events Very useful. We actually made events more useful by focusing on the things that actually were showing up this sort of refinement You know is is Kubernetes maturing into an environment where you can really trust it with all of your applications and As a you know as a corollary to both of the first two Observability of Kubernetes as a platform is very important to us Prometheus, which is a metrics gathering Project open-source project that became part of the CNCF's last year is a really has a really great user experience for Application authors and operators to be able to easily pull ad hoc metrics from many different components of a scale-out system the Kubernetes community has worked really closely with Prometheus to To expose metrics to be gathered, but also the on the Prometheus side to build in Support for the kind of dynamic Rapidly changing environments that Kubernetes represents and so we're really excited because we spent a lot of the we spent a lot of the last six months or so working with Prometheus in these very large clusters with Kubernetes and other parts of the Kubernetes ecosystem to take some of the things that we were seeing and to ensure that we were just driving that level of reliability and Functionality into Kubernetes. So Kubernetes comes with kind of a stock set of monitoring in the 1-8 time frame we did a lot of work to Enhance the edge cases to make sure that operators and administrators looking at these large clusters can get back the data they need about How well the the APIs are responding? How accurate those APIs are we've added new metrics in a number of places and we're really excited because we think that As this becomes more and more formalized, you'll see a larger You're seeing a large move in the Kubernetes ecosystem to expose these metrics everywhere that sort of Simplification of the ecosystem to where you can very easily get these metrics is going to really improve the operation and running of Large Kubernetes clusters and Derek I I think you know, this is something that you're deeply familiar as well the the changes that we actually saw as we move from Etsy D2 to Etsy D3 Yeah, I guess the general idea here to me it's like there's a difference between What we can monitor in an ivory tower environment and in the upstream whether it's on PR tests and scale tests versus like what actually Happens when you monitor real-world production data so when You know, and generally I think Prometheus has been an invaluable tool for for us to kind of get a handle on Figuring out, you know where there's smoke where there's fire and where we should focus our energy and improving general reliability across the platform so You know as Clayton talked about here with the Etsy D3 migration, you know, very clearly we are able to see in The move here that we've had dramatic improvements and and network and memory use You know, I'm just sitting here thinking about this. There are other areas. I can think that we That we haven't even called on the stack where improvements were made so like Generally speaking I think The the open shift use case of Kubernetes is slightly more directed than maybe what you see in the general upstream community We're slightly more opinionated on how people should follow a particular best practices And so like things like quota like Prometheus was invaluable identifying areas where quota was making more calls than Necessary and then we were able to draft those fixes into into the upstream But generally speaking I would say our experience with monitoring in the online environments has been really beneficial to understand what happens when real users reuse a platform and helps focus our our Decision-making and inform where we go to make impacts and kubernetes around stability. So Generally, I think it's just been been great all around Yeah, and tying you know to close out this section on stability because I know everybody's really eager to go see the exciting stuff Which is the features the whole premise is Working working in the ecosystem to make sure that not just the core Kubernetes components work well The the supporting components that people are increasingly beginning to rely on are working well that they can be monitored that the kind of due diligence as People start to build not just simple clusters, but more complex clusters that this sort of this sort of focus in Kubernetes 1-8 is something that is incredibly practical and Everyone will eventually see the benefit of regardless of where you are today on your journey with Kubernetes So with without further ado, we will move to the thing that everyone is really excited about Which is new features and even though we said that this was a stabilization release It's very difficult to convince 380 people not to go do specific targeted features that make people's lives better So I've tried to bring Derek and I'll kind of go through the different Areas of kubernetes and we'll highlight some of the top-level features as I said before the detail on these is It's pretty a lot of these are pretty deep and there's a ton of things that don't even show up on this list We'll leave time at the end for questions about Areas that you may not have seen or a few are unclear about these as we go through so Without further ado, let's start on the exciting stuff Derek. Yeah, so I think a lot of stuff interesting happened in kubernetes One day in the sega auto scaling space Some of these represent areas that we've Add as a community and especially from red hat tried to represent What our users were asking for and trying to drive core features into the platform so One of the first items here is folks might have seen there's a new incubator project around a Metrics server and their metrics API. This is really setting the groundwork for us to get a future replacement for heaps during one nine in the community and so I think It's just setting a good foundation for us to grow on moving forward Second I'm here. That's a real interest to me and this has been Something that we've been trying to push through the community for up to two years now I think about and has slowly evolved from alpha to beta now in this release is Oftentimes we get the requests that users want to scale their applications horizontally on a custom metric source So today in the platform you initially had just CPU as a scaling target then memory got added and Now in the cube one ain't release you have the ability for your horizontal pod auto scaler to Target custom metrics as a scaling signal. So as an input to that, there's a custom metrics API that third parties can implement to support integrating with the horizontal pod auto scaler natively and Generally, then would be able to give additional signals to choose when and how you might want to scale your applications Where the particular resources that keeps supported natively might not fit. So this is really exciting for us to see that This is now graduated into beta In addition, we've had a lot of feedback about just a lot of people are curious like what's going on with my horizontal pod auto scaler. Why is something scaling or not scaling? So we did a lot of work to try to improve the visibility into the status of a particular HPA So that when things are going right, you know, why and when things are going wrong, you can better pinpoint why exactly they're going wrong. So in general, I think a lot of exciting things happen in this space and a Lot of the stuff played the groundwork for doing more and better advanced things in the future in particular around like usage based scheduling concepts. So I Think a lot of a lot of fun stuff here in this release Sorry guys It wouldn't be a wouldn't be a presentation without some technical difficulties You know see my screen still Yep sorry, so with the the 1a release we're also continuing a lot of the extension work that we've been focused on from The very beginning of Kubernetes making it easier for people to be able to plug in their own pieces to Kubernetes And this is something we see as fundamental to the success of Kubernetes as an ecosystem is you know, not just someone who takes the code and forks it and adds in some some tweaks can change how a Kubernetes cluster is monitored or is is controlled or is Limited but those can be easily done by people who plug in on top And so in the 1a release there were several areas of extensibility that we continue to mature and one of the more interesting ones and Is flex volumes which were a concept that were it was added quite early in the Kubernetes release to kind of release the pressure on I want to integrate my new storage provider and Have that be injected into the pods running on the cluster if you The kind of normal process as you build some code into Kubernetes and you change the Kubernetes APIs and you wait a couple of years and Once you get to that point you're in the Kubernetes API, but obviously that won't scale to the kinds of New and interesting technologies that people add in the future Flex volumes were the first approach to allow people to dynamically add new volume types to Kubernetes pods on on the fly after our cluster had been installed there's work on going in the community to Standardize this as the container storage interface that work will probably last you know throughout the next few releases And in the meantime what we really wanted to do was ensure that flex volumes were easy to Use on a cluster and so there's been work in Kubernetes 1a to make the to make it very easy to inject a New flex volume provider into all nodes on a cluster Through a Dangan set and so the the flex volume provider is containerized and that can be that can be easily deployed on top of Kubernetes which then allows applications to Administrators to both experiment with new flex volumes as well as to try them out There's a lot of really exciting stuff being discussed that would allow us to do more sophisticated Secrets and security integration by leveraging flex volumes that you can imagine a flex volume that injects into you pods a Secret that is provided by the platform, but is never stored on the platform such as a Kerberos principle or another form of like a private key injected by Vendor integration for doing security across the cluster We want to we've also worked to make flex volumes something that can be controlled via security policy in Kubernetes and so that's just one concrete example of It is an area where there are many different ways that you may want to provide content into a pod and Working to make sure those a but APIs are stable and are easy to use Custom resource definitions which are the replacement for third-party resources Got a couple of future improvements this release including the ability to do validations that work will continue We want custom resources to be the easiest way to add new APIs to the cluster and there's just a ton of work that's continuing both at the node and the API server level to Mature how you can hook into the platform as an administrator at the very lowest level Right Derek Yep, so along that line Some of the new features that were coming out of a six storage That we want to call it here that we're of interest to folks was generally the initial request for persistent volume plane might not be the right long-term request so Work was being done in the key one a release to allow you to dynamically resize your PVC So you could grow a 10 gig to 100 gig PVC without Well as you basically want The the increases is intended to be transparent and work is being flushed out to make sure it worked well with quota and all the other things But basically a lot of good progress was made in the Q1 a release around this feature In addition, it's a common request that people want to be able to snapshot their volumes and then potentially create a new PVC from that snapshot Work was being done in an alpha phase to support this in Q1 a and It's probably representative of what you'll see in future releases as getting progressed to beta and stable but a Lot of the the basic premise that people would look for around PVC is we're getting a nice attention in the Q1 a cycle Networking is also an area that's just continued to evolve. We're not trying to be too aggressive, but to do this IPvS is a IP virtual servers and Linux is a kernel feature that's been there for quite a long time You can think of it as a little bit of an upgrade over the IP tables based version of cube proxy And there's an alpha implementation in Q1 a that will go through some hardening and testing contributed by the the folks at Huawei You know, we're hopeful that in the next few releases, this will be something that allows us to improve Service level connections between pods network policy also continues to evolve some of the concepts from open shift like egress policy Making their way into Kubernetes network policy as well as CIDR rules for pod matching a lot of small incremental improvements and stabilization and make it easier to ensure that users and operators can find the right balance between running containerized applications and preserving security Yeah, so another major area that has been close to my heart is around how we're going to brought in the set of workloads that Can run on the Kubernetes platform. So folks have might have seen a recent blog post went about The resource management work group and this was an effort that we red hat had kicked off Oh, I want to say in the beginning of this year to try to really formalize how we can take resource management and Kubernetes to the next level and so from that We've had a lot of great community participation from Google and tell and video and others around how we can support, you know more workloads With better performance without actually sacrificing node reliability because this folks probably seen we did a lot of work in the community through cube one six to kind of stabilize the node and and Provide, you know improved reliability around things like Quas and the secret park and stuff and so we felt like After that point it was the right time to pivot and ask how we can make things better or support more workloads better So in the in the resource management working group, we had spent a lot of time on identifying like what what key focus areas could we tackle to? Drive iterative improvement improvements into the platform And so for the key one eight release we focused on three areas one was around how we better improve CPU management on the node Second one was around how do we support device plugins and in my head? This is for folks who've been tracking the community, you know, we had alpha support for GPUs But in order to really graduate it into a beta or stable foundation. We needed to get into a model where You weren't having to Integrate Support for particular hardware devices natively into the core platform. So device plugins was a model to to try to address that and then finally certain workloads You know have particular memory Requirements that we were hearing over and over it would be nice if you could consume things like huge pages to Broaden the set of workloads that could run well on the platform and so work had been done in that area as well so if we dive down a little bit deeper on Say CPU pinning This is an exciting feature for me. So folks who might have been Familiar with the quality of service model today. We have in Kubernetes. We have this concept of best effort burstable and guaranteed service tiers and So in the best effort tier basically a pod can use you know as much resource as I can scavenge You know in the burstable tier a pod can have a minimum request for a particular amount of resource But then can burst above that request as as resources become available But one of the things we'd observed As a community it was that there wasn't a huge performance benefit In going up to the last tier around guaranteed cause so One of the major things that we've worked on At red hat and with our friends at Intel was trying to Make it that you did not actually have a performance penalty by running in the guaranteed cost here by Allowing you to actually get improved latency benefits. So the way we chose to tackle this was a new feature you'll see in the in the node Cubelet agent where you can configure a node to have a particular CPU management policy. So The one policy that we've been exploring heavily in cube 1a and is alpha is what we're calling the static CPU pinning policy and what that allows you to do is say if a pod requests one core of CPU or two cores of CPU That that pod will get access to an exclusive core over its life and it will never move And that has a lot of latency benefits in that it's not fighting any noisy neighbors on the same core and I Think generally speaking this would be a broadly applicable performance improvement for a lot of workload types It's a bit intelligent on how it chooses to assign a particular core So it actually will inspect your physical processor topology to try to find the best fit for your workload on that node We don't expect that this will be the be all and all only CPU management policy that nodes might be configured to support so other other options have been explored for future analysis around like rather than a static policy a dynamic policy where pods might get us exclusively core for a momentary period in time but Basically when a lot of great work was being done to improve CPU latency if we want to move on to the next item the device plugins as I said GPU support has long been alpha in the project and as a community we were trying to think about how we can best move it to beta or stable fashion and a key tenet of that is Generally speaking there was nothing special about GPUs versus any other hardware device and so Right ahead obviously it's really important to us that like if you can get access to a device on your Linux host We want to be able to let you get access to that device In your pod and so the first step that's being done to support that was What we're calling the device plug-in model and this has been a another great community effort across Google and video and Red Hat and The idea here is basically we want to have a vendor neutral way of allowing Plugins to be deployed on the cluster so that we can just Support discovery of that vendor specific device and then make that device visible up to the scheduler so that pods can actually make requests for it and get them scheduled and Then at the end of the day ensure that when you're running in production where devices may fail or Be temporarily unhealthy that we have all the right metaphors in place to ensure that if your device does fail What will happen to remedy the situation? So in the one eight release we have an alpha plug-in model And we've been trying to validate this model against particular classes of devices so in particular GPUs and certain custom Knicks and This will be an area that we continue to explore move forward to ultimately try to get things like GPUs to beta or stable So great stuff there, and then the last item here when I talk about huge pages briefly this is this is one of those things that when you want it you really want it and Sometimes this can be viewed as an impediment to bring certain workloads on to the cluster so if you are an operator of an application that's Managing a very large memory set. So whether it's a particular database management system or a large Java middleware application Oftentimes people have tuned those applications to take advantage of huge pages and Not having huge pages support on the platform was an impediment to bringing on those workloads So in the keep one eight release, you now have the ability to let your pod make a huge page request and then you're Application is properly isolated and accounted to be able to use those huge pages and You can consume them through all the metaphors you'd expect so both be a shared memory and as an empty dirt volume, so I Suspect that in Cuba nine and beyond this will evolve into a stable state Yeah, and I think you know in general this sort of targeting application workloads and being practical about There's a there's kind of a gap between the Kubernetes Idealized that many people use it as as a microservices platform where you don't care about some aspects or performance You might be more focused on the gross levels or they the development efficiencies you can gain It's also equally important to us to ensure that that broad range of applications both can run on Kubernetes But also get advantages from running from Kubernetes and as we talked about as we mentioned during the metrics section Really one of the long-term goals is to be able to tie actual concrete resuits You know people use Kubernetes They love it because it's easy to deploy applications on and people love OpenShift because it's easy to iterate on applications on top of Kubernetes To us closing that loop so that you can run all these different types of applications and get your benefits and get good performance And then closing the loop back so that the cluster operator and people running clusters for other people can actually get Good utilization to to see these applications stack up and be used efficiently across an entire cluster And to to be able to see those higher utilization numbers in very large dense clusters But also in production clusters so that you you think less about how to run your applications And they just work and and that really does tie back to You know one of the one of the features that we care very much about at the lowest levels of the platform is ensuring that the interface between the kernel and The application user space inside the container and the cluster is as efficient and as easy to support as possible because ultimately at the end of the day, you know containers are just Linux and you know the work that we do in the kernel and device drivers and overlay and username spaces and SE Linux and security is really all about ensuring that An application workload that runs on one Linux cluster runs on it consistently You know when we talk about the reasons why we focus so strongly on this the kernel and the The low-level levels or the levels of Linux tying up through Kubernetes into OpenShift Is so that we can ensure that applications work correctly across the board and so cryo Is a big investment area for us You know as a container runtime running under Kubernetes designed to work with Kubernetes Works on top of OCI standard containers is able to run all Docker images that exist today the focus for us really is cutting out parts of the cutting out parts of the container runtime that hurt our ability to ensure applications are portable and Ensure they are reliable and so we've We've been working pretty hard as a team In the community with others in the ecosystem like Intel and Suze to get cryo to its first release candidate It is certified against cube one eight and has been has passed all the tests there for quite a while or cube one seven I'm sorry and all the tests there working on getting that support post release for one eight and then being able to move Cryo to production status Not everyone may choose to use cryo. Our hope is that we can really demonstrate the value of a simpler and Kubernetes focused container runtime and how it ties into Kubernetes as a platform that's excellent for running applications on top of Linux which red hat Arguably knows as well as anyone and so we'll you'll see a lot more about cryo in the coming weeks and days our goal is to make sure that there's a diverse ecosystem of of Container runtimes and that can trade off different advantages for end users but also focus on the thing that just works well for Kubernetes and so Those are the high-level features in Kubernetes one eight. There's a ton of there's a ton more detail that We could get into I urge everyone when the Kubernetes one eight release is announced to go look at the release notes There'll be an infinite amount of blogs and blog posts from everybody in the community talking about the things that they personally care about the most To me, that's a sign of the success of Kubernetes You know, it's hard to point to someone today who hasn't Realized the same thing that red hat realized almost three years ago that Kubernetes was going to be the future And so we're really excited to have everybody join us on that Kubernetes one nine is You know, it's still fairly early in the SIG groups. There's a lot of things that individual SIGs are still working through You'll start to see over the next few weeks that coalesce into SIG specific goals at the top level across the project, these are some things that we've talked about in in SIG architecture and at community meetings already stability and bug fixes continue to be key Continuing extensibility, you know, we really want we have to keep chugging through extensibility our goal on the open shift side is that extensibility in Cubane done until open shift runs and So that to us that means, you know setting that path where Something as powerful and complex as open shift can run as an extension on top of Kubernetes We think it's possible. It's gonna take some time to get there, but it's a it's a key goal for us to both All the right mechanisms are in place So all of the use cases that we see on the open shift side from administrators and users The the specific controls they want are also available as a proof point Without having to get open shift to use those and finally scaling improvements Across the board just continuing to refine our approach to how we add more and more components in the ecosystem and that's scaling not just from our performance perspective, but from a community and ecosystem and integration and security perspective it should be easy to extend the platform and preserves Preserve security and so each of these kind of builds off each other Derek do you want to talk a little bit about some of the the more specific ones? Yeah, I'll call it a couple so I think speaking there's always a lot of interest on how we can Provide more sensibility around patterns like initializers and custom webhook Plugins for where mission controllers being in tree or challenge. I think generally that's an area We're continuing to invest in and and validate against And so hopefully we'll be able to proceed on that in one line The de-scheduler is an interesting topic area for me. So As folks might have seen there's now a new incubator project Around the de-scheduler, which is basically asking Asking the question of has things have been scheduled now Is there a now? Is there a better place for that pod to be scheduled, you know today versus last week and This is like the next step on some of the stuff that's been going on in the Scheduling community around asking like do I have capacity to run this workload now? It's asking are my previous decisions my optimal decisions and so the de-scheduler is an interesting focus area on there Priority and prevention is another interest area within six scheduling And I think it's really critical for us being able to run more cluster services as daemon sets And hopefully we'll be able to get that over the hump and in one line Generally speaking on the node level improvements, you know I touched on a lot of those things that got added as alpha features around CPU pinning and G Device plug-in support and that type of thing. I think generally in the one nine release You'll see a lot of stabilization of those Features in preparation to look to get to beta and one ten and beyond As Clayton touched on cryo, you know, we are passing all of the one seven node ED and cluster tests and We encourage folks to to check that out Very shortly after the cube one eight release, you'll see a branch of cryo that will meet the same need on cube one eight And what's nice about that is all the features that you would expect will just work So all the all the metrics gathering that you get from see advisor today and now work with cryo and stuff But generally speaking at one line is a very short release window So I think the best thing we can to do is is focus on stability in that timeframe and And grow the things that we've started to completion responsibly over the next one to two releases and You know, I I think you know in closing, you know across the board This is a Kubernetes is a long-haul project for us We want Kubernetes to be the best place to run containerized applications. We want it to be transformational to how How large organizations build and develop software we want it to be a stable ecosystem that allows people to to Orient themselves to provide value to build solutions that work for other people and to make that easy to run and secure and manage We think that just like the operating system in Linux was Transformational and making it possible to do to see the the world we have today that we want Kubernetes to help Build the world that we'll see tomorrow. So expect expect us to keep Walking this path of making it a predictable and excellent place to run applications and We'll take questions if anyone would like to ask There there are a couple of Clayton and Derek and thank you for this And I think your point about the best feature is is the community. So These a lot of great work has been done by lots of the organizations and individuals on this Release, so it's pretty a notable release one of the questions Was has there been any notable progress on the service catalog in 1.8? you know that I I knew that there was something really important that I was forgetting and So there's a slide that's missing. Thank you for for it service catalog went through a ton of work in 1.8 a number of people from quite a few companies worked extremely hard to bring it to Beta status the there's kind of a few loose ends. The goal is to make it beta Almost very very shortly after cube and the service catalog in a lot of respects is the first extensible part of Kubernetes It's something that runs on top of Kubernetes and plugs in but it's not actually tied to the Kubernetes release So our goal yeah, you know, it has been a test bed and has helped pave the way and the folks involved have certainly jumped through Some hoops, but it's going to make everything else in the Kubernetes ecosystem better So the goal is to get to beta and to have that be available for people to consume and use very very shortly And the one other thing that Maybe I missed it and maybe Andy missed it too was there anything on the Federation support is it's and is it still worked on? Derek Yeah, I'll touch on that. So I think in the community there was a lot of So over the cube 1.8 release there was a federation face-to-face where a lot of the core Contributing companies got together and and try to sit back and ask What was federation going in the right direction and what we can do to accelerate it? So folks might have seen some announcements that went out where sick federation will be renamed to stick multi-cluster and one of the challenges that we're looking to address and looking at things like service Category actually as a proof point was How can we decompose federation into a smaller set of items geared to particular Use cases rather than necessarily one large monolith called federation. So out of that there's an effort going on Right now actually and hopefully we'll see more in cube one nine to move federation out of the cube tree proper and In that move of out of the Cube cube tree it's being decomposed into two pieces. So there'll be a cluster registry component which Folks who are tracking Federation cluster was the unique API resource offered by that project and So generally speaking everybody thought the cluster registry is a useful concept or foundational tool to build a lot of other tools. So Out of that SIG there's an effort to go and decompose the cluster registry into a standalone deployable artifact and using the federation Code base as a base to kick that effort off and then generally speaking one of the other things That we at red have been pushing hard on in Federation is to ensure that It has a stable lifecycle Release cadence. So I think there had been some confusion across the community about when Federation says a particular API resource had reached, you know beta status. I think that that wasn't always Clear what was meant by that and so what we're trying to do is by moving the Federation code base out of tree Getting into a cadence where a Particular release of Q would go out the door and then Federation would verify it functions against that stable release So in the same way that like service catalog will say I work well on a cube on a platform Federation will probably start to trail mainline cube releases and say that Plus two weeks plus three weeks figure out the number After a cube release, they would be a federation release and then that gives the hardening you need to know that as Things get into a cube release at the end of the day very last minute that Federation had the chance to respond and validate against it So I think Generally speaking Federation is still incubating And growing and it's starting to decompose so that we can accelerate getting its use cases out to the community All right, and there's a couple more questions If you guys have time and this might be a little detail But on the auto scaling feature someone's asking if we can scale an app based on combined condition of multiple metrics His example is it's like to scale is that when the CPU is at 80, you know above 80% and the memory is above 75% You know have you done a deep look at that yet? Yeah, so right now that The it's the common a Tory scaling target is not something that We support natively But it's interesting to sit back and reflect on so I think we'll go back to say go to scaling and Kind of touch on that use case a little bit more to see where it may or may not break down And There's always this got to be a complete question With permutious becoming available in the future how will how easy will it be for customers using permutious now in OCP 3 5 6 to migrate So what our focus is with permutious is there's kind of three three areas We want to hit and kind of a specific order So I talked about instrumenting and doing a better job in Kubernetes and an open shift of providing the instrument instrumentation making sure it's correct we want to make available a Prometheus image that is supported by red hat for the use of gathering cluster metrics operational metrics that will tie into Cloud forms and actually platforms will have the ability to read Prometheus We think there'll be some really valuable integrations there for customers for managing Customers and users managing very large numbers of clusters. So much more of an open shift use case on the Those metrics and the alerts within it, you know integrating with Prometheus really well our goal would be We will offer an out-of-the-box set of Collectors that gather all the metrics of the platform And we do want it to be possible to take and collect additional metrics now obviously with metrics You know every I think this is a little bit like backups or Or security everybody has a slightly different approach, but everybody's trying to accomplish the same goals We're going to try not to be too prescriptive on exactly how the metrics operationally are calculated We want to take advantage of the flexibility of Prometheus To slice and dice metrics in a couple different ways Our goal will be to take most of the elements of the Prometheus ecosystem that work well and begin to support them in Out-of-the-box gather those for the cluster and for the components running on the cluster The next step would be making it easy to use Prometheus within a namespace or in a set of namespaces So just like we have a Jenkins image and open shift that you can use that integrates well with the platform We'd like to have a fairly simple integration there for what you might call tenant Prometheus And then the third step the third step down the road would be a Prometheus that can do multi-tenant metrics at scale And actually that's as part of that custom metric stuff that we talked about We actually anticipate that being one of the first paths where that Prometheus would be used But multi-tenant Prometheus is a somewhat complicated project. We don't want to jump too early into it So we're gonna we're kind of gonna take baby steps through the custom metrics work To gather metrics from all the applications on the platform at a very high level for the purposes of autoscaling And then slowly make it easier for operational teams to run their own Prometheus together I would say most people using Prometheus today You'll see our new config file. You'll see our images. We have a set of tools around Securing that Prometheus. It should be a fairly easy switch and we'll definitely want to work with people who have complex Rules and configs that we may not have thought about and make sure that's easy for people to integrate into the cluster monitoring I Think that's all we have time for in terms of questions as always great job Clayton and Derek on on this one and If you have more questions, or if you want to take a watch this video again, we'll just post it on our YouTube channel And post it on blog that openshift.com So it'll be up probably by the end of day today and as we said We'll put links to the Kubernetes 1.8 release notes so that you can reach that and if you are coming to kube-con You can hear probably both Derek and Clayton talking again on the release. That's hopefully 1.9 ish by then on December 5th The OpenShift Commons gathering will send you things and register for that too. So again Derek for all your work and everybody in the community for all the work that's gone into Kubernetes We'll keep in touch Thank you. Thank you