 Hello, KubeCon. Thank you all for coming out here. This is an amazing and overwhelming week. My name is Tim Hocken. I'm a software engineer at Google. I work on Kubernetes. My name is Michael Rubin. I'm a tech lead and a manager at Google for Kubernetes also. Michael and I, as we're getting ready for submitting proposals for KubeCon, we've had this topic that we wanted to think about to bounce around, and we thought, hey, if we submit a proposal then we'll be forced to think about it. And like, it almost didn't work. So we submitted this proposal with a very rough sketch and we've had this idea that we've been chewing on for nine months-ish. The question is, how do we manage Kubernetes? This project has grown exponentially and it's going to continue to grow and it's sort of an amazing thing, unprecedented in many ways, the speed at which this is happening. We look at the way we manage Kubernetes, we look at the things we hear from our users and we look at the things we hear from our developers and we're asking questions of how do we refine this, how do we make it better? So I want to start real quickly with a little bit of a retrospective on where Kubernetes is today. Everybody here has heard of Kubernetes, right? All right, good. So today, Kubernetes is a fairly large system. We're pushing three million lines of code, which makes it one of the bigger Go projects in the world. By most metrics, it is either the most active or one of the top couple of most active projects on GitHub. We have thousands of developers, hundreds of meetups, but it's not a monolithic system, right? And on top of all that, it's still growing. But it's not a monolithic system. It's not like one binary that you can use and it does everything. It is, in fact, microservices in some sense. It is a bunch of cooperating pieces that are very loosely coupled through APIs and they work together. So this is a very high-level architecture diagram. You see a little bit of how the components work. On the left, you've got our master components. On the right, you've got our node-side components. But these pieces are all individual binaries and they're all talking to each other. This slide is a total lie because the system actually looks more like this. And I had to leave out all of the smaller components just to make sure that you could actually read the text. This is just some of the things that are part of running a Kubernetes system. And the result has been that we want Kubernetes to work in many different ways and have many different functions. And so it needs to be extensible. This has been one of the core principles of Kubernetes. The approach that the community has been taking and the technology has been taking is to put things out of tree in order to make things scale and to have velocity. The idea is to take the functionality and the drivers and other pieces of Kubernetes and move it out of the central repo where it can be worked on by both the specialists and have its own release train and have its own constraints. I'm more aware of this in regards to container runtime abstractions and storage abstractions, things like CSI and CNI, but it's been happening across the entire system, so cloud provider environments, operators and add-ons, not just the environment and the drivers. The idea here is that we have work that goes on in the central repo and we can extend it outside of that repo. But if we take a look, you know, the goal is still to have a fully formed system that users can use when you download it from the central repo. If we take a look, though, at the trend that this is going to create going forward, we're going to see increasingly that there's some assembly required in this system. And if we keep looking even further at how this trend is, the effects of this trend, or if we keep clicking, we find that you're going to have to download it from the central repo and then you're going to have to find the other components that you need in order to make it work in your environment. And then you're going to have to download those components. And then you're going to have to keep clicking. You can see the pain that the presentation has given us just to this prospect. And then deal with the versions skew. Make sure they're at the right version. And keep clicking. And then you're in test hell, or presentation hell. And getting them all to work together and getting to find them all together, put them all in the same repo, your own environment, and then testing them to make sure they work before they use them with the customer. The result is... It's unusable to most users and presenters. And the bare user, you know, the end-all user who should just be able to grab a version of Kubernetes and get going, it's going to become untenable. It's not going to work. And we're pushing out the very people who don't have infrastructure teams and large amount of resources to make Kubernetes viable. So as we stare at this problem, and I think it's a real problem, you know, the idea that we're going from this thing that works to this thing that is a piece of a thing that works, a mental model started to emerge. And I thought it was interesting to do some comparative analysis to another software system that you guys might have heard about, Linux. It's a little successful. And it's sort of the only project I could really identify that sort of compares, I think, to the ecosystem and the way that this thing's evolving. So Linux, as a system, is a decentralized system, right? There's the kernel, which is the biggest heart of it, and it's really the thing that we talk about when you think about when you talk about Linux, but there's installers, there's shells, there's programming languages, there's tools, there's GUIs. There's all these pieces that you need to actually make a Linux system usable. A kernel on its own isn't very useful, right? So taking a little deeper look at the components, the kernel is one big C program, right? Anybody ever contributed to the kernel? It's kind of fun. It's a little daunting, but it really is just a C program. It lives in its own Git repository, right? It has its own world of Git tooling and process and workflows and people, but none of the tools, none of the add-ons live in that Git repo, right? It is its own thing. It releases fairly frequently for such a big piece of software. It's pretty amazing. I took a look at the release stats for the 4.x releases, and it's about every 65 to 70 days it cuts a kernel release. And it only versions x.y, major, minor, right? There's a 4.2, and there's a 4.3, and there's a 4.4, and if there's a bug in 4.4, it'll be fixed in 4.5. The community picks up the slack and says we're going to offer 4.5.1 and 4.5.2. There's a split there. The main kernel developers are focused on progress within the kernel, and there's a different set of people that are focusing on the stability. So everything else that you require and that you look to to make the kernel useful is developed separately. It's outside of that development environment and out of that cadence. The do-it-yourself kernel system is really not, again, tenable much the same way as we referred to the Kubernetes trends that we're going towards. Around 1992, the concept of a Linux distro emerged, and a lot of people got together and put in a ton of effort to amass the tooling and the other part of the kernel itself in order to make this technology viable. Today, everyone uses a distro. In fact, looking online, it looks like even Linus uses a distro. Distros serve different needs. You can take the same Linux kernel and configure it differently with a distro and have it focus on an embedded processor maybe sitting inside a phone or controlling a roller coaster, or you can have it control your desktop or you can even have two different distros that both serve the same function of controlling these resources in very different ways with different opinions. One can be fully featured, one can be very lean and very hands-on. Most people using distros care more about the distro version than the kernel version. Distros release very slowly, quarters to years in order to get it right depending on the stability of the distro. Like I said earlier, the distros emphasize the difference in the use cases of the kernel. What are the specifics inside the distro? What is it? Taking a look at it from quick Wikipedia searches and some long late-night conversations between me and Tim, we kind of put in these four categories. There's platform support. In the kernel, what are the drivers? What is the hardware environment that the kernel should be configured with in order to operate successfully and easily? The packaging and component lifecycle, all of the applications and the libraries associated that sit on top of the kernel, how do you install, discover and manage them? Support security and testing. Distros do a really good job of making sure that if there's a black swan or a bunch of CVEs that are relevant to them that they're patched quickly, and also that the whole system itself is combined and works stably for the user and the installation. Installing Linux used to take tons of floppy disks in a lot of time and a lot of expertise. Today, most distros make this a very, very simple process. Where does Kubernetes sit in this spectrum? Is it a kernel? Is it a distribution? Where do we want it to go? That's really what we wanted to talk about today. In the vein of the same comparative analysis, what does Kubernetes do? Distros release every 6 to 12 months. Kernels release every 6 to 12 weeks. Where does Kubernetes fit? It fits closer to the kernel side. We release every quarter. We've been pretty successful with that. It's a little slower than the average kernel a lot faster than the average distribution. Kubernetes today includes most of what you need to run on the major clouds, right? Google, Amazon. That list of clouds in the world is changing. The list of clouds in Kubernetes is not changing nearly as fast. In the future, I'm not sure that it will be true that you'll be able to download Kubernetes and have it run on most cloud providers. Likewise, Kubernetes today supports a bunch of drivers because everything's in tree as we have more vendors as the ecosystem grows. I'm not sure that in the future that that will be true. I think that you will have to get vendor drivers for different environments. Kubernetes also... It ships with some third-party add-ons. It ships with DNS. It ships with UI. It ships with logging and monitoring out of the box. But it doesn't ship a whole lot of those. There's a whole lot of ecosystem that Kubernetes doesn't bundle. There's some really cool project out there that probably a lot of people want, but we don't put into the Kubernetes release. The thing we cut every 12 weeks. One thing we don't do at all is we don't carry patches. We don't have an upstream project that we apply security fixes to and then ship that. So it's different than a distro in that way. Kubernetes does release medium-term support XYZ patches. We have several live branches that we keep going, and we've released patches for them over the course of several quarters. So in this sense, it is more like a distribution. And Kubernetes, from a management perspective, we live in a small number of Git repositories, but not one. And in fact, the trend and the direction is that we're pulling it apart. Ryan quoted today, I guess, there's actually about 90 Kubernetes repositories. But the majority of the core of the system lives in one monorepo. So in this sense, it's sort of in the middle of a kernel and a distribution. So if we use that same model that we discussed earlier for taking a look at a Linux distros, how does the upstream Kubernetes look like as a distro? Platform support, we're going to use red-yellow-green, because it's simple enough for me to understand, where green is great and red is not a lot of progress just yet or some confusion, and yellow is maybe on its way with mixed status. Platform support, I think, is yellow, it's in the middle. There's a lot of drivers that are useful for folks and sort of maybe some of the core, but there's much in increasing demand every day to have more and more functionality and cloud support and new environments inside of the Kubernetes upstream, and it's going to become increasingly difficult to maintain the even current percentage of support required. Packaging and component lifecycle, there doesn't seem to be a simple, discoverable way to find... a way to discover other packages or even have a reliable way to make sure that when you install and update packages, it works portably across clusters or distros even. Support, security and testing, there have been some heroic efforts to stay on top of security issues inside the upstream distro, and there's a few mailing lists here and there and some very, very impressive efforts from people volunteering their time from different companies and sometimes other issues are tackled. But it's still a little bit best effort and in testing, again, heroic efforts are being put there too, but it is hard given all of the configurations to really make sure that we know how hard in the system is when a release comes out. Simple installation, a lot of great work's been going on over the past year, a lot of things that have been checked into the tree. I want to give just a shout out to Cops because Justin's here and he's in the front row but other than that, we love embarrassing him, a bit of both. But it's still, we don't have a standard solution and it's often one of the first questions that new users have with how do I install this system on to, let's say, my laptop and how do I get things up and running so I can really learn to explore Kubernetes overall. Other distros, there's like 30 other distros today at least and more and more growing. You come to KubeCon, you can see new names I didn't even know about and they're coming in here to add this value and fill in this gap. Many of them have made their own decisions of either productizing platform support that's inside the upstream distro or adding their own. Packaging and component lifecycle still seems to be a challenging area and I think this is an area where upstream needs to possibly has an opportunity to lead and that would help these other derivative distros figure out what is a good way for developers to create and make their add-ons discoverable and be manageable and have users be able to update and download them and install them. Support security and testing is something that I think DistroDisk has to do. If you know that you're creating a distro and that it's not safe and that it's not available, well it's not going to be successful so there's a very strong incentive to make that work and they generally do and then simple installation it falls in the same category so whether you're a cloud company where all you really need to do is a few clicks and then boom you've got a Kubernetes cluster or I think you can go the other way or one other way is what OpenShift does where they not just bundle Kubernetes but they even include things like a whole SDN in order to make it very easy to set up your cluster. Installation, installation instructions are things that are being approached many different ways by other distros. So with all these distributions out there we have some real risks. We have the risk of fragmentation of terrible user experience of non-portability. Kubernetes is at its very core about portability. We really want people to depend on that the portability of their applications if you're bundling them as part of Kubernetes. So we have a conformance program through Kubernetes that launched a few weeks ago. It is called the Certified Kubernetes Program and there's some people from Google and from many other places that are working to robustify that to make it a real thing. I've got stickers. Apparently we've got stickers now. Yay stickers. This refers to the API though. This is all about the Kubernetes API. If you build your applications to the conformant API set then you should know that you can take your Kubernetes application and you can pick it up, take it off of your on-premise data center and drop it onto Google Cloud or onto Amazon or onto any other conformant implementation. And all of these distributions, Docker and Amazon and Oracle and Google and Microsoft have all committed to staying in the conformance space. So Michael and I, we sat and we talked about this and we argued about whether distros were a good thing or a bad thing or whether we could stop them or whether we could have prevented them. And I came to the conclusion distros were inevitable. They were always inevitable. It was going to happen. There was nothing we could do to stop it. And we were sort of naive to think that maybe it wouldn't happen. The question that we have in front of us now is how are we as a community going to organize our project, organize ourselves in light of this fact? So I'm going to present you three options. This is where we start the conversation. Unlike most talks I don't actually know the answers. This is different. So option the first. We could just ignore it. This is sort of what we've been doing to date. Right? La la la. There's no such thing. This is not going to play out real well for us. Other people are going to continue to make distributions. We will not have any coordination across the distributions. We will have, as a community, we will have no say in what happens with these distributions. They are going to invent tools. We will have the great app versus RPM more all over again. I'm not sure that that's a good thing. The net result of this is going to be a lot of conversation. People competing on where they're different. We've seen this in other communities. I'm not sure this is what we want for Kubernetes. But it's a valid option. It's going to be a bad user experience. And maybe, maybe, maybe, if we're lucky at the end of three or four years there will be a couple of major players. But how many distributions are there for Linux right now, Michael? For Linux? Yes. 600 I think was the number that we were counting. We tried to count all of them. 600. I'm not sure that we want to do that with Kubernetes. The second option is we could go all in on this. We could make the one true distro. I totally had an awesome joke here and Jesse Frazell scooped me on Twitter so I'm not going to do it. But this is going to be a big effort. If we're going to go down this road, we're going to need a team of people to focus on this. We're going to need marketing. We're going to need UI people and UX research. It's going to be a huge effort and it's going to be a major distraction from the Kubernetes project. Or it's a different skill set. It's a very different focus. You don't want me designing your UI, I promise. And you know what, at the end of the day everybody else is going to do their own thing anyway. Because people have opinions. For every opinion we take there's an equal and opposite opinion. The result of this I predict is as episode one. Which is people are going to do their own thing and it's going to be fragmented and it's going to be politicized. So what if there's a happy medium? I mean it's often when you present a Goldilocks problem, too hot too cold, what does the middle ground look like? What if we formalize what we already do? And what if we focus on upstream creating a correct and stable and controllable distro based on what we've got in the Kubernetes functionality? The goal is not to cannibalize or stop any of the other efforts for the other distros. We need them, the community needs them and the technology needs them to continue doing the great work that they're doing. But what if this is an opportunity for the open source community to lead and provide guidance and some standardization? I'm betting a lot of these derived distros would actually appreciate not having 16 choices and maybe limiting it down to like one or two. The results could be that it would focus us to not just sort of keep these questions in the closet but clean up our thinking and put it in the front and center. Help us to find tools and standards. I actually, I know installation is hard but I also know that this community can tackle it fairly easily once we set our mind to it and I think that if we were able to centralize our efforts it would be a thing of the past. We can get over doing that so that the upstream distro or an upstream effort could make it very easy to install Kubernetes. The neat thing is that the derived distros would have a benefit from staying close and decrease fragmentation in areas that are besides the API. So when people said I'm using Kubernetes they would have less confusion over some of the other items inside the distro over what that really means. So we have some concrete ideas. We're going to throw these ideas out. These are not proposals. These are not direction. They barely raise to the level of idea but I thought we would be remiss to do this talk and not have some concrete calls to action here. So step one. Let's pick an installer. Let's put all of our wood behind one arrow and actually try to make one installer that works really well. We've got about two dozen installers now and I don't harbor the belief that we'll get to one that everybody uses but we can make one that we believe works really well in almost every case. People will still choose otherwise but we can have the one that works well. Alternatively, maybe we don't need one installer but maybe we need one set of infrastructure that the installers all use so that we can have the same net result depending on which installer you choose to use. So I remember the day when I was able to install a whole list of programs easily on my Linux distro and I thought that was just such a big add-on and such a big piece of functionality and I find that with Kubernetes today it's really hard to even know what the options are and who's doing what in the world. If we could formalize an add-on system to make them discoverable either through a central repository or maybe through some system that at least allows us to enumerate and make it clear the ownership of each one of these add-ons and even knowing the last time this one's been worked on is this add-on that's fantastic stale or is it up to date and ready to go based on the latest version of Kubernetes? These are basic questions that I think are hard to sometimes understand today figuring out the management of the updating and deleting them on your cluster. I showed this slide to some people today and they said it's really important to be able to say upgrading your add-ons should probably be as reliable and not make you worried the same way that you want upgrading your Kubernetes cluster itself. You should never hopefully be afraid to type the word upgrade and hit return unless you're doing something pretty fundamental and then in that case if you are the system should be able to say have you backed up everything or strapped in your seat belts or making it clear. I'm not sure today that we have a system to do this and it wouldn't be great if we had something along that level of simplicity. We could start by just looking at cluster add-ons and trying to figure out how to track them upstream. Tim loves this picture by the way so I'm just letting you know that if you want to make him happy at this talk after the talk just say you love this picture and you really appreciate it and it's going to warm his heart. Woodrow Wilson as a dentist. We sort of drew this line of are we a kernel or are we a distro? The spoiler is we're both. We have both properties. We have both problem sets within our project. Concrete proposal. Let's extract the kernel and manage it differently. It's going to hurt. That's why the dentist. It's going to hurt. If we manage the kernel as a kernel we can get velocity for our kernel developers who are able to apologize to all real kernel developers in the audience. For our kernel developers we can get them to move faster to focus on progress to focus on the technology that we need to drive the system. We could release it faster. What if we could cut a kernel, Kubernetes kernel release every six weeks? Just put a stake in the ground. Six weeks sounds like a pretty good number. We could even manage it TikTok style. We do six weeks of adding features cutting a release. Six weeks of stabilizing it. We have our code slush and stabilization freeze which we're in right now. It's sort of break it for six weeks, fix it for six weeks. We could formalize this. We can manage the release process. I think it would change the way we think about the tools, the way we think about the process, the code reviewing, the way we manage feature development if we had a faster release train here. It would also change the way we think about compatibility and APIs deprecations. The main point here being the kernel is not the distro and we don't need to manage them the same way. What is this distro going to look like? What would an upstream distro look like? This one just took me a while to wrap my head around. The idea here is that we've got this other repo or set of repos and you take that kernel that Tim just talked about before that's sort of a more minimalistic, focused on sort of the different pieces of technology but not those other four towers or squares that we discussed earlier and we put all of this in one place to avoid the challenge that our users are going to have of playing the scavenger hunt of every time you want to run an upstream version of Kubernetes and we include add-ons and the installers and another thing to think about is container images. What happens if I think today if we upgrade the version of busybox that I believe CubeProxy works with and do we upgrade the whole version for all of Kubernetes has to go up to a new patch version if let's say we find a security vulnerability in busybox. That's not really probably the right thing to do in the kernel but in a distro that makes a lot of sense that we batch these things together and we say we have a problem here and it's also the constraints are different and the focus is different. It's a different product so to speak and it's a different result that we can achieve out of it and make things a bit more nimble. So if we're going to do this, if we're going to take control we're going to start to think of ourselves as a distribution we manage it like a distribution there's a few other tactical changes that we should undertake as a project. Aside from the fact that we should only ship code that we own that we can reproduce we need to make it really clear when versions change today we have Kubernetes X.Y.Z and if we need to change the packaging not the binary but the packaging of some of our components we have no real option except to roll to XYZ plus one. That's not great. Every distro out there, every Linux distro has a concept of distinguishing the version of the thing versus the version of the packaging of the thing. We didn't really do that. So we should fix this. This is a big deal and it's actually significantly harder than it sounds. We've got to now retrofit this into the system. It hurts velocity. It makes it really hard sometimes to make quick decisions or impact adding functionality to the system. It's hard to overstate the obstacle this creates. So along with that I'm proposing that we start to slow the project down. I've been quoted several times this week talking about boring infrastructure. People keep hearing us talk about how Kubernetes is getting boring. Quick show of hands in the room. How many people are frustrated at the pace of Kubernetes in that it's just too damn fast? Yeah. Wow. I'm with you. I agree. We've been cutting release every quarter for a couple of years now. Nobody wants to be six releases behind a single update. They're clusters every 12 weeks. The natural result is that you fall behind. Worse, we pin our deprecation policies on releases. Which means if you're more than a couple releases behind, I can't even guarantee that you have a safe upgrade from 1.5 to 1.9 unless you touch 1.6, 1.7, 1.8 along the way. So I've turned your one week upgrade into a one month upgrade. I don't think that's tenable. This thing like a distribution, we need to be talking about quarters to years for distributions. Think Ubuntu LTS. They offer multiple years of support and they don't cut that often. But it would be neat too if you offered the option. There are machines in my house where our very LTS based and they've got the photos that I never ever want to get damaged. And then there's other things where I need to be on the latest and I'm playing games and I want to understand the latest feature of something that needs to be updated. I know that we work with both users and sometimes even customers who do want to be on the latest and understand the risk and want to have that option. There needs to be an opinion and flexibility to do that. I think the cool part here is if we decouple the kernel development from the distro development, we have a skeleton that other people can derive distros from. They can pick up the intermediate kernel that x, y, z kernel releases that integrate with the distro releases. But we don't have to. We're not forcing people's hands. We don't need to take all of the things to update some of the things. This is going to create a need... This is a need for almost a new community and a new set of roles. Managing a distro is generally a different group of satisfaction loops for the people who do it. At least in Linux what I've noticed is distros often are satisfied, excited by and have skill sets that are different than maybe the people who are implementing RCU or huge tables and pieces like that. Like Tim said earlier, I'm not really sure if we want Tim to manage your UI system for monitoring, but he's exactly the person that you want to have managing your deep network innards. Finding ways for these different groups to work together deliberately and creating a whole new group of people who can contribute and focus in these areas, it's inevitable. It's already happening. And I think that if we focus on how we want to do this deliberately instead of organically, like now is the time to do it, because if we don't, we're going to probably fall into those option one and two results that we alluded to earlier. So, you know, this talk is both a delineation and a description and detail of the situation that we've got, kind of a look further down the road and it's a call to arms to some degree and saying that there's opportunity here and need here and I think it would be much appreciated by everybody around if these are things that are interesting to you. And I talk to people all the time who would love to do these sorts of things and don't yet feel empowered to get involved. So, with that we're done. Our terrible ideas are out there now. Please get out your biggest guns and shoot them down. I would love to make this just the start of a conversation. I think as a community leader it's my job to sort of throw these things out and see what sticks. I would love to talk to anybody who wants to talk about it. We've got some time for questions now but afterwards we'll be around. So, I would love to just get this conversation going. Somebody raised the point today we are on an exponential growth curve. So, we're going to have more leaders than we do today and we will never have more freedom than we do today. So, if we're going to make big changes to the community and the way things operate now is the time to do it. Questions? I think there's mics down in the middle so if people want to cue up at those mics or you can just yell. The question is this sounds like something for the steering committee to head up. Have they been involved? The question has definitely come up in these topics. It's come up no less than a dozen times this week already in various conversations. We had the community's Dev Summit on Tuesday, the day before yesterday. It literally came up in every session that I was at. Literally, I put out a plug for this talk in every single session because it kept coming up. So, I definitely think it's on a lot of people's minds. It is a big topic. The steering committee the first thing they're going to do is delegate. As someone who's been watching the steering committee for a while, I think they're going to be the first group of people to say don't look to them for all of the answers. This conversation has been coming up among customers, among other distro folks that we interact with, among many of the SIGs. How are we going to we can all see the progression and feel that there's got to be a better path. Question down here. I heard you talking earlier about it seemed like you were building a model of the cloud integrations as sort of the hardware drivers that would drive the kernel. But then it sounded like you were talking about maybe pushing them out and letting them be something that distros do. Whereas in Linux kernel development, they actually make a big effort to pull those in to the maintained kernel. Is that something that you have a strong opinion about one way or the other? Or is that maybe like a to be answered later? I think that is to be answered. We had some of that material in this slide deck originally and we cut it out because I didn't want to dig too deep on that topic. It definitely warrants discussion. We're getting the stop signal too. I'm going to have to take other questions. We'll take them out in the hall afterwards. One sentence I want to add though is we ended up talking a lot about that and one really important thing to remember is even though we made a comparison to the Linux kernel, Kubernetes is fundamentally it's not Linux especially when you get more detailed. It's like one large image with kernel modules and some very strict APIs. At its core it's a bunch of cooperative containers and pods working together and it changes the flavor and the tenor of a lot of these questions and what the answers will be. Thank you all for your time and attention. We'll be hanging out in the back for a little while afterwards so please come and chat.