 They fight Balrogs and forked Linux distribution. So when I came up with the name, there was when this whole like system D drama was going on. And Debian got forked over system D. And I thought like, all right, if they fork over system D, what they're gonna do with containers in Kubernetes. This Kubernetes actually like system D done right in a way. So that's what the title came from. Like first time I gave the talk in Brno and I actually didn't shave for quite a bit, but I can't pull that off here. So what are we talking about? So traditionally, I go a little bit into the history of the role of the operating system and why I think these things matter. If you look at inside Rattat and our community, often can have this discussion like, what is the operating system? A lot of people say, well, RHEL is the infrastructure. I never really agreed with that. From my point of view, infrastructure is like network cables and hardware and things like that. And the point of the operating system is actually to run applications. It's the application runtime. Because no one runs an operating system because they want the infrastructure. They run an operating system because they want to run an application and they want to abstract the infrastructure so they don't have to deal with it. But it's true. There are two views. There's the hardware center view. We're thinking about how do I make the hardware work, how to bring it up. And there's the application center view, which is how do I give an application a common runtime? And if you look at the historic role of Linux and specifically in our world, Enterprise Linux, coming from a world where everything was about the hardware and computers were big and scarce and you called them mainframe and they came from a big window who controlled the hardware, the operating system and the software you could run on the operating system. And you used least hardware, ran black box services. Not every mainframe worked that way, but the biggest dominant players in the market worked that way. You didn't really have much control and everything was vertically integrated. And it was like least hardware and if you wanted more performance, you called them and they locked in. They made a modem, made funny noises and then activated more capacity in your mainframe. Then we had Unix and mini computers which I will ignore here to make this shorter. Someone will call me out on it. With Unix we had a little bit less vertical integration. You still had the operating system and the hardware from the same vendor, usually. And you had, depending on the vendor, more or less control on the ecosystem. Some of the Unix vendors had very tight control on the software you could run on top of Unix. Some had a very liberal view and in many cases you would install the GNU environment anyways because you want to tap completion or at least a better shell. And that was called open systems was the big deal. You end up with controlled hardware, vertical integration hardware, operating system and often as a tool chain and some of the application but you had some choice of third party software vendors. But then that also was like too controlled for many people and with Linux we broke both of them. We created a model where now you have free choice on the hardware from different vendors and you have free choice on the ecosystem, the ISV ecosystem software you're going to run on top and you have full transparency and insight into the operating system itself. The operating system today, Linux today becomes an abstraction layer across different types of hardware, both the different footprints of what we call the hybrid cloud so bare metal virtualization, private cloud, public cloud and it allows you to build a binary once and run it in all these footprints and it gives you that transparency. It also gives different players in the market, different ISVs a common way to run binaries, build binaries an implemented standard. It also gives you the same interface across different hardware architectures. So if you're doing ARM or Intel or Power today or what's really interesting from my point of view risk five you can use Linux as the abstraction layer to have one way of looking at that. They're not binary compatible but at least their source code and the kind of architecture is just called compatible at that level. So you don't have to write yourself multiple times and a lot of the additional work of making things work in different environments is going away. So Linux is the neutral runtime across different types of infrastructure different hardware that abstracts from that and allows you to be efficient as an application developer to not have to deal with the underlying infrastructure or have dependencies or get locked into vertical dependencies. So that's the role that new Linux played and in our world that Rated Enterprise Linux played for the common industry. The snarky common is that we allowed people who couldn't afford a sun server to run their IT like they could. So that's what we did back in the day. Now early on the way we managed software stacks in Linux we inherited that from UNIX. When I went to university like we had a bunch of Vax machines and a bunch of big sun servers in the university and a whole bunch of individual Linux PCs then showed up. Early on it was like what we used was primarily the sun services terminals and we had multiple admins per server and it would compile software locally on these machines and use a local and then you stow to put it somewhere to find it in your binary path and you cross mount user local or even user depending on what you did from one machine to multiple machines. So you shared your binary runtime across multiple machines by cross mounting it over NFS it was a common practice like that. We did a lot of that and even in the Linux machines early on so Linux distribution came up as a way to easily install Linux because figuring out how to make 500, 600 different components work together in specific versions was really hard so Linux distribution was meant to make that easy. The first Linux distributions basically used TAR files with pre-compiled binaries for the initials in Slackware was what we used early on but there were others, SLS or things like that but then we would like get additional software and that usually was compiled in user local and we used stow like some links to put that in the path and then every time there was nothing you recompiled that and all that good stuff the problem with that was that you had a lot of dependency on the state of the machine so you couldn't expect that the software would behave the same way the state of the machine at the time where you installed software when you compiled software because you didn't install it you compiled it really defines how your software is going to behave and that worked fairly well if you had like a few big machines and admins would take care of that and it became very clear very quickly that that wouldn't work when you used PC type machines many machines with Linux because it just didn't scale it just didn't scale to compile everything on every machine so the solution was to create binary reproducible builds to package things in a different way not just TARS but managed dependencies and things like that which originally you couldn't and which is painful so manage the problem by trying to capture more state of your software stack in the binary package you create so RPM and Debian came along you build once, install it everywhere it behaves the same way it manages dependencies as all this context Debian added apps as kind of a transport model for that right there it's something called up to date and later replaced it with yum and what that gave you is really a good way of dealing with the stack complexity however and it implemented like so the original model was a very late binding area you bind on the source code level and you rely on that kind of leading to the same behavior here we still do late binding but we only bind dependencies late we use shared libraries extensively and we rely on ABI stability between the libraries to update parts of the stack and make it reproducible it worked fairly well and got us really far 20-ish years of this I'd say the problem is that dependencies get hard to manage over time and so a lot of people in the room who are dreaded are painfully aware of that you start back porting things to older versions because you need to keep binary compatibility because you locked in this ABI contract in this dependency model that's a downside and things break if you don't do that and an interesting side effect that I didn't really realize for a long time was a problem like back in the day when you compiled in user local and stored things you could have multiple versions of the same binary or the same stack multiple instances of the same code on a machine in different versions so multi-incense, multi-version environment because it was just separated in user local in the move to binary packaging both in Debian and in the RPM world we implicitly turned Linux into single instance, single version system you can only install cleanly a single instance and a single version of every binary package if you want multiple versions at the same time you have to rename the package so you have to do some naming things and relocation and if you want to experience it look at the Python stack some of the languages support that better than others but it can be really painful and we evolved around that with some nice tooling with software collection things like that to enable that but we have additional tooling to work around the single instance, single version and I never really realized a bigger problem that was because we kind of of course we want one version of everything because we moved away from these complex multi-service machines but who never really got on board with it was the Java crowd because they wanted their own version it runs everywhere as long as you use the same JVM and they always wanted their own in their home directory their own version of the runtime they might be okay with a shared JDK JVM for the machine but then they want multiple instances of everything else and we had these arguments of you know, JVOS should be in an RPM package and no one ever wanted to use that and we were like why wouldn't you use that and actually control over your software stack and also your dependencies scrap those zip files and we never understood why... I never understood why they didn't want that until I realized oh, they want multiple different versions on the same machine and that's why they hate this because RPM actually doesn't allow them to do that in a clean way or they have to do these additional steps there was an interesting side effect and I think we'll see how that's relevant in this discussion with all these PCs and like scale-out architecture and more and more machines RPM and YUM weren't good enough so other tools were invented one of the big things for Red Hat was Satellite Kickstart Kickstart allowing you to recreate software stack on installation on deployment of the machine, automated deployment Satellite, we built some instrumentation around it and central service to make that for content management so you know, the Satellite server basically was kind of a content management for software streams or is still content management for software streams where you do your standardized approved build and then you deploy that through a kickstart file and then you centrally manage what gets updated where so you have a large number of machines and you centrally manage the software updates for them, work like a charm and you have things like CF engine and then later you know, obviously newer implementation like Puppet which is CF engine re-implemented in Ruby whether that makes sense a different question or answerable like a newer model so centralized control for configuration management and orchestrating these machines and that got us to really big deployments that worked really well under the constraints of single instance, single version or some additional workarounds and the constraints of a binary ABI ABI contract that you have to maintain now the problems in that model like is still like is that whenever you have a change in dependency you want something that's newer you can't install different versions of it at the same time or you have to do additional work and with the complexity of software stacks it's harder and harder and today just because of the amount of software and the amount of change you know I had that I managed streaming for this conference and the presentation laptops and so I installed them on one day and then two days later I ran the same script which did a YAM update and it was 86 packages that got updated in the enterprise linux but it shows you the amount of software that gets updated in the linux distribution in two days it was the kernel and some SSL thing and a bunch of other things so 86 things that happened so basically if you have a large cluster when you're done patching it you're going to start over because it's always moving and it gets into some problems where because we are late binding we expect that people do this in production so I have a production cluster and I have a running application and I'm going to update a library like SSL which is kind of important in production in that cluster and I expect that all my applications keep running I might have to restart them so it loads a new version and I do that constantly because when I'm done the next update rolls in if I have 5000 machines and then I want a new version and that breaks over time we do an update because one satellite wanted an update and then we broke open stack because they used the same shared python library and ABI compatibility wasn't maintained I'm making that up that never really happened but it's just a theoretical example so the problem is that the amount of software, the amount of change the scale and managing the stability, the ABI stability gets really hard and we call that dependency hell so what happened is that people moved to VMs VMs came around as a big problem solver one big thing we inherited it came out of windows primarily first because in windows in Linux early on we ran multiple services per machine so you had a cluster with a bunch of servers you ran your database your business logic and web application in the same cluster and you had cluster managers that would schedule your services and by default you had three tiers, it's typical so you had at least three machines you had the database on one your business logic on the next server things moved around and consolidated and you could run multiple things in the same cluster, it was really nice in windows they never could really do that you never ran multiple services on the same windows server and that was a real problem for them because if you had to buy a piece of hardware for every service they couldn't multiplex hardware like we could and it got too expensive it took too long so VMs took off as a solution to that because with the VM you had virtual hardware so you multiplexed your hardware by running multiple VMs one instance of the operating system of the windows operating system per VM that worked like a charm and we inherited that in Linux because suddenly our customers started operating Linux like that that solved some of the dependency problems because the dependency problem if you're running one service it's kind of manageable if you're running satellite on one machine we test satellite when we update rel we test rel and satellite and that usually works the problem comes in when you run your own application on top of that with some different version of a Ruby package and you're running another reted package there so you have a single namespace for dependencies for multiple applications written by different developers who want to use different versions and you're not aware of the same level of changes in API and ABI you get dependency conflicts a VM source set because you're basically saying okay I'm using one instance of my software of the operating system I'm binding it to my application and I just run a lot of VMs so it's a bit heavyweight but it gives you isolation of namespace of dependencies and reduces the problem it has two downsides one is we call it VM sprawl now you have 1 to 10 number of servers that you need to update so if you had 500000 now you have 5000 servers and you have to update all of them and this whole problem you're done at one end, you're starting over that gets worse and worse and the way people use VMs in theory it would have worked in practice they just screwed it up so the first thing is when VMs came along everyone said oh you're going to use image based deployment so I'm going to image my golden image and then I'm going to run use it like an appliance so we call it virtual appliance and you're going to update the golden image and then roll out the golden image around your cluster and that's going to be super clean you don't actually have to update things in production anymore so problem of oh I'm updating logic breaks because the developer didn't know that behavior changed, ABI changed or API changed you won't have that in reality that never worked so most people even if they're using virtual appliances they use them for deployment initially and then they go back and run YAM update in VMs so I've not seen any I shouldn't say that right like of course I've seen it once or twice but not at scale at large people are not running VMs efficiently with imaging, they're treating them like pets so it's terrible and at the end it borders a couple of years on dependency hell because you reduce the side effects of different stacks running in a shared namespace but it sprawled out too much and it just got too heavy weight NaO cloud took over it abstracted us from the hardware we don't care about hardware anymore performance doesn't matter anymore it's all about scale and we are elastic it just made this even worse now there's no control at all anymore you don't have to request a VM and wait a week and justify it now everyone deploy things all the time you just press a button self service environment and you get a VM and you install something and you run it many companies have kind of like a workflow around that with golden images and some control and you're not supposed to have if you're the application owner you don't have root and production but in reality what most of them do is that they give you root when you install your software because it's too hard to figure out how to not have root and then you're supposed to give it back and that means that your operation people have no idea what this VM is doing no one knows what's in there so it's great it kind of works but you have things changing and you create a CI CD process around it you create DevOps around it works great but the problem is because you're treating it like a pet you're not using the image based deployment you're testing it you're developing it with one version of the stack you're developing it to testing and someone runs YAM update so it tests a different version of the stack with SSL updated and some Ruby packages updated and then you're running in production and it changes again so when you run in production it's not the same stack as the one you tested and then in production you're updating it still with the YAM update and SSL changes again so it's still a late binding model where the actual in production every time there's a security fix update and people are trying to control that but customers really are afraid of security updates and bug fixes because they introduce churn but there's no way around that because you can't separate like it's a single path forward in RHEL we did some interesting things like extended update support which I profoundly apologize for to our engineers it's my fault but it was a work around I was a PM for it because we customers wouldn't want to keep on like they wanted the new features for new hardware and new software so they wanted to keep us back porting things to versions of RHEL in a major release so RHEL 7 runs for a couple of years 12 years I think I'm not a PM for it anymore I see but then they wanted once a year a more stable release that gets only critical fixes because we didn't want to absorb the change anymore so we added like another full layer of life cycle to work around the problem and it works but it's very heavyweight now while all this was going on a couple of things changed in the world out there in the industry you know everyone became a software company because software was eating the world there's no business that's not a software business that's not right my landscaper will disagree but if you're in larger businesses everything is defined by software your car is a freaking data center on wheels so it's a business value development and we used to have a certain control over what happens in IT because there was an IT department that at the end could enforce standard, could enforce a golden image but that went away with the line of business becoming a software business and writing their own software and getting the power because they're driving revenue so at the end they will always win the argument against IT over what to run few exceptions and we also had a change in cultures like I grew up with 3 TV stations and you had to follow a schedule and if it wasn't on it's kind of weird but it makes you a bit more tolerant about waiting for things but that completely changed newer generations of people like end customers are not going to deal with that because we have an on demand culture and that has trickled through so that has changed line of business behavior and corporate behavior and bring your own device it's just an example of that like who would today accept that the company tells you you can't bring your own phone and connect that to the corporate server are you crazy? that's insecure if you need to prove like me having my own laptop and my own phone and that connected to corporate is just a proof point that those things don't work anymore you can't stun at us we have a much more dismaneuvered service oriented architecture it's kind of a buzzword but it's reality we are aggregating services it led to a preference to consume the most current version because developers are driven by using the end customer value in the line of business so they are not going to standardize on old versions of anything they want the newest version of everything interestingly open source is the default but that also removed even more control because everyone can get everything everywhere so customers now have the problem how do I control what they download and that's the biggest challenge it's very creative and very fast moving but it's hard to control if you had back in the day with proprietary software it was easy to control because if the vendor didn't publish it you couldn't get it in our world that's not the point if someone writes it you can get it and someone is going to use it and we had cloud taking off, devops changing how we do things and then application centricity and I wanted to delete that point because I'm making that later so I'll ignore it so one of the effects of that is that the amount of software available and being used is exploding I got this on the internet so it must be true modulescounts.com gives you a picture of how fast software is being created NPM is going to create a black hole if they keep going like this so this could all be forks of the same package someone is going to use every single version of that if you think that we had a problem with dependencies to put this into perspective I think Fedora has between 20 and 30,000 packages in the ecosystem that we package in RPM and I think Debian has 10,000 more or something so that would be somewhere down here so our ability to keep up with this we can't repackage all of this in RPMs so we have this nice solution that got us really far but it's not going to keep up with the software that line of business software application developers want to use no chats you can't keep up with that and most of these languages have their own package managers so it's not that this is going all the way back well it's kind of depending on what you're doing but if you're using pip install it actually is going back to compiling locally because you can't find the wheel for what you are doing and so it's actually going to compile on your local machine and you're going to depend on your behavior and I'm sure that other languages have similar problems but it's a huge problem so because of the complexity we are back and I would argue that even if you're in a complete binary packaging the fragility of the dependency chain right now is basically as bad as the fragility of compiling from source on every machine back in the day because it's too much change and the ABI contract doesn't hold end to end high in the application that we should give up on ABI contracts right there are a lot of things where that is very valuable on the low level right on the root of stability which is the kernel G-Lib C GCC those things you want to control but above the interpreters you can't keep up so that puts us in a problem our traditional distro approach was application centricity is an impasse we can't keep up with the amount of packaging we can't even make the packages available and customers just don't care Solomon Hikes found of Docker in the event of the Docker container thing said that a couple of years ago in a talk at their conference in Europe he said no developer using Docker uses the distribution of the container he's probably right that developers will use the language native packaging inside the container because the stuff they want in the versions they want is just not available from the distribution because it's too much content too much change, too many different versions and I would also argue that just repackaging things that already packaged in a binary format or in a format that deals with the intersection of binary and compiling local dependencies is of limited use so if you have a pipi package wrapping it in RPM doesn't solve an additional problem it solves the same problem in a slightly different way so we see a change there there's also a problem that testing when you're up in the application testing actually isn't valid if it's not done with your application because most of our customers have more developers than we do and they write applications that use features in these libraries that we will never be able to test so it's not only the amount of software but also the depth of functionality in the software that we can't cover in testing for all of the ecosystem and so you actually need to validate that a certain Ruby library didn't change behavior to break your application you actually have to test it with your application and that further devalues the Linux distribution again, lower in the stack I don't think things have changed but when you go up in the stack it gets harder and harder because it's not a high value target validating the kernel is a high value target validating glibc is a high value target because everyone is using them and the benefit of like no one compiles their own kernel anymore no one does that anymore almost no one does that anymore and the value of doing that versus the risk you incur is just not worth it if you're like some Ruby app that is different so containers came along and we thought that might solve the problem we had the ability now to run things in different namespaces LXC was there it gave us multi instance, multi version environments we could run different namespaces different versions at the same time vserver was vserver was a project or virtuoso was a project that everyone was using like 2011 2012 I mean and that actually like solved a lot of problems because now I could run different things on the same machine without dependency interactions I could I could compile things locally without polluting my system I could install things from pip or Ruby gems without pushing them into the shared namespace that all of my applications share so this problem that oh I'm updating this Ruby library and I'm breaking satellite and open stack actually open stack doesn't use Ruby so I'm making this up but for the sake of argument that's not happening anymore now that was the first iteration of containers was looked at as kind of a nice packaging or a nice operational tool but then it went wrong and revolutionized how we do this because and it's brilliant so we had containers which is basically cgroups, namespaces, NSLanox but then Docker combined that with a transport format and a copy and write layering model so they added basically the concept of aggregate packaging and the concept of distribution so the two things that we had so with RPM we had packaging of application and we had some concept of aggregate packaging back when I was in states engineering consulting big deal was installing Oracle databases on rel and they had this Java installer that was just terrible and you couldn't really run it headless so what you did is you ran it once on a test machine and then did an binary RPM just for the whole Oracle install to make it redistributable you used RPM for aggregate packaging of the whole software stack just to make it reproducible so that's kind of the same thing that Docker did so you build the ones you basically inherit the frozen binary reproducible builds in an aggregate for the whole software stack per application and you have the model of distribution that we had was yum they added push which totally makes sense in the modern world which didn't make sense in the broadcast world but it's on demand peer to peer world it totally makes sense that you can push and pull things and they had the management of installed artifacts so in a way you can look at this as reinventing it's like a static binary it's like an early binding so you bind the full stack at one point when you create a container and then you have a fully controlled binary distribution of exactly the same behavior and the only interface, the ABI interface is moved down to your user space to kernel interface which is the one that we can absolutely keep stable even upstream is keeping it fairly stable and Reddit has a great track record of keeping that stable and it's a valuable target because it's a right interface to keep stable everyone buys into that it also lets you move the same by the same artifact through the development test and deployment process so we're building it early and then we're testing the same thing we're deploying the same thing we build orchestration around that to make it fast right now we heard enough about Cryo so I'll just skip this but it's containers done right docker was great they innovated invented a lot of things but they also had a lot of overlap with the underlying system that wasn't that useful necessarily like it's not a natural experience for sysadmin when you use docker it's probably good for developer but for sysadmin it's not ideal and they have too much demon overlap with Cryo we solved most of that and if you want to know more dance and Nellyn's talks on that we're awesome earlier today and we recorded all of them so you can find some on YouTube if you didn't listen to them another problem is like an application today is not a single container so we have multi-contain applications that's where Kubernetes comes in it orchestrates multiple containers into a consistent application and of course we're moving from a single like traditionally a single server was your default deployment that's what we cared about and then you might the cluster was the exception you would aggregate multiple servers into cluster nowadays everything is a cluster so we call it a single server the exception so cluster is a computer again Kubernetes is like how we solve that we call it the metaKerner but actually it's just the meta system D we call it the metaKerner because everyone likes a Kerner and no one likes system D but Kubernetes gives you both the application orchestrator it takes multiple containers puts them into a multi-tier application just how many instance of what you are running and how they connect. And it also is your cluster manager so it gives you a scale out cluster. Next thing is full service delivery. So we are with operators, service broker, API, helm, things like that. We are putting together the full transport capability. So the problem statement is, today I can do a YAM install IPA and then IPA install dash dash config or whatever, like it's a config script and I have a running orchestrated instance that next time I boot my machine will come back. Which just contains some Kubernetes I have like that's a bit hard to do. I have to copy around the Git export files, edit files, run 15 commands to get to the same experience with a combination of things like the service broker, open service broker and operators. I get to a similar experience where I can actually take a full application. I see it in a service catalog in App Store, click a button and it gets deployed in my cluster. So I get full application deployment and portability which is really important for scalability and reproducibility. And I'm out of time so I'll quick talk for two more slides before Langdon kicks me out. So, yeah, so I'll put a protest in here. Someone had a stupid idea, sorry for that, to put 35 minutes on a technical presentation. Like you can't say anything in 35 minutes. Clap if you agree. So, three use cases that matter, fully orchestrated applications that are portal orchestrated Kubernetes. You still have applications in areas what I call loose orchestration so you're deploying one container somewhere using something like creating a unit file to start it up in a traditional server and you'll still have pet containers which is like, oh, I wanna do what I did on a single server but I want to do multiple versions of it. I want to run different versions of the REL user space on a newer version of the kernel. That's where pet containers come in. So you're basically operating in such a container. It's an exception but it's valuable if you are on single servers and using things like that. Just, you wanna pip install things and you wanna install a different version of an RPM than you have on the shared namespace that's just namespacing the machine still. Still valuable. And with Builder it's awesome. So, what does it leave out? It leaves us with a new architecture where we have an application platform that's Linux plus containers plus Kubernetes as a new meta-operating system extending the old model of Linux into a cluster application-centric view. It's awesome. And if you think this isn't real, Unicons are real. Thank you very much. So come back tomorrow for... So we have to clear everybody so they can flip the venue for the party. That's why we're so aggressive right now.