 Hi. Yeah. Can you hear me? Perfect. Well, thanks for the interaction. Thanks for attending this this talk. I'm really excited to share these kind of ideas with you today. We're going to walk through really the why of what we're doing. So it's going to be a little bit philosophical, a little bit kind of hopefully maybe even new ideas and kind of get your head spinning after this. That's that's the kind of idea. So. No, it's a little bit delay. I'm Lucas Chastram, studying at Alter University at the moment. But I've already been doing some quite some different things in the Kubernetes community, starting from Kubernetes on ARM, which is also my my Twitter username. If you want to tweet, tweet at me and give feedback. I'm definitely looking for feedback for this. Currently, I'm kind of looking into these these kind of Kubernetes research things that that's why do we do what we do and how can we kind of continue and sustain the kind of development of the community by going back to first principles. Right. So, but also been doing QBADM and some other interesting upstream maintenance work in the past few years. But yes, let's take a look at the why and when we're coming to this KubeCon conference, we get a lot of the what. So we get a lot of the YAML, a lot of the kind of run this command, you know, in this order and then write this script here and it's going to do its thing or like click this button. But then we might ask how so how does it work? Well, when you submit a pod, when you submit some application to Kubernetes, it's going to schedule the different things to the different nodes in a certain order that we also have a lot of talks about. But now I want to go even more to the core and wonder why did we choose different certain patterns that we now have kind of as de facto standards in the Kubernetes community. So this is kind of what I don't want to happen. So if we if we don't look at why we're doing things, we might end up missing the kind of problem we're solving. So there's two kinds of complexities and I thank Jobita for this this analogy. So that's necessary complexity. And that's the one we're going to look into today. And there's also kind of accidental complexity and the accidental complexity we hope to remove over time so that users aren't annoyed in just vain. But the necessary complexity is actually there because of reason. And we're going to go into the reason. What is the reason? What is the necessary complexity? We might not even know that we had. And as you might know, Kubernetes stems from development inside of Google. At first now it's a community project as you have seen in this conference. It's not just Google anymore, but in 2014 it was. And it stems from these kind of two scientific papers, two systems within Google, Borg and Omega that had great success under running still the search engine in Gmail and everything else. But now Kubernetes being the next version, they improved some of the things from the kind of earlier systems. But they also found just new ideas, new kind of concepts that in order to make it work in a larger community and not just within a single company. But then some new users of Kubernetes ask, is this too complex? Why is this so complex? And that's exactly what we're going to distill down. What is the necessary complexity? What is accidental complexity? And how can we find out when we're doing new things on top of Kubernetes kind of what is the necessary complexity we need to take into account and what can we kind of then figure out in other ways? So here is an example. As a small scale user, you might say that, oh, this is too much for me. And I don't have any problems with an easier system that I was using before. But that might be because you haven't run it long enough time to be kind of experience all of the randomness, all of the problems that come over time, that come with time. And Kubernetes is designed to mitigate these problems that you see over time. And this is done by Google running, well, a lot of servers this way. Then it takes just a couple of days before they have kind of growing pains. They have all kinds of things that affect their service systems and they need to fix. But for a small scale user, it might not be apparent before or after a year or two that, oh, these are real hard problems that we weren't thinking about before. So I'd like to start, well, there's going to be a lot of analogies in this talk, but let's start with one. Kubernetes as an orchestra or more exactly as a conductor of an orchestra. So, well, we want to play some piece of music. And we have some composer, you know, writing the music. But if we have no conductor, maybe we don't find out who is playing what and in what arrangements and what kind of ways to split this up and who is actually making sure that everything goes in the right order. So we need a conductor. Otherwise, things might be kind of inconsistent really easily and not that controlled. But it might also kind of, it might work, it might not. The conductor maybe can be away for 15 seconds without the orchestra keeps playing, but maybe if there wouldn't be no conductor in the whole, all of the performance, they would get out of sync. And this is a kind of other interesting thing that, again, Jobita has actually written an article about, that Kubernetes can be seen as a conductor for your servers. The one orchestrating, making sure everything goes in the right place. But also the musicians themselves are good, right? And say that there would be some noise from the outside. Then they kind of improvise and they bring the noise into the music. They kind of make it, improvise such that it adapts well to the environment. And Kubernetes is extensive, accessible and also adaptive to all kinds of certain conditions that it's operating in. So we need a conductor, but we also need to be able to improvise. So we're going to go through four principles today, and this being the first one. So we have the kind of a control plane, API server, this kind of stuff at the really heart of the system, then orchestrating the nodes in the system. But they can also improvise themselves and adapt to their environment when they see that things are kind of going sideways. And then we have some kind of claims. So what is the music that we want to play? Well, it's written in notes, normal sheet music. And we need to hand that sheet music to the musicians. And that might then look something like this. So we have a claim. We have some kind of description of what is the music we want to play. And then the conductor or the one doing the arrangement then hands out different kinds of specialized notes to the whatever piano, guitar, violin, whatever we have in the orchestra. And there is a difference here because the declarative claim can be kind of more general, more kind of, okay, this is something along the lines of what it should sound. But then there can be many different kind of fulfillments to that claim, many kind of arrangements of the same piece of music depending on who is doing it, depending on what environment, depending on all kinds of things. So that is then the imperative kind of the exact commands, the exact notes you should be playing then. So we'll get back to that in a moment as well. But this is kind of one of the key ideas that we have, the declarative thing, specifying claims, and then the imperative kind of more specifying actions or sequences of operations. So this is exactly this. Declarity means making a declaration of something. And then there is some kind of actuator, some kind of person process that fulfills the claim by doing some operations. So the declarative model focus on what is it that we should be doing, whereas the opposite, the imperative focus on the how, the exact sequences of things. And that's why Kubernetes is essentially all declarative. So when we submit something to the Kubernetes API, so when we tell some Kubernetes to do something, then that's declarative. In the same way as, for example, these opposites. So the same way as you make an SVG file, you don't encode specific pixels in that. You can zoom forever, and it's still the same resolution. Whereas in a PNG file, that gets pixelated fast. Same way, if you talk to a normal database today, you actually say that select something from some table, and then you get it back. You never had to go into the indexes of the database and be like, oh, where is that primary key and where is that kind of stuff. You just ask for something and you had an actuator, your database giving you it back according to that kind of specified language. And the last example being C versus some functional programming language like Haskell. In the functional language, you just specify kind of mathematical high level operations. Then it figures it out depending on the environment. No defined execution order. But C is sequences of operations directly interacting with the hardware. So Kubernetes is really the declarative claim you submit there. What do you want to see? And at least this kind of reminds me of these kind of, I don't know what to call them, children's toys. That is the first lesson of what is an interface and what is an implementation. I at least had these as a child, so maybe that taught me something about APIs already as three-year-old. But this is actually how all of cloud-native community, all of the ecosystem is built up. So we start by defining these interfaces. We start by defining these ways of doing things without specifying exactly how we do things. So just, these are some logos of some CNCF projects that, or Linux Foundation projects that actually have just a specification. Just a description of how to do something without telling you exactly how to do it, which means that you can swap out the process that is actually doing it. So, is it Docker or is it CRIO that is doing containers? It doesn't directly matter from the purpose, from the point of the one that just wants to have a container. Or metrics, or storage, or updating software. It goes on and on and on. So this is really the second why. Why is Kubernetes doing these things? Why is Kubernetes having YAML, for example? At all. Why don't we just run commands that would be SSH-ing to other machines directly? This is why. We want to tell Kubernetes what is the end state, what is logically the thing we want to see in an optimal world, and then Kubernetes will try to make that happen. And we'll go into how Kubernetes makes that happen as the next thing. So we, through these APIs, we can swap different implementations from each other, achieving portability. So Kubernetes runs, for example, from the smallest of Raspberry Pis. On ARM, it runs on, or from that to the largest cloud servers or on-premise servers or whatever. But, now let's go in a different direction. Let's think about our real world and the problems that we kind of face just by the nature and the physics of our world. If you clean your desk on a Friday evening and then you go home, then on Monday you arrive to a clean desk at work. How does it look at the end of the week although your desk was fully cleaned on Monday? Is it messy than on Friday again? Next one, at least it is for me. I'm sad to tell. And that's actually one of the laws of nature or laws of all universe. The second law of thermodynamics. So, I mean, I am to blame, yes, but for my messy desk, but it's actually one of the physical phenomena that holds much more generally for everything in the universe, even black holes. So, we start from something that is orderly and then with time it just gets messier, messier, messier, messier. More chaotic. An entropy is kind of a meter of chaoticness. So, anyone that's been making lunch at home sees the same thing. So, you start with a clean kitchen, you make lunch after you've made lunch. Your kitchen is messy. Again, that's how it is for me, but then I need to do my dishes to actually kind of fight the added chaoticness. It's not that it's, although it's inevitable that it spontaneously becomes more messy, we can do things about it and kind of make our system orderly such as we won't. But my sense of orderliness in my house might not be the same as orderliness in your house, so that the kind of, the definition is very subjective there as well. So, I declare a desired state, I want my kitchen to look like this and when it looks like this I call it orderly. That's what I then do when I do the dishes. So, what does Kubernetes have with this to do? Well, we start with a server that we just bought. Time is at the beginning. Then we put it in production, we install all of the things, we put it into production. Oh, nice, it's so perfect. But then a week in, a month in, a year in, whatever, it's all messy. It looks hopefully not like something like that, but that's what I found when I put in broken server on Google. But luckily we have something that is called Kubernetes which actually fights this, which does our dishes for us and make sure that it strives towards the desired state of what we have declared to be the order. So, it fights the entropy, it fights the chaoticness. So, you could talk about Kubernetes being the dishwasher of servers, right? So, in, you put in messy things and, well, out in a couple of minutes or even seconds sometimes, you get out a clean server. Of course, this doesn't hold on a physical level Kubernetes software, but still it holds on a conceptual level. So, and the server that's coming out will be whatever you declared to be the desired state. So, one more kind of fun, just side note on entropy is do you know why turning it on and off again actually works? It works because it minimizes entropy. So, when you are working on your computer, for example, when you start it up, you have a clean RAM, you have clean everything, kind of the state, but then over time it gets messier, it has like a lot of files open, a lot of RAM, you know, things like that. In use, programs can't take everything into account. So, this bound at some point has some messiness. So, but then when you start it, you shut it down, turn it back up, then it's more likely that it's actually working. Now, we have clean entropy. So, it's kind of an infinite game between the universe and our sense of order. And our sense of order also changes with time, but we always periodically need to fight against this entropy, fight against the chaos mess. So, that's where you see the corrective actions in this picture. So, we kind of, we might have a long-term policy of something. So, here is our desired state over time. That evolves, but actually the blue arrow is where the system actually is, although it's not matching the green one that we want to be. So, that's why we always need to keep nugging the system up and down, make things the way we want it to be. And the infinite game is a good book about these kinds of ways. It's not, we can't just say that, okay, on Sunday we're done. And now we're never going to update our systems again. There's no security vulnerabilities, I promise. We will need to continuously do that and kind of fight against this. So, to conclude this, systems are becoming less ordered. So, there is some need for periodic corrective action against what we have desired, kind of designated as order. So, that is the third one. So, we do corrective action periodically to kind of guard against this, law of the universe. Then there's more to it, there's randomness. From physics, this is Heisenberg's uncertainty principle, which says that we have an upper bound, we have a maximum we can know about what an atom is going to do, which means we can't fully predict how it's going to evolve. Which means, by extension, that there is no way that we can, we can't even look at the single atom and say, this is going to do this in the future. How are we supposed to predict large systems? There's no way, we can't at all predict anything in the future and I really recommend the Black Swan book, kind of really ingraining that when you read it. It's a really good book. So, we can't predict anything for the future, we don't know how things will evolve. And Google has in their Borg paper, as I said, where some of the theory of Kubernetes is coming from, Google saw this. They even wrote in there, the thesis now I'm quoting, failure is a norm. Deliberately leave significant headroom for all kinds of failures, such as workload growth, occasional these Black Swan events. If you read the Black Swan book, you'll know what that means. Load spikes, machine failures, hardware upgrades, whatever. So, you really need to be prepared for what you don't know will happen. And that is really hard, there's no silver bullet to this. But just designing for failure is the first step, which needs to be taken if you believe that, okay, if I add one plus one in our computer program, it's always going to be two, no it won't. There might be background radiation from a black hole flipping the bit in your RAM while you add things. And then it's just one plus one is one. There is all kinds of randomness that we can't control. And we need to just assume that failure is going to happen and then kind of deal with it from another perspective. And it's really hard. Here is one example why it's so unintuitive to us as well. So say that we have a failure that happens once, we have measured it, it happens once in every 10,000 times. So it's in 99.99 probability that a single thing, a single run is success. But when we start adding these together, so if we think that this is high enough uptime that the system is never going to fail, then we're wrong. Just if we run it 10 times a day, already after one year, we will have a 30% risk of at least once it has been failing. And after eight years, there's only 5% chance that we haven't had a single failure. So it's a matter of time, and this is kind of going back to what I was showing earlier. So Google with the large scale and these other hyperscalers, they have such a lot of mass, such a lot of different units interacting that they are seeing the randomness, the kind of effects of this instantly. And they need to design for that as a consequence in the software that they're doing. But as a small scale user, it will take a long time before you see the effects of randomness. Such that you believe, at some point you will believe that, oh, it's not there, and then crash next day, and then you have an audit. So it's all about the mental model, and that is one of the mental models, which is key to how Kubernetes has been designed, why it also looks overly complex at first glance, but it's actually necessary complexity. Of course, there is kind of accidental complexity in Kubernetes as well. I'm not saying that that will be fixed over time, and I welcome you to contribute to that as well. Because we need everyone's perspective on how to make things kind of going back to the basics and find a better model. But let's think about how Kubernetes solves this in a manner of a taxi driver. So if you call a taxi, you will say that I am here, and I'm in Valencia at the conference center, and now I want to go to the city center in the evening. Then I will say two states. I will say my actual state and my desired state. And like this, what the taxi driver will do is to, well, get my desired state, get my actual state, and then do an action plan. So take the difference between these. So I'm here, I want to go here. Now we have taken the difference. So compute the path that we need to drive. Then they pick me up and drive me to the place. But need to remember that it might not work out. We might get a flat tire. There might be a roadblock. There might be all kinds of randomness on our way getting there. So it's not a given that although we had the action plan to drive to the city center, that we'll end up there. Maybe Google Maps satellites are wrong, and then we drive to Madrid. Might be. But then when we have some kind of result, we will need to, or kind of what we might want to do is to update and say, say I'm meeting a friend in the city, and then the taxi driver accidentally gets a flat tire and we only get halfway there. Then I probably will need to call my friend and say, sorry, I'll be half an hour late because we got a flat tire. So then I do kind of this actual state update again. This is why Kubernetes actually updates the status parameter, the status objects in Kubernetes, if you're familiar with that. But then once we're done, once the tire is flat in the taxi, then what do I do? Well, I call a new one. So then it starts from the beginning. So now I'm halfway there between the conference center and the city center, but I still want to get into the city center. Now we compute a new action plan, do all of that and inspect the system again. So this is actually the way all of Kubernetes controllers work. Kubernetes has probably between 30 and 50 controllers in the controller manager that are doing this constantly. All the time, 24-7. Looking at is a state, what the user said it should be. If yes, then okay, I'll check back in 30 seconds again. If not, then take action. So this is Kubernetes way of dealing with randomness, dealing with failures, making sure that although we weren't successful at the first try, maybe there were rate limits, maybe there were just outages or maybe it just didn't work for some other reasons. But then trying again and again and again, eventually we'll get there. So that's the notion also of eventual consistency, which has been popularized in the recent years. And these four principles together make it so that we have a new way of operating. So instead of in the same way as the Industrial Revolution, made it so that we were not doing things by hand anymore, directly for producing certain resources. We now have machines producing the resources for us. And then by extension, we are just operating the machines that are producing something we want. So in the same way, we're not managing servers directly. We don't SSH into them anymore. We don't run commands at a certain server directly. We just tell that what is the end state? What is the key goal of the system? This will change over time as new security vulnerabilities come out and new business requirements. But we tell it that, okay, where do we want to get? And then we have something like Kubernetes to fulfill that for us. And that is the Industrial Revolution of server computing. So to end with, there are some kind of further reading you might want to do. So this is a brief introduction kind of just to get started with this a lot more things to unpack for this. The work that I'm now presenting is based on my bachelor thesis. So I encourage you to check that out. It has loads of more reading, loads of more context. And my idea with this work is to be able to onboard a new set of contributors and a new set of users in the cloud native communities. We can scale our communities such that you're not lacking the context you need to make the decisions that have to be made. And this thesis is made as kind of cloud native educational material. So just like this, now after you've seen this talk, you should be able to kind of read the thesis and get a walkthrough of even the more detailed ways of why Kubernetes works the way it works. These are the fundamental principles, but it's just scratching on the surface. There's loads more to unpack. And over time, I also hope that we can kind of make this kind of summarized version on some web page or something similar like that in an ideal case. And there's also loads of more interviewing that needs to be done from the founder, from the people early on in the project, such that we don't lose that context for the next generations that will build cloud native further for the years to come. So that is the thesis. Then there's two others and the principles that I've mentioned here today. Two other kind of influential theories in this, control theory is I recommend that it has usually used in say process automation or electrical engineering or something like that. But it's also applicable to these kind of service systems and it's heavily used in Kubernetes way of thinking. In fact, the controller model that I was just saying is exactly the same as a closed loop control system, which you're using, for example, when you have the cruise control in your car. That's exactly the same thing, exactly the same model. So I recommend control theory, looking up that a bit more. Here is a good walkthrough from Kubernetes contributor, Valerie Lancer, that's from Qcon. And then we have even mentioned in the Kubernetes documentary from his theory from Mark Burgess in Norway. He's been writing this, well, inventing this theory and also many other things around kind of space-time in computer systems and distributed systems theory. So those two have also influenced the Kubernetes design. So to wind up, we have the control plane orchestrating the nodes saying what should people do, but they also let the nodes themselves, the musicians improvise when needed. Portability we get from actually declaring the desired state, the end state. Then we find the entropy, the inevitable chaos, the inevitable spontaneous chaos by periodically doing corrective action. And finally, we have designed the controller model according to control theory in order to kind of mitigate the effects of randomness over a long time. So that's it. And if you have questions, you can ask them. I think we have time for one question. If there's anyone... No questions. People are going to the party, I guess. It was all Claire. Oh, one question. Hi, Lucas. Great talk. I kind of help but think, okay, well, it's like we... Everyone wanted to sort of contribute, you know, and there's this like really beneficial patterns that are producing the Kubernetes project and all of the cloud-native ecosystem around it. And I think like, is there anything that you can say about how these values in our software kind of also affect the way that our community works together? One good thing about the extensibility, for example, focusing on the APIs and the kind of desired state first allows us to kind of find in the community different kind of good patterns of implementing the same kind of problem space. And it allows us to kind of not have just one kind of winner or one kind of solution that has monopoly of everything, but instead see that, okay, maybe this actual implementation is better for this case if you optimize for this, but then this other way is better for some other use cases. So that is one concrete way of the declarativeness kind of being a good way for our community and then it also lets people build on top of each other. So like Kubernetes is a foundational piece of software for distributed systems, but a normal user might want to use a platform which is built on top, but then we still share most of the kind of, most of the work is in the Kubernetes platform and then the platform on top can kind of specialize in whatever opinionated, more specific user requirements that there is for that persona. Cool, we have time. Thank you very much, Lukas. Thank you very much.