 Hi, it's good to see all your faces. So I apologize for the title of this talk, because I realize afterwards that unless you've seen the talk, you don't actually know what the title means. Hopefully we will fix that by the end of the talk. Let me tell you about the motivation for this talk and why I'm giving it. So my name is Nick. I work on a product called Tilt, and we interoperate a lot with Kubernetes. And as we interoperated a lot with Kubernetes, we started to realize how much the design decisions that Kubernetes makes were influencing the design decisions we're making in our own app. And when we kind of thought about why this was happening and we said, well, you know, if we squint really hard, Kubernetes is just a framework for solving control loop problems. And if you squint really hard, Tilt also needs to solve a lot of control loop problems. And we really admired the way that Kubernetes broke those problems down into pieces and solved them. And so we said, okay, Kubernetes actually has a pretty good opinion on how to solve these kinds of problems. Why are we building all these libraries for ourselves to build this app to build control loops? Why don't we just pull in more of Kubernetes as a library, take a lot of the controller runtime, the API server, and just really just pull it into our own app and use it instead of building it ourselves? I can't see your faces, but you probably look horrified. Maybe by the end of the talk, I will convince you to at least be less horrified by this idea and I will tell you what happened and I will tell you if you should also use Kubernetes as a library yourself. So let me just start by talking about what I mean by the control loop problem. Control loops are a lot of places in the Kubernetes documentation. This is a very old engineering concept. The idea of a control loop is that in an environment where things can change at runtime, where you need to react to changes, where you need to steer to avoid obstacles, you really need an engineering system that can self-regulate, that can continuously respond to changes and keep the system in a consistent state. There are many mechanical control loops in the world, but we're gonna focus really on software control loops. These systems look very different than kind of traditional web apps where a traditional web app looks very much like you send a request, you get back a response that tells you if your request succeeded. Software control loops are constantly adjusting their behavior in response to some environment around them, whether it be the amount of bandwidth you have available, whether it be the amount of CPU you have available that you can schedule a linter, whether it be the file system or operating system that's changed around you. A lot of this talk is really inspired by this book which Chris Nova and Justin Garrison wrote a few years ago, and the one of the reasons I really admire this book is that it tries to abstract away from the general, from a lot of the implementation choices that different cloud-native systems make and really tries to tackle like, okay, what are the patterns, what are the abstract problems, what are the problems that any system that tackles these problems are going to have to handle? The reason this is relevant now is because, for a long time, the kind of the Unix philosophy is around files, but the cloud-native philosophy is really around servers. And servers that have lots of runtime inputs that can be swapped out and brought down and reloaded and change their APIs at runtime. And Kubernetes in many ways is driving this trend by making it easier to manage lots of servers, but in also in many ways it's just kind of responding to this trend that like we have, as a kind of software community have decided, we're gonna build apps out of lots of servers and we need systems that help us keep those services in a stable state. So with that kind of short introduction to what a control loop is, let's talk about how Kubernetes solves this particular problem. I'd like to say that when a lot of people encounter Kubernetes the first time, they find it very scary and they just want to hide in a corner. There are CLIs, there are HTV APIs, there are GRPC APIs, there are client limits, there are pods, there's DNS, there's services. There's all the resources that Kubernetes manages and it can be pretty overwhelming. I wrote this blog post in the spring that got shared a lot and it was a kind of a troll post and I was surprised it got shared so much is called Kubernetes is so simple you can explore it with curl. And what I wanted to kind of break down in that blog post was the way that when we talk about Kubernetes, we're actually talking about two separate things that we're talking about one part of Kubernetes that's really a huge ecosystem of tools that manage cloud infra and that is cool. I don't wanna denigrate that, that it is an important problem but the other part of Kubernetes that I think is like super exciting to me is that it is a very simple, very elegant, very honestly genius library for how to manage control loops and because that library is so simple and that API is so elegant, it has allowed us to manage very complex cloud resources and that the kind of the simplicity of that core API is what has allowed us to make Kubernetes so complex and handle so many complex problems and really trying to help people to differentiate between okay the simple consistent API that's the same for every resource and also all the things that you can do with it. I'm going to talk, we're gonna break down how Kubernetes is structured. It is all, this is my kind of basic illustration of if you want to reach Kubernetes API server self-actualization, this is your hierarchy of needs. I also want a big shout out to the Kubernetes community who has done a lot of good work really breaking this up into small repos. Repos with like really clear division of responsibilities that really layer on top of each other such that each repo is building on the repo below it. We're gonna go right through them. If you are contributing to Kubernetes it's all gonna sound very familiar. The first kind of the basic like bones and potatoes of Kubernetes are the GoStrux. A lot of documentation in Kubernetes is really auto-generated from GoStrux but I really just enjoy just reading the GoStrux because they're very simple. They are very well documented. They all have a kind of a consistent structure. They all have a spec that just defines the desired behavior of the resource and a status that just defines the observed status of the resource that other things that need to respond to that status can watch. Going one level up from the basic GoStrux we have API machinery. Once we have a system with objects we now need a way to refer to those objects. We need a way to refer to those objects types. We need a way to name those objects and refer to them by name. We need a way to group those objects together. And so API machinery says okay, we have a bunch of objects. Let me give you, let's have a standard, well understood way of how to refer to those objects. And this is all broken into its own little library that you can use for managing any GoStrux. The next level up, once you have, you've reached your needs of how to refer to objects, you now need HTTP APIs for each struct. Now it's interesting that like, I honestly believe that like, one of the geniuses that Kubernetes does not make you define your own HTTP rows. It just says, you know what? We know what HTTP rows you need. We're just going to auto-generate them. For every object that you need to manage, we know how to refer to it by name and by label. We're going to auto-generate HTTP APIs to create, update, get, delete, patch, list, and watch. We will talk, this is a little bit more complicated than just a basic crud app and we will talk about that. But for now, what's really nice is that this API server can just generate at runtime new routes for new resources. Once you have HTTP APIs, you need a prototyping tools, tools that let you from a CLI interact with those objects. And that's why we now have its own repo, kubernetes.io slash kubectl and also kubectl.io slash cli dash runtime that are just the inline tools that for working with this struct, kubectl get, kubectl describe, kubectl edit, kubectl apply, kubectl patch are good examples, which all are just really ways to kind of prototype things around those HTTP APIs. Once you have a prototyping framework, you really need a way to make it robust and to really build your control loop. To be able to not only, to really take, okay, this is a set of HTTP APIs I have to manage resources, how do I continuously respond to changes in those resources? And controller runtime is the framework to do that. It turns HTTP calls into a fully realized control loop, it listens to changes, it caches things appropriately, it manages the caches for you, it doesn't really expose to you the HTTP APIs it's making under the hood, it just really just exposes to you as if they were events, as if they were just changes you need to respond to. And the important bit of this is that to build a control loop, you need a lot more than the classic crud web app, you can't just create, read, update, and delete, you need list and watch, and those are the HTTP tools that lets you build these standard control loops that we can use to manage any resource. Okay, stop there for a minute. This is the basic building blocks of a control loop that Kubernetes breaks down. Why does this apply to tilt? As I said, if you squint hard enough, tilt is also a control loop. Tilt is, we're not gonna talk much about tilt, but just so you have the context. Tilt is watching your source code, it's auto-deploying and restarting services when the source code changes, and it's raising alerts when services are in HLV. So it's really just kind of responding to a lot of environment. When we first wrote tilt, we ended up building our own control loop infrastructure. We just kind of hand rolled it, so like this isn't too complicated. We can just build our own control loop, it's just a loop, right? Let's just have a little loop that reacts to changes to a data structure, and it will be simple. And the problem is, once we built that very simple control loop, we actually realized, oh yeah, as an infer tool, we need a lot of tools around it. People are asking us for a lot of tools around it. They really needed an API to kind of be able to make changes to the data structures. They needed, or HBAPI to make changes to data structures. They needed the CLIs to diagnose problems. They wanted to kind of write new dashboards, new kind of tools that were responding to the changes and able to kind of display updates. And we kind of would end up writing a lot of those things ourselves for our own internal tooling, but it was kind of a pain. It didn't work very well. We kind of were very lazy about it. And we said, you know what? Kubernetes has already solved these problems. It solved them really well. We don't need to build an API server. We don't need to build our own CLI framework. We don't need to build our own control loop runtime with appropriate caching and appropriate retry mechanisms. We can just use what the Kubernetes community has already built. So I'm gonna talk about kind of two examples where we used this in practice and the kinds of problems we tried to solve with it. We kind of said, okay, we're gonna test this theory out by just trying it with a few obvious objects. The first object we're gonna try is called tilt's file watcher. There's actually a long history of tools that have implemented file watching as a control loop. Enter is a big one. Nodemon is a big one. Facebook's Watchman is a big one. They all are demons that run on your machine. They set up the file watches for whatever API your OS supports for doing file watches, whether the API is, you know, I notify KQFS events or read directory changes W. And it's really just says, okay, I'm going to continuously watch these files and expose those to other services that know you to know about file changes. This turns out to be a really great fit for kind of how Kubernetes thinks about declarative data models that you can declare, hey, I wanna watch all the files watching go. I wanna watch all the files that match star.go. And I want to continuously react to those changes. And I really only care about the most recent, you know, kind of the current state of things, the most recent stuff. This is what file watch spec looks like the very first and the kind of desired state. We just said, okay, this is a pretty well understood problem how to watch files. So that's a good libraries for watching files. We're just gonna create a control loop, kind of a API server wrapper around it. And so we're gonna define some structs. This struct has something called watch paths, which is just the recursive directory of things that we're gonna watch. And it also has a list of ignores that files we're going to ignore, files that we're gonna have us ignore. And we're also gonna have a wild out watch status, which is both going to give us low level operating system areas like, hey, you ran it out of I notify nodes and also file events that, hey, here are the most recent files that change that you should be responding to. Before we built the system, a lot of our support was around helping people diagnose file watching problems. Then still is really just a file watcher and a lot of tools like really need to be good at watching files. But the problem is people would have all sorts of problems either they had, you know, written the pattern wrong or misunderstood the pattern or misunderstood relative versus absolute directories or if they had screwed up the syntax or some all sorts of weird areas. And the problem was that like if diagnosing bugs was really difficult, we just have to run through this list of 20 questions to like figure out what their problem was. We'd often have to have them share their whole project. Once we move this to a control loop, what we realize that the control loop kind of pattern of the world really lets you expose the state to people and lets them run their own interactive experiments. We would often just say, hey, this works just like Kubernetes you can say tilt describe file watches and they would say, oh, this is perfect. This is exactly what I need. I'm going to change the spec. I'm going to touch a file and I'm going to watch the status and I can run my own interactive experiments to see if I got this right and just made it way easier to help people diagnose their own problems. You just incredibly just the number of file watch issues we got dropped off. The other thing that we do a lot of is the very early migrations we did was the port forward. So port forward, Ana, if you've used Kubeco port forward great tool it binds to local host on a port, creates a tunnel to a pod in your cluster and kind of dynamically steers the network that from local host into your pod. Again, Kubeco port forward is another thing that has to kind of continuously maintain that tunnel and has to continually steer to keep that tunnel up. And we used to have all sorts of problems where people would say port forward isn't working and we'd ask them, okay, what's going on? They say, I don't know. And we just have megabugs that are just like, you know, doesn't comment saying, I just port forward isn't working. And we kind of would say, okay, let's copy and paste these error messages into Google and then we'd find a Kubernetes issue thread and it would also be the same thing, it'd be 40 people on an issue thread and Kubernetes saying, port forward isn't working, please help me. And so we said, okay, this is frustrating. Could we do better? Could we write, adapt this to a reconciler? Could we have it really publish its spec and status and help people understand what parts of their port forward are failing? So we kind of forked the port forwarder and client go and adapted to this pattern. Once we did this, it actually, exactly the same thing happened. It's just certainly, suddenly people had the ability to diagnose why port forwarding was failing. Not all the time, but certainly much more easily could kind of do a bisect of like, is the local binding working? Is the remote binding working? Does it know which pod it's supposed to be binding to? And people could use the CLI to delete port forwards and self heal. And this really allowed us to kind of triage issues better and really kind of made users not have to pair with us on it. So we've done a bunch more of these kinds of migrations of making more things on our app, follow the basic Kubernetes reconciler pattern. And when you run tilt today, you do run a Kubernetes API server locally and more and more of our primitives use that. I wanna kind of, let's just zoom out a bit and just talk about kind of like the high level lessons and whether you should also throw out parts of your app and just replace it with parts of Kubernetes. The joy number one that we discovered, which is probably not surprising, is that kubectl is really great and has really figured out a lot of the right verbs for how to diagnose problems and really don't underestimate the social conventions of the tools that people could say, oh yeah, this works like kubectl. I'm used to kubectl. If every tool worked like this, that would just make my life a lot easier. I mean, there's a kind of a second kind of related thing which is that like, hey, kubectl just makes a lot of decisions for us, the CLI. A lot of these control loop decisions that Kubernetes makes, makes a bunch of decisions. We probably might have had slightly different opinions, but like the fact that someone gave them to us and they were good decisions and they were good enough for us really helped us kind of made it much easier to build these control loops. And then the last thing, which is something I just really admire about the Kubernetes architecture in general, and we started to see kind of benefits of, is that like, hey, if you're using this kind of an architecture of an API and a CLI and a controller runtime, that means that the users in your CLI and the in binary code and the kind of external code that's interacting with are all really using the same API. I've wanted to write a kind of a blog post that's just like writes Kubernetes controllers in bash, but instead I just wrote some fun like tilt extensions in bash. So tilt just ends up being a wrapper around kubectl. So my bash extension just says tilt get command dash dash watch streams that list of commands that are changing to a script that reconciles them. That script, every time a command changes, it creates a button for that command and ends up just creating a bunch of buttons in the tilt UI dynamically based on what you're running right now. And this has allowed us to like really just prototype, you know, bash is such an underrated technology, prototyping things in bash is just always gives me some joy. I wanna talk about a couple of changes and problems we ran into. The first big overall change, I know it's always popular to complain about go dependency management. Jason gave a talk at Kubicon EU this year. I didn't actually know he was gonna be the track host. So I'm really glad he's here. I can praise him about this problem of pulling Kubernetes to the library. There are certain dependencies that are really problematic. Mostly they're not Kubernetes dependencies. They're mostly upstream Kubernetes dependencies. I'll let the slide speak for itself. I don't have a good solution to this, but pulling in Kubernetes to the library definitely does pull in some problematic projects that break version often. The other kind of big pain of building in Kubernetes to the library is that, hey, Kubernetes brings in a lot of things that we don't need, that Kubernetes really likes at CD and really likes to run an at CD server. Tilt swapped that out within memory, which ended up being taking some work just to replicate what at CD does and some of the kind of at CD semantics that Kubernetes relies on. And then lastly, that Kubernetes really likes TLS inserts and how Kubernetes does auth and how Kubernetes does access control, which we kind of inherited and for like a local dev tool that probably doesn't make sense for every tool to do auth and security like Kubernetes does it, but it caused a lot of pain, but we're kind of ended up being through it. And lastly, if you want to learn more about this problem, this is not, we are not the only team that is doing something like this. There were two talks at KubeCon.eu that both talked about this, the KCP talk, which I think is Kubernetes control playing, which really talks about how to better expose an API server as a library that other people can use for non Kubernetes management. And then bad idea, which is Jason's talk about doing a kind of a similar thing. They actually set up CRDs, which I have not set up yet. And then lastly, KubeBuilder, which is a just a great, has a lot of great documentation on how to write these sorts of tools. And I hope to see more kind of tools built with this architecture in the future. Thank you very much. That is the end of the talk. I wanna give a shout out to Ellen Korbs who was originally meant to be a joint talk that I was supposed to give with them. And instead they decided to give a Kubernetes action movie on Friday, which you should go see. It's called Beyond Kubernetes Security. But and also the tilt dev team who gave a lot of notes on this talk and made it a lot better. So thank you everybody. And now we have time for questions. If you have any questions, please raise your hand and we'll get a microphone over to you. Does he want to? So I wanna get into the weeds a little bit on the like specifically what GRPC does that makes it such an ugly, and maybe this is, maybe you should just go look at the other bad idea talk, but so I mean pulling in client go is huge. Maybe not necessarily problematic in the way that I'm getting the sense that GRPC is like, as far as just bad, you know, pulling in a bunch of things you don't need and not playing nice, but I mean, client go is massive. It's gotten better. But I remember like back in the glide days before go mod and like people passing around like giant lists of replaces and like dynamically building big lists of very specific version pinning. Cause I mean, it's great that the go packages are built into like all these different repos, but getting a cohesive set of those to build client go was a pain for a while. So yeah, I guess the question is like, what shorted me going to watch the whole talk? What does GRPC do that's so bad? Wait a minute, is your question about client go or about GRPC? No, about GRPC, maybe like I said, tell me if I should just watch the other talk, but mostly just like I wanted to dish on what was so bad about GRPC. Okay, I will, I'm not sure if there are any GRPC people in the room that I don't want to criticize them too hard. I will say client go has actually been gotten, used to be really awful to pull in as a pensy has gotten really good. I'm not sure who was responsible for that, but I owe them a drink. GRPC, let's just say I've had some conversations with the GRPC people where they've turned things into errors, but not really provided any mitigation strategy for what happens if one of your dependencies upgrades GRPC and how you're supposed to handle that new error. That's generally the problem. I think their official deprecation policy is that they do not follow Simver, that they are allowed to break any, anything that is marked experimental, but there's no way to know unless you read the documentation whether an API is experimental or some downstream project uses experimental API, you just didn't know. That I think is a large source of the problem. Okay. What's the tendency to open source tilt in the product? Yeah. Oh, yeah, tilt is an open source product. That's basically it. Yeah, you can, you literally can just check out tilt tomorrow and just see the parts of Kubernetes that we use. There's a separate repo called tilt-API server, which just shows, which is kind of a standalone API server, not unlike KCP or Bad Idea, that just kind of shows you how to set one up. All right, thanks everybody for coming.