 Hi, everybody. Thanks for coming. So first thing, can I get a show of hands who is using sidecars at the moment? So all of you, OK. Who is using version 129 already? Oh, yeah. And are any of you using the new native sidecars feature with 129? No hands. OK, cool. So I guess after this presentation, maybe you'll be convinced to either try out the new feature or convince the authors of your upstream projects to add support for the feature. My name is Mike. I'm a software engineer at Kong. I work on Kuma. And I'm joined by Mate. I'm a Lincority maintainer, software engineer at Boyant. And also I cannot crop pictures, as you can see. Both of them have different dimensions. But yeah, thank you so much for joining us. It's great to see that we have a full audience, especially on a Friday. So yeah, I guess let's get started. We're here to talk about sidecars past, present, and future. So maybe let's talk with what a sidecar actually is. And we'll go in the past first. The first reference that we could actually dig up, Mike and I, was from 2015. Sidecars were featured in a Kubernetes blog post about how to actually augment your main container with additional capabilities and all of that good stuff. So yeah, they're a design pattern that has essentially been there since the start of Kubernetes. But let's dig a little bit deeper. What actually is a sidecar? Well, it's just a container. In Kubernetes, you have a pod. It's your smallest unit of scheduling work that you can basically deploy in a cluster to have an application running. And a pod can have an arbitrary number of containers. In this case, in this diagram, we have two containers. They're both in the same pod. They share the same network namespace. But they both have their own process trees. And they both have their own file systems. Of course, they can still interact with each other. They can still share a file system through volume mounts and all that stuff. But essentially, they're isolated. And that's exactly what a sidecar container is. It's a container that runs in a pod in the same network namespace. So it can communicate with the other containers via the network over loopback. They share the same C groups. So resource requirements and resource usages just definitely get shared between the two of them. And then they can share the file system. But the file system is otherwise isolated. All of these things matter because sidecars, as we'll see, can add some additional capabilities to your main container. But it also has some very interesting properties that we're going to talk about more, such as they run as long as your app runs. And when your app terminates, so does your sidecar. Some examples that you might be familiar with. So sidecars, like I said, have been used since 2015. You probably use them with secret managers. They can pull secrets from external sources and then keep them in a volume mount to be shared between containers. Or you might have seen them with log aggregators. You can use sidecars to aggregate logs or to send logs somewhere, a bunch of that stuff. And of course, with service meshes, both Kuma and LinkerD make use of sidecars. They're a really great way for you to add functionality to your main container. And if there's one point that I want you to take from these slides that I just presented is that they're a way to augment your main container in a very modular way. Because they can be owned by a separate team, they can be upgraded separately, and they can be worked on separately. There's nothing that ties them to your actual application or to your actual application stack. So of course, I mentioned service meshes. Can I get some nods or some hands up where people have used service meshes before? All right, let the record show a couple of people. So service meshes are really cool infrastructure tools. They allow you to get observability, reliability, and security basically out of the box without making any changes to your application code. And the way that we do it is, you might have guessed it, through sidecar containers. We inject what's known as sidecar proxies that run in the same network namespace as your other container. They live as long as your other container lives and they just take over your traffic. When they take over your traffic, they can do a bunch of cool stuff, like adding retries, timeouts, smart load balancing, and I'm sure I'm gonna miss a few features, but it's kind of basically the gist of it. Of course, like with most technologies, there's always a trade-off and there are always some cons. The first cons are kind of cognitive, right? You introduce a sidecar container suddenly, some of the tooling stops working as well as it should. For example, cube cuddle, and yes, cube cuddle, not cube CTL. If you do cube cuddle logs, you won't be able to pull out logs from the container that you maybe wanna pull them out from. You're gonna have to start learning how to use some very esoteric arguments. And also when it comes to resource usage of visibility, it can make it a little bit tricky to find out what's really going on under the hood, so it does add some cognitive burden. But then there are also some more kinds of cons that we can probably talk about. First of all, you have some operational cons, right? Like you introduce a sidecar container, you can actually introduce some sidecar bloats. So with LinkerD 1.something, we use the JVM. Imagine running the JVM next to every single application that you have in its own container. That's like 200 megs of a runtime that you probably don't need to use. And pods are also immutable. So operationally, that sometimes makes it hard if your application wasn't built to be rolled out frequently. Whenever you wanna upgrade a sidecar, you have to upgrade your main application. And if your main application is actually stateful, you know, that can be a little bit tricky. And then there are some interface issues, but I'm not gonna go too much into detail there instead. I'm just gonna pass it over to Mike. Right, so I'm gonna talk a little bit about the native sidecar feature and how it solves some of the problems that Matej just mentioned. In particular, I'm gonna go into how sidecars actually worked in Kubernetes before this feature and how they work now. So yeah, you might be wondering what's this all about. I mean, like Matej said, we've had sidecars for many years. So what does this feature do? So before this feature, it's important to realize that we're actually only two types of containers in Kubernetes, there's init containers. These are ordered containers that run one after another. The first one, each one runs the completion. And then the Hubelet starts executing the next. And then we have containers, regular containers or app containers, main containers. And these are all started at the same time. And each of them keeps the pod running because there's no inherent priority between them. So if even just one is running, then Kubernetes keeps the pod running. So yeah, this brings us to the first problem that the feature attempts to solve, which is you can't actually use sidecars with any containers. So if you have a service mesh, for example, an init container can't take advantage of it because there's just no way to run a container next to an init container. And we kind of have the same problem with app containers as well because like I said, they're started at the same time. But if your app container is fast, it will come up before the sidecar and it probably will start to send requests to the mesh, but it won't be able to because the sidecar isn't running. And this is especially problematic if you're using a CNI for your service mesh because service meshes provide transparent proxy support, which is using IP tables or EBPF to basically rewrite all traffic to be sent to the sidecar. But this CNI executes before the pod has started. So basically your traffic is being, even in an init container, being rerouted to a sidecar that doesn't exist. And that just breaks traffic for your pod. So that was startup. I mentioned that containers are started at the same time. That's not quite true. In fact, the cubelet waits for something called a post start hook to finish executing. Before moving on to the next container. So at the moment, that's basically how sidecars work before version 129, how they work in Kubernetes. You put your sidecar container first, you attach a post start hook, and then you have at least the guarantee for your main container that your sidecar will be running. Okay, that's startup. Let's look at termination. So the first part of termination is, say you're a pod and you belong to a job and your main container is running, it's doing its work and it finishes, so it exits, right? What happens? As I mentioned, Kubernetes has no knowledge about priority between containers, so it just lets the pod keep running. And this is a problem because you actually just want your pod to exit. And there are ways around this as well. For example, you can have a controller that watches your pods. And if it sees that just the sidecar is running and other containers that have exited, it can send a signal to the sidecar that it should exit. Or you can wrap your main container with some functionality that sends a signal to the sidecar on localhost, for example. But it's not a great UX. And finally, we have pod termination. So if your node's going down, for example, your pod's gonna be terminated. And when that happens, KubeLit sends a SIG term to all of the containers that are running at the same time. And this is also a problem for proxies and sidecars because the sidecar gets a signal that it should terminate, but actually you don't wanna terminate yet, right? You want your main container to have the functionality provided by the sidecar while it's shutting down. It might have to send information over the network about calculations it's done, something like this. So you need to make sure that the sidecar continues to run. And every service mesh has various ways of doing this and each has their pros and cons. So, okay, these problems are not new. We've known about them for a long time. And in 2019, the beginning of 2019, I kept was proposed to solve this. It had a target of 115 and the current feature went into alpha in 128. So it's been a while. At the time, the idea was just take containers and declare them to have a type. So a sidecar or a normal container. And then the key would start any containers, then sidecar containers, and then normal containers and at termination, it would terminate normal containers and then terminate sidecar containers. And for a while, this was worked on. There was work on the proposal itself. There were implementations, PRs opened, but there were problems. At the time, there wasn't that much confidence in the correctness of the pod lifecycle. And for example, there were flaky tests. There was functionality missing around node shutdown that would guarantee a graceful pod shutdown when a node shuts down. And ultimately, these things led SIG node to pause the KEP because they wanted to be conservative about adding new functionality before there was confidence in the existing functionality. This was about the time that the KEP became a meme. So it had slipped countless releases and people, I guess, kind of lost faith in it in general. But ultimately, it was with good reason that it was delayed. So if we fast forward to 2022, a SIG node subgroup for CI has been started and greatly improved the testing infrastructure and confidence in the pod lifecycle. Graceful node termination has gone beta and there's a lot of renewed interest in getting sidecars working, especially for jobs, AI and ML workloads in particular. So a sidecar working group was started. And actually within a few weeks of the working group being started, there was already the proposal that basically is the current feature, which is the following. So in order to get the ordering guarantees that we want with sidecars, we have sidecars actually in the init container list, but we declare an init container to have a restart policy of always. And this makes it into a sidecar container. This is really simple and elegant, in my opinion. It covers a lot of use cases. Some use cases that were brought up were either dropped or pushed. So things like complex types of containers. So you could have a cleanup container, for example. This was dropped. Any kind of like dependency, like a graph between containers was dropped. So there were some ideas around security boundaries for sidecars. So for example, cube CTL exec for only a main container and not a sidecar container. And additional functionality around resources, resource management, like so to manage resources is basically different for sidecars. And I'll talk about that a little bit at the end. Yeah, so let's look at the cap. How does it work? So first of all, startup, let's go there. Like I said, the sidecar is in the init container list, so it starts in order with other init containers. But if the cubelit sees that an init container is restart policy always, it, instead of waiting for the container to run to completion, it waits for the startup probe to succeed. And if it does, then it moves on and leaves the init container running. And we can see already that this solves two of the problems we had before, so we can actually use the sidecar functionality in the containers now by just putting in a container after the sidecar. And we also have the guarantee about the sidecar being up when our main container comes up. And an important part of sidecars is that we always want them to be running. So if in the normal lifecycle of the poddit, for any reason the sidecar exits, then the cubelit will restart it. That's the restart policy always part. So, and finally we have termination. Yeah, so let's look at the normal state of termination, so the best case of termination. If your pod is going down, instead of sending sigterm to all of the containers, it sends sigterm first to the main containers, waits for those main containers to exit, and then starts sending sigterm to the init containers, the sidecar containers that are still running. Which solves the problem completely of what I was talking about before with termination. So we know that the sidecar will be running the entire time that the main containers are doing their cleanup. One thing to note though is this ordering guarantee is dropped if the termination grace period expires and the pod enters emergency termination node, then sigterm is sent to all containers at the same time, two seconds pass and then sigtill is sent and the containers are finally terminated. So yeah, what did we solve? We solved the init container problem that we had before. We no longer have to worry about any kind of lifecycle hooks, any hacks around that. We have no need for a controller that watches the pods. We have no need for the main container to be made aware of the pods and we have a great termination story and it's predictable. And ultimately this makes sidecars simpler, easier to use and more reliable. And I'm going to pass it back to Mate. All right, Mike is a very gracious person. He let me do the fun section, the unpopular opinions section. So, you know, service meshes have been using sidecars for a long time now. Both Linkardy and Kuma used them, Istio has a service mesh, uses sidecar proxies, although we'll see that's not always the case. But obviously this is not the only approach to deploying a transparent proxy that can take over traffic in a service mesh. And, you know, this is a pretty hot debate. And again, both Mike and I have worked the booth at KubeCon and we got this question a couple of times. There are three basic approaches that have started to crop up and become mature in the service mesh space. One, you can run a service mesh for a sidecar. Two, you can run a service mesh with a host proxy. And three, you can use an ambient mode, which is basically a combination of one and two. I do want to note that all of these are just operational models on how you can actually deploy a transparent proxy. It doesn't actually change what happens under the hood, right? Under the hood, you still have a proxy. You still handle traffic and you still do all of the stuff a service mesh does. So let's start with host proxies. How do they work? In this model, you deploy a proxy on each node in a cluster or on each host. And then that host is responsible for directing all of the traffic. An example is Cilium mesh and LinkerD 1.0. I'm gonna talk about LinkerD 1.0 because it's the one that I know a little bit better, even though it's a vestige from the past. In LinkerD 1.0, we had a JVM-based proxy that sat on every node, took over traffic, and got it to where it needs to go. And there were some advantages with this. It's really good for resource consumption, especially when we consider the JVM. I mentioned the JVM is really good at scaling up, but not so much at scaling down, right? So there's no point in introducing the JVM runtime in a sidecar container, but it made a bunch of sense to have it as a per host proxy. It is also easier to deploy per host proxies because you think of the proxy as part of the network. It doesn't really introduce any of the cognitive cons that I mentioned at the start of the talk. And then finally, you can generally upgrade your proxy without really disturbing any of your running pods, right? You don't need to consider that your application is stateful and that rolling it out may have some unintended consequences. You can just do that on the node. But of course, there's a trade-off. First of all, you lose out on operational decoupling. Suddenly, you have something that runs on the node. It's impossible to know what it will affect when you actually upgrade this proxy, right? So if you have a bunch of workloads that are scheduled all over the nodes, you might use some tricky scheduling policies to ensure they're all evenly spaced out. If you upgrade something that's on a node, it's impossible to know which traffic it will impact when something goes wrong. It can also lead to something known as the confused deputy problem. With service meshes in general, we tend to generate some private keys, right? So we can encrypt traffic and do MTLS. And in the sidecar model, at the very least, all of this private key material stays in memory, right? It never leaves the container. And again, kind of going back to the first few slides that we had, we saw that each container has its own file system, it has its own process tree that it sees, and all of these things are isolated, right? So when you generate private key, you keep that in memory and that never leaks. But that's not the same on a node. Imagine if a node is holding on your private key material. That's a little bit of a problem if it leaks. And finally, you have contended multi-tenancy. This is also known as the noisy neighbor problem. And it happens when you have a bunch of workloads, they all send traffic, maybe have a workload that's a little bit bursty. Suddenly, it's going to take all of the resources available in the proxy, right? The proxy is going to deal with a bunch of traffic, but because one of the workload is sending more traffic, then it might lead to this contended multi-tenancy problem. Of course, that's the per host proxy model. We also have the ambient model, and I'll be the first to admit that I really don't know how this works under the hood. I know how it works as an operational model, but I think it's a little bit more complicated. But it does have some advantages. The same advantages that a per host proxy has, but as a disadvantage, it is more complex. And I'm the living proof of it, but of course, I don't take my word for it because I'm also biased. In the end, it's all a trade-off, right? It is a hot topic, and I know a lot of people have questions around it, but it's a trade-off. It depends on how your environment is structured. It depends on how you want to run your environment. It depends on what tools you're running. There's never such a thing as one tool is better than another. It's more what tool is the best fit for the job. So if you ever have to pick between service meshes and between operational models, what I would encourage you to do is just to do the research around it, compare it to what your environment actually looks like, and make the decision that's the most sane for you. And with that, I'll let Mike grab the mic back. So yeah, just one final note on the future part of the talk. So yeah, like I said, the sidecar feature is in version 129 on beta. That means that it's enabled by default, so you can use it as long as the tools you're using support it, which I know LinkerD and Kuma do, or it will in the next release. And the API is stable. There's one thing that's still in progress, and that's KEP4438. And like I said, it's important that the sidecar is running. That's part of the guarantee of the sidecar. But this is actually tricky to do at the moment when the pod has already begun terminating. There's a lot of assumptions in the Kubelet code about the fact that normally when the pod is terminating, you wouldn't be interested in restarting any of the containers. But now we do need to restart the containers. And this is especially important because a lot of like ML jobs, for example, they have grace periods of up to like half an hour. So if at any time, during this time, the sidecar goes down, then all of the work up to that point would be lost because they could no longer use the mesh to communicate. So we definitely need to restart the sidecar in this case as well. But yeah, that requires a lot of refactoring and a lot of new tests. Finally, yeah, I mean, it will take a little while for the ecosystem, I think, to adjust to the new model because the basic assumption was before like only containers are running ever, right, in the steady state of the pod. But now actually there are containers running in the list of any containers as well, potentially. So some additional direction for the present work on sidecars. It's a bit tricky to upgrade. It's a bit tricky to handle this change like when you're upgrading because while you can upgrade your control plane and that would allow you to create pods with a restart policy, with any containers with a restart policy, you will actually need your nodes to be running at least version 129.2 because otherwise they won't be able to handle the behavior and they'll just essentially ignore the field. So there's a little bit of interesting work in something called a universal sidecar injector which would basically be a way to or a model for injecting essentially both a new version, sorry, a new sidecar container as well as an old sidecar container. And then at runtime, depending on what the version of Kubernetes running on the node is, it would basically communicate, so the two canaries would communicate with each other and make sure that if the sidecar container feature is there, then just the sidecar container is running and if not, then the old style sidecar container would run instead. Some other things I think that would be interesting for sidecars, at the moment like you can add resource limits and requests to the sidecar containers. But like if you know, for example, that at only one point in the sidecar's lifetime, it will actually use some requests like storage. It basically ends up keeping this the whole lifetime instead whereas actually what you want is to somehow share these resources between containers inside a pod. And there's some work being done on pod level resource limits as well. Yeah, and finally, I guess this is kind of a, this is like a far future thing, but it is kind of weird that we have sidecar containers in the intercontainers list. So maybe in the probably far future, there's work to be done on improving the API in general and kind of maybe merging these two lists together in some sense. Yeah, and that's the end of the talk. I just want to thank the sidecar working group and Matias and Sergei in particular. And thank you for coming. And we're happy to take questions. I think someone with the microphone will come to you. Just one quick comment. You mentioned, I'm not sure if it was intentional, but you mentioned that it will essentially ignore the feature if it's not there yet. That is not correct. Thank you, yeah. The restart policy always is not supported for intercontainers. That's the error message and then we'll just talk about it. Okay, thank you. Hi, so you mentioned the intercontainer sort of ordered process on startup. So if you had two regular intercontainers, then a sidecar, then another intercontainer, will the last intercontainer wait for the startup of the sidecar? Yeah, it will wait for the, either the, it will immediately be run if the sidecar container has no startup probe, if there's a startup probe set, it waits for the startup probe to succeed and it will move on to the final intercontainer. Yeah, exactly. Thank you. Hey, since all the talk, I remember that any containers and the container resources are calculated as a maximum of both for the pod and if the sidecar continues running, how does it work? Yeah, in this case, I mean the sidecar container is treated basically as if it were a container when it comes to resource limits and request it exactly. Go in once, twice, raise a hand. Oh no. No, I just, when I saw this, I thought this is a great feature, but I thought maybe it's a little bit buried in the intercontainers, so I was wondering if there's like a strategy for making this more visible in the future? I think just the documentation, getting it out there. Yeah, it's a bit, I mean, it's not ideal, right? Like if we were to go back and start again, we wouldn't look like this. It's kind of a compromise between making too big of an API change and enabling the feature, so, yeah. Just wanted to say it's good at family shift. Yeah, hi. I wanted to ask about how it works with locks and also QPCTL execution. So can you execute in those sidecars and does the locks show up? I don't remember, but something within it containers doesn't work there, I think. As far as I'm aware, yes. So yeah, these tools have been updated to treat it essentially as if it were a normal container, so you can exec into it, you can logs, read logs from it, yeah. Is it a requirement for the sidecard container to have post-stop, no, it's not required. No, these hooks are no longer necessary with the sidecard feature, yeah. You can still use them, but they're no longer necessary. Sorry, can you go back to the slide with the three approaches for proxies? Because you talked about the sidecard, but I just wanted to see what the other approaches were. So it's ambient is basically the variant of false process, okay. Thank you. No problem.