 So I wonder if anybody tried to debug containers using sidecars. Anybody? Okay, a few people. Have you tried to create your own tool to debug other containers using sidecars? Okay, good. Have you tried to do it with Docker and Kubernetes? How about containerD? Anybody tried to do that? That's a little tricky. We're gonna talk about building a debugging tool for all those three runtimes. They're similar, but they're different. And we're gonna talk about Docker Slim and how it was created there. And if you don't know what Docker Slim is, you'll learn more. So first I'd like to intro Ivan. This talk wouldn't be possible without him. Unfortunately, he couldn't make it. But if you're interested in containers, you should see his blog posts, newsletter, and his amazing diagrams. His labs are awesome. Yeah, I am Sayum and I'm a field CTO at Sevo. CNCF ambassador, Cube Simplifier founder. You can find me on Twitter, very active, I might reply you here. And also joined by Kyle, the Docker Slim guy. And yeah, CTO at Slim AI and you can KC QoN at Twitter. You can follow us and let's get started. Okay, so a little bit about how it all started. There's a connection here because it all started here at KubeCon. KubeCon and you two years ago in Valencia. We had a hackathon and then won the hackathon and he created the very first version of the debugging capability we're gonna be talking about. So let's first discuss the problem, why we are building, what we are building and what's the current landscape of debugging the containers using the existing tooling and then how this new tool actually helps to simplify the developer experience and do it seamlessly across all the three Docker Kubernetes container D. So minimal container images don't support debugging because they are minimal and they don't have what you normally use for debugging. So the standard debugging tools won't be there. The go to command, since many of you raised hands so you probably might be aware of these. So the go to command when you are debugging is Docker exec, Kube CTL exec, executing into the container and spawning up a shell and then you do LS and bunch of other things to test out what is working, what is broken. That's how you get to the application which is not working. And it starts a target shell in the target shell process in the container. So that's the Docker exec-it. So these are some of the standard commands that you probably would be using when you are debugging a case across Docker container D and Kubernetes. We'll come to Kube CTL debug and others as well. But when you talk about the minimal images, distal images, that is just the minimal container image containing your application and nothing apart from it, no libraries, no dependencies, no shell, then how do you do that? Because if you cannot put a shell, spawn a shell process, how would you attach to the file system? How do you debug? So that's hard. This means exec commands won't work. So as you see the standard Docker exec, Nerd CTL or CTR exec, Kube CTL exec, these standard commands won't work. Some of the others still would work like in the center which does require some low level privileges to run and Kube CTL debug. So taking a step back and reviewing the different types of deepening experiences, we can just put the site in the center for now. So you can see Docker exec can run the target binaries and it does install some of the extra tools, only works if there is a shell. Kube CTL debug, the site card that Kyle just asked like how many of you have debug using a site card? It can see the target process, access the target root file system, but not as is. And NS Center can do anything, but you should have sufficient access and you should know what you are doing. So that's for the NS Center. So can we combine the best of both worlds and make sure it works in the debugging experience that you would want with your containers, Kubernetes pods and with container D for the minimal images. Also, if we can have a same user experience. So I think user experience here matters a lot because when you are writing or used to, get used to writing the commands via single CLI tool, how easy it is to use the same kind of framework or the same developer experience throughout all the environments. For example, right now the tool name is tool. You have debug, if you can just pass on different runtimes, if it's Docker, container D or kids, and then you just define the target where you actually want, which particular target container you want to debug. So that would be cool. So let's see if it's possible to have this. So with Docker exec, what you do is you get a new container. You get a new container. It shares all the namespaces, including the mount namespace. So you can see when you have the Docker exec hyphen it shell, you will get a sidecar and it attaches everything is over there. And these are some of the more details. Like if you are interested how the Docker exec works, some of the more details around it. So you can go to these OCI open container runtime spec and the issues and the pull request, how it works. You can almost simulate the Docker exec behavior using the Docker run command by attaching the processes. So you can use Docker run, then use some of the namespaces net IPC PID and create a app. In this particular case, you will have a separate container which will not have the same mount namespace. So the file system of this particular container when you do the LS, it will be different from the target container that you actually want to debug. So this is what it looks like. You can see both the app container that you actually want to debug and the SH container that you just created using that Docker run command. They share the PID net IPC because that's what we shared, but they have the different mount namespaces. So, and when you do kubectl exec, so kubectl exec or sorry kubectl debug, which is the fmrl containers that gets created. So it creates a sidecar container. So this is your distiller's app that you want to debug. It spawns a sidecar container and you have different amount namespaces and it shares all the other namespaces. So even there, when you get directly in a straightforward way when you do the LS, you will be getting the LS of the container that just got spawned up. So not sharing the mount namespace has an impact on debugging experience because now you have a different root file system. Is there a way to kind of can we do it in a way that we get to the app file system which we actually want to debug? Yes, we can with some of the workarounds. So if you, when you run this, you can actually use slash proc slash PID root, for example, proc one root, and it gets you to the app root file system. But still the question remains about the developer's experience. You went to the file system, is it still possible to have the same level of developer experience for the, with the sidecar debugging? You can use the chroot command or equivalent command. The chroot command will make it look like we are in the target container, but we need to make the debugging image binaries accessible to the new chroot. So what the workaround is, so in order to see the debugger's entry point, so you can see you first created a sim link. So that's the sim link creation, then exported the binaries and then chroot into the proc one root. What this does is it actually, now it gives you the feel, okay, you are in, when you do the LS, you have the access to the binaries of the container that you actually want to debug. So we saw the problems. We see the developer experience lacks. There are workarounds that you can do and different workarounds are different for different runtimes. Can we now start implementing our debugger? The clicker, cool. So let's take a look at what we need to do. The high level flow is very similar for all runtimes. Yeah, you need a container, but yeah, you need to spit up a container then you need to connect that container to the target, you wait for it and then you can do the IO there. The steps are similar, but the details for each runtime matter because of the APIs and because of the design differences, for example, but Kubernetes, you have pods in ephemeral containers to deal with. There are a number of low level details you need to deal with when you go beyond the happy path. Just like with anything else. We won't go into great details there, but for example, Docker Slim creates privileged containers, the SideCore containers as privileged containers to deal with some of those gotchas. We talked a little bit about exposing the binaries in the target container view by sim-linking the debugging file system and then adjusting the path with the bin directories. But if you have shared libraries in those executables, then you also need to do something similar there as well. And it's a little trickier to do with shared libraries. So it's easier to have statically linked binaries in your debugging images, but we'll talk more about it later. So let's take a look at the different runtimes in the APIs. Obviously we'll start with the Docker API. So here we have a few methods and the relevant interface that you would use in the official Docker API client. Kind of the standard stuff to work with the container API to create a new container, et cetera. A couple of screenshots snippets of the API. Hopefully you can see it. I didn't get a chance to update that. And here, this is a slightly different version because it's using a wrapper package in Docker Slim to abstract the lower level API, but this is what happens in the tool itself. When it needs to handle the Docker runtime, it sets up the options for the new container execution, then it creates a new container execution object. And the most important thing starts on line 257, where we're setting up the modes, the IPC mode, network mode, and PID mode. We need to make sure that they point to the right container, that's the magic. So when you're building your debugger, this is the key thing that makes it happen for the Docker runtime. So let's see what it looks like for the container D API and its runtime. So it's a little more complicated, but conceptually very similar. There's the container construct, but there's also this task construct that you need to use when you're creating containers and when you're running containers. Much lower level and the key here is the container spec namespaces that you need to configure and it's configured in a different way there. A few snippets of the API itself, it just shows the container, the process, and the task interfaces that you need to deal with, creating the debugging image instance, container and then interacting with the target container and all of that. So this is a little messy. It has a lot of snippets because it's the most complicated than the lowest level API. So first, with container D, you need to find the container that you want to debug. There's a tricky way to do it where there's no nice way to have a name for the container, unlike Docker, you just say, hey, this is the container name. But once you find the container, you need to get the main task of that container. Once you get the main task, you need to get the process ID for that task and then you use that process ID to set up the namespaces. For example, there, line 252, you set up the process ID namespace and then you create a Linux namespace option spec option then you later use in the call to create a new container right there on line 328. You're passing a spec with new spec functional argument and you pass the spec ops there that you configured. And after setting up the IO plumbing, which is very similar for all runtimes, but slightly different. So it doesn't make sense to go into that. You create a new task, again, what container do you have this new construct to deal with? That's how you run things. You don't run the actual container. You execute actions on the task. So you create a new task, you start waiting for it to get the signal channel and then you start it and then you just manage the signals if you have a terminal and then, or you forward the signals to the target if you're not. With Kubernetes, a little easier, but you do need to deal with the ephemeral containers. And the magic there is configuring the target container name in the ephemeral container construct. Much easier than container D, just specify the name and that's it. The main APIs, the one in the pod interface, it's used to update the ephemeral containers. You can use the low-level put and patch calls as well if you want, that's doable, but Docker Slim uses the update ephemeral containers call and then the rest sub-resource call to do the actual attach. Now here it has a couple of API snippets, kind of shows that the update ephemeral containers call is implemented using the put call in case you cared. So here we have a whole bunch of snippets, fewer than a container D. We set up an ephemeral container and like I mentioned, online, 11.94, we configure the target container name parameter so we're talking to the right, so we're attached to the right target container. The rest of it is the usual container spec set up with the entry point and arguments and all of that. Then we grab the pod spec from the target spec that we got from the Kubernetes Client API and add our ephemeral containers back there and then we call the update ephemeral containers function method and that will add that ephemeral container after we wait for that to start and run, we can set up the IO using the remote command and we also need to attach the sub-resource as well. So this is probably easier than container D by 50% or even more, no namespaces to deal with. So a few things to mention about the debugger images. So the best images, like I mentioned, are the images with the statically compiled binaries because that means after you connect the path to the target container file system, you can easily use them. It gets trickier when your binaries have shared libraries, it means somehow the target container view needs to find those as well. So if you're using the same base image that matches the target image, that makes it easier. So the same system shared libraries will be there. The next best thing is to use something that's similar to what Nix does. So I don't know if anybody looked into it, but what Nix does, it uses SimLinks a lot and we looked at the SimLinks a little bit. It's doing something similar and then it's doing magic with environmental variables to for the path variable and the LD load path variables as well to make sure that the right libraries get loaded. So you would have to do the same thing there as well. The next best thing is to use a Nix-based image because then they'll take care of that for you. They'll set up the shared library loading and all of that. With the Docker Slim tool, there's a set of pre-selected debugging images. So there's your usual busy box, that's the go-to container when it comes to sidecar-based debugging and then there's a couple of chain guard images. Chain guard was nice enough to create a custom debugging image for Docker Slim called Mental Kit Debug and there's a few other popular debugging images like Natshoot and a bunch of cool kits for different runtimes and a few others. So feel free to experiment. So some of the challenges creating the debugging tools. One of the biggest challenges is the APIs and the libraries. They're not well documented. So if you try to build your own tool, you know that you kind of have to reverse engineer what they're doing. And unfortunately, chat GPT doesn't help because especially with Kubernetes, why is that? Because it has all data and the Kubernetes API changed significantly and then it generates a whole bunch of code, looks good, you try to run it as broken because the API is outdated. So the best thing to do is to go to the source, go to kubectl, Docker CLI, nerd CTL, they mean gotcha there is that it's complicated. There's just a lot of complexity and you kind of have to reverse engineer that. Ivan has awesome Go examples and his CDBug tool is a lightweight version of Docker Slim and it also has port forwarding as well. So you should check it out and it's a good example again if you're writing your own. A few more gotchas, again kind of a summary. When you go beyond the happy path, you have to deal with permissions and other gotchas around shared libraries in your debugging image. But again, like with anything else, you deal with that depending on the needs of your application. So here's the clicker, it doesn't sound like it. This one works. So, yeah, hand and mic are difficult when you do the demos. So we will try to now this one. Yeah, it will be interesting to do the demo like this, but yeah, hopefully it works. So the first one would be to have the slim like using the mint tool to debug the image engine X slim. So that's a engine X slim image, which doesn't have a shell and we are trying to now debug this. So there's tab completion. So you can see we have debug and then we can specify the runtime and then we can specify that it's Docker. If you don't specify the runtime, it automatically chooses Docker. So if you are using, if you are debugging Docker, then you can actually omit the runtime. And then we specify the target and then we specify the engine X slim image. And you can see we are in the shell and you can also see which engine X. So you can see we have the access to the engine X as well. So yeah, it's the same workflow. So now what we'll try to do is we'll try to replicate the same workflow which we talked in one of the slide where we had tool mentioned. The tool was meant. So we'll try to replicate the same behavior across Kubernetes and container D as well. So moving to container, moving to Kubernetes. The animals. It's the same one. You want to show this. Yeah. Just show the spec. The cursor is all the way to the right. Yeah. So a simple YAML file, which is there, simple engine X part with the chain guard image, which is there. That's already deployed. And yeah. So we exit from this and we do kubectl get pods. Still need to. Still need. Yep. Okay. So you can see the example pod that spec that I shown is up and running. So we'll use the same tool and debug. And this we specify the runtime. This time it's K AIDS. We actually have to specify namespaces and the pod, but it's since there is only one part in the default namespace, so it should automatically pick up and it does. And you can see it is starting to enter the container. And there we are. So the same, the same tooling that we have. Which engine X. So we have same access to the root of the target container that we want to debug. And we had the same developer experience for Kubernetes as well. Now we'll try container D. Yeah. Yeah. So it's like Docker exact work to tell exact. You see the target container file system. You don't see the, the Docker entry point shell because it's, it's a chain guard engine X image and they don't have it. But all of that engine X stuff is there. Can we, can you switch to the other thing? Okay. Before I do that, I'll just do, so do nerd CTL. Yes. So right there, there's a, there's an image running and we'll do the same thing. Run time. Okay. And we shouldn't get the same target completion. Yep. We see the engine X right there. So here we see the Docker entry point. We do which engine X. Which should be good. So we can exit and we can switch back and show the version without the same target container experience. If we want to do it, we have a few minutes, a couple of minutes. Yeah. So it defaults to the darker runtime. Yeah. And this parameter, we need it false. So now if you know, if you see, we, this is the root, the root of this particular debugger. And if I do which engine X, it doesn't work. So we can simulate that experience using the run as target shell false. Yeah. So that's what you get when you use the, the regular CLI with Ctl debug or Docker run and other similar tools. Well, that's it. We'll point to the last slide. Yeah. And there's a couple of screenshots there that show the same thing. And how you can disable the target shell flag and get the debugging image. And then the information about the images and also the next one. If you scroll, scroll to that. Yeah, that, that custom debug image chain guard created was awesome. Yeah. So that's pretty much it on debugging. Yeah.