 All right. Thank you very much. Um, as mentioned, my name is Derek Parker. Um, and first I would just like to start off by thanking the organizers for allowing me to present this talk here and for everybody in the audience, uh, in person and virtual for checking this talk out. It's something that I'm very excited about. Uh, so as mentioned, the title of this talk is debuggers and ebpf bringing debugging to production. So as you might be able to tell just from the title, we're entering into, uh, something a little bit different than most of the talks that we've had so far around networking and security and things like that. So, um, the, the bulk of this talk is going to be centered around my personal experience, uh, rewriting, uh, the delve debugger tracing backend, um, using ebpf. Now, if you're not familiar, uh, delve is it is a debugger for the go programming language and it has a trace sub command. So if you think of something like S trace or something like that, it's somewhat similar, except for in this case, we're kind of talking about user space tracing. So if you want to take and kind of spy on your programs and see what they're doing in real time without dropping into a full interactive debug section, this is a way to do that. And I say bringing debugging to production, because typically when you think about debuggers, you think about kind of like a slow, methodical back and forth kind of conversation that you're having with your program where you're asking it a bunch of questions, you expect some sort of responses, and hopefully you figure out what's going on. But what I want to talk about today specifically is more spying on your program, just kind of seeing what it's doing under the hood from a user space perspective, the actual functions that you've written. And doing that in a way that's that's perform it so that maybe you can actually spy on your your software that's running in in production. So we've already done a little bit of introductions, but again, my name is Derek Parker. I'm a senior software engineer at Red Hat, where I work on upstream go and I also work on delve as well. So this being kind of my experience using this with the delve debugger. Some of this stuff might be a little bit go centric, but the bulk of it is about how the tool uses ebpf. So there's going to be a lot of talking about you probes, you reprobes all that fun stuff, and especially how we can coordinate and communicate between a user space program, i.e. the debugger and the ebpf program that's running in kernel space. So first off, why, right? The, the, the essential question that you have to ask with any of this stuff is, is why I mentioned initially that I'm rewriting the implementation. So the implementation is already there. Why, why am I spending time on this? We have P trace. And if you're not familiar with P trace, I'll explain in a little bit, but you already have, we already have an existing solution. Shouldn't, shouldn't that be enough? Let's take a little bit of a digression. So if you're not explicitly familiar with P trace, I'll talk about it just for a second. Essentially, it's process trace. It's a feature of a lot of unixy systems. There's parallels on other non unixy systems as well. But it essentially provides a means for a user space program to take over inspect control another user space program. It's basically how debuggers work underneath the hood. However, there's a problem with it. P trace is pretty slow. And I mentioned P trace specifically in this talk, but syscalls are slow is really what it comes down to at the end of the day. And I'll kind of dig into why that that is so problematic for us. But P trace and syscalls kind of in general are slow. They're very slow. So I, when I was initially starting to work on this implementation, I did a little bit of testing and some measuring and I wrote just kind of like a small example toy program measured how long it took to execute and then measured how long it took to execute with different kinds of tracing and implementations on top of it. So as you can see, the program execution just by itself is around 23 microseconds. We're talking microseconds, not milliseconds. We're talking like extremely, extremely quick. So with the EBPF base tracer, you can see it balloons up a decent amount. So we go from 23 to about 683. But, but still remember we're talking about microseconds. Like this is this is not this is, you know, a decent amount of overhead, but nothing too crazy. Then we look at the traditional base, the traditional traditional approach of P trace based tracing, and we go all the way up to 2.3 seconds. Now, for a program that executes in that small amount of time, ballooning up several several orders of magnitude of overhead would never be viable in production. So this is kind of the basis for this work is how can we bring this kind of user space tracing a little bit more interactive and ask these kind of questions to a process that's potentially running in production in a Kubernetes cluster or somewhere else. So let's talk a little bit about why P trace is so slow. Essentially, what it comes down to is the syscall overhead. The user space and kernel context switching gets very, very expensive, especially when you have to do those kinds of operations multiple time per trace. So say, for example, you're tracing the the entry point and the exit point of a function. When you hit the entry point, you want to get the arguments to the function and things like that, that could potentially be, you know, a single P trace call for every argument. And then if you want to follow pointers, now you have even more P trace calls and you're inspecting more things. So this kind of can balloon out of control really quickly and get a lot of overhead. And additionally, if you want to do inputs and exits that's two stops, you're stopping at where the function starts and where the function returns and you're doing potentially multiple P trace events within that. So overall, there's just a lot of overhead. So our solution to this is let's use the epf. Why do we have to do any of this context switching can we we can do better that the technology is there. So epf turns out to be very fast. Again, when we look at the benchmarking there is still some overhead there but but in terms of of the other solutions that we have right now it's it's pretty negligible. So let's dig into a little bit about why epf is so fast. First, it runs in the kernel right so there's no context switching we get rid of that overhead just right off right out of the gate. And on top of that epf programs are typically small targeted programs so the execution of those programs happens very quickly. So we're not kind of running in huge loops and unconstrained kind of behavior. And that single program that single stop can can by itself gather all of the data that we need and send it back to user space without doing multiple points of context switching. So let's talk about the requirements that we had for our tracing back end because I think this is where it gets the most interesting in my opinion. So one of the things that we wanted to do one of the requirements was essentially it has to maintain parity with the existing tracing implementation right so we need to be able to trace arbitrary functions. That's very interesting to me because a lot of the use cases for epf programs are usually very small and targeted. You kind of already know what syscall you're going to be attaching to or what thing you're going to be inspecting. So it's it there's not a lot of guesswork there you can kind of built in a lot of the logic just within your program. We for for for go programs and in the context of Delve we need to be able to retrieve the go routine ID so we need to know where to find it and all that stuff. We need to be able to read function input arguments and we need to be able to read function return arguments. So let's talk a little bit about tracing arbitrary functions. So in general in order to make this stuff work we use Libbpf and Libbpf go so we've heard about Libbpf and for anybody trying to experiment with ebpf stuff and go there's there's a bunch of different frameworks but we've decided to go the route of using Libbpf go. It's worked out very well for us so far. So we load the epf program and that's embedded in the Delve binary and I'll talk about that a little bit more but that that has some interesting side effects. So we attach you probes and you reprobes for each symbol. Now this all seems pretty standard but in the context of go I'll explain a little bit how you reprobes can be particularly tricky. So initially we we embed the epf object in the Delve binary. This is something that I think is really cool. It's a feature of the go programming language in general but it allows us to basically continue to ship Delve as a single binary without any kind of dependencies on disk or anything like that. In terms of trying to find where this epf program is or anything like that. And then and then from there we you know pretty standard stuff we we we load it we load the epf program into into the kernel and kind of hold on to some references from it. Now we have our we have our epf program we have it loaded it's it's in the kernel now how do we kind of interact with it. So I want to get into some kind of low level implementation details because that's what I think is is most exciting for these kinds of talks. So in order to communicate back and forth between the debugger and the epf program we use some pretty standard epf stuff ring buffers and maps and we use them in what I think is some somewhat creative ways. So we use a ring buffer to communicate from epf land back up to user space to the debugger and then we use the map to communicate vital information from the debugger and user space to the epf program. So one of the things that we communicate from user space to the epf program is all of the information that it needs to know to find how many arguments does this function have. Where do they live we're talking specifically in the context of go you may or may not be aware that go has recently changed its ABI from a stack based calling convention to a register based calling convention. Within delve we have to support both of these versions so we have to know where to find arguments on the stack. We also have to know where to find them in in registers and if we have pointers we need to know how to follow them and get that data and we need to know that it is a pointer in the first place. So we we try to convey as much of this information as possible from the debugger and we store this information in the map that's keyed by the instruction address. So we can say when the epf program is hit it can look at what the current instruction pointer value is look up in the map the information that it needs to be able to decode all of the information and then the the program can kind of run and do its thing. So as you can see here we kind of put a lot of information in like the go ID offset so where the go ID struct is from the offset of the go ID from the the actual like where the go routine struct lives. We describe the G adder offset so the the the offset of the go routine from thread local storage. It's a lot of kind of low level information but it's this kind of information that the debugger already has so instead of like re implementing a dwarf parser in ebpf or something like that. We try to provide as much context as we can from the user space side of things and we do that ahead of time so that by the time the epf program is actually triggered it has all the information that it needs to be able to just quickly as quickly as possible read all this information and send it back to user space. So again more information so for each function parameter input or output we have a ton of information about it what kind of variable is this what's the size of it. What's the offset from the stack pointer if this is stack based ABI if it's in a register all this all these kind of information and then also the information that we want to convey back to user space like the values the actual raw bytes of these variables. So we want to pass that back to user space so we do it through these various kind of structs that we pass around using ring buffers and maps. So now that we have all of our information set up from user space and in in the epf program let's talk a little bit about how we how we kick all of this stuff off and how we start triggering these these events. So from the delft side of things we attach you probes and you rep probes. So if if you're not familiar with this through epf you can you can attach user space probes and typically you probe would be at the function entry point and then there's also you rep probes which can trigger whenever a function returns. So here we do a thing where it's update our map which is what I was kind of saying where we pass all the information from user space about how many arguments where they're located all that stuff and it's keyed by a memory address. So we say the function entry which is where the epf you probe is going to be and we pass it all the information that it needs and so this is us updating the epf map from from go. Then from there we get the offset to the symbol that we want to start probing and then we attach our probe and are you rep probe. Now specifically with you rep probes when you're talking about using them with go programs they get a little bit tricky. So the way that you rep probes actually work is they modify some information on the stack so they modify actually where the function returns to it actually ends up returning into kind of like a trampoline that ends up executing the epf program. This has a tendency to make go very very upset because go likes to look in the mirror a lot it likes to inspect itself. So if you're not super familiar with go go has this concept of go routines which are these very very small kind of units of execution and they start with really small stacks and that stack grows and gets copied over time. So when when go does this it needs to inspect the stack look at pointers update a bunch of stuff and when it's doing the stack inspection if it sees an address that is not familiar with it's going to blow up. So we have to be really really tricky or we have to be really careful with how we use you rep probe specifically so that when you're tracing your program it just it doesn't just start panicking because what's the point at that point right. So to do that we we kind of we set a break point a real break point like a P tracy break point on on the runtime function that handles this copy stack so we kind of do this like real quick when goes about to copy the stack we remove the you rep probes and then when it's done we kind of put everything back so it's kind of like a weird little hack but it's mostly working for us right now. So let's talk a little bit about getting data back from the ebpf program so Libbpf go has a really nice interface where if you're communicating back and forth using a ring buffer on the go side of things you can consume it just via like a channel. So within delve we have a go routine that's long running starts communicating and and just kind of getting this information back from the ebpf program and it can parse it and essentially the nice thing about this approach is it can impart. It can start parsing it at its at its leisure for the most part I mean the hard work is done like what we really want to do is we want to we want to prevent the program. From stopping for too long right we want to prevent the overhead so we just need to get the data in the ebpf land we just need to gather all this data as quickly as possible shoot it over to user space. Once it's back to user space we can kind of parse it and present it to the user. You know not not slowly but there's there's less time constraints there because that doesn't actually affect the program that's being run. So this interface is really nice and it's been working out really great for us. Now there's there's a lot of like upsides about this this approach and this rewrite and it's been a lot of fun it's been very exciting but as with everything there are a little bit of downsides. So one of the first things is in my opinion ebpf programs if you if you've written if you've written them you know if you haven't. They're they're you mostly write them in like a constrained version of C and when I say constrained I mean you can't loop you have very strict requirements on like how much space how much memory you can you can allocate in the stack. There's no concept of allocating in a heap so you have to kind of make your own heap using like a map or your own kind of ring buffer that you that you just use yourself. There's a lot of kind of weird non standard stuff that you don't have to think about when you're when you're writing in like a you know like just regular go or even normal C or anything like that. So that kind of cognitive overhead I think I you know I understand why it needs it needs to be that way but that is in my opinion just a little bit of a downside there's there's a little bit more that you have to think about. And you have to be really creative with certain things so for example like lack of being able to write loops in an ebpf program from for this particular implementation we need to be able to say well I want to parse three input arguments or four or five. How do you do that if you can't loop and kind of you know do this kind of arbitrary thing. So in our case we kind of took advantage of C style switch statements and automatic fall through to make like weird pseudo loops and stuff like that so there is workarounds but it's just something that I think people should be aware of. Another thing is kind of fighting the verifier so it's like you're fighting a whole nother compiler. So also if you're not familiar when you when you load an ebpf program into the kernel it goes through like a verifier where it just makes sure that this function or that this particular program is going to execute the way that it expects. It's not going to do anything dangerous it's not looping or and verify that it actually will exit deterministically. I mentioned this a little bit but the small stack requirement can be a little bit of a hindrance and something to that you might have to work around a lot. Again no loops limited control flow it forces you to be very creative and as I mentioned you reprobes do not play with play well with with go programs by default. If you're not really really careful with how you use you reprobes you're guaranteed pretty much guaranteed to crash any go program that you use them with. And that's it thank you very much again my name is Derek Parker you can find me on the twitters there at at Dirk the daring and I'll be around for if anybody has any questions comments or anything like that but thank you all very much. So I'll repeat the question as I understand it is are are these probes handled on a go on a per go routine basis or are they global. So in the strictest sense they are global but that's one of the things that we do from the debugger side of things is try to figure out how to correlate all of this information and present it linearly. So that's one of the things that's why it's important when when I showed some of the code examples where we're parsing the go routine structure and we're getting the go ID. So we're we're using that information to take that back to user space and kind of present like a cohesive story of when a function is hit and when it's returned and making sure that that that the input in the return values are all being associated with the same. The context of execution so the same go routine so to answer the question in general yes they are global but we do a little bit of work on top of it and the debugger and to kind of stitch all of those things together. Yeah that's a that's a great question so the question was are there are there any limitations of when this this particular implementation of the back end can be used. So yes so right now you do have to be a privileged user to be able to load the eb eb pf program into the kernel or have syscap admin. I think if you have syscap admin you can do whatever you want but that that's another kind of work around and but and right now it's it's still like this this implementation is ongoing and something that I'm still kind of currently working on. So right now it's kind of you have to build like a like we have a make file entry of like build BPF right where it's like you build like a slightly different version of it. So you have to do that first and then also be a privileged user to be able to do the rest of the stuff. Great question so just repeat the question. Since we're talking about this in the context of production a lot of people will strip debug information out of their production binaries to make them even smaller. When they when they run them on you know in any kind of production context so yes this that is a huge for this aspect that that would be a huge hindrance in limitation. A work around for that is delv does allow you to supply external debug information. So if you have it stripped but you also have it around somewhere you can you can provide that information after the fact even if it's not present in the binary delv can use that information just as it would if it was in the binary.