 All right, good morning. Everybody can see everything and you can hear me okay? Good. My name is Liz Rice, I work for a company called Aqua Security. We help enterprises secure their containerized deployments. So you might think it's quite a long leap from thinking about container security to thinking about debuggers. So this all came about because I did a talk about how to build containers and you use system calls for that. And I felt that I needed to understand a bit more about system calls. So I did a talk about what system calls are and how you can trace them using P-trace. And I came across this little paragraph from the man page and it says P-trace is this very powerful system call that you can use to observe and control the execution of another process. And I used it in this other talk for system call tracing. But you can also use it for breakpoint debugging and I thought, yeah, I should try that. I should figure out how to use that to build a debugger. So that is what we're going to do this morning. We're going to build a debugger. So before we can do that, let's just talk a little bit about what's happening when we run an executable. Oh, I should talk a little bit about P-trace in the go code first. So I'm a go programmer. Anybody else write in go here? Not that many hands. It's okay. You people will be my peer reviewers when things go wrong. For everybody else, I'm sure you'll be able to follow along. P-trace, the system call library in or package rather in go gives us a whole set of functions. They map down to this single P-trace system call, but actually there's a whole load of sub commands and this gives us a pretty good idea of the kind of sub commands you get from P-trace. So for example, we can see things like getting registers. We can see setting registers. We can see things like a single step, step from one instruction to the next. And if you're old enough like me to remember things like Commodore 64, I remember peaking and poking data. I know that that is to do with writing data and reading it from memory. So that gives me a pretty good idea of the kind of things I can do with P-trace. So before we start looking at writing a debugger, let's just talk a little bit about how executables work. I'm sure lots of you know this, but make sure we're all on the same page. So when we compile our source code, go is one of those languages that is compiled. Every line in the source code maps to one or more machine code instructions. And when we run the program, we have a CPU register called the program counter that's pointing at the next instruction we're going to execute. So as we go through the code, the program counter gets incremented to step through the code. Now suppose we want to break points in our debugger. How do we do that? We do that by overwriting where the instruction is, where we want to stop. We write this code, hexcc. And that tells the CPU, as it's stepping through the code, if it hits that bytecode, trigger an interrupt and stop execution. Okay, I think that gives us enough to start building our debugger. Oh, one last thing, though. We need to know where in memory to set that breakpoint. Because as humans, we look at the source code, we want to be able to stop at a particular line in the source code and we need to know where that maps to in the machine code in memory. So we need a way of mapping between addresses in machine code and their corresponding source file and line number. Right, so this is not the best go code you're ever going to see written. I have a lot of global variables just for convenience, so forgive me for that. This is going to be my debugger, and I also have a little executable called hello, which we'll look at in a moment. So that's going to be my target that I'm going to debug. Now, Go gives me a way of extracting information about that mapping between machine code instructions and the lines in a symbol table. How we actually go about extracting that is not terribly interesting, so I have written myself a little convenience function to do that. The code for this is on GitHub, and I'll give you the link later so you can see how I did that. It's not super interesting the detail of it, but we could just go into my... Let's check where it is. So my hello target is in a directory called hello, and I could use a little tool. I hope I've got readElf on here. Yes, I do. So readElf stands for something like executable format. Let's look at... So this is telling us something about that executable file. We can see that it is an executable file. We get some program and section headers, and the one I'm particularly interested in is this one here called goPCLineTable. So that's written by the Go compiler into my executable, and that's what my little function gets symbol table reads out. We don't need to go into the details of what that is, but that information is built into the executable. Okay, so having got that symbol table, I can do interesting things with it, like I can look up a function. Now, all go executables have a main.main, so I'm pretty confident I can find one of those, and I'm going to get back a function structure, a structure describing that function, and I can print out some information about that. So the function name starts at a particular address. So we get the function name, and we can get entry, which is the address in memory of the first machine code instruction in that function. We can also do mapping between program counter, so remember that CPU register is called program counter. We can get the corresponding line in the source code from this symbol table. So if I start with that first address of that function that we just extracted, I can get back a file, a line, and a function structure, and I could print that out. So function whatever at line, whatever in file, whatever. So that's going to be a function name at the line at the file. And finally, we also have the opportunity to go in the other direction, so we can go from line to a program counter. So we can take a file and a line number, and we get back a program counter. What else do we get? Let's check. Yeah, we get back a program counter, a function and an error that I'm going to ignore, and we'll print that out as well. Now let's choose a line that we're going to look up. Here's my source code. I'm going to pick line 22. So let's go back to my debugger and say, let's find out what function we're at if we were at line 22 in that source code file. So let's try that. We'll run this. So we were able to see that the machine code address of the first line of, or the first instruction in main.main is whatever. But then we could map from that to see that main starts at line 5. So let's just check that that's true. There it is. There's line 5 and main starts there. Great. And we also saw that if we were at line 22, we would be inside a function called f3. And that is also true. Here's our function f3. So we've got that mapping between the machine code instruction and the location in source. So now I'm going to get my debugger to run this executable. So I can do that with... We build up a structure. I need to say that I want to map stood-in, stood-out and stood-er from the OS stood-in, stood-out, stood-out because otherwise we can't see what's going on. So this is where I get to do multi-cursors. And then I can start that executable running. So I'm setting up a structure that describes the thing that I want to run. And then when I call this start function, that actually creates a process for running my target executable. And then I'm going to wait for something to happen. And if that gives me an error of any sort, I will print out what it gives me. So wait to return something. Okay. And I'm also just going to put in a little line here to say that the debugger finished when it completes. Right. I'm going to just comment out the output we put in before because it might just be kind of getting the way. So now, by running my debugger, it should run my target executable. And it does. So my little target executable just outputs a line, returns a value. And then we see that output line telling us the debugger has finished. Right. Now I'm going to enable p-tracing on this target executable, which I do by setting a sysproc attribute. Sysproc attribute. And I say, I would like some p-trace, please. So this is saying, when you fork this process for the target executable, I want to attach p-trace to it as well. And when we do that, we see that wait returned. I mean, it's come back as error is maybe a strong word. We've got a trace breakpoint trap. That got output. And then the debugger completed. When the debugger finished, there was nothing holding that breakpoint up anymore. And our target executable was allowed to continue. So when we get this stop signal, the target executable is just sort of held waiting for, waiting to be told what to do because p-trace has stopped it with that. At the moment, just right at the start. So now I want to let that executable continue, but we're going to set a breakpoint at a particular line. So we want it to continue up to that breakpoint. So what do we need to do? We need to set... We're going to use p-trace, oops, poke data. And we're going to write into the memory for this particular process. So we need a process ID. We're going to stop at the program counter address that corresponds to a line we want to stop at. Why don't we use this one we had earlier? So we'll stop at line 22. We've got a program counter that corresponds to that. And there's something else I need to do here. Oh, I need to actually write in the hexcc value. So I need to... There we go. So that writes that bytecode, cc, at the correct address in memory. I need that process ID. We get that from here. Okay. Having put the breakpoint in, I now want to let the program continue. So I can do that with p-trace-cont. So allow that cont. Allow that particular process to continue until any signal is received. Then we have to wait for something interesting to happen on that process. Nill. So hopefully what's going to happen there is we're going to get another interrupt, another breakpoint trap. And at that point, we should find out something about what's happening. Let's get the state of the registers. p-trace-get registers. So that's the process ID and we'll read them into some registers. We could look at the program counter. Now, confusingly, the program counter is also called the instruction pointer. Those are kind of the same thing. And I could print out the value of the instruction pointer. I'm calling it instruction pointer because in the reg structure, it's called register-rip. So that's going to tell us the address where we stopped. And we should also convert that from a program counter to a line number so that we can see in machine-readable form where we are. So I am going to convert the current instruction pointer to a file and then a function teller. I feel like the sound's cutting in and out. Right, let's see what happens. Okay. So we stopped and we got this output that told us that we actually stopped in, well, exactly where we wanted to, at line 22, which was inside that function F3. We also got the output that told us what the address in memory was. Kind of what's taken us to the point of hitting a breakpoint in a debugger. But we would like to see some interesting information about the state of the toggle executable at this point, just knowing where we've stopped to tell us anything new. So what I want to do is output the stack frame or the stack trace rather. What's the stack trace? Let's have a look. So we need to talk about a couple more registers in the CPU for this. There is a stack pointer that points to some part of memory and a base pointer. And between those two addresses is the current stack frame. And that's like a kind of scratch pad for whatever function is currently operating right now. So we might have things like parameters, or space for return values, any local variables. They all get allocated space in this stack frame. Then when we go from, well, when we call another function, interesting things happen. The current program counter gets pushed onto the stack. And that tells us where we're going to come back to when we return from the function we're about to call. The base pointer moves to where the current stack pointer is pointing to. And then the stack pointer, we allocate a new frame on the stack. So that's a new frame for the new function we're about to call. But because we put the old stack pointer on the top of the stack, we can chain through pointing back to the previous stack frame. And we can use that to look at the chain of functions, the hierarchy of functions that we've called to get to this point. And we can also look at the contents of that stack frame to look at things like the address that we are about to return to when we exit a particular function. Now, writing all this stuff out would take me a little bit too long to do here and now in this talk. So I have cheated a little bit, and I've got a thing called output stack. Again, you can see the source code for this later. And this takes my symbol table, my process ID. I've spelled that wrong. And a few registers. So we need the instruction pointer, the stack pointer, and the base pointer. Right. So now, we not only stop at line 22, but we get this output of what's on the stack. And I've looked at things like what the return address, you know how we saw the current program counter address gets pushed onto the stack. So I've used that to see how that kind of function hierarchy got put together. And we can also see some interesting things like all the threes, all the ones, all the twos. If I look at my target executable, you can kind of see where that's come from. I've sort of deliberately put things that we can easily identify. So those are the local variables and the return values. We can see those kind of on the stack. So that's kind of cool. But when you're in the debugger, you can step through, right? You don't just want to stop in one place. You want to be able to allow the program to carry on and look at the trace at another point in time. So we can do that pretty easily. Let's just... Now, rather than hard coding this, I've got a little function that will let me enter whatever line number I want to stop at. And we'll put this into a loop. Now, let's just have a quick look at that. So let's say I want to stop at line 22. We get that stack trace. And now I can say what line I wanted to stop at next. Maybe I'll say, well, it came from line 16. So let's go to line 17. And that's where before we were inside function f3. I started late, so, yeah. Where are we? Right. We move to the next function. So we're inside f2, where we were inside f3. So we can stop at any line we like. But it would be really nice if we could single step. So we're going to, based on the line number, if I do zero, if I enter line number zero, I want to just kind of run to completion. And we can do that by just calling continue. And then I want to break out of this loop. And just because of the way it goes, I actually need to put a label in this particular case. Ask me about that afterwards. If I don't enter anything at all, let's do a single step. So we'll step through an individual instruction. And that is pretty easy. Syscall, p, trace, single step for that process ID and zero. And any other line number, we will do what we did before. If you were really eagle-eyed, you might have noticed that we overwrote some, you know, we stamped hex CC in memory here. So it might be nice to replace what was there before. So we can do that. What we're going to do is read out the previous value. And we need something to write that into, which is a make one byte slice. Right, so that's just somewhere we can write the original value. Oh, thank you. And we're going to poke that back in when we have reached the breakpoint. Okay, that shouldn't make any great difference other than it's restored our program. It should have contained. Okay, what have I missed? This looks reasonably, reasonably okay. I'm just going to cheat and check whether I've missed anything. Ah, I need to wait for, yeah. So I do need to, okay. So now I can either enter a line number or I can just hit return and it should single step. So let's go to line 22. And now just a single step and you can see the values of the local variables and things changing as we go through the program. We can see that hex 777 that we finally get kind of propagating through the program. So if I run to completion, we see that all the sevens. So one last thing, I know I'm a bit late, but I'm still, I've only been going for 25 minutes. So, right. The last thing we can do with ptrace that's really kind of fun is we can change the tracy, the targets, memory and registers. So that seems like a cool thing to do. Now, this was the machine code that we looked at before and it corresponded to the function F3. And if we go a little bit, if we look at this, we can see you don't need to understand a lot of machine code to see, ah, here's this hex all the fours and that gets added into, well, AX, I happen to know as the name of a register. So seems pretty plausible that AX is going to contain a value that is kind of important in the, you know, maybe it's our return value that we want to, or that we could change. So let's just confirm my hypothesis by printing out that AX value. Registers are AX. So, if I go to line 22, AX currently contains all the threes. And if I just single step through, eventually we will get to the point where I hope eventually why doesn't it get to 23? I'm just going to do that. Oh, yes, it needs to be there, doesn't it? Ah, what have I done? B, there we go. I think that's it. Okay, so let's go to, I'm going to go straight to line 23. AX contains all the threes and if we step through eventually we see it's getting updated to all the sevens. So I'm going to say, if I happen to see if registers AX is all the sevens, 1, 2, 3, 4, 1, 2, 3, 4, and if we are on that line 23, let's set the registers. What are we going to set them to? We are going to set AX to, I say all the eights, 1, 2, 3, 4, 1, 2, 3, 4, set the registers for that process ID to whatever we've currently got in the registers and just so we can see it's happening, let's go over writing. Very important. Okay, right. So, let's go to line 23. No, let's go to line 22. And we will single step through and we will see that AX register change. Well, it's all the sevens. We are just saying we are over writing it. Now it's all the eights. We have been able to modify the value of this return value. And if I let it run to completion, we have changed the output value. So, P trace is super powerful. We can manipulate what's happening inside our target executable. Which ties us neatly back to container security. There is really very, very little reason for any containers running in security to have the capability to use P trace. Really, just by default, if you are using Docker it will be off. Hands up if you are using Kubernetes. Right, you may or may not know this. Kubernetes does not, by default, set the same set-comp profile as Docker does. So, you need to explicitly choose to use that set-comp profile and thereby disable things like P trace. Ask me about that afterwards. So, I promised you a link to the code so you can check out all the bits that I didn't have time to type in this morning. It's there on GitHub on debugger from scratch. I wouldn't have been able to do this talk without some blog posts by Michal Lovitsky and Phil Pearl. So, hopefully that's told you a little bit about how a debugger works and I will be around so ask me any questions after. Thank you.