 We have a speaker about debugging a Python process with JDB. Hello. Can everybody hear me OK? OK. All right. Thank you to all of FASTEM organizers for helping to put this together. My name is Brian Bauters. And I'm a software engineer at Red Hat. And I'm going to share with you what I learned of the hard way so that when you end up in the situation I was in, that you can have some tools at your service. We had some trouble with the slides, so we're doing it like this. OK, so I've been a Python user since 2005. I really love open source, so I'm sure you do too. And I work on a project called Pulp. There are two Pulp projects. Another one's going to be giving a talk in here later today. But we're the pulpproject.pulpproject.org. And this is a Pulp shirt. But enough about me. So why would you use GDB to debug Python software? Many of the preferred tools, of course, are PDB or are PDB. Those are great tools. You should definitely be using them. So why would you ever need GDB? Well, things happen. I'm sure maybe you've run into this where you have written something that works great on your machine or maybe in your test environment. And you deploy it to production. And it just doesn't work. Or perhaps there's a bug that is just occurring rarely. And you just can't reproduce it in your system. But someone contacts you and says, oh, I have a system right now. And it's occurring right now. How can you help me? And this is kind of a challenging thing. Because one, it's running in production. And so you can't just start fooling around with it. And two, sometimes you can't have access to their system. It just might not be possible for network reasons or firewalls or debugging that sort of thing is practically just very, very difficult. So these are some reasons why you would use GDB. It's debug Python software. In my particular case, it was the last one. It was a deadlocking problem, which was not occurring in the code that we wrote, but in one of the libraries that we use. So it was very difficult to find. But we did find it using the same techniques that I'll share here. So the conceptual model here is we're going to use the GDB, the GNU debugger. And we're going to connect to a running Python process, which is running in C Python. So GDB, I actually don't have a lot of experience with GDB, but it's a very powerful tool for debugging compiled code of many different types. And C code or C++ code is just one of those types. So C Python, the interpreter for Python, one of the interpreters, is written in C. And so what we're going to do here is we're going to use GDB to connect to C Python and inspect the state of the interpreter itself. And we're going to use that information to answer a lot of questions about, what is this Python process doing? Is it doing anything? Is it in the part of the code that I expect it to be in? So all those kinds of things. So here's a really simple example. I mean, it's so simple, it's trivial. But it's going to make a call to sleep into a function. And it's going to sleep for 30 seconds and then exit. So when you connect with GDB, there's some basics. You use the man pages, look on the internet. All this stuff is very well documented, and there's a link to my slides at the end. But what you want to do is you can use GDB to connect with the path to your program. And then the PID number also. So the second option here is the GDB-P, which says, connect to the running process with process ID, one, two, three, four, or the PID. And so simple stuff, you see to continue. Control C to stop execution. And Control D to detach, which will continue your program. So really simple stuff here. Unfortunately, we're not using my laptop. We're using a different laptop, so there won't be any demos. But it's trivial. It's really not that hard. In fact, you can take the source code from up here and put it on your system or any kind of trivial program that has a sleep statement so that it runs long enough for you to run and try to use GDB to attach to it. So it's simple. I encourage you all to try it. It's an easy way to get started. And that's basically what the demo would be. And if you do this, what you'll see is, the first thing you want to want to do is run a report of a back trace. So that's with the BT command for back trace in GDB. And when you run that, you're going to see a stack trace starting with zero at the top and then incrementing down. So normally, the current instruction pointer, when you're looking at PDB and RPDB, is at the bottom. But in GDB, it's reversed. And so the current instruction pointer, the deepest level in the stack is actually at the top. And that's considered frame zero. So this is what a single function call looks like in CPython. I actually don't know deeply CPython. But I've learned a lot about it just by inspecting the way that the code is written. And so this is quite difficult to read. But there are some great tools that we'll look at in a minute that make this much more usable. But this is, for instance, a function call, a single function call in CPython, as reported by GDB. So it's a fast function call you can see on frame 8, which is calling a function in frame 9. And it's using this pi eval frame x. So it's kind of complicated stuff. But it's a great way to learn more about what is the interpreter doing when it's interpreting my Python code? Because again, the conceptual model here is that GDB is going to be debugging the interpreter. And we're using that to reason about our Python code. So like I said earlier, the frame zero is actually the current instruction pointer. And this will trace the Python code all the way down to the user space, kernel space interface. And so frame zero often is a call to the kernel. So here, from the example, it had that sleep statement for 30 seconds. And so you'll see that it's actually calling into select, which is a kernel-provided mechanism for delaying efficiently for a certain amount of time. So a great way to learn about how your Python code is making its way all the way to the kernel. So that was hard to read. The first time I saw that, I was like, whoa. I don't know how to read any of this stuff. So luckily, there are these Python extensions for GDB, which make the act of doing this significantly easier. This was contributed several years ago by a guy named David Malcolm. It's been contributed into Python itself. And at the end, there's a link to the discussion in Python where it was contributed. And these provided extensions for GDB, which work with Python to make the output significantly easier. So these commands, when you connect to a Python process, are usable things like PyList. It shows the current. It's basically like the list function in PDB. It shows the current Python code as it's written in Python. So it's kind of taking those three stack, like a single function call being three stack frames in GDB and turning it back into the original Python source code, which is pretty slick for a running system. PyBT, it's a backtrace, the current backtrace as Python. So that's going to look just like PDB in terms of the backtrace that it shows you. Locals, print, pie up, pie down, you can move up and between the Python stack frames instead of the GDB stack frames. Because again, there are kind of three GDB stack frames for every function call, right? So this significantly makes it easier. So here's an example of PyList. If you run that program, and there are links to the source at the end of this, and you connect to it with GDB while it's in that sleep statement, and you run PyList, you're gonna see this. This is incredibly useful because now you can see the Python source code as if you're using it with PDB or RPDB, but it's actually being provided to you by GDB, which is an extremely useful mechanism. So you can see frame six, which is where I stopped it, where the program spends almost all of its time, has the instruction pointer with that greater than symbol there. So PyBT is a way to look at the backtrace. So if the backtrace that would be a lot longer, if you're just using the plain BT command, where BT says, oh, GNU debugger, please show me the full stack frames in C, this tool takes those stack frames and compresses them together. And you can see it's kind of skipping every three of them. So the call, the line, the Python line bar, which is line six in my program, corresponds with stack frame four. And you can see the command that it's executing is time.sleep. Similarly, frame seven is a call to the function foo, and, or it's in foo, it's a call to the function bar. And frame 10 is the outermost module that CPython is interpreting at the module level. So this is a great way to kind of play around and look at backtraces of programs, especially when they're in a hung state and you're just not sure what is it doing. And if I stop the program, I won't be able to reproduce it. So that's especially the time when you want to use this. So some simple stuff, these tools work well with threads and these are just basic GDB thread things. I kind of came from the Python community and that was the first language I really, really got into. And so these are some things that you'll use to look at threads, info threads, shows you all the threads, switch to a current thread ID. The current one is marked with the star. And if you want to just dump input from all of the threads, there's this thread apply all command, which at times I've written some little one-liner scripts that I would hand out to users to say, oh, if you're in a bad situation, why don't you just run this and it gets the list and the backtrace information from all the threads in the current running process. So that's very useful. So this is great for analyzing local processes and but in a way it somewhat defeats the purpose because the situations where you really need this is especially when it's running remotely. And so there's this great tool called Gcore, which will take a core dump, which as a Python community, we don't usually think too much about. But if you can get a core dump from a Python program, you can analyze it with GDB, which is super useful. So this is actually how I did my debugging. I would have a remote system that I couldn't access. They had a process right now that was experiencing the problem. And I would ask them to take a core dump of that PID, they would, and they would send me the core file, which is quite large because it has all of this information about the process. But if you can get a hold of a core file from a user, you can pretty much analyze it deeply to see exactly what each and every thread is doing in there. So Gcore coupled with this technique is extremely useful. So those are all great techniques if you want to know what a program is doing or if it's doing what you expect or if it's hung perhaps just unexpectedly, there are a lot of variety of reasons for why that could occur. But if you just want to know if it's doing anything at all and perhaps what it's doing, consider using S-Trace. S-Trace is a very simple utility that can document and show you the calls that are occurring between user space and the kernel space. And so if you just want to know what kernel calls is it are being made, S-Trace is a great tool. So consider S-Trace in your local processes. If you just want to answer a simpler question, which is, is my Python process doing anything other than just waiting on timers, for instance? A lot of sleep statements and stuff like that. So S-Trace is another tool that you can use. This is a trivial demo and you can try it with the source code at the end because we will not have demos. There will not be a better demo. So there are a couple of gotchas. The way that this works is you need to install some debug info libraries. So C code is compiled and GDB is not magic. So in order to look at the original source code, you need these debug info libraries to help and support your GDB so that it can decompile it. I'm not sure if that's exactly the right word, but that it can translate the compiled version back into an original source version. And so if you're gonna do this, you can connect to a Python process with GDB and you can use the BT command, but you won't see very much. You'll see all the stack frames, but some of the details like, oh, what's the function name? Like a very important detail that you would want is not going to be there because GDB just sees it as a memory address. It doesn't know what it's called. It doesn't actually know how to read into the compiled version of C Python and say, oh, this thing is called Pi eval frame X. Or, oh, this is in this function call or this variable name and it has this meaningful name. What you're gonna see instead is a pretty terse memory-based analysis that shows you the number of frames and the number of threads, but not a lot of detailed information inside of it. So what you need, and this is true for pretty much any GDB usage, meaningful usage, I would say, you need to install these debug info libraries. So debug info libraries are built along, just like C Python, for instance, is built and compiled. At compile time, there's another library, another package that's generated and it usually has a debug info ending onto it. And typically, so for instance with Fedora, for example, there's the normal repositories where all of your packages would come from and then there's usually a debug info or a debug repository. So what you typically need to do is enable the debug repository and then install the necessary debug info libraries. So really the only one that you, to do almost all of this, the only one that you really need is the one, the debug info for Python itself. So whatever package brought Python onto your system, there's going to be an equivalent package for that is a debug info library that's going to give GDB all the information that it needs to be able to debug it. Now, if you're debugging a more complicated application or an application that calls into libraries which have C compiled portions to them, what you're gonna wanna do is also install the debug info libraries for those portions too. Otherwise, as C Python calls into other C code that is also compiled, GDB won't know how to read it and won't, well, it'll know the basics like the stack frame depth and the number of threads, but again, it won't give you that meaningful information. So the trick is to just try to install the debug info libraries and sometimes users don't wanna do this and I completely understand that. So this is a great thing where they can send you a G core and then you can have a test system where you can analyze their core dump. So the great thing, the thing to know is, I mean, this sounds a little bit tricky, but GDB is a great tool and actually, I mean, it's different across different distributions, but it'll usually tell you when you connect to the process or the core dump right at the end, it'll say, oh, I'm parsing all these debugging for libraries, hey, you're missing these and it'll literally tell you by name what packages you need. This is how it works in things like Fedora and RHEL, other distributions I expect are similar and similarly so they very likely provide a separate repository with their own debug infos, but other distributions may vary. So the other gotcha that you should be aware of is versioning. So GDB is very picky and for a good reason. The debug info library must exactly match the compiled version that you are running. I mean, it has to match perfectly because there's no way for GDB to know if a slight version change has caused a meaningful difference and so you need to match them the same and this can be a little bit tricky because some repository systems don't maintain or make available older packages or at least not easily. So I run into this situation where I get a core dump and that user's system was just running older packages and those packages aren't part of the mainline debug repository anymore and so I have to go find those vintage packages or try to build them and put them together. For a while, I was spending a lot of time while investigating my problem trying to do that but now I've kind of moved to this simpler approach which is just make sure that the Python version is right and GDB will tell you that you need a lot more but I don't know, just see how far you can get with it. For the most part, all the questions that I wanted answers to, I can get with just one single debug info library and that's the one for CPython itself. So this is a gotcha that you may run into. I used to worry about this, I don't worry about this anymore. Let me know if you have interesting experiences when you try this. Yeah, so there are some compiler optimizations which are great for efficiency but can make debugging a little bit harder. Generally this hasn't been much of a problem but the options that Python is compiled with actually are meaningful and if you really, really optimize it, it can be difficult to see what the process is doing internally and so what you'll end up seeing the BT output for instance, the backtrace output is it'll just say in greater than less than symbols, it'll just say optimized out and so if you're running into a problem like that, what you wanna try to do is perhaps recompile Python with a less optimized version. I only mentioned this as like a small caveat, I've never had to do this but sometimes when you're analyzing stack traces you'll see optimized out and that's part of what you're seeing. Maybe people who better understand C, Python can come and teach me a little bit after this. Yeah and of course root is required, this is a privileged operation, I mean you're pretty much analyzing another process so you'll need root to do that or the privileges of which to read that. So one thing to consider is there's another way to do this and it won't solve the case where you can't connect to something but if you can connect to something, if you wanna add a moment's notice, connect to something but you don't wanna formally and always open up our PDB for remote process debugging, you can consider just enabling it with a signal. Our PDB supports this in more recent versions and so for your applications if you wanna do this at some point in the future, considering enabling it so the way this works is you can send your process a special signal and only when that signal is sent will it actually initialize our PDB and give you an opportunity to use more traditional tools which is what we want. Again, you don't wanna ever have to do this but in the event that you do you can use GBB to debug Python. So this is a way to kind of think ahead a little bit in your Python applications to allow on demand remote debugging. This is well documented in the our PDB stuff so documentation so you can find more about it there. So these are some references. This is a QR code to these slides so that you can try to source yourself and I don't know, these are some great tutorials online. I'm really just a user here. I'm not a contributor to these things. I'd like to be to contribute in some small things here and there but there are some great tutorials out there so I'm really just evangelizing this technique and help identifying how it significantly helped me. We had a very difficult, rarely occurring deadlocking problem that was not in our code. Our ability to solve this would, we would not have solved this, I think. It would have been just one of those things that, oh, occasionally it deadlocks and just restarted but we want great quality in Python software and so we can raise that quality by using these kinds of techniques and so that's why I'm out here telling everyone about it because it's a very important technique. So these are some great tutorials that are written by other people and you can click on the links if you download the slides using the QR codes there. So that concludes my talk with no demos and I believe we maybe have a few minutes for questions. That's a great question. The question is how does this work with PyPy which is another interpreter that's different than the C Python interpreter? It should work. I've never done it. It should work though and in fact it's a great question because the conceptual model is the same, right? PyPy is written in, its implementation is in a compiled version and so GDB can be used to decompile it if that's the right term and analyze it. Now PyPy supports just-in-time compilations so your stack frames are gonna look differently and it might be a little bit more complicated but the technique is the same and furthermore this technique can be used to interpret other languages as well. Pretty much any interpreted language that has a C-based implementation or C++ or really anything that GDB can analyze can be analyzed in this way. Yes. Yes. So one more addition, comment is that the debug.info has a debug.info install command which can be used to conveniently install these packages. So thank you all very much and enjoy FOSDEM. Thank you. Thank you.