 Hello, my name is Frank Eigler. Aaron will be joining us shortly. Let's talk about alpha-utils-debug-info-d, which is a web file server for debugging artifacts. What exactly that means? We'll find out very shortly. It always starts with a problem. Something always goes wrong. There are always bugs in the code. But once there's a bug, what do you do? Well, let's see. What do we do? Sometimes you don't need to do anything. You can just lunge right into solving a problem. But in the case of C++ programs and operating system stuff, you will need a debugger or something like it. And if you need a debugger, you need some data to drive that debugger. And installing this data is our topic today. I'm not going to talk about what you do with a debugger. Once you have the data, that's your problem. If you use Python or Node or any other interpreted language, this talk is not for you because your script languages, you have more interactive debugging facilities. You don't need this. You lose Golang or Rust. You might need us. But I'm not going to talk about that part just yet. If you don't use a debugger, but you use other fancy tools like Crash, SystemTap, Valgrind, any of these guys, they also use debug info as a prerequisite. And it's the same info, exactly. So if you don't use GDB, heck. But you use some of these other tools. Listen on, because this is for you. Debugging info, debugging info, debugging info. What is this stuff? Debugging information is a special binary format of data that a compiler generates for every program that it compiles. It usually requires a special compilation flag, dash G for C in similar languages. And that means that the compiler will emit a bunch of extra metadata in addition to the compiled object code. This metadata is in a standard format, usually dwarf. And it maps between source code-level concepts like file names, line numbers, variable names, lines, all those things. And object code, things like actual PC instructions and registers and memory addresses and symbols. This is what allows a debugger to show you source-level context for a breakpoint, for example. And this is what lets a debugger know where to find variables at runtime. The quality of the debugging data varies a great deal by compiler. You want as high fidelity as possible so that you can debug even highly optimized, deploy it to a binary, not just the toy stuff that you just compiled with dash G, nothing. So you need a good quality compiler to give you good quality debugging info. And the world class for now, as far as I'm aware, is Google GCC, which does an excellent job of emitting debugging data for even fully optimized code. Then there are tools that will obviously dump it. There are tools that can compress this info, either with a tool called DWZ to kind of take out common pieces from multiple files, or Google GZIP, et cetera. There are also some other formats like CTF, BTF, et cetera, et cetera, which aim to give you just a tiny little subset of the information just to bear essentials needed for certain purposes. But for our purposes, we want the whole thing. Often debugging info is not capped for very long, and we're going to talk about why in just a second. And that's what gives rise to our challenge today. And the other thing is when we do debugging, we usually want to look at the source code. And if we have the only tooling that forgets about source code, it is not good enough. Expect more. So why the heck is debugging food not always at hand? Well, the easy answer is its size. Debugging full dwarf data is pretty large, especially when it's emitted its full glorious detail. It can increase the size of a binary by an order of magnitude easily. Here's some random numbers. Then again, there is source codes to look at too. So altogether, a deployed binary might be very small, but the information needed to actually debug it is large, large, large, and you would need it, right? So let's talk about where it sits on various distros. How would you normally get this on various distros? Because it's a distro-specific problem. So let's see. For Fedora and REL, it is usually automatically built during the building process, but then it's stripped to the separate debug info and debug source sub-RPMs. If you don't have them installed, this is kind of the GDB conversation you have. It'll give you an advice as to what to install. So it's available. It's capped, but to install it, you need to change the route. You need to run this DNF program, and you need to wait. So anyway, at least it's there. On Ubuntu, it's similar, not as full-featured or not as smooth as it is in Fedora land. They have different sub-packages for debugging information, and it's kind of available. However, it's not on the default repository, as you have to edit configuration files and hope that you get it right. And then you have to do this route, and some source packages are just not available in any useful way. In other words, you can't find what the source was that something was built from directly. You can only find the original sources, not the built sources. So anyway, it's so-so. It's not bad, but it's so-so. On Arch Linux, which is always an inspiration for the do-it-yourselfer, they build without the dash G flag completely. So you just don't make this information in the first place. Debugging will be challenging. So their official documentation advice is just recompile what you need to debug, recompile with dash G, and then off you go. Those hardy Arch people, I love them. Next was just one of the minority distros, but it did something very clever that inspired us a lot, which is that they solved this sort of problem close by via a user space file system, and which mapped a debug info queries to HTTP service. So it was actually pretty close to what we needed, but we took a lot of inspiration from this, and we built something new and better. So these are what, kind of like their requirements would be had wanted, and this was our thinking back in 2019. So this is now a year and a half ago, we started to work on this thing almost two years ago now actually. We knew we needed to make it easy to get this debug info to a tool. We wanted to make sure it can work without any special privilege, any special access. You don't want to change the route just to look at the source code. We wanted to be able to consume raw packages from a distro so that you didn't have to unpack anything and just put it in a special place. No, just get the data right out of the RPMs or Debian's or whatever you have. Want to make it quick and efficient so it just downloads exactly the files that we need, nothing else, and we want to make sure to include source code provision, which is very important if you ever want to look at the code that you're debugging, it's obvious. So we do have a tool and it's deployed and it's pretty much ready for use. We call it Debug Info D. This is a server that runs on any kind of Linux machine. It speaks plain old HTTP and it's been shipped, as I say, about a year and a half ago. It's in Fedora, RAL, a bunch of other distros. So it's already pretty widely and easily available. There are some public servers that are operating and I'll give you instructions about that in a sec, which means that to get started with this, you really don't have to do very much. We'll get to that in just a moment. The server itself just is a normal process. You launch it with a couple of directories or files which contain your build tree, your personal build tree, or a bunch of RPMs you downloaded from somewhere, and it will index them and it will be ready to feed the content back to the clients. It also has a little monitoring interface. You can then nail Prometheus or Grafana kind of tool to it and monitor it. It's numerous and very voluminous internal statistics. Actually, you can tell whether things are going okay or not. This is important for a very big installation which can take minutes or hours to finish indexing a large dataset. The other neat thing the server does is that it lets you federate. It lets you connect lower level servers to higher level servers in case you wanna run one for yourself, one for your work group, one for your local distro engineering and one for your external customers. You can kind of daisy chain these things so that extra clients don't have to know about all of the stuff. You can daisy chain servers very nicely. All right, interesting, interesting, good. All right, what do clients look like? Well, clients are just your normal clients. We've done a lot of work over the last year and a bit to get debugging for the client capability into a whole slew of tools. And I don't know if we have a list here of the whole set, but there's a webpage which contains current status of all of them and it's pretty cool. All you need to do as a user to do this is to set an environment variable, just debugging for the URLs environment variable and you point it at your local server and you can point it at other servers such as this public one I tell you there. It's a top secret one, never use this for real. No, just kidding, no, it's a real one. And it serves a bunch of useful content. I'll talk to you in just a second. But nice thing is one environment variable is all you need. And then you can start debugging and all the tools, all the debugging for consumer type tools, just look at that environment variable to find the debugging for servers. That includes GDB, Alfitills, the EU stack, it includes Perf, Valkgrind, a whole boatload of tools now have this kind of capability in them and SystemTap, it's most of the tools we know of that are debugging consumers have this capability in them now. So that's pretty sweet. And then you just use it and the data just flows down automatically, no route, no interruption, it just works. There's also a command line tool interface so you don't have to use the, so these clients don't have to use the C library, they can just make a call out or a user can make a call out to this debugging for the find program and it just works too. Good. And the way federation between servers works is that each server is itself a client too, but that's just a clever little trick. All right, let's move on. The clever stuff here, the interesting stuff that makes this all work is build IDs. Build IDs are unique hash codes that are built at compile time into each binary that's made by a modern tool chain and these are hexadecimal identifiers that follow a binary along wherever they went from right from the compilation right to the deployment, even in core dumps. And these codes are unique, global unique pretty much. So we use these codes as the key in docups to debug and for these servers. That's it, simple. Oh, let's go see Aaron's demo. We'll be right back. Hi everyone. I'm gonna give a brief demonstration of how to use debug info D with SystemTap, Valgrind and GDB. I'm gonna start with SystemTap. Our goal here is to instrument a process with a SystemTap probe that prints a time call graph with function parameters and return values. In order to do so, we'll need debug information for the process we wish to instrument. If we don't have this debug info installed, then SystemTap is not going to be able to compile the probe as you can see when I run this command. We could manually use DNF to install the debug info but this isn't always convenient or possible. So let's start up a debug info de-server that can index the debug info we're looking for. The executable we'll probe in this example is a tree program and in this directory we have its debug info. We also have a directory of RPMs that will index for other examples in this demo. So we'll start up debug info D and point it to our directories. Now debug info D is able to serve these resources. I'll now point SystemTap to this debug info D server by setting the debug info D URL's environment variable. Since executables share build IDs with their debug info, SystemTap is now able to query the server for the missing debug info using the build ID it finds in the tree executable. I'll now run the SystemTap script again with debug info D enabled. On the left you can see that the server received a request for trees debug info and on the right the instrumented tree process ran and you can see that SystemTap was able to successfully produce the call graph using debug info received from the server. Another feature of SystemTap that uses debug info D is the ability to specify probe targets according to build ID. When we want to instrument an executable with SystemTap we usually specify either the path of the executable or a process ID. Now debug info D gives us an easy way to target a specific build of an executable. In this stop command I've replaced the path of the tree executable with its build ID. When I run this command SystemTap will query debug info D for the tree executable in addition to its debug info so it's able to compile and run the probe. This feature may be useful when you want to target a specific build of an executable or shared library or when you want to cross compile a SystemTap module for different platform. By setting the debug info D URL's environment variable we're able to point all tools with debug info D functionality to debug info D servers. One tool which we hope to add debug info D support to in the near future is Valgrind. If we run Valgrind with an executable that writes to an allocated memory and we want to see detailed call stack information about the invalid write we're going to need debug info for the executable and shared libraries it uses. We already have a server running with all the necessary debug info for example executable so when we run Valgrind it will be able to acquire these resources automatically. Here you can see this information regarding which resources Valgrind was able to acquire from the server. In this case it successfully retrieved the main executables debug info and debug info for Glib C and LD. With this it was able to produce a detailed call graph with function names and source code line numbers. I'll now show debug info D working with GDB as it runs inside of Fedora container. For this example I'm simply going to run Python 3 under GDB. Since this container doesn't have any debug info or source files installed for Python 3 or Glib C and we haven't pointed GDB to our debug info D server yet the information we get from GDB is a bit limited. We've already indexed the Python 3 and Glib C debug info and source files so all we have to do is point GDB to our server by setting the debug info D URL's environment variable inside the container. Now we'll run GDB again and we should see information regarding which files GDB queried the server for. GDB now has access to Python 3's debug info and when I run Python 3 GDB will query debug info D for the debug info of any shared libraries that are dynamically linked. As I attempt to print source files associated with each stack frame, GDB will query debug info D for any source files that it otherwise cannot find. That's all for this demo. I hope you enjoyed this brief look at some of the ways debug info D can simplify the process of debugging when using tools like system tab, file grind and GDB. And we're back. Let's finish up some links of course. The first link here is to our main project website. It gives status about the releases, what public servers are available for your use, link to the man pages, all the blog posts we've written and importantly all the client code that now speaks debug info D protocol. So it's a good place to start looking. To get in touch with us, we have mailing list of course but the IRC is really fast and easy. We're collecting a lot of blog posts here on the developers.retta.com website. So read us there. If you wanna try things today, if your distro is new enough, set that environment variable to add to this very nice little URL and you can give it a try probably right now as we speak and then things may just work for you. There's also an internal red hat server for rel8 and stuff but don't tell anyone. Yeah, sorry. And more distro type public servers are coming online as soon as we can help them as soon as we make it work. So I hope Fedora will be one too. That's very exciting.