 So, yeah, I'm going to introduce a plug-in and I've written that adds the concept of pipelines and iterators to GDB. This is especially handy for finding the interesting element in a data container and can easily be extended to work on whatever data structures you use in your program. So, I've got a setup for live demonstration that might need a bit of explanation. This is Vim open on a file where I've written the commands I'm going to run. Lines like this are commands to run with a hash, our comments. I've got a little thing that goes to the next comment with a number and runs all the commands under there. So, I've started GDB on a program that's in this extensions test suite. The help of the new command GDB pipe shows the basic syntax. This is a command that denotes the creation of a pipeline. It takes a long string argument which is all the walkers which are the iterators over something separated by the text space bar space. Each walker can take a sequence of pointers, optionally do something with them and then pass a possibly different sequence of pointers on to the next rest of the pipeline. It should be reminiscent of UNIX pipelines with the place of a UNIX command taken by one of these walkers. You can find help on walkers with the walker help command which has a few different forms. You can request help on an individual walker, but you can also list all walkers currently registered with the plugin, list the tags that walkers are categorized under or list all walkers categorized under a given tag. I should mention there's also a walker apropo command which searches through the doc strings for things related to a given word. This is the basic data structure that I'm going to give some small examples on. I'm going to start off with examples on this before going on to real world examples and then showing how you would write your own. So here we have a link list structure where each node has a data type or some integer data and a pointer to the next node in the list. I hope you'll just trust me that the end of this list is indicated with a null pointer in that next node. Users of this sort can be iterated over using the link list walker. We can see the syntax of this walker with walker help command, including mention that it's simply a convenience for using this more general follow until walker with a more complex syntax. So if I start the program and run until a known point in program where the head of this list is in this local variable named list head, we can iterate over all nodes in this list with this walker. We can also pass that to the show walker to print out the data element on each of these nodes. The show walker takes the template of a gdb command to run and runs that gdb command for each pointer in turn with this dollar curve variable replaced by the address coming from the previous preceding pipeline. That dollar curve use is consistent throughout the plugin. So we printed out each the data on each node, but we can also filter to select only those nodes which have a data member greater than some value. We can sort the nodes based on their data value, or we can find the single node or three nodes with the greatest values or with a minimum of modification to that initial command. And hopefully that should be an intuitive modification to the initial command as well. So there you have the gist of the plugin. We iterate over all values in something, optionally do something with each one and finally like pass that on to the rest of the pipeline. I'm now going to go into some real world examples. I'm just starting gcc running under gdb, which is taking a little time. Because I recently have been working on gcc and I've got two real world examples from there. I'll then use a real world example from my co-worker's side project where he's been rewriting an RTOS. So I've now started gcc and if we list those categories, you can see there's a new category named cc1 just at the start there. Under this category, there are a bunch of walkers specifically for gcc data structures. These have been automatically loaded based on the name of the binary that I'm debugging, using an auto import mechanism I'll describe later. As is often the case, the most used feature of this plugin is the simplest. And the simplest feature here is pretty much just to print everything out. So I often find myself, gcc has a dump files which print out the representation of its internal language at each stage. Because I'm relatively new to this, I quite often find myself seeing one of these statements in the dump file and going, I want to know what the object looks like that causes that statement. I know the object is in my data structure somewhere, but I don't have a reference to it in GDB, so I can't access it. I tend to approach this using a creative use of breakpoints and conditions to stop in a point in my program where that is in local scope. But using this plugin, I can simply iterate overall statements, printing their debug output right next to the address that the node is at. I can then search for the one I'm interested in and copy that address and assign it to an internal variable, sorry, convenience variable for later inspection. If I know some criteria that is like, so here I know it's a function call that I'm interested in, I can simply put an if statement so I can search through less statements there. This means I don't have to move my program past the point that I'm stopped at in order to inspect that variable I wanted to. So as I've said, that's the most simple action you can do, but you can also do any complicated action you can think of. So just starting GCC again. It has a list of passes where each pass is a transformation. As I mentioned, it prints out a dump file for each of these passes. I find myself wanting to, if I look through these dump files, I can sometimes find, oh, that's the pass that introduces the behavior I'm interested in, and then I just need to put a break point on the function implementing that pass, so I can start debugging. Since GCC has a data structure containing each of the transformations, rather than go to the source and look through how things are defined and where the method implementing what I'm interested is, I can, using this plug-in, iterate over each defined transformation in the list. Filter, so I'll ignore this one, but filter based on the name that this transformation has, which correlates to that dump file, extract the execute method of that transformation and then put a break point on it. I could also, if I have many passes that I'm interested in, put a break point on each one in turn with this single command. So now I'm just going to give one more real world example from my co-worker's side project. He has an RTOS and he's used this plug-in to inspect this running state. Here, he's iterated over an array which contains data structures describing each of the allocations this RTOS has performed. And then he simply, for each one of those, increments some GDB convenience variable by the number of blocks allocated for that allocation, resulting in a total number of allocations, which number of blocks allocated from which he can convert to bytes. I especially like this example as it demonstrates these are just convenience for loops, like iterating over whatever you want to iterate over and you can do any external calculations you feel like. That's the gist of the plug-in. So you iterate over all elements in something, perform some mapping or some filtering or some action on each one in turn before passing on a list of elements to the next rest of the pipeline. So you can usually find a way to debug whatever you're interested in using the general walkers. Most often with this follow and till walker, which is essentially a for loop with a start, increment and test expression. But I've put a lot of work into ensuring you can write walkers for your own data structures pretty easily. To demonstrate this, I'll start with another program from this extension's test suite. Here, we have again a very simple node, but instead of one pointer to the next node, we have two to children to create a nice simple tree. If I start this program, run until a known point where the root of this tree is in tree root, and then import a file defining this walker, a walker for that specific data type, ignore the method of Python import here. This is just because it's not one of the special, like automatically loaded ones, it's a demonstration. So I can now see that there's a new tag named tree demo at the bottom of that list. Under that tag, there's a walker named tree elements, which claims it can iterate over all nodes in that tree. And just to demonstrate its use, we can print out the data on each node, or with a simple if condition, print out the data on all the leaf nodes of that tree. So this is a pretty nice walker to demonstrate how one would write your own. At a bare minimum, you'll need to define a class inheriting from the walkers.walker class. This string is the help text that you saw printed out from walker help. I'll just remove that for making things clearer. We have the name that you use on the command line, and the categories that these things are stored under in tags. You can, as indicated by that list in tags, have multiple categories that each walker can be put under. And you'll need to implement three methods. In it, from user string, and it to def. At the creation of a pipeline, your class is to be instantiated then. If the user typed the name of your walker on the command line, then your class will be instantiated using the from user string method. This method takes a string, which is the entire argument set that the user gave your walker, and two booleans indicating whether this is the first walker in the pipeline or the last. The init method is there to provide a nice programmatic interface for anyone that wants to build upon your walker. Sometime after instantiation, your class will have, it's a def method called. This method takes an iterable over all the pointers that are coming into it from the pipeline, and is to return an iterable over all the pointers that pass out. Each of these values is a GDB value representing an address or a pointer to something. You're free to do pretty much whatever you want otherwise. So once you've written your pipeline, your walker, sorry, you'll usually want it around when debugging the binary you wrote it for, but not around otherwise, not clattering at the help text. For this method, for this reason, we have an auto-import mechanism, which is similar to the GDB one, and I believe you could use the GDB one if you want, but well frankly I can't, it's not well known enough to get these walkers into the main projects, so I have a separate reason. Here we can see the CC1 walkers that were the ones defining that GCC plug-in before. So in order to use this mechanism, if you have written walkers in a Python file for the binary named MyBinary, you should save that Python file under the name MyBinary-GDB.py and put it in this plug-in's auto-import directory. Now if all that seems like a lot of work, I have good news. You may have already written the Python code to use this plug-in on your data structures. That's because this plug-in leverages the much more established GDB PrettyPrinter API in order to iterate over data structures that have the optional method PrettyPrinter.children defined. This method takes, well, it's defined to iterate over the children of some data structure. For containers, it's usually implemented to iterate over all the objects that are stored in that container. It's defined to return iterable over GDB values or something that can be converted into a GDB value. This is slightly different to the addresses that I pass through, but we can do a conversion if things satisfy certain criteria. So the walker to iterate over these PrettyPrinter objects is the PrettyPrinter walker. This, when you give it a object, say, example, ran container here, it will take that object, look up the PrettyPrinter that's supposed to print it, and then find the .children method, iterate over all the values and convert them into addresses. Well, take the address member of that GDB value and push those down the pipeline. The extra restrictions on this GDB children method that is not in the PrettyPrinter API is that you return an iterator that has elements which are values with the correct address. So, for example, a PrettyPrinter that iterates over 1010, which is a case in a QT bitmap PrettyPrinter. So you can't work directly with this, but it seems the libstud C++ PrettyPrinters all satisfy this requirement, which means we have stood vector, stood map, and a bunch of others already defined for us. Hopefully your PrettyPrinter also satisfies these requirements. So I'm just going to end on some future directions, possible future directions. The thing I'm most interested in is how many PrettyPrinters in the world satisfy these requirements or behave as you might expect. So there's two sort of difference points here. One is how many sort of work nicely. As an example of something that works but does not work quite nicely, the stood map PrettyPrinter iterates over alternating key and values. Whereas what you might expect using this plugin is iterating over the pairs. There's not much I can do about this other than making sure that writers of PrettyPrinters know this plugin exists, know what the requirements are, and hopefully believe it's useful enough to modify their PrettyPrinter to work, which is why I'm here. But for the stood map walker in particular, I have written a wrapper that iterates over pairs, but I can't do that for every PrettyPrinter out there. The second question is how many iterates over GDB values that have the correct address assigned to them? You can use the PrettyPrinters that don't have the correct address, so to do that you'd pass an extra argument to the PrettyPrinter walker. But as soon as you do this, if they had an address assigned previously, you lose the ability to assign into the memory. Because the action of putting something into a convenience variable loses this ability, like you lose where the value originally was as in a reference to memory. And hence, so for the example the Qt bitmap thing I was mentioning before, you could iterate over the ones and zeros, but you'd have to convert into something a bit more useful later, or you could just modify the entire of your pipeline. Apparently I've got ten minutes left, that's way more than I expected. Yeah, you could modify the rest of your pipeline to handle values instead of addresses and just not be able to assign anything in the inferior. I have thought about using values everywhere instead of these addresses, but because of this difficulty in creating a template and putting something into there without losing the information of where this is in memory, I've not been able to do anything there. But yeah, if anybody knows a better way of assignment rather than this convenience variable, then I'd hear about it, but other than that I apparently finish very early. There's the github where this plugin is stored and I'll take a few questions. Yeah, so you notice there's an endless loop, press CTRL C and then pipe to head. Oh sorry, what happens if you have an infinite loop or something? I've hit this a few times. Generally what happens is I get panicked, press CTRL C and then pipe to head. So hopefully I can basically go, oh there's the point where it repeats. I could also, I'm sure you can, there is a unique thing in there which could be used to filter out or you could write your own walker that would find where the start repeats again and something like that, but yes it does mess up your write. Go on GDB that there's some kind of iterate at will, as you go over 500, something like that. So just a comment from the person in the front, it's typical in GDB there's a limit and suggesting that we set some option to say limit at however many like a few million, something like that. We always call C++ and since this gives you the sequence of objects that individually would that work? Because that would make it much more convenient for data types only I have to put the fingerprint here. Yeah, so the question is basically in C++ there's often iterators so you have a dot begin and a dot end and is there any way that you can automatically use these functions and to define some iterable in GDB. I basically I've tried to use these iterators directly with in GDB a few times and I always hit some problem and I don't know why. I'm guessing there's something to do with inlining or whatever, but there's always some problem I hit. Yeah, so I resorted to this. Yeah, so the question there is I've always demonstrated using this plugin using debug information, but is it possible to use things just directly with registers, etc. So, yes you can, it just makes things a lot more awkward as debugging without debugging information yet does. The follow until walker literally just takes a GDB expression so you can know the exact point of math you want to perform. You can just type that in. That looks about it. Okay, cool. Oh, that's a yeah, that's a random VIM plug-in I wrote. Thank you. Yeah, the actual the point of the point of the plug-in is more just to help myself debug stuff because I can like whenever I write to command and then I'm like that wasn't what I wanted and I go back. But yeah, it's same GitHub username, but VSH. Yeah, no worries.