 So thank you all for being here and to my talk on MiniDebug Info support in LLDB and I'm Conrad Kleiner. I work for Red Hat and mostly on upstream LLDB since last year. Before that I worked on an OpenShift in 2016 and until recently. So the goal of this whole project was to improve LLDB as a debugger for Fedora and REL binaries where you mostly have, when you have only release binaries, right? You don't have debug symbols installed and that means you don't have any symbols directly accessible and that means whenever a program crashes and your tool picks it up you only see addresses and no symbol names. And the approach was to make LLDB aware of MiniDebug Info which is the concept we're going to talk about and it could be that I use MiniDebug Info and the GNU debug data section interchangeably so please excuse that. And the MiniDebug Info is where those symbols are stored that we are interested in. So it's helpful to talk a little bit about why it was invented and how it was invented because before we go into how it actually looks. So it has been invented before I joined Red Hat long before and I only talked to colleagues recently about why was it invented this way and not some other way. And so the whole idea was to be able to generate a backtrace for crashes when you have the automatic bug reporting tool in Fedora and for that one wants to have symbol names and probably line numbers and file names and such. And so those make up an L file on their own, right? And the idea was to put them all in and eventually it all got too big and everything was stripped out and so you only have with the regular symbol table but cut down in fashion I'm going to show later but essentially just function names and that's it, no variables or parameters and everything else in the rest format itself remained even though maybe the information could be stuffed in different places somewhere else. One thing to remember is this is nothing to do with debug information even though it's called mini-debug info it's just symbol tables and nothing else so there's no relation to dwarf whatsoever. And yeah, I hope you can read this. Can you read that? So essentially we can just talk about the bubbles here. So to the left you see sort of an L file in my mind. You have this green bubble where you have the dynamic symbols and the sim tab and usually the sim tab is the superset of the dinsim plus more and when you have a release binary you essentially cut this out. It's no longer there and you usually cut it out and put it next to the debug packages and install it there but since we're dealing with binaries that have only release information and as for Fedora and REL also the mini-debug info that's at this place here. That's essentially a GNU debug data section, an invented section that contains data namely an L file on its own which is essentially as you can see the reference there essentially the sim tab but cut out all the duplication from the dynamic symbols and the holes that looks like it's with cheese that's more or less we've put out everything that is not a function name or for example you've stripped out symbol names and variables and so forth sorry variables and parameters and that has some implications on LADB for example when you start the program and you read the symbols LADB usually tries to find if there is a sim tab that's enough you have everything but if there's no sim tab LADB would go and say I read the dinsum and that's it and here the whole implementation evolved around trying to combine those two to have something that is at least capable to give you symbols for your functions and symbol names I should say so the way I did this was focus on back traces and not on crashes or so forth but make LADB essentially aware of the symbols so that you can set a break point hit it and maybe dump the symbols and for that I just took in a whatever I found zip binary that is mostly installed in every system and I sort of blindly identified a function and the only hurdles was it must not come from dinsum because that's what LADB can read must come from the Gnudibak data section and then we're going to do a shootout of GDB versus LADB and so on this slide you can see I've dumped the symbols to this zip.gggdddonline2 that is essentially the Gnudibak data section that I've extracted from the zip binary and here you can see this promising help function help symbol because I just looked at it and said maybe you can find that some other way but I looked it and said maybe it's promising if you call zip dash dash help maybe that gets triggered and we can on this line we just on line 12 we see that yeah it's not in the directly accessible symbols so it must come from the Gnudibak data section itself and it's no duplication there so let's be brave and try a demo it's not a fancy demo but at least it's somewhat interactive so when I fire up GDB calling exactly what I showed you before zip dash dash help you can see that GDB tells us yeah I'm reading symbols from Gnudibak data section and it also tells us I don't have any debug symbols installed which means we're not cheating so if we start the program you get what you expect the regular zip help output but if we want to set a breakpoint on help it can find the breakpoint and if you run again it stops there just as you would expect so that's nice let's see how LDB performs here and I'm talking about LDB9 which is what ships with Fedora 31 so it doesn't tell us anything much the calling convention looks a little bit different and here the same we run it we see the output and if we do try to set the breakpoint on help no way it finds it right it's just not there so that was how it worked in LDB9 in LDB10 which should be shipping soon I guess or packaged soon we just proof right we run it and we can set the breakpoint on help it finds it it stops and essentially the proof that yeah it works okay that worked and the question is is this ready to ship no of course not and the main part was in testing so just as a word of warning this was my first contribution to LDB itself like I said I only started last year we have a bunch of tests that I've been asked to implement it like take the community back to our section find a symbol in there using image stumpsim tab and issue a warning when you have a mini debug info where you essentially try to decompress it but you don't have LLDMA support compiled within or when you're having compressed I mean you can read for yourself we have corrupted archive and the last one was the tricky bit getting the there's a GDB manual page where it says how are you going to construct a binary that has this mini debug info installed and I need to sort of replicate that in LLDB because that is the only way I can really set and hit a breakpoint and you might wonder that what parts were hard or what which not and actually setting a breakpoint worked more or less out of the box the only problem was that there was some confusion upstream about how to create this elf object and turned out to be very easy but hitting the breakpoint didn't work because I essentially just fetched created an object file fetched the sim tab out of it and stored it where we stored the other sim tab and thought yeah it's using it correctly I can hit the breakpoint everything is fine but actually LLDB has some concept of unified section lists and I need to put it in there and then it all worked and also LLDB does work with this concept of having a stripped down elf file that is not runnable but where it can do sim tab dumps on or yeah and that's not ideal to if you want to say I can hit a breakpoint right it needs to be runnable and what was a pain for me was that the tool in LLDM that was used for that is YAML2Object which takes those YAML files if we have enough time we can show you that and it always takes the YAML file and produce an elf file and that made my test go nuts in my head it exploded because it always produced the sim tab and I didn't realize that at first because then LLDB only sees yeah I see a sim tab I'm going to read that and it was empty no symbols found so like usual you have the regular polishing for upstream making everybody happy and documentation is really an issue in LLDB I'd say yeah so we have more time so let's head over to some more slides what I really liked was I came to love actually the LLDM integrated tester which consumes files like this one but it doesn't have to be a C file here as you can see probably guess we just print the number of arguments in line 8 and that's about it and we have a number of tools involved here so you pass this to the LLDM LLDTESTER and it's interested in those requires and run comments that you can see there and it just says yeah I need a linux system LLDM asynport must be compiled in I need the exact computeable and then it's just going to execute one by one the run statements you're not supposed to do that in line 2 that you directly call GCC but here I'm just doing it for the sake of explanation so %t and %s are %s is this file %t is the temporary file just for this test then you compile it to the %t call %t give them a bunch of arguments output that to file check and you also pass the current file as an input to file check again so that then is interested in the check comments just to check that yeah it's going to check that the number of outputs is 5 it required a little bit of work just some CMAC canonization like whenever you said CMAC is you can just say turn it on, true, one or whatever so that was those were sort of as one or as I don't know it falls I don't remember but that was sort of it and that's the example I have for sparse not runnable L file you can essentially just describe your L file in the envelope format and then say okay like before we have those required runn and check comments and pass it to the all of them lit and then essentially all it does is tries to find the multiply by 4 symbol name there that you can see at the end of line 5 in the content and as you can see notice line 3 I had to manually remove the fix luckily and that costs some problems yeah that's all I have so if you have any questions thank you for listening yes please so you said in the beginning you wanted this for a back reporting tool but then later you said back traces were not a goal of this thing, wouldn't that be important? yes sure but I needed to have it inside of LDB first and have the I mean if you take LDB and then you can just use it and I just wanted to have it understand this many book info you mean like back traces or something good question need to maybe forward that you would have that or you would use a public reminder the idea is that now you can map the addresses in your tool to function another question no thanks