 Ik heb een beetje een koal, dus ik hoop dat mijn hoogtje opgaat, maar oké. Nou, ik ben Mark. Redhead is mijn employeur. Ik werk in de Perftoolsgroep, het is het koal, ik hoop nog steeds. We soms veranderen de naam. En ik werk meestal op de Velgrind Elfhutil System tab. En debuxtaf. Dus, toen ik de titel van Binary to Source kwam, dat is gewoon een deel van wat Dwarf is, want als dat allemaal er is, dan moet je de debuctlijn mappen, je neemt een adres, oh, dat is de source. Lijn, maar Dwarf doet, wel, zoveel meer. Je wilt weten welke functies er zijn, wat parameters er in de functie zijn, zijn er verbouwen, wat is de scoop van de verbouwen, wat zijn de typen van de verschillende dingen. Dus, je hebt dat. En dan, als je die verbouwen wil, de locatiedescriptoren, die ik dacht, het werkt best wel, maar dan... Nou, je zag de laatste afspraken. Je wilt weten ranges van dingen, je hebt debuctranges, je wilt weten hoe je in een functie hebt, dus Dwarf providee je onwijnlijke informatie, niet alleen om te weten waar je vanaf kwam, maar ook om terug te komen op de waarde van de verbouwen, de waarde van de verbouwen wanneer een functie was gehouden, je kan de waarde van de vorige frame inspecten. Het even providee, wanneer de meeste compilers niet direct makkelijkt, zodat als je de baas weet, en je wilt het copypaste, je kunt de makkelijkt gebruiken om alle beveilingen in je kaart te verwijzen. Oké, dat is veel dingen, Dwarf heeft veel designgouden, waarschijnlijk een beetje te veel, want ze conflicten met elkaar, natuurlijk. Het is een beetje interessant dat de laatste afspraken ervan was, dat is geïnteresseerd door de ABI, want een van de designgouden is dat het geïnteresseerd is, dus in principe, bijvoorbeeld de onwijning, je kunt bijna zoveel ABI-knowledge doen, het enige wat je moet weten is dat er met de beveilingsnummer het actual beveilingsnummer op die architectuur is, dus het is een beetje interessant te zien dat het niet altijd gedaan is. Maar wat is echt leuk, alstublieft denk ik, toen ik naar Tom en Tom zei, dat is het worst, dat is niet echt waarom, maar dat is een sorted problemen. Daarvoor die er op kan spelen, is het zo, dat dat is. Ja, dat is het bestyt, dat is het bestyt, dat is het beste moment. Dat zit er ondersteund, want er zijn een aantal knulvenderextensies die later opstandig zijn. De probleem met de venderextensies is dat je ze moet implementeren in alle produceren en alle consumeren om te werken. En ik denk dat de probleem is dat dan wanneer de venderextensies je moet implementeren, dan weer. Maar ik denk dat het goed werkt. Dit is een van de designgolven die ik heb gekregen, want ik had iets te spelen om te praten. En toen ik begon om alles te listen dat Dwarf 5 uit de andere knulvenderextensies kwam, was het veel te veel. Dus ik vind deze designgolven de tools die Dwarf hebben te proberen, niet te weten over Dwarf. En ik denk dat Dwarf heel snel gebruikbaar is, omdat het betekent dat je je kan makkelijk kompozen. Piezen van koud met Dwarf en combine ze. Natuurlijk biedt dit een aantal van de andere designgolven. Maar ik dacht dat er een aantal klevige extensies in Dwarf 5 zijn die een aantal van deze limiteren. Ja, go look there for the whole standard, and then we just discuss a couple of new data representations. Where am I here? To compose Dwarf from two objects which contain Dwarf, you don't need that much. You have a assembler where you can reference labels in a data section. You want a way to reference between sections. For example you have the info tree which describes a variable and it has to reference where the location descriptor is. And you have to have addresses of symbols which you don't know yet where they will be. So you need some way for tools to do the relocation. And it would be nice if the assembler knows how to create a lab 128. I don't know how you pronounce it. But a compact form of writing out constants. Because otherwise you have to do that. But that's it. With just that, which is basically a generic assembler linker, you can combine Dwarf. One of the nice things about that is that you can combine different Dwarf producers. You have a C file, an assembler file. You get two object files and they get combined even though they produce... They have different Dwarf producers. You don't have to know anything about the Dwarf. They get combined. Dus... Is this show? Of course not. Can I... Is it too... It's somewhat readable. I wanted to have an example where it wasn't completely trivial. But even this actually produces too much Dwarf. And sadly it also doesn't show size reduction. On the other hand it's way too small so the overhead just dwarfs it. All the files are almost the same size even though I want to show reduction in size. We have a header file with a simple struct. It defines a function that takes a pointer to such a struct. We have two files. This will most likely crash and burn. This defines the fob function and you have a main that calls the fob function. The nice thing is that you just create two ovals and you combine them together. The linker doesn't need to know about the different Dwarf. What the Dwarf data precisely represents. We just look at the debug info for this. You have the first compile unit which comes from f.c. The fob function error's name was resolved. The statement list is the line table to use for this. That's at offset 0. That's in principle all the relocation you need. In this case the matching debug line was placed at the start. The first object file. You will see for the second one that the statement list is at the next offset. All the linker has to do is place the debug line pieces after each other. The thing you see immediately here is in both the first and the second compile unit. We define that structure type. Which is a bit wasteful, especially given that one of the design goals of Dwarf was. That it shouldn't repeat itself too much. We want as simple as possible tools to combine Dwarf in object files. Let's go back. Precisely, that's the conclusion. It's a simple concatenation. I understand well what happens if you compile the option so that the linker eliminates the unused functions and so on. What is being concatenated there? That's a nice question because the Dwarf isn't touched. If the Dwarf in the original object files describes the function that you then eliminate later, it still describes in there. You can go look at the supposed address of the object that is defined and see that address. You don't have much. There are ways to get back on your feet, but you don't get it. Indeed, one of the questions is if this design goal is. What if we put all those types in their own section? En what if we had tools that had a link once section or a section group? Which is a good question to ask because that's outcome that sections and linker support that. That's a nice thing. Dan, if we would calculate some hash or check some over the type that we put in a special section, then we could reference those sections with the hash as the name. That's basically what debug types do. It was actually an extension for Dwarf 3, integrated in Dwarf 4. The only thing you then need is a way to reference a type with a signature or hash. The problem is that it does need a new 12 unit header so you couldn't easily combine them and for that reason and others they were put in a different section. No, no, first example. Yes, so lots of Dwarf. GC implements this with debug type sections. We compile it again. And this time we see the same debug info. Let's see if I can find flop and flop C. F is defined as a type 2D. Oké, er is... No, no, I have the wrong type. I knew I made my example too big. Sorry. No, I have wrong. Oké, argument F is type 92. It's a pointer type. It's a pointer type and pointer 2. Oh, well. That one. And the nice thing is that in the other compile unit we should have the same when we pause it. When we create the variable it has the same signature. I'm sure that's the same one. And now we have a debug type section with that same signature and there is only one of them. Er is? Ja, er is. Dat is echt mooi, want we hebben nog een linker-mechanisme nodig. Maar nu duplikeren we wat informatie voor vrije. Dat is mijn voorstel. Ik probeer te laten zien dat het echt veilig is. Memo en eigenlijk zijn de objecten een beetje groter. Want deze een structuur is te klein voor de header. Het helpt daar niet echt. Maar het helpt daar. De meeste programma's zijn groter headers, meer types. Veel types. Het helpt daar echt. Het probleem is dat het iets meer complexe maakt. Er zijn allemaal verschillende uniten in dezelfde debug-infosection. Dat is eigenlijk simpel. Maar ze hebben ook meer complexiteit. Maar ja, het is nog heel groot. En als je in de example kijkt, zie je veel relocaties, referenties naar andere data sections. En de linker heeft eigenlijk te delen met alles dat. Dus wat kunnen we... We hebben ze echt nodig voor de strings. Het is echt leuk als de toolen weten hoe te merken strings. Want veel van de drive data is strings. En je wil echt die te merken. En natuurlijk voor de errors, de symbolen, je moet relocaties. En je gebruikt relocaties voor de intersectionsreferenties. En wat als we iets doen over dat? En zoals altijd, de answer is dat je een laad van indirectie hebt. Natuurlijk. Dus wat we doen is dat we alle addresses in hun eigen sectie zetten. En dat is eigenlijk gewoon een index. En dan kan je een index in de address, de debug address section gebruiken. En dezelfde doen we met strings. We hebben een nieuwe offset section. Dus je hebt... In plaats van om direct naar een string te komen en de linker te vragen. Fix dit op. Als je de string ergens loopt, heb je een nieuwe offset. Een indirectie, dus je kunt gewoon... Ik wil een string. En... We gaan gewoon de ergens van strings in dat sectie reloceren. En dan eindelijk... In plaats van bijvoorbeeld de locatiedescriptie... Descripties die zeggen, oké, mijn... Ja, dat werkt uit. In plaats van zeggen, mijn locatiedescriptie is... At that offset in the debug loc section, or the range you want to range. What we do is we add an index at the start of the section. En we... At the start of the compile unit we just say... I'm using the debug loc section, that one. And from there you just again use indexes to where your real ranges are. One of the nice things is that through all those... If we use all those in directions we can move most of it out of the opx files. So we don't need the linker to even see the most of the dwarf data. It still has to see the addresses. But even the strings can be moved away. En this is kind of interesting because... Shall I first show how... So what we do here is... This is actually the dwarf for new extension. Kind of the same. En... Oh, I should have shown that it now creates... Both an O file and a DWO dwarf object file. Dwarf object file. A dwarf object file is... Wat je ziet is in de object file, instead of a real compile unit, you have a skeleton. There are some addresses there. It still has some section offsets. It has the base address to index into the address list. En it has a, hey, another signature. And a name. En, well, that's it. And for the other compile unit it's the same. Well, different IDs, different DWO files. En this is really nice except if you want to look at the debug info. Because now the linker doesn't have to deal with most of the dwarf data. So for elf utils, read elf, I implemented info plus. Which picks the... After the skeleton it shows, it should actually show... I thought I had patched in. It should actually show from which file it got. It doesn't. En you see in the DWO file, it uses indirect string references. En let's see... Wat? En it can... This wasn't supposed to happen. So this address comes from matching up with the skeleton seeing where the debug address is. And then using an index into that. It doesn't work, why not? En it does it every time. Too bad, sorry. At least it's consistent. At least it's consistent, yeah. En this is also a small difference with dwarf 5 here. It still says it uses a section of set. It doesn't really... It should use a different form in dwarf 5. But the idea is it uses the section of sets from this DWO file. Ten minutes, okay. I'll manage that. So the same for the other skeleton. En you look up almost most of the info in the DWO file. There. So this is really nice for your normal edit compile debug cycle. Because the compiler has to put the debug info aside for most of it. But the linker doesn't need to see it. Doesn't need to copy it. We do have lots of duplications. Again, even for the strings now. But those were already original in the DWO files. So dwarf consumers need to be a bit smarter. They have to do some of the deduplication that the linker would do. It's awful for distribution. Because some programs are links of thousands of all files. And you really don't want to ship thousands of DWO files. I'm also not sure people have used this in anger yet. I'm really with thousands of DWO files. Because for Elfutiles I tried to be smart. And it's all very lazy. So we just open the DWO files. And only read them when we need to. And after a couple of thousands you don't get more file descriptors. Oh, that's not nice. I assume GDB does something smarter. No? So because that is a kind of a concern if you're not in your edit debug cycle. Binyutils komt met de Dwarfpicketser. I don't know how it's called. And it kind of does what you expect to do. It reads the DWO data. It sees all the references to the DWO files and adds them together. It acts as a kind of mini linker because string sections make up so much that you really do want to merge all the strings. But it's nice because you only have to update the offset table. And because you concatenate the data sections luckily you only have that one relocation against it. What it does is it creates a index section where for every DWO it says for the info part and the ever part. All the debug sections where it starts and ends. In principle, that's all you need because you have all those indexes that you now know start at zero or that offset in the concatenated DWARF stuff. Of course, the next kind of logical step is we're getting to a point where the tools aren't simple anymore. So why not write the tool that understands DWARF and does a lot more. That's basically what DWZ does. The nice thing about if you know about DWARF and you don't mind one extra extension which is actually also in DWARF 5 now is that you can deduplicate between debug files. You need new forms again to follow string pointers or references. But now you can say oh yeah for all those debug files just look there. I've made one large string section or info section types. So five minutes. I wanted to go in more detail but the number of references between the various DWARF data sections is kind of daunting. Especially if you then combine it with split DWARF. I thought it was kind of funny about these come from the DWARF spec that they didn't even try to have split DWARF plus supplemental files because I don't think anybody does that yet. But it is all specified as you should be able to have split types split DWARF en supplemental files together. I just showed that there are still bugs but I implemented most of this at least the new extensions with an eye on how they are also done slightly differently in DWARF 5. So if you are writing a DWARF consumer maybe you want to try out LibDW and write documentation for it. No. It is a really nice library every time I say do you want to use it people say okay where is the documentation. There is a header file. It is a nice library. I am kind of proud that we keep ABI compatibility even while we support newer versions of DWARF. Of course the value is kind of you often need to use new functions to really get at some stuff. For example with the split DWARF everything works except that old programs not using the newer functions only see the skeletons. But really it's nice. Did I do it right on time? Questions? About these two independence you mentioned that the linker can be produced by the assembler without understanding them. If the assembler compresses the file does the linker need to uncompress them? Yes it does need to and there is actually another extension but that is an extension to ELF the ELFAL format where the compiler compresses all the DWARF data and the linker expends concatenates and compresses it again. There are people who say that is useful. So if I want just to get compressed final executable I have to tell only the linker to compress the data and not the intermediate tools. Oh yeah you can also do that. And you can also use ELFutales compress which actually does that for you. Ah, look, it's not installed. Oh well, ELF compress, sorry. I don't even know my own tools. Sorry. Ah, stupid small screen. Ah, so by default it compresses all the data sections. Sorry, yes. When we produce a release build we don't mind too much about the compile time and link time and we mind about whether we have to debug that gdb or whatever we need to debug it so fast. But when we are developing we want of course gdb to read fast but we also want compile fast and link fast. So what is your recommendation about when to use which option for which kind of usage? Well, for the compile edit debug cycle I would recommend split dwarf but that's... No, no, no, gdb supports it and ELFutales now supports it and sadly tools using the ELFutales libraries need to be updated to support it but it's really easy. Super easy API. So, but for gdb I would but I'm not completely sure people have used it in anger and there are some questions on okay, instead of the linker spending a lot of time you now have gdb opening a thousand files but out of time, okay.