 After all that's ét battling I'll be right in the place to present you all about lold from a user perspective.ình nw i wneud fydd eich Milk Ild o'r £11 or chicken sang ot maeair sy'n hy bru to a'r punched 0 I'll explain what isn't a few minots for those who are unfamiliar. So this isn't a deep dive into what lld is and what it does, this is more like how would I actually go about using it on the assumption that most of you may have heard of lld but probably haven't tried it out o ddod o'r aci'r ffordd ymweld. O'r gweithio, yna'r cyhoedd i'w ddod amser? Yn cael ei wneud LLD, wrth gwrs, dyna'r cyhoedd yw'r ddod, ac mae'r rhai cyhoeddion yn ymhyfyd i'n gweithio'n gweithio LLD. Mae o'n ddod yw'n o'n ddod i'r cyhoedd BSD, ond rydyn ni'n ddod o'r cyhoeddion. Mae'n ddod, rydyn ni'n ddod o'r cyhoeddion llynigol o'n ddod o'r cyhoeddion. You're going to end at three weeks? Three weeks. Okay, and I just noticed something, I think it might have been this morning, that it might actually get into the LLVM 4.0 release. So it's looking like it's going to become slightly more easy to get hold on there, but it's actually not too difficult to bill from source, if you wanted to fiddle around with it. The next bit is, what can I expect if I actually try and use this thing? And then just a little bit at the end of, if you're interested and want to give it a go, ac sut rydym yn baut. Yn ffrindiau'r coffaeth o'r maco fersiwn, wedi gweld y cwbl o'r slidau. LLD rym ni'n gweithio 3 linkers i 1 o'r unrhyw fersiwn gynhyrchu cofasau. Rydyn ni'n ffeilio'r fersiwn LL a rosell eich cyfnod a'r fersiwn LL yn i. Mae'r ddau'r fersiwn LL i ddim yn ddau'r fersiwn LL yn ddau'r ddau'r fersiwn LL i ddim yn ddau'r fersiwn Microsoft o'r fersiwn Apple. a'r Microsoft version o'r Apple version. I'm basically working on the arm port for LLD. So that sort of background is in sort of arm based tool chains. OK, so this is a very, very basic slide for people who may not be familiar with what linkers do. They usually the sort of program that sort of hides behind the compiler and used to call your objects and libraries and it does its job. So apologies if you're already well familiar with the sort of linkers job is I won't spend too much on this slide. But in effect LLD takes the place over here. So compiler compiles source codes to objects. Those objects have basically got symbolic references between each other. So they'll say like I reference some symbol called printf or underscore start or something like that. And quite often what the linkers first task is to say take in all of these references saying that the objects say I need X, I need Y. And then it's got to go search through all the libraries that you've put on the command line and then basically say OK. I see definitions of function X, function Y and these functions need Z and whatever. And then basically just load everything all into one bit. And then once you've got everything all loaded you've then got to resolve all those symbolic references so that the reference to printf gets connected with the definition of printf. So that in a nutshell is what a linker does. There's various other bits that you can do to sort of control where bits go. So people doing embedded development will often use a special linker control script. They'll say hey I want all of my vectors to go address zero where you know on certain embedded systems and the like. And at the end of it you're left with a single file that you can then use the shared library and executable or some sort of slightly more concrete form than what came in at the front. So that's what LLD is trying to be. It's trying to be a sort of system linker. It's not really in the state to be a library at the moment. And I think there has been some arguments on the list about you know is that the right design for LLD to be a program or a library. There are arguments both ways. Typically as a sort of person who's developed linkers over the years. Typically linking is an all or nothing job. You tend not to want a bit of a linker. You tend to want all of it or none of it. So I think using it as a library I don't know whether anyone's come up with a really compelling use case to do it yet. It sounds nice in theory but I don't know yet whether anyone's worked out well or actually come up with a compelling enough reason to do it. So LLD is a project kind of I guess has existed for quite some years but it only really development of it kickstarted in about 2015. Prior to that there was a single code base and it was based on a sort of macro concept of an atom where an atom roughly maps to one function which and you could then move these atoms about. And that works quite well in macro but in Elf and Coff where you can have a concept of a section that can have multiple functions in. The Elf and Coff linkers basically had to keep translating to and from or the atom base model putting constraints in and it was slowing up development quite considerably. So at about time of 2015 the Coff code base forked off and basically said no it's not there's not a lot of point in maintaining this atom model for the Coff linker. I can streamline the development much easily so it ended up with a much faster simpler to maintain code base out of that. And then probably a few weeks to a month later the Elf back end was sort of stripped out under the same sort of principles. So we've effectively got Elf and Coff which are kind of streamlined platform linkers and you have the macro based atom model which is kind of like a linker toolkit for sort of building linkers. So these linkers don't really share code they do share algorithms and they do share sort of concepts in a similar sort of style. But the design decision was made that it was the amount of work that you did to transform Elf and Coff into the common data structures and back again wasn't worth the effort and it was better to share at the sort of algorithm and concept level. So that's one of the one of the big design decision difference that you would have from say LD dot BFD where you've got BFD which is the sort of translate all object formats to BFD work on BFD translate back. Which means you can share code in the centre of the linker but then you've got sort of quite a all the back ends and front ends have to sort of translate themselves to that BFD representation. Okay and it has been written with an emphasis on performance so I think that's probably as users it's probably the one reason why you would want to give LLD a try unless you were bothered about licensing or other sort of source code areas. So this is and I'm going to put a huge disclaimer on these numbers here. So this is a machine that I've only recently got from my company's IT department with a PCI ESSD, lots of memory and a fast processor. I certainly on my older machines didn't see anywhere near like the performance differential on this. So I think this is very much going to be a case of when there's a huge amount of disk bandwidth which is in effect what a linker is doing is basically a slightly clever form of cat. It's concatenating lots and lots of binary data together so you're mainly limited by how fast you can copy gigabytes of data to one place to the other. So you could say if you've got a slow hard disk system that's going to swamp all the computation that you might do. So if you've got a nice SSD that will emphasise the differences. So these programs also are huge so I deliberately set out to say what can I find that's going to be horribly nasty. So Clang linked fully statically with full debug information O0 produces an exec, well on my machine it was roughly 1.5 gigabytes in size, the output file. So that's huge but LLD tuned through that in about six seconds and BFD took about a minute. So if you're working on a gigantic code base I would say it's worth giving LLD a look because you might find that your compile link execute step could go down from a painful weight to something fairly quickly. So if you say you just changed one file, one string in a printf or something and have to relink then that could actually give you quite a bit of benefit. Not so much for say an overnight build or anything like that because that will be swamped by the time of all the compilations that you have to do. But that's where I would say the primary reason to use LLD at the moment because the way they've been designed, well LLD's been designed is that the ELF version tries to mimic the same interface that gold and LLD do. So it just can be a drop-in replacement but one of the other bits of that is that well okay it's not doing anything else so there's only really used to use it as a substitute. So why might I not want to use LLD? I've given you the good side which is the performance so what's the drawbacks for that? So I'd say that LLD doesn't implement all of gold or LLD's features yet, certainly in say the link control script area that work is in active development right now. Whether it ever gets all of the niche features of LLD I don't know. I don't know of anyone using this heavily in an embedded context right now. At the moment it's very heavily tuned towards sort of Linux, BSD style operating system platform linkers. Embedded systems tend to be horrible to linkers with things like multiple address spaces with different memories and different constraints and all of that type of thing where you've got to have quite stringent placement requirements. So I think if you're using an embedded system I would probably say you will be one of the trailblazers if you start adopting LLD but if you do please do and report bugs and we can start working on that. It's got a limited amount of users compared to gold and BFD. The BSD community are adopting LLD probably more aggressively than anyone else and they're doing some great work in improving it so hopefully that's one thing that can be used to drive forward the adoption of LLD is to have a big user base full of people all using it and finding out where the problems are. I guess limited support for number of targets. I'd say the main target at the moment is H8664 or AMD64. That's the one where the predominant number of users are on there. There is a good port for ART64. I think that's now running a build bot, it's got tests, it's self-hosting. ARM needs range extensions at the moment so it can't handle big links at the moment but in a sort of small station it's getting there. I think the MIPS port is well maintained. There is a port for power, AMD GPU and X8632 but they tend not to see too many commits so I don't know too much about that. Of course if you're happy with the performance of your linker at the moment there's probably not a good reason to change. How might I go about using LLD tomorrow given that hopefully it will be in the LLD 4.0 release? Very simple, if anyone has been used to building LLVM you basically put LLD in the tools directory like you would put clang and then just do the standard CMakeNinja and install and you'll get an LLD in your standard build slash bin directory along with all the other LLVM tools. So what do you actually get out of the end of it? You'll get a tool called LLD and you'll get a sim link called LD.LLD. Now that's very similar to what you would get with say I'm just thinking of an Ubuntu Debian sort of system where if you look in slash user slash bin you'll get LD.BFD, LD.Gold and you then set a sim link for LD to either gold or BFD depending on what you want. So typically the generic LLD can basically by setting a flag called flavour you can say act as if you're the Microsoft linker, act as if you're the GNU linker. If you use LD.LLD or LD, a sim link with LD it'll infer that you want to try and mimic GNU LD. So the intention that generally I think these types of linkers that you generally tend not to call the linker directly it's usually the compiler driver that will feed you with all of the various objects. If anyone went to Brooks Davies' talk on Hello World yesterday you'll be familiar with all of the extra objects like CRT1, CRTI that the compiler driver will feed to the linker to make everything work. It's generally not just your helloworld.o and libc that get involved in there. So yeah, intention is just to drop in replacement. So what I found the most reliable way if you want to use LLD is to sim link your slash user slash bin LD to LD.LLD. As I mentioned earlier that's pretty much how you select between the GNU linker which I'll call LD.BFD or the GNU gold which is LD.Gold. I say there beware of build systems that include their own tools. So for example when I built Chrome and Firefox and started trying to say well okay what if I use LLD and I found out that the build system created a whole new directory. Just for their own bin utils and they copied their own versions and directed the linker directly at that. So you'll sometimes find on these big projects that the build systems will try and defeat you and you have to fiddle around a little bit at that point. Now if you're using Clang there is a flag called ffuse-ld that does exist on GCC as well but GCC won't accept LLD at the moment it will only accept BFD and gold. But if you're using Clang you can use ffuse-ld equals LLD so what that means is that when the compiler invokes the linker it'll say I'll look on the path for LD.LLD. So that would be a way of say installing LD.LLD in your slash user slash bin directory and then not have to fiddle around with symlinks or change things. You would just control that with that particular flag there. So what are the differences between LD, LLD? So I think main difference is now you actually have to try pretty hard to make this make a difference. But the way that the linker search for libraries is subtly different in that basically the GNU linkers they search the command line from left to right and they only search each library once. So if you have things like trans-circular dependencies between libraries where you kind of have an object, well a member A in one library and a member B and they each call each other you sometimes have to break that cycle by either using something called a start group, end group or you add the libraries in multiple times on the command line. LLD doesn't do that. It basically sort of searches the symbol table each time it sees the archive and then it just will automatically go back into that search table. It doesn't sort of require you to put libraries on the command line multiple times. Again there are contrived examples where this makes a difference but you do have to try hard to actually run into that. Now there's no default linker script for LLD so if you're ever using normal LD you can type in LD minus minus for both and it'll tell you the linker script that it's using so if you do this for say a standard Linux link you'll get a linker script about two pages long full of stuff. So what LLD does in this particular case is it's got a, it has a completely separate code path to handle that case so that again can cause slight differences and there's been, there's not any attempt to exactly match the behaviour of GNULD for linker scripts. I think the idea is to come close to or to use, to basically match where the behaviour matters but the sort of linker control scripts are somewhat loosely specified and they're often only specify a part of what's needed so for example you can write a linker script that specifies just one portion of your image and then you leave the rest of it to the linker. And so LD and LLD might make different decisions in that particular area and also some options are just not implemented and quite often there's some command line options that are just, they'll be accepted because they're passed by GCC every time you do a link but they're not actually needed so LLD just ignores them. Okay so here's a very very quick data. I won't go into too much detail about this but this is just showing an example of some of the design decisions that LLD has taken that's different to LD so on your left you can see two programme headers, an RO and an RW and you'll also see that the green bits are the sort of executable code and the sort of cyan bits are the sort of RO data. LLD groups its RO data first before the executable bits and has a three programme header layout and you can find that some tools that assume the two sort of programme header tables can get confused by that so that's again something just to be to watch out for. Generally this sort of thing doesn't matter. Okay so as I sort of start wrapping up and I started going through this earlier on then realised that I had a slide later on and but I think the general status I'd say I'd say the AMD64 one is looking pretty good. Lot of work's been done over the past few months to get the linker scripts working. There's several build bots that basically go through Clang, LLD, two stage build bots that check everything works and I think as of the January 2017 I saw a message post list at the free BSD based system, kernel plus user space has been linking and running with LLD and I think there's also some progress going through the ports and I think as you can see there are 20,000 out of 26,000 are linking properly and I think we reckon that some of the ones that aren't, there's not 6,000 individual problems there, there's probably a handful of problems that are causing those and there's a sort of upstream tracking bug so that if you're interested in sort of keeping that progress. So AR64 we've got little endian support only but we have got a build bot running to make sure that it keeps at least self hosting and linking Clang and the test suite. And we've got little endian support only. We're missing range extension thunks so we can't link Clang yet because that's a rather too large for an ARM v7 system and that's what my personal project is at the moment as range extension thunks. So once that's done there we should have enough to get a build bot together and that will sort of bring ARM up to the same level of supporters AR64. OK MIPS. Now I don't know too much about MIPS as a system myself. It's actively maintained so I'm going to make the assumption that based on the last status update that it passes all the single and multi source tests and the LRVM test suite still holds and I think it's probably my guess is that it will be in the same sort of state as the AR64 ports. Probably not quite as heavily used as the AMD64 one but certainly if you're in a MIPS architecture it will be worth taking a look. X86 32 bit. I think it's complete but as far as I know I've not seen this being actively tested and the last time I tried trying some things out when I was making some modifications to the new iPhonec I've found some problems so I think I don't know again too many people are actually using this one right now. Certainly I think if there were any bugs reported on it they could get fixed fairly straight forwardly. There is a port to power PC and to AMD GPU but I've not seen any traffic on those for quite some time. I'm not sure of the status of them but just based on lack of traffic I don't know. It's probably best I just say status unknown if you've got one might be worth trying but I don't know whether anyone's actively maintaining it right now. Okay so what can you do to contribute to LLD? So one of the things that I think as any of the sort of LLD maintainers would say what we really need right now is people to go out and use it and report bugs and find out what's not working that type of thing because we're thinking it's getting to the stage where you can actually genuinely give it a go and expect that the programme should work. Well you can give it a go and hope that it might work and I think that's certainly something that we would appreciate. If you want to contribute patches of course LLD is covered by the developer policy. You can find the owners on the various places and read me there. There isn't an awful lot of stuff on LLD out there in terms of full documentation. There is a sort of LLD site to the documentation but it probably doesn't tell you an awful lot about how to go about developing it. I did a presentation at the LLVM cauldrum last year on sort of LLD structure and I think most of that still holds. Really the main author of the cough port did a presentation in 2016 on the sort of the dev meeting about LLD. So those are the two main video sources that I can point you to. Any links to content before 2015 will be about the atom based linker so unless you're going to be working on the Mako atom based stuff I will probably ignore anything that came back from 2015 unless you're sort of interested in tracking through the history. OK. Thanks very much everyone. I don't know how many times I've got a new time. OK. Right. OK. So I've come up a little bit short on that. But so we've got plenty of time for questions or if you want to go get your lunch probably if you want to run out of here as quickly as possible I won't. Yes. Right. OK. Oh that's good news. Right. OK. Right. OK. So just to repeat that just for anyone it didn't catch that on the thing. So update on the BSD side so that AMD AR64 should be and MIPS 64 should be fairly close to AMD 64 in terms of their coverage MIPS 64 or MIPS may be missing some support for large programs. The BSD sort of base system tends to be small UNIX programs rather than sort of giant programs. So yes. But work is definitely actively ongoing in that area and we hope to be better soon. OK. Any more questions. Is it easy to summarise why LLD is so much faster than gold and LLD. OK. I mean I think partly it's because it's trying to do. It's it probably does less is fully the shortest thing. It was the same event as any fast program. It's obviously doing less less at that point. But what what LLD does is it tries to read through everything only once. So it's kind of I guess almost like you call a backpatching assembler. So as opposed to like a two pass assembler where you read everything once then work out what you need to do on the first pass where all the sizes are then you do another pass and do all the content. Another way of doing it is like a backpatch where you basically go through once working out where you need to patch all the things and then sort of do the sort of copy and then sort of figure out the patches which tends to be. It tends to make it harder to harder to write the program in the first place. But it does mean that you generally passing through your passing through the content generally once. So I think it's generally it started off with an emphasis on performance and it's been actively tracked. And almost every patch that goes in is sort of assessed for performance. So I think partly it's just it's probably not had enough time to do because I think gold started with the same goal but whether over five to 10 years more stuff's been added in more features have been added in. It's probably may have slowed down over time. So it may be that LD is skipping some steps that the other linkers are doing that sort of thing. And it might be that as it gains more features it might end up slowing down over time. But I think at the moment there's a sort of an aggressive sort of emphasis on performance and reviews. So I think that it will hopefully maintain that over time. Right. OK. It certainly does. It's got incremental code folding. And it's certainly got some support for the health. SHF merge strings. But so it does deduplicate strings. I don't think it does duplication deduplication of debug strings. So I don't think at the moment I think debug is very much a black box at the moment. It's got some simplified relocation handling for it. But it's not trying very hard to compress debug information at the moment. That's my understanding. I haven't looked too much into that. But yeah I can imagine if you do some special handling of debug. I mean for those of you don't know if you've the amount of debug information totally swamps non debug information. So when I mentioned that one and a half gigabyte clang executable that comes down to about 30 without debug information. So if you compile with gigabytes you're really hurting your linker at this particular point. And so if you start actually peering into debug information and debug information is normally encoded in things like something called ULab encoding. So you can't just jump into the middle of it and do things. So if you're going to process gigabytes worth of debug information you're going to slow your linker right down. But you might get a much smaller output which will then help your debugger start up quickly. So a bit of a trade off there. But yeah as far as I know it doesn't do an awful lot. Yeah. Yeah. Yes. Yes. So certainly it's one of the things I say I would definitely post those. Take those performance numbers with a severe pinch of souls. Other thing LLD and gold are multi threaded BFD is not. It's quite difficult to get an actual are these linkers. You know anyone who's done any compiler to compiler benchmarking will tell you how difficult it is to actually match up. So you've got a roughly fair comparison of what compiler is actually doing at what optimization level. So I would take that as just a you might want to give it a try. Please measure your own code at that particular point. But yeah. No I didn't. I just chose the last link step. So there may be a case of where getting it all of the intermediate parts. So what I did I cheated. I basically said I'm going to go in to the directory that they the bin utils directory they created and I created a sim link in that one. So I just copied it from there. So that's not very nice at that particular point. And yes. I don't know if I went all from the start whether they would have started. Certainly more work to be done at that point. Certainly I think there's been. I'd certainly say it would be a good chance for Chromium because I think I think some Google internal people have. I think that's I think they may even use LLD on X86 instead of the Microsoft linker. Whatever. OK. So LTO the actual thing is the linker actually doesn't play much for part in LTO. So it's I know it's called link time optimisation but it's really only link time optimisation. The linker or the linker's responsibility for that is to collect the bit code files together and spit them back at the code gen. So yes LTO is supported in LLD but it doesn't actually do much more than just say I see this file is bit code spit it back at the LTO code generator. Yeah. OK. Any more questions. Yes. So well OK. So it's under quite heavy development. So it's often OK. So OK. Sorry. I've got to repeat the question. What is LLD like to develop on so. So it's quite a dense code base. There's not actually that many lines of code but it's quite heavy C++ 11. And it's quite you know that almost everything every sort of bit of code does something. So it's quite quite quite it does take a while to get your head around what the various dependencies are. So you know what I mentioned about everything being read once one of the difficulties about that is that you. Well one of the most difficult bits about linker development is knowing what assumptions that you can make at each point in the link. Because at every point you're going from an abstract representation to a concrete one. And you might find that you've broken the assumptions that something's made later on or before. So so I think from based on on my sort of experience with early because it's slightly harder to get started with just because it's it's quite dense. But once you kind of crack that or get your mind around the model that it's using it's fairly fast because it's not that much of a lines of code you can search around it very quickly that type of thing. And it's quite it's very actively refactored. So it's probably going to be quite difficult to maintain downstream patches other that it's quite easy to maintain downstream features. So for example if you wanted a downstream feature where you kind of said call out to a function implements my downstream feature and don't touch any of the generic bits of the code you'll probably be fine. But if you start say oh I've got to add a bit to a data structure you might find that a week's time someone's completely changed that. So yes it is it is not great for code stability right right now I think but but that's got a good and a bad thing because it means it does move forward quite fast. Do you know which company are investing in LLD? OK so from what I know that so this is the question was who's investing in LLD. So in terms of companies I as far as I know that's that's public information that is that the main maintainers that is at Google. I think there's some work from Sony Computer Entertainment. I guess I think that's also sort of done from the BST base. And there's quite a lot of people coming from the BST community. So I would say largely from companies there's people from processing manufacturers. So Linaro on AR64 type thing. I think the main MIPS maintainer I think I don't know whether he works for MIPS. But I'm it's very possible that that he does. So besides the process of manufacturers wanting support for their own architectures I'd say the usual sort of suspects of the large companies that sort of thing. But yeah that other than that look at the maintainers look at their email addresses. So you told us that we should not use LLD if we do not have to use it. But LLD is much, much faster question is if we try to use LLD in a production system would we have to worry about breaking the system where it comes from? Well all I say is that you're at a higher risk. In theory it's all there and it's working. But if you have LLD.BFD you've got every man and his dog linking Linux applications every day with it. So you've got a huge test space full of people. So it's just at the moment your risk is higher because the user base is lower. So it's quite quantifying how high that risk is. I don't know certainly with the amount of BSD things that are being linked right now I would say certainly on AMD64, Aout64. If your application is not trying to do anything too clever bits that gives linker's trouble are often complicated linker scripts, large amounts of thread local storage or weird uses of thread local storage or trying to directly place things with. So if you're doing a pretty much vanilla Linux application or shared library you shouldn't have any problems. It's when you're using special linker features is probably where you'll find that you'll start getting problems is what I would say. So I think one more. Have you tried using a mixture of linkers? I say if I hack into something and I've built it with LLD then a downstream user tries to use C with BFD linker with this shared library. Is that something that's known to work or just going to be tried? So the question is can you mix linker development say a Linux shared library with one linker and then say your downstream user uses a different linker to actually use it. So yes I have tested that at least interactively. Generally that should work because at most point when you actually link against a dynamic library pretty much all that the downstream linker, I'll call it the one that the user uses is just reading the symbol table and as long as LLD has created the symbol table correctly and the versions correctly then all of the real work of generating all of the code to interface of that library is done by the user's linker. So it's fairly low risk that that's the case. So yeah that should be fine. Okay I probably best stop that now and change over to the next one but thank you very much for listening.