 OK, rhaid. OK, everyone, it may as well get going. So I'm here to talk to you about what makes LLD so fast. I apologise for the somewhat click-bait title on that one. LLD, as a linker, has got a reputation for being faster than similar linkers on unix systems and the equivalent one on the Microsoft-emulated one, LLD cough. I'm going to be sticking to the elf one for this particular talk, that's where most of my experience is. Okay, sorry, I need to speak louder. Okay, so what am I going to cover today? Before I can talk about why LLD is so fast, I have to tell you a little bit about what it does. I know to some people in the room that they'll be quite intimately familiar with what a linker does, so apologies for that. I'm not going to go into the actual details, just give a high level overview of what sort of job it's doing, so I can explain why different stages take different amounts of time. Give you a little bit of numbers on how much faster LLD will be than GNU LD, which I might call LD.BFD, or Gold, which I sometimes might say LD.Gold. But those are the linkers that you'll get in, binutils. I'll then go into some reasons as to why that might be. I don't think I've got all the answers on that. I've certainly got some ideas. It's generally not one thing. It's lots and lots of little things, all sort of going through at that particular point. And some of the non-technical factors why that might be the case. Within every project there's the actual source code, the executable, but then there's also the people that maintain LLD and the sort of user community behind it that drive it in a particular direction. My own brief history with linkers is I've been... I'm working at Linaro, which is a sort of an open source foundation doing work for ARM AR64 type areas. So I've made some contributions to LLD and mostly adding extra support for ARM and AR64. And I've previously worked at ARM on embedded linkers. Okay, so what job does a linker actually do? So what we're going to look about here is just some of the contents of the ELF object file format because what a linker does is intimately tied to the actual object file format there. Sort of what you would call the generic linker design. It's kind of like there's not that many ways you could actually write a linker that would make any sense. So most linkers will follow one form of that design and a little bit about the individual steps. So apologies, I'm going to run through this stuff relatively quickly because it's relatively obvious to a lot of you. So first thing about what a linker is, binds more abstract names to more concrete names, which is the definition of the only book you can buy on linkers and loaders. It's probably about 18 years old. I can pretty much only use now is to taunt graduates that it's older than then. But anyhow, it actually is actually quite an interesting book and it certainly got a lot of references to how things came to be and if you want to know why some things called something, then this book might have the answer for you. But typically a lot of linkers have moved sort of more towards ELF than here. But in essence, the summary of it is that linkers are kind of like glorified versions of CAD. They've got lots and lots of things that are coming from the compiler in terms of objects. It's got to glue them all together and resolve all the references between all the bits. So the three main things, if you're going to remember anything about an ELF file, are sections, relocations and symbols. So sections are kind of like the atoms of linking. The linker is generally not allowed to split these things up because it doesn't really understand what goes into them. There are exceptions to that particular case for certain platforms, whatever. But in general, you don't want your linker to actually have to understand what's in the sections. It allows the compiler to basically emit stuff that it knows that the linker's not going to change. So sections are just, think of them as building blocks that the linker glues together. Relocations are various points where the linker's instructed to fix up a reference and the relocation will tell the linker precisely what calculation it needs to do. And it's that precision in the calculation that it needs to do enables the linker not to understand any of the contents. And symbols are kind of like labels in there. So in that particular example, you've got something like main, the actual function name. Well, that will get put out as a symbol. So typically these things are linked in that you define your symbols in sections. So main might be offset zero in the dot text section. And relocations refer to symbols and relocations are associated with sections. They're all kind of this three-way thing that's going on there. So in an executable elf file, all the same sort of components are used, except you have some extra things called segments that might have flags like loadable, that type of thing. So in effect, what a standard linker will do for a sort of elf-like, sorry, Linux-like SVR4 where you're paging the elf file into memory to load it, is that it's basically job is to take all of these little dot text sections from the object file, glue them into one big dot text section in the sort of final executable file. So it's kind of like slicing lots of little bits of the input files and combining them all into one bit at the end. And the job of a sort of a fast linker is how to do that job in the minimal amount of time. So here's what you might call your generic linker design. It's very much a pipeline type structure in that you've got the inputs on the right-hand side there. Lots and lots of dot-o files, some static archives where the linker has to fetch the objects out of them and the shared object files. So the first job that a linker has to do is that it's not given all the files it needs to look for on the command line. It's sort of given usually just find this stuff in these libraries. So the first job is to actually go and look for all the various content that it needs to load at that particular point. So once it's loaded everything, it can do what I would call here global transforms. These are things like garbage collection where in order to throw something away you need to know if anything in the program references it and you can only know that nothing references something if you've read everything. So that's the point where you have to get to the end there. Linker generated content is things like your procedure linker's table, PLT, got, that type of thing. I won't be going into any detail of those because they take a fairly small proportion of the link time overall. Layout is where the linker's got all the content, all ready to go but it just needs to assemble it in the right order and give things addresses. Again, this can be an interesting thing for linker performance because once you've assigned something in address, if you then have to grow something, shrink something, remove something then you've got to recalculate all your addresses. So there can be certain transformations towards the end of linking where you can end up with sort of having to make a change, recalculate all your addresses, redo the change, that type of thing. And then finally you have to copy all of the bits from the input into the output and fix up all the differences, that sort of thing. That's actually the thing that the linker does last. To start with, it's sort of mostly just sucking things in at that particular point. In general, things get less symbolic and more concrete further to the right you go. One of the things I'm going to highlight here is that the order of processing that you do things is significant. So if you put one file before the other in the link line you'll get a different binary out if you swap those files around on your command line. So that sort of ordering difference can mean that it can limit some of the things that you might be able to do in parallel for example. Right, so this is going to go through this incredibly quickly. Don't worry about the slides there, but this is just static libraries. Typically what you have is a list of undefined symbols, things that linker does, the programs asked for references to, doesn't know what they are. Linker has to go through each of the objects, symbol tables, say, does this library define a symbol? If so, extract the library. That library might have yet more undefined symbols. So it basically iterates until there are no more symbols unresolved. And at that point you either give an error if there's undefined symbols or you carry on to the next stage of linking. But that's just sort of showing you roughly the things it would be doing in terms of archive files. So this is what the sort of internal state of a linker will look like after it's read in all of the stuff. It's kind of like this directed graph. I deliberately say not acyclic because you can get cycles within here of input sections that reference other sections. So one of those might be your main function in one of the sections and it might call out to printf, which will be in another section, that type of thing. I've put a section up the top right there that's unreachable. This is kind of how linker garbage collection works. It'll start from an entry point, trace out all of the relocations, see if it can find a... Well, see how many sections it can reach. And the sections that it can't reach, there's no way your program can get to them. So at least not by any legal means other than, say, having to know the address of them random and just jumping to it. Generally, you're on your own if you do stuff like that and enable garbage collection. So this particular stage linker has no idea what the address of any of these sections are. If you take one out, it doesn't really matter in terms of performance. You can just connect up the relocations and go there. Layout, this is just a general idea of what it does. I don't want to go into the process in detail, but generally what happens is that you'll start with an output section and then you'll put all of the input sections which match its particular pattern. You start with the address at the top and then you basically just add things... Add address offsets from that particular base as you go with various alignment constraints. That's generally a fairly straightforward process for a linker to do. Okay. Relocation, just to sort of give you an idea of what it might actually be. It's kind of a particular calculation will sort of give you a symbol to go look up, find its address, and it will give you an add end, and it will just say to the linker, either do an absolute or a relative calculation based on that. So an absolute collection is just put the value of the symbol in there. So if you say get me the address of this global variable and you're not doing position independent code, it can just say... The linker can just splat the value of the address directly into that location. If you're doing a position independent one, it has to add various offsets to that. So we're now sort of gone through what a linker does. I can now move on to how fast LLD is. Okay. So the next sort of section I'm going to give you a bit of a warning on the numbers, these sort of things. The differences between the actual measurements that I've done are quite large. So I've not really done a huge in-depth benchmarking study with scientific conditions. Generally, the variable is sufficiently... Well, the actual differences are sufficiently high that any sort of form of machine, jitter and variance is not going to be that significant in there. So that's the machine I ran most of my benchmarks on. It's a fairly fast machine. It's got a lot of memory. Your mileage will obviously vary if you have smaller machines, lower core count, lower memory. I have done some tests on a Thunder X1, which is an AR-64 machine, with quite low single-thread performance in relation to a Pentium, but it's got an awful lot of cores. And I certainly see similar differences, and in some cases, more differences, depending on how much multi-threading the actual program can do. Okay. So this is kind of a table showing some results that I got. It's quite a confusing thing to read at first. So the programs that I sort of chose to look at were basically some things that I had lying around at the time that I know to be large. So Clang itself is statically linked against a lot of libraries, and it can come to about probably about 30 to 40 megabytes of code. I've deliberately included Clang and Clang with Debug, just to sort of show the impact of adding Debug information has. So Clang might have about 40 megabytes of code, but it might end up with 600 or 700 megabytes worth of Debug data to go through. The difference between 02 and 01 is that when you go 02, LLD will try and optimise strings to do tail merging, and that process is quite expensive. 01, it will just say merge identical strings, and that can be done in parallel. So you'll see that there's a jump in speed between 01 and 02, and that jump is higher for Debug sections because there's a lot of mergeable strings in Debug sections. To compare that with, I've also looked at libchrome.so on AR64. So that's actually building Chrome for Android, and so that's just to show you a different program. So it has different characteristics. It's not that much bigger than Clang itself, but it's compiled differently and has a lot more smaller files. So just to give you a general idea in the speed, though, you can see that even with turning threading off, you're finding that LLD can be round about twice as fast as gold and can be about two, three to five times faster than the new LLD, BFD, that type of thing. And with threading, in the many cases LLD can get more performance than gold does with threading enabled. I used four threads there because the gold threading architecture seems to top out about that much and sometimes can actually regress in performance when you add more threads. I used the value that the Chrome linker scripts used there on the assumption that they've done their benchmarking and decided that four was the best one to go for for that particular program. So it would be really great if every single program had exactly the same pattern of accesses. So I could say this bit of the linker is where the problem is. Unfortunately, there's various different stages that are sort of basically hit by different programs. For example, if you do F function sections, which basically compiles every function in its own section, which makes it easier for the linker to throw away unused functions, that can, if you've got to say 10 functions in each object file, that can multiply the number of sections the linker has to deal with by 10. So it does change the characteristics somewhat. And debug data, as I said before, adds a huge amount of actual, generally quite simple information to process, but it also can sort of affect various different points of link time. And C++ programs are generally more challenging to link than C due to the extra sort of alpha synthesis they use. Okay. So what I've got here is a very rough indication of some of the numbers of components here. So this is just, these are approximations and I've rounded them very aggressively here. But of the order of 2,000 object files, you might get round about 500,000 symbols, which if you think that you've got, for every file there might be say five functions, each of those will have a symbol, each section will have a symbol, that type of thing. So you can see how these things might scale in that sort of way. You'll notice that the real number of relocations is very much higher than the number of symbols. Typically that will be that each function might call say three or four functions, might reference some global data, so you might end up with, you've gone out average more of those per function than you will symbols. And there you go. The actual final amount in terms of data actually probably underestimated how much the actual size of the program was, it says 80 here. Okay. So when I add debug data on here, I've gained about 100,000 extra sections, 100,000 symbols, and I've 100 million relocations and 1.6 gigabytes worth of data. So if you're wondering why your link has suddenly got slow, which answers are you've enabled debug? And often if anyone's ever tried to build LLVM by default on a machine with multiple cores, and basically the make system will use as many cores as it can, and because LLVM's link statically and all the libraries are linked statically and there's lots of things that look almost as big as Clang, you'll basically saturate even a 32-gig machine and bring it to its knees horribly. So there are special facilities in the LLVM build system to limit the number of parallel links that you can do. Okay. So Lib Chrome, this is just a sort of comparison. You can see instead of 2,000 files, we've suddenly got 21,000 files and we've got millions of sections and symbols, but a similar number of relocations. So I think this just means it's compiled F function sections, so that sort of thing's a bit different. Okay. So we've got about a similar size of program, maybe slightly bigger. Okay. So this is what happens with Clang and debug. So this is sort of a, basically me putting some timers into LLVM to find out some of these things, but I've highlighted some of the bigger issues. So I don't know whether I've caught absolutely everything here. So there may be some times where this doesn't add up to 100% or whatever. So just so if anyone wants to call me out on that, I'm aware of some of the things in that sort of area. But an interesting thing to observe is that the first link is basically from scratch and there's no file cache involved at all. So when we head to subsequent links, if you've got enough memory on your system, everything's basically cached in memory, doesn't necessarily need to go to the disk much. So you can end up with substantially faster times for second links. And so you may say, am I cheating by saying how much faster LLD is or look how fast LLD is, and it's all because everything's all in memory. But typically, the way most people use it, it's not their end-to-end build that's important because that might even get run overnight. It's your incremental build. It's like I've changed something in one object file. I don't want to wait for 10 minutes. So I'm going to claim, or maybe hide or duck, whatever, and say really it's the performance on the right-hand side that's more important on the left. That is shared by all linkers. It's a load of time spent on disk. But it does mean that it can skew some of the operations. So the split sections operation is something that the linker will try and do when it's trying to merge strings. So if you find that that could be an expensive process if it's actually having to read contents off disk or disk, but once it's in memory it's fairly quick. But by contrast, the actual merge sections phase itself, O3, that's where it's trying to do tail merging, single threaded. And then because of the debug data, there's a huge amount of strings and that takes an awful long time. But if we go down to O1 and we can do it in parallel, that amount of time drops quite considerably there. And as we can see there, that right output file doesn't mean it's got longer. It just means that's a proportion of the overall link. It's got higher at that particular point. You can see the overall time's gone down quite considerably just by dropping that. So I guess a lesson to that, if you don't care about the most minimal then the LLD default of O1 in terms of sharing strings is probably the way you want to go there. So Chrome 3.64, this is just a comparison to that. Now something went wrong on the left-hand side because a large proportion of things disappeared and this is nowhere near up to 100%. So there's something timer that I've missed somewhere in there. So I'm going to need to go and have a look at that at some point. But as we can see here, similar numbers, merge sections at O3 comes out quite high. You'll notice that code layout has got higher. So part of that is down to ART 64 being slightly more complicated there in that you have things like thunks where you've got to basically extend the range of branches. And you've also got the, in this particular build, the fix for the Cortex-A53 aratum is on. And so that means scanning bits of the code which can slow things down. As a word of warning, one of the things I did try was Chrome for ART 32. And by default, that on gold and BFD, it turns on the fix for the Cortex-A8. And that particular thing took 60% of the link time up on GNULD. So if you don't know your thing, your program's never going to run on a very old Cortex-A8, then disable that workaround term for faster GNULD performance. OK. So I've gone through just some of the numbers to show you roughly what the speed differences are. And then we'll look to why that might be. So one of the things that you have to do if you're going to build any program that's fast is look at the characteristics of that particular program. So every linker that you write is going to have to splice a lot of data from a large number of files into one file. There's something called Smart Format Dumb Linker. And that's the idea that the linker doesn't need to understand what's in a section. It just operates on those sections. So any optimization you write is going to have to run on a large amount of data without choking on it. So that generally means, I guess, at least being aware of the number of iterations, making sure your algorithms don't go n squared when they're going to be large inputs, that type of thing. Typically, most algorithms that a linker will come across are fairly simple. So the other interesting thing is that the internal representation, unlike a compiler, when you start off with an AST and then you might change it to one, basically, L-O-V-M-I-R, then you go to M-I-R, then to M-C. So your internal representation is changing about five times and you don't go back to the previous one, so you can dump the previous one from memory. The sections and relocations and symbols stay all the way to the end and it's very difficult to throw memory away. And there's probably more opportunity within stages of the pipeline to do parallelism. You would have thought to start with, well, can I do a sort of an incremental thing where I kind of try and link as much as I can as my first object file than add the second one? In theory you could do that, but it's probably the amount of bookkeeping that you'd have to do to make that work will probably not make it worthwhile in the end. So, abstractions that LLD uses. So this is kind of the contrast to BFD because BFD kind of has a file format that abstracts between three or four different file formats that were around back at that time. LLD concentrates on just ELF. So, LLD also in contrast to a lot of other LLVM programmes, it structures a programme first, a set of reusable libraries second. If you want to use LLD as a library, you're effectively wrapping the programme. That has caused some controversy at times because some people want a more LLVM-like design. But, in general, no one's yet come up with a use case that really would benefit that yet. So, things may change if somebody does come up with that use case, but no one's done that at that particular point. Okay. So, yeah. The linker-generated content is actually represented with fragments of input section, which we call synthetic sections. That's as opposed to creating them as the output. That doesn't sound that significant, but that can make things like adding thunks for certain types of architectures much, much faster. Okay. I'll skip the rest of those for time. Okay. So, another thing that this is just something I don't need to go through that list. This is very much saying there's an ADT library that LLVM provides with many optimised versions of common data structures. So, LLD can get a strong leg up in performance just by using those and not having to use the standard library types. Okay. So, the bit I mentioned earlier on about memory management. So, what this effectively means is that you're reading an awful lot of memory or allocating an awful lot of memory towards the start of the programme, and then you can't free it until you've written the data out at the end. So, this is kind of ideal for a region-based memory allocator, which is in effect saying, allocate me this large block of memory and then just increment the pointer every time you do an allocation. And because you don't need to throw any individual bit away, you can just throw the whole lot away at the end. Then you can make memory allocation very fast. I do believe Gold still uses malloc and free, and you can see that in the performance profile at the end. So, this does make a difference. Now, one of the caveats of this region-based memory allocator that LLD uses, I don't think it's thread safe. At least I couldn't see any evidence that it was thread safe. So, that means if in any sort of parallel bits you can't allocate memory, which then influences how it does parallelism. Okay. So, parallelism in linking. So, this is some of the opportunities that you might have here. So, you can potentially use it when you're trying to read the input files. Unfortunately, because order processing is significant, you kind of have to do quite the bookkeeping to make sure that everything will come out deterministic in the end. So, LLD chooses not to do that. It just chooses to do that single-threaded. And as we looked before, that part of linking wasn't taking the highest amount of time. So, even parallelising that wouldn't save you an awful lot of time. Global transforms, a lot of those are fairly cheap. Some of them can be quite expensive. So, things like ICF, identical cold folding, string merging, those can be quite expensive, but they're also quite difficult to parallelise. But LLD has spent the effort to parallelise those algorithms, or at least parts of them to sort of get as much of performance out there as possible. You could, in theory, do layout address, some addresses in parallel, but again, the increasing amount of bookkeeping, and it's not an awful lot of link time anyway, so there's not another point. Towards the end here, where we've got the copying of the sections, once you've actually done all of your layout, calculating any individual relocation is independent of any other, so you can do that whole lot at the end in parallel, and that's where the vast majority of your time saved by doing parallel linking comes from. That's particularly visible when you've got a huge amount of debug data, which is the parallel bits are just, it's a huge load of stuff that you just have to copy, so you can just basically blat that out in parallel across multiple threads very easily. So yeah, the general rules for LLD's threading model is keep it as simple as possible. We want to use, we don't want a complicated task-based system like Gold has. In effect, all we actually use is parallel for each, which in effect runs a function on an independent set of data. We can't allocate any memory on there, we can't share states in there, but in effect what we have to do is, whenever we do that, we have to basically have single-threaded bits that put all the data into independent parts and then call parallel for each on those parts, and we find that model has worked quite well for us at the moment, and typically LLD gets a higher degree of parallelism out of things than Gold's sort of worker threads and tasks. So here's just an example of how things might be parallelised, so I'm going to have to go through this really quickly. Identical co-folding, basically it's looking for functions that are binary identical so that it can be merged into one. So in this particular example, you can kind of see F1 and G1 the same, F2 and G2 are the same, H1 and H2 are slightly different, so they can't be merged. So in effect it's kind of like the algorithm has to sort of find equivalence classes amongst functions and then merge the functions within those equivalence classes. So I'm probably not going to get a time to go through this. It's probably best I leave that to look out for. There is actually a huge comment in the LLD source code to go through there, but what I was going to do here was just basically say, this is kind of how you can split up an algorithm into bit to go through. I think I've got time for questions. I've got not many more. I've only got two more slides, so I'll go through here. So what I've sort of gone through here is some of the sort of technical reasons why LLD, or how LLD gains its speed. There are also some non-technical factors here. So many of the contributors to LLD are sort of from companies with very, very large programs and they monitor that performance and they try and keep LLD as fast as it can, report bugs when it slows down, that sort of thing. There is a performance monitoring bot that constantly runs LLD against a suite of large programs so we can detect when things slow down. Code reviewers will look for that. I guess then there's something that can cause controversy particularly when people want features added and the upstream will say no because either you can do it some other way or whatever. Typically, it's trying to keep the code as simple as possible but sometimes that means that things that... You can almost in some cases say niche things don't get supported as fast as they perhaps could be. So LLD tends to work on what people... implement what people need, not necessarily what the spec says is there. Okay, so conclusion that I had. So there's a number of technical factors here. So it's thin abstraction layers, custom memory allocator, use of threads, optimised data structures. It's probably a fairly simple answer. There's nothing magic about this. It's all lots of little bits there. There's a reference in the end to the gold design document and that goes through BFD's problems extensively so that will explain the same sort of things that's going on. LLD in structure is actually fairly similar to gold. I'd say the majority of the differences in speed between LLD and gold are not design ones. They're pretty much just the data structures used, the extra use of parallelism used. But I could foresee gold being improved if somebody did want to go through that. How close it would get, I don't know, but at that particular point. Okay, so here's some references for you if you want to go and figure some of those stuff out. But that's really all I had. So any questions that anyone might have for the remaining time? Okay, hello. So the GIT engines in LVM, they use custom static and dynamic linkers. Right. So is it a good enough use case for having LLD to place that code? So the question was, LLD is separate from the custom GIT dynamic linkers and is there a use case for combining them? So that question did come up a few years ago. My guess is probably not. Generally a lot of the LLD probably wouldn't be as fast for that particular use case. Because in effect the only thing that those dynamic things are doing is resolving relocations. It could kind of share the design as such, but typically those dynamic links have got different data structures representing the symbols, because they have different needs for those. Each of them are optimised for their own case. And as soon as you start trying to mix the two co-paths, you end up with probably more trouble than you really gain from it. So in effect it is possible, but the chances are you wouldn't gain very much by doing so, I don't think. Okay, hello. Some numbers on how much you lose by not doing the string tail launching. So by not doing it at all? I didn't try that. I don't think you would lose... Well, it's one of these things. The difficulty you get with those things is you end up with... If you don't do any merging at all, then your output's much bigger. So any time you save by not doing the calculation is then offset by how much bigger you have to write it out. My guess is that it will probably not be that much different to the 01 case where it just merges identical, because all that algorithm does is hash each string. And if the hashes match, it's basically just a run-through of all the strings once compare hashes. It's not that computationally expensive. So my guess is it wouldn't save that much, but I've not done the numbers. Okay. Hello. Right. Okay. So the question was, have we looked into peak memory usage and have we tried potentially reducing that? So I have done some anecdotal looking at that just by looking at basically the memory usage, peak memory usage from time whilst running some of the programs here. And typically because all of the inputs are N mapped, that's where a lot of the virtual address space comes up. And typically LLD is and gold are kind of comparable and they're slightly ahead of BFD, but it's not really that much. So pretty much all the mainstream linkers are coming in at roughly comparable in terms of the amount of virtual address they would use. It's difficult to, I think it's a fundamentally different linker design if you're going to get significant savings, unfortunately, it's one of those ones where you would have to basically keep going back to disk a lot of the times, it would probably be substantially slower at that particular point. But yeah, so I think there may be small savings that you could make, but they wouldn't, you know, they would be order of, if you were allocating gigabytes of memory, you might save a few hundred megabytes. It's probably not worth it. Okay. Okay. Oh, sorry. Yeah. Right. Okay. Okay. Mm-hmm. Right. So okay, so the question was, is that there is actually a use case that someone has for linking together large amounts of objects and using LLD as a library, would the upstream accept patches? And I think the answer to that is probably given that there is a use case that you can outline, and so yeah, we're trying to do this specific thing, whether our pain points, whether our specific pain points, can we get patches that are not going to, not going to obviously derail the other, the basically the good bits about having it as a program design. And I think if you can kind of say, okay, these bits are additions and they don't contrast with anything. And I think that would be reasonable is a reasonable thing to ask for. I think the things that would get pushed back are, yeah, here's the modifications, but we've suddenly made this all bit much more difficult, I guess. But yeah, fundamentally, give it a try is probably the best fit. Okay. Okay. So the question was about, there's some strings that you have to merge for pointer quality. In the actual strings that we're talking about that go into these particular merge sections, these are things like string literals that you don't take the address of or you're not supposed to take the address of or their debug names. So yeah, I think basically, in terms of ELFs, ELFs allows you to not do the optimization. But yeah, I do know what you mean. The stuff that the address has taken, you can't go in those sections, basically. Okay. Okay. That's the last question. Well, thank you very much for listening.