 Okay, I'll start now. So thanks very much. Welcome to this talk on debugging with LVM. My name is Andrzej, this is my colleague Graham, and I'll try to showcase technologies within LVM that will help you with solving various problems. Both Graham and myself are compiler engineers at ARM. We work on the scalable vector extension support for R64. None of the technologies that we'll be describing today is something that we work on. It's something that we use on daily basis, something that we very fond of, and we feel that it does deserve a bit more advertising. So this presentation is split into two parts. I will introduce and talk about LDB to begin with, and then Graham will jump in and he'll introduce you to LVM sanitizers. LDB, one of the nicest thing about LDB is very compatible with GDB, and I think it's important because a lot of people are familiar with GDB and they are concerned about switching to LDB. And this is apparent when you realize what the architecture of LDB is. So, up a case LDB is just a sub-project within LVM. Now, that sub-project implements a bunch of binaries like lower case LDB, lower case LDB server. And what happens, as a user, you'll be using LDB typing commands, but behind the scenes, it's actually the LDB server that does all the heavy lifting like memory, read, memory, right, register, read, register, right. And the two communicate with each other using the GDB remote serial protocol, which means that, in principle, LDB server implements GDB server and can be a drop-in replacement for GDB server and same with LDB. In principle, it works, I tried it in certain scenarios, but obviously there will be some clashes, but still, I think it's fairly obvious that they try to make it compatible with GDB and I think that's important. If you're interested in the GDB remote serial protocol, it's fairly straightforward. Every packet starts with a dollar sign, followed by data, followed by a hash sign and a checksum. And LVM actually extends that because you can do that and LDB works on a lot of complex platforms. But what's interesting from a user perspective, you can use lock-enabled GDB remote packets with an LDB and see all the packets being exchanged between LDB and LDB server. And that can be useful because if you're on a new platform and something doesn't work and you're wondering why, then you can use that to see where the packets are being exchanged and unknowledged or if you just want to see how, for instance, registers are being requested and read. So this can be just a fun thing to use. And in fact, LDB has a lot of logging channels and you can check them by typing log list. And the list is very long. And this brings me to a structure of LDB commands. Each command is basically a noun, like breakpoint, followed by a verb, like a set, followed by a bunch of options. Another example is process launch, which basically means study or debugging. These are a bit verbose, but they're aliases, so you don't have to worry about it. And I'll show some examples. If you're stuck and you don't know what you're looking for or you don't know where to start, you can just type a propose and a keyword. And keyword doesn't have to be a command. It can be something just fairly random and LDB will display your list of help pages with relevant info and then you can look the corresponding help page. But I think it's much more interesting to see this stuff in an action. So I have a few examples. The first one is it's not complex. It's basically a main function that the class array of names, and J Graham, and J Graham, and loops over all the names and displays either hello and J or goodbye Graham, depending which name it sees. Or I am confused if it sees something else. And if I now build it, oh, sorry. LDB, LDB example. Yeah, sorry. Sorry for that. So yeah, it goes and hello and J, goodbye Graham, because that's what's implemented. Now if you run it in LDB, yes. When you type and always say what you're typing because I don't see it. I'm only interested in the LDB commands. Yes, so I'm starting LDB and I'm typing breakpoint set dash n main. I don't need that. So that's the verbals way of setting a breakpoint. I'm setting a breakpoint. I'm running my process and it's displaying the assembly which is not what I wanted because I didn't build the debug info. So let's do that again. Okay, and immediately you see how nice LDB is because it displays you the source code with the surrounding context. It highlights where you hit the breakpoint including row and column, which is useful because sometimes you can have multiple expression in one column. And now if I just step through and I'm typing n in LDB, which is an alias for stepping over which is compatible with GDB, it's re-displaced the same stuff and highlights where exactly on which line you're stopping. Now, but what I really like about LDB is the expression evaluation engine. And you can basically type any C++, more than C++ code that you want. So now I'm overwriting the contents of the character array. So I'm just typing X for expression and then you type your C++ expression. In this case, for auto name and names, names set to pinky, which is just random name. And now you can see that it has overwritten it. So it jitted the code and it did something behind the scenes and now it's changed. And if I continue, it's confused, which is fantastic because sometimes, and I do it all the time, you might be tricked into, okay, I'm gonna modify my source code, recompile, rerun, and then debug it and see what happens. You don't have to do that. LDB leverages lip clank and has very access to very powerful JIT so you can do stuff like that live. Let's try another example. This one, the source is not very important. So I'm trying a different example. This implements a multi-threaded, very simple multi-threaded increment of three numbers. So it increments X, Y, and Z to some very big value and it does so in three threads. If I do it, if you look at it in LDB, so this time I'm using just B to set the breakpoint because there's an alias. I don't have to do breakpoint set, B is fine. So that's something you know from GDB. And I'm setting a breakpoint on function in X because I know I implemented the file, so I know it's one of the methods that's executed in parallel that does the increment. And again, it displays the relevant code. It doesn't fit on the screen, but you can see the breakpoint is being hit over there. And because this is multi-threaded, I know there will be multiple threads stopped. So I'm typing thread list and indeed there are three threads. So I want to know what's happening with these three threads. And for that, I think it's really nice to actually use GUI, which is Graphical Use Interface. So I'm typing GUI in LDB and it takes you here. And you get this for free out of the box and it just works. Main window is your source and highlights where you hit your breakpoint. Your variables are displayed here from the current frame and there's a process and that one is the most interesting one because you can expand it and you have three threads, all fine. And then you can jump between the threads and it shows you exactly where in each thread you are. And you can also go over different frames within each thread, which I think is fantastic and really useful. Another feature is embedded Python interpreter. So you just type script and you enter a Python interpreter inside your debugger. And you can type anything basically you know from Python. What I'll show you is an example of using LDB Python bindings. So you can do something like so I'm typing print LDB target get triple. And if you're familiar with LVM, you'll know that triple is something that LVM uses internally to describe the platform you're on. So I'm just printing that using LDB Python bindings and it's x86, 64, macOS. There's many more. There's many, many, many more. And as a matter of fact, you can use those Python bindings to script your debugger and extend it. And in order to present that, I will use a, I have a hobby project called LVM Tutor. It's a collection of passes. And if you're familiar with LVM passes, you'll know that each one implements this run method. But it's either run on module, run on basic block, run on function, run. And then you have the new pass manager, old pass manager. And if something breaks, I just want to start LDB and hit a break point and I don't think which method it needs to be. I want to be just like, just set a break point. So let's try. So I'm starting LDB and this is a long command, so let me just quickly explain what it does. I'm debugging opt, which is LVM's tool for running passes implemented in shared objects. I'm loading some specifying which shared object I want to load. So it's this one and that's something I implemented. And then just bunch of options to specify which pass I want to run. So this is a bit complex because I'm actually debugging opt and my pass is dynamically loaded, not yet because I haven't started yet. And I'm thinking I will be lucky. So I'm setting break point. I'm typing B run, run on module. And I'm thinking I'm being lucky because debug is a clever and it finds 73 locations. Makes sense because I'm debugging in principle opt which belongs to LVM, which has a lot of passes. So there'll be a lot of run on module methods. But I know it hasn't found the one that I'm interested in because I do image list within LDB and it tells me which shared objects have been already loaded. And you have a full list and you can check the pass hasn't been loaded yet because I haven't started opt. Anyway, I'm being naive. I'm thinking I'm being lucky. I type run and I had break point inside my pass. So automatically behind the scenes will happen opt started running. It loaded the pass, the shared object at some point. LDB notices that and goes, right, you're loading some new object. Earlier you requested some break points. I'm gonna investigate this new object and see maybe I can set a break point in that object. And there you go. And it just works fantastically behind the scenes. Now you may be taking this for granted but few years ago stuff like this didn't work. And it's quite impressive that it just happens to work. And this worked and that was just one pass. Now I'm doing exactly same thing but with a different pass. So it's just a different shared object and different command line options but otherwise I'm doing the same thing. So run on module exactly same exercise found 73 locations and run and it's not hitting the break point. The pass runs, displays the output from the pass but the break point wasn't hit because the method this time is not run on module. However, I have this method that I implemented in Python which is not complicated called LVM Tutor because that's my project and that simply passes the command line that I passed to LDB. It has access to that via LDB Python bindings. It's not really magic. And it identifies okay you load it hello world and it knows that because it just checks the name of the file and sets a break point accordingly. And it knows where to set it because I told it like I have a map of pass and which method to stop on. So if I now type run it again automatically set the break point and it hit it. And you can do a lot of things like, oh sorry, I removed this one. But let me show you how I implemented that method. So there's two steps. The first step is dot LDB init file that's loaded when you start LDB and it has just bunch of things that you want to do whenever you start LDB. First the commands it's settings set again a noun followed by a verb and it just says display 20 lines of code around where you hit the break point or where you stop. So that's the setting I use on my computer. The other thing is command script import commands.py. So this basically says take that file and load it. And finally on this line I'm just saying take the file that you just loaded, take method that implements my custom command and alias it as lgim tutor. And that's it, that's done. And that's more or less what I had on the LDB. This is a few aliases available within the LDB but I actually copied some of them from official documentation of LDB. I recommend you go there, the list is very long and there's a lot of aliases like this. I show the expression evaluation engine. We talked about Python interpreter and implementing custom commands. Now, there's a few links. So LDB tutorial available on the official page of LDB is amazing, there's a lot of examples and it's written very, very well. So I recommend visiting that and trying to follow the examples, GDPRSP protocol was really, really, really well described over there. And if you're interested in my examples then you can also look at them. Now, there's a lot of things that you cannot do with a debugger and you can deal with them using a sanitizer and that's something that Graham will talk about. Okay. Right, so what is a sanitizer? It's a feature in LLVM and GCC as well that allows you to build your program with additional instrumentation code that helps you find bugs that you didn't know were there. And it's pretty easy to start using in that you just need to have this extra command flag on your compile line of dash fsanitize equals and then the name of a sanitizer. And there's several sanitizers available for things like figuring out whether you've come up with an invalid memory address and try to de-reference it or if you've invoked some undefined behavior from C and it performs this instrumentation by wrapping key operations within your program. So things like loads and stores, it will call off to an internal routine that it's linked to and build up a map of memory showing which addresses are valid and generating a log of where a piece of memory came from. And if you run top when you are executing one of these, you will often see something like a virtual memory size of 20 terabytes which looks a little alarming at first but it's not really that big. It does have an overhead of somewhere between three and 15 times space wise depending on which sanitizer you use and which Hopkins you're going to use but generally it's not that bad unless you're going to do something like launch six of the electron apps at once and have them all instrumented, you're probably not going to run into any problems. There's behavior that's tunable when you encounter an error. So the default is to just report that an error occurred and then keep on going and hope you make it through depending on what it was, you may crash anyway. There's an option if no sanitizer recover which says no, as soon as you see an error, I want to stop and then if you're just using the undefined behavior sanitizer that doesn't need the extra tracking of memory. So you have the option to just say plant a trap instruction and I don't need to link in the compiler runtime. These can be combined to some degree so if you wanted to track whether you'd hit a signed integer overflow but then definitely stop if you walked off by the edge of an array, you can do that. But for the address memory and thread sanitizers, they all want to control the heap and control all of malloc and stuff so you cannot use them together. So let's go into a brief example of what we can actually get out of this. So I've got a simple example here over two files and I suspect a few of you've already seen what the bug is but if I compile this, so we're kind of expecting a zero here because we're summing up all the things in the array or the integers in the array and we had initialized them all to zero. So something's gone wrong. If we use the address sanitizer and rerun, it'll now show you that, yeah, you've gone off the edge. So what it shows you is that you've got a read size four, so one integer, from your my loop and that memory, the memory it's closest to was just a 40 byte region. So you're 10 integers times four bytes which was allocated in main and you've hit the red zone region just beyond that. So it basically, what this simple example shows is that this isn't just instrumenting your local code within a function, this is a global kind of bounds checking mechanism that memory that's been allocated in one thread can be protected and guarded against overruns and, well, in one file, can be protected against overruns in another file. So the address sanitizer detects how the bounds access is like that but it also picks up use after free or after scope or double free. On Linux, there's a default, well, it's on by default. You can detect leaks in your program and you can also detect initialization order problems. So if you have init functions for your various libraries and they might reference memory that hasn't been set up yet by another init function in another library, it can detect that kind of problem. So the undefined behavior sanitizer, is a class of behavior within the C standards which is not implementation defined so that it has a known thing but it depends on which vendor. It's anything is valid if you do these cases. Things like signed integer overflow is a case of that and in theory, deleting all of your home directory is a valid response to trying to run that code. In practice, the compiler generally just deletes some big chunk of your program and things go subtly wrong. It can also catch very similar cases that are not technically undefined behavior so unsigned integer overflow but are nevertheless useful to catch because you generally don't expect a very large number to overlap to a very, very small number or the opposite. So we were trying to find an offset between eight and 32 and that's not quite correct. So in this case, I'm showing that there's an additional tool beyond just saying a sanitized equals undefined that you can specify an additional behavior that has just to track that kind of thing. So it's a long list of all the undefined behaviors and close to undefined but undesirable behaviors that it will report to you. Thread sanitizer. So this is obviously your thing to report data races. An example for that as well. Okay. So the thread sanitizer in this case is reporting that we had a data race between the threads on ready and on item in that it will tell you which threads wrote to it first, which one wrote to it second and that there's no synchronization between them. So it catches data races like that and that includes data races on mutexes which you'd normally expect to be okay because sometimes you might have split off and then one thread tries to initialize it but before it's done so another thread is already trying to lock it. It catches problems where you're trying to destroy a mutex say if it was a scope one while it's still locked, catches problems where a signal handler might overwrite Erno. So if you had an asynchronous signal come in you'd probably don't want it to overwrite whatever area you've got coming back from a file open. And in cases where thread sanitizer doesn't yet understand what it's looking at you can help it out a bit. So if you were to use open MP and you'd try and run that simple example and you'd deliberately put in an OMP flush which acts as a fence and basically stops everything from crossing that barrier. Thread sanitizers doesn't yet understand that. So you can tell it that you could use the annotations to tell it that the right to item happened before this flush and that the right to ready happened after it and similarly in thread two. And if you do that then it will actually understand what's going on and say okay that's fine and keep on hunting from bugs elsewhere in your program. Memory sanitizer, this is to pick up issues where you haven't managed to initialize a piece of memory. So in this case some old program understood a couple of command line options initialized who in those cases otherwise didn't bother. And then at some point somebody adds a new flag and something goes wrong somewhere down the line. I'd love to show you that one but unfortunately that one doesn't work on Mac OS. So it's Linux, FreeBSD or NetBSD only at the moment. It might be the other BSDs but I don't think they've been properly tested. And similar to the other ones it can track the origins of memory so that it knows where it was allocated and where it might have been partially initialized but not all the way. So you get an idea of where you should focus your attention. So say you've started to use the sanitizer on your program but it's just too much overhead. So you've got some really big compute happening in the middle and you're pretty sure that that bit's correct and you want to focus on finding bugs and things like your parser because parsers usually go wrong. So you can add an attribute in your source of don't touch this one with the address sanitizer or don't touch it with the thread sanitizer or something like that. But trying to add all little attributions all over the codes as you find that they've hit too much CPU time or memory is maybe a bit too much. And so you can actually just give it a file and say for this source file I don't want you to instruments funk one with anything. For this other one I don't want you to annotate this with the address sanitizer. Unfortunately C++ names have to be mangled. So that's why you've got this Z9 thing. But it does allow shell wildcard of star to match everything with that prefix. More info. So I have definitely not covered all of this yet. It's a few other useful ones. So point of compare and point of subtract. If you've malloped two separate objects and you try and compare pointers on them that's undefined. Of course they make me completely different memory spaces you don't know. Control flow integrity. If some sneaky bit of user data is trying to overwrite your stack pointers and get you into a bit of memory you're not supposed to. You've got the CFI to catch that. The data flow sanitizer is a bit of an interesting one because it's not an automatic one like the others. In that you have to manually tag pieces of memory with a name. And then it will keep that tag wherever that value goes through memory. And it's quite interesting to see with the age of a debugger. And there's more in the process of being written. So the type sanitizer is currently under review if anybody wants to go and look at that for caching strict aliasing problems which is yet another class of undefined behavior. There's the main clang documentation site which contains a lot of useful information on the sanitizers and how to invoke them. And there's an older set of documentation from Google. Some of it's out of date but it still contains some useful information. And I have seen people set these things up in public CI like Travis so that they get at least weekly builds verifying what they've done. Back to Ange. Yeah, I'll quickly wrap up with a one final example because you might be wondering, right, LB sanitizers, it's in a way connected but a bit orthogonal. Can you marry them together? And yes, the answer is you can. So I'll try. Right, that's the wrong directory. Yes. And very trivial example, basically allocate memory, deallocate memory in F1, F2 and then you try to write to that memory after the allocation. Yeah, so the sanitizer should capture that and it does. Yeah, so that's something that Graham showed you. Well, it's a slightly different example but the concept is the same. Now, you want a bit more insight. You want to do it within LDB. So let's start LDB. I'm typing B main. So setting bake point on the main function, run. I'm stepping over F1 and F2. That does allocation, deallocation. And I hit the point where I'm trying to put something into the array that has already been deallocated. And you type memory history, name of the variable and it shows you the stuff from the sanitizer. It shows you, well, I apologize, it's not displayed very nicely because it doesn't fit on the screen but basically it shows you the back trace of allocation and deallocation. So you have the insight from the sanitizer within LDB and that's all wrapped up into this memory history array. You can have a look what it is by typing help memory history. So this gives you access to the address sanitizer. You have access to the other sanitizers as well. It's not wrapped up in nice commands but you have access to that. And otherwise, I think I will just finish here. So most of all, thank you very much to the community because this is stuff that the community has developed and it's really nice and we can use it on a leverage to be more productive. We rushed through a few things so if you have questions afterwards, obviously we can have questions now but feel free to email us with follow-ups. And otherwise here, thank you very much. Yes, so we have around five minutes for questions. Say, remind me, present us to repeat the question. Yeah? My question would be, I'm not using LDB and what would be main motivation to switch for it? Like, are there additional features which you would promote and advertise? So the question is what would be the reasons for switching from GDB to LDB? That question comes up a lot and my answer is it would be best to ask a LDB developer. My personal take is I don't have such thing. I switch between GDB and LDB and the way I look at it, they are complementary. If you're on macOS, GDB is suddenly not as well supported as it could be, so you can only use LDB. LDB, by default, you have this nice interface that displays you context. It leverages lip clunks, so you have C++ expression evaluation out of the box and whenever something new comes in at C++, you get it immediately as soon as LVM gets it. I don't think you get that in GDB. Also, you might be on some exotic target for which there's only LDB implementation or you might be on a target that has only GDB implementation. So I think the way I look at it, it pays off to know both tools. LDB doesn't depend on that. Okay, okay. But I thought because you have LDB and LDB server, GDB and GDB server architecture, that kind of removes that dependency in the GDB as well. GDB, that's product city six. Okay, okay. Yeah? Sometimes GDB has a very hard time of analyzing the variables in the local stack frame and most of the time it's not that big of a deal because in the worst case it says I don't know it, but sometimes it actually starts to hang and then basically I have to stop the box session because GDB just stops this hang. And so the question would be does it, is LDB better in that respect? I mean, I'm not necessarily talking about resolving everything but at least not crashing so I can at least start to do a different introspection of the one that currently doing that. So I think the question is in certain scenarios they'll be more stable on reliable than GDB in particular when you have heavily templated code. My personal experience, for me both GDB and LDB are very stable. For what I've been using. In the past I admit I was switching between LDB and GDB and LDB was very flaky but that was three, four years ago. I'm using it now on daily basis for debugging LVM and it's rock solid. I'm really impressed. And the scenario that you're describing heavily templated code, well that's tricky for debuggers because one stuff is compiled and you request a template that it's not, it hasn't been instantiated. So there's been developments within LDB to actually make that work better based on modules. And I think patches are being upstreamed. There was a talk at LVM Dev, maybe that's something would be relevant for the scenarios you're describing. But my personal experience they in part, but maybe my, yeah, yeah, yeah, yeah, yeah, yeah, yeah, yeah, yeah. I've not tried it but I think it would be interesting if you compared and came back because I'm interested in myself. Yeah, one more question. What about something called hardware ACM? Is Arnold helping out with accelerating complicated to return types? So the question is about hardware sanitizer and I'll just pass the gram. So there is a, a hardware assisted sanitizer. It's mostly developed by Google. I think with ARM has contributed a little bit to it. But it basically takes advantage of the fact that ARCH64 can reserve the top end bits of a pointer for any other purpose and still be able to use that pointer as a plain loading store. And I think that's used to serve the address generation for stack variables. But you also have an upcoming feature of memory tagging which should do a lot of what you get in the address sanitizer for free by coloring memory and storing bits of, storing that data of what color that memory is elsewhere in memory such that the hardware can actually detect overruns for you.