 Mae'r cyflewyr hynny'n speculate yma, clwwyr Cymysysol, felly yn gilydd ei hyn ar eich llwyffbeth Cymysysol yn brunio hynny, dyna'n hynny'n explainedio'r cyflewyr cyffredd. A dyma'r pierdyn ni wedi'i gael wnaeth yma o 5 yr ymwybodol a ymddi'r cyfan o'r 5 yr ymwybodol. Mae ymwybodol eich gilydd yn ymddi, han iddi'n ei gael o'r mewn cyflwyddyr, yn y cyfansiol cyfansiol sydd yn yn y cyfansiol, ac yn dod o'r problemau ymddlygiad. Felly rydyn ni i gynnwys i'r unrhyw gwych mewn bywyddiadau'r cyfansiol sydd mwyn i'r bywyddiadau ac mae'r bywyddiadau. Fel yw'r bywyddiadau, rydyn ni'n fyddwch yn dda, yn ei fawr. Mae'r bywyddiadau'r cyfansiol er mwyn i'w gwybodaeth, ac mae'n ddweud â'r gyfansiol. A oedd eich ddifud i'n credu mynd i ddïbu'r ddweud yn wych. Fe振l iawn Iang Douglashawn yw'r ddod yn Llywyddion i'r ddweud o bwysigio unigbaradau a sosiadau arun yn Llywyddiad. Mae'r ddod yn Llywyddion i'r ddod y ddod yn mynd i chi'n gwybodaeth unigbaradau ac i'r ddod i gael y ddod i'i ddod i ddod i ddwynt i'i ddod i'u ddod i ddysgu'u unigbaradau yn wychydd iawn ar y ddod i ddod i fwyllkhyn要an am Llywyddionio. Felly, mae'n cydnabod o gweithio'r yw'r ddweudio'r前knadau yng ngod, sy'n ddweudio'r ysblog iawn o'r cyfrif Podol. Mae'r wneud honi, mae'r ddweudio'r ddweudio'r ddweudio. Mae'n dwi'n cael y gwybod y gallwn yn ffordd yygol, mae'n ddweudio'r ffordd yw'r rhaid i gwrthod trwy'r ddweudio'r gweithio. We had a thread-checking tool called Hellgren, although that doesn't work anymore, which is unfortunate. We also have a whole bunch of profiling tools. You may have used Caltry and the K-Cashgren GUI, it's a KDE application. We have Massif, which is a space profiler, which doesn't get much use, but it ought to get more use, it tells you where you're allocating stuff. We have Cashgren, which is a kind of a rather nice low-level profiler, which tells you about cache misses primarily. There have been various other experimental tools, some of them quite sophisticated, some runtime type-checking tools, but usually experimental tools don't get to a stage where you can really use them for real. It runs on essentially any modern Linux distribution on any architecture, almost, that you might reasonably want to run it on. So there's no excuse to not use it now. It's also not a toy tool. You get complete coverage of your entire user space application right down to the kernel level through libc, through dynamic linker. You see everything. You don't need the source code or you can deal with libraries for which you don't have the source code, which is actually important for debugging proprietary applications or with proprietary libraries. It works on large systems, so it runs open office, no problem, open office being a large system. We have people that tell us you go up to about 25 million lines of code, runs okay. So what I'm going to talk about today is memcheck, which is our most popular tool. We know that about 90% of our users use memcheck more than any other tool. But first a bit about the infrastructure. So one of the things you notice if you want to start building simulation-based tools is that often what you want, the instrumentation that you want to add to your program to collect information like profiling data or error checking data is relatively simple. You might want to count the number of instructions that have been done or the number of cache misses or something. That's not very difficult. The real problem is to build an environment in which you can run your real application with its system calls and signals and threads and God knows what else and actually collect this stuff up. So getting it into a program is difficult. What Valgrin really provides, which is so useful, is to provide a common infrastructure that does all the really nasty crap bits of this problem. So it hides all the details of your processor by unpicking your instruction stream into an architecture neutral representation, which the tools can then deal with. It does this while the program is running. It's a dynamic translation-based scheme. So you don't need to relink or recompile or anything. It's very easy to use. You can just type valgrin ls and it'll do whatever with ls. And as I said before, it covers everything, even stuff for which you don't have the source code for. So the common infrastructure provides threads, system calls, signal support, provides reading of debugging information, provides all sorts of facilities which you can use to build the tools you want. It allows you to look at all the addresses that the program deals with and it allows you to look at all the data that the program computes. If you add two things in your code, you can see what you got if you want to see that in the tool. So the tools, you get this nice architecture independent code representation, which tools can add instrumentation code to so that you write a tool once and then it works on x86 or PowerPC or whatever with almost no extra effort. The tool can see the events that are significant to it, like thread state changes for threading tools or mallocon free changes for memory tracking tools, things like that. And there's no obligation to do any of these things. So you can write a really simple tool which will count the number of basic logs executed in about 100 lines of code and just link it in and then you have a tool which will run anything and do that. So let me talk about Memchec because this is perhaps the most widely used tool and there's not enough time to talk about the rest of them anyway. So I really think of Memchec as doing three separate things, which I'll go through later. Memchec will look in detail at addresses in the program. So it'll tell you where you're reading and writing in bad places and this includes telling you about reading and writing free memory. It tracks the addressability of memory on a byte by byte basis. It tells you about when you're doing bad stuff in mallocon free, freeing stuff in the wrong order or accessing memory after you've freed it. So it's kind of like a policeman for the mallocon free interface. And I think the most interesting thing about Memchec in my view anyway is the fact that it will find uninitialised value errors places where you're using data which in reasonable interpretation sees uninitialised and in compared to other tools like Purify and third degree which is an alpha tool and various others we think it does actually better job than any commercial tool you can get. We actually know of no other tool honestly which will track find single uninitialised bits in code and we often have seen find a single uninitialised bit in applications. So perhaps the first and simplest thing it will find is addressing errors and this is pretty simple stuff so if you allocate say a four byte block then what you'll get is your four bytes and then you get some red zones on either side of it and if you read or write in the red zone then the tool complains like that and it tries to explain you did an invalid write and it tries to explain what the invalid address is in terms of stuff that you can understand. So then when you free the thing up and the whole block is painted red and then if you write in or read or write in that area then it complains again except this time it's telling you that you're dealing in a freed area. So this is probably pretty stuff if you've used the tool. There are a couple of subtle points. One point is that these red zones are only finite size they're actually 16 bytes long in a standard build of Valgrind. So if you do a really screwed up write and hit here or here then you may not actually be told about this because it can't tell you. Obviously we'd prefer the red zones to conceptually be infinitely long for each block but it's not feasible. Another observation is that a conventional implementation of Malachor free will want to try and bring back into circulation memory that you freed as fast as possible to minimise the total amount of working set that you have. Whereas we want to do the exact opposite. When you free something we want to keep it out of circulation as long as possible. So when you free memory, when you're running a program on Valgrind that free thing is put at the end of a long block of freed, a long queue of freed blocks and then you have to wait for it to come back into use. And during that time that it's out of use, any invalid access, any access to it you will know about. But at some point this comes back into use. And then if you're using it mistakenly with old pointers in some sense then you won't know. So it's not quite, there's subtleties and if you understand these subtleties that helps. We get people asking, I did this really stupid write, 55 bytes before this block and it didn't tell me or 100 bytes before this block. So why not? Well that's the sort of reason. The second thing that Memcheck will do is leak detection. So Memcheck is intercepting all your malloc and free calls and doing its own implementation malloc and free and new and delete whatever. And it keeps track of all these blocks and where you allocated them and where you freed them. And when the program comes to an end or just when you ask it will scan the entire address space and look for pointers to blocks which haven't been freed. So it's a sort of pretty standard leak check. And it will classify the blocks that it can still find into three classifications. So if you can find, this cloudy bit is intended to be heap. If you can find a pointer to a block to the start of a block then it believes that block is reachable. You still have a pointer to it. You could at least have freed it up. If you can't find a pointer to the block at all then there's no way you could have freed it. So it's definitely leaked. If you can find a pointer that points in the middle of the block well it's not exactly clear if you really had a pointer to the start of the block or whether it's just a coincidence. So that's classified as possibly leaked. So it will tell you at the end you have this many bytes definitely leaked, possibly leaked and still reachable. And you really want to get that down to zero if you can otherwise your program is probably leaking. It's another useful classification and it often seems to confuse people that use Valorant we find from the mailing lists that it will distinguish between directly and indirectly. This is very useful. So we have this block here which has no pointer added at all in the heap and so that is directly leaked. So this block here by the rules up here is not actually leaked because there's still a pointer to it but there's only a pointer to it because it's there's a pointer from some other block in the heap which has already been lost. So this is classified as indirectly leaked and the reason this is useful is for detecting cyclic garbage you know garbage cycles like this by the rules up there all of these blocks are not leaked but in fact there's no pointer to any of you can't start getting into the cycle so they're leaked really. That was the later refinement. I should also point out that people seem to have this impression if you use Valorant's or Memcheck's leak checker that there's something exact about it and in fact the whole thing is a giant clewge the first version was hacked up in four hours it kind of got refined after that but the honest truth is that leak checking in C++ really is a clewge because there's no reliable way to tell what is a pointer a integer which just happens to look like a pointer but isn't really a pointer if you're really unlucky the compiler can sometimes optimize in ways which cause pointers to sort of disappear so that's kind of weird and it's not exactly always clear where we should look for pointers when the program is finished and where we shouldn't it sort of gets better but it's not great because it's inherently a problem with the language also you get weird shit like glib C hangs on the pointer sometimes and the STL causes all manner of problems because it allocates large blocks and then chops the blocks up and hands them out itself it has its own allocators I think so uninitialized value checking so what does that really mean well it kind of really means when you're finding out when your program is using data which has no sort of meaning by the definition of C so data that comes from malloc blocks is considered uninitialized if you use that before you write in the block then you kind of have a problem similarly local variables on stack so simple example this is a raise full of junk so if you do a test like this then the test is meaningless and it says exactly this when you do that which is kind of useful to know there's kind of question about how this is done so roughly how all this works is here's your original computation above the black line you pull a couple of values out of memory add them and put them back there or somewhere in the background well you can't see it memcheck is maintaining a couple of large bit maps one of which tells you which addresses in memory it's okay to look at them which are not so these bits are used for the leak checking and for the addressability checking it's also maintaining for each original bit of data there you have a corresponding bit of data here so it pulls it uses these v bits to check to track the definiteness of the data here so you pull your corresponding v bits out of memory and do some weird computation in the background which gives you an approximation to the definiteness of the result so one upshot of this is that if you add two garbage values together then it'll decide that this is garbage and it'll actually complain at that point this is a design decision let's see so one of the most perhaps one of the most problematic things is to decide when should we actually complain about your using uninitialised values this is not an easy problem so the the obvious thing to do would be just to complain whenever you're pulling uninitialised data out of memory like pulling it out of your stack or malloc blocked it actually doesn't work at all I think it's what Purify does but with some tricks the real problem if you complain about reading uninitialised data out of memory is if you for example if you have a struct like this then the compiler is going to put a three or seven byte whole here just so that the int is then then align properly and so if you then have a structure assignment like that then it's going to complain about the three garbage bytes that you copied from the middle of one struct into the middle of another struct and we have done some experiments to check this is true and you get absolutely flooded with errors which are not really errors if you complain at that point another thing is the decision which is sort of shown here that we don't, even if you're doing arithmetic or whatever kind of arithmetic floating point vector scalar integer arithmetic you don't complain if you compute garbage values you just track garbage through the system and you only complain well at some later point so yeah the summary is if you report errors early then or if you try and report uninitialized values too early then you get a lot of false positives the problem is if you do the opposite and allow garbage to be copied around the system a long time before you complain then it's actually difficult for programmers to figure out what went wrong it says I'm using uninitialized value here but that value came from it was passed as a parameter so several layers of calls to this point before it got really looked at so yeah our strategy is to delay as long as possible because this reduces the noise level and essentially we will complain whenever the use of an uninitialized value would possibly cause an exception so either it would cause a memory address to be undefined the address of a location not the contents of that location or let's see yes when you would effect like this when you would effectively write an undefined value into the program counter by jumping on uninitialized value or when you're passing garbage to the kernel so these are really the only places where garbage is going to complain particularly in this kind of situation it can be a long time before the point at which it complains can be a long time after the garbage was created that kind of doesn't help there's another sort of minor question as well which is suppose I allocate an array 10 bytes long and then read off the end of it what does this what does this mean so off the end of this array is kind of garbage A of 10 well it could be garbage so maybe I should give an uninitialized value error but the root cause of the problem here is that you're using a bad address not that you're using bad data so there are situations when you're using the tool which it has to make a decision between complain about bad address and complain about bad data and it complains about bad addresses in this case it's sort of easier to understand and then it doesn't complain about the fact that you're using bad data even though you are so we get a lot of questions on various valgran mailing lists about how do I figure out what my program is really doing wrong when it complains and this is about the best answer we can give if you look at the three categories again if you have an addressing error well it tries to say you've got an invalid read or write of whatever size done at this point and then it tries to say it tries to describe the address in terms of the blocks that you've allocated and freed or it's some address on the stack and it tries to say it's some address on the stack and that usually makes it fairly clear what the problem is for checking for leaks it's a little more difficult and I don't have a good answer to that either you have to ask questions like who was supposed to be owning this block and where was the last point of the block overwritten very much but the real problem is finding out where did my uninitialised value error come from so in this example we have some arrays which is presumably allocated with malloc and then you're multiplying and checking and it's going to say at this point it might say you're using an uninitialised value in this conditional and your problem is you don't know whether the uninitialised value has come from the A array or the B array so one thing you can do is look through your programme logic and inspect where A and B have come from and try and figure out where those arrays contain garbage another thing you can do is to actually ask Valgrind to tell you if you include this header file which comes with any installation of Valgrind then it has a bunch of magic macros and you can actually force it to check you say check that the whole A array is defined for 800 bytes and also B is defined for 800 bytes and then it'll tell you maybe it'll tell you right 504 along the A array you have uninitialised garbage or even the array is not even in addressable memory that's kind of a useful thing to do that stops you having to look around you can actually force it to make checks so these little macros one thing about them is that when you run your programme normally not on Valgrind then they have no effect and they're very cheap as well they take about four or five instructions there's a magic trap door which your programme can communicate with Valgrind to tell it stuff about memory management to ask it questions about memory management and that can be very useful we use it a lot internally in the implementation but it's also useful for power users if you want to say that here's some other stuff which is kind of worth knowing but not everybody that uses seems to know it one of the problems that you get is that you get errors in libraries which like glibc or whatever proprietary library you manage to link into your application and you can't get rid of them you can't fix them so about the only thing you can do is tell Valgrind not to show you the specific errors so you can create files of suppression files describing exactly errors that you don't want to see and so you say you specify suppressions file like that and you can ask that suppressions are generated using the gen suppressions flag so it shows you an error you say give me a suppression so that I never see that error again that's kind of useful a big suppressions file which causes Valgrind to stop complaining about some stuff I don't know I think a lot of people have them another thing which is sometimes useful to do is to describe your weird memory management scheme which is not that you might be using in your program which is not nullic and free and this goes back to using these magic trapdoor macros again so you can describe I think you can describe that you have your own memory manager which is creating and destroying blocks and these will participate in leak checking there are people who use some kind of pool based memory manager that's not really sure so there's a bunch of macros with creating memory pools these are low level macros where you can just say this area of memory is now off limits for whatever reason and you know tell me if I see any accesses in this area or this area of memory is now addressable but it contains garbage or it's addressable contains data for example if you had a garbage collector when an area of memory goes out of use for a while you could paint it no access and then you could paint it as writable when it comes back into circulation another thing you can do is a lot of people want to do leak check in the middle of the program not at the end or they want to do leak checks multiple times along the execution of the program so you can use this algorithm do leak check macro which will just run a leak check at that point and they typically will then do some kind of diff the leak states from those various snapshot points if all else fails so there's lots of options lots of command line options which subtly modify the way the thing works it's worth playing around with those they're useful you could file a bug report that's also very useful since we actually take notice of bug reports sometimes we even attempt to fix them if you do file a bug report don't just tell us that the system crash because that's completely useless make it possible for us to reproduce your failure that's an obvious thing to say but you'd be amazed and mail us because we even read our mail sometimes seriously if you want your giant lardy application with bazillions of lines of code to run on Valgrind and it doesn't for some reason which can happen it's willing to work with people to figure out what's wrong but you kind of need to work with us and that's got stuff working many times in the past so when should we use it that's a good question so if you use something like gdb well in my perhaps rather jaded view gdb is only really useful for when when the program has crashed you want to find out why it's crashed so I suppose you can set breakpoints and stuff as well so you can use valgrind or memcheck when looking for a specific bug but the thing that's really valuable and the point, the reason why I basically created it in the first place is to go looking for memory management bugs that you don't know that you have yet so you've got a bug which may crash cause it to crash on a user sometime in the future but you're kind of unlucky and you don't pick up the bug during testing well basically run the thing on valgrind or memcheck and keep fixing what it complains about until it doesn't complain anymore and if you do that then you'll have got rid of a certain class of memory management bugs from the application and that tends to make the thing more stable before you release which is good for everybody the best thing you can do is run your regression tests of your suite whatever on valgrind as well so that you get the odd corners of your application prodded and have the memory management being watched at the same time people don't like to do that cause it takes so long but it's sort of worth doing so we quote this study too much but one of the open office developers ran some basic tests from open office about this is the open office 2 line about 18 months ago and I think the thing that was really significant is this that of the bugs that it picked up a third of them would just crash open office if they ever actually appeared for a user so you get rid of those bugs before the thing is ever released and I think that's a good thing people sometimes ask about whether the system produces too many or produces false positives we get sometimes people saying well I don't believe this thing that it's complaining about particularly it's saying I'm using an initialised value here or I'm using a bad address and well there's a lot of effort gone into making sure that valgrind very rarely tells you stuff which isn't true so almost all of the time 99% of the time if it complains about something it's right if you attempt to run highly optimised code on it you can sometimes fool it so I suggest you don't go above minus O with GCC but minus O and memcheck is okay so just before I finish the other tools are also extremely useful I've kind of mentioned them cashgrind is a great little cash profiler which can tell you information which seems to be very hard to find by other means so you go basically you can find out where you're screwing up in your caches all of three instruction D1 and level 2 cash and these cause your performance to mysteriously drain away for no apparent reason for often very unobvious reasons so it will profile at the level of the whole program functions, lines of code or even individual instructions that will tell you this specific instruction is causing 80% of the cash misses in your program and it will print annotated source code and whatever nowadays we can actually run valgrind on valgrind itself which means that we can profile valgrind running stuff and that's turned out to be very useful we found a whole bunch of cash misses which means it will be a little bit faster the space profiler well I mentioned that it's kind of useful for finding out who allocated what and how the allocation it doesn't just tell you at the end it will track along the way and then show you pictures as your program is running of the space use and who allocated what and who's holding on to what and why and this is kind of useful I think for dealing with space problems I haven't actually used it myself one of the frustrations of being a valgrind developer is we don't actually really get to use it so we don't have that here a picture of how users use it valgrind and the GUI kcashgrind is an external tool from Joseph Wiedendorfer and you may well have used it I think it's got used quite a lot at least by KDE folks for profiling stuff and I think also by a lot of other folks it has a nice complicated GUI which will show you all sorts of stuff about cost attribution between callers and callees valgrind is the tool that we had for finding threading errors it really looks for memory locations for which it cannot show that there is adequate locking when this location is accessed by more than one thread which is the kind of summary of a data race really so valgrind stopped working about a year ago due to some other threading-related changes we are now back in the state where we have the infrastructure to make it work and now what we need is a person to actually push this along so we are looking for we are looking for somebody to put it back together and make it work it's a difficult problem because you need to know lots of stuff about threading and assembly programming but we are looking for volunteers to fix it so if you can do that somebody can point them our way whilst coming up we are currently on valgrind 310 we'll do a bug fix release next week it's kind of overdue doing a new major release in about seven weeks so there's various things but the most significant thing is that we have a we're reducing the performance overheads the memcheck tool in a spaceway mostly also it's slightly faster we're integrating valgrind because it's a very popular tool and it's easier for us to do quality releases if it's integrated and we're generally improving performance and stability we're always looking to improve stability and make stuff run which doesn't run for example but we are able to support wine in version 3.2 which means that you should be able to run whatever wine can run whatever windows application you can run on top of it that would be fun it's actually been done before it's not as crazy as it sounds if you like writing large parallel programs using MPI then we will have some support for you if you want to run on PowerFee C64 we can do that now as well we will have another release of the GUI for memcheck which will be nice and this what I'm saying here is if you have a big application which you would like to memcheck eyes or generally valgrind eyes and it doesn't work you can do nothing mail us and see if we can fix whatever needs to be fixed in order to make it work if you don't do that then you're just going to wind up with a 3.2 release which still won't run your thing so basically if it doesn't work complain and further along yes we would like to make hellgrind work again that would be a good thing we get quite a lot of people asking about this thread checking tool so if you are a cool picture this is great if you only remember one thing use it and get rid of memory management bugs because it's good for your end users and it's good for your insanity should use it