 Thank you. Ah, yes, the magic dial. Excellent. The GL transitions. An open office. Excellent. So I'm going to talk very, very quickly, as you see, about IO and profiling it. And perspective is key. Let's take Richard here, who's a very clever guy. He's got a brain that's maybe, you know, 20 centimetres away the information is. Maybe if you want to find something else, how you go and find a book on your desk, 80 centimetres away, maybe you have to walk across the room, you know, to find something on a shelf to look at, but when we start to get a bit further away, maybe in the Middle East, what's equivalent to this distance in computing terms, right? Well, we have our registers in our CPU and the numbers, of course, vary, but these are ballparks, maybe half nanosecond, level one cache, two nanoseconds, memory, 10 nanoseconds. Well, it turns out we have to go to the Middle East to try and get data off the hard disk. The disk seek time is about 8 million nanoseconds, like that, 8 milliseconds, right? And it doesn't change to go and scrape the data off. So it really pays to avoid seeks where you have to go and fiddle around on the disk and try and get the data out of it. So, open office, yeah, open office cold start is still not what it could be after a long time of me working on it and it's getting better, I promise you, particularly the easy to measure part. But as you see, as we go lots of the time here is really just IO time. You can measure that very easily. You run it warm and you run it cold. And, yeah, you just see this. And open office does a whole load of stupid things. That's very true. You can run S trace on it and you can see all these system calls it's doing that it shouldn't do and so on. But they all fit in the small part here. They're not IO specific. They are clearly in the 20% and decreasing piece. The 80% is clearly the IO time. And my thesis is that Linux should let you do stupid things faster, right? Because people go doing stupid things all over the place. The next problem is that on the Linux kernel, at least, it's very difficult to get deterministic runs of your IO time. So you really want to measure this thing so that you can shrink it bit by bit. But the problem is you can't really measure it. It just doesn't work. Even if you cold boot across cold boots to the same point you don't get repeatable timings. You can also try all these other nice things. There's things that help here, dropping VM caches, IO tools, but, you know, the kernel is 10% is a reasonable jitter that you would see. And, of course, if your win is only 5%, which is a worthwhile win, it's just lost in the noise. And why? No idea. The kernel is really broken in this regard. And there's nothing complicated happening. This is not an elevator type thing. There's just one IO request happening after another after another. And yeah, these the kernel people say things like this, just rebuild your kernel with this patch set, which is deeply unhelpful. And the kernel also has tweakables, so you can set different things and it will, you know, apparently speed up IO. Yeah, but that's not very helpful either. So IO grind then hopefully tries to address this problem by providing a cunning solution. So, what we do is we run our application inside Valgrind here. I have a pointer, I believe. Excellent. I can, uh-huh. Yeah, here. And that generates a nice trace file. Actually, it's just a standard out dump from Valgrind, this Valgrind skin. And then we have a little GUI. Now, simultaneously, we grab your disk and we dump your file system. So we have a layout of your file system. And then we can simulate them all together inside here with a disk simulator. And this gives us pretty pictures. We apply our brain and shortly, hopefully, we can then improve our software. Of course, the problem is this tool is really very beta, which is why I'm showing it to you. It shows some interesting things. But of course, um, yes, it could go very badly wrong with it. Oh, nice. Don't lean forwards. Just a tip. So, here is... Oh, dear, oh, dear. I'm too loud, I think. That's probably the problem. It's congenital, unfortunately. So, here is a lot of the stack. Ah, that's better. Hey, good. If I shout, I don't get feedback. So, cool. So, here is a view of the entire aggregate stack time. This is area is simulated line. So, the axes are meaningless, but the area is meaningful. It's much like K-cache grind. And what we can see is that this guy here called check icon is taking a whole lot of time. So, we can dive down into that. Well, we could go up a bit and see the immediate stack frame there. And you see on the left here, there's a whole lot of stack frames. And this is interesting because there was a bug in... in which he was doing a whole load of icon theme validation stuff, which was doing a whole lot of things. So, it's pretty easy to find that. You can break it down by files. So, you can see what type of file is consuming the most time where it is on the file system. You can look by address. So, this is looking at virtual memory pages and when they're first touched. So, this is starting to show you the working set of your application. And you can see all sorts of interesting things. So, for example, as you map libraries, let me explain the axes first. So, maybe it's a bit more complex. So, there are huge holes in virtual memory. We throw them away and we just draw compressed VM there. This is simulated IO time on the other axes. So, you see as you map your libraries and you come down, you start doing lots and lots of IO, reading bits of them, and then there's sort of nice bits of backwards IO. Lord knows what's going on there. And hopefully, as time progresses, you start to actually execute code. And you can see some of the stack frames there from where these pages are being used. So, this is a bit of a myth inside this kind of thing. The interesting thing is the icon cache I was showing you earlier is here and you can clearly see that these people are touching every single page in this rather large memory map for apparently no reason, which is not necessary. So, hopefully, you can see silly things as they happen. Now, there's another tool here. I hope I use the right file system model to show it to you. And this allows you to browse your filing system. So, on the far right here, we have your entire file system. I guess the white bits aren't used. The black bits are. And here we can also see it. If we scroll down, we can start to see some of the things in the file system. That's a particularly interesting cache file here. I hope if I've got the right version. Which maybe I haven't. Huh. Well, anyway. You can see fragmentation in some of these files. So, for example, this 555CFS here, you can see a split across many pieces. And I try and use the tone of these things. That is the same file. The intensity of the color is used to delineate order. So, the file is ordered. Hopefully, a perfectly defragmented file will go from dark to light. Just as a continuous thing. So, when you start seeing different blocks of color in the middle, you know, that's bad news. And some of the files, particularly SQL light databases, get horribly fragmented during use. Finally, there's a third thing which is scribble, which is much like the file system view, but it lets you see multiple IOs. So, as I select different IO transactions as they happen in order here, you can see that it's starting to draw scribble on the page. And this is starting up G edit, which is just a fairly simple text editor. And as you see, as we start to select it all, it ends up with, you know, really quite a lot of IO happening here. Lots of seeking about, you know, all over the place. And of course, if you read all this data linearly, you can read it really very quickly. But if you start doing this sort of stuff, clearly your IO performance goes way, way, way, way down. So, you don't want to do that. You want to read it nice and linearly. Hopefully, you also see that there are sort of hot zones here. You know, clearly there's something interesting down here to G edit. I don't know what it is. We can perhaps select it and see what it's doing. Okay. So, it's doing a whole lot of G streamer stuff, apparently. Loading various things. God knows what I see that. Oh, no, I lie. I'm just going to show you how you can actually use the tool to improve your application, not just get depressed about it. So, one of the big performance problems in OpenOffice we solved between 1.1 and 2.0 is the application that we're using in the open office. And so, I'm going to show you how you can actually use the tool to improve your application, not just get depressed about it. So, one of the big performance problems in OpenOffice we solved between 1.1 and 2.0 was startup time performance was incredibly dire. And we found out much of this was in the linking stage. So, as you linked, it would touch lots of pages which would be forced in, and they were all scattered all over the place. This was very slow. So, we wrote a very clever application that would page all of these files in and touch each of their pages one by one by one by one. And then it would all be there, if you had enough memory, and then it would start up a lot quicker. And this was a major noticeable user performance improvement. However, we have this little application here that goes around touching libraries. So, first of all, it touches all of the star writer library, and then it moves on to SVX, and then it moves on to something and something else. And as it goes, you see the pattern is building up. And hopefully if we zoom out a bit, you can see maybe a bit more of it. You know, it's doing some nice linear reads of each library, but between them it's still dotting all over the place. So, with a very simple sorting scheme using the information you can get from the library, which is the iNode number. So, without actually having to stat the files, we can get a hint as to what order the files will lay down in, hopefully. So, the iNode number is approximately good as an ordering hint. And hopefully if we come here and we zoom out and I scroll down a bit because now we're starting at the top instead of the bottom. Hopefully, you can see that we're starting to read fairly linearly here. So, some bits we don't read, but at least we mostly don't dot about. So, you can then prove that you save a certain amount of time. And guess what? Each time you run it you save exactly the same amount of time. 100% deterministic. And you can point other people to it and say, look, I saved this amount and they can't argue with him because they can't reproduce it. So, you know, the iOgrind never lies because there's nothing to contradict it. Which is great. So, that's fun. But having done all these things you then come up with a whole lot of other crazy ideas. So, for example, it would be really nice to install whole systems and see how file system layout is affected by this to tweak the kernel layer algorithms and reinstall in a few seconds. Of course, not using real data, just using bogus data and see what happens. Just rerun the entire iO pattern of an install and then rerun all of the applications from oracle databases through random desktop startups through booting the kernel and say, ha, everything got faster. Or 99% of apps got faster and 1% got really slow and they were broken anyway. This kind of thing. So, to do that, we really need to be using system tap. Running an entire install through valgrind I think would be not feasible much as I love valgrind and it is really cool instantly writing a plugin for it. It's relatively easy. I encourage you to do that. And then just using a cut down kernel to simulate it so that we can actually try the things in line. There are other interesting things that come out. That of my 4GB openSusra install I have 16MB that describes all of the file system layouts, all of the names, where they are on disk and where their iNodes are. If I can press that, it goes down to 2.6MB which is 0.065% of the file system. Easy to fit in memory. And the problem with directories is it's basically pointer chasing. You can't tell where the third directory level down is until you've created that thing, read it and then found it's just before where I was reading. This information problem that is only solved by seeking if your directories are laid out like they are at the moment. I think there are lots of things that need to be done to improve file system allocation, information, and our file system is not really good enough at the moment without band-aids. Problem is you talk to the server guys about this and they say we could have a billion files in a directory. That's true. My laptop doesn't have that much. It has relatively few. There is a server desktop imbalance here and someone needs to take this task on from a desktop perspective. If anyone has time, see me afterwards. Why should you have a reader directory? You should just read 2.5MB in a fraction of a second when you boot the machine and know where everything is. There are various hacks you can do to improve layout as you install packages which is quite fun, such as, for example, you can set into a single directory, hard-linking it to where it's supposed to go and deleting the original. You can control the layout in these rather hackish, but perhaps quite effective ways. I can't claim credit for it. Open size, for example, saying how large you want the file to be as you open it so that you don't get fragmentation caused by multiple people opening files and appending to them slowly. Late block allocation, as some file systems do, it's foolish to allocate the blocks for your file as you open it or as you append to it. You should only do it idly as you write to the disk because you can delay and reorder much better and hopefully the file is closed and you know how big it is. Here's the crazy idea. Why not serialize your whole iNode and directory cache across mounts? As you unmount your file system you just shove all that information in a nice linear block and rehydrate it and save a lot of seeks as you bootstrap. That's my crazy ideas. Here's where the code is. The only deterministic solution, get those 5% wins. If you have 10-5% wins, that's pretty good. But if you didn't have any one of them because you couldn't measure it in the noise, that's pretty sad. Particularly because many of these optimizations can slightly confuse the code flow and make things more complicated, hard to argue for but really, really important. So yeah, iGround is useful for applications, file system authors, lots of potential for expansion if you want to play with it and have a hack. It's easy to hack, it's sort of in C sharp, it's really simple. And yeah, there's URLs. Thanks to AMD for funding it, Julie and Cio for producing Valgrind, Federico Mena for doing pretty graphical things. And my Indian friends, thank her for helping out with it. So thank you. Very good.