 Hello. Can you hear me? Wonderful. Let me look around the room. Ah, good. So can I just do a quick poll to make this interactive before we start? Who would call themselves a developer in the stream? You know, you actually write code. Of course it develops into translations of other people, but code is good. And who's a user? A pure user, you know, translator. Someone who doesn't code. Who doesn't code? Ah, excellent. Then I have an easy talk. I'll go away now, then. They all turn up. So this is what I'm going to say. And I will be teaching you to suck eggs. Because, I'm sorry. So regressions, you know, it's pretty obvious what a regression is. It's broken and it used to work, right? And this is the perception of a regression. And, you know, the thought would be very bad if it hadn't been there. And escape regressions even works. These are ones that the user actually sees. You know, they download the application and it doesn't behave as, you know, as they expect. And, yeah, these guys expected the suite to be stable and it somehow failed. Because non-escape regressions are bad still, but much less bad. We catch them before they get out into the wild. So this is the normal flow of how you talk about things. Well, let's talk about some of what kind of regressions, you know, we can get. So for example, you know, it still works, but it got twice as slow, you know. Is that a performance regression? Probably. But, you know, the problem is, often we have the speed and memory and, you know, trade-off here. You know, how quickly you can develop it versus, you know, how quickly it runs versus how much memory it uses. So it's quite easy to make it 100 times faster for one user and simultaneously two times slower for someone else, right? Or we could use way less memory so someone can actually do something and, unfortunately, something gets slower for different user kits. So that's fine. So the obvious fix here is to revert the patch, right? Just revert the patch. But then we get another regression, which is it's now 100 times slower, right? So having done this, you basically can't win. And sometimes my fix is your regression, you know. So some of these things are, to some degree, mutually exclusive. I give it two views that I've heard about how we should structure development and how we should go about this. So I get very exondering angry people who come and say, first we must stop adding any new feature at all. Which only fix bugs until they're all fixed, you know. And the other view I hear is, hey man, you know, this is not regression, whatever, you know. I'll do what I like and, you know, just get used to it. And my goals here would be really to help people who are, A, to understand, you know, that actually perhaps we need some regressions. Regressions are not necessarily a sign that we're doing everything wrong. And I'd like to be people to think, you know, maybe we should actually invest more time in, you know, fixing these things, avoiding regressions and writing, you know, tests and so on. So perhaps moving towards the middle ground. So here are regressions versus time. These are the closed guys and these are the open guys. And as you can see, there is a slow but steady trend over, you know, how many years to increase the number of regressions. But I'd like to persuade you, the line of this curve is not some kind of hockey stick, you know. It doesn't sound good to have regressions increasing. But on the other hand, you know, we fixed, you know, three and a half thousand plus of regressions, many of which never escaped. So, you know, we don't actually track, well, I should probably track escape regressions and draw you some nice graphs, but I didn't have time. So, so there you go. And we tracked them and break them down in the ESC. Every week we look at the numbers, like publishing them on the mailing list so that people can see. And the reason is that numbers often have a positive effect in themselves. So I like to think that by publishing the numbers, we have an effect on reducing our regressions, you know, because people don't like them. Unfortunately, when numbers get big enough, people ignore them. They're just too big, you know. And so we then try and shrink the numbers again by saying high priority regressions and so on. But you can see that the bulk, the bulk of these numbers tell you something interesting about the suite. And I think to a large degree a lot of work and a lot of use goes into writer and counter and, you know, impress and base and so on. Are either much better pieces of software or there are fewer people using them and creating problems. Anyhow, so, other questions I often get from users is how can a regression happen? You know, they come up to me and go, oh, it's just unbelievable. You know, these people, you know, it's my right to never see a regression and I can't believe I saw one. You have failed me. You must be a terrible developer because you have created a regression, you know. And this attitude is not terribly constructive with the developer. Because the developer is already feeling bad that he created a regression anyway, right? And so when you come and you say this, they sort of either get angry or they curl up inside and die, you know. So not good. And so I just want to help explain to people how regressions can be caused and actually that they're not unusual. So if I also don't know any developer, you know, we have lots of developers in LibreOffice with widely varying skill. You know, we have people who have just learned to translate German comments at the bottom, you know. And we have people who are doing white space changes. And we also have people who have 20 years experience on a PhD and, you know, C++. And all of these people create regressions, all of them. I don't know any developer who hasn't, okay. So the suggested solution here of stopping anyone that creates regressions from being able to type, you know. Taking that to you all the way will result in no code change, right? Which will interestingly stop for being any regressions, but, yeah. So there is no developer that hasn't. And normally this happens in proportion to the degree of change. The more change you do, the more regressions you create. And what causes them? Well, there are lots of causes. But imperfect knowledge and understanding is it, basically. It is a hard job programming. You have to think like a computer. And the computer thinks quicker than you do. And unfortunately the person who came before you was an idiot, you know. He tangled the code like this. And so it's just not possible to know all of the side effects of everything you do. So even for a small bug fix you have to assume many things. You have to read the code and think, it's structured like this. It's called that. Almost certainly it does this. So let's fix it in this way. You can't read all of the code on the standard side effects. It's like the sort of chaos theory. You know, the butterfly over there is flapping its wings. And there's a hurricane over here. And if only we could find the butterfly, you know, we could kill it. And then it wouldn't be a hurricane, you know. Anyhow. So this is some quick examples. These are just a tip of, not even the tip of the iceberg. You know, there are like a bird poo on top of the iceberg. You know, three and a half thousand regressions. Ask any developer you like for some very silly examples. So here's a great example caused by, that I happen to fix, caused by a developer who is awesome. So this is not an idiot who causes this regression. This is someone very competent. Extremely competent. And you know, so the bug is here. You can read the bug. The get export methods actually also create and register things. So when you call a method, which is get text paragraph export, you don't expect it to go and instantiate a whole, massive heavyweight object and then just return it. It should be get or create or it should have a name that explains what it does. And so in a clean up of the code, someone came and removed these methods because, well, no one was actually using the output of them. It looked like this code was unnecessary. Atina. And it really shouldn't have massive side effects. Unfortunately, the result of this was that we loo lost great chunks of data when you saved. Don't get files, I think. Yeah, so you would save it and then when you loaded the game, it wouldn't come back. But only in some cases. You know, you could test it and it seemed to work fine. But depending on the ordering of what other code was called, this thing would either be created at the right time or the wrong time and you'd lose some stuff. You know, so some shapes might lose attributes in some documents. And so the fix is particularly lame. I think it's still there, is to call the method get, which actually does nothing with it, two lines. So it's a silly regression just caused by bad naming and, you know, obviously the testing was fine, I'm sure. But yeah, unfortunately. Here's another particularly nice thing. So VCL, I'll talk, it has a hierarchy of windows. You know, it's like a tree of these things underneath. And when you walk over a hierarchy, you tend to assume things in many cases, such as it won't change underneath, the hierarchy won't be morphed. So as you walk down, this won't disappear or you're going to be thrown out of the hierarchy while you're walking through it because it won't have any children, you know. Similarly, you tend to think that it won't gain a whole lot of children and then suddenly lose them again while you're walking. And so this method here called calc minimum size pixel, which is even a constant method in Toolbox. In fact, it actually creates a whole new Toolbox. It shuts it into the tree as a child, as a peer. So it's not creating a child, it's creating another one here. And then it copies everything into it. It stuffs it full of widgets and it adds it to the docking manager so it's fiddling with another hierarchy. It then asks this child to calculate itself and then it deletes it all again. This is not what you expect when you call calc minimum window size pixel from the top level, you expect something sensible. Unfortunately, this has been here since 2004. So this is not a new booby trap waiting for somebody to catch. And it's also horribly inefficient and extremely dumb. But these things hide in the code, if you're not careful. So all of the things you think are safe are probably not. But the problem is you can't protect yourself against every eventuality. If you write code that copes with this sort of thing, you'll end up with unreadable, unusable, awful code. So it's usually better to assume, get the regression and fix it cleanly than it is to try and protect yourself against every available possibility. So here's a performance regression. So Libre logo, nice little Python thing, good for education, draws pretty diagrams. Good for kids at school. Who would think that adding this toolbar which is not visible by default would have a significant impact on startup time. So it turns out that the framework code without putting two final points on it loads all of these icons and interpolates them down with a high-quality interpolation from 26 by 26 pixels to 24 by 24 pixels on every start. It used to anyway, before I nailed it. Before even showing the rest of the UI and certainly without it toolbar being visible. And you can say that this is really lame code in the framework, right? Well, there is a lot of lame code in the framework, so it could be the answer. But in general, it's good for a developer to be lazy. It's good for them to write simple code and not optimize for cases that don't exist. So it's actually good to be lazy. It's good to write minimal code that's easy to read and simple and doesn't optimize for things that don't happen yet. So all you can see here is that we triggered a path that we weren't expecting. And hopefully it's now got much better performance tests and things, and we can start to see that immediately as soon as it happens. Rather than waiting just before the release profiling and going, whoa, what happened there, you know? So the magicians will get you caught this back in the day. And this is also irritating because it only happens on certain themes. So something to do, I think, have 26 by 26 icons and then it doesn't scale it. Others are 22 by 22 or 24, whatever. So how about bug fixes causing regression? This is another one. This is the last silly example I'm going to give you. And remember this is, again, just the very small cross-section of silly things. So someone, this reads chunks of PNGs. It reads all of our images inside VCR. And you read out in the screen the next chunk of length. So the PNG comes up a whole lot of chunks of data of different length and different types. And clearly, someone here fixed a bug. Sanity check for chunk length. If the chunk length is negative, something really bad is about to happen, right? So we'll just return false. We'll give up on this PNG. Unfortunately, the main result of this is that a whole load of images suddenly disappear from people's presentations. Not very many. Not one you tested. The 100 you tested were absolutely fine. But it's the 0.5% that are generated by badly behaved programs. And so this guy instead is my fix, I guess, truncates over long-trailing chunks. So in some cases, you can put an infinitely long chunk at the end. The last chunk just happens to have a massive length that goes over the end of the file and looks negative if you're not careful as well. And so this guy then says it is less than 0. It's negative. Or it's actually beyond the end of the file. Just say it's the last bytes. So we can read the last chunk and actually get your image data out without dying. So this is a great example of a bug fix, presumably for security. The broke whole of the stuff and then has to be fixed again and hopefully is right this time. But, you know, just fixing a bug regression, that's my thesis. So the idea that we all sit down and fix every bug, 6,000 open bugs or 7,000 open bugs, each taking one man a week to fix approximately, will create, even if we could do that, and it's, I mean we could do that like 100 man years or whatever, many millions, tens of millions of dollars, we would create a whole load more regressions. So hopefully many fewer, but still some. So instead we have all those strategies to avoid regressions. Let me show you them. So the first one is quite popular in the user community. Quality through obsolescence, right? So if change causes regressions, don't change anything, right? So it works, you know, it's a great way to avoid regressions. All you do is you move the data, right? So you say what we have now is perfect and so any change would cause it to regress, right? So, and this defining the data must now is, you can wipe out all your regressions, you know, it's where you measure it from. Anything that was a bug actually is now featured. It's a much-loved feature that when you click that button it crashes. That's just how it is, right? The problem is the people's perception of quality is not conditioned by what we would call a bug. So for us a bug is, you know, it doesn't behave, you know, as we designed it, but the problem is the world moves on, you know, there are new document file formats, there are new features that people want, there are new expectations of how the UI should look and behave and saying, well, those are all features, you know, the fact your document doesn't open and when it opens there's nothing in it. That's a feature, right? That's a smart ass layout feature, that's not a bug, right? But the user doesn't see it that way. So, you know, we have to be fixing these things and we have to be moving on and we also need to build a community. If you can't change a code it's really difficult to encourage people to contribute. So, stop the world or just fix, to just fix bugs, this is another, you know, request we sometimes get. So, you know, don't allow any commit unless it fixes a bug, only it fixes a bug, right? And if we do this for six months we'll have just wonderful quality and everything will be fine. So, you know, if we truly stopped the world and didn't let anyone contribute any kind of feature, we'd have a massive impact on our community and we'd drive people away. But the good news is we actually already do this. So, we have a stable branch, we have the stable branches and anyone who likes can go and fix only bugs. And you can merge those to the stable branch if that's your mission in life, awesome. We love it, that's cool. We need more people fixing bugs, we need more volunteers actually to care about bugs. And, you know, and if you really want loads of years of just bug fixing just for you. The problem is that many people who say this are really just trying to twist your arm into fixing their particular pet bug. So, first fix my bug and then you can carry on adding the features I want all for free, usually. So, who knows. And the other point is that even if we did just stop and fix bugs, it would still cause the regressions to move around to some degree. So, those are the negative approaches. There's a lot of stuff happening to reduce regressions. So, there would be many talks here I hope you've attended, talking about some of these tools. I'm just going to whizz through them. But the first, the very baseline is not have any warnings. And, you know, the compiler can tell you your code is stupid. And, you know, we should listen to that and we can then go and turn on many more warnings and find many more kinds of stupid things. We can use CPP check which can see more things than the compiler crazily in many cases. Fixing lots of these things, Julian and Abbot have done some fantastic work there. Co-verity, you know, getting the score down to zero all of these have a significant impact on quality. We've seen security bugs that don't affect us because all the co-verity work we've done, you know, is a great prophylactic client plug-ins. So, there are some things the compiler can't know about, about our code base and the way we use our APIs and the way we structure that you can't expect a compiler to know that we can teach the compiler that and that should definitely have done some great work to create plug-ins to catch particular badness that we can see in how we structure our APIs. So, as you compile, you can't get bad code. You can't screw up your reference counting. You can't... There's just loads of them. There's a great long list of things that we now check every build on our Arlington Debox. Human testing, so, you know, a lot of our human testing is happening sort of in triage, unfortunately. So, after the bugs have escaped, we somewhat fast in an enemy triage, we reproduce it and prioritize it. But I was actually extremely impressed with the amount of work happening on master. People actually downloading and running and using and testing a master before release. This is really quite a big effort there and it's something that I don't think celebrates it enough because these people are doing just amazing work that has a really big impact and I'd encourage you, you know, if you are wanting to get involved, actually just using and running the latest master build and finding bugs is extremely valuable. And the more frequently you update, the better, really, you know. So, yeah. So, every regression is really a compound failure. You know, some developer caused it probably inadvertently. They're not trying to ruin your life. They're trying to improve things. And all users failed to test it and report it before it got out of there, right? So, I think there's a dual responsibility here. And there are relatively few developers and there are, you know, 100 million users. So, I think the blame is, you know, it's clear to the portion. So, the most ideal testing is to do it alongside the developer, you know. So, during that feature development and, yeah, as I said, I just really encourage to see that, you know, that volume of bug fixes and DCL points come in from human beings testing it. Bysecting is, of course, really important and it's a relational aspect, finding the person whose patch accords this and it's actually yours and yours, Mikael's and created this thing and it's just fantastic. And it's important to close the cycle quickly. If we are creating a lot of bugs, it's important that we fix them as well while we have the time and attention, while we have the stuff in our mental set, while the customer is paying the consultancy bill, we need to find the problems we create and fix them quickly. If they come up six months or a year later, the whole problems change. People move on and it's just much, much more expensive and difficult there. So, if you can work alongside the developer and just follow what they're doing and test their stuff, that's really cool. What else? Testing. Michael Stahl's talk was just awesome. I'd recommend it to you if you can see it on the internet as well when it's there. Yeah, if you can hit just a bug fix, you'll get to fix it again later when it progresses. Whereas, you know, first you'll get to fix it again soon and then you'll get to fix it again in another few years when you've forgotten all the details. And it's wonderful to have that deja vu feeling when you see a bug, but it's actually much better to write a unit test in the first place. And then you move your problem to someone else, you know, because someone else's bug fix is your aggression and they get to improve their fix and your fix stays fixed. So this is just a huge investment and it's incrementally more and more of these we have, the less stupid stuff we would get. And it's interesting, so the open GL work we just did, there were lots of people working on the GL back end and one of them caused a regression in something that I had implemented and tested. And weird drag and drop test in draw found this threading bug, you know, I guess I fixed it in this case. Make sure I found another, you know, some really unexpected good hits from the unit test. So CPU unit has got lots of good stuff in it. I really want Michael to talk to you. So test automation doing very regularly, getting these tests to run very quickly being reliable and doing it fast. So I can't spell continuous and as you can see my spell check error is not working. So great work with the normal getting the integration testing. So ideally every commit that goes in hasn't just run the quick unit tests, but the whole category of tests we have. That's happening very quickly takes an hour and your patch can be tested on three platforms with some reasonable quality. And of course the tinderboxes are building with extra checks and sanitizing the code as well. Crash testing, great one. Let's talk about this. I think before and it's pretty awesome what we're doing there. Performance testing also important under the simulated CPU. Testing old security CVE documents without anyone intending to when we first created the office and we first built a unit test framework. We tested all the old CVE documents to see if they regret half of them had. So we fixed them all again, but having unit testing really helps. Code review. So there are more expensive strategies on running unit testing. Code review is obviously a very good way to do this. So why don't we mandate code review for every patch and make every developer's life awful and have someone sitting on their shoulder watching every line they type. So that would probably improve quality. Probably extremely expensive in developer and reviewer time and demotivating and slowing things down if you're not careful. So we reserve this technique for new contributors people who haven't committed code before we don't know how the code is. Stable branches. So we get a double review for anything that's a bug fix and a triple review for new features. So four I's have read the code by the time the feature goes in. That's the hope and tested it and made sure it works. And this is pretty effective. Like it has one of our tools for stopping regressions is this is really pretty good. It's relatively unlikely to get a double regression on the stable branch. There's some numbers on this. It's the order of less than a handful across a whole lifetime of 4.4 for example. And the other thing we use it for is of course volunteer input. I'm changing this code or I'm worried can you check my code. Please have a little advice me. Unfortunately I have very little review bandwidth you know everybody is very busy so finding people to do this is a real pain. It knocks chunks out of people's days and we really need to be focusing on encouraging new contributors and getting their code in. So TDF is hoping to help us advertise for mentoring leads to whose job it will be to encourage new contributors and take this piece of reviewing out on picture so the experience developers can be focusing on these scenarios here which hopefully will grow the community and improve quality. I gave this in my UCL pointer talk. Huge fix. Intended to be small but actually became huge. 61 regression bugs tracked and fixed fine from this gate. So I think that's just a fantastic comment on the QA team really that they found all of those bugs like 90% of them plus before they actually got out to users. And here is our six months of just fixing bugs going on in the pipeline. 501 had three of them fixed, 502 has two fixed. So far we don't know of any more. There are certainly some. Yeah this shows a big hole that Matthew who is filling with work to actually just open and close all the dialogues and find all of the UI and type things into it. And we're basically missing. I mean it's humbling to see, you know we made make check pass, make check pass just fine with the UCL pointer of your work. And for each problem that we had in 8-check we found problem market types and we fixed all of them. So we found that we screwed up this specific thing and we grept and we even reviewed the code. You know 10,000 20,000 line patches is pretty tedious but you know that being screwed in. And so we thought it was pretty good and then you used to get out of the 60 regressions which is pretty well it shows you the hole in your tools, you know the way you're not testing. So what's the future? Well hard to predict the future, right? Easier to predict the past. But many reasons people don't write unit tests is that it's hard to predict the first one. So often for years and years Star Office didn't have a good systematic unit testing framework that could do these quick reliable tests. And it was a pain in the backside. I mean Quaidorm and myself helped bootstrap that thing because it was such an obvious need for systematic testing. And you know once that's done it's very easy to add a new test but there was a big chunk of work for the first one. And so the engineering committee was asked the developer list was asked to come up with ideas for building infrastructure so it's easy to have the next test. And yeah, so here are the things sprang by what we thought was important. I'm guessing that Matthew has done deep already so that's correct. We're going to apply our money elsewhere or we can talk to Matt as we like. I'm sure there's more work to be done than finishing it up and maybe hopefully you'll be able to bid for making this just beautiful. So some of these things, so A is what I particularly like. So currently there's a whole area of LibreOffice we don't systematically test in detail which is bad. And that's essentially the area of text layout and making sure that the exact layout is identical. And the reason is that whatever platform you run on with whatever different set of fonts you get a different answer. So you know the fonts and the shaping of each language produces a different answer which is very bad in itself. But we need to create some stuff so we can be sure that given everything is equal. You know you have the same shaping and you have the same fonts that we can actually have a predictable layout test. So I might hope this will work best in that. Format validity checks I forget what these mean maybe someone can remind me. Automated help and documentation screen shots. So this was an idea to make it possible to automatically create screen shots for help and documentation so that you can keep the documentation up to date very easily in an automated way. So instead of having a screen shot in the help you'll have a, it is here in the application. You'll be able to build those things and crop them and highlight the right bits based on what's actually there. So making it very easy to update and check quality of all of those. And particularly to translate our documentation. You know screen shots are very hard to maintain and be translated. Anyway, it's funny I back in the day Linux magazine was a German publication and had this terrible problem that all these English screenshots, you know they had to try and somehow translate into into their magazines. Some of these screen shots have only been created once as very pleased I always do work on a component model but I happen to know that this particular screen shot just then published it. A dozen magazines never existed again. It was the high point of this thing. It only ever worked on my laptop. So you can see it's a real problem and we want to avoid this for the help of documentation. So another approach there. I'm just making it quicker for QA developers to actually text people they're doing by sections on slow and old hardware. And the TDF has money and we can help maybe to speed that up. And Norma has some voice green master ideas to try and make sure that everything goes through CI and it's all beautiful. That's right. Something like that. This was for looking for shared library dependencies and Android unit testing with. It's extremely hard to develop on Android as maybe if it makes us just talk. You'll have heard, you know, like a minute to build the thing and push it and run it. And it's extremely unreliable platform too. You just don't know if you're updated or APK. So you can't be sure what you installed is even the right version unless you manually removed it first by clicking the buttons really. The debugging is awful. To not have any unit testing as well is slightly nerve wracking. So it's all cloth doing human testing before we ship that thing. So yeah, this would be areas I think that we need to continue to build tooling and build automation to try and help improve quality. So again, my notion is to make your office rock and quality is obviously important part of that. Regressions are bad. One knows this. They are to some extent inevitable. Finding a negative in advance is of course possible and the automated tooling makes it much easier to find regressions and part of making it easier to find regressions is allowing us to change more. So my personal feeling is that in order to make the office beautiful and reliable and clean, we need to do more work more quickly and we need to do that without breaking up too badly. So my offer for increased testing and increased regression stuff is to be able to fix stuff more quickly and sleep better at night after it. But of course there's a balance here between how quickly we move and the quality and if we can trade one to somebody against it. So that is my talk. Have I run over horribly? That would be a first. That would be a first, yes. I don't know what's after me but tell me Mr Session manager person. Do we have more questions? Two minutes ahead but who's next? She's definitely one. Is it okay? Any questions? Matthew? More simple than a question but according to my QA has I just wanted to point out that while there are conflicts that there have been 3,500 regressions, we have 5,000 odd open bugs which have never adequately been classified into being a regression Aha! Look at that. So you can get used to classifying bugs and massively increase our regression count. Great! Volunteer is a QA member. It's an interesting one. I think that's a good point. There are lots of bugs that probably are regressions that haven't been categorized. And then again and the ones that haven't been found yet. I think the there would be more regressions than that because 3,500 is the closed number of the queryable. I'll just get into that. What else? If you look at other projects like the Zilla, I did a query on the Zilla recently to try and find how many regressions they had and you know it topped out 10,000. The number of closed regressions I think was over a suspiciously round number of 10,000. But the number of open ones was in the thousands. So I think it's expected that we get this round of low intensity, low regressions. Hang on. Thank you.