 Okay, this is the Coverity and Ossess Fuzz Issue Solving thing. So what I've done here is I've left out all of the simple stuff. So if any simple bugs are reported, it's not really at heart here. It's just the more interesting things, techniques to get rid of more complicated Coverity warnings and ways to solve some of the Ossess Fuzz issues that arise as well. There will be lots of treaty slide transitions to keep you entertained and also to make sure that they work, did it all in my development version where the slide transitions work perfectly fine, and then they didn't actually work in the Fedora version, which can be tracked down to a change in how GLM works. And there is a Fedora updates testing there since yesterday that has slide transitions working for Fedora again. So I guess as it's traditional, going to the conferences make you use your own software to realize what bugs need to be fixed. So the configuration we have for Coverity. Coverity works is that you build it locally on your side and then that outputs a great big blob which is uploaded to their server and their server does the analysis on that. So at least in the open source version that we use, you don't check the results locally. You have to use their website and their output to see what the results are. The way we've configured nowadays is that it's fully public and it defaults to private. You get a chance to look at the bugs first yourself and decide whether to make them public or not. Once we fixed all of the bugs, we decided to be public because we had no legacy issues that we needed to be concerned about anymore. So if you have ever applied to be a member of any of these projects, you don't need to be a member of the LibreOffice one. All the issues are public. In the older versions of Coverity, your project was either a C++ one or a Java one. So both Java and C++ in LibreOffice and reports issues on both languages. Coverity does not support C++ 2A, but does relatively recently support C++ 17, which means that we patch our configure to go back down to C++ 17 to get it past the Coverity tooling. And we only scan LibreOffice itself. You don't worry about any of the third-party projects that we use. Some of those are scanned separately and some of them are people's issues. This is how the website that you see the results looks in. The results are emailed to the list, but the UI on the website is superior when you have a non-trivial issue. This is an example case of something where there's an uninitialized member that's just reported like that. A lot of the warnings that Coverity gives are kind of heuristic-based as opposed to like guaranteed, whether heuristic-based, in the sense that if there are no members initialized in your struct, it won't warn about that. It assumes you know what you're doing. If you initialize most of them, but then not one or two, it'll warn about that. The same is true for a lot of the cases that it'll look kind of statistically whether or not things are out of abnormal. So your code can be unchanged. Somebody can delete a couple of lines of code and that means the statistics for this particular pattern changes. So you get new warnings even though new, no change has been made to the code that's been warned about. The overall state has changed. So warnings can appear and disappear, not with nothing to do with the actual lines of code being reported. So you'll see like if there's 55 times somebody checks the return value and then the case is introduced and it doesn't, then that's something that's worth flagging. But if it's 30-30 and then somebody changes one of them, then the change is from 29-30, and then you might get 29 warnings saying that return value is unchecked. So that's why things sometimes appear and disappear. If you want to waive a warning, you can just do it strictly in the web interface. You can say this is not a warning, this is a false positive, this is intentional. But if you do that and you have two issues, the first issue is that if the code changes sufficiently, that coverage can no longer track that it's the same code, the warning will reappear. And then that only affects that coverty instance inside Red Hat. We run another instance of coverty and we put LibreOffice through the paces on that one as well. So if I get rid of false positives in one, I don't get rid of them in the other, unless I do something that can be detected by both instances. So we have the annotation stuff here. And there's two possibilities of annotation, which I know work and I've seen documented in our own internal documentation in Red Hat, but seems to be hard to find out on the public web, which is that if you use the warning name that I'd highlighted in the earlier slide, you un-initialize member, put it between the brackets, coverty, whatever, and put an optional comment. It becomes marked as an intentional issue inside in the web interface. But then if you use the spaceful column false, it's marked as a false positive automatically, which is convenient. So here then the top one is inside in CalQ, where there's a struct that's deliberately not initialized. It has some an OU string member, which is a default constructor. So some of its members are initialized, others not. You can say that they're all deliberately not initialized with that one. It's an intentional issue. And then the other one is where there's a copy and paste error, which is just completely spurious. And that can be marked as a false positive. Then the other annotation is that you can set a function to be told that it's like a fatal function, that if this function is called, that's your program will terminate at that point. That's automatically in place for things like a certain abort, obviously. And then you can add annotations in our case to the CPP unit called reports an error. How it actually reports an error is it throws an exception that's unhandled. So if you leave things in their default state, you get 5,000 warnings saying that you've unhandled exceptions, LibreOffice. But they're all from the deliberately, it's meant to deliberately fail in our CPP unitess. So that's annotated as being a fatal function. Once that's called, a covariate knows that it's the end of execution and that any follow-up issues can be just dismissed automatically. We run, the covariate runs with the enable assert always abort. So any asserts that you put in will also if triggered terminate flow. So if you were to run without that flag, you would find that there are warnings inside LibreOffice that would not be handled. But because we say assert is fatal, which means that we really mean what we say when we have an assert. We say an assert will not happen. So they're not for use for trivial issues. Use your warns instead because you are removing them from the source code analysis. If you use an assert, that's not a true assert. That's really difficult. Sometimes it's the tainted data. You want to know if something is tainted data, if you're reading data from a file format, a Microsoft Office binary file format, if the values are entrusted, if it says that there's minus one values to follow and then you try to allocate minus one value, then you have an issue there. So you want to know that this data is entrusted. But there are cases where severity will say this data is entrusted. But you happen to know that that data is data that's shipped with LibreOffice and that actually is trusted data. So in this case, in this example, we don't want to say that all values that come back from that function are entrusted, but just in this specific case we trust that this markup here, calling severity tainted data to sanitize, will say that this data is trusted and that removes that issue. This all comes back from the Heartbleed issue in OpenSSL and at that point, Kaverity added this support by detecting common byte-swapping techniques as being a way to know that certain data is tainted. So the other solution, whether it still works or not, I haven't double-checked, is that if you use the then-documented non-standard swapping pattern, then Kaverity will not consider that data as tainted. So if you wanted to say that all data that comes back from this function can be trusted, despite the byte-swapping, you can byte-swap in that style and apparently it will consider it untainted. And if you just wanted to actually test your data properly, in the case where you are reading untrusted data, then any kind of a check at all on the data will consider it to be validated. So here I've checked to make sure that the length of the resources is possible within the length of that file. It's unsigned, so there's no need to check for list and zero. If it was signed, you have to check both ends of it, otherwise it'll be considered unchecked. You get a lot of that, especially in the image filters. Yeah, sliding stuff. The other thing that's difficult to deal with in Coverti is its handling, is its tracking of exceptions. It's very good at tracking where an exception could be thrown from, which gives us a lot of exception warnings in areas where in practice it's not going to happen. Typically, you get something like from the configuration, where the configuration could throw, but that configuration is going to throw the first time it's read, and typically like you read maybe it started a constructor, and then you read the same configuration data in the destructor. When it comes in the destructor, and if there's, for instance, a standard unique pointer which cannot have any exceptions to ensure it, it's going to, you're going to be told that it's going to abort, it ever happened, but it's not going to happen. It could happen if somebody deleted the configuration while your program is running, but in practice it's not going to happen. So how are you going to get rid of these? You can either come up with a really convoluted scheme of catching the exceptions, but it's all just going to make your code or you just accept that it's not going to happen. You want to hear about other cases like that, but just not in this specific case. So when you get that warning, the most recent one was just yesterday, the day before, and you have your standard unique pointer, you can pass in the deletor parameter, and the deletor parameter is that o3tl default delete, which when it's not running in, when it's not compiled in the converting, it just does the standard delete, and you assume that that exception isn't going to happen, and just to tell the conversion not to worry about it, you have effectively that. It's not exactly the same code. I shrunk it to fit on the slide, and then you can see that if underscore, underscore, converting, underscore, underscore is defined, you know that you're running into converting, and then that silences that whole set of warnings so that you don't have to worry about them. All right. So it was kind of the tips and techniques we've been using to get conversion numbers low, to get them manageable, and what we always want to really hear about is changes in warnings from converting. So this is yesterday's stats on converting to 20. So we just fixed the last warning there again, so we're back down to zero warnings. The number of code we have, you can see that we're analyzing 6.1 million lines of code. We're down from about 6.5 million lines of code since 2018, so we've actually shrunk our code quite a bit, which is nice. It hasn't happened for a while, and those are the figures we have there. The other stat I have as well is a timeline of how many bugs we started off with back in 2014, and shrunk it all the way down, and the various gaps, and then you see there's a gap there where we didn't have the ability to run converting for a couple of months, because we required C++17, and there wasn't C++17 supporting converting until whatever, July last year, so the numbers rise when it's not been constantly kept under control. Yeah, so that's the end of the converting section. So the OSS Fuzz, that's where there's a huge big giant set of cores set aside on Google's cloud where it fuzzes the documents for us. In this case, they build on their side our project on their compilers and their hardware. They call our script that script which we find in our source tree which builds the various fuzzers in our configuration. So we've fortified different fuzzer targets. We've got all of the graphics file formats, BMP, GAF, PNG, JPEG, etc, etc. Then the file formats such as like Doc, Excel and all those ones are our own file formats in the flat version, so the flat ODT, flat ODF and all them. So there's 45 individual targets and then they build them on their side in three different configurations. Two lib fuzzers ones, the address sanitizer, the undefined behavior sanitizer and the up fuzzers which basically is the address sanitizer version of that as well. So the three of them are run continuously giving us 135 running away constantly. What's real problem for us is that the configuration on their side is with no dynamic libraries at all. You can find the configuration in that directory there if you're interested. What we do there is we reuse that disabled dynamic loading thing which is intended for IOS because TOR manages it effectively and he manages it for IOS. So when this IOS changes there there can be unexpected changes for the fuzzing as well. So new components added to IOS will end up getting compiled on the different configuration for this fuzz and it doesn't always succeed. The individual fuzzers are still unfortunately very large about 300 megs per fuzzer when they really should be a tent of that. So they're individually quite large. We don't run with a configuration layer inside the fuzzing lens so we just hard code various defaults and then you can use that at runtime to determine whether or not you are fuzzing or not whether the configuration is being avoided and put in some other kind of default. If MBLS is interested in running fuzzers against us that's the URL you can find there are seed corpus for the 65 formats. The 45 file formats are mentioned there in an additional 15 from the Document Deliberation project with the extra filters that are fuzzed separately so you have corpus there which are not our full collection of formats for that file format but a cut down set of the minimized set that excites the most corpus when out exceeding I don't know it's about half a meg in size or something per individual case. That's what the OSS fuzz website looks like their documents, their bugs are again private and they remain private for 60 or 90 days until after they're fixed something like that. A lot of these issues are now publicly available you can examine them if you wish but by default they're private for the first time what's nice I find is that the minimizer there you can start off with a large test case in this case it's not too big 3-4 bytes but then it'll minimize it itself down to the smallest case so you don't have to work with a large document when you do get something reported so let's see kind of a more recent actual bug that came in and the original top line is the original code and then there's kind of an easy hack on the way at the moment to find better integer types so it went from the sal underscore u long which is effectively a size T and it was changed into a size u16 because n colors is a size u16 but that means then during the shift operation with the wonderful C++ promotion rules becomes an int at that point then you have two signed ints and then you try to add them together and the numbers are two large defeating signed ints and you get this undefined behavior warning like that and then at the bottom is a simple fix to send it back to an unsigned type to get rid of the issue again so that's the kind of bug you get out from a Ubisoft or complicates things in what it says first is mostly the timeouts and the timeout for about 25 seconds at which point they lug a bug saying your target is slow they'll only report one at a time so you get one timeout bug per target if you fix the timeout bug for target you typically get another timeout bug a couple of days later for a different issue so it's a real treadmill some of the file formats we've seen have gotten rid of all of the timeouts because it hasn't been any reported for ages so no one seems to have solved the ODF timeouts for example what's outstanding at the moment is various timeouts in the doc filter the LWP filter the Lotus WordPro filter and a few other ones like that some of them come and go we're just barely on the threshold and we slip below it for a couple of weeks it disappears and we rise above it again and it reappears so how can we help with the timeout obviously if we're processing infinite data it's going to timeout at some stage so there's a doc options file inside that VCL workbin directory normally it's the 64K some file formats take longer than that so we slip a 32K or some number that gets it all in so you can adjust that some of the file formats though, no matter how small you limit the input they have it basically, effectively near infinite decompression abilities you give them 10 bytes and they can generate hundreds and hundreds of megs from that by just putting in an integer and saying that there's going to be this many zeros following it we have to have some way to limit that so when we're running under the OSS fuzz and if that fuzz max input length which comes from the max length above is set then you can say that decompression is limited to some arbitrary multiple of that value depending on some filters some filters are 256 times that value sometimes they're 1000 times but there is some ceiling limit to the amount of decompression that's allowed given that the input and the timeouts are just reporting that there's an infinite loop so here's an example of a true infinite loop inside one of the file formats where it's a chain of properties each property tells where the opposite of the next chain is so a simple technique that's commonly used in a lot of the filters is just to track what places have been visited already what offsites have already been visited so if you've already visited one particular link in the chain and then you return to the same link again then obviously it's just corrupt, your document is bust and you just break and move on so that's the most common solution for the real case of the infinite loops there's also memories there's another issue as well as that you're limited to 2 gigs and where you trigger that JPEG itself there's a couple of examples of simple files that will break right through that limit that are perfectly legitimate JPEG files the JPEG Turbo we got some patches into that to convince them that we could set and honour the old flag of JPEG memory which already existed but to honour it so that you could set some ceiling for how far JPEG will allocate similar inside and calc you can set max, matrix elements for some of the calc file formats to help limit that and similarly no translated calculation brings things under some kind of control as well there are various tips and tricks there in code for the practical the most useful thing of course is just double checking how much data is available and comparing that to what the headers of especially the image filters claim there's a huge graphic coming up you can check to see if that's possible some of the file formats of course are decompressed so you have to look at whether or not the decompression would be able to fill in that data so some of the filters have known max compression levels you know that JPEG cannot produce more than that ratio against its input data so we do that for a number of the binary file formats we compare what their max possible decompression of the input is and if it sees that we fill that again so they have practical real-world implications yeah so that's the results for the amount of reports we've been getting so in the last three years we've gotten basically a thousand of them about one a day but they've all been front-loaded the first two years is a huge tail-off last year we've been adding them incrementally over those two years so every time a particular filter was added and began to not give constant results we can add in the next filter so it's kind of kept to that limit we've stopped adding fuzzers now I think we've got most things covered and we're still getting this constant trickle but it's nowhere near the filter was before and that's it of course anytime anybody makes any changes to a given filter they tend to reproduce some problems as well so there were some changes yesterday to the Lotus Word Pro filter and I can see that I have five or six new issues reported against it already so yeah it's very dangerous and fraught to change any of those kind of hecky binary file filters right and that's what I've got that's the end are there any questions they Google the aspect boxing with other platforms not for us but I think they have their own they reuse all these for chromium so I think for chromium they have different platforms that they test chromium separately on but for our case it's just Linux that they work on so yeah all of the stuff you see here is they reuse it for their own software and they just let us have a little bite at it okay thank you okay