 Testing for new profiles. I hope you have joined the round table on Wednesday where we discussed some parts of it already. This is now a presentation about the different stuff we do to test new profiles automatically. During the talk in the second half I will now present what automatic tests we are using. So I will just start with it. So why should we use automated testing? Automated testing is fast, gives you a short turnaround to inform you if you introduce the bug or not. Best would be already before you push, but even after you push it's quite great. The thinner box will report it to you in normally less than an hour. And you know the cycle. Okay, we don't write tests because we don't have time, we need to fix bugs. And that's because you don't write tests originally. If you add a test for each fixed bug it won't appear again. And over time the number of bugs that will be introduced decreases. So you just need to make the cycle once and suddenly you have time to add your tests because you have less bugs. So how do we actually have developers write tests in the office at the moment? So we have a quite good testing framework already. It's built around CPPUnit. It works on all platforms, so it works on Mac, Linux and Windows. The shared test code is in Unotest and Test. You have already quite a few tools there that help you to set up the office for tests. HTML and XML helpers. XPast alerts, there are quite a few. Niklaus introduced them. So you can assert on an exported document that something is there or has some special value. You can validate exported files. That's quite new as well. So at least in Kite, we already used that. Every file that is exported in our tests will be validated with a validator. And we find, therefore, validation errors already in the tests on the tinnet boxes. And there are existing frameworks for writing tests in Kite, Infras and Writer at least. I think Base has some existing code that's in a quite rough shape. Math does not have many tests yet. There are also some real unit tests in other modules, but it's more difficult to test them and you need to read more code because there are not that many tests yet. So the different types of tests that we have right now, the most complex one is actually you, Kite, you, Writer and your Infras. They link statically against the core libraries for these modules and therefore provide access to the private symbols. So you can really use more or less everything in Kite, everything in Writer if you write the test there. But what's not possible is to import files because you are in a dependency problem loading the filter library. They are quite fast because they have not access to everything like the import. But of course it's quite hard to implement something there because you need to know all the details of the library. So it's most of the time used for testing something special or a dynamic change that you can't assert after import or something like that. So our import and export tests, they have gone quite a lot. Mostly thanks to Miklos and the Writer guys who are more or less at the test for each fixed import and export part. So they are more or less separated. We have the import test where we import a document and then make an assert or most likely through some UNO statements and assert that a condition is true. The export tests, we have two different types. Export to file, import it again and then similar to the import case, assert on the content or what's quite new is Miklos' concept to assert on the exported file to expand that we have the correct value into the file. They are slow because we have to load a document and over time if you add many test cases you have the load time of like I think 15, 20 or 30 documents and as you know loading the document is quite slow. If you need to export them it's even slower. It loads our repository so if you add such a test case make sure that your document that you use for the import is small. We had some cases where people added documents that had one megabyte or something like that which does not scale. They are quite easy to write. You just need to create a document that contains what you want to test or what you fixed and then find a way mostly through UNO to assert that what you fixed is correct. You don't need to know that much about the internals for example write or archive because you can just use the UNO API. Then we have API tests, two types. They are all Java tests. They are slow. They are one in the subsequent check target. They are a bit unreliable so Stefan fixed some parts of it but they fail from time to time for no apparent reason. They internally use some sleeves to get everything right and the assumption is that if you sleep for a second you can call the next stuff but that does not work all the time so then you get a test failure and if you run it the next time it works. They are quite hard to debug so you have some fun setting breakpoints in the Java code and hoping that it works or setting a breakpoint in the C++ code but it's still the only way to test each part of our UNO API because writing new test code for that will take years. But for zero code we have our C++ based tests. They are more reliable. They are in process compared to the out of process Java tests and it's a lot easier to debug them. It's similar to the other tests we have direct support in the build system and you get your GDB session put your breakpoint there and it will for sure stop there and it's easy. You should use the C++ wave for new tests or if your Java test reliably fails it's most of the time easier to port it to C++ and debug it there and fix then your issue. There's also an easy help to at least rewrite the disabled Java tests in C++ which would help us to enable again more API tests. We also have some external testing automated mostly so I think the most famous example at the moment is the crash testing which runs on 55,000 documents imports them, exports them to a few formats I think at the moment they export about 150,000 documents and that attracts if we crash during import during export and then validates the ODF and OXML files that are generated and publish the result on the URL and also mails the developer mailing list with the number of files that crash Fixing there is appreciated we still have I think a few 10,000 validation hours We have for the last one suddenly 100 more import crashes and a few hundred more export crashes so there's some room to fix some new stuff We have Moxtrap, it's manual testing but it should be done for the past that can't be automatically tested at the moment like all our UI code is not testable and if you have a test case or a bug that they have no idea how to test it and after talking to us or to someone else you decide it does not make sense to write an automated test it would be good to add a Moxtrap test so that at least it's tested for the manual test runs before release and then my question is if you have more ideas what we can add what we should add and just talk to me, I'm always open to new ideas I'm happy to add new testing ideas if they improve the project Do you want to make a question for us? Yeah, we can do that Any questions to that part of the talk? What's holding us back from doing UI testing? UI testing is very fragile so if you change the dialogue if you change the layout there's no good library support testing libraries that support UI testing One idea is to go through the process dispatch mechanism I think the Patrick guys are doing something in this direction but in Java and I want to avoid Java as a testing dependency and it just requires some work to explore what's possible another idea was the accessibility API but that one is slow it's also not the best it's quite a few modules for example the kaipon needs some love It's quite an idea What we do in Java is we use interface elements as an identifier and then we actually get the UI toolkit so instead of saying press the button that I'm going to apply we say press the button that I'd be like us Yeah, but it starts already How do you open the dialogue? How do you get to the dialogue and stuff? So getting to the menu, to the dialogue you name the option, you move the option and suddenly it's not possible anymore So that's where you need to find a way to make it reliable because unreliable tests don't help So you need to find a way that's really reliable that's low maintenance for developers and yeah, I haven't found a good solution yet but apparently Michael has So I have this picture of an info monkey state that is quite a trend in the inter-European and of course you can semantically inform that by what acceleration is available for menus so basically it's just doing this a lot and we'd love for some help to make that more intelligible already it's done some nice performance problems with Asian characters so anything that you can do at possible times can be viewed as well Yeah, something like that but trust takes some work and some exploring Another question was do we have a minimized practice case? No, so it's all manually at the moment yeah, Pat there is appreciated I think Miklos is doing most of the adding test cases at the moment so I think he had I think 280 test permits in the last two releases What source? Yeah, I think it's Miklos, I saw the numbers and before I forget I forgot one of these external test cases is the performance testing by Matos which helps quite a lot So the top unit test cases are the top five are Kovay We launched their phone Kovay so it was counted as a big option Okay, any more questions? Okay, Michael again Can we have a fuzzing in your import and export crash Tesla? Yes, sure If you give me more power I'm happy to have fuzzing So it's running already for four and a half days It's exposing kernel bugs so we've already found a GCC, an LZ bug and a kernel bug with that At the moment it's not possible to add more but if you add more hardware, sure On the more hardware account I saw this wonderful presentation by the infrastructure guys and I still have the idea that we should just make the Ibrafus home page and say we write back building for a week and then we'll have enough power for everything that we want to do with testing I don't know if our users would appreciate that too much So, yeah So I want to... Can you hear me? So I want to talk a bit about the new Tinder boxing stuff that I did and the reason for that is that all the testing that we do Tinder boxing is also testing It's testing the most basic thing that something is building We have one web page for all of this which has like the 23 different platforms and types of builders that we have but in an ideal world we would have all these results and all the stuff that Markus showed Ideally accessible in one place so that you can see what is breaking There's for example, actually you could maybe do that with the current Tinder box page in a way that you would have like the crash test reporting read if you have more crashes than the last one or something like that but it's not really real for that So let me just quickly go through what TV3 does It does just building like a Tinder box building the newest thing like you know Although building can also mean just running a test or a crash test or whatever So the actual implementation what you test is, is abstract and if something breaks it can actually go back in time and bisect the first commit that that broke something and it can do that on multiple repositories and that means you can do that for example on the source code with real builds and stuff that you do in the build but you can also do that for example on a BIOS repository and just have a set of documents and see when was the first crash when was the first time that a document started to crash and put this in some web page like this where you see okay this was the last good commit and then the next time I built it it started, it looked bad and so it went back and tried to find the first commit that actually broke something and we can also see there's not yet known which means that there are new commits on this branch adding new stuff So this stuff roughly ready just working on that a bit at the site and we never really deployed that because it was not really easy to deploy so what I did now is put the whole thing in docker-sol so that you have docker-containers that do this stuff and this means that all the stuff is auto-configured and makes it easy to set this stuff up and have a reproducible environment so you can use that for example as a simple tinderbox just running on your local machine but you can also have multiple tinderboxes running on different hardware doing the same test for example and have some of them in some cloudy thing and other things actually in physical hardware and the main point is you also heard this one before that with docker you can actually have reproducible environment for the tests that you are doing for example you can have our baseline for example in an image and then you can actually test on your local machine still while this is breaking on the baseline and not on your machine so let's look into how this stuff looks up it's actually quite simple if you look at the project layout I'm going to upload that later this week it's just this salt master that distributes the configuration to all the machines it's the image of the minion which is a docker image that takes the configuration from the salt master a made file that creates all these images and then some salt configuration and an example script which is like the perfect script for every developer because that one always says this build and this was fine and there was no problem with it so if you set that up locally for example you just say make SSHP which generates the key pair for you to lock in and configure the salt master you never do that manually but it allows you to remote control that from the made file and if you then type make the images however many you want to have and configures them and sets them up, installs all the software on them and stuff like that and that takes 5 minutes and after that you have the whole thing set up locally and if you type make start all container start up and work with each other there are more commands but I will not go into that right now and then you have essentially the tinder box that if you have no new commits that looks if it's broken if it's broken it goes back in time to spider-secking so you can simply spider-seck with a good UI or you can use it as a tinder box that actually runs against something that is a moving target changing and to show you how the actual tinder box script is stuff that does the test whatever you're testing if you're testing a crash test or whatever is this is essentially the whole thing that you need in the tinder box script this is the example that never fails but the first thing it does is it tries to see which commit it should build, this is the first request so that's one URL and then it says okay I'm gonna start building this and the MN says okay I'm finished and this is my result as you can see here the result here is always good so it never fails so yeah actually setting up a tinder box that runs against this is quite easy and you don't have to take care of all the testing infrastructure and working results and stuff because that is all in the other things so you just have to add a test and whatever you want to test you can test with these three URL requests okay so that's my part of the talk if you want to know more about that there's the URL at the beginning on the first slide and my other talk also had quite a bit on TB3 and how the details work any questions?