 My name is Antonio Dexero, and I'm going to talk to you about patterns for testing data packages. So we can start with a brief introduction to the MCI. So the MCI is a service that has the goal of providing automated testing for the entire data archive. So in back in 2016, AutoPackageStats was created. It was a tool that allows you to have tests inside the data source package and then have the tests executed against those binaries, the binaries produced from that source package. And then it supports running on your regular system, it supports running against KVM machines, virtual machines, against LXC containers and all several other types of virtualization platforms. Then there was also an idea circulating in the area that we should probably run AutoPackageStats for everything, but nobody did that. And then in 2014 I started the MCI with a very simple system that did field sequentially in a loop. And then since then I have been improving the system. We had a few summer course students helping me improve the system. So today we have a system that works pretty well. It scales with large changes to the archive. And we have plans for having the MCI engaging migration from a stable to test. So the idea is that if your package has a new version and all its purest experiences still work and the package itself still works unstable then you can migrate to testing faster and set up waiting for that number of days. And also the contrary, if your new upload breaks something then it's going to be blocked from going to testing. So the MCI records the test history for your package. There you see the history of the MCI itself, so it's back in the archive. And then it keeps another six months of history. You have quite a lot of data there. So if you have a failing test there you look at the logs to see exactly what failed what was the impression of each package and stopping the test system and that kind of thing. Since 2014 we have been really improving our coverage of the archive. So right now we have more or less 8,000 source package. When we began we had only 100 and something package with tests. So we have since then doing a pretty good job as a community to add coverage for our archive. So today we have more or less 28% of the archive. So that's source package, so the tests are at source package level. And on average we have been adding tests for 20 packages every day since then. So that's pretty nice. And in this process of proposing that we increase our usage of automated testing I enrolled several test suites for several packages. I helped people with questions on how to test things. I looked at how a lot of packages were being tested. And I started noticing similar solutions. I started proposing Zima solution to different packages. And I thought I always had the idea that those solutions should be documented. Then I started thinking in terms of patterns, which is the main topic here. A pattern is a reusable documented solution to a repeating problem. So it's commonly used in design disciplines like architecture, like building architecture. And also software engineering. So if you've been around enough of our area you have probably seen this book which is the classical introduction to the patterns way of looking at things to our discipline of software engineering. This book was released in 1995. Today I would say it's one of the classics for software engineering, computer science. And this was the first of a series of books on the team. So you have patterns for enterprise application software, patterns for software architecture and all kinds of patterns. And last year I noticed that there was going to be patterns more or less pretty close to where I live. So Buenos Aires is more or less close from home. And I decided I would try to document these patterns I have seen in the demo CI context and go there and get them to help me improve these documentations that could be useful to the demo community. So they accepted my paper. So this is the full reference if you want to download the PDF you can download from that address. In the demo schedule there is also directly into the PDF so you can also download from there. And then when you are documenting patterns, usually you have a few common elements in what we call a pattern catalog. So that original book on design patterns for object-oriented systems it's a pattern catalog. So they have like 20 something patterns. And each of them has very defined sections. So usually the pattern, the template we are using here is the pattern has a title, a context which is in which situations you can use that pattern, a problem which is what you want to solve, that's the problem that repeats in several situations and you are documenting solutions for that. Then you have forces which is things you have to take into consideration before applying that pattern like compromises you have to make, drawbacks in that context of doing that. And then you have a solution. You have a consequence which is usually a discussion or what are the consequences of applying that pattern. And then you have that same. So usually there are several styles, several different styles of doing that. You can be either very explicit and have explicit sections for each pattern or you can write those sections in a more like an historic way where you have a more fluid texture for several sections. Keep that so a little bit about that page which is the specification that defines how you add that to a demo package. So the main idea is to test the package in a context where it is installed in a user system. So you don't want to test the contents of the source tree or the violence you just built. You want to test what's installed in the user system. So that's why Debian is called testing as installed, testing package as installed. So let's see how that works. Very simple introduction. So you need to have a Debian test control file and then in there is the same format used for Debian control itself. So you have different paragraphs for different tests. If you have tests filled then that's a list of test programs which are inside the Debian test directory. So in the first paragraph there at the top you have test one and test two. Those are assumed to be two programs inside Debian tests and they have to execute in return zero. If they return anything different from zero then we assume the test failed. Now you can also specify the penises specific to each test. So in the second paragraph you see a net symbol which means all the binaries produced from this source package plus an extra package that is only used for two random tests. And then if you run the test three test program and if that exits zero the test passes and otherwise it fails. And then more or less recent admission instead of specifying the name of the program you can also use the test command field and specify the command directly. So we noticed that there are several test suites but you have like one line script to call some command to reuse some test infrastructure or something so it was added to the specification that you can actually set the command directly. And then you also need to have a test suite field in the source stanza of Debian control but if you are using the package source from stretch and later that's done automatically for you if you have a Debian test control. Then the main implementation of the specification is auto package test we also have a program in that script that also implements the specification but it doesn't implement everything so today auto package test is the only complete implementation. So if you can run it against source package for a changed file with binaries in that case the test will use the binaries that you just viewed and the reference inside the changed file or you can also run against the current directory which is the third line in the example and the dash B says you don't need to build anything from the source package just assume its binaries are already installed. And then on the bottom you see some options for virtualization so you can use QMU or LHC to have an isolated test where the corresponding binaries will be installed in that system and ready. So now we are going to look at the pattern themselves so the first one is called reusing existing tests and then this first question is annotated with each part of the template so you guys can follow so the context is upstream has tests written they are usually intended to run against the source tree like unit test or something like that they usually assume you are running them after probably doing a build or something but they are going to use binaries in the source directory so there are no tests for the installed package and this is the problem we are going to solve then we have to take into consideration the forces so the maintenance might lack the time or the skill to write proper installed tests so it's not always very easy to do this testing but on the other hand you already have the tests that upstream wrote so one way of solving this problem in this context is you can implement the installed tests as a simple record program that costs the existing tests provided by upstream so you can just reuse whatever upstream intended to be used during the build and then in that context reusing unit tests is very useful for library packages so assuming you have your library installed it's unit tests should pass and also if you have acceptance tests that's also useful for applications so one example of that is the LXC package so upstream provides acceptance tests which is actually the best case so they don't assume anything about the source tree or if you have future or not it already assumes that everything is installed so you can just call everything and then handle errors and exit with nonzero if anything fails so you don't have to care too much about the specifics of this script then the second pattern is test the installed package which is in the context where you want to test the installed package and the problem we have is that if the test exercises the source tree that doesn't exactly reproduce a user system so you want the test to actually test what's installed and what's in the source tree and sometimes you have an absolute source code reference inside the code so improve your byte in the constants that you can use to reference files relative to the test file and they try to load the code from the source tree instead of just assuming everything is installed and some other tests which are better behaving in that sense in which they just assume everything is in the right place and just use the regular construct of the language to load libraries or code programs or something like that then you want to remove usage of programs and library code from the source tree if you fail off the installed counterpart so if you have a unit test that's loading a Ruby module or a Python module from the source tree you want to change that to make it load code from the installed system so when you do that then you can actually simplify the test so if you need to follow a program you just call it by name it will be on pass you don't need to handle relative occasions without that you just call the program instead if you have a library you're just going to use the import for the require or the load instruction of the language you are working with and not care where that is coming from because it's going to be coming from the system and usually you don't have to build anything to run the test except if the test programs themselves are compiles then you need to build those one example for instance if you have this type of thing in your test suite so the ADT TMP is the environment that you can use to detect whether you are running that test suite on the auto package test or not so in that case so you can set up the test suite in a way that if you are not testing the installed package then you can add like your source tree being directed to the system path and you can add your source tree library directory to the dynamic linker load path so then your tests don't need to care where any programs or libraries are coming from they can just link against those libraries and then if you are running those tests during the build okay they work against the build version if you are running against the installed version they also work because they are going to look for the installed version another example unfortunately very common Ruby programs so they have stuff like that that manipulated the location of the test file to note something explicitly from the source directory and then you just beat up three of a clue but then just change to the right thing and assume the test framework will add the relevant directors to the path for not the video in the context this is very common I have on package that you feel you are testing the installed package but it's actually testing what's in the source tree and then if you have a different version in the source tree then if you have different versions in the source tree and in the system then all kinds of make and happen another pattern is clean and disposable test path so we want the test to be repeatable so if the test fails and they don't see it you want to rebrand the same failure locally before you try to fix it so the problem we want to solve is making sure we can always reproduce the environment that the user would get when installing the package and application system so hopefully if you want to reproduce you want to automate and then on the other hand automation has an upfront cost usually going to spend some time automating that but in the long run if you are going to run tests a thousand times 100,000 times a million times the upfront cost is working so one way of solving that is use virtualization and container technology to provide fresh test so this is already implemented auto package test and it's something that you can use locally on your developer machine and that's something that MSI is also going to use on the infrastructure that has a few consequences first you need to make sure that your application is correct both on the binary package the penises and the test penises so everything needs to be there if something is missing your test is going to fail you need if you need some package to run the test but not for the regular usage of the package then that needs to be explicitly listed in the test control file you can automate other things like eShort testing a live application that doesn't do the web server part automatically you can automate that in the test script itself and then that's going to be reused forever when you are running those tests but on the other hand sometimes when you are writing the test it's useful to be able to run them without having to spawn a fresh virtual machine and install a virtual scratch so it's useful to be able to run the test in your local system but you should always run against a clean system before a program a few examples as I said the package test supports different visualizations including non-adult so you can use the package test itself to run tests against your local system currently in the NCI we use LXC containers and I'm working to be able to switch to QMU in the future to test things that are carrier-related like file systems carrier modules and the carrier itself and Ubuntu uses both QMU and LXC depending on the architecture one interesting pattern is a knowledge known failures sometimes we have a package that has a very long test with lots of tests it's nice but most tests pass and some of them fail there are several reasons why a test may fail of course ideally you want everything to pass so that's what you want usually but that's not always possible you need to investigate the failures sometimes figure out why exactly one out of a thousand tests fail is not exactly trivial depending on how much you know the internals of the package you know that so you need to consider how severe is each failure that's are all features in the case really important if there are a thousand tests there will be a few failures maybe those are non-issues or the test might be broken or has different expectations regarding being executed from the source tree or not so and you also have to take into account how much effort you need to fix the test so a solution for that is to make the failures where I know about non-fails so that fails ok, we know and we acknowledge that but then we are not going to fail the test run because of that so when you do that you are the test that you have that pass which we expect to be the majority of them will act as a regression test suite so if they continue passing then everything is good and if the non-tests fail that's ok, you tolerate that then you are probably going to maintain like a blacklist of tests that you know that fail but you can't really deal with them right now and then you can use that as a to-do list to fix so those tests might might be broken for a host of different reasons and then you also have to take into account that you can't leave those tests fail forever so you need to keep an eye on that so this is an example from the Ruby source package so there are a few tests there that assume being run from the source tree and they don't really work when you try to run the source out package so you maintain a blacklist in a file called nonfailures.txt inside the test directory so what this code is doing is running each test and checking if a failed test is in the blacklist if it's in the blacklist then it doesn't it doesn't fail the test run so at the end you get a result so you have n tested passed and test that failed but we already knew it failed and if any test that's not in that blacklist fails then the test run action fails otherwise you just tolerate the ones that you know are broken for now and go along, go ahead now another pattern called automatically generate test metadata this is very important and is one of the reasons why we were able to add tests to so many packages so far usually when you are working with a team you have several similar packages and the code to run the test for those packages is very similar and usually in the upstream communities you have conventions on how to run the test so if you have Python package it's always Python setup.py test if you have your packages always make tests in this kind of thing so you would end up having several duplicative test definitions if you are going to have to do that explicitly all the time and duplication of course is bad on the other hand some packages can be tested a little different than others so you need to take into consideration that maybe you are not going to deduplicate everything then a solution for that is to replace these duplicative test definitions which ones that you generate at runtime so in that way we do this with a tool called autodepnage which is able to detect which type of package you have and then generate the appropriate test definition then when you redo that if you need to change how the test are run then you can change that in one place only and not have to change lots of packages by hand and also you can also automate things to manage the test environment for instance if you want to have a workaround that makes sure that a given type of package always loads code from the system and not from the source stream then you can do that in a single place for instance for Ruby packages autodepnage will generate something like that called gen2.app testrunner which is the testrunner we have in the Ruby team and it already handles all that thing of making sure the test is not going to load code from the source stream and it does that when you reuse the package test parameter and then when we need to change anything about that we can just change in a single place in the gen2.app testrunner package and not need to change in more than 1000 packages so autodepnage also supports Perl, Python, Node.js DKMS are help which are the max something package and go so if you are in a team that's not supported yet you can also see the batch to glad support for your type of package it's very easy to do that with everyone every of those supported language with other types other people than myself so it's really easy people didn't have really a problem with that smoke test is another pattern so you have to realize that sometimes not all package will provide tests and sometimes it have features that are not really provided by upstream like features from maintenance scripts or features that are added by the data maintainer when adding a certain definition and stuff like that so in that context the maintainer wants to add test to make sure that all functionality works you have to take into consideration that it's not always easy to test the internals of the package again depends on how much you understand the details of that package and how familiar you are with the technologies that are used there and also you can always have tests that are specific to the Debian package that test some integration with Debian that specific to Debian doesn't really make things to be done upstream so you can write smoke test to exercise features of the package and check for expected results the idea is that the smoke test was either the main functionality or the very basic functionality of the package and the analogy there is or the problem there is file so smoke test is something that you can run easily, quickly and if that fails there is the most basic functionality that is probably a large problem that you need to investigate and if you think about it even a very simple test case like just calling your program with dash dash version can already catch breakage so the library maintainer made a mistake didn't realize an API change upstream then just your program doesn't load properly anymore so you can catch that type of stuff you can catch issues with dependencies if there is anything invalid with libraries or anything in your program it's not going to work even for just printing its version number you can catch things like instructions so we saw that sometimes we probably see 64 where you had power 7 machine in the beginning of the part and now you have power 8 machine and then the binders that are very old didn't really work anymore with the new machines so just running the program and doing very few things is going to catch that also you can catch package issues there is a bug in the package and the main program is not being installed anymore so the test is going to fail and you are going to catch that so this is an example from the Chef package so very basic call, it's just called with a recite that says please install the view package in this system and then you can just install it so if that succeeds then it means that the check is installed properly all the dependencies are there at this so far as you need for this test case they are ok and then if anything like that is broken then you have a failing test the last pattern is record interactive session we have to realize that some packages are older than the train of having such a thing with automated tests you really don't have tests sometimes it's not really easy to write automated tests out front so when you are experimenting you don't really know how the interface is going to be it's too hard to foresee which type of test you want and then you are going to provide tests for a package like that and you realize that some programs have a clear interface with the rest of the system so that interface can be a common line interface it can be a graphical interface but can also be a server socket that is distant to some protocol like a web server or something like that and then one way of writing test in that context is to interact with the program so that you can play that back later as automated tests so the way to do that is you install the package on a clean test pad like a clean VM or a clean container you exercise that interface you call the command line or you make a request to show an HTTP part and you record what happens and you compare that with your expectation or with the documentation and you and then you go ahead and we call that interaction in an executable format so that there are several ways of doing that that depends on the tool that you are using but here is an example of a tool that I really like and it's very interesting so you can easily imagine having that session there as a shell session which you ran a few commands and then you just copy and paste that into our file and actually there is a tool that does that for you so if you take that file and feed it to CLI test it will actually detect that each line that starts with a dollar sign is a command that needs to be run and everything below that is the output that you expect so it will run the command for you it will check the output and it will use each command as a test for you and then you have just like that you have six tests and you can easily do that for other online programs it's very useful then to finalize so we have a set of patterns that document design issues when you are working with auto package test I hope they are useful to you again the full paper is available you can read that it's gonna be linked to the CLI website documentation at some point you might have noticed that some patterns solve the same problem so this is normal so you have different solutions for the same problem that may have been in the context may have been in other restrictions you have be it time restrictions or effort restrictions or platform restrictions different solutions to the same problems and if you can identify other patterns I would be delighted to discuss you can talk to me here so we can expand this catalog and give this documentation for helping people deal with adding tests to their programs I went a little fast over the details exactly you do the testing I would like to plug above session you have on Friday 15.30 on Friday we are going to discuss everything related to CLI so if you have questions or want to exchange ideas about the topic we can meet there there are a few lessons there is the paper PDF the CLI documentation has pointers to everything that you need to know and then in step 15 I made a talk a lot more detail about the specifics of how you write tests the different options and the different tools so you can also check it out that's all I have I'm not sure if we have any technical questions testing package installation both with system D and with system 5 in it so like that you first install the right system and then it's only good questions so there is nothing specific but I think if you specify the test that is right you can probably make it work we can look at that later but I imagine that it's possible you can have different test waves you can have a VM with system D a VM with system 5 and then you can run the same test against both for instance all right it's not there now but it would be we can do that if there is people really want to contain system 5 forever why not is there a way to have a VM for like packages or whatever plus also I want to run this extra one yes it's documented you can create the VM slash test slash control dot append or control dot auto type base it's easy to remember exactly but if you have that file it will be appended to the automated ones