 So, hello, welcome to my talk about maintaining 8,000 packages, that was the original title since then, some editions were made, so it's now actually 9,000. It's about how we built the Postgres packages for the Postgres software universe and how we are doing quality assurance in there. So the problem looks simple at first, we have Debian, we have Postgres, we just make a packet of it and are done, but of course the reality is more complex. We have several Debian releases to target, there's always at least five supported by upstream Postgres scale major versions around, the most recent one is 9.4, 9.0 is actually going out of support at the moment, I think it will get one more update from upstream and then it's gone. The major packaging problem there is that the databases, the different major releases have an incompatible disk format, which means that on upgrading the package from 9.1, which was in Weezy to 9.4, which is in Jesse, you actually need to have an upgrade procedure for your data as well, so you either need both versions installed in parallel or you need lots of disk space to dump the whole database cluster and also complicated plan what should happen on the upgrade, so the preferred version is of course to make the packages for the different Postgres scale versions co-installable, that's why the source package is not called Postgres scale, but Postgres scale dash 9.4 and we have an infrastructure on top of the Postgres scale packages called Postgres scale common, which takes care of also making sure that data can coexist for several versions in the system. Postgres scale common is a Debian edition, it's not part of the upstream Postgres scale software world, it was originally written by 10 years ago, I forgot actually the name, then mostly maintained by Martin Pitt and now we are working together on it. It has some Debian specific quirks, but it can also be used on other operating systems. I've recently done a port to Red Hat, which basically makes the whole coordination thing works on Red Hat as well, but that's not the topic here in the talk. So still we only have one Postgres version in Debian at any time, as said in Jesse it's 9.4, Wheezy has 9.1, Squeezy had something else. The problem with that is that users will always want to try a newer Postgres version to do better testing, to do testing of the applications and developers always want the latest thing. And of course sometimes people want to upgrade the operating system without having to upgrade the database at the same time or just to upgrade the database first and upgrade the operating system later. So we want to have a plan that enables this to make it independent. So the solution for that is that about three years ago we created the APT Postgres SQL log repository, which is a standard repository driven by Reprepo hosting packages for all supported Postgres major releases built for all supported Debian and Ubuntu releases. So it's somewhat a super set of what backpots would do, but backpots didn't work for us because backpots still only has the packages for stable which are in testing currently, but of course there's no 9.3 in testing, there's no 9.5 in testing. So we had to set up a separate repository there. Apart from that the packages are just the same, they just get rebuilt in a standard cow builder environment using the very same source. So at the moment there's six supported Postgres releases in the repository from 8.4 until 9.4. 8.4 is officially out of support in the Postgres world as the talk by Michael Bunk I think two days ago or something that we are still supporting 8.4 via the Squeeze LTS project and I haven't disabled it yet on the APT Postgres SQL log repository as well, so your extensions would be available for that as well. I've put 9.5, which is the current alpha branch into parent thesis there because it's not fully supported yet, but we also have packages for it there if you want to try the alpha release at the moment and there's also already packages for 9.6, which is the current development branch. So that's six Postgres SQL releases. We have seven Debian and Ubuntu releases covered from Squeeze, Weezy, Jesse, Unstable and three Ubuntu releases, the two LTS ones, Precise and Trusty. The topic is an, yeah, nowadays outdated non-LTS release. I should be upgrading that at some point, but so far there's been more interesting stuff to do and users haven't been complaining. So the whole thing is done for two architectures, yeah, the obvious candidates AMD64, i386, so these access combined makes 84 targets for which every package needs to get built. At the moment we have about 130 source packages, some of them are simple because they only need to get built once per Debian distribution and per architecture, so that makes 14 targets, but some of the packages need to get built for each Postgres SQL version because they are building server modules, which really only work with 9.1 if they are compiled for that and, yeah, you really need the version of the extension module that fits this Postgres SQL server in there, so, yeah, that's really where the company referral explosion happens. There's some statistics, the archive size has been growing a lot since we started that at the end of 2012. At the moment we are a bit below 7 gigabytes of packages, that's the red line and the black line there is the number of packages which is around 1,000 now, when I was submitting the talk the number were around here, so what happened here is that packages for PG Loader got added, which is written in common list, so we had to add about 55 common list source packages there which also are in unstable and that makes yet another bunch of packages which add up to the numbers here. So about packages and tests, I don't need to tell you much about how to really build Debian packages, what we are doing special here is that we need to build the same source for several distributions, which means we need to have separate version numbers, this is where this shell script generate PG-DG, means PostgreSQL Global Development Group, which was the acronym we are using to take the packages, it will just add it to the version numbers, it looks like BPO something, it's just a different string, we get several source packages there and then build it for several architectures, this is the simple case, in the case of packages that need to get built for different PostgreSQL versions, there's the PostgreSQL version included in the package name as well and in that case even more depths are produced which have all the information encoded in the depth name, so we can actually tell where things are going and which we also need to really have things included in parallel in the archive. So the problem with that is of course that no one is going to test 9000 packages, so we need more test suites, the nice thing is that PostgreSQL itself has an extensive regression test suite, most PostgreSQL extension modules do have extension regression tests and also before mentioned PostgreSQL common cluster management framework, server version management framework has also a test suite on top of the PostgreSQL packages, there's still yet another problem, no one is going to test that, so we need automation and this is where Jenkins enters the picture, those who have attended Mikas, Jenkins, Debbie and Klu Boff will know, Jenkins is continuous integration server which is basically something that takes your code and compiles it all the time, more generally speaking it's a framework for running scripts triggered by either manually or triggered by cron-like actions or triggered by VCS commits, we're using so-called matrix jobs there where one job is responsible for building the source and the binaries for a given package and these configurations that are attached to each job then will make sure it gets built per distribution and per architecture, so every job is essentially running 14 times once per distribution and architecture combination and the first, there's actually several jobs per source package in the first step, a source job, this was the first arrow we saw on that slide here, takes the, yeah Debian source, this is a standard source we're also using in Debian, we strive to build from unmodified packages that are actually unstable except for this version number tweak, takes the source, builds source packages from them and then in the next step these binary packages get built, yeah and of course thanks to Mika for providing the framework there, I should finally take the time to merge back my changes into this system but we're in close cooperation so the features we have should be the same just slightly implemented differently, yeah how it looks like, there's a huge list of packages to look at, unfortunately the build server is not public yet, I'm thinking about opening it for at least read-only access to everyone but I haven't thought up my mind yet about how to actually secure it, if someone wants to have a look there just notify me, I can give you an account. Then looking closer I've just picked one post-class extension module which is Aura FCE which stands for Oracle functions and compatibility extension which is useful if you're porting some Oracle application to post-class you can add some functions to the post-class server that makes the database look more Oracle like, this gets built for the distributions mentioned there, you can see that at some point the build was actually failing, no idea what the problem was back then but then one minute later it succeeded, apparently I fixed something there, so this then triggers the binaries jobs which at that point was even more than 14 I think when I took the screenshot Ubuntu Lucid was still alive, it was nice to see that go because we had to apply lots of hacks to support modern packages on this very ancient suite, squeezes surprisingly modern compared to it, so there's actually no technical cost at the moment to drag it along, I think I will leave it there until things start to break and it gets too annoying, then the build artifactors are produced and in the next step uploaded to the repository, there's a devian testing like step involved there as well, so uploaded packages do not get pushed to the distribution the users are seeing in the first place but there's one manual step behind then when we're promoting the packages from the testing repository to the live repository, yeah we can look at the console output of the build, here you can see this is the typical output of Postgres regression tests, it always says okay that's fine and how does it actually work, there's an extension or program inside the Postgres common framework which is called PG build X which makes the problem of building Postgres packages for multiple versions at once easier, the problem is that the list of binary packages we're building from one source package depends on which Postgres server versions are considered to be supported at that moment, in devian most often only once but we want to have packages for several Postgres versions in parallel which means that there needs to be a loop somewhere in devian rules which calls make more often and there needs to be a mechanism which updates devian control to mention the binary names we're doing there, so what PG build X is doing is basically loop over the list of Postgres girl versions that are said to be supported by this very package, in the ideal case you would just say all in there you can say 9.2 plus if it requires a minimum of post plus 9.2, it then loops over the list of versions there, takes the intersection with the version that is supported by the system at the moment and writes out the result to devian control and makes sure the package is actually built, there's examples in the main page which looks about this, if you have a standard depth helper, rules file you need to do some overrides to actually hook PG build X in there, of course you could just mainly say make for Postgres 9.1, make for Postgres 9.4 but this PG build X, build X extension or command will take care of building in this subtory mentioned here, this magic percent v gets replaced by the version number, there's usually nothing to test there, I will come to back that later for Postgres X extension modules for installing is just the same, you tell it which package name you're installing for DH install doc usually needs a tweak that you want the documentation in all modules packages there, the content is actually the same for all modules built and the readme doesn't only end up in the first module built, but yeah right, the rest is just standard depth helper glue, yeah of course if your package is more complicated there will be more code in there, sometimes packages are building server extensions and the binary then it gets more complicated but it's basically just a simple shell extension here, so build time testing is nice, testing installed packages is sometimes better, we need these to make sure files get actually put into the right place, once the module is actually installed you can actually load it into the server and do something with it, that's where auto package test enters, we have lots of auto package tests in there which make sure that once installed the module is actually usable and didn't just compile and then we notice later it's not usable nonetheless, this is also where the integration tests from the Postgres common package get run, a full test run for Postgres SQL 9.4 will include something like 1300 tests and it's even more if you have more than one server version installed in parallel because then it will automatically test for upgrades, yeah here's an example where it looks like, you've probably most of that or seen that in that case it just goes to user share postgres.com and it does . slash test read you can actually do that on your life system, just go to that directory and run that command as root but make sure your database is not being currently used because it will, it will not delete it but it will start all sorts of temporary servers and shut them down again, yeah the problem with Postgres extension modules is that they usually don't support build time testing, they only have make install check target which comes from this Postgres extension building infrastructure, no idea what the S stands for actually, which where PG Build X also helps you automate that, in the simple case you just say PG Build X install check and it will loop over all versions that are being targeted and fire off the tests, you could also do more complicated things there like filtering the list of versions you want there, let's say the test suite is broken on 9.0 anyway but I know the package still works so I'm ignoring the test result there and then called PG Build X install check or if you want you can just call make check or whatever, maybe what I didn't mention so far was the code mentioned here is all part of the standard Debian packages, it's not specific to the app repository, for Debian it just gets one once or the loop is the trivial loop that doesn't loop and for when packages get built for a certain time for app Postgres will target more versions, if we were building backpots for backpots Debian or it would automatically adjust the package to target the proper Postgres version there so it's all automatic, it's just a matter of running the build again, in theory even BIN and MUSE would work to change the Postgres version supported by the package but the problem there is that the set of packages built by one source changes and BIN and MUSE don'ts like that, so the source full node change upload is required anyway, now all the auto package tests are also visible on CI-Debian net, the screen chart here is a little bit older, you probably all also have seen it, there's some CI column, not sure if you can read it there, yeah and yeah we are slowly working on putting auto package tests in all Postgres packages, I think I've done something about Czech Postgres lately but I'm not really sure, one omission to mention there is that Postgres common doesn't have a test suite which is kind of funny because it actually contains the test suite but that's because the tests get really run for the server itself, okay here's a few examples of bugs we've actually found using the machinery, yeah my favorite one is this one, this was in March 2014, it was well past the freeze for Postgres 9.4, it had been branched off there and peer authentication was totally broken which meant it wasn't discovered while development because it still worked, if you were user Postgres into system you could still lock into the database as user Postgres but what the bug there was that instead of checking your user name when you're connected to the database it would check the database username so anyone claiming to be Postgres could lock into a database running as system user Postgres so the logic was just the wrong way around there and nobody tested it, this is a super ugly first class security bug wide open and nobody noticed for months until the Postgres test suite module discovered it and yeah this was, this is actually a segmentation fault so Postgres you can write functions inside the database and you're allowed to write recursive functions and if you have recursive functions you need to make sure you don't make the stack overflow so there's actually a regression test which makes the stack overflow and then expects the error message stack depth limit exceeded on a side note here this is all context diffs the Postgres community likes those I got used to reading them but they are still strange so the, oh sorry this shouldn't happen I need power sorry is there something I can unplug, oh sorry I'm still alive, okay sorry I'll tell it to restart but I roughly know what the, oh there it is nice, okay. What happened here is that Postgres is exceeding the stack limit and it is expecting for a proper error message trapped by the server itself itself what in reality happened was connection to server loss which is the client side of a segmentation fault and on looking into this it was a problem with the hardening flex which had been applied to Postgres like forever and are now default in the Debian compiler flag the problem there is that the address base layout randomization on 32 bit architectures on Linux leaves way too much space too little space for the stack so it's really easy to make the heap run into the stack even if you are still well below your configured stack limit I think this is a kernel security bug but I also think that I don't have time to go through the Linux kernel main list to fight it through so I was talking to Bastian Blank about it and he's also said that it looks fishy I don't know why the address base randomization insists on putting heap and stack so close together and what I ended in the end is that we are just disabling dash pi on all 32 bit architectures to make them secure again it works on 64 bits because the address base is much larger there yeah we found some regressions in the back then development version of Postgres when suddenly some query plans didn't produce output anymore that was reproducible you can ask for a query plan of any query in the server just by prefixing it by explain and then you get told yeah I'm going to use a nested loop here and a foreign query in the server just by prefixing it by explain and then you get told yeah I'm going to use a nested loop here and a foreign scan and whatever and then during development the developers there decided that they wanted to add more information there this is actually the fixed output again it had told here that the planning time was something 0.10 whatever milliseconds for this query and of course this is not reproducible so the test for this module would fail all the time and it was luckily fixed by not showing the planning time for the plan again there's been lots of time zone mess the bug was first reported to the Postgres list on a Solaris system but I had exactly the same problem as well Postgres ships its own time zone database but for Debbie and we are not using it we are using the system time zone database and then suddenly the regression test didn't go through anymore and up and closer inspection it was actually the Russians to blame because some time zone definition was changed for October 2014 which was used in the regression tests it was then fixed upstream by using a similar test case for a different time zone which is older the problem there was actually that I was running an old version of tset data on that system and that test really depended on me having really current time zone data installed was easy to work around but they also fixed it upstream I think it happened later again but then we decided we just go for the new version as well there's a few problems that are architecture specific this one is still not really resolved one building Postgres girl on MIPS and MIPS EL sometimes the one of the regression test modules simply hangs and then the build gets killed after five hours I think it depends on some sub architecture there but I haven't yet got access to a build demon which actually uses this code just rebuilding on a different MIPS host fixes it I've seen a similar issue being fixed last month but I haven't had yet a chance to actually check if it's if it fixed this issue there yeah PSQL ODBC which is the ODBC driver for Postgres has always lots of interesting problems on interesting architectures so these really get fixed quickly upstream but someone needs to fix them needs to find them they have an internal type system which is interesting and sometimes just wrong on non-standard architectures yeah then of course there's lots of cases where the extensions are not really ported to new Postgres versions yet the bit of a problem there is that the upstream authors of the extensions only start supporting new Postgres versions once the new version is out they don't really test it while it's still alpha so yeah debbing is doing that there's a few examples of bugs we have not found there were surprisingly many problems with the new version with SSL in this case the lip uq version was upgraded from I think 9.2 to 9.4 and suddenly SSL didn't work anymore and what the problem there was that the TLS version was upgraded that is supported but the guy running that server only had a 512 bit RSA key here installed and this key is sometimes too small to hold the packets necessary for TLS 1.2 I think that was not really fixable on the code side so we just had to need to tell users to use larger keys still it was something that was missing in the release notes yeah this was interesting as well because the error message was always out of memory while the actual problem was really something else this is a problem with Postgres back end if you're running in a change route Postgres 9.4 suddenly failed for no obvious reason and up and closer inspection it was actually looking for ETC password to check your username even it's not used and if you have a minimal change route like what Postfix is construction it suddenly was failing no one thought about testing that before I guess it's hard to test without being rude on a system but still it's something that some test should have found but didn't yeah okay so what's left to do for us some packages still don't have tests I'm increasingly uncomfortable with actually uploading these so I should be doing something about it but with some packages it's just not possible like for example with pgadmin the graphical font and I wouldn't know how to test it anyway now a nasty problem is that some packages have interesting test rates which probably run on the developers machine but not on my machine because my home directory is called differently and they have interesting ideas about where the binaries should be that's mostly the problem with pgpool and the more complicated front ends yeah in the end I'm just not running that test but I should yeah yeah what we're seriously lacking is some way to really trigger rebuilds of packages that are updated somewhere or when the new architecture or new distribution gets added we should have some one built like system we need to look into that maybe I should be discussing that with Mika as well yeah we don't have a nice packages debian org like front end yet it's been written but it's not it's made its way into the Django installation on the Postgres web server yet yeah we could be looking into making building Postgres extension modules even easier with some dh sequencer command or even something like dh make pgxs which would just automate everything related to the initial package creation and yeah of course speaking to the choir here there should be more people working on the Postgres ecosystem there's about 50 source packages in the group of which let's say 35 of mine maintain a name on it but that's too much yeah maybe if someone pops up I won't give up the hope for it okay thank you for your attention anything you would want from the repository user feedback which architecture are we missing I do have a question myself you showed one of the examples of failing test cases on architectures but I think you explained that you only had ARM 6086 and the Intel so I guess these failures were from the that was from the debian from the debian build yeah I was mixing that there a bit still it's the same test with being run okay so thanks again and thank you