 talk about how we can detect API breakages in our libraries in Debian earlier and run those detections across wider range of libraries. Because individual packages and individual libraries and individual upstreams are using the ACC tool already, but I don't think enough people are using it and I would want to know what can I do to make sure that more people use it. There are a few ways how API can be broken in a library. The most common one, you change or remove symbols. For most of the packages, you should be using the package again symbols and check symbols files such that you can catch removal of, for example, C symbols from a library. But see if, for example, arguments on the particular function change, you will not notice that change in symbols files, but the ACC tool will notice that. And it will inform you that the API has been broken of how many parameters it has, what their sizes are, if the sizes of the structures change. For C++, symbols, they are more verbose because they do encode the return types and the types that you pass to functions. However, if you do changes to the templates, your symbols file will not tell you that the API has been broken, but ACC can detect changes in C++ templates. And if your symbols files are not that good or you don't specify tight enough version numbers and you break your API but your version dependencies are too wide, then you didn't help much because you can still install broken packages which failed to run that runtime. Here's a few tools that you can be using. I'm mostly going to be talking about ABI compliance checker, but there is ABI damper, there's spam and cheat checker, and there are other ways you can do this. So, for example, if you run out package test, that your binary starts and can print out text output, that means that it can open and find all the shared libraries and there was no ABI breaks, presumably, such that you can execute things at runtime, right? And typically how we find ABI breaks is that somebody uploads a new version of the library, somebody goes and does routine rebuilds, and then the rebuilds fails to build from source because the API changed and presumably the API changed at the same time and then people complain, which I think is a bit too late. We should be able to do this much earlier. And then once we detect that there wasn't library break, we need to upload a new library again, bump the ABI number, sometimes they've been specific if it was broken and taken only, and then rebuild everything and that's such that we get the correct dependencies. The ACC tool is quite useful because it can operate in multiple modes. Typically you pass it a shared library and it will scan all of its symbols and try to extract as much information from the shared library as it can. But you also can pass to headers and then it will scan all the structures, their sizes, what type of arguments you supposed to pass to each function and what are the return types, and then it can generate an extensive, basically structured, flat text file of everything you can find out about a particular combination of files. And that can generate your dump and you can then compare your next library revision against the dump. It also has a few other modes where you can generate a dump for a whole system to it. So for example, squeeze default installation or that we see default installation and then you can take your binary and test whether they will run against those truths without actually executing that binary, such that it will analyze if all the shared libraries are present and if all the symbols are present and if they're compatible. Which is useful if you're a third party vendor and you need to check that your compiled binary package does work against multiple distributions. A while back, I've added DH ACC helper add-on to the ABI compliance checker tool in David such that it would help you to generate the dump site build time and it will also compare the ABI of your new updated library at build time such that your builds will fail if suddenly your library becomes ABI incompatible. That's very good because if your builds fail that means it will never hit even unstable such that in even unstable you wouldn't see an ABI break. This is all kind of cool. You do bootstrap it slightly the same way how you bootstrap symbols file. You first add a description of your headers and your library and then you upload that. Once it builds once across all of the architectures in ABI you can fetch the depths themselves and then extract the ABI dump tarpaul from inside each architecture build and then if you commit those to your sources of your given source package your next upload will verify that you're still compatible with that old ABI and ABI dump. Does that make sense? Yeah. However, it's still quite manual, right? I'll show you slightly later and in the manual description you still end up having to list the libraries that you're testing and you have to specify the header files that you're testing and what ACC tool uncovers quite often is that not all headers are actually compilable and a lot of headers you cannot compile them by just including them. They sometimes need optional extra other includes or they depend on the include order or you define some extra variable to operate hence a lot of headers for most libraries you will hit that you have to exclude certain headers from being checked for ABI compliance and the default simple way to configure ABI compliance checkers to just pass it a list of headers or directories with headers and libraries which should work most of the time but once you need an exception you have to convert and write an XML description of things that you need to do and jumping from a flat list of files to an XML has proven to be not user friendly even though it's quite simple hence there has been very little uptake of DHECC so far that I've seen in the archive I can give you a real example of somebody who is actually using it as is in the archive today which is quite cryptic because they are using multi-arch and I didn't take multi-arch into account hence they pre-process their definition file first and then they run the actual DHECC tool and the description file that they specified is an XML which is very verbose and perfect in my opinion but they do specify the original version number that they took the dump off and tried to compare to all the time they specify where their headers are which headers to skip because presumably that header does not compile by itself and which libraries to check and they do substitute the multi-arch variable in there it might be and this is too verbose but it's not that bad I think most people would be able to write something like that for their old library the other bit which I didn't consider when I wrote DHECC I was thinking that people will take the dump from their build directory such that your path should have been current directory dot slash devian slash temp user, lib, blah blah blah in this case this person is running this tool actually as an outer package test such that it's running against installed libraries in the installed environment instead of the build time I'm not sure why they did that but that's how they did it and now I'm coming to a few more live open questions of how would you want the library dumps to be maintained who and where should be comparing the dumps for example should it be done should you commit all of those conflict files in your devian source package or should it be managed externally somewhere for example on a hosted service which goes and scans all of the devian archive and tries to compare the dumps of everything that it can find and things like that should it be run as an outer package test would you like it to be run as an outer package test I mean some people I've started talking to a few people and they use it quite differently so for example up upstream they run API compliance checker before releasing a new upstream as an upstream developer and they do so by installing the previous version on their system and then running the ACC from the build directory to compare the system library against the one that you just currently built but that's kind of a recursive build dependency if you actually try to upload it into the environment hence they don't upload it and do it on every build and there is a project related to the upstream developers they run upstream tracker.org where upstream developers are encouraged to add their own libraries and the descriptions of their libraries such that all versions are scanned and all versions are compared against each other to check for ABI breakages and hopefully if internet works I would be able to show you how those results look like that didn't work well now I don't want to shut down that's supposed to work, right? this is the most exciting part of the presentation retrieving files via HTTP is very complicated so it should have a difficult user interface right, so I've opened new TNLS upstream tracker where it tracks various version numbers which are cut off on the screen whoa! more practical solution right, so here they check this results that ECC generates in HTML if you provided a lot of version numbers as you can and you can see that 3.3.0 did break ABI and so did 3.2.12 hence there was 12.1 released straight after to resolve the ABI break which happened in the screen there so here's the full report for the ABI bunch of symbols but they have removed a bunch of symbols and if we jump there we can see that xssl.h header was removed and all of those symbols were gone and the shared library was gone as well presumably it moved somewhere else and maybe you would be able to still get it or maybe it's gone for good so this doesn't tell us what actually has changed but you could, it generates a lot of other information as well so for example it marks things so it gives you explanation of things that change so for example something changed to a constant pointer instead of the one that you can modify in practice it's a lot of severity because the type didn't change the position didn't change the size didn't change but the fact that it's now a constant pointer somebody may have realized that it isn't and somebody's GCC WOLW error compiles will start failing because of this change if somebody's not passing things right so that's the overview of the Indiana compliance checker tool and now it's the question why are you not using it or are you using it and if you are using it how are you using it or how would you like me to improve it it's using it shouldn't it I'd like to ask a really stupid question I don't understand is it using all the library or can it operate off it can operate on one or the other or it can operate on the other and then here are the files which tries to I think I'm done here and the functions and what you're supposed to pass to them separately it does the dump of the library where it extracts the symbols from the library itself so it's two separate dumps one is called binary the other one is headers only mode and the binary mode uses the so it's as comprehensive as the headers mode almost for most things so for C it's almost as comprehensive but it doesn't know the struct sizes for example because that's not included in the shared library I'm not sure if it checks that yet I've read the whole source code of the whole thing it's not very complicated it's a lot of pearl so the advantage of using only the binaries is that you don't have to worry about un-copilable headers and strange headers and previous headers and what but I don't see how much that is an improvement over just regular d-package symbols the binary only mode in that case we've got a limited amount of effort this needs to be automated can we automatically detect which headers are used can we, I don't know, grab them out of the debugging data I was thinking to do multi-pass of ACC tool such that if it finds un-copilable headers just remove them or have an option flag to automatically remove them and rerun it again until it runs with no errors usually you only need to remove a handful of header files such that you are ABI checking most of them if you can and then if I do automate it do I automate it and send a mass-back file of patches across all-day bin maintainers or am I running it separately as a fake-out-of-package test type of thing go on it doesn't require manual interface one of the issues that I started using is the bootstrapping stuff well, A, I don't want to ever look at XML and B I'll leave it in package and then you're comparing it across two versions and one of the things that I noticed up there you still have that web page handy where you can look at so one of the things I noticed in the... do I want to go back? yeah, the previous page showing the compatibility between versions it's not clear so it says 3.2.12 is incompatible it says 3.2.12.1 is compatible but it's not actually clear whether that means compatible with 3.2.12 or 3.2.11 right so are these actually all the same upstream or are all these all the same? N2P I guess they could improve that the SO names, it checks multiple libraries so it's a report for the collection of those whatever 5, 6 libraries but it will be different basically right is annoying well, the API dumps they are arch specific such that at the moment my tool does account for that in the helper such that you actually need to supply dump for each arch that you want to be checking the compliance on yeah, it is very annoying the other bit is that do we want to only check against one base or do we want to check against each version that ever was in the archive so for example, if LibFu had 3 versions in the archive when I upload the 4th upload shall I be checking against the last upload for all 3 previous which claim to be that SO name good question I think a responsible library maintainer will maintain it in such a way that the manifest for the most recent version should be sufficient considering you did it on each upload each time you didn't break it then it's fine and if you break it once you'll still notice it library maintainers are usually know what they're doing more than the average maintainer but it's still it's still tricky when you say library maintainer you mean upstream, not debut and developer I mean debbie maintainers oh, okay, okay because the debbie maintainers are going to have to make an manifest true so one other side point I was true this was an example of just 3rdly normally actually using this tool I would have done the DH exact type of thing myself I mean, presumably I could write something like that to run against the debbie archive but would it be useful or would you rather do it in the source package that your source package breaks I think you should think about the data flow you should think where should the manifest be kept are we going to implicitly regenerate the manifest from squeeze and compare it with what's in wheezy rather or are we going to manually maintain the manifest you need to think about how this will play out with derivatives add simbox to the shared libraries so I'm just going to trade off the derivatives we don't do something that means that derivatives get the benefit of this checking it also means that derivatives anything much less about this kind of stability than we do will unnecessarily trip over the built breakages when they change things well, I mean in Ubuntu for example there are a few core libraries that add additional symbols which are not upstream and are not in debian and ideally you would want to verify that those symbols are never dropped especially those available such drop given that they're Ubuntu specific and not even upstream or debian I mean I'm more worried I'm not very worried about that we do something that might or might not by default do the best thing for Ubuntu it's probably not very much work for Ubuntu to make it to update the dump and make it built it needs to be done we've got lots and lots of derivatives now and we should be making things work well for the people who don't have much effort right and XML even 7 lines is that's too high I wanted to make another comment about auto-pickage test it is possible to write auto-pickage tests that test packages other than the one that contains the tests and you could do such a thing and then you would have to teach you need a new slightly different invocation rules because now the package that is to be tested must be specified on the auto-pickage test command line in a slightly different way right test definition you could write tests that that do this thing do that kind of thing but I know whether that's useful or not for you but it's a thing I should mention I really like to do it at build time because packages are built for all architectures and I have no idea the password for this machine because Debin has a lot of architectures and auto-pickage tests currently are only run on AMD64 and i326 right that's a good reason for doing it at build time and doing it as much as possible based on information in the source package but that means patches and it can explode your source package by bit so for example a dump for a single architecture can be a megabyte compressed all of which ends up in the Debian disk yeah that's quite exciting and then people tell me well I gave you a turbo which is compressed and then people say well I'm committing it into git hence I don't want an ABI dump compressed I want the flat text file which is obviously larger well all the Debian packages are compressed during transport so compressing the file again when dumping it into the Debian directory is probably not helpful okay right I still don't want to see the slides again I hope you don't want to see the slides again or me it is not nothing very small maybe it was Depconf because she was typing F it's ability somebody ping Alice on IRC or maybe Docker did you Fubar excellent hello Matias would you like to track GCC ABI I'm not trolling do you mean have the package failed to build when the GCC is missing from it oh yeah yeah that bit because that would detect it because you see it checks it will tell you that would never I mean if you're not joking what what did you say was it generating an empty GCC1 package or did it not generate the package I think the package was there the library was not it was the special case of special so this tool very sophisticated can detect missing shared libraries if you forget to ship it which totally did not happen right ever so let's hear what you're saying all the reports right so it has support for templates whether the change inside your template will affect the ABI of the library which happens to use that template after it's rebuilt however to actually test that you need both the library that uses that template and the templates themselves which are often two separate projects like Boost and LepuPP as they're called and then it can check the ABI and it will tell you that the template has changed the template API has not changed but when you will rebuild it your library ABI does change so it has support to detect that but it's I haven't managed to actually construct the config file to detect it and actually give me a correct answer and I've detected this failure so I could work on it if there is a good example such that I could libabigale which exactly tries to track templates templates yes I was not aware of that one because the ones that I've listed have listed oh it's new because these are the tools that I've tested so they're sanity checker and ABI damper I was not aware of that one ideally you would integrate as many of them as possible to detect things okay so everybody will be happy when I send them a 10-bank patch against their package to commit in the source package how are you tracking ABI's? right you do it locally you don't make it public and it's not done on every build what do you want to? okay I mean if I set up a public server it will most likely track AMD64 and high 386 and maybe ARM if somebody donates me loads of ARM hardware can you do this checking in a cross way so I install a weird version of this tool on AMD64 and I can check all the architectures with throwing CPU at it I think yes yes that should work oh you did test it Q double to Q real Q double to Q real I ran all those checks on AMD64 it was horrible but that had nothing to do with it being cross did you do you need a cross-binitel available or anything you probably had one so I can't tell you and I probably just pointed it out my cross toolchain it did the thing and it did the problems I ran into with that had nothing to do with whether I was cross or not it was just the problem with all of the Q headers yeah and that was a problem unrelated to whether it was cross or not and that would bring us back to the thing that in Ubuntu I have cross toolchains to most arches that we care about and in Debian there are there are no cross toolchains for all arches from like AMD64 and it's quite a long one people are working on that I've heard that before probably at like last fconf are they available oh it's your machine hi you have cross toolchains to all arches I want that okay I've heard that every week checks on sources and patches them very lightly very lightly just a little bit of sprinkling actually why talk to him I can sponsor packages for you are there some patches missing for example patches this is not an accusation question what's the strange conversation most of the people in the world have no idea what's going on well this is this is kind of like playing mallards so that's why it's a package but it has been modified package reports differs from the cabal well the templates the templates with three multiple architectures and you get all these great strings yeah the problem that we change is the same set of problems this this thing finds more problems than the symbols and for example for example if you drop or change private symbols this thing will tell you that symbol has disappeared and symbol was changed you're EBA incompatible because your symbol is public and it is in the shared library it has no idea that it is somehow else how marked private in like API documentation that's a lot people say these are private symbols but then they stick it into the shared library so that it's accessible to anybody who links against it well that's not very private that's like a normal symbol so then this tool detects it and then you need to actually check if anybody else is using those symbols or relies on them and that's like the harder because you need it's checking all the dependencies it doesn't help much it finds more interesting things in C++ than the compiler dependent symbols so it actually finds C++ specific ABI breakages because deep package symbols it's mostly you know ALF based C type of mostly for C type of changes of ABI I have 10 minutes left in this session that's been quicker than I thought it would be so everybody would happy to see an external website and maybe not act upon it and then not many people would be happy with 10 megs committed to their Debian source package not really unless it's very important experience library like in the core would you want an ABI dump? see we effectively do this upstream and trust that upstream is not screwed up we don't actually ship a symbols file in the source package at all but the symbol versioning in GSMC upstream is correct I just repeat that all over again so well all the WGLT containers are also upstream developers as well we trust that we're doing these checks correctly upstream most of the time we're right and in our case because we use symbol versioning very very heavily in GSMC that allows us to auto generate our symbols files and they are generally should be correct now it's possible that Debian Patch could drop a symbol and maybe we would like to know that on the other hand we'd also like to think about that dump so like the massive however large that dump would be would be very very big I'm not sure if it's something we would care to have or maintain right for some of the smaller libraries I maintain like say LibArt has which I sort of fire and forget me that would actually be kind of handy I think to have an extra level of security well for me I would ideally would want to have this tool reliable enough to detect that the boost API stays stable such that when they release a new upstream version I don't need to repackage and rename ABI of every single library just some of them although that sounds scary a little bit the other thing like for example the recent ABI break which wasn't in ABI break was symbols conflicts such that we had NewTLS26 and we had NewTLS28 and both libraries had conflicting symbols such that when eventually enough people started using the new one but there were still users of the older one when a binary had both libraries linked against it transitively it would explode at runtime because the symbols were conflicting how can we detect that okay and I knew about it that was different from the last version but really I think a way of supplying to maintainers the information do you realise that the ABI changed and that means that this needs to be adequately reliable yeah and you need to work out as you say just against the last one or the last end or against stable or against testing exactly but if you can supply a useful a useful data feed that a maintainer would actually quite use for you and the version that was suggested shouldn't actually want and it holds the point several megabytes in their own package they might be able to use okay okay so that's the way you would never know I mean one of the libraries that I've maintained upstream has yet to release a new point release version update like micro point release update which does not break ABI so for me it was easy I was bumping the ABI by default but then I would like maybe I should stop doing it and then I would run the check and it's broken it's broken they change classes inheritance for no reason that breaks C++ ABI straight off the bat alright okay yeah do people care checking whether this binary still is executable in Weezy and Jesse and for example said without recompilation because you can take a dump of squeeze you can make a squeeze trute and take an ABI dump of that trute and then you can make a dump of unstable trute and then you take your binary that you've compiled and you can ask will my binary run against that system such that you check that you're still compatible across multiple systems so for example if I'm a third party vendor of games and I'm called not mouse but host I check that and then I can verify without executing the game actually that yes everything is compatible or no some library changes its ABI or package version numbers changes such that I need to do something to keep it running or like provide a compact library with my distribution method or do people who do this are not in this room or maybe a question is too confusing I'll write a blog post about it and about the session and notes and see how that goes okay I can look into it and fix them does see how David did not have retry but yeah okay you need to trigger it yeah anything else I'll write up a blog post and it will be on David planet and then the URL to the slides will be there and I'll try to submit them to summit as well such that if you browse in summit you should be able to find the slides as well and if you have any questions then ping me on rcxdocs or email xdocs at debin.org thanks a lot