 Thank you for coming to our our talk. So today we're gonna talk about everything ABI basically in Fedora as far as ELF binaries are concerned. So for those of you who don't know us, my name is Doji. I work for Red Hat and I work in the tools team mostly on static analysis of binaries. So basically applying compiler technology tricks to analyze binaries to know whether what we're doing with our ABI's these days. Hi I'm Sini and I'm working in Red Hat in alternative architecture team and when I get time I work on ABI stuff too. So in today's presentation we're gonna talk about well or focused about mainly I would say five points. First of all I'll try to explain what we mean by an application binary interface. This is kind of a fuzzy area and then we talk about what we mean by ABI compatibility in general and then we talk about the you know tooling we have in Fedora today to try and tame those ABI compatibility changes and if time permits and I hope it will we'll look at some real examples of ABI you know change reports that we have today in production in Fedora and well we talk about future right as we usually do at the end of talks. So first of all what do we mean by ABI? So during this talk I'm gonna talk about usually about two binaries right there's gonna be one binary that uses another one so the first binary I'll name it E that's the context the second binary so E will use code from L which is the second binary and so the first binary usually is either an executable that's where the E comes from but it can also be a shell library and the second binary is usually a shared library but it can also be a module that is dynamically loaded so whenever I say executable it can be a shell library when I say library it can be a dynamic dynamically loaded module. Confusing? Okay we can go straight ahead. So once this context is set up we need to understand that usually E the executable at execution time requires some properties from the library right and so I'm not gonna talk about all the properties possible right I'm gonna focus on a subset and that subset is going to be what we call EBA right so for instance one property that is that E will expect from L will be some format for instance if we are on Linux the format will be F if we are on Mac OS 10 god forbid it will be Mac-O for instance and on Windows it will be PE portable executable etc etc and once that once we are in a format the architecture counts if you're running an executable of course which is a I don't know Intel architecture based then the library needs to be Intel as well etc etc and then there are more interesting stuff required the presence of certain symbols who here doesn't know what the symbol is it's okay we have time to explain okay great maybe I should have asked who knows so the presence of certain symbols and those symbols can be either from functions or global variables even files you know weird things from from from the elf binaries in our case our case and even more interesting is the layout of the data that is expected from the code that starts at those symbols if I say something that is not clear please stop me right away so those properties are all expected you know between the two binaries and there are even more right there are things like you know function calling conventions how you pass parameters you know things like that one thing that I am not talking about is the behavior is you know whether what the function actually is doing for instance that we're not we don't care about right so we only care about properties that define the structure of you know of the entry points of the you know what starts at the symbols address so these are what I call structural properties right not how the function behaves right and so those properties constitute I would say a very loose contract you know between the acceptable and library and that news contract and is what we call the ABI here I'm staying general but with time we'll see more concrete examples of these things so we're talking specifically about the ABI of a binary as opposed to talking for instance about the ABI of an OS of an operating system we're not talking about that we're not talking about the ABI of the kernel for instance just you know user space binaries in this case so I'm now narrowing the scope down and so I mean that context this ABI is made of the set of symbols that are symbols of functions and global variables that are defined in the binary and exported from that binary we're also talking about the layout of the data that is used by those symbols right and things I did talk about at the beginning like the format of the binary the architecture and and things like that so these artifacts are the ABI basically so usually people say that oh you changed the ABI bad you well actually you know we're doing free software and we want free software to you know to strive for that to happen well we need change we need things to evolve we need bugs you know to be fixed we need new features to be added right and well we're not that rich so well we need to share functionality so unless we're doing we're writing everything in go we're we're going to have shared libraries around for some time to come right so yes by doing this well we'll add new functions will add new global variables functions will get new parameters all those things will change so ABI changes are going to be there for now and for the foreseeable future so that change is inevitable right so what we need to have basically is a way to say okay first it's a way to see the change the ABI change right as opposed to seeing changes in source code for instance right and we need to be able to categorize that change say this one is good this one is bad and this one I need to think about it I need to review it so this is what I mean by managing the ABI changes so to be able to detect problems well we need to make oh can you read that the font is weird here so we need to be able to spot the the changes that make the ABI of you know for instance of a library incompatible with applications that are linked to previous versions of that library so we need to be able to see those changes and spot them when at least there are ABI incompatible for instance if there is a function that is removed that is probably an ABI incompatible change because if there is an application that was relying on those functions well and you have a new version of that library without those functions well that application that is out there in you know the wild will probably stop working reliably because expect something that is not there anymore that change need to be spotted you know in a you know in a meaningful way even more interesting for instance suppose you have in the library a function that is expecting some data that has a certain length and then you have a new version of the library that changes that expectation well sorry a new version of that library that provides a data that is shorter so the former code of the application expects a buffer the data you know that is wider than what is actually in the newer version of the library what do we get in that case buffer overflow you know security issues that are hard to to debug and you know and fix so that change in particular need to be spotted early on etc etc and so I won't get too much into details when about when this kind of changes happen but you have examples here so all these small changes I'm talking about we need to be able to spot them just by looking at the binary again we're not looking at source code here so interesting but then once we've we're capable of spotting the bad changes then we can see the other changes too and say okay these ones are okay they're okay but I still want to see them right just in case we'll see examples of where that can be interesting and of course we want to see those changes as soon as possible right because of course users will see them in the end but if we can see those for instance upstream even before we as a distributor handle you know the packages push before we push new packages even better so to me one interesting way of seeing well one interesting way of approaching this thing is to reuse existing concepts that we have today today we we do this kind of review for source code right when there is a new usually it's upstream when there is a new patch it is reviewed usually right and what do we use for that what's the cornerstone of this process it's gonna you dip we review changes review diffs right and this is how we spot changes that are possibly harmful etc etc so we thought that having the same you know concept of reviewing just changes would be interesting and those changes it will be interesting if those changes are narrow enough to only represent a bi changes so that we could have the same process as we have for source code but just for a bi so it will be well why one of the reasons why we're not we were not capable of spotting a bi changes just by looking at source code is because of the signal to noise ratio right a lot of source code changes don't impact a bi right so yeah something tailored just for a bi I think is really needed so which bring us to the tooling that we have in Fedora today for doing that so far we saw that how a bi impacts different how you have impacts different other applications so we'll see how we are going to do this in Fedora because all the packages are built in Fedora and a couple of packages depend on a on a package so if a beer changes in one package which on which other packages rely then what happens that if there is no changes in package but due to the change in the dependent package it is possible that the even the package rebuild will not work because the the a bi has been changed and the the set of functions on which it was relying on it has been changed so it needs to be taken care in Fedora so that we can avoid doing such we can avoid such changes in Fedora and we can have a better a better distribution without much a beer breakages and all so what we are doing we know that every new updates is shipped by the Koji and there is new reveal happens so for each new rebuild we will try to for each new rebuild we will try to make this a beer checks to the previous stable release release of the package suppose for example we are having Fedora 24 and there is a new update for full package and so there is a new update so it's now the new pack all the packages will use that new pack news at new full version so it needs to be tested against the ABS stability so that if any package rely on the full package then it it doesn't break due to the change in the full package yeah so so what it does so it checks for the ABI and then sends package then was what should be done if there is a changes then send the report of the ABI changes to the maintainer so that he knows that these are the changes and he should review the changes whether it's a valid change or not and then maybe then do it new rebuild or maybe it's it's important to be a change then maybe flag those changes so we can categorize all the ABI changes into three parts which some of if there is no changes we can mark them as past and if if there is a compatible incompatible we are changes then we can say that it has been feigned so that packet maintainer can look into it and say that it's whether a value change or not and maybe do a new rebuild and some area is gray area which may impact as ABI change it may impact the other packages or may not be so these are like needs to be manually reviewed and which has need inspection so what tooling we are using in Fedora for the ABI checking we are using Taskatron and who are aware of Taskatron okay so if you don't know Taskatron is a framework which allows you to write automatic automated tasks so like we have rpm lint we have in which you do the checks for spec files and we have added one ABI check task which does ABI checking so right now we right now we are doing ABI checks for the subset of packages which are in the critical path and what it does is basically ABI checks get triggered when there is a new update in Koji and and it takes the latest stable package and the package which has been recently uploaded and and the ABI def is compared and if the and the results is shown and it's test the status whether past failed or need inspection so if this ABI check Taskatron is an automated task but packet maintainer can also run it offline by using the Fed ABI package tool which can be run on the command line and on the command line you can see Fed ABI package and and the NVR of the package and it will pull from the Koji the update and it will do the ABI def locally and so before pushing it to the Fedora Koji it's better to do the change checks locally and for the package maintainers there are also tools like ABI package def and ABI def which is independent of the Fedora it can be run on Debian or it can be run on simple tar files and so on so everyone should do the ABI checking so that all the packages which depend on the given set of package doesn't break and this is the ABI check work is live and it runs on Taskatron production instance so if you want you can check it out we have links further and right now we have two limitations so it runs on C and C++ applications and we currently running on create path packages so which we are planning to further improve it to run on all the all the packages so here is so live it's up so here is a real example so which has been taken from the from the Taskatron run on the package called GPGME so you can you can see that the log is for Fedora 23 and next so the output is splitted into different pages so that we can show all the reports are together it runs for different architectures so we cannot show it all together so it's a small subset of the results so you can see that these two versions of packages has been compared and the changes which are shown it's like it's for each each binary available in the package so on each binary the ABI combination will be done and if there is any change that change will be shown so we can see that there is a library called the GPGME-P3.so available in the GPGME package so there are some ABI changes so we are showing it and we see that there is no function removed so it's okay and there is a change in the there is change in the existing functions or whatever so we'll see later on and there are seven added functions and rest are fine so let's see what exactly these changes are in detail so we saw that there were seven added functions so these are few functions you can see that this is the detailed prototype kind of thing for the functions the CAD start the GPGME, this is the function name and this is the return type and this is the symbol name which you see in the library this is in the source code so this is kind of mapping which is you can see that it's showing you in the detail what exactly the function prototype can be and this is the versioning of the symbol which is see in the boundary so there was we see that there was one change in the function in the subtype so let's see okay so that change in the function this GPGME-cancel and the function return type is GPGME underscore error and this is the form this is the parameter to the function so in the subtype change we show that if there is any change in the behavior of the parameter either or in the return type so let's see where exactly the change is and this you can see that this function is defined at line number 194 in this file so okay the change is in parameter one question maybe it's because I'm not a C developer but you said that you are only looking at the binary and now you know exactly on which line the source code it was yes you can deduce that from the binary exactly we have debugging for which has exactly so we're trying to do the same kind of analysis that debugger will do today but just for the analysis thing so this is the point right try to show the kind of details we can extract by just looking at the binary because it nice it needs to make sense to the programmer of course but not only to the programmer to the guy doing the package maintenance you know so yeah we need to talk about things that make sense right types function names not symbols addresses you know offset okay so the change in so we can see that this function is a type the change is in parameter one which is a type F so there is a change so let's see what exactly this type F is okay so this is a kind of pointer of type GPG me and this context so we know that it must be pointing to something so let's see what exactly it is so it's basically is pointing this structure so okay so we know that the change is actually in this structure and the structure is defined at this file 76 line number so okay next so what exactly change happens so you can see that type size has changed from 1664 to 179 to bits so basically the size of the structure which was earlier is not the same it has been changed so now see what exactly has been changed okay so three data memory insertion so the first change is this new new data type has been introduced at opposite 416 so you can see that the opposite the size was 1664 and it was introduced at 416 so it's basically the insertion is in between somewhere in the structure so if the changes in between so it's most likely that this can change lead to the ABA breakage and similarly this change is done at 1216 and 1280 so all three data members has been added in between the structure so it changed so the in the binary format and basically it reads from the addresses so it will expect some other data member at that particular offset but now due to addition of these members in between there will be some other data type at offset for example 416 so it may leads to ABA changes in the new if you're using the new version of the library so that's all now we'll talk about what needs to be done so far this is what we have changes needs to be at multiple level because at Fedora we are using and then the tools are in Libabigill like ABI pick agitif and then we are running ABI check tasks so improvements need everywhere like we need right now we are running on critpaths and there are some packages which needs a lot of memory like 30 to 40 GB to do for example in Firefox so we don't have right now that much memory for running the task so this needs to be work out somehow I don't know as of now maybe we'll talk to the info folks and we'll see the next question so the KD lives that also needs more memories he's right no seriously so yeah we say Firefox but it's not just Firefox it's Firefox dependencies like for instance Gekko which is a you know or or leave so right so these are okay the zoo is not a good example okay no let's say let's say the zoo for instance it can be reused elsewhere like it has been used in the Thunderbird and that due to that the issue is happening in Thunderbird so that so it will be you know it's a library used by others so it's it will be nice to be able to analyze it so but there are other and there is working for instance in the same you know kind of area so yeah we we need to be I think it will be nice to be able to analyze everything but yes okay is that answer your question and right now we show all the changes and there may be chances that developer don't want and package maintainer do not want to flag all the changes as a via changes so we may need the suppression specification which is already available in which already supported by the live Abigail tools so if we can have something in place to keep the keep the suppression specification maybe in the discrete and then maybe later on the testosterone can be able to pull in those suppression specification and apply on the packages whenever it updates is happening and while running the ABI diffs it will suppress if the changes happened in whatever mentioned in the suppression specification so do not flag those changes so it can avoid some changes which is like kind of redundant to the changes happen in the ABI and at ABI check level which is a task run on the testosterone basically right now we are running on the task available in the accurate packages list so there are like I don't know I haven't counted the list but it's maybe 300 400 probably but we are not running on all packages so we would like to see what all blockers are there or if there is not then how we can start running on all the all the packages in Koji I think that list is used at other places as well so I can I can answer the question I think the critical path list and we subtract some packages that we know they are really really difficult to test because they require too much memory and I think we would be able to add some like some more packages but it wouldn't scale so we can do it so there is a real strong reason but I would say let's just if you're interested come talk to us and if we get to me request we're gonna start saying no but I suspect that there aren't going to be people stamp eating at the gates to get their packages check well there are two or three packages I know that pretty much at least once every door or at least breaks something that would make a change that preferably the option to stick to that then let's talk after this and support for my support for giving the double packages so it's a bit tricky I will explain so right now all the changes the changes ABI you see is like whatever exported symbols are there in all the list of shared libraries available in our package so but you cannot use all the symbols available in the library so basically whatever we ship in the public headers through the suppose in Fedora RPM so whatever headers are shipped in the through the packages those are only supposed to be used so we need to if we need to use the devil packages or whatever wherever the header files are shipped in the list of sub packages so only those changes should be flat and rest of the rest of the symbols or the functions should be considered as private because they're not exactly supposed to be used by the other people because the headers are not available really to the to the uses for instance the previous example code example we saw that was shown there the file the type here the GPG and E underscore context is defined in a file named context of age right if you go look at the source code of that package complex of age age is the private header so actually this is not a problem because the change happened to a type that is private to the package but it's it's cool for the for folks to review it still but it's not a problem because it's not happening to a type that is supposed to be used right so the idea is to know that it is supposed to be used we need to go well we need to know that the type is defined in a header that is in slash usr slash included something is that clear what I'm saying so this is what you mean by taking into account the API of the package when you are showing the API changes if there is an agent on the type that is not meant to use that is a part of the API we are not going to show it by default but if you still want to see it there is an option that should mean all the world and so this is the API picketed if which is used as the as the main tooling inside a via check to do the which shows the results so there also needs to be improvement so for example the memory usage right now we see for example in Firefox it's like 30 to 40 GB of a memory is needed to run the abhx successfully so it's hard to get that infrastructure of 40 GB of yeah so we are working on it and support for more C++ language constructs like right now you will not see the changes if there is a change in the union type then better categorization of a bi for example the change there is a new parameters added to the functions we show it as in a bi changes but maybe the parameter has for one parameter has been removed so it's basically if someone is using that function it will expect three parameters of course but there is no two parameter so it's definitely an abh change so maybe we should flag it to something like as incompatible a via right now we show it as a via changes which needs to be in need inspection or something like that then right now it these are all command-line tools so maybe we like to have some good web best friendly so that normal users can also look into it and the tracking facility for all the changes happening in the different packages in the web form that's it and the patients we have we have 15 minutes 13 so we can have a buff okay so today so what are subversion files right so today you see okay the interesting changes are on types basically right saying a function has been added or removed it's cool but it's quite easy the interesting thing is to see how the type changed and what the changes are right but then sometimes in your project depending on the project a change that is a problem to you your project would be a problem in your project let me give you an example suppose you are doing a network kind of software right where your structures are being streamed out to the network so if you add a new member to that structure at the end of the structure to you it's a problem because you know dumping the structure to the network this is encapsulated in frames so a new thing is bad because it changes the size of the frame right but to you for instance who's not doing that a new member I mean at the end of the structure is not a problem because you know nothing was looking at the end of it at that point before so maybe you wouldn't want to see those changes there would be noise for you and for you they won't be noise so at our level we cannot choose how to classify the change is it an incompatible change or is it just a change that needs to be here so what we did is we provided a way for users to specify how the tool could suppress some child reports so you can say for instance that if a structure or let's say let's say C++ if I'm in C++ and I have a namespace which name is hidden any change in any type which name is hidden colon colon and anything else should be shown to me because the namespace hidden is something that should be hidden did I clear what I'm saying or you can even say things like oh if I had a data at the end a data member at the end of this structure named blah blah blah don't show it to me so you can it's a declarative file that has an eye eye format where you can specify this and then you can provide it to the idea in the package div etc etc so that implies the suppression to the changes a bit like what happens what exists with ball grant ball grant has suppressions and they are both suppressions that's why both suppressions so but today when you're using the automatic way of doing things what happens is okay the problem is how do you specify how do you provide a suppression specification to your package because maybe the change is not meaningful so what we did until now is that if inside the package there is a file and ends up with the dot a b I ignore you know extension it is considered as a suppression specification that is taken into account by the tool but then that's not really cool right maybe people would like to have it in this case so that it is version you know so that can see the changes to their suppression specification but then from within that's what Ron asks it will be cool to be able to get those suppression specification from this gate and if they're there apply them to the idea comparison you know things like that so it needs to be discussed see how it's processing I guess to see how we can do things does that answer your question so yeah maybe something else you mentioned about the concept of symbols versioning, can you tell me how common is in this concept and how to find the documentation about it? okay so two questions I need to repeat so first how common is elf symbol versioning and the second question is where can I find the documentation so I'll start with the second question thank you for answering anyway now I'll start with the second question so there is a famous paper that I don't remember which I don't remember the name of now that were written by Olga which name is you know shared libraries you know what developers should know about shared libraries so if you type shared libraries whatever search engine you're using you should find that and it is well detailed and it explains why it should be used as opposed to bumping surnames that's what we do usually when so only is was and I still is still against just bumping surnames and he was arguing for using elf versioning instead right so how common is it it is pretty common on on core system libraries usually for instance in the commutal chain so the GNMC and all the core libraries of the new tool chain use the symbol versioning you know GNMC elf details all the standard libraries of all the languages supported by the new tool chain and we're seeing that more and more you know more and more people are using it so it is really important to be able to support it and we do support it so does that answer your question seriously it depends on the package but but on the on you know on my machine and I only my box and I only run an optimized version of the body girl because I want to see the worst time worst case nine for example you can see that it took only 2.1 a second but this is not a good okay this is interesting this is an interesting case because API package that we use is heavily multi-threaded so if you have five shared libraries in your binary in your package sorry and you want to compare it to another package with five batteries the five batteries are going to be compared in parallel if you have at least five parts so usually on my machine the time thing is the time of the smallest comparison that takes the most time so two three seconds usually one is more than ten second I go look at it and I try to optimize things but for an optimized build I'm not doing that answer question yes yes yes yes can you can you find out if those are the changes somehow related to parts to parts I mean something in a slash cross slash says can you find out function change with the body of the function change and the change is related to the sun location in the past so I don't well we today we don't do analysis of instructions inside functions if what you're saying is that there are some instructions in a function that do change something we don't do that analysis yet so what we do today is just another analysis statistical analysis of you know the types that are used well that are exposed by the function that are used that are reachable from the declaration of the function and if a library exported a global function global variable yeah you can detect it yes yes yes yes cheat based libraries often have introspection based bindings are comparing generating introspection as well because they are often generated from the comments so so they can change you know that see that one so just to rephrase the question so he said that GLA based libraries often use introspection to generate you know functions that well library stops that are used and so can we detect changes in that case in the generative code even though no you know the library the finger has been generated from commands right that's why you say okay so first if the generative code is and well the generative code is compiled and it ends up in an health library a binary no it's it's a it's not a binary so I was just wondering if you're planning on on adding girr okay so it's not run the tool as part of the API check and check to see if the results that's interesting they are forced to not do me that so but that's I like the question sorry one minute yeah one minute okay so he says we are so today so of course we read type information and main information from health right but we don't only do that so when we have the API representation you know in memory we have our internal representation we can also dump it into xml and then you've dumped it to xml and then you can take that xml and say ABI first parameter is the xml and the second parameter can be a binary so you're comparing the ABI open binary against an ABI that is specified in xml so what I'm trying to tell you is that we have a reader to read ABI information type and functions and variables information from xml so it is very well conceivable and possible and not usually complicated to write a third reader that we type functions global variables you know this kind of artifacts from something else that something else is you know type you know type info coming from you know g1 gir things I think that is very well possible yeah we have this problem sometimes that that so so the way this is used is that for example there's a part in module that knows how to read those bytes and like do runtime introspection and then so so so it's possible to see libraries from one example and and and sometimes they divided just change without anybody noticing that yeah so yeah it's very well possible to yeah to write a reader for something else and we have it was last week I think someone coming and saying oh would you support Apple binary format I was like yes write a reader so yeah so we still have five minutes yes I'm wondering you know one trend I see is continuous integration is becoming more common which is great and I wish this was also part of the development process like once at the moment someone is writing a patch let them know oh are rich you're changing API to intent tool and say okay if they wish to they would proceed but if they weren't aware this would affect the API they could rewrite their change to be compatible sure yes and and my answer is we're already doing that in GCC so in GCC there we are depending on some on some libraries that are developed elsewhere for instance in GCC we support address sanitizer that is developed and you know LPM's SPM repository so from time to time there is a seat you know like involved take the code from the SPM in LPM and put it in the GCC's repo and now before doing that we maintainers are asking that we run you know on and to compare the previous version we had in GCC against you know the new thing that we were importing and we knew the changes and if they're like API changes that are not compatible then we shut or we try to do something so yes this thing is happening the Jalipsi guys are planning well they actually are doing it on their you know my hand today but their plans to try to automate that somehow their upstream project that's a direction so okay so in theory it's just work so I don't have the thing is I haven't worked on that yet but the kernel is an elf thing so sincerely yes you know I'm interested in having that either doing it myself or helping someone else do it or just accepting patches whatever and I think it's coming but you know the kernel is special but yeah it's something we're interested in you're not the first one to talk about it there's some chatter around and so yeah this is definitely something we need to add you know basically for the kernel stuff we need a key to which I guess we need to sit down and define what you know what we want and you know I'm just coming up with a tool so I think we would call it today as far as this was a decision sorry thank you very much