 Thank you for coming for the very last session in the conference. My name is Anderson Saki, I'm a software engineer at Red Hat Crypto Team, and I will talk a bit about API and the API maintenance and the library maintenance. Sorry, I don't have much time, so I will talk 20 minutes non-stopping. If you have questions, please take notes and ask me later. Thanks. This is my agenda, I will talk a bit about ABI breakage, ABI versioning, map files, and finally about the two ABI maps to automate the update of map files. So ABI breakage, what is ABI breakage? It's an incompatible change made to the API, so software depending on that API or ABI will be broken. So imagine in the context of an operational system, it would be something like a broken package, and all the packages depending on that broken package would have to be changed, at least recompiled so that it would incorporate the changes and be able to run again. So some examples of what would break the ABI, for example, if you remove one exported symbol, it would break the ABI because any software depending on that API would just break. If you change arguments that could break during runtime, for example, it wouldn't change the exported symbols, but it changed the behavior and it could break it during runtime. The same thing for change of exported symbols, it could break during runtime. So what we want is to have one stable ABI, meaning that everything that worked before has to continue to work. So old applications compiled against the old version of the library should be able to run against the new version of the south problems. And so we should consider only the documented behavior. So if the user is using some internal API or using some undocumented manner, then he's not covered by the contract. So he's on his own. This is to be fair with the developer. So we want to make incompatible changes without breaking the ABI. How can you do that? The idea is that you have to keep all symbols that all APIs you had before in the new version of the library. So here comes the ABI versioning. So you can keep more than one version of one implementation, one API. So the first idea to version the ABI would be to use different file names. So for every new version of the library, you would create a new DSO file. So this is really not smart because you would have to have all the files for each version of the library. Otherwise, you would break some software depending on old versions of the library. So imagine this situation where you have an application depending on two libraries. Each of them depends on different versions of the same third library. When you load this in memory, you don't really know what would be the behavior because that would depend on what the dynamic linker would find during a lookup. So it's a problem. So what we do is to use symbol versioning. So we add version information to the symbols in the binaries. So you can keep more than one version of one implementation or one API. So the idea is to create version nodes where you can put the list of the symbols that were introduced in that version. And so you put in the binary, you will have the symbols associated with the version information. And you create a hierarchy. So if one version node has a predecessor node, it means that it supports all the symbols that were available in the version before. So it keeps the compatibility. So how do I add version information to my symbols? So you have to create a version script. Also known as a map file. And declare the version nodes with the list of all symbols that version introduced. And pass the map file to the linker using dash dash version script. This is all valid only for no linker. So let's talk a bit about the map files. So this is a map file. This is the name of the version node. It's also the version information that will be added to the symbols introduced in this version. So it's an arbitrary string, but it's a good idea to put some reference of your library and version so that you can control the version of that. Global and local are defined as scope of visibility of the symbols. So using map files, you can also control the visibility of the symbols. So you can define exactly the set of symbols you want to export. So if you want to export a symbol, you have to put it in the global scope. There is this special symbol. The asterisk means catch all wild card. It will basically catch all symbols that are not explicitly added to any version node. So all the symbols that you are not adding to the map file will be hidden for the application. It will not be visible. This is just a predecessor. It's not really necessary in the map file, but it's a good idea to keep this information so that you can see what was the original version in which you base the new version. So, yeah, this is how it adds the version to the symbol. It matches the name you put in the map file with your implementation, and it will add the version information to the symbol. So when you compile it, you get your shared library with the version information added to the symbol. So how do I make a compatible change? If you are making changes to the code that are really compatible, then you can just do it and change it. It will not affect the behavior or shouldn't. The new version will work just as the old version and the applications linked against the old version of the library will continue to work because it will find the symbols with the old version there. You can optionally add one empty version node just to explicitly say that you are not adding any other APIs in this version. Adding APIs, this is probably the most common change you want to make to your library, so you are adding new APIs. So I added some other API, so I have to add one new version node containing that symbol to be exported. And the new version of the library will contain both the old symbols and the new symbols that you added there. So how do I keep multiple versions of the same API? This is the most complicated case, let's say. So you can use this assembly, the same instruction to create aliases for your implementation. So what I'm saying here is that this new funk is actually a new implementation for that API that was available before. So I add that symbol to the new version node and you have to pay attention to the app marks. If you put a single app mark, it means that it will be visible only during runtime. So new applications being compiled or linked against the new library will not see these symbols marked with a single app mark, only with a double app mark. We usually say that the symbols with double app are the default implementation because it's what the linker will see when linking the application. So what happens if I omit the first one? First assembly, so the instruction. What would happen is that both new and old implementation would get the double app mark and so the linker would be a bit confused. Actually, it would use the first symbol it found for that first version found. So it could be the old or the new one. And if I omit the second one, what would happen is that the new implementation would be caught by the catch all wild card and wouldn't be exported at all. It would be only local scope. So you have to keep both to have two versions of the same API. Now how do I make incompatible changes and keep the API stable? You basically have to keep all the old symbols, all the symbols that were available before during runtime. The important part is that it should be available in runtime. So to make incompatible changes, the techniques are similar to keeping two versions of the same API. So you change the implementation. So in this case, I changed the argument. So this is an API break. So I add a new version node containing that symbol and I use the assembly of the instructions and I use the single at mark in the old version and the double at mark in the new version so that the linker will find the new version when linking. To deprecate symbols safely, what you have to do is to keep it available during runtime. So you can use the same technique and put the single at mark to the function with a single one and you add a new version node without that symbol. Obviously if you added some API, those added APIs would be present here in this new version node. But the idea is that the deprecated one will not be here. And what would happen is that the symbol would get the single at mark so it will be available only during runtime. So when new applications are being linked with the new version, it will not see this symbol. But applications linked against the old version will still find this symbol available during runtime. So you will not break them. If you really want to break the EBI, meaning that you are completely releasing a new version that is completely incompatible, then what you should do is get all the symbols ever exported and merged together in a new version node. So you will have a single new version node. And this makes clear that all these symbols in the new version are not compatible with the old ones. So once you start versioning the symbols, you have to update your map file for each added API. If you forget to add some added API to the map file, it won't be exported and won't be available at all. Even worse, if you accidentally remove some exported symbol from some version, then you are basically breaking your EBI. So you have to do this with care. It updates the map files. So that's why we wrote the EBI map tool. What is it? It's a tool to automate the maintenance of the map files. So you provide all the symbols to be exported to the tool together with the old map file, and it will update the map file automatically. And it can also detect if you haven't removed some symbol accidentally to avoid the EBI breakage. So what's the workflow? You provide the list of all the symbols to be exported to the tool plus the old map file, and you will get the map file updated with the symbols added to the new version node. So these are the commands available. With new, you create a new map file, update your change when existing one, check, just check the syntax of the map file to see if nothing's wrong. Version just prints the tool version information. So I'm afraid of live demos, so I put a static one, and the colors are artificially added, so you won't see these pretty colors there. So what I'm doing here is providing one symbol to the EBI map with the new command and specifying the version node that I want to create. So it will just create the version node with the added symbol. I can update one existing map file. So in this case, I'm providing... Remember, you have to provide all the symbols to be exported, including the old ones. So what it did here... Ah, you ignored the warnings. This is something that maybe I should remove. But yeah. So what you'll get is a new version node with the new symbol added there. The tool tried to guess what's the name of the new version you want to create, so in this case, it understood as an incremental change, so it just incremented the minor number in the version. So in this case, I forgot to provide some of the symbols, and it would detect that you accidentally removed some APIs and would abort the operation. In this case, I added all the symbols that I had before, plus one, and I specified one existing version node to be updated, so it can change existing version nodes. So what it did there is to add that symbol to that existing version node. You can also explicitly say that you want to break the API with the dash, dash, allow, break, ABI break, and it allows you to remove things, but what it will do is to merge all exported symbols in a single new release, and again, it tries to guess your version that you want to add there and increment the major version. There is this final, dash, dash, final option that marks the version node with one special comment. It's just a trick for the tool to understand that this version node shouldn't be changed anymore. So if you mark in the map file with this release comment and try to change it, it will give you this error message saying that you already released that you shouldn't be changing. This is just to show how the check works. So I removed one semicolon and run the check against the library and it gives you some error message. So the tool has some limitations. It doesn't support adding existing symbols to a new version. This is required if you want to keep two versions of the same API, so this is a limitation and improvement I have to do. Extraction of symbols is out of scope, so you have to have a way to parse your source code and provide the symbols to be exported. The detection of the ABI breakages is out of scope. What I do is just check if you are not removing symbols. If you really want to check if the ABI is being broken, I strongly recommend this other libabigale or the ABI diff tool to check if you are not breaking the ABI. The integration of the two could be easier, so please consider contributing. So the summary. Breaking an ABI can be catastrophic for OS, obviously. So what is a stable ABI? Everything that worked before has to continue to work, so basically you have to keep all the symbols ever exported forever. The symbol versioning associates version information to each API. Map files is how you define the version nodes and it also can be used to limit the visibility of the symbols in your library. And the ABI map is a tool that you can use to automate the maintenance of map files. If you need further information, I recommend reading this paper. So you can get a code in this address, and that's all. Thank you. Do you have questions? Yes? The question is if that only works for C and not for C++. So it theoretically would work with C++, but you have to provide the whole symbol to be exported. So yeah, I would say that the tool is more suitable for C. So yes, maybe I should make some improvement in this area. Yes? Okay, the question is if I recommend changing or bumping the SO name with incompatible changes. Yes, I think the best way to maintain is to always bump the SO name when you make incompatible changes. So if you are breaking the ABI, then you should definitely change the SO name. That would be kind of related with that ABI, allowing the ABI breakage. So you are creating a new version node with all the symbols exported there, and bump the SO name. So all the tricks with that assembly, so the instruction was to allow you to keep the SO name and make the incompatible change so that the new version would be still compatible with the old one. So yes, for me it's better. If you are making incompatible changes, just bump the SO name and recreate the map file. But yeah, those are tricks. How can you keep the SO name and not break the ABI and make the change? Yes? So the question is that the linker already has some better way to deal with different SO names. Sorry? In Solaris. In Solaris. Yeah, so the symbol version scheme of the new linker was based on the Solaris one, and all these tricks, I think, is only available in Linux, in the new linker. So the question is if I consider the Solaris way of dealing with the SO names, no, I haven't considered that. So everything that I wrote was for the new linker that was the thing that I wanted to support. So sorry, I don't know about the old difference about the Solaris linker and how it deals with the symbols. Yeah, so the question is that in C++, probably the only thing you have to do is to mangle the symbol version into the symbols. And I think, yes, that's the way you do. As far as I know, when you compile the C++ code, all the symbols get all the classes and stuff mangled. So you get really big symbols. And you could definitely mangle the symbol version there. So, yes, in C++ you could simply mangle the version information to the symbols, and that would solve the problem basically in the same way, symbol version. Oh, okay. Okay, so it was not really a question, but an answer for the other question. Yeah, sorry, I haven't... Sorry, I'm out of time, so thank you for attending.