 Hello, I'm Pierre-Marie de Roda and this is Raphael Amyar. We are both software engineers at ADACOR and today we'll talk about Libetalong. So, yeah, we have very little time to explain to you, so let's go. So first of all, what is Libetalong exactly? Well, in just three bullet points, Libetalong is a library and we wanted to allow people to get insight about ADACOR code and also to modify it. For this, we want to offer both high and low-level APIs. So by low-level, we mean some small details like, okay, what is the location of this token? What is the token under this location? Think like this. And also we want high-level APIs such as, okay, what's the type of this expression? Or can you please rename this type and all use occurrences of this type? Libetalong also wanted to be very versatile. So we wanted to be usable from any language, technology. So for this, Libetalong is an Ada library. It also offers a C API. And on top of the C API, well, it can talk to basically everything. So we ship with Libetalong a Python wrapper so that you can use the whole Libetalong library from Python. So this offers an interesting feature, which is basically you have a really short path between having an idea and watching a tool because you can do a quick prototype in Python. And this is great. So yeah, one important point. So Libetalong is an Ada library. We are going to present you some examples that will be written in Python because it fits in slides. But everything you will see in Python can be done in Ada, of course. Let's go. So let's have a... Well, first of all, let's see why we need Libetalong in the first place. So this is a screenshot from GPS, the worldwide known editor for Ada. GPS needs to know where the block starts and ends. So here you can see that it knows... It seems that it's in a type declaration that starts here and ends here. You get to do some parsing to understand that. So Libetalong will provide that. It also will provide... Well, in intelligent code editors, sometimes you want to click on an identifier and you expect the editor to lead you to the definition corresponding to this identifier. So this is often called name resolution or cross-references. We want you for that. We want to make it easy for IDEs to rename a function for instance or do source transformations like this. And also... So here you have some program. And we want to make it easy to write for people and for everyone to write custom tools that will, for instance, act as linters, so detect variables that are poorly cased. This doesn't match, but it doesn't matter. So yeah, variables names. If you have a rule like all variable names should be capitalized, well, you can easily write a checker to do that. We want that to make it... We want that to be easy with Libetalong. At this point, if you know enough the AIDA ecosystem, you might ask, okay, what not to use it using AIDIS? It does precisely that. Well, in Libetalong we have... Because we want to serve as the building block for tooling including editors, there are several mismatches. The first one is we want to be incremental, which is you open your project, okay, you Libetalong analyses the project and you perform a very minor modification. You don't want the DPS, for instance, to freeze for seconds or minutes because it computes everything that depends on the modification you made. So when something changes we want to do to perform minimal re-computation. Most of the time when you're writing code, your code is incorrect because you're writing it. And so we want Libetalong to be as helpful as possible when your code is incorrect. And also something very important, we want you Libetalong to be somehow bonded in the resources it uses. So we don't want Libetalong to crash your program after three days of running process because it exhausted all virtual memory. So AIDIS and in particular Gnat's implementation of AIDIS, well, they were implemented with some objectives in mind. Here we have different... We have needs that kind of contradict them. So Gnat and AIDIS are not... are polis-treated for what we need. So we decided to do yet another library. Yes. I just want to make one thing clear here. There is no problem actually with the specification of AIDIS for those needs. The problem is more with the implementation that is based on Gnat. So Gnat is a compiler. It was done to do all its work in one pass. So basically it's not adapted to be integrated in AIDIS. And hence AIDIS implementation that we provide is not adapted either. But we found other problems with AIDIS at the API level and we wanted to take a shot at doing something that is more user-friendly anyway. So this is why we created Libetalong. Thank you for those decisions. Okay, so as of today, what does using Libetalong look like? Well, first let's start with the basic level of languages, tokens. You can Libetalong ask to pass it a file, ask it to pass a file, and then to ask for the list of tokens that came from this file. So here we have an ADAP program, a really simple one. And this is a simple usage of the API, so in Python. You create a context to host your computations. You ask it to load a source file. So you take your analysis unit and you take the root node of it and you ask for the list of tokens corresponding to this root node and you print them, all of them. So this is the result. So, well, asking for the token stream is quite easy. Next level, let's go to the syntactic level. So this is a more complex ADAP program. Well, you can ask here too. So you take the root node of your analysis unit and here you ask, okay, find all nodes that comply to this predicate. So this is the type. So find all nodes that are object declarations, print their slug, well, source location ranges, and their text. And so this is the result. So again, something useful. So, yeah, performing this kind of query is useful, for instance, for linters. And next level, and this is getting more and more interesting. So this is yet another ADAP program where we defined two double functions that are overloads, so they're called the same. They only differ by their signature. So one of them takes an integer, returns an integer. The second one takes a float and returns a float. And there is a call to one of these double functions. So in Nibedalong, okay, so I didn't repeat it there, but we have asked to pass this analysis unit. Then here we're trying to get the double call. So we find all call expressions whose called function is named double. And, okay, so here we have, so this call is present here. And then all we have to do in Nibedalong to get what double function is called is to get the name. So the call actually gets also the arguments. And if you get the name, you only have this. And you ask for the reference declaration, and you print it. And Nibedalong selects, as we are calling double with an integer, the first of the load is chosen. And so Nibedalong finds which is the one that is called. This is cool. Okay. On the previous slide. Yes, sorry. On the object declaration. Previous. Previous, yes. The second line is afterwards. There's only one line with two object declarations in it. Syntactically, no, this is in the Ada grammar. There is, this is a single object declaration that declares two objects. That's an interesting point, so actually. No, no. Just give it to me. Okay. So that's an interesting point in the Ada grammar. Syntactically, you have one object declaration node, but indeed, semantically, you have two. So since Nibedalong's prime goal is to make analyzers and tools that act on syntax, we want to keep as close as possible to the syntactic representation, which is why you only have one node. We don't modify the tree after parsing or stuff like that, which is another thing that is difficult with Gnat and Asys, because they are compilers that want to emit code, so they might get rid of that representation very early on, and then you don't have access to it anymore. What if I have a reference to C, for example? I want to jump to the declaration. So for the moment, reference declaration will just give you the whole reference declaration. That would be enough, but from C, I have to find the declaration. Yes, absolutely. So what we plan to do is to have an API that will also give you the precise identifiers that you are looking for. But that is not done yet, but it's not too difficult to do. Okay. Okay. Now, so this is being worked as we speak, almost. So we want also to provide a feature that enables users to actually modify the source code. So here we have, on the top of the slide, an AIDA program. And then here, this is the use of the API we intend to facilitate with Zilberda alone. So, yeah. So first of all, so imagine we want to turn this into this. So all we have to do is to take the call to put line and to modify the input argument. So first of all, we find the node corresponding to the call. Then we start a rewriting session because while rewriting things, we want to keep the old thing available to help you doing the refactoring. And then, so what you do here is to take a kind of rewriting handle to the parameter here. And then what you do here is to say, okay, let's rewrite this parameter and rewrite it using a string literal, this one. Then you apply so that replaces the old source code with the new one. And then you're supposed to get this. So we want to provide that and work in progress. A question about that. Yes. We're trying to bootstrap Ada from a very small assembly scene. And part of bootstrap able to talk. And part of bootstrap able to talk. And we have problems with Ada because Ada is implemented in Ada. And we have a self-perfection problem. And I wanted to ask, is it possible to use this to do transpiling to see something, for example? Yes. Yes. A little bit of topic, but yeah. Let's discuss that after. What would happen if the string had a word that appeared twice in the program? For example, another port line had a word. So actually this example is incorrect because find all returns your list. So here we were supposed to extract which found element we would have to work on. So in this example, if there were multiple calls to put line, we would have several results and we would have to pick which one we would want to rewrite. So if I can be a bit more precise, the way you are finding the node is not by searching for the text, but you have the option to search for precise context. For example, you can say I want the first call or I want the call to this function even if you have another function but the same string literal. So you have a lot of granularity because you are doing a query on the tree and not on the text itself. So here we say we want the first call expression but you could say something else and get the node that you want to rewrite very precisely. Okay, I'm afraid we'll get out of time. Okay, so this is an example. So if you remember in the previous slides I talked about a linter that will check your variable names identifier. This is one possible implementation of it. So it's the whole script. We just iterate through each given file name. We parse it and then we check for parsing errors and if everything is okay, we just look for all object declarations and all identifiers inside object declarations because they can be multiple anyway. And we check the identifier and if it's not capitalized, we want about it. So it's really simple. We want it to be really simple to write this kind of tool. And now I will let Raphael talk about more usage example of the library. Okay, I guess it's good. So Pierre-Marie showed you a bit of how it's supposed to work and how you use it. I'm going to show you what we did with it so far and what we will be able to do with it in the future. So I didn't get that I was going to start with the demo. So a little demo to start with. Yeah, I'm going to find it. Don't worry. So far, we showed only Python code. So you might be like, okay, those Ada core guys, they do only Python. So the example I'm going to show is done in Ada. So we don't do only Python. We also do Ada. So what it is, is a syntax highlighter slash code browser. So it's basically a very small subset of the functionalities that you want in IDE. So it's a common light tool that you launch on your project. Here it's the Gnatcov project. And it generates a hierarchy of HTML pages. And then if you click on one of the links. All right, so small. Then you get highlighted code. Okay. So basically this is done with the libay.long API. We highlight tokens in certain fashion, but we have the tree so we can do a bit more syntactic highlighting. So for example, you can see that types are highlighted correctly, et cetera, et cetera. And then you have links to the cross references to right here. And if you click on it, it will bring you, even if it resets the size, which is a bug, but it will bring you to the correct source and to the correct line with the line highlighted. So this is very simple, but it can still be practical if you want to browse your sources offline. And it is shipped with libay.long today, so you can already try it if you want. It's in the country directory of libay.long. Okay, so my demo went well. I'm so happy. So another thing we did, Python again, is very small syntactic-based analyzers. So this was a fun project done by Yannick, who is not here now, but did a presentation on Spark. So he was like, oh, we do all this really complicated static analysis based on Spark and code here, but let's do something really simple. And this checker is doing something very fun. It's looking for binary operators and looking for cases where the left side and the right side are the same. And in most of the time, it's an error, okay? And this is the way you express it with libay.long. So we look for every binary operator. And if it's in the list of interesting operators, so we have multiplication, addition, the concatenation operator, et cetera, et cetera, then we check if syntactically, the left side and the right side have the same tokens. And if they do, we print a warning. So what is really fun is the number of problems we found with that in our code bases. So basically, we would assume since we run static analyzers and we have big test suite and everything, no, this cannot happen. It's Ada, right? It's a very safe language and everything. But, well, we had a lot of bugs in our codes linked to that. So it's really interesting. It's also an example of the power you have at your fingertips where you have access to the syntactic part of the code. So you are not into the text anymore. You can browse the tree and find interesting stuff. What we are working on right now based only based along two is a static analyzer based on semantics. So it's not a full interprocedural analyzer like, for example, code peers that we have. Some of you might know about it. But it's less powerful, less ambitious in scope. It allows you to do interprocedural stuff a little bit like a Selang static analyzer. So here, for example, we have a simple example where we have a file and we open it. And I'm going to take that, too. And here, we get a line and we close it every time in the loop, which is obviously an error. But when you write the code, you might do this kind of error. So what we want to do is to be able to warn you very early when you write this kind of API code and say, oh, be careful. File might be closed at this point. And when you close it, it might already be closed, too. So we are using a simple form of abstract interpretation to make that. And what's interesting is that users will be able to specify their own checks for their own APIs. So if you have an API that has some simple invariants like that that you want to enforce, you can add a simple checker for it. And it's a work in progress done by one of our interns at Edecor. And you can check the progress on this repository. We also did a copy-paste detector because we thought it was fun. Given the number of bugs we found with the static analyzers, maybe we could find like maybe a whole project duplicated at Edecor or something like that. It didn't happen, but we found some copy-paste. It's also an example of the API of Libatelon. And it's very lightweight. It's a few hundred lines of code, and it's pretty efficient. So if you want to try to run it on your Edecor base, you can find it on our blog here. And it's a contrib directory of Libatelon. So inside Edecor, we also use Libatelon for serious stuff, not only prototypes. So we are in the process of changing the semantic engine of GPS, the main IDE, to use Libatelon. So it's a work in progress. It should happen in the following year. And also the new versions of GNAT metric, GNAT stub, and GNATPP. So GNATPP is a pretty printer. It goes through your code and pretty prints it. GNAT stub generates stub for your sub-programmed bodies and specs. And GNAT metric gives you some metrics about your code. And all of those tools are based on Asys for the moment and are being adapted to run on top of Libatelon. And outside of Edecor, we already have some people using it. Some guys are doing instrumentation with it for coverage. Some people are doing automated refactoring to make code smaller. Some people are making serializers and desalizers to JSON on top of it. So this is an example of the kind of stuff that you can do on top of Libatelon. So in conclusion, if you want to check out Libatelon, literally or not, you can go on this URL. You can try it and open issues if you find problems. The API is still a moving target until we release it as a real product. But it's very stable for some parts. Some others are moving. So it depends on what you do with it, I guess. Thank you for listening. And if you have any more questions. I have one more to add. If you want to know how Libatelon was implemented beside this, we are doing a presentation at what? 1 p.m. tomorrow? I think it's 2 p.m. It's 2 p.m. Anyway, check out for the linkage presentation in the source code analysis dev room. Yes. Thank you. Thank you very much. Questions? Yes? Does it mean that you will give up some support for ASIS? So the question is, does it mean that we will give up support for ASIS? So we are not going to release new versions of ASIS. So it's basically baseline. We will continue providing support for the current version of ASIS for undetermined time for the moment. So that's a control? Oh, yeah, yeah. But don't worry. We won't leave Jean-Pierre hanging. It's not part of our plans. You also might piss off some customers. Yes. And we don't want to do that. So obviously, as long as we have some requests for ASIS support, we will support ASIS. We, internally, there have even been some discussions like, oh, we could rewrite the current ASIS based on libation alone. So just to give you an impression of the kind of discussions that happen, we would prefer not to do that, honestly. But if we have to... It depends on customer pressure. As a base, the question is about the CE. Oh, yeah. So you had the question, which was, could we use libation alone to transpile to C? Anything else? The background is this. That, you know, Ken Thompson's famous paper about backdooring compilers. Yes. And it could be that all compilers are backdoored. And that's why we started the project a while back, which basically starts with us manually toggling switches on the computer and writing a small program, which is an assembler. And then building a tower of languages until we are at GCC, which now works. But other doesn't work because the other in GCC is written in other. So it fails. OK. We'll try to find a way to fix that. I think there is one really big thing that is missing in libation alone is the implementation of execution semantics. I mean, the knowledge is not there for now, at least. So there's still a huge work to do starting from libation alone in order to create, basically, an interpreter or compiler on top of it. It's not really his job to translate to another language for now. Yeah. So basically, you have a small part of the front-end. You have the cross-references. If you really need the legality checks, you can use Gnat on top of libation alone. But then you still have a lot of stuff to do if you want to compile your code. There exists one Ada compiler, not open source, but there exists one that actually emits C and uses a full Ada runtime in standard NCC. But I think the chart for using it is $1 per line of source code. That is the code I heard at some point. You might be able to say, get them to say, it's an interesting project and not pay. I think it concerns all of us by now because everyone has to have untrustworthy computers and have everyone listening to everything else. It's probably not good. Maybe they're interested. What about the completeness of the front-end? Can it pass the other compiler source code? Can it pass? Was that the question? Can it pass the Ada compiler source code? The answer is yes. It can pass any source code that we could find. The parser doesn't fail on anything that we could find. The semantic analyzer, name resolution, still fails on some stuff, but it's getting really small. That's all we have for the moment anyway.