 Oké, ik denk dat ik nu kunnen beginnen. Waarom ben ik hier vandaag? Ik heb een masterproject gedaan, een graderatieproject op de relatieve LVM. En de hele idee is dat, als iemand met een kans gaat doen dezelfde ding, dan heb je al een start gehad. En dan heb je mijn missies en mijn werk. Dus het topic van mijn graderatie, het is om met LVM te connecten met een worst-case-accusentime tool. En ik zal wat dit in een minuut betekenen. Dus de talk zal volgen. Dus de eerste is, wat is dit worst-case-accusentime en het worst-case-accusentime-analysis? Want het is niet echt related met LVM, maar het is related met mijn project. Ja, en waarom zou je LVM te connecten met een worst-case-accusentime tool zonder het in het LVM zelf te proberen? En wat moet je picken? Dat heb ik al al gedaan. Het is een sweet tool. En ik heb mijn redenen voor het. En de manier van wat data te gebruiken van LVM en de datastructures. En ik zal het praten. En ik heb wat dingen gedaan met een specifieke architectuur, de armcortex, die ik zal shareen. En mijn ervaring met deze appartement. Dit vol concept in general. Wat ik doe. En ik zal het dan concluderen. Dus, eerst van het worst-case-accusentime. Het is een beetje een fysiologische concept. Maar imagine dat. Ik vraag je te ontdekken. Een software-routine voor het brengen op de auto. Ik denk dat je moet, door het design, hebben er wel een garantie van dat het afgaat. Want wanneer je brengt, wordt het niet geïnactiviteerd. En je auto moet op 3e of iets crashen. Dus het worst-case-accusentime is de heel langste tijd dat een p-code kan nemen. En je moet, door het design, een appartement hebben op dit. Dus hoe zou je dit worst-case-accusentime determineren? Of je moet zeggen, dat je je programma met alpossible ingebouwen hebt. Nou, dat begint niet te zijn, of niet makkelijk. Want in een non-trivial code zijn er veel teveel paarden om te kofferen. Dus, bijvoorbeeld, als je een functie met 2, 32-bit integers hebt. En je moet de programma veel en veel tijd hebben, om alpossible waarschijnlijk te zijn. Dus je moet zeggen, in praktijk moet je minder doen, maar je bent nooit echt zeker of je programma is arbitrarisch complex. Dus, er is een ding called Static Analysis, waarin je een soort van rommende code gaat. Het gaat om de semantieken van de code en dan proberen we een soort mathematical derivation om een worst-case-accusentime te worden gebouwd. En de theorie van dit is called abstract interpretation. Dus ik wil dit opnieuw discussie met een vorm van een graf. Dus zeg je, op de y-acces hier, de probabiliteit van een deel gebeurt met een paar ingebouwen. Dus dat betekent en op de x-accusentime. Dus het zou zijn, dat op de deel hier, de red arrows, dat je er veel maatregelen en maatregelen, maar je bent niet zeker, je kan niet testen alle parten, alle mogelijkste ingebouwen. Dus je moet missen, het kan langer zijn, dus je moet een grote probleem hebben, als dat gebeurt. Dus de theorie kan dan een gebouw doen, een worst-case-accusentime gebouw. Het is Static Analysis om een gebouw te vinden dat is een beetje hoger dan de reale, die je niet kunt vinden, maar uiteindelijk is het veilig dat dat is wat de theorie doet. Dus, waarom verbindt het LVM met een tool die een worst-case-accusentime gebouw kan doen en waarom niet, gewoon neemt het in de LVM. Ja, eerst van alles, dat is echt hard, want de theorie is niet echt makkelijk. Je moet eigenlijk een PhD in Computer Science of een echt specifiek en een mathematical field om dit te kunnen verstaan. De tweede, er zijn de deel van het idee dat je de worst-case-accusentime gebouw kan doen langs de gebouw dat je compijlt, want je hebt de gebouw in datastructures in LVM. Dus je kunt en je kunt gebruiken informatie over de architectuur die in LVM geplaatst is door de bedrijven van scheduling en andere dingen. Dus dat is eigenlijk een tablegen. Dus, ja, er existen open source tools die de worst-case-accusentime gebouw kunnen doen. Dus waarom niet neemt het in de LVM? Dat was de hele idee. Dus de tweede suite was mijn beslissing voor mijn project. Het is een tool dat staat voor een 3D-accusentime tool. Dus, ja, het is een open source maar een beetje slecht open source dat betekent dat je ze moet e-mailen om de code te krijgen. Dus dat is weer. Het heeft een interface langs af voordat je een binary hebt. Dus de tools reinterpreten de binary en dan doen control-flow-graphic constructions dus we hebben afwisselen enzovoorts. En dan dragen ze de semantie van de binary en doen static analysis. Maar dit tool is een beetje anders omdat je deze afwisseling hebt. Dus je kunt eigenlijk niet om ze alleen de verschillende binary-formaten te supporten. Er zijn verschillende tools terwijl ze voor de tweede suite zijn. Meestal zijn er ook een research tool. Meestal zijn er een bandomer of research tools. En er is één proprietary tool de AIT by the Absence company. Dat is een afwisseling die de markt helemaal domineren. Dus ja, er is geen open-source alternatief dat is weinig gebruikt in deze stuffen. No, the sweet tool is a bit of a specific way to use it. Basically, they ask you to change your code semantically to this alf language. And it does analysis on this alf language. So it can then find out which base blocks will be executed or not before you run the code, which I find to be really interesting. So it is the sweet tool. It needs two inputs. It needs alf code of your code, which is semantically the same, so the same calculations and so on. And you can also model your registers for memory in this alf code. Next to that, it needs for each basic block, it needs amount of instructions that it will take on your machine. Because sweet doesn't know it, it does analysis on alf. So you're expected to do like take an account caching and pipelining and two other things. And just provide them a list of basic block names with cycle counts. But all of this stems from the control photograph, is it necessary to have. We have this in LVM, because it's in a data structure. A small example of alf code which you probably cannot read, but I suppose you can look up these slides later. This would be an addition of two registers. The first thing you notice is that it's really verbose and annoying to read. And there are lots of 32-year, meaning that's the bed size. So they are really specific from the bed sizes that you use. Because that's where you get the possible ranges of values from your calculation. The other thing you probably cannot see is that they add those two loads from frames called r0 and r1, which is these frames you can design yourself. It's basically a name with bed size. And sweet has an arbitrary amount of operators from which they claim you can model your instructions of your instruction set. Or add and so on. So next month my approach in the project to connect the sweet tool with LVM. So the whole thing is the output alf from LVM, because sweet uses alf. That might say it's easy. The thing is that this alfcode needs to be exactly semantically the same as the instructions on your processor. So if you make a mistake then your analysis is wrong and the worst case execution time is wrong and that's a really big problem. So the alfcode can be obtained from tablegen. From tablegen they use in the deck so the instruction selection when you go from IR code to the backhand you have they use the deck patterns and this is essentially semantically the meaning of what the instruction does. So this is what I use to output alfcode from the instructions from LVM. Now we should probably discuss this picture below. So this will be an easy overview of a simplified overview of LVM. So to the left you would have Klang or the frontend and it will go to the intermediate presentation and there are done architecture independent optimizations and then you would go to the backhand or A backhand or C2 and there you have two key data structures the machine instructions and the machine code and the difference here is that when you decompile assembly code or binary you go to machine code and then you do not have the control flow graph so the decisions in the code because that's really difficult to retrace when you have decisions made on values on registers that you do not know out of time. Ok, so the machine instructions that's where you do have the control flow graph and the idea here was to go from the machine instructions to alfcode at the very last moment when all the organizations are done where you very much can say it will be the same as that you output to the binary and this would have been at the pre and mid pass function to your target something pass config class in your backhand so, ja the alfcode parent instruction the idea is that this semantic information of your architecture is already present in LVM and that we reusing it to output alf so the idea was to construct a table gen backhand as they call it which is basically when you interpret table gen you get a set of records and with these records you can do anything you want so you could have every instruction and information about each instruction so what I did in the project is to generate a function that based on the machine instruction object outputs alfcode and this alfcode is then determined based on the deck pattern field of each instruction the thing is that not every instruction has a deck pattern so you can customize this by doing a complex note or something so there is also the option to do there is quiet work to have to write c++ codes for generation of alffunctions alfcode when this is not available that's a bit annoying further on the compiler does not even care about conditional flags in a processor you would have a suction then failures flags are set for example the result is negative zero or carrier or something and then based on these flags they usually processors do this they decide the decision is made on these flags an lvm in the arm instruction an lvm is not really registered, but the flags are set for each instruction so this was a bit annoying and I had to basically assume that these four flags existed which is not target independent at all so I try to do is for the arm cortex try to get the work squarespace execution time analysis going and then to generate al this alfcode from table gen as the C of R I would go this turned out to be a lot of work because there's a lot of instructions and lots of them don't have DAC that DAC patterns to find so I have to write c++ code by hand to output alf which was annoying so I mentioned earlier that the c++ code also needs the cycle count worst case for each basic block I made this the process I selected has some branch prediction so that I was forced to take the worst case execution time listed in a data sheet but it works and it's probably safe so the thing is that you can get analysis from the sweet tool but you're not sure if this analysis with the alf you obtained from alfium is correct it's not really the same so I had to make test against simulator where I checked basically the register values it did work but it's really annoying because you're compiling a whole c code then you get some arbitrary code that is selected and scheduled and at the end you would have to test this alfcoded sweet tool with the binary and then execute it in the simulator so it's not really easy to find select one specific instruction and then generate alf for this so I had two major issues here furthermore there are two things that I discovered for a good reason not to do this project again another reason why I'm here if you I made the assumption at the very beginning that all the code you compile is available in alfium and for each instruction you know the semantic meaning and you can obtain this from alfium this is not true for what happens if you have a processor that does not for example have floating point instruction alfium doesn't have these instructions to find in the code or what would happen otherwise but it works instead link to a function of the libgc or the compiler rt library which starts with something for example multiply the vision here I believe so this code is not available in alfium and I cannot make alf code for it this is really annoying a big showstopper view furthermore when you define globals in alfium the position of these globals is only later determined in the linker so that the compiler also doesn't know it so that's another big problem so we try to conclude with a small summary and as I try to do so I try to add the worst case execution time execution time analysis to suite to alfium with the suite tool for this I just needed to output alf code which is semantically the same to the instructions of your compiling from alfium I used tablegen for this to make a generate code and that you can then use in your backend I managed to get it working for the cortex and three for some programs that do not have these for example floating point calculations I was I found out that I required to know the conditional flags of some negative zero and so on and they are not in tablegen because the compiler doesn't care about it apparently so bottom line alfium does not have all the information required to do this because for all general code or non-trivial code you can find because parts of the functions are in libGCC that's really annoying this could probably improve with well yeah the backend only the tablegen backend I made only looks at fixed patterns so set add and so on so not nested add that patterns that is I don't consider delay slots of instruction so if you have instruction in a few cycles later the addresses get updated the arm did not have in my project I could not finish the M3 instruction set and it would have be needed to specifically select a libGCC version and then handwrite all the alf code for these so for the functions that your processor does not have you would have to supply additional alf code that needs to be the same some links if you want to get give a go at this project or continue it so first is the code I have I put it on GitHub I made a syntax highlighting file for VINF because there was not one available and they don't think there were that many users on the sweet mailing list there were only 25 users I don't think the tool is that much known anyway the sweet homepage if you just google sweet you don't get to this tool because of all the meanings and there is this specification document of the alf language the sweet ok that's it, thanks for listening and it's the questions anybody got any questions? I understood this correctly what you did is because you don't have the conditional flags available you assume a very specific set and always use that doesn't that mean you completely eliminated certain code paths entirely from the code which makes your WCT analysis entirely wrong? ok, so what he asked is that I assumed only to be four conditional flags so negative zero carry and overflow and you mentioned that this would make the analysis wrong well it would not make it necessarily wrong but it would make it only possible for processors that have only these four condition flags so the specific processor I used but I suppose it could be adjusted to make some kind of function in table gen that would make it possible to find more status flags in table gen but I did not look into this much did I answer your question? throw it off I think there is some misunderstanding here when you say you assume the flag would you assume a specific value for the flag or did you consider both possible values? I assume the existence of these flags and for example these are all the patterns I considered for each instruction in table gen as like these patterns to find and so for a set register with at or something and from this instruction I sat a specific flag and I don't consider any other flags if you have a really weird processor that based on the bit size say you have a 32 bit operation and then it considers definitely the first 16 bit and the higher 16 bit and such different flags for it is not modeled because it's really specific to only these four flags is a bit hard to explain any other questions? did you consider compiling compilerRT to bitcode so you could potentially basically use that to get a fast sort of way of getting the working point retrieved oh no, not at all you mentioned the compilerRT and then compiling it to bitcode is this IR bitcode something like that so then you could then link that with your program and then you would then have all of the compilerRT routines that are needed for the closing point available to you to be then translated well I need the semantic information of each function that is defined in compilerRT I need to transform this to Auf that's what I would need to get it working I might have a look at it we have some general C codes and the compiler will output intermediate representation that you then translate to Auf why can't you compile compilerRT and then translate it to Auf oh yeah, I did not think of that good one I either compile it and use it at the IR level or compile it to Auf and kind of link it in suite oh okay, any more questions yeah would the other issue you mentioned the linking issue is if you compile did an LTO build that fix the linking issue do you have all the actual symbols that the linker is actually adding that LTO wouldn't solve so is it just the actual addresses you need or is it no you need to have for all the code that would go through the compiler you need to have equivalent Auf code and the thing was that some functions that are some instructions that you cannot do on your processor the LVM just inserts a call to a function that is linked later by after the compiler is done and you don't have this code available in the processor in the LVM so for example a function that would do the floating point addition but then in software with different instructions and this was not available in LVM at all that was the issue oh, wait I think I misunderstand you so the second one was is that the compiler of the some what I was asking is an LTO build so link time optimization build would you get or does that solve the problem well I'm not familiar with link time optimization there's a link time optimization basically you delay the machine code generation until you know the location of you basically create LVM IR but generate machine code you have all these object files that they actually have LVM IR in you then ask LVM to link all this together so you get one huge LVM module then you generate machine code oh, yeah that would perfectly solve the problem so it takes a lot so then I would in LVM would know the location of all variables instead of in the linker the code generation happens before the linker lays out any of the addresses so you wouldn't know the final address but you would know all the globals what would exist but you wouldn't necessarily know where they were yeah, that's the problem basically you the compiler time is measuring in doubles measuring doubles the compiler time may double oh, oh ok, but in this case I would not care I suppose ok, so my opinion about using this with cache is it cache or caches are a big problem met worst case solution time usually when I encounter these tools they either have a really complex for example AI2 has I believe some cache analysis but cache is a really big problem you don't know basically what is going to happen so they always expect a miss but the sweet tool does not do any cache so on, yeah I would have to do it myself and I did not we wouldn't have to stop there to change yeah, ok