 All right. Hello everybody. My name is Nicola Henle. I work at AMD on the GPU side of the company where we develop the AMD GPU target in LLVM, which is upstream, and we use to compile compute kernels and graphics shaders directly into binary that runs on the GPU. But this talk is not about that. This talk is about TableGen, which is the domain-specific language that is used in part to build LLVM itself. Now, my own personal history with TableGen is probably fairly standard initially. I mean, I started working on the back end and you just have to use it at some point. I copied and pasted stuff. And then last year there was something that I worked on in our back end that really ran into the limitations of the TableGen front end. It ran into weird errors, crashes, and so on. So I dug in more deeply and in part this talk is my attempt to spread some of the knowledge that I gained while looking deeply into TableGen, fixing some of these problems that I encountered. So the agenda is a brief overview of what is TableGen, where it is used very roughly, and then really look at what are the features of the TableGen language. What can you do with it? Maybe you'll learn about some interesting patterns that you can use in your own TableGen sources. And depending on how time permits, at the end I want to take a bit of a more deep dive into one of the more advanced uses of TableGen language features in our back end. The nature of this talk is such that you may have questions that it makes sense to ask right then and there, so don't hesitate to do that when it makes sense. Alright, so what is TableGen? So TableGen is a tool, LVM-TableGen, into which you can feed TableGen source, like the example that you see on the left hand side, and it spits you out some automatically generated C++ code that you then include in your handwritten C++ code. It is used for all sorts of things in back ends, so for example machine instruction descriptions, register descriptions, scheduling properties of the microarchitectures of the machines that you're targeting, it automatically generates bytecode for instruction selection, assembly parsers and emitters, disassemblers, etc. And it all does this based on these kind of sources that you see here in this example, it's just the definition of four machine instructions that we have in our back ends, where a bitwise and an or in 32 and 64 bits. So I said it's one tool, LVM-TableGen, that's a bit of a lie. I know actually of two TableGen tools, one in LVM and one in Clang. They use the same front end, which is just a library that is reusable part of LVM, and they have different back ends for different purposes. So in general the flow of what TableGen does is it reads your TableGen sources, the front end parses it, does some evaluation on it and produces a big list of record definitions, which I'll show you later. And then the back ends, they don't care about the original source anymore, they just look at this set of records, they maybe filter them to look at only the records that they are interested in, which depends on the back end, to interpret what is in those records and then use that to generate the purpose specific auto-generated C++. You specify the back end that you want to use on the TableGen command line, and if you don't specify any, then it'll just dump all the record definitions. That's extremely verbose, but it's also extremely helpful sometimes if you want to kind of debug some gnarly problem that you have with your TableGen source. So if there is one lesson, only one lesson to take away from this talk, it should actually be this, that if you're running into problems with TableGen, don't be afraid to just dump all the records and use, you know, regex searches in what you get to figure out in detail what's going on. But usually you don't invoke TableGen yourself, you let CMake do it, it's all integrated. Usually you don't have to worry about it too much, there's some snippet of how it is integrated as an example. The one lesson that you should take away from this is that there is this setting LLVM optimized TableGen which you should absolutely use in debug builds because even if you do a debug build of LLVM, usually you don't want to debug build of TableGen, right? So optimize that, it helps your build times. Okay, so records. So what are records? Well they're basically just key value dictionaries usually with a name. So on the right hand side you see at the top this little snippet that I showed two slides earlier and this little snippet actually gets expanded into this big record definition and this is actually only just a small excerpt of it. And the way that this expansion happens is TableGen has a notion of classes which are basically templates for records. So at the top you see this s and b32 colon sob232. Sob232 is a class, TableGen class which is defined in our backends TableGen sources which then in itself derives from other classes and once all the transitive inheritance is done you get this big record. You have at the top this is something that the record dumping prints out in these double slash comments, a list of all the classes that were transitively inherited. So most of these are target specific but in particular there is this class called instruction. So instruction is a target independent class which all the backends that are interested in machine instructions use to filter out the record definitions. So they look at all the record definitions that somehow directly or indirectly inherit from instruction and then they work with those. And yeah like I said usually records are named so in this case we have a name that actually appears as a column in C++ to refer to that machine instruction. Not all of them have to be named like if you have standalone instruction selection patterns they don't need a name although just giving them a name can be helpful for debugging issues. Yeah so TableGen is not just a tool it's also this domain specific language and it has a kind of a core language which is used to write down record definitions and then a larger set of features around it that allow you to generate kind of regular sets of records fairly easily. Like you saw before in the example we have this scalar and and scalar or and obviously these instructions are going to be very similar so we have tools to generate these big rack records that are mostly the same in an easy way. So this is kind of a sketch of the kind of TableGen source files that live in LLVM and how they include each other so the main bulk is typically in just targets back end definitions. You usually have one top level file that includes everything else because you can have textual inclusion but not really include guards and also yeah there were some additions fairly recently but that's the way things work. You include all the various files in your back end that define instructions scheduling and etc and you include the target independent stuff. And the other big chunk is the intrinsic definitions and there are some random other things as well. Alright so much for the very brief overview. Now I want to go into just TableGen language features and there is a list of them that I want to cover. Okay so very brief look at the types of system. It's fairly standard although I mean a bit quirky. You have bits and integer types. The integers are 64 bit only. You have bit sequences. You can cast between them. You can slice bit sequences. There is a string type. There is also a code type which is a bit strange but you know there is this notation with the square and curly brackets to have like a convenient way to have C++ fragments in your TableGen source which then gets pasted into some auto-generated larger function. Really you don't need one type for that but whatever. There is a list type which is just a homogenous list. TableGen does some type inference but the TableGen front end is basically one single pass through the source so there is no type inferencing pass or anything. So there are some cases where you need to help out and put the type in these angle brackets as the second example shows. You don't need that often but sometimes you need it. You can index into lists but only with literal constants at the moment and then there are some other unset DAG and class record types that I will explain in the next few slides. Okay so unset values is an interesting thing because of course usually all the key values like the values in your records should be defined but sometimes they're not and there is actually one at the top right you see an example of how these variables or values can end up unset. Both of them are unset in the same way one explicitly with the question mark the other implicitly but there is one nice application having to do with instruction encoding for having unset values on purpose so what you see on the left hand side is just a short extract of the table-gen source of our back end which defines one encoding type. It's a 32-bit instruction since the NG32 it has this inst variable which is a placeholder for the 32 bits of the instruction and then there is an encoding called vinterp which defines the fields of the encoding. You see that the bits are 1 to 26 they're assigned to a constant value that's what defines this encoding class and then other bits out of these 32 are parceled out to variables that are there defined at the top like the vdesk which is just undefined or the op which is the opcode which is actually passed in as a kind of a template argument to this vinterp class and on the right hand side you see one of the machine instructions or the record corresponding to it that uses this encoding and you see that now this inst field with its 32 bits has been expanded so there should be eight entries in each row there and you see that for example in blue the two bits that correspond to the opcode have been filled in to 0,0 but a lot of the other bits actually still refer to these fields like vdesk, vsource, etc which are just unset bit sequences and the whole machinery for instruction encoding and disassembly is built based on these relations so the relations of which fields defining registers, etc are where is read out of this and it's also tied to you see there the DAG out operand list for example it mentions the name vdesk which is attached to the variable name vdesk to tie the representation of operands in your machine instruction to the encoding then there is the DAG type so it's called DAG like directed as acyclic graph because it's used in instruction selection but I think the best way to think about it is really that it's a kind of s expression where you have an operator and then a list of arguments except that each of the arguments can optionally also be assigned a name it's a convenient way of having heterogeneous nested structures and like I said most it's used for instruction selection patterns so there is an example down here also from our back end the first row in there describes a pattern that is looked for in the selection DAG and the row below is the machine instruction that should be generated for that pattern and so in the top row you see the inner thing is a bitwise XOR that is used once XOR is something that is named source zero with a constant this is then interpreted as a 16-bit float and converted to a 32-bit float and it so happens if you XOR that value to a 16-bit float well it flips the sign bit so many instructions have the ability to aside from whatever else they're doing to flip the sign bit and so we just replace it by this conversion instruction with the modifier that says negate the source even though the name is DAG it really more represents a tree than a DAG even though the name is DAG it really more represents a tree than a DAG that's why I said the name is a bit misleading I mean you can't really express a DAG in the source language with this syntax and then of course there's classes as I already said classes are basically templates for records they have inheritance although I actually don't have an example of that here on the slide but this syntax is basically the same as C++ for the inheritance and so on the right hand side right in the example above you see some source that's not taken from anything real that's just a random example and at the bottom you have what it's expanded to okay so there are two records that are explicitly defined right my record derives from both of these classes that are defined and I think the main thing to point out here is an interesting feature which is in the other record which has this B angle brackets 3 which is actually an anonymous instantiation of the class so you see here this anonymous zero record which is generated automatically which is quite a useful feature and the other thing that's interesting to notice that every class has an implicit template argument called name which is replaced by the name of the record that is being instantiated alright so when you know about classes and records it's very tempting to try to define all the variables in your records as functions of template arguments class template arguments this works but it tends to lead to a design of your table gen classes where they have lots and lots of template arguments and it becomes a bit of a mess for these things it's better to use let statements so let statements are a way to override values that are defined in the classes that you inherit from for example and one very interesting thing about this is that actually expressions are evaluated late so if you have here the class A again on the left is an example source it has a template P a template argument P is assigned to a variable X and then X is assigned to Y this assignment doesn't happen immediately it's delayed as late as possible which means that below when you instantiate A2 for example and then say let X equals 17 in the fully evaluated instantiation on the right hand side you see that both X and Y get this value 17 rather than having the value 2 that would be implied by the template parameter that you passed in it's important to remember that the let statement is not the one that you know from I don't know rust or functional languages it doesn't define a new variable it just allows you to override an existing one so classes I said are basically templates for records multi classes are templates for sets of records okay so maybe best to look at an example there on the bottom left this is part of a definition of intrinsics we have a def M so M is used to instantiate multi classes which is given a name which is kind of a base name for intrinsics it instantiates this multi class that is defined above and this multi class defines three records with names you know underscore X underscore Y underscore Z which are concatenated with the base name and each of those then inherits from you know some class helper class that we defined which in turn derives from a target independent intrinsic class which means that we're defining intrinsics here so we're defining three records and at once like classes multi classes have this implicit name argument so by default the names that are instantiated are just like the base name that comes from the def N def M concatenated from the name that the record has inside the multi class but you can play around with this like the example on the right hand side shows if you look at this the REC 3 the REC 3 is actually prefixed to the base name so the rule is that if you explicitly mention the name template argument then that overrides the default concatenation that can be useful in some ISAs where you have like instruction families where some of the instructions are prefixed with something there's some interesting corner cases so interestingly multi classes also support inheritance from other multi classes which is really basically the same as just putting a def M of the base multi class as I've shown there while preparing the slides I noticed that that's not entirely true and maybe that should be changed but yeah if you have a def M so the def M instantiates a multi class right but it can also actually inherit from classes which can be useful for tagging the records that you're instantiating with something like instruction things and stuff like that but it cannot have a body so if you want to override anything that that is any variable that is defined in the records that you instantiate you need to use global let's statements okay so multi class is one way of generating many regular records there is another way of doing that which is for each and for each is yeah it's a for each loop like you loop over a list or a range of integers so there is nothing too special about it in case I haven't mentioned it here in the example at the top you know that the hash is a concatenation operator string concatenation just good to know and there is also this exclamation mark add which is a built-in function to add integers you know this is just part of the encoding of the register in the isa one interesting thing about for each is that you can abuse it as an if statement so this is an example from our back end as well so we have a family of instructions that we call just vop one instructions and they have multiple incarnations right they have the basic e32 vop one there is an extended 64 bit encoding there's something that has a feature called sdwa for all of them but then for some of them there is also an incarnation that uses a feature called dpp and now if you have a class of instructions that is regular in that way except for this one thing you have multiple options for realizing that in table-gen right one way would be to just have different multi-classes right multi-class for vop one with and multi-class for vop one without the problem is with this kind of approach that you can easily get a combinatorial explosion of multi-classes so what we do here instead is this if statement basically so we have this notion of a vop profile that is passed as a template argument to the multi-class and it has a bit that tells us whether the instruction should have this dpp or not and then there is an interesting pattern here which is basically using table-gen classes as functions right this pool-to-list class it takes this bit value as an argument and then it has a variable called ret which is the return value of the function that we're defining and you know if the bit is zero then it we return an empty list which means that the for each will not do nothing or we return a one element list and the for each will do something so that's a pattern that we use quite extensively and it's quite useful yeah for each versus multi-class both of them serve kind of similar purposes I would say that multi-class is more idiomatic for table-gen and it's often easier to reuse but for each there are some programmability advantages so both have their place and use them as it makes sense another kind of niche feature is the dev set which as the example on the top right shows you allows you to capture all the records that are defined inside the outer curly braces into a list that you can then later reference and use in a for each for example to iterate over them we use this in AMD GPU to define for these intrinsics that are up there a generic table so generic table is generic searchable tables is one backend that is a fairly versatile way of exporting data out of table-gen without writing your own custom backend what's happening here is that we just define a generic table and you really should think of it as like a database table which is called resource intrinsics the fields are listed there the rows of the table come from records that are derived from the given filter class and table-gen the searchable tables backend will give you a function which you pass in a key of your choice and it will search the table and produce the corresponding struct if it exists and you can actually also define multiple keys over generic tables which is nice for mapping from between two things back and forth in both directions so here we use the for each based on lists that we previously captured to define the rows of this table and now again you might ask so Devset is really a niche feature and even though I added in myself I have to say that it's not very idiomatic for table-gen in most cases you'll probably want to instead use like multi-classes or possibly heterogeneous multi-classes where you define maybe some machine instructions and then some other class maybe for those tables at the same time but in this particular case something interesting happens because the intrinsics are defined in target independent table-gen sources which are included by all back-ends by all targets and they have to be defined there because it's a global enum that is the same for all targets it's not target specific but the table like if we were to define the table in the same place then every target would get this table which is really specific to our back-end that doesn't make sense so instead we do the separation using the Devset feature we already saw one built-in function this exclamation mark add there are a whole bunch of others the most interesting ones are probably the casts so you are able to cast between strings and records actually which means that if you cast a record to a string it gives you the name if you cast a string to a record then it looks for a record of that name and gives it back to you if it has been defined previously in the source file and then so there is for each which is basically a map function is what it would be called in like Haskell or something and there is a left-fold which is also nice to have alright so this is what we went over we went over most of the stuff now I said we could maybe do a deep dive into some more advanced application of table-gen if we have the time we don't have that much time anymore so I wonder which part might be best so we have this very complex family of intrinsics for image operations in our back-end right there is a lot of orthogonality here where part of these image instructions is the dimensionality of the image right if it's 1d to the array 2d and then the arguments of the intrinsic depends on that dimensionality of course it's either only an s coordinate or it's an s and a t and a slice in the array or it's just an s and t etc you may have this dot d which stands for partial derivatives in which case you have all these green you know ds over dh dt over dh which are partial derivatives of coordinates with respect to screen coordinates and this really large number of intrinsics we want to define some way and we do all of this in table-gen so you get code like this which at the bottom left you see it defines the notion of a 2d array dimensionality of an image it gives it a name 2d array and says okay there are two coordinates that are relevant for derivatives namely s and t and one coordinate which isn't and then in this class that we define above we define a variable gradient args which is the list of gradient arguments for intrinsics that need them which are defined basically using function calls right we first iterate over the given coordinate names and use string concatenations to define these names ds dh etc and then we pass that to that class which has really a role of a function which is defined above make argument list and it produces what is there on the right in the box so why do we want so what is that that we have on the right and why do we want it well when we define intrinsics we have to define the types of the arguments of the intrinsics right and there are we can give a type explicitly such as 32 bit float but we actually want to be able to pass not just 32 bit floats but also half precision floats so we need an any float type for the ds dh argument and then the arguments that come after that should be constrained by the IR verifier to be of the same type so that's what the LLVM match type is for and what this make argument list function does is precisely generate an array of this form by taking you know the first name combining it with the base type and then doing a mapping which maps all the names to this AMD GPU arg pair of match type and argument name and any questions about this this is a lot to those are things that we use it we don't use them for much internally they're convenient for debugging and I have plans for other things that we can do with them other questions about this I will upload the slides of course so you can look at it later we do more stuff like this so for example if you have two of these arrays of arguments you need to and all of them use LLVM match types you need to adjust the index that the match type refers to that's what that stuff is about especially that stuff on that slide but we don't really have the time to look at it in detail alright so I just want to close with some final thoughts maybe a brief look at what we covered you know a brief overview of TableGen quick run through of pretty much all the features of the frontend language except the built-in functions not in detail and a brief example of what you can use some things to keep in mind you know multi-class versus for each calling TableGen directly if you have problems to dump all the records these are kind of things that I think you should take home some possible things to improve in TableGen in the future I'm not going to work on them because well priorities just aren't like that but there are still some cleanups that could be done in the type system right to smooth some of these corner cases it will be convenient to have the hash operator not just work for strings but also for lists and DAGs that would be convenient the thing with multi-class inheritance that I mentioned and then you know back ends I mean it used to be like last year when I started diving deep into TableGen my problem was that I always ran into crashes and errors in the frontend this at least for me this doesn't happen anymore if it happens for you talk to me but now the problem is that the main problem is that I do something with selection DAG you know the eye cells patterns and I just get weird error messages that are super difficult to understand at least for me and I think there are some things there that are also in the feature set that could maybe be more orthogonal but yeah that's that would be something probably another big project and not for this talk so with that thank you very much you mentioned dumping like invoking TableGen itself given a td file you can just invoke TableGen on it to dump like pre-process kind of just record definitions and then is there a way to go from the map to then dump the C++ that gets emitted okay so the question is can you I said you can run TableGen itself to so you can dump the records can you then go from there to the generate include files so the generate includes so of course you can go to the generated include files directly by just invoking the back end directly but if you want to like edit the stuff that's in between that's currently not possible because so you could just cut out the record definitions and feed them back into TableGen as input I think there are two problems that you'd run into one problem is that I don't think the output of the dump is you have to define a variable before you refer to it and I don't think the dumping makes sure to preserve that order I think it just dumps alphabetically that's one thing the other thing is that the back ends filter the records out based on the classes that they inherit and the way that the records are dumped currently you get the information of the inheritance in a comment and not as actual inheritance in the TableGen language so I think those would be the two major hurdles to enable that thank you so much for the back end yeah so obviously you can generate other stuff than C++ from a back end and in fact somebody not so long ago added a back end that dumps the records in JSON format so you know you could then use some other tool but yeah you could dump other things then generate other things than C++ if you wanted to then in C++ you can do whatever you also explained how to use classes to actually implement something like function calls would be better to just have function calls syntax rather than a more obfuscated way so right the question was I showed how you can kind of abuse classes as functions and wouldn't it be better to have a dedicated syntax for that I thought about this myself back when I did all this work I didn't find a solution to it that I was completely convinced of but it's certainly something to think about maybe keep it as classes but allow invoking it with the exclamation mark syntax to make it at least a little bit more approachable I'm maybe wondering every so often I just have to have a brief look at table-gen files and it always takes me a very long time to run for understanding what this actually means just wondering if that's one of the very small tweaks that helps 5% to write if you just read the stuff it might help the understandability but to have a dedicated defined function syntax it's a thought that I was also thinking about this but it would have to be because this pattern isn't used that often today so it would have to be something that's very intuitive but also fit well with the other stuff in table-gen so do you think there might be a scope for it because I know there is a table-gen manual but it's pretty much a reference manual do you think this is going to contribute some of this to table-gen? Yeah, so the question is is there a scope to add better documentation or tutorial to LLVM? I wish I had the time I started a series of blog posts on my blog where the idea was that maybe that could one day lead to something like that but then I just like it's a problem of time Is table-gen Turing complete yet? Not intentionally, I don't know so there are some places like with these function calls and everything where the question naturally arises but at least last year whenever I was at a point where I thought well maybe it would be nice to have arbitrary recursion or something in there which is a way of not adding it so I don't think it is but maybe you will prove me wrong so you can, there is a very limited scope for self-references when you define a record you can define a variable which you assign a cast from the name of the record and then the record will contain a reference to itself that is the only way that was, stuff like that was actually being used in existing table-gen when I started looking this deep yeah I mean go ahead next post then you present table-gen is Turing complete yeah I don't think so but I would not be super surprised if it were other questions because I was lazy and at the time that I first thought about it there was like one place where I wanted to use it by now I think there are a few more places where it is used so maybe maybe, oh sorry the question was I showed how to abuse for each as an if statement and the question was maybe there should be an actual if statement in the table-gen front-end language for textual conditional parsing I don't think that would, no that's very far away from the kind of thing that I showed