 I want to talk to you today about Ruby Macros, which is this system that I created for enabling deep metaprogramming with Ruby. Now the idea is to be able to manipulate your syntax trees at parse time and change them pretty much in any way you want. If you grew up as a C programmer like me, maybe thinking of Macros as this kind of macro, a C preprocessor Macros, this is not the same thing really. It's sort of the same general idea, but the C preprocessor is very limited in what it can do. It's a fairly simple textual substitution scheme. It can't, the language in which you write macros and C is not turning complete and you quite quickly run up against the limit of things that you can do. What I'm really talking about here are Lisp macros. Lisp has a very powerful macro system and that's what I'm trying to emulate. It's difficult to get to the level of complete integration with macros that Lisp has because of the nature of Lisp. In Lisp, code is data and data is code and that's just a really natural flow between the two. In any other language that's not Lisp, that's basically impossible to do, but I try to get as close as I can. I think I got maybe 90% of the way there. Here's an example of a really useless macro which adds two things together. This shows I had to invent several new syntactical constructions in order to achieve the effect that I was going for. This shows all three of the new syntactical constructions. First of all, we have this macro keyword. The keyword macro introduces a macro and macro definitions look syntactically just like method definitions. In fact, in many ways, a macro is a method. It just runs at parse time instead of at run time. Basically, it looks like a method definition, but instead of def, you have a macro. Inside of the macro, typically, you will have a form. That's this parenthesis thing, but it's got a colon on the front. A form is a way of quoting your code. I'll explain it more in a little bit more detail in a couple of minutes. For now, let's just say that every macro should have a form inside of it and that form should mention the parameters of the macro up here. Those parameters have to be escaped using the third of the new syntactical constructions, the form escape. That's what this carrot is. That's a unary operator. What it does is it allows you to break out of the form. Anyway, this macro is in a whole lot of use. What it does is just inline the addition operator, basically. An expression like this, which is calling the macro, would be turned at parse time into an expression like this. Now, how does this work? There's supposed to be some arrows there, but I think they ended up pretty faint. Anyway, what a macro does is it returns, it runs at parse time, and so it does not have access to all of your regular Ruby objects. It can't manipulate your arguments as objects per se. It manipulates them symbolically instead as what are called S expressions. I think S is supposed to stand for symbolic. And S expression is basically a parse tree. Same thing. What a macro does is actually it returns a parse tree, which is then inline directly into the code at the point where the macro invocation occurred. In this example, we'd have a macro right here, and that would get substituted with these two things that were returned in an S expression by the macro over on the right side. Now, macros can also take arguments. Again, the arguments are S expressions. They're not regular Ruby objects with regular values. Normally, for a normal function call, Ruby would evaluate what is A plus B and come up with some object and would pass that to the method. But in a macro situation, you cannot do that. You don't have access to A and B yet. They haven't been defined at parse time. So instead, you get a parse tree. You get an S expression for A plus B, and it gets passed as the argument there, and then the argument ultimately gets used somewhere in the macro in a form escape, typically. Now, let's talk a little bit more about forms, because forms are pretty important, even though they seem kind of boring in comparison to macros. So I've got an example of a form here in the middle, and I've put it between a string and a proc to sort of illustrate the concept that a form is kind of midway between a string and a proc. All three are sort of ways of quoting code. A form is like a proc in that the contents of it are actually parsed, so if there's like a syntax error in it, you'll find out about that at parse time. However, it's like a string in that the contents of it are not turned into instructions. They're not code yet. They'll probably become code eventually when you use it for something. But for the time being, what's inside of the form is just data. It's just a tree. So here's that same form again, and here's a representation of the tree that you get out of that form. Now, you should look at this as kind of like a yaml data structure. It took me a long time to come up with this method of representing my trees. In fact, I had to get help from Roger Pack. He's sitting way up on the back there. And so this is actually a pretty clean way to inspect your trees. What I had before was a lot more ugly. So basically, this form turns into a tree that represents the code inside of the form. At the top level, we've got a call node. The actual class is call node, but I've left out all the extra stuff that you don't need to know about, like the node at the end of the class name. And it has a name. That should be print instead of puts. I guess I didn't update that. And then it also has a list of parameters, and that list contains a single string node and the data in the string is hello. Now, here's an example of a form escape. This ultimately ends up creating the same form as before, the same as expression. But it shows you, illustrates how to do a form escape. I said that forms are kind of like strings. So form escapes are kind of like the string interpolation syntax that crunch curly brace things. And basically, it allows you to escape out of your form back into regular Ruby mode temporarily. So what's controlled by the form escape operator is actually interpreted at the time the form is evaluated. And its contents are placed into the form at that point, kind of like the way macro expansion works. Now, here's an example of a more realistic macro. I think this was the first macro I wrote. You're all probably familiar with assertions from test unit. This is an implementation of assert as a macro. But this assert does a few more things than normal assertions do. So, for instance, we're getting the condition that's being asserted as an S expression here. And then this part is checking to see if the condition is one of the known comparison operators. And then if it is, it's picked apart into the left and right sides of the comparison and what operator it was. And we construct this nice error message here, which is a little complicated to explain. So I'm just going to show you how that works. Is that big enough? Can everybody see that? Now, before we do anything with assertions, we've got to set this debug variable. And then, you know, say I have a couple of variables. I can do an assertion like this. And you know what it will be. We expect that to be true, so nothing happens when you run the assertion. But what happens when an assertion fails? We get an exception. And the exception has an error message. And just like in test unit, the error message tells you what was on the left and right sides of the equal sign. We expected on the left side an A and on the right side, there was a B. On the left side, there was a 1. On the right side, there was a 2. But the other thing that this is showing you is it's actually showing you the symbolic form of the left and right sides. So it's telling you that, yeah, what you wrote on the left side was A and that value was 1. And on the right side, you wrote a B and that value was 2. That's something you can't do with a method-based assertion. You can't tell it what... You can't get it to inspect the image of the condition that's passed to it. Then the other interesting thing that's going on here with this assert, you know, notice I got this effect using regular cert. I didn't have to use assert equals. Using a macro assert, you don't need the assert equals, assert not equals, assert greater than all of those, you know, dozens of crazy assertions that TestUnit has. You just need one assertion, assert, that's it. And you can use regular syntax inside of assert. You don't have to use some kind of special operator for equality. Now, the other thing I'm doing here is the whole thing is only enabled if this debug variable is set. If debug is not set, basically the assert returns nil, which does nothing. If a macro returns nil, then what occurred at that point in the parse tree is it's turned into nothing at all. There's not even a call to a method that does nothing. It just vanishes. Now, so basically you can disable your assertions. Now, there's not much cause for doing that in unit tests. I don't think that would be very useful. But assertions are useful for a lot of things besides unit tests. Some people take assertions to a great extreme, and there's a whole paradigm called Design by Contract, where assertions are used extensively. I don't recommend going that far, but it is useful to be able to place assertions in your regular code and have expectations in your mainline code that can get checked. It's also nice to be able to turn those off in production mode, so you're not getting the cost of those extra checks. Now, let's talk a little bit about the syntax trees and the formats of those. Basically, we've got a tree structure. Here's an example expression, and this is the tree that it would turn into. At the top, we've got a plus node. Its left side is an A, and its right side is the star node over here, which is left and right sides. This is just like an XML tree or any other types of tree data you might have to deal with. Most of you may be familiar with ParseTrees, but we had a different library for creating Ruby ParseTrees. Ruby ParseTree or ParseTree are the most well-known ParseTree implementation. I wrote my own ParseTree. I consider the output of it to be greatly superior to Ruby ParseTree because Ruby ParseTree is trying to emulate the output of ParseTree and ParseTree is basically a hack that reached into the interpreter and pulled out some data structures from it that were never intended to be publicly accessible. The internal ParseTrees used by the interpreter, and they're just fine, I'm sure, for the interpreter, but they've been munged a little bit. They're a little more distant from your original source code than you'd like, and they're kind of funky. Unlike Ruby ParseTrees, red ParseTrees are actually object-oriented. The nodes are object. There's a node object, it has a bunch of subclasses, and nodes contain other nodes, and that has a number of useful properties. For instance, the subnodes of a node have names instead of just numbers. In ParseTree, if you want to go from your some node that you've got to a subnode of it, you have to say node bracket zero or node bracket three or something, and you have to know that, yeah, three means the rescue clause or something in whatever node you're looking at. In red Parse, you can say node.rescue or node.params, node.name, instead of having to deal with all these numbers that they're not so meaningful. Finally, red ParseTrees actually are very close to the original source form of the code that created them, whereas Ruby ParseTrees have been manipulated a little bit. For instance, with Ruby ParseTrees, rescues are handled in a strange way where there's multiple nodes that are nested together. In red Parse, the rescue is one clause that's attached to various other, the various types of nodes that can take a rescue. In Ruby ParseTrees, parentheses, if they're present, have been eliminated completely from the resultant tree. In red Parse, the parentheses are a different node and you can use that if you need it for anything. In Ruby Parse, operators are all turned into calls for you, so you cannot distinguish a call from an operator if it's one of the overriding operators. In red Parse, there's a separate operator type. I used to have this big description of all the node types at this point in my talk, but what I decided was it's actually better to show you some examples of some ParseTrees. You can use this red Parse command if you have red Parse installed to parse something and show you the result of it. Let's just look at a method call, for instance. We've got this call node, it has a name, it has some parameters, some flow control. There's an if node and it has subnodes that are called condition and consequent, else ifs and so forth. How about a method definition that will turn into a... Uh-oh, what did it do wrong? It's not how you define a method, is it? It turns into, you know, a method node which has a body that has things in it. Are there any other syntax examples that anybody wants to see what it looks like? There is this long section... Did you have one? Oh, a round baz, okay. Like this. Okay, so that one looks just like the other one because it's still a method call. If you do another set... You will end up with a parent node in the output. Notice that it's taking a little while to run all these commands. A big disadvantage with red parses is how slow it is. That's something I hope to be able to fix someday. So yeah, we got the parenz here that turned into a parent node down here. Now, there is actually more information in these parse trees if you really want them. You could turn on this verbose mode and it tells you the positions of things and all kinds of funky attributes like if there were parentheses in the function call and stuff like that, which mostly you don't need to know about and it's just extra useless details. So, that's enough of that. So, I talked about how cool Ruby macros are. There are some problems. As I said, the red parse is slow and because it has to pre-process everything that's got a macro in it, that means the startup of a system that's using Ruby macros is going to be slow. It has to parse all those things using my slow parser. If you've got a file that has macro definitions or anything like that or uses macros, even if it doesn't define any, that file has to be imported into the interpreter using this special version of require, macro.require. You can't use the normal require. That's something I hope to have addressed fairly soon. It would be nice to be able to scope macros to be able to say, I'm defining this macro within this class and so only search for expansions for that macro within the class. Right now, you can't do that. All macros have to be declared in the global scope and are visible everywhere. Finally, if you're familiar with Lisp, Lisp has this thing called macro hygiene. Basically, if there's any local variables defined in the macro expansion, you want those to not interfere with local variables in its caller context. It has to rename the variables for you in this weird way. I haven't implemented that yet. It's a little bit tricky. All macros at the moment are unhygienic, which is slightly dangerous. Now, there's some other sort of advanced features of macros. Macro can take a block. You can pass a block to it. Again, it's an S expression and you access that block using yield keyword, which doesn't have... yield does not have the same semantics as you might expect within a macro. Incidentally, this macro is one that allows you to make changes to local variables within the block, which will be... and those changes will be hidden from external callers. Macros can also have receivers. You can access those with this receiver pseudo keyword. That's not the way I really want to do it. I think I'm going to change it so that you can use self instead. That will probably make a little bit more sense. Oh, and this is the sort of R-spec-y version of the assert macro I showed earlier. Now, here's a macro that I'd like to write, but I haven't written yet. It would be nice to be able to take a big long pipeline like this where you've got a bunch of these functional operators on enumerations, like select and map, that you've stacked together this way and stick the pipeline attribute in front of it and have it turn into something like this. A single loop, which does the same thing, but doesn't have all these blocks and stuff in it, just as a single loop body should execute faster than this would. I have another couple of interesting macros to show you. Do I have time? I think I get time here. I don't want to show that one first. I want to show this one. This is a loop unroller. It takes the body of the loop and multiplies it. I think currently I'm using four as the loop multiplier. Your loop body, if you've got less than four iterations through your loop, it'll just be expanded into four versions of your loop body. If you've got more than that, there'll be four versions, a loop with four versions in it. Because Ruby has a bunch of different types of loops, there's a number of special cases in this. There's probably more I can do here, but right now all I'm trying to implement are the while and until loop types and the times loop. The way you use this is you stick an unroll in front of your loop. I'm not going to try to explain all this code, but notice this macro is about two pages worth of code. It's kind of complicated, but it's not too bad. You can actually do a lot of stuff in not a whole lot of space with this. It continuously surprises me about Ruby, how much really cool stuff you can do with not very much code. Something you thought would be really complicated and it ends up being something that will fit on one screen. My other macro that I really like, I just got this one working recently, this is an inliner. If you stick the inline keyword at the beginning of a method call, at the beginning of a method definition, we'll turn that into an inline method, which is basically an inline method, it's sort of a specialized version of a macro. So how this macro actually works is it turns your method definition into the equivalent macro definition, and then that is expanded by a subsequent stage of the macro processor. So I think I've got enough time I can actually demonstrate how some of this stuff works. So this is how you would declare an inline method. Let's give it a parameter. Let's let it do something simple. That's actually created a macro for you in the background called foo, and then you can invoke it like this, and as you can see, it has the same semantics as the equivalent method would have. Now, maybe I cheated, maybe I didn't actually implement an inline macro and just using a regular method call, so let's show you how that works. So one of the features of macros is this macro.expand method, which lets you see what your macro definition or expansion or form or whatever looks like when it's turned into regular code. So foo of four, let's see how that expands. It's turned into, I forgot I don't have history. It's turned into this parse tree, and we could try to interpret that, but what I like to do is just unparse them which takes a parse tree and turns it back into a string, and as you can see what I've ended up doing here is this foo got inlined into the method definition, and the whole method got inlined into its caller. There's some extra stuff going on here. It's a little bit clunky, but basically it's an inline method. This is something that, as far as I'm aware, nobody has done this before in Ruby. Maybe I have time real quickly. I can show you the unroll working as well. Sometimes the parser is kind of slow. While we're waiting, does anybody have any questions they want to ask? Yes. It mutates it twice. So basically what you would probably want in most cases is you'd want to have that side effect happen only once. But there may be cases where you do want the side effects to happen more than once. That's something you need to be aware of when you're doing this. Incidentally, that's currently a problem with my inline macros. It doesn't handle that case properly. Inline should preserve the original semantics of the method call. So it should be evaluating it once and storing it in the variable and then using that. Any more, any other questions? Yes. It should be a complete parser for Ruby. As far as I'm aware, there may be a little corner or two that I haven't discovered yet. But it should parse the complete Ruby 1.8 language. Some of the new 1.9 constructions I can't do. At the moment, it's not possible to use macros to create new types of tokens. You have to use the existing token definitions. I wrote a tokenizer as well, and it's fairly easy for me to extend that to new types of tokens, but probably other people, it's not so easy. That is a feature of Lisp. It has these things called lexical macros. It'd be nice to be able to do that. I think I'm out of time, Mike, is that right? Okay, thank you very much, everybody.