 I'd like to thank you all for coming today to listen about my project. This is Ocelot, this Ruby compiler I've been trying to write. Ocelot is a preliminary name, and it's also a preliminary implementation, meaning in spite of how much work I put into it. I'm sorry. Ocelot is a preliminary name, and it's also a preliminary implementation, which means that in spite of how much work I put into it, it's still not very far along. But I've gotten through the difficult parts, I think, and so some exciting stuff should start happening soon. Now, my goal with this project is to have an efficient implementation of Ruby while supporting the full semantics of the language. Now, those goals aren't entirely compatible because Ruby just isn't that way, but it's surprising how little actually has to be given up in order to have efficiency. So, oh, and I should say, probably speaking, Ocelot is not a compiler, but a Ruby to C translator, which is not, in my opinion, a true compiler. But it should still be pretty fast regardless. So, before we talk about Ruby, let's talk a little bit about an efficient language. Here's some C code. This is a Fibonacci calculator. Now, C is efficient because the compiler knows what the types of things are. So, for instance, when it comes time for the compiler to omit the instruction for this minus operator here, it knows that the types of its operands are integers, and so it can omit the integer version of the minus instruction instead of some polymorphic instruction sequence that switches off depending on the runtime type of the operands, choosing a float or an integer or a char minus. Likewise, with this plus instruction, the compiler can see that the return value of the Fib method is an int, so it omits the integer version of the plus instruction. And even more than that, when it comes time to omit the call site for these Fib calls here and here, the compiler can see where the target of those calls is, and this is recursive so the target is the same method, but being able to see where the target of the calls are means that the compiler can do things like inlining or tail recursion elimination, that kind of thing. And most of this would be true even if C were a polymorphic language, even if you were allowed to have multiple versions of Fib, you could still have all these nice optimizations. So now here's a Ruby version of the same algorithm. Ruby is several orders of magnitude slower than C when running this code. And the major difference between these two is, you know, aside from syntactical differences, is that Ruby has no type declarations for anything. So it's very difficult for a compiler to get a grip on what the types of things are going to be at runtime. You know, in this code we'd normally expect N to be an int, but according to the semantics of the language N could be an array, it could be a string, it could be anything. So, and because of that it makes it very difficult for the compiler to generate efficient call sites and the calls are everywhere in Ruby. You know, it's not just these Fibs here, all of these operators minus, plus, less than or equal, those are all calls and those are all pretty slow. So writing a compiler for Ruby is something I wanted to do for a very long time and it's something I more or less gave up on a number of years ago. It just seemed too hard to solve that problem. You just can't get a grip on what types of things are in Ruby. But then there was a breakthrough. Now, in Ruby we have this tradition of writing tests. Ruby's the most test heavy language that I've ever encountered. And so here's an example test for the Fib method. Now, notice something very interesting about this test. The arguments to Fib and the test are all integers. Likewise, the returns of Fib are all integers. So what if you had the compiler run the unit test for the code that is compiling and then extract type definite, the types of the expressions as they go flying past and use those pragmatically as a kind of type declaration, right? I call this type induction. Type induction is to be contrasted with a camel-style type inference, where the compiler sees what methods are being used on a specific variable and infers a type declaration for that variable that has all of those methods in it. I should also mention I didn't invent this idea. There's two other people who seem to have independently come up with the same idea and told me about it. Those two people are Rich Moran and Josh Sesser. Both of them are at the conference, but I don't see them in the room today. I did come up with this name type induction, however. Okay, so now I've been using this word type, which is kind of a controversial religious word in Ruby. It's been the cause of flame wars in the past. Now, even though it's such a dangerous term, I think it's actually the appropriate one, but I should spend a few moments to give a definition of what is a type in Ruby. I think I can come up with a definition that's both one that we can all agree on and one that's useful for writing compilers. So what is a type? Now, here's some wrong answers. Some people, for instance, say Ruby has no types. Now, while there are no type declarations, it is not true that Ruby values have no types. You might say that classes are the types. That's a closer approach to the definition at any rate. And it's true enough in a static language, but it's leaving out an important aspect of the way Ruby works, which is singleton classes. In Ruby, you can change the behavior of an object at runtime, and that changes its singleton class. So maybe these singleton classes are the types. It's closer yet, but it's still not quite right. That's actually, I believe, a correct definition of types. However, the problem is it's really not very useful. Singleton classes are too numerous. Every object which has singleton classes has a unique singleton class. So potentially a program that's using singleton classes can have very, very many types, or an infinite number even. And that's too many to deal with statically. So here's a good definition of type. Type is class plus decorators. So when an object is born, its type is its class. They're the same. But over its lifetime, it may get decorated with these modifications to its behavior like here. And the class that it's born with plus the list of decorators that are applied to it over its lifetime together define the type of a value. Here's another definition of type. This is probably even better than the last one because it results in fewer types overall. Type is the object set of name to method body mappings, which is to say it's a hash of method names to method implementations. This is the most exact definition. I think it really captures the essence of what a type is. And it is equivalent to this definition, although it's not a one-to-one mapping. That's to say that there may be multiple paths through the graph of decorators to a single unique set of method name to method body mappings. Now, type inductance is very powerful. It's the key insight for this project. But it does have some problems. I want to cover how we're going to address those problems. One is the issue of coverage. So here's the Fibonacci code I showed before. But I could have written it like this. That is, I could have made a weaker test. Now, in this version, I'm only passing 0 and 1 to Fib. And that will mean that this first line of Fib is the only one that gets executed. And never is even hit. So as a result, the type inducer, when it's examining Fib, will never get a chance to see what the types of the expressions in the second line are. They won't have any information about them. So clearly, in order to be able to type induce your program, you first have to have pretty much complete code coverage. But code coverage isn't the only type of coverage. Now, let's imagine we have this situation. We have these three animal classes. And they each have a call method, which corresponds to the sound the animal makes. And then there's this zoo class over here, which is a collection of animals. And the zoo also has a sound that it makes. When all the animals in the zoo make their call at once, we have a cacophony. And then there's a test over here for the cacophony method. But there's a problem with this test, right? I forgot to put a bird into the zoo when I created it. So as a result, when the type inducer will never see, in this line, never see that the variable animal could have the type of bird. There isn't a code coverage issue here. All the expressions in cacophony are being covered. But there is a lack of what I call type coverage. Now, another problem with, yes? Because I think I would think that a zoo cacophony would be any sort of animal would be able to receive a call. Would this sort of compile be able to understand that? That wouldn't really help by itself. The ancestor is just sort of a way of initializing the class when Eufford has created it. So another problem with type induction is mocks. Mocks are basically fake objects with fake types. And so at runtime, they're polluting the information that the type inducer is able to obtain. They cause the type inducer to believe that some of your expressions will have types that aren't actually possible at runtime. Even worse, it could be a sign that your tests aren't exercising all of the types that all of your expressions could have. And so again, you'd have a lack of proper type coverage. So I don't use mocks. In general, I personally dislike mocks and for the same reason that the compiler dislikes mocks, they divorce your tests from reality. Use of a mock means you're exercising your code in an artificial environment which doesn't correspond to the actual runtime. In a few circumstances, they may be justified in order to interact with something external, like a server or a piece of hardware. Kind of have to have a mock then. But even if you do that, you should also have another test which exercises that code using the actual real type that could actually... Question? Yeah. What about like the fact that if you use no mocks, any failure in your code can cover the whole code base because it's going all the way through where many times you want to isolate your code for certain types of tests, say unit tests. Only that area of the code is actually responsible for these things. You don't see a benefit from mocking in that case? Well, mocks can be nice to kind of narrow down the range of code that's being executed. I can see the advantage of that. What I'm trying to say is if you do use your mock, make sure that you also have some kind of integration type test which is not using the mocks which is actually testing the whole stack with the real types that are going to be present at runtime. I don't want to forbid people from using mocks. It's a very popular thing to do. It's not going to be a disaster but you do need to make sure you get complete type coverage. And I'll be talking about some of the other ways of dealing with missing type coverage. I've talked about typing induction, some of its problems. I'll be talking about this a little more but first let's explore some of the C code that the compiler is actually going to generate. Keep in mind these examples that I'm giving are pseudocode, reality as usual is more complicated than this but it'll give you the idea of what's going to go on. So what happens, for instance, when the compiler sees a call site? Here's a call site that I snipped out of that example I gave earlier. What should that look like and see? This is one possibility. This is what C++ does. C++ adds a hidden field to every object which they call the vtable. Here I'm calling it the class. That's the name that's used by MRI internally. Now it's a kind of deceptive name because we know class and type aren't the same thing necessarily but that's the standard terminology for Ruby. And then hanging off of this vtable or class there'll be a table of pointers to actual method implementations and so at runtime the compiler dereferences the object reference to find the class and then dereferences a pointer in the table in the class table to find the actual method implementation and then jumps to that. This is an okay way to do it. C++ does this a lot. It's not very slow but there's another way. Instead of jumping through a pointer you could switch off of the class field and when it has a type known to the compiler actually jump straight to the implementation of the method for that type. Now probably some of you are looking at this a little bit sideways and saying okay you're going to take my nice little Ruby calls and you're going to turn them into this big ugly switch statement like this. Why do you want to do that? There are some pretty good reasons for doing it this way. One is that avoiding the indirect call means you avoid the pipeline install that those calls usually cause. A direct call even if you have several branches proceeding it is going to be predicted by the processor better and as a result it's going to be a little faster. Another advantage to this technique is that it enables inlining. When you know exactly which method you're going to call it's relatively straightforward for the compiler to just inline the implementations of those methods directly in the appropriate switch branch. Whereas with this version it's hard to see how the compiler is going to inline it all without converting it to this first. Again this isn't my idea I stole this out of a paper on small Eiffel which is the open source Eiffel compiler. Let's see. Both compilers and processors benefit from having explicit knowledge of what the targets of your call sites are going to be. There's something interesting going on in the default case down here. The default case corresponds to the case where the compiler didn't know what the type at runtime is. At a type at runtime is encountered that the compiler didn't expect. That indicates a gap in type coverage. Now one thing you could do at this point is you could just raise an exception. Something happened I didn't expect. Sorry. But there would be a problem with that because then your compiler would be introducing bugs into your program that weren't present in the interpreted version. It's much, much better not to do that if it's at all possible. What I've got here is an RB Fun Call for those of you who aren't familiar with writing C extensions or with Ruby internals. That's the way that Ruby itself handles method calls internally. This RB Fun Call function. And it's every bit as slow as a normal method call in Ruby. In fact, it's probably a little bit slower. But it works. And it preserves the full semantics of the original code. Now the other thing going on in the switch statement is I've got this warning. Remember, going into the default case means that you have a lack of type coverage. And as good testers, that's something you ought to want to know about. You ought to be looking at these warnings and using them as clues as to how to write better tests. Which means the next time the compiler runs it will have more information and it will better be able to compile your program. Now one problem with these warnings or another thing you can do with these is the compiler can actually use these directly as input. So on a subsequent recompile of the same code, it can actually see where there was a type that wasn't expected the first time and generate the correct code the second time and you get, again, a better compile. That's a somewhat weaker technique than actually fixing your tests because that information quickly becomes stale whereas your tests presumably don't. But either way, it's a good idea. Now let me try to talk as briefly as possible about object representation. Here's an example of Ruby class and obviously it has three instance variables, Fubar and Baz. I was just going to be represented in C. This is what the interpreter does. Every object, every Ruby object is represented by one of these R object things. And R object has three fields in it and it has a flag field. It has the class that we talked about before and it has this IVAR table which is a hash where the instance variables are kept. This works but there's a little bit better way. Since we can tell fairly clearly what the instance variables in the class are, why not just inline them directly into the appropriate structure. That way, access to those instance variables will be done basically as an array dereference instead of as a hash lookup. That's both faster and it uses less memory. I think Ruby 1.9 already does something like this. I'm not sure of the details. It may be limited in some way. And then when it comes to bindings, we have much the same issue. It's relatively easy to analyze a method for the set of local variables and use in it. And you can use that to create a custom stack frame for each method which contains the local variables and use in that method. As well as a hash table. In both of these cases, you keep the hash table and that's used as sort of a backup for the case where the variable is used dynamically but not statically which you couldn't otherwise detect. Other than polymorphic dynamic method calls, let's talk about some of the other issues that a compiler is going to run into because there are a number of them. Ruby has these singleton methods singleton classes, it has modules you can extend your objects by and all of these present more or less the same problem which is that you're changing the type of a value at runtime. How is that going to be done in C? C doesn't allow you to change types of things really. C types don't even really exist at runtime in the same way they do in Ruby. But remember, the type of a Ruby object is stored in this class field here and there's no reason you couldn't just change that as necessary in order to make it the value that it needs to be. The object's type is really just an aspect of its state and the boundary between the type and state is, if you really think about it, it's not that firm of a thing. There's no reason why the type can't be mutable just like the state is, even though most static languages don't really support that. Okay, so what about method missing? Here's another difficult feature in a dynamic language, it's hard to support statically. The problem with this is it makes the flow of control pretty hard to follow. A compiler can never know when an innocent looking method call is actually going to end up calling that the missing instead. And in a program that's using method, I should add that if it were not for this one feature, we probably would have had a compiler a number of years ago based on a camel-style type inference. Method missing inhibits that. But it's a useful feature and it's used in all kinds of neat things. So we want to be able to support it. Now let's go back and look at this example method call I showed before. Now what if we allow the type of the receiver to be a delegate instead at this point which responds to most methods with its method missing call? Once the compiler figures out that a call to method missing is possible for a specific type at a particular call site, it can just generate the call to method missing at that point. You do have to pass an extra parameter for the method name. But it really type induction handles this problem quite beautifully. You also have to make sure that you've got complete coverage of your types over the call sites where method missing can be called. But it's really in contrast to type inference which just falls apart completely in this case. We can handle this really with no problems. And then there's eval. Some people call this evil and there's a good reason for that. This really makes writing a compiler hell. You just can't tell what's going to happen when there's an eval. There could be arbitrary side effects inside the eval. You know, you can't tell what the argument to eval is going to be in the general case. Eval is sort of the essence of what an interpreter does. It takes a string at runtime and interprets it. How are you going to compile that? But it turns out that most calls to eval are actually static. Which is to say that there's only ever one argument that's really possible or maybe a fairly narrow range of arguments. So if you can figure out what that one argument is, you can deal with that. Discovering that statically is kind of difficult. In fact, it's impossible in a lot of cases. But we can use the same trick as we did with type induction. That is, you can run the unit test, see what argument was passed to eval in the unit test, and just assume that at runtime it's going to be the same argument. So in other words, calls to eval will turn into something like this. You check to see if the argument was the expected one. If it was, you just inline that code directly at that point. Notice there's no eval call left in this code now. There's also no quotes around this code that had been quoted before, or in a string at any rate. Now the other thing going on here also, we've got this else clause, right? The correct thing for me to do at this point, if the argument wasn't an expected one, would be just to fall back to the interpreter's version of eval and use that. I actually think that failing at this point is going to be a better thing to do. If there's a, if the call to eval has an unexpected argument that indicates one of two things, either you didn't have proper eval coverage in your tests, which is something you want to know about, or possibly user input is able to influence the call to an eval. That latter is considered a bad thing usually, although it's useful in some programs. Every eval call is a potential code injection attack, so having a tool which can eliminate those attacks entirely is probably a good idea. And expecting your tests to have complete eval coverage is not that extreme of a requirement. That's something your tests really should do. Okay, so this magic way of predicting what the calls to eval are going to be, that's something I call eval prescience. So, overall, the process of using a compiler works something like this. You use type induction and eval prescience to narrow down the actual range of dynamic behavior in your program. The compiler then uses the information gathered in those stages to produce a static version of your program. But the information gathered is only really as good as your tests. And most tests, even tests for Ruby projects, really aren't complete enough. So the runtime also emits log statements telling you whether your gaps in your test coverage and you should use those log statements to make your tests more complete, which makes more information available from the compiler in the next cycle. Probably three or four times through a cycle like this and you'll have pretty complete tests which describe the behavior of your program pretty well. So we can deal with the lack of type coverage and coverage in general in four different ways. One, there's a fallback to the interpreter when something unexpected happens. Two, we have this virtuous circle which is hopefully causing you to continuously improve the test coverage unless the information available to the compiler. Three, we have the warnings emitted by the runtime which can be used directly by the compiler as input and type hints to optimize subsequent recompiles. And then there's a fourth secret technique which I believe to be the most powerful, but I'm not going to tell you about that yet. Okay, now let's talk about the really hard stuff. I said that, for instance, most calls to eval are static. Well, there are dynamic calls to eval. For instance, in this program. Does anyone recognize this program? Who said REPL? Right. This is what LIST calls a REPL that stands for redeval print loop. Basically, this is a very simple version of IRD. It's taking input from the user and evaling it. So no matter how much you test this code, you're never going to cover all the possible inputs to eval. It's pretty well hopeless. So this is a case that's basically incompilable. But it's impossible to compile it completely. However, it is possible to compile it to something that ends up calling eval in this inner loop here. And for that matter, who really needs IRB to be faster than it is right now? Now, there are other interesting cases for this kind of thing. For instance, if you were writing a spreadsheet in Ruby, and the natural thing to do would be to use Ruby as the cell formula language. In that case, again, you've got user input being passed to eval. So things will be happening at runtime that weren't predictable. So probably, I think that there'll be some kind of hint that you can give the compiler to tell it that those few places where there is a really genuinely dynamic call to eval, this one's a dynamic eval, and so it'll know how to handle it a little bit differently. I don't know a lot about just-in-time compiling. It's not the approach that I prefer. It could be done. If what you're doing, if the output of the compiler is C, then adjusting time compiler means that you're going to be running the C compiler in the background, and it's not really optimized for compile speed that way in the way like-it's not optimized for the way a Java jet compiler is. So it may not be a real good idea. Yes, here. Now, one thing that you might want to do, for instance, is have a sort of background compiler that runs in a separate process. It gathers the warnings emitted by default cases and then uses that to, you know, be like every 10 minutes, recompile your program as long as it continues to be warnings. Something like that for a very long-running process like most Rails apps would probably be a pretty good thing. That means that Rails app would start off kind of slow and then we get faster and then we get faster. That's not really a jet. But, you know, overall this technique is more or less the same thing as what goes on in a jet compiler. It's just a little slower. It's slower in the cycle, the feedback cycle is slower. So you first run the tests in one process and you gather information about your program there and then you use the information to do the compile and that's a whole-you quickly get to basically static version of your program which has pretty static behavior. Unlike in Java where you're constantly running inside of the virtual machine and you basically don't have any control over when the compiler is running and when there might be extra work to be done, although Java does a pretty good job of hiding that. Okay, so there's another important case to consider. The dynamic typing version of the dynamic val problem. See, in this program I've got, let's say, 20 modules and they're being mixed in to an object in random order. That adds up to 20 factorial different possible types. That's probably more than the number of atoms in the universe. So there's no way you're ever going to be able to exercise this program enough to enumerate all of the types. Nor will there ever be enough memory to store all of those types. So this program, again, is incompilable. On the other hand, this is really rare. I've never seen or heard of code that actually does this. Has anybody ever written any code that does something? Paul raises his hand. Something like that. Okay, but in any given program, you're usually loading in the same order, right? You can be specified in a configuration file or something like that. Yes, that's true. Good question. That depends on whether the method actually has a call to super in it or not. If there's no call to super, then you're not chaining the methods up and the order is much less important and the actual number of types is approachable again. To go back to Paul's point, I think that somehow... Well, to handle that kind of case, you would need to have the config file basically be part of the inputs of the compiler. And that could get a little tricky. All right, so that's the end. I want to thank you all for coming and listening to me. Here's some information about me, my email address and my blog. I hesitate to put the blog up there because I haven't updated it in a long time. There's not much on there anyway. There's also this mailing list, which I started for discussing this in similar topics. And I want to add that I'm actively looking for collaborators and people who want to help me with this or maybe there's somebody out there who'd like to sponsor me for doing this work. And in the about five minutes remaining, open the floor up to questions. How does code which is dependent on other libraries work? Do the libraries... Would you use some type of library loading? Like the libraries would have to be compiled as well? Because I'm trying to think, if the libraries weren't compiled, would you need to test for the libraries to be run for all the types or to determine all the types for the libraries as well? Yeah, basically the libraries have to be compiled as well. And the libraries have to have tests. Which, again, are part of the compile process. Separately, if you have a library, you have a compiled version that then your client program would load the compiled version of the library, or which compile it all as one big unit. You probably could do it separately, but it's better to have, for a dynamic language, to have all of the source code available to the compiler so that it can get a complete picture of what's going on. It's called whole program compilation. Wouldn't, if you did compile the libraries beforehand using this compiler to be used to commit to those? Couldn't you ship some sort of summary of the types in addition to the actual compile code that this compiler used to know what sort of types that libraries can spit out that are in the compiled library? Yes, you could. But the compiler really wants to be able to see other things that are going on inside of the external source code as well. You know, to be able to do something, if you want to be able to like inline, for instance, calls the things in your compiler, you have to have that source code available. So what's the state of the compiler and how are you testing its performance? At this point, I have type induction almost finished. So it's really not very far along. So the next thing to do is to start actually generating C code from parse trees. That's, I think, going to be fairly easy. And as far as testing performance, well, you know, you run some code through it and see how fast it is compared to other implementations. There is a project called Ruby Benchmark Suite out there for benchmarking Ruby implementations and operations that they do. Or you could benchmark Ruby spec, or you could benchmark Rails. There's lots of possible things. Yes? So you're linking in the Ruby interpreter as well, such that potentially if there's code you depend on that they can't make a file or whatever, you can still deal with that. Yes. Yes. It's just slow. Yes. I'm linking in the Ruby interpreter, the C code for the Ruby interpreter and with your executable also. If you can guarantee that there aren't going to be any cause to a vowel and that you have good type coverage, you could potentially leave the interpreter out. Probably that's not a real good idea, but it might be an attractive option for some people. Was there another question over here? Yeah. Why restrict your type induction to only come from tests? Why not come from other parts of the program, especially if you have a couple of programs that you can do some sort of data analysis for? Well, that's a good question, and you don't need to. In fact, what you really need is just to exercise the code in some way, and you want to do that exercise in some way that in the end of the day there's going to be no persisting side effects. So tests fulfill all those conditions, but depending on your application there may be a variety of other things you can do that also can be used. How about handling when a dynamic code calls in words from the compiled code, say it sets an instance variable that does or doesn't exist in your class? Well, so as far as just calling code from the dynamic side to the static side, that would go through an RB fun call, which would be slow. The instance variables, basically each class will have to have a different implementation of instance variable get and set, which knows how to get to those records in the statically allocated records in the class of structure. Yes. When you start, you mentioned that you're putting, instead of having 90 tables, directly variables in there, right? Yes. So what happens when you start going dynamically instance variable set in that table and say, where did you set up? Well, this is going to be a custom version of instance variable set for each method, for each class. So when you call instance variable set it will call a specific version rather than a generic version that the interpreter has. I see people coming in, so I think that's the end of my time. Thank you all very much.