 This is an hour talk, and we've got 25 minutes. OK, I'm going to talk about method handles. My name is Charles Nutter. Charlie, to my friends, which includes all of you. I work at Red Hat now for six years. I've been working on JRuby and other JVM language stuff for like 10 or 11 years. If you want to see the full version of this talk, at least the first time that I gave it, we were out in Hawaii with Chris Townser and a bunch of other cool folks. If you want to come next year, we're going to do it, right? Absolutely. Lava 1 in Hawaii next year. January, great time to go there. But the whole talk is available. Yep, come on out. OK, so what are method handles? We're going to go through, here's the agenda for what we're going to talk about. We're going to talk about what methods and handles are, why we need them, what's new in Java 9, a little preview of some of that stuff. I'm going to talk about a library called Invokebinder, which I've done in a longer form here in the past, but a little primer just so you understand some of the code. And then maybe we'll try and do something crazy with method handles. OK, so what's the deal with method handles? Well, when the JRuby team first came on at Sun in 2006, there was this JSR that had kind of been languishing for a while. JSR 292 for Invoke Dynamic, a new way of doing invocation on the JVM. But around this time, we had Ruby and Rails. JRuby started to be able to do some cool stuff. We could run Rails on top of the JVM. So there was a lot of interest developing in having more languages, and especially dynamic languages on top of the JVM. But we needed something better to make it easier to optimize and compile these things and get the JVM involved in running these languages. So what is it that we needed, at least on JRuby, Groovy, the dynamic language side of things? Well, we need to be able to call methods dynamically, obviously. We need to be able to look them up when they're called based on name or based on name and types, and potentially very different class structures. Both Groovy and JRuby have a meta class structure that's essentially a big hash. And the way that you define new methods is you stick methods into that hash under some name. It's not like a typical Java class, so we needed a different way of doing lookup and dispatch. We need to be able to dynamically assign fields and constants. In Ruby, you don't declare a whole bunch of fields ahead of time in your object or in your types. It's just when you assign a name that becomes a new slot in the object. So we need to be able to grow object shapes and access those fields similar to how we would in Java but going through this other indirection mechanism. And we needed this all to be fast, of course. We could emulate all of these things with reflection and other tricks, but that wasn't really tying well into the JVM. It didn't give us the performance that we wanted out of a JVM language. And so we visited the JSR 292 and we came up with the invoke dynamic byte code, which is what allows us to hook into the JVM machinery. And the larger part of that that came out of the invoke dynamic JSR is the method handle API. So the method handle API is sort of like reflection. You get a lookup object. The lookup object allows you to go and get a handle or a pointer to a function or a field or an array element and those sorts of things. Rather than just passing in a set of parameters like we do with reflection, we can pass around these little boxes called method type which encapsulates a return value or a return type and all of the argument types that would go with it. And then all of these methods, fields, arrays, they're basically just direct handles. So we get a pointer to a function and we can call that directly without a lot of the plumbing that reflection normally has. Type checking and type conversions and argument boxing and unboxing and so on. We get much closer to a real function pointer which allows us to do a better job of optimizing a dynamic language. And the method handles class then provides all sorts of additional adaptations. So if we want to add back some of that type checking or argument modification, like reflection does, we can do it on a piecemeal basis. Insert arguments, move them around, change their types and so on. And all of this basically forms a tree of method handles. So at the top you've got the invoke coming in that will maybe move some arguments around, maybe convert ints to longs or doubles to floats and then continue down until it gets to a direct handle, a call to a function or a write or read from a field. And that's what we have as a method handle tree and we can pass these around as a callable function object. So now why don't we wanna use reflection? Like I said, the use cases for reflection are very similar but it does a lot more than we need in many cases. If we have all of the correct types and we have the right number of arguments and we know we just wanna call a function through this handle, we don't need it to be doing all these extra checks and all of these different argument manipulation things. Method handles let us have a direct pointer to a function and then only add back in the ceremony that we really need. So let's walk through what this actually looks like in some code. It's kind of conceptually hard to just talk it through. So I mentioned the lookup object and there's two different ways you can get a lookup object. The first lookup here is a special lookup. It can see anything and have access to anything that you'd be able to see at that point in the code. So it can see private fields that may be in scope for this piece of code rather than with reflection where you have to go and get a method or a private field and then try to set accessible and then pass a bunch of Java nine flags to make it let you do all of this stuff. You can actually give out this access or have this access for your program. More often though, if you're just using this as sort of a reflective thing to call into Java methods, you'll use a public lookup, which is what you'd expect. It's only what's going to be public to someone that's completely outside of that class and package. So we have our lookup objects. Now we can use our method types to lookup methods. We have three different lookups we're doing here. We're going to get the get property off of the system class. That's a fine static and you can see the property, it's a return string and takes a string argument. We're going to that signature. Here is add on a list. This is a virtual lookup. So we lookup the virtual method and now we have a handle to add on a list. And then we can also do constructors. We can lookup constructors and construct objects and they all basically just end up method handles that we can invoke with various parameters and signatures. Lookingup fields also very similar so we can grab the system out field off of the system class. And then we have a method handle that every time we invoke it, it will go and get system out. Just like a Java lang reflect field. And then we can take all of these and combine them together as well. So here we have our get property. We have our get property, our add, our list. Down at the bottom we can also get our find static gather and pull these together into a tree of handles essentially. Then finally, once we've got all of our handles together we can invoke them just like we would with a reflected object. Here we're using the get property so we get the java.home and now we can access it this way. Okay, so I mentioned about combining method handles and I didn't talk about that in those slides there. Let's take a look at some of the adaptations we can do. So when we have these two different handles like get system.out and call print line on it we'd like to be able to combine those. And there's many different handle adaptations that let us combine those together into a tree of operations. We call these method handle combinators. And so these combinators allow us to take many different methods, many different fields, some conditional logic and loops and so on and create still just a single callable object that has all of this additional logic wrapped around it. And the cool thing about this is that the jit still sees through this. It says if we wrote these loops or these conditions or these calls in regular Java but we're building it up programmatically using method handles. And then the JVM takes that and optimizes it and turns it into the same assembly code that you'd have if you just wrote Java code directly. So one of these combinators to kind of give you an example of how this looks it would be an if then else. So here is our pseudo Java code here. We're just caching results from a database in a little hash. So we've got our hash. We check to see if we've already hit that particular key. If we do, we can return it. Otherwise we go to the database, load some data out of it and stick it into the cache. That's the simple thing that we'd like to represent as a set of method handles. The magic call that we do here in the method handle API is guard with test, which is essentially an if then else. So let's take that. We've got our condition, our then and our else. So our condition in this case, we find virtual on map that contains key method, which returns a boolean and accepts an object. And then we're going to bind it to the cache. So that means that first argument, the argument where we actually have the physical cache object is now taken care of for us. We don't have to pass that in like we would with reflection doing an invoke on an instance. So we bind that to the cache. Then we have our then case when we find the key. So now we will actually just go get the value out of the map. Again, we'll bind that to the cache. So now this will pull that element out of the cache and return it to us. And then we have our else. Our else is to pull it from the database and cache it in this hash object. And again, we can bind it and pull all of these together with our guard with test. Now we have a single method handle, a single callable object that does all of this conditional logic and then optimizes itself internally. Pretty cool. So there's lots of other adaptations. Obviously, I don't have time to get to all of these here but we can insert arguments. We can drop arguments, reorder them depending on how we're adapting different APIs together. We can filter the arguments or the return value. So for each argument, pass it through some function, take the result of that and make that the new argument at that point. We can also fold all of the arguments together, pass them to a function, calculate something and then the result of that becomes a new argument. So all along the way, moving arguments around, making calls, accessing fields and essentially building a tree of logic out of all of our method handles. So what's new with Java 9? So when we first came up with the method handle API, we had a fairly complete set of operations but there were a few things that turned out to be missing that programmatically it turned out we needed as part of the method handle API. One of the biggest ones that I was worried about was try finally. I'll show you in a minute why that was such a problem. We also had no way to do loops. So if you're going to be representing essentially an entire expression tree, it'd be nice if you could actually do some controlled looping over elements of an array or over input arguments, whatnot. We didn't have any real good way to do volatile or atomic accesses to fields. So I showed you that you can get a method handle to a field but it's just going to access it the way that you would if you were calling it from Java. There's no way to say, oh, I want to do a weak volatile right or a non-volatile right to a field. It's always just one particular access way so we needed better ways to do that. And there was no way to just tell it to construct an array. You would have to have a utility method somewhere that you would call into to construct arrays for you. Obviously, this is Java, we need to be able to create arrays as part of this whole expression tree. Okay, so I mentioned try finally. And maybe a lot of folks may not know this. When you have a try finally in Java code, Java C actually will duplicate that finally block of logic along every exit path from that block of code. So there will be the finally logic that gets stuck on the end of the non-exceptional path and the finally logic that's handled on the exceptional path. And so if we wanted to do this with method handles, we essentially had to do the same thing. We had to take that handle and duplicate it twice, make the tree that much more complicated for the exit path and for the normal path and for the exceptional path. And so something that looks really simple like this, we essentially wanted to wrap a target method handle with some exception handling and then pass it off to a handler if there was an exception turned into code like this just to do a try finally. And this is actually greatly simplified from what it really is because I'm using my invoke binder library which simplifies and wraps a lot of the method handle API. But this is obviously not what we want. This is way too complicated. It worked, but we've got so many handles and so many adaptations that the JVM had trouble seeing through it. It didn't turn it into the try finally logic we really wanted. And so in Java 9 now we can just do this and say try finally and put it in the right places. We also didn't have loops which were added. So here we have just simple while loops or a do while loop, a counted loop or an iterated loop that has an iterator and then just walks through a collection, for example. And then the general case, methodhandles.loop has very many different ways that you can configure it to handle other types of loops. These are all just sort of sub versions of that. All right, var handles also added in Java 9. And so var handles gave us the ability to have that handle to a field on an object or a static field, but much more powerful mechanisms for doing volatile access and atomic access. We don't have to cheat by going all the way to unsafe to do a compare and swap of a field or deal with indirecting through an atomic reference object. We can just do it directly with a var handle now. And so one of the many ways that we cheat with an unsafe can now go away. It also gave us something else that was pretty cool. We can have views on a byte buffer or a byte array that rather than if you actually look at, for example, the last talk had an int buffer that it wrapped around a byte buffer, right? If you look at the code in int buffer, it actually just reads bytes and then concatenates them together and turns them into an int. It's not actually pulling int after int off of that line, which means you're doing four or eight times as much work reading individual bytes and sticking them together. Now we can actually say treat this byte array as if it were an int array or a long array and it will read it in int or long chunks. Saves a lot of time. What does this look like in code? So here we're gonna use our system out again. So we get our static var handle and then we can do our accesses against that. Here's our byte array buffers or our byte array view var handle long method name but we can basically take a byte array and make it look like it's an int or a long. So we can walk it with a 32 bit or 64 bit stride rather than walking it in eight bit chunks. And then being able to access the array elements there, also again something from the var handle stuff. Now if we, here's where we have the atomic access. So here is just doing a set volatile. There are all the different types of accesses in the memory model are available in the var handle API including a few that normally are not even part of the Java language itself. They can be used in other ways. Here is our byte array view. So we read in a bunch of data off the line. We know that the data that we're reading from this stream is all longs. Previously you would have to just byte by byte combine those things into longs. And now you can say okay I'm going to turn this into a view that is actually pulling longs off of that byte array all at once. And that's helpful. And of course this all fits into the method handle API. So var handles can be converted into method handles and then mixed into the same tree and pulled in with all the other expression logic that you have. Okay. So now a primer, a little short example of invoke binder. Invoke binder is a library that I wrote to make it easier to work with the method handle API. There are some challenges in using the method handle API. For example if I wanted to write this code this is basically our running example here of system out. We're going to take an argument and then we're going to concatenate it and print it out. If I wanted to represent this whole expression, this whole function as a method handles, it ends up looking kind of like this. And it's difficult to read through because we're sort of starting at the end points like there's the field, there's the concatenate method, there's the print line. Okay now down at the bottom we're going to filter and change some of these arguments around. And then eventually we can combine this together and invoke it. But it's very cumbersome to read through. So what I did in invoke binder, rather than composing these method handles in reverse where you have to grab a target method and then you wrap adaptations around it step by step by step, you can start at the top and say I want to make a method handle that takes some arguments in and returns this result. First thing I wanted to do is convert an argument. Second thing I wanted to do is call a method. Third thing I want to do is assign to an array. So we write it going forward rather than backward which is the way the method handle API usually makes you do it. We also, the method handle API is very repetitive. At each step of the way you need to be telling it exactly the types that you want to work with even if it should be inferred from the previous handles. In invoke binder you carry those types forward so it's always aware of what the current method type is, what the current argument types are. And actually there's some more advanced features that lets you use named arguments and work with them directly that way. So let's look at how that actually looks in an example. So here is our example again from just using the plain method handle API. That turns into this with invoke binder. Significantly better to read and we can actually just walk through it. So, up here we're gonna start. We know that this is going to take a string array that's our arguments and we have our void return. We don't care about that. Continuing on, we're going to filter that array by getting the zero argument and pulling it out. Now we're going to again filter that string and prepend hello onto it and concatenate that. We're going to fold in the print stream which we get from a static access of system out. And then we're going to call print line. And the whole thing becomes one handle, one tree, one expression that we can call and pass around and it does all of this logic in one little box. So that's much more advanced than what we can do with reflection and invoke binder makes it considerably easier to read through that. Okay, so I said we're going to try and do something crazy with this. Once, now that we have loops as part of the method handle API, we've essentially created a Turing complete set of IR, right, we can modify state. We can call functions. We can have conditions. We can do loops. We can represent a language. We can actually compile a language into method handles if we like. And so that's what I decided to try to do. I wrote a little compiler that basically takes JRuby's abstract syntax tree and then just walks down like a standard visitor. For each syntactic element, I compose the right set of handles that would represent that particular part of the code and then it all just composes back out and then you get a method handle that basically represents the script. Top level part, not terribly interesting to look at. A few things to note. I added this little loop thing because I wanted to test that it was compiling and jitting and optimizing correctly. It's not much more that's interesting here. So let's continue on. Like I said, it's a standard visitor. So we have our node and then we walk down the node tree. Each node calls except on our compiler and then we get back into our logic. Here's a couple simple cases, two constant types, false and true. And in this case, I'm just using the objects false and true from Boolean. So the logic here. We have our state for this script that's being passed in. We have some return value from the script. We don't care about the state. This is a constant and we stick our Boolean false and the same thing with true here. Let's look at something else. We got fixed num here. So here is our fixed num which is in Ruby just a long, essentially. And I'm using box longs for this example just to make it simpler. But again, we don't care about the state for the script. So we drop that and we return a constant. What about local variables? Well, so I said that we've got our state array that's basically holding all of our local variables. So to get a local variable out of our script state, we just provide the index and do an array get. Similarly down here, we have our state for the script. We're going to do a little bit more complicated logic because the set into an array doesn't return a value. So we're going to capture the value we want to set, put it into the array and also return it. And I won't walk through all the logic here. But basically the same thing is doing an array set and assigning it to one of the elements in our script. Here's an if. Not much to say, pretty easy. The only thing that's a little bit different there, the condition needs to return a boolean. And so whatever the result of that subexpression is, we run some sort of truthy check on it. And in the case of Ruby, that would be anything that's not nil or false is considered true. In the case of this script, I did something similar. And then we can put our condition our then and our else. So we've represented an if expression from Ruby in method handles. And then of course, we have while loops finally. So we compile our condition again. We compile our body. We're checking to see whether this is a while loop or a do while loop. And then we take our script state. We invoke our filtered predicate, our filtered condition. We have our body, which just runs the code that's in the body. And then we can create our two different while loops. Okay. Of course, there's obviously a lot of other different types of expressions in Ruby. But here, now we're at the root. This is at the top of the script. So of course, we just compile the whole body. This is where we get how many local variables we want to have. And we throw an array construct in here. So we construct our array of local variables. And then that gets passed down through the method handles all the way into the rest of the script. So does all this work? Well, I wouldn't be here if it didn't work. Of course, it works. Here is our AST. We've run a simple loop that just iterates a value. And then I've set it up so that whatever the last value is of the script will be printed out. And there we go. There's our result. But does it work well? Of course, we could have written an interpreter in just regular Java code. That would not be interesting for this. Does it actually optimize? Does it actually compile like we want it to? And I'm thrilled to say that it actually does. This is assembly code that has now been generated from our compiler. But we have not generated any byte code. We have not loaded any Java classes. All we've done is built up a method handle tree and executed it. That's pretty cool. All right, so now a little bit more about results here. The compiled code, this compiled method handle tree actually does end up turning into one piece of native code. And that's great. That's what we were looking for. It takes a while to get there. Normally, the C2 threshold for doing the optimized compile is 10,000 iterations. I had to run this for at least 100,000 to get the stuff to come out. There's a lot of other plumbing in the method handle API that gets in the way of the JIT. But it does eventually get there. It may be a toy, but it's kind of an interesting toy. Couldn't we do something more practical with this? And now this is going to be determined sort of thing. What about the Streams API? So the Streams API in Java, you've got a collector, a map, utility functions that you call. You pass in a function and the map will call near little lambda for each element and replace the results in the list. Unfortunately, everybody in the world is calling into that same map function. The JVM has no way to see through and optimize your map versus somebody else's map. So none of it inlines, none of it optimizes the right way. Well, what if we implemented Streams as handles? I basically just did that here. It's not difficult to do. We can take every single operation that is too generic, too difficult to specialize in the Streams API, turn them into a tree of handles, and then each place we call it, it will inline, specialize into our local piece of code, and have the target function that lambda that we want inline all the way back. And that's what I'm going to start playing with after I'm done with this talk. So anybody want to play with this? Invokebinder is out there. It's headiest Invokebinder on my account. You'll be able to see all this online on the slides. And if anybody was interested in playing around with the Streams thing, send me an email. Let me know. I think there's a lot more we could be doing with method handles than we're doing today. Thank you.