 So I'm Tom. This is Charlie. We've been working on JRuby a long time now But before we start, how many people have exposure to JRuby in some way? Alright, most of you good good like eighty three point five percent So for those people who didn't raise their hand, I'm just gonna go over a quick overview quickly So JRuby is just another Ruby implementation We try to be as compatible as we can with CRuby, and we actually support these three versions Of course JRuby is built on top of the Java platform, so we get all the benefits that Java has We don't have to write our own garbage collectors Hot spot makes our code run very quickly, which we'll see in the next slide The most important thing here for people to notice is that Java has native threads and so does JRuby So there's no global interpreter lock and there's a couple of good talks later today Jerry Antonio is going to talk about How the Gill isn't your savior and how you can do a lot more with real concurrent threads That's at 115 and GM And then Peter Halupa is going to talk about the concurrent Ruby library and building goods It's a good set of concurrency primitives and tools that work across all the different Ruby implementations That's 420 in this room. So if you're interested in concurrency at all, those are two great talks to check out. I Had to include this is one of my favorite favorite performance graphs for JRuby. So we've got we've got Benchmark of a red-black tree library the top is CRuby MRI running a pure Ruby red black Implementation taking about two and a half seconds to run this benchmark the benchmark creates a bunch of nodes Traversism deletes them and does that over and over again And you can see why we often have to have to turn to C extensions on CRuby So the second bar down is Ruby with C extensions certainly gets a lot of performance improvement And this is you know now taking only about point five seconds at the bottom though, which is pretty cool This is a very nicely well written pure Ruby red-black library JRuby is able to optimize that the JVM can do a lot JRuby running the pure Ruby red-black tree Actually performs faster than CRuby with the C extension here And this is all because of the magic of the JVM awesome garbage collectors awesome optimizations There's also a lot of Java libraries out there If there's something if there's a Ruby gem that isn't cutting it for you Let's say you you're doing something with prawn and you want to do something that prawn can't do you can Just go over to the Java world and use itext and Compare this to there's about 7,000 libraries on Ruby gems. So 7,000 libraries versus versus 47,000 libraries that are in Maven There's a lot of stuff out there just about anything you need. There's a JVM library for it 47 is greater than 7 It's really easy to call into other languages Java's highlighted Oh, I think here. Yeah, I think so Java's highlighted and it's very easy to call Java with the Ruby syntax But you can call any language that's on the Java platform like closure Cobal cobalt. Yeah Um, so here's the two supported branches we have on Masters jrb 9000 which we're going to be talking about and then we still have a maintenance branch for jrb 1 7 We'll probably continue maintaining 1 7 for maybe another six months or so six months to a year Maybe as long as people actually need 1 9 support. I think is the is going to be the answer Jrb 1 7 was a very interesting release for us because you can pick which compatibility level you want You can either run in 1 8 or 1 9 mode with a with a flag This ended up being a horrible idea for us because we have to maintain two runtimes in the same code base And it just it didn't didn't work out so well So for jrb 9000 we're only going to support the latest version of Ruby And we're going to track the latest version to see Ruby So it's 2 2 right now. It will become 2 3 right now that now that 2 3 previews 0 previews 1 is out We're going to start putting the features in hopefully within a month or two after MRI 2.3 is out. We'll have jrb with 2.3 support right away so last Friday before getting on the plane 904 came out and Next week when we get back 1723 will be out. We're very conference driven here So jrb 9000 these are like the super high level bullet points We already said how we're tracking see Ruby We have a brand new runtime. We've been working on this runtime for years Most of this talk will be about this new runtime We're now bypassing Java for IO. It's mostly just native calls We can still fall back to Java But this gives us better performance and more importantly it allows us to do some compatibility stuff that we couldn't do Using the pure Java solution right probably the most POSIX friendly JVM language at this point and Only Garumas Transcoding facilities have been completely ported and we have no more encoding bugs. I promise a Few people might be wondering Why would we pick 9,000 as a version number and it's it's totally because of Dragon Ball. That's all no It's it started it started as a joke because we were going to go and say j Ruby 2 and then that was about the same time That Ruby 2 is coming out and that would have been confusing as hell. So We couldn't come up with a better number and it just stuck Charlie's even wearing the shirt today That's right. It's over 9,000. So the funny thing is 9,000 started out as just kind of a code number for the release But then we went back and looked and we had Eight previous major releases 1.0 to 1.7. So it turned out that this is the ninth major release of J Ruby So 9.0 9.0 is our version number You can kind of say we're doing the Java numbering scheme because they went from 1.4 to Java 5. Yeah So now what? Well, that's the title of the talk and we do tons of compatibility work We probably spend a lot more time in compatibility, but no one wants to hear about how we fixed compatibility bugs So we're gonna talk about performance Okay, so there's a few recent things that we've done to improve performance from Ruby stuff stuff that we've wanted to do For years, but was really hard with the old runtime make made a lot easier by the new runtime work that we've got We're gonna go over these quick There so the first one up through J Ruby 1.7 when we would compile JIT code to JVM bytecode at runtime. We only did it on method boundaries So if a method got called 50 times or more then we would turn it into JVM bytecode and you'd get good performance out of it I know the problem is that there's a lot of code out there that just has freestanding procs or lambdas so if you have a table of Procs that you're using for a bunch of calls or if you're using defined method for example Those would never jit and so they'd stay in our interpreter and run slow generally slower than MRI So that was something we needed to fix so in I think was actually 903 that the that the block jitting came out I've got 904 on this graph. So here we show MRI The blue bars here are the performance of a normal method at a regular def method The other bar is defined method which uses a block and has some block overhead and you can see that Jeremy 901 here both cases were actually considerably slower than MRI because The benchmark had a bunch of blocks in it and those didn't jit So not only were we not jitting the defined methods here, which are really slow We didn't actually jit the benchmark now that we can jit on block boundaries and on method boundaries The performance of both is much more where we'd want to see it definitely faster than MRI for regular method definitions And a little bit faster for for defined method and that was the next thing that we wanted to tackle was try to get defined method To perform a lot better than it does. It should be closer to a regular method So here's an example of two different defined methods the first one We would consider simple because it's a non capturing defined method It doesn't use any state from around the surrounding scope just used for simple metaprogramming to create a Method out of a block basically the second one here We're iterating over some names defining a method for each one So the name actually is getting captured. It uses a bit of state from the surrounding scope a little bit more complicated So these are two the two basic cases that we see for defined method and we wanted to be able to optimize these better Here's a comparison of the performance before the optimizations you can see that in in CRuby and MRI Define method methods perform about half as well as a regular method definition And that's due to that extra closure overhead extra state that needs to be managed And other various reasons in JRuby We were only slightly better than CRuby because we had the same sort of overhead to deal with with defined methods So a little bit better performance on defined method, but not certainly not as close to a full-on method definition So the strategy for optimizing these in the case of non capturing defined methods We actually can just treat it as a plain method in our compiler We do some inspection to make sure it doesn't access any surrounding state And then rather than compiling it as a block we can pile it as a method and it should be close or exactly the same performance As a normal method now future work for this if we see that we're doing some capturing We've got some state in the outer scope, but we can tell that it's only being read within this defined method We will also lift those values out as constant and then be able to optimize those cases as if they're regular methods too So this is future work. We don't have this yet So here's here's the early results. This is what we had in 903 904 releases Over there on the far right is 904 You can see the method of the performance of a regular method definition Now the performance of a simple defined method with no captures is but twice what it was before And actually it should be almost where regular methods are was probably some tuning tweaking we need to do So yes, all the defined methods out there within releases within a few releases We should be able to get them up to method level performance So the other big one half the time when someone comes to us and says j ruby is incredibly slow at this benchmark And we don't know why it's often because there's exceptions being raised all over the place So backtrace cost and exception cost is is high on C ruby. It's not free for sure But on the JVM it's way way more expensive building a backtrace requires Piecing together all sorts of inline frames all sorts of other method structures going back and forth between the Java interpreter and the Compiled code so it takes a lot longer than it does on C ruby and as a result. This is a major pain for us It's especially frustrating because exceptions are frequently ignored They're just caught and then you continue on with some other piece of code or they're used as flow control So used to unroll a stack back to some previous point So what we want to we think we figure we figured out how we could optimize this a little bit if the exceptions ignored Or you don't ever look at it. You don't even use the backtrace. Why don't we just not generate one? So here's the common pattern or the common anti pattern that everyone loves to use so if foo raises an exception here we can do our little quick post fix rescue and Just return some simple value or make some other call something simple like this So if it's a standard error, it's gonna fall in there in this case Obviously the exception is never going to be seen by anyone. There's no reason that we need to have access to it So we can try and optimize this away So the strategy here we inspect the rescue block that goes with a chunk of code If the contents of the rescue block is just a simple expression like a local variable or a constant value or a nil or Something we set a thread local flag that says we don't need a backtrace for any exception that might be raised down stack Then when we get to the point of raising the exception building the stack trace We check this flag and we can omit all of that work from Jen from generating the backtrace And so for cases like this, you know, this is a contrived case We hate when people use rescue nil, but we do understand how helpful it is But there's also other simpler cases that are actually practical uses of this So these are some of the value converters in csv.rb If you turn on converters, it will basically run each of them in turn And if they raise an exception it just returns the old value and proceeds to the next converter Psych had some code like this for a while to where it would just let the Conversion fail so that it knew to try the next conversion in the chain So MRI could add this MRI could make this change and it would be great for them, too so This is obviously horrible for us if you had a value that didn't convert until the last converter Or maybe didn't convert with any of the converters at all. We'd be throwing Maybe four maybe five exceptions building entire stack traces for them and then throwing them away And so our performance on csv was terrible as a result of this But the good news is this is the improvement that we have So this is this is a based on just doing a simple rescue just measuring the rescue itself csv.rb itself was something like five times faster for the performance and With the work that we did raising a simple exception and oh you want this one there you go And now and now on a logarithmic scale so you can actually see it So nearly two orders of magnitude faster for this pattern on J Ruby and this is in 904 This works great now I do need to mention that there are cases where we can't make this optimization So the typical non-blocking read case which throws an E again Because we're doing a select call here. We can't necessarily see through that We don't know if the exception is going to be used We do have some strategies we want to put in place like if we can see that it's an internal Method that doesn't use the exception. We might still be able to set the flag But for cases like this currently we don't generate a back trace for E again because it's not really an exceptional case It's an expected result of this call and in Ruby 2.1 and higher You can actually pass a flag here that says don't raise the exception just return a nil And then if you get a nil value from this it means that the mean non-blocking read didn't work So obviously both J Ruby and MRI are trying to work around the cost of exceptions and make it easier to work with Them without having performance hits All right on to Tom So I said we had a new runtime. It's called IR for internal representation having the most boring name for runtime ever From J Ruby 1.7 and earlier we did everything with the abstract syntax tree So we'd go and parse your Ruby into a tree and the interpreter would just kind of bounce around in the syntax tree The jit would just kind of bounce around the syntax tree and generate bytecode We wanted to get away from that. We wanted to work with Ruby semantics directly We wanted it to have a traditional compiler design We wanted anyone who's taken a compilers course or read the dragon book to be able to look at her code and be able to contribute to it And no one ever wants to write in another piece of software again. So this is our last runtime. Yes figures crossed So so this is in the top dash line This is what 1.7 looks like today, you know, we like some parts and we create that syntax tree But now we have these extra phases in 9,000 we have semantic analysis where we actually translate that tree into a set of instructions We create some other supplementary data structures like a control flow graph after that we go into an optimization phase and we run series of Compiler passes which then go and modify those data structures. Then we interpret those machines virtual machine instructions and after they've been running for a while then we decide we're going to go and Generate Java bytecode and then hot spot goes crazy on it and makes it fast We don't actually support Dalvik generation, but but we can I refuse to remove it until we support it So here's our first look at instructions I clean this up because the actual output of IR is is not for the faint of heart So we'll just look at a couple of the instructions at the top. We have check arity We have to make sure that there's two required arguments Lines one and two. We're binding the parameters and B to the zeroth and first value passed in on line three we have special variables for like For receiving like blocks. Yeah, if a block is passed in We look at line five. We have a line number instruction if we happen to raise That's how we know what line the error occurred in the Ruby source. Let's see line eight We're calling the plus method on the receiver a with the argument C a plus C It's pretty simple to read Also in semantic analysis, we do a lot of transformations This one I really like We Everyone knows what Z super is right If you don't put the parentheses then it passes all the parameters, right? The magic super that just passes everything that you were received. I always just think this is so crazy, but Now now an IR we just do a simple transformation We convert it to a regular super whereas in one seven and earlier We we had to maintain this extra state and then look it up when we when we ran into Z super in the syntax tree So we ended up Dramatically simplifying how we handle this in fact There's no special code now other than doing this one small change when we first build our instructions We have a pluggable compiler passes Dead code elimination and constant propagation are two of the popular ones. So I'll just go through an example of those quick So we'll go back to the same snippet of code The first thing we can see is that B isn't used and then that special block variable. We're not doing anything with blocks Boom, it's gone The next thing that we notice is that C is only used once So we can as a constant value That's another thing I refuse to fix in the slides as when I first learned how to use actions But seasonal longer needed either and then Finally nothing can happen between these two line number instructions so we can get rid of that as well So we are able to get rid of half the instructions and this is just static analysis I mean we're gonna get into some cool stuff later, but we can do a lot of cleanup in the code Just by having some intelligent compiler passes That's right. And so everything we do is done with static analysis today, and it's conservative No optimization. We do can fail or have to deoptimize So the big thing that we're working on right now is getting method inlining working All optimizing run times to get most of the performance from inlining methods And it's great because we get to eliminate a whole bunch of data that we have to store We don't have to create a stack frame We don't have to pass parameters to it and get the return value back and in JRuby It's extra important because we're building stuff on top of Java So we have to pass extra stuff that Java doesn't have to pass So we have to allocate a whole another structure and pass those values in once we inline the method back None of that stuff is there anymore I'm using Ruby code here and not showing your IR again because it wouldn't fit on the screen We have a Ruby snippet on the left which is my sample program And on the right is what it would look like as equivalent Ruby code after the inline happened. So We just count down from a million and we're calling this document one. So we're calling this method a lot Let's inline it So first thing we do is we add a guard We need to make sure that the type hasn't changed because if the types changed perhaps there's a new version of decrement one But so long as that guard passes then we just call the inline body here I minus one However, if the guard ever fails then we just go back to doing method dispatch So this is incredibly simple and that's why we started with this and It never deops, which is really good because we can't deopt yet, but it's also bad because we can't deopt yet Or we want to deopt and it's not virtuous Now I don't think this is really a thing, but I put it in here. Anyways, I believe when you run compiler passes every time You get done with a compiler pass. Hopefully you've provided more information so that you have More options for compiler passes to continue using that new information and continue to create this virtuous cycle So if we go back and look at this post inline method and now we're deciding we want to do another optimization Well, the first thing that we'd look at is like What's the type of eye? If we can figure out that I's a fixed numb we can do some math specialization but Unfortunately with this inliner we still have this fail case and so This fail case is for when decrement one gets changed to be a different method So we have no idea what I is so this sucks But What we want to move to next for our next inliner it's gonna look more like this You'll notice that the wild body is a lot smaller now We still have the inline body But instead of guard same with the question mark We have guard same with an exclamation point and what this means is that we'll go and raise an exception and that exception will be It's time to de-optimize We'll save at what point in state we are in this version of the method and Any variables that and their current state and then we'll go back to a safe version of the method and go in at the same Semantic point and start executing on the safe one Right, so we can basically if something changes with our expectations in our optimized code we can back off Probably the most important way on the important The important optimization is the optimization being able to back off from an over-optimized version But the nice thing with this version is now we can with 100% certainty know that this is a fixed numb And I did cheat here because minus can be overridden, but let's let's pretend that it can't And so from this virtual virtuous cycle perspective de-optimization is a good idea, but Let's ask ourselves a question if at any point I can de-optimize and go back to a safe version of a method Why not just do super crazy optimizations and hope they work out? Because the worst thing that's going to happen is you're going to have to de-optimize back to the slower version So the only cost of actually trying to do aggressive optimizations is that'll take longer to get to steady state performance And and how do you how do you make aggressive decisions? It's not it's not that exciting though we just collect information so There's two things that we are collecting in our profiler How hot something is so if you if you spin through a loop a whole bunch of times then anything in that loops probably a good candidate for some sort of optimization if If the objects in that loop don't change types Maybe that's a candidate for inlining a method or if it's a fixed them. Maybe it's a candidate for some numeric specialization So numeric specializations the next big thing that we want to work on after we have inlining really solid So in Ruby everything's an object sort of Numbers in C Ruby are generally represented as what's called a tagged pointer So there's no objects allocated. It's just a 64-bit wide value that has some bits set so that they know that it's not a pointer You can pass it around as though it were a pointer to an object But then every once in a while you check and see if those bits are there, you know, it's a number So now the JVM we don't have tag pointers We have references which are object types and we have primitives And they're not incompatible in the bytecode You can't pass a primitive for an object or an object for a primitive without boxing or unboxing either way And we really want to do is to be able to optimize numerics and numeric algorithms using the primitive values So they were not creating objects all the time So fixed num for example as I mentioned in MRI It's a it's a 64-bit value a couple of those bits are used to tag it So it can only really represent 62 bits of fixed num values without going to a big num object In J Ruby, we have to use an object all the time So to reduce the cost of the most commonly used fixed nums we cash minus 256 to 255 So essentially signed and unsigned byte range will not generally allocate a new fixed num in J Ruby And that helps pretty well We've tried to bump this range up and create a larger range of cash values But massively diminishing returns outside of this range And the JVM has its own primitive value for an integer type or a fixed num sort of type a 64 bit signed long That's what we'd like to be able to use and we want to try and use long whenever there's a fixed num as much as possible for performance So we look at a simple example here. That's just running a loop a numeric loop We got this number coming in it's probably going to be a fixed num It looks like it's being used as a fixed num This is something that profiling can tell us or if the in line if this looper code was in line somewhere else And we could see that n was actually a fixed num value then we'd have that type available I here obviously is a fixed num because we've got a constant value for it And like I mentioned, that's going to be a cached object. So at least for small value for literal fixed num values We won't be creating an object every time, but then it gets a little tricky. So do something is doing a call and passing I along Well, we've determined that it's probably going to be a fixed num all the time But we don't know if that target method is ready for a fixed num or if it's made for an arbitrary type So we have to figure out what that type needs to be And then we're incrementing I every time and this will generally create a new fixed num object for every loop And that's something we want to get rid of So the idea is that with some profiling magic and with our Deoptimization tricks so we can back off we should be able to turn our looper method into the long version here the primitive version So that it will optimize as well as possible Ideally that'll pass through to do something we'll be able to specialize that for longs get better performance And because we have our deoptimization if we ever if it ever turns out we made the wrong decision about this We can just back off jump back into the interpreter and use regular fixed num objects until we do a little bit more profiling and figure Things out so this will be great We've done some early experiments with this about a year and a half to two years ago Our compiler guy did a prototype of an unboxing pass for the compiler and the performance of For example a Mandelbrough or other numeric algorithms is 10 to 20 times faster than running regular JRuby So this will be coming along probably in the next few months and that but that was without guards in place That was without some guards in place, but you know once we get our deoptimization. We should be able to do pretty well so today We do actually have a profiler and it's a little rusty a lot of this stuff was written years ago as experiments and It's just such an exciting time to be a JRuby engineer So that's going to be the next in the next couple weeks. We're going to make that work again The inliner now is working again Got dusted off it can inline a method and a simple closure at the same time some really really impressive Results, but Unfortunately, I only got it working in the interpreter So they're pretty meaningless to say that something's running 30% faster in the interpreter is going to have totally different performance characteristics once it jits and We're not going to actually inline for the interpreter. So we only have our deoptimization strategy and emails and It's it's a lot more complicated than we thought it would be But we are getting close to having something that we think we want to code up now We pretty much have all of the we know how to do it It's just a matter of getting an IR to be able to back off properly. Yeah, and There's lots and lots of bugs even in the inliner even though it works. I noticed When I inline a method and a closure back, I noticed that there were like three return instructions all in a row I mean the first one works. So the program works, but where did these extra return instructions come from? So we have some cleanup to do but as I said, this is super super exciting We've been waiting years for for this. Yeah, but and forgot to mention jb 9000 We spent more time on compatibility than probably any release Everybody that we've heard that's made the transition from 1 7 to 9,000 has had a great experience with it We really are pretty much compatible with 2.2 Everything pretty much works which has freed us up to start looking at some of these crazier optimizations now So it is really exciting Refinements work the sort of work this sort of work Who's using refinements here Yeah, just get out Now it works for simple cases we we're waiting to see what people report for bugs we're kind of pragmatic on that So we talked a little bit about the current the status of j ruby today the stuff we're working on for tomorrow We really should talk a little bit about what the JVM folks are also doing with us working in concert with us So we used to work at Sun Microsystems. We're good buddies with open JDK folks the hotspot team the GC guys And they use j ruby as one of their primary test cases for making JVM improvements and making improvements to invoke dynamic and making improvements to garbage collection So they're talking to us all the time. So there's a couple two projects. I wanted to talk about that are upcoming for Java 9 and beyond So the first one we all love ffi. It's made this let us have real native POSIX IO real POSIX Process control, but we do it all as a separate library. We want something at the JVM level And so I proposed a JEP. This is JDK enhancement proposal JEP not 191 is project Panama, which is like, you know, bridging north and south Java and native So this will be native support at the JVM level for doing ffi And now what's cool about this they already have code generators that you can just feed it a header file And it'll generate all of the JVM code that you want for that binding But the JIT also knows about this So when you compile code that's making a native call and it's compile your Java code that makes a native call It'll actually just go and do the C call for you in the jitted code There's no extra layers to bounce through no no function pointers in indirection So it's going to be basically as fast as if you wrote it in C to begin with So here's the here's the old way that we always had to do if you wanted to call into some C library from the JVM You'd have to write a little JNI stub on both sides a little bit of Java code a little bit of C code And that sucks because then you got to compile that in every platform or you got to ship binaries for every platform So we've ended up using a library called JNR JNR is Java native runtime. It's basically our JVM ffi that we've built over years for the for the J Ruby project And so all we have to write now is just the JNR stub it Programmatically will go and figure out where the library is find the right function pointers and then bind it to our callable Yeah, but we trade it off having to write C every time for writing Java Right, so we're still writing Java every time and there's there's enough layers here that the JVM can't really see through this It's certainly not any faster than doing a JNI call, which is not fast to begin with but we've got some extra layers of indirection So the goal with Panama and as I understand there they have this working now in early builds of Java 9 What we would do is just tell Panama the C library that we want to call or generate it using some of their tooling And the JIT will then know about both sides It'll be able to actually optimize this all the way through and so we'll see something like this when the See the Ruby code Jits at the JVM level if you're doing a get-pid call through Panama It'll actually be a get-pid call right into lib system or into libc. That's awesome This is actually working out in some of those early builds So the other thing what's the biggest problem with J Ruby startup time by far if we had solved if we could solve The startup time problem. I think we'd be done like everybody would just use J Ruby all the time for everything So this is our greatest challenge for sure and there's a lot of reasons for this everything starts out cold Our parser starts out interpreted by the JVM our IR compiler all of our core classes So basically we're in the early stages of executing Ruby code We're running an interpreter our IR interpreter on top of another interpreter the JVM interpreter now eventually that stuff warms up And we end up performing significantly better than MRI, but those first few seconds everything is cold cold cold all the way through Now we're also starting to use more and more Ruby code in J Ruby to implement Ruby Functionality and that that ends up just aggravating the problem So now even to get J Ruby started that we have to parse a lot of Ruby code interpret a lot of Ruby code Make things worse over time So here is how bad it actually is So this is three different actions you might want to do in Ruby So dash E with dash E one just a simple startup of the J Ruby run for the Ruby runtime a gem list with probably about a hundred gems Installed and rake minus T in a rails app So that will boot up rails and then get a list of all the tasks that are available And you know obviously orders of magnitude slower here at least in order of magnitude slower than C Ruby for all of these things So the one thing that we can do right now is set a flag to optimize this a bit So starting a little over a year ago J Ruby one seven ten or something like that We added a dash dash dev flag that you can pass the J Ruby and what this will do is it'll turn off our jet Which takes some overhead and slows some things down at the beginning it sets the JVM to a lower optimizing mode So it's not spending as much time thinking about the code It's getting the code up and running in a simpler form right away And this actually helps a great deal So for most cases this is at least a 50% reduction in startup time if you're not using dash dash dev on J Ruby Give it a try it'll it'll it really helps improve performance. Definitely the first line of defense for making it look a little bit better So the thing that the JVM folks are working with us on They're working on their own ahead-of-time compiler for Java code And this is ahead-of-time compilation of Java to native code So they'll be what they'll be able to do is they'll take the hot code that runs in J Ruby at startup like our parser and our compiler and interpreter and so on and give us a pre-compiled native version of that that should get us Significantly closer to a real native runtime as far as booting And now the really cool stuff thing here is that they don't just stop at that point They give us the native code, but it still has all of the information for the JVM to continue optimizing it So unlike our dash dash dev flag we'll get fast startup and it will continue to optimize at the JVM level and get full peak performance And so it's it's getting there. So here is just the rake minus T. There's J Ruby with no flag Then J Ruby dash dash dev down from it and then two versions of the A the AOT works that these Oracle folks are working on the bottom one is that Optimizing AOT where it can get the fast startup performance and then still continue on to peak performance and run as fast as it would without flags Turning it off and this is really early work We know that there's a lot of stuff So this is supposed to be part of Java 9 even if it's not we've already heard from the Oracle folks that they're gonna Let us use it. So eventually you will be able to use AOT stuff with a J Ruby runtime There's many many tweaks that we can do to make the AOT work better for the for the Oracle guys They've told us about little bits and pieces that don't optimize well changes that we can make to the code We're gonna keep iterating on this and drop that bar even further So the ideal is that all the code that runs it boot for J Ruby should run native And it should get us significantly closer to MRI performance on startup And we won't necessarily lose all the optimizations that are cool in the JVM So closing it out last year we asked companies to let us know if they're using J Ruby so we could put their logos on a slide and This is just a day's worth of gathering tweets from people really amazing response Most of these we had no idea that they were running J Ruby. So this stuff is really important for the Ruby world There's a ton of people out there running it for very important things One of my favorites here is BBC news all of the election results that they report are a J Ruby application Using a bunch of cool concurrency tricks. They've got some little Sinatra apps, but they've had no problems with load It works great. They definitely couldn't have done it if it was just MRI alone So it's exciting stuff here And we're really happy to have you who folks here that are interested in J Ruby and folks like this that are actually putting it into production Let's slide on this, but we're having Hacking in office hours from two to four today, right? So if anybody's interested in sitting down with us with an application or a bug you have or performance That doesn't look like it's where it should be. We're gonna be in the the buff room I think we're in section a you'll see us from two to four today So stop on by and we can chat a little bit and before I forget I've got a whole bunch of J Ruby stickers So if anybody wants it, this is J Ruby like J a Y Ruby the Ruby J He's our little mascot So if you want a J Ruby sticker come up and grab one or or come to the hacking session at two to four and that's all we have Thank you Right. So the question is about the status of invoke dynamic So invoke dynamic is a JVM feature added in Java 7 that lets us bind Dynamic calls like in Ruby more much more directly so that the JVM sees them as though they're regular Java calls and can optimize Inline do all that cool stuff. It works great performance wise So if you're if you're running with it in a production application performance will be better in almost all cases Let us know if it's not and it works. It's solid as far as Bugginess is like all they've worked out all the issues definitely give it a shot for straight line performance I will say that they've improved things, but at the moment it does have a startup impact So it's not something I'd recommend that you run as part of your standard command line when you're executing stuff But if you want peak performance, it's definitely the way to go and there's a flag for that dash x In book compile dot invoke dynamic equals true something like that. So you can turn it on and then get the performance you want Yeah, so the problem is that there's there's like no RVM master anymore So the question was about RVM is still kind of stuck on like an old JRuby version At least for the release RVM So if you try to go install JRuby 9000 it installs a preview release from many months ago So my recommendation until somebody in the RVM world can get a release out Just use the head version of RVM RVM get head to update that one has all the fixes in place So that it will actually know the right most current version of JRuby to install Why 50 iterations before JIT good question? It's kind of an arbitrary number It's so the JVM has its own like simple optimizer their count is about a hundred I think for C1 the lighter weight compiler We just kind of programmatically figured out that you know typically 50 is a good number for telling whether a piece of code is hot With the with the now blocks jitting that actually helps Considerably because if you call a method once and do a heavy loop with a block in it now We will also compile the block so 50 is not a bad number It keeps too much from compiling but the hot stuff usually gets picked up But this is something that might change once we get the profiler and then we'll be able to make a much smarter decision about what's hot Yeah, so the profiler would be able to see okay This method has only been called twice, but it does a lot of work Maybe we should actually go ahead and compile this they're probably gonna keep using it It's really it's really funny to see the initial request spikes on a rails app. Yeah There's the spikes at 50 requests and spikes at a hundred requests As more and more stuff still slowly starts to get compiled it doesn't look quite as bad because we do Compilation off-thread now, right? It's not not kind of trickles in a little bit, but it used to just be like It's gonna be horrible No, it's probably going to be a lot lighter weight in many cases because So the question yes, thank you So the question was what is the likely memory impact of inlining as far as the overall size of the runtime? So the IR is gonna be a lot bigger But if we can compile one inlined version of a method with all of its calls That should be less JVM bytecode That should be less native code generated rather than having all of the separate methods compile on their own The other aspect of this is that if we're able to inline a method with a block and the block and all of that comes back Back to the beginning. Well, we're not gonna have to create any block structures for it We're not gonna have to push all that state around that stuff that we're not gonna have to worry about It's gonna really improve performance of those like those live runtime state might get reduced But our memory is gonna be I think a big issue We're we're using like 30% more memory now than we were in one seven So we're gonna get that down that will we'll level out But if we have to have multiple copies of methods, then we're gonna have And times as much memory right and again This is something that you may not necessarily run on a local development environment But you know throw it into testing and production then you get your performance you want Yeah, I guess the other thing too is once the profiler is going I think we're gonna be Jitting a lot less stuff because right we'll be able to make smarter decisions about what to compile and when to compile it So that'll help balance it out a little bit too That's like the weird thing though when when you start working on these projects and try to figure out how to make things fast You just assume I can just keep inlining forever and I can make it like super fast But there's all these trade-offs you have to do to like actually still have it like not use tons of memory or Take ten hours to warm up You know all right, I think that's it for us. Thanks for coming out. Hopefully we'll see it two to four