 Okay, sorry for the technical hitch there. Right, I was supposed to be doing this talk with a colleague who works with G on JFR. Unfortunately, he wasn't able to come, so I'm going to cover the material he's provided for the talk. This is a talk about our attempts to add some new functionality to Graal VM. So what I'll do is I'll start by providing some background of what Graal is, just to give an overview of how it works and what we're actually trying to achieve. Talk about the things we wanted to do, explain how that went, some of the problems that we had, what we actually succeeded in doing, some of the things we learned from the process, and I suspect we probably won't have time for questions, but I'd be happy to talk with them outside. So Graal is not just one thing. It's actually a whole series of components all written in Java. The core of that is a compiler which is used for various different purposes. It can function as a JIT compiler as an ahead-of-time compiler. There's an interpreter framework called truffle, which can be used to provide support for a whole lot of interpreted languages that the compiler can then be optimized by compiling. There's a native image generator that can take a Java program and finds a closed world of all the code that's called from the program and will generate a small executable from that code. That can run independent from OpenJDK, doesn't need a JVM, because it's got its own little JVM inside there called substrate, which replicates all the functionality you need from OpenJDK to run a small native Java image as a self-contained binary. You can also generate shared libraries with it as well. It's just a binary with multiple entry points essentially. So what Graal looks like at the core of it all is this compiler, and you can use it in OpenJDK as a JIT compiler. So if you were to plug it in by the JVM-CI compiler plug-in interface, OpenJDK will create some compiler threads and start handing methods to it to compile and get compiled code back, and that will be used as the compiled code for OpenJDK. In order to do that, it has a front-end, which parses byte code and generates a graph structure. There's the usual high, middle, and low tiers that massage that graph into a shape that you can spit code out and a back-end that generates code buffers, and there they are. You've got compiled methods. A slightly different configuration of the compiler, most specifically with a very different back-end for linking in a different way, can be used in the ahead-of-time compilation for Java and OpenJDK, and that can be used to populate the class data sharing segment, application class data sharing with predefined methods so when you bootstrap, you've already got compiled code for Java runtime methods or application methods. So the same compiler has been repurposed by reconfiguring all the components in the compiler. Another use for it from Truffle, Truffle allows you to build interpreters that parse a language and there's various Truffle language implementations and execute by walking the abstract syntax tree for the parsed language program. The Truffle interpreter framework knows how to take a node in that graph, that syntax tree, and maybe some of its subnodes push the graph straight into the compiler, bypassing any bytecode parsing front-end because you've already got a different representation of the program. It pushes it through the stages in the compiler, out pops code at the back-end, and you can install that as an execute method for the interpreted language to call directly rather than interpreting the graph, running on OpenJDK. You can also use it in native images where you've compiled things to native compiler and the gral compiler can run inside a self-contained binary as part of a program running some Truffle language. The native image generator itself makes two uses of the compiler. You start from a method in a particular class or a set of entry methods for a library and it takes that method and the class is associated with with JVMCI descriptions of them, eats the bytecode, and instead of compiling it down to machine code, in the first pass it analyzes the method and at the bottom of the transformation stage in the graph processing comes a transform method and a list of methods and types that that method refers to. They're put into a pool, into a universe which is built up recursively processed, finding all the references from classes to other classes, methods to other methods, until you build up a closed world of all the code that can be reached from your initial entry points. You can then take all the type information you've derived and the methods that you've found and compile them to a completely self-contained executable image that has a closed world model. You found all the code that you could reach from that entry point for that library or that application. It's not quite as simple as that because this closed world executable image is at some point when you call into the JDK runtime methods is going to expect to call out and have a JVM under there under normal operation, have open JDK. Of course, this program has to run its own without open JDK and it can't reuse all the JVM functionality from open JDK because that expects to do class loading and keep track of classes and it's all meant to be compiled into the image self-contained. What this translation process at the first stage does is it also substitutes invocations of JDK runtime code at certain cut points and side steps into substrate which provides an equivalent thing to the same sort of job as open JDK in this delivered image or delivered shared library. It used to be that all the functionality that was provided to emulate the open JDK VM was implemented as Java code. That became a bit of a burden when they realized they would have to keep porting that for 11 and all the later releases. So, from a lot of the native code that's implemented as C code in native libraries like the zip library, the libnet code, that's been put in and linked underneath the substrate image and it's used directly to avoid having to maintain so much Java code. So, it obviously has to emulate some of the things that normally go on inside the VM code and provide an alternative because you've got a different model for execution. So, there's this set of things that started arriving a few years back and we were looking at this about two and a half years ago and thought, well, the ARM port isn't really properly supported. It's a sort of second-class citizen. So, our first idea was we do some work with ARM, back-end, be on a par with x86 and our goal there was architecture parity. More recently, we've been interested in the native image generation because of our middleware product Quarkus. So, we looked at adding JFR support into the substrate and VM and into the native application. All the monitoring capabilities you have along with JDK are not currently present in these native generated images and really for usability users are going to need to understand what's going on in the application and also have some of the way of identifying things when it goes wrong. So, JFR was what we decided we wanted to port. Oracle already had plans to have that in their enterprise edition of Grail. We wanted to add it as a community feature and that's really critical, we think, for usability for the community edition, the open source edition. With debug info for debuggers this is a different story. A program that runs on OpenGDK if you get a native image compiled version of it and in any way diverges from OpenGDK behavior, that'll be because the compiler is compiled code into this heavily optimized executable program that is doing the wrong thing. So, you've got to be able to work what happens if ever something like that should arise. So, we wanted to put debug info in that would allow you to take the executable image and refer back execution of the instructions there to the original Java source code in a debugger like GDB Visual Studio. So, you could actually deal with that problem should it ever arise and understand what's gone wrong and then go and fix Grail probably. So, this is more a supportability goal. So, for the the ARM64 code, what we wanted to do was just add very basic stuff. Two and a half years ago there was very minimal things like addressing, there were no displacements embedded in addressing. They were actually loaded as independent values into registers. There's no prefetching things like merging an abstract shift and a mask operation using one of the bit fetch instructions in ARM that we wanted to optimize issues like that. And the most important was using load acquire store release for volatile operations rather than having memory barriers for efficiency. So, there was a lot of stuff that really just wasn't there we needed to put in. With JFR we really just wanted to get JFR working. So, we wanted this low overhead profiling method mechanism because it's really critical. We needed to have as many of the VM events, we wanted the equivalent sorts of things that made sense to come out of the substrate VM so you can see what the virtual machine and the native runtime code is doing. We'd also want to have support for user events and we'd also like if there's something about the way substrate works that isn't really a correlative open JDK, we were thinking of adding new events for that. We mostly just wanted to reuse the existing code as much as possible so we didn't have to reinvent the wheel. The debug support, we had a fairly simple set of capabilities we thought would be enough to debug problems. Break points by method name break points by file and line number and the ability to step when you hear a break point line by line, stepping into a row of functions, back traces. We'd like to print objects field by field if we can and we'd like to use path expressions to dereference through objects and pick values out using a sort of field axis so on. So, that's the sort of capability we really wanted to add. So, what happened? What were the things we encountered when we did that? Well, with the ARM code, two and a half years ago we started this. It's actually still ongoing. We've been slowly adding things. We met in real problems because of the complexity of the compiler. It was very difficult to get things into the code base. But actually there were problems with the development process in that that was organized and there was also problems with our understanding of where GRAL was going, what the roadmap was for GRAL. I'll talk quite a bit about the compiler complex because this is really quite important. Any compiler is going to be complex. You've got usually a high, middle and low tier of transformation stages in a graph model. And so there's a whole load of different operations happen in various stages. Some of them are rigid, some of them are nested. What happens with workflow controllers is really quite a difficult thing to do. And you're massaging a graph and transforming its shape all the time. Nodes come and go. So a line of the life cycles of the graph and the life cycles of the phases is always a problem. When you have a very deep node hierarchy, you may have a phase that operates on a generic node type. But which actual types of node does it really operate on? Same view as a lot of interfaces. These are generic problems. The problems that GRAL faces, but I think it's particularly difficult with GRAL because of the questions of scale. But there's also the configurability issue. GRAL is used by many clients for many different purposes. So all those phases that apply in each suite and all the different back ends are different for different uses. And that makes it, that means there's a lot of possible combinations of choices of what might be going on in the graph transformation process particularly. They also use a plugin model which allows you to take a particular method operation or a family of method operations or a particular code operation and to have some special handling via a plugin. And each different client plugs in different phases to do these different operations. There's another indirection mechanism for where things happen. Annotations are used on the code to guide these operations, to guide how the phases operate, how the plugins operate. So you need to understand how the annotations actually side effect this. And what makes it really complicated is that some of the annotations are defined in substrate classes and they affect the way that JDK classes are modified and compiled. There's an indirect mechanism. So you can see there's lots of hooks and lots of indirections that make the complexity of this really, really difficult to follow. And a lot of the configuration is done with a layered hierarchical overriding model. So you have a basic suite provider that provides the phases. You have a hotspot suite provider and then you have an AOT hotspot suite provider. So there are these nodes in one place. It's distributed. So it's really, really very complex. Some of the numbers involved there are 150 subclasses of class phase. This is a thing that does a single graph transformation. There are 200 different types of graph node in the code base. Access, which is an interface that's used to encapsulate all the memory nodes has 21 different types of read or write nodes or in one case a read write node. The graph builder plugin that does special case side particular transformations. 200 different graph builder plugins. 9 different types of plugins doing different types of transformations. And there are 200 annotations that direct these operations, these transformation operations in different paths. And of course the other problem we had was when we started this the ARM port was relatively new so there were problems with the code and the shifting code base. The ARM64 code was the least stable part of the compiler. But actually there were problems with some of the generic code so when we tried to fix what we thought were back in problems in the ARM64 port and improve code quality we actually ran up against things that were problems in generic code that didn't quite do what was needed so there were generic fixes needed and we weren't really in a position to do that. And actually that generic code was also still in flux a bit at that point. It's got better since and we've managed to make some of the generic changes but it really meant that we had to either rework things or withdraw some of the things because they were just too fragile. But the really biggest problem was the development process the Graal engineers at that stage were working on their own on their own repo with their own management system and they put up a github repo with PRs for us to try and push things in but they were really focused somewhere else and they weren't ready to be receiving things from us. They weren't even really looking at what we were doing because they were actually looking internally to their own group so getting them to actually be prepared and understand the implications of trying to build a community and to work with that was actually quite a long slow process. It didn't help also that the final testing stage, this is an old story though from JDK, was done behind the scenes in Oracle and so when we put changes in and did get them through we would sometimes get a cryptic message saying you've gone wrong but we couldn't see what had gone wrong because we couldn't see the tests and we couldn't see what changes Oracle were putting into the code base that might have broken our tests until they could see. So it was really done at a distance, it was quite hard. Sorry, can I just? Yeah. The other problem was that what this product was and where it was going was quite difficult for us to see. Graal has got a very long history, relates back to Maxine, I think before that, Jikes. This idea of Java and Java and all the sort of generalizations that Graal was doing, all the capabilities was something we gradually began to understand. It wasn't just a JIT compiler and an AOT compiler, it was actually not just an interpretive framework, it had all these other uses as well. Part of that was us not really knowing enough about it and not reading the right things, we just went in to do a job and fix some code. Part of it, I think, was that the roadmap that Oracle Labs had for the product was not clear anyway, they had different ideas about what they would do with Graal, what would be the key thing, the key value proposition and so on. It wasn't all our fault but that was to a great degree or there was no public roadmap, it wasn't really because it was a private project that was trying to become an open source project and none of the things were in place there. With JFR support, Josh had real problems understanding he had to go through much the same learning process I had with the compiler with Substrate VM, understand some of the limits of how it operates. He was trying to link in JFR code, there's native code and there's real issues with linking things in. This substitution mechanism that replaces things I think this thing was going to happen because that's the way it works in the JDK runtime but actually it's different in Substrate and just putting the whole thing together required a lot of understanding and since it was more new than the compiler even though we did this more recently it was still a bit of a moving target. One of the things that you have to really deal with is the fact that you can't use reflection and you can't load classes dynamically so you've got to make sure that any class you want to the image is notified to the Graal code analysis the points to analysis so that it will get linked in and compiled ahead of time. So you've got to actually go about getting code that relies on these features to work in a different way and you have to understand how to do that and similarly J&R you can't just load a library and have native methods, there's a registration mechanism for registering foreign code. That was actually quite tricky to do and this actually changed part way through as I said all the Java codes were placed with callouts to the Java libraries so LibNet, LibZip and so on. The one library that wasn't included was LibJVM and of course the JFR code that we wanted to use is in LibJVM so Josh is still working on getting that factored out and actually linking that as a separate static library and that means the other alternative would have been rewriting it all in Java and that's really not interactive proposition so that's been the biggest stumbling block yet. As I said, if you're looking at the way that the JDK runtime code works, Josh is looking at system load library trying to get a library loaded to get some foreign code in and system load library never got called because they'd cut above that point so he spent a long time trying to work out why his code wasn't being run and eventually found this out. There's another thing that Substrate does for optimization generate in the native image that you produce, you can pre-populate a heap with static class field data including objects that are computed at build time or else you can run class initializers at run time now you've got to get those two things consistent across all the classes that might be side affected by computing and evaluating something in the build time context versus what you might generate if you compute things in the runtime context you get those wrong, you get these really subtle bugs so that was another thing that was really hard to decide what should be initialized at build time what should be initialized at run time whether any existing decisions that were made about that might snooker you, so Josh spent a lot of time working through that. So he's had a really difficult time of it actually and again this code was in flux there were things changing particularly the library the way that the JNI and the library stuff was changed really meant he had to redo a whole lot of work. The debugging work is something I've done much more recently than Josh and it's been probably the easiest thing but I think by this stage we understand a lot more. It's actually also a relatively self-contained thing which is if you've got code in your code base or types in your type base for this analysis universe how do you tell a debugger about it? So one of the problems he had was to put this in a way that doesn't upset the upstream implementation that already exists so that's one of the constraints we're going to have to go through in the review process the biggest technical challenge was making sure that the stuff that writes the object file doesn't have to know about all the other types in the compiler and the heap management and so on. So there's only one sort of constraint and that just was defining several interfaces to allow these things to happen. That was decoupled fairly easily I haven't looked yet at making this an optional feature using all the sort of mechanisms Graal uses for enabling features. I'm assuming that's going to come out of the review process I hope so but it's been pretty successful So with the JIT we did get some very basic things in about two years ago we had another little round about 18 months ago and got a little bit more in. In the last year Roland Westrelin has managed to make some generic changes with the Graal compiler team and actually ARM have also contributed some of the instruction reduction optimizations quite a lot of stuff there so we have had quite a lot of success eventually it just took several goes to get there Josh is still really struggling getting a working prototype in a native image. He's got the event stuff worked out. He understands what he needs to do we can run that in fullback mode when it runs on OpenJTK and we know it's going to work but he's still got to deal with the library linkage problem so it'll probably be a few more weeks before we get anywhere having something that's a working prototype that we can then polish up and submit so there's still a work in progress there unfortunately. The debug stuff I actually two weeks ago submitted a pull request with the first three features so we now have for Linux and GDB we have working break points working line stepping and working back traces and the type stuff is the next thing I'm going to add in that's just gone in for a review and I got just two days ago a nice big set of review comments from Paul Virgo which is really good so Oracle were really helpful with this they actually gave us a whole lot of hints some of which I already found some of which I haven't when we raised this as an issue and they've actually been very cooperative about getting this into the product so that's been really quite heartening and I think this will probably go in in the next two or three weeks I'll have the core debug functionality and then I'll work on round two next the process has got better the Oracle engineers and labs are much more aware of our existence I mean we've been bombarded in a way with lots of pull requests not just us but the caucus team and many other users of Graal but a lot of them have risen to that and started working with the community more they're still working on their own repo that's the problem but they are more aware and they're more responsive they've always been helpful it's just the time this response has been the biggest problem they're now much quicker and we know who people are and that really helps and they know who we are as well we had a committers workshop we met them we made some good contacts so that's really helped in terms of us actually just getting things to happen because it just makes a difference that you've met face to face and you know people it's just a fact of the world the planning process we understand a lot more where Graal has come from and that was particularly helped by talking to the Graal developers and they also we also have more idea of where the product's going and how we can now fit our work in and the committers workshop was actually very helpful in getting all that to happen and we've got a agreement that the open JDK team and our middleware team who are working on caucus will regularly review and some of the other people contributing outside of Red Hat as well will regularly review things of the Graal team to improve the process and make sure we plan things we've been actually given ownership of the issues for some of the compiler issues and for the debug info and I'm assuming that in the longer term we'll get that with the JFR issue as well so we now actually are responsible for negotiating all that with the community and seeing it through review so there's been a certain devolution of ownership and responsibility as well which I see is one of the most positive things that's happened just in the last week that happened hopefully just to wrap up we had a lot of problems there are still some structural problems we've learned a lot by doing all this and we've also I think we've helped educate the Oracle Lab about how to work with the community and how to build up a community and benefit from having community input the project is now I think stabilizing a lot more it's Oracle are clear where they're going, Oracle Labs are clear where they're going with this and I think we know where we need to go with it the functionality we need in the community and so I'm hoping that this is going to be an up and up rather than something that gets stymied or stagnated here so I'm quite optimistic about this at the moment so how's that doing time for questions? No, not really Alright well I'll take them offline then I'm sure Andrew would be happy to answer any and all of your questions on that corridor track