 OK, great. Right, OK, we'll get started. This is a talk that I've actually based this on a series of articles I've written, blog articles that I haven't yet published. The blog article is going to have a lot more detail about this. And I will publish a link in the slides when I post them to both of them to the articles. They include full reference and all the background to all this. So this is going to be a fairly brief, quick run-through of stuff because of the shortness of time. So I hope we'll have time for questions at the end, but probably outside. So just to give an overview of what I'm going to be talking about, I'll explain why I'm talking about Java class metadata. I'm going to skip over the stuff I had in the slides, but I've left it in there for reference about how you measure overall JVM native memory use. I'm going to concentrate on the tool that allows you to see class metadata statistics and identify exactly how much memory use is being used to model an individual class in the runtime. I'm going to show you what that looks like under the hood, what's actually going on inside the JVM, and give you some numbers about sizes and things. The idea of all this is so that you can look at the amount of cost there is in the JVM for actually modeling the class base, and you can relate that to what's given up in any of the stats for your application. And you can maybe start identifying the opportunities for actually redesigning the code to have a slightly lower metadata overhead for your application. So I'll just explain what is Java class metadata. It's basically the JVM's internal model of everything that's in the bytecode. The JVM unpacks the bytecode and creates an object network and then throws almost all of the bytecode away. It also annotates that model with some extra state. It has resolution state explaining our classes linked to other classes, the link between classes and methods that are going to be invoked at some point during the code or field is going to be accessed and so on. It's also updated with interpretation state. For every class, there's a cache that keeps track of information that allows the interpreters to identify whether our methods or fields have been resolved to quickly invoke them or access them. And it's also updated with extra compilation state code addresses for code entry addresses for compiled jitted methods, linking stubs that allow you to do transitions between interpretation and compiling and so on. And also there's a load of profile kept by both of the interpreter and the compiler. So all of that is modeled internally as an object structure. And why do you need that? Well, if you're going to be running a managed runtime with dynamic class linking and loading, you've got to know what classes are in there and always link other classes into them as you load them. And you've got to respect their visibility and their access. So you need a model of the class base and the methods. The interpreter and the jit need to know about the class model because you can't run an instruction like a check cast instruction without taking some object working out which class it belongs to when then working out where that sits in a class hierarchy. So it's needed for execution. And obviously it's needed for you to do compiling and optimization to make sure that that's correct and sound. You need to reference the class model to be able to do that. Reflection actually requires you to reify the class model in memory. You've actually got to create an instance of Java line class. And potentially when you start doing reflect operations, create the proxies for methods, fields, meta handles. So you need knowledge of what's in the class base in order to be able to do that. And finally, JVMTI agents need to be able to scan query the class base and maybe update the byte code. And you've got to ripple that through into the rest of the JVM, the effects of that change. So you really need a model of the class structure. Why would you not just stick with byte code? Well, byte code has a whole lot of things that really make it inappropriate. It's in order to get data that's embedded in the byte code, you've got to traverse the way through a byte array to find things. So it's not easy to access things. Whereas a separate object network, you can index things. You can access things directly. A lot of data in byte code is implicit. You have to convert from byte representations to, say, a number or a string or something. So it's not really a good store in the way it stores things. You can't really annotate the byte code. It's a slab of bytes. So updating it in place isn't really an option. And in an object network, the model's a class base. Some of the objects can be read only. They store stuff that's constant. Something can be read right where you put runtime derived information into the place that's actually needed for you to do things with that class or without that method. And the byte code's very verbose. If you look at a symbol-like Java line object or a method name like add, if you look at a method signature, these occur all over the place in byte code. Constants, string constants, and class object constants occur all over the place. Keeping multiple copies of those is really, in lots of different byte codes, is a waste of space. So the JVM has a symbol table where it puts symbols in once. It creates unique strings on the heap, and it just puts pointers to these things in. So you can reduce a lot of the verbosity. And constant pool data is an enormous slab of the actual stuff that's in class files. So there's a lot of opportunity to win there. So why am I going to talk about this? There's really lots of motive for this. It's good for people to understand this works. We always want to encourage new people to hack on the JVM. But metadata can actually be a large proportion of the code in your actual memory image, the data in your memory image. I'm going to show you an example based on the JVM wildfly, the project we drive the app server from. And if you just boot up an app server instance with no deployments in it, there's about 22 meg of instance data and about 55 meg of class metadata, just when it's a bare app server. So it can actually dominate the memory image quite substantially. So what I'm hoping is that once you understand the thing about how this works and you can get stats upon what the actual costs are for your code, there might be opportunities for optimization and redesigning your code to use a smaller class base. Obviously, you just use less classes in one way. But also the way you code your class is going to be different. So I'll show you an example of that for me if you need. I'm going to skip the stuff about the native memory stats just to say that there is actually a whole memory management system that replaces Malak and the new operators in C++, plus creating all the JVM metadata. If you get stats on that, there's one particular stat tool in the system, which is the JCommand option, to allow you to actually see the information about the class model. And you need to run with an unlike diagnostic VM options enabled for that to work. So this is the command that starts up the Wildfly server. You run a standalone script to just start a simple app server. And I put that in as an argument to start it. So this is the example we're going to be using. You can use JCommand to find the process ID of that process. And then you can use another JCommand option, the GC class stats option, to get a formatted list of statistics on all of the classes. There's about 10,000 really good at a Wildfly. And it comes out as a table, which is vast. I've summarized that by putting it. You could actually load it into a spreadsheet. That's exactly what I did to look at this stuff. And you can summarize it. So these are the default statistics that are given. They represent aggregate statistics for the class as a whole and certain categories of structures that are used to model the class and the method. So I'll go through these in a bit. You basically get a load of columns. There's two rows, so I've colored them differently. And down the left-hand side, you've got the index of the class that goes all to 10,228. On the right-hand side, the very far right column in red, you've got the class name. And they're sorted by default on the instance bytes statistic, which is actually an interloper. That's a heap statistic. That's Java memory, not metadata memory. But the most popular class in Wildfly is a character array. And there's about five megabytes of character array. It went to that booted. The next most popular is object array. There's two and a half megabytes. And the first user defined class is hash map, dollar node. And there's about 2.3 megabytes of that. So quite a lot of instance data in there. The class bytes column represents the amount of storage used to model the class itself without taking account of the constant pull data and all the methods. It's actually one mainstruct plus a few little auxiliaries. And you can see the two array classes use about 480 bytes. The actual instantiable class there is a slightly different structure using memory. They're all instances of a class called class with a k, a c++ type. But there's array class and instance class. Two different types of array class, actually. And the instance class is a bit bigger. And it has some extra auxiliary data in it because of the structure of an instance class. So there's about five and a 60 bytes there. There's no annotations in these classes. Annotations are stored as a thing hung off the class. The class annotations, that's just a packed byte array. So there's none there. The constant pool only exists for interfaces or for class, usually defined classes. And that represents all the stuff that's in the constant pool area of the byte code. Now, it doesn't include the overhead for actual symbols. They're stuck in the symbol tables. Any one copy they've shared. What this is is basically an array of pointers to symbols or constant numeric values or indirect references to strings stored in the heap. So there's quite a lot of constant pool data. There's 1.3 kilobytes of constant pool for that class. There's quite a lot more than the class object itself. And constant pool data really is quite a big overhead. There's also a byte array of tags to label each other in the pool. So there's nine bytes for each entry. So there's quite a lot of storage there. There are seven methods on this class. That's the method count column. And the actual method byte codes, the only thing that's salvaged from the input class file is just the executable bit of the byte codes. The stuff it describes, add it to the method. That's actually very small, 149 bytes. So there's about 20 something bytes per method. But there's actually just over what there's actually 1k of objects data to actually represent those seven methods in memory. As we'll see, that's a cluster of objects for each of the seven methods. So about 240 or so bytes per method. And then the last three stats are summaries. So we know all this class is using just over, just about 4k of storage to store the actual, the details of the class in memory. And that's split into about 1k of stuff that is structures that are read-only. And the other 3k is structures that are read-write because you need to be able to update them with runtime derived state. If you go to the bottom of that table after class 10,000, 228, you get summaries. And there's the summary we've told you before. You've got 22 megabytes of instance data of objects in the Java heap. And overall, in the total column, you've got 51 megabytes of actual class metadata. You can see there's about six meg of class bytes. There's a lot more constant pool data, 18 meg, and there's a lot more method data. And in fact, if you look at the percentages, which are also cited there, about 45% of the metadata is methods, about 31% is class pool, and about 12% is class bytes. Only 7% of the original byte code is actually left as the method byte code. That gives you about nine methods per class on average. That's, these are not untypical values and about 250 bytes per method. Whereas if you look at the class bytes, the class is on average about 620 bytes. And the constant pools on average, you don't actually have a constant pool for every class, they're not for array classes, but you've got 1,850 bytes. And if you think about a pool full of eight byte entries, pointers, or constant number values and one byte tags, divide that by nine and that gives you about 200 entries on average in the constant pools for these class models. So constant pools are really quite big. So let's move on to what that actually looks like in terms of what's really in memory. So what you've got in memory, for every class loader that's in the JDK runtime, there's a corresponding class loader data object in the JVM that manages all the memory for all the structures for that class loader and it has its own region of virtual memory that it uses for that which you can wipe once it's deleted. And there are different types of classes. So coming down from the class loader, there's a link, the class loader data points to the first class and there's a daisy chain field in the classes called loader next, which links them all. So all of the classes we give the loader are all daisy chain together. They've also got a link back to the class loader and then the three different types of classes like a subclass of the actual top level C++ class with the K, our instance class, type array class and object array class. Those correspond to the three classes I showed you before. A user defined class or an interface would be an instance class. A primitive type array like char array would be a type array class and an object array of object array of foo would be an object array class. You create one of these each time you load something. So the sizes are 232 bytes for the primitive array, 240 for the object array and 420 plus an extra bit for an instance class because it has some data that varies according to what the layer to the class is, the definition of the class, and you get a tail which has some extra stuff packed in so it's in one block in memory. So there's a few of those fields that are common across all classes. Every good point to it's superclass, in most cases that's object. There's a link down from a class to its subclass which is the first sublink and then there's a sibling link along. So you can go down along to get all the subclasses of a given class. So you can do that to do a breadth first search of the class tree if you want. And then there's a Java mirror. Now that's in red because that's not a reference to other metadata. All these other pointers will point us to other classes. That's a pointer into the heap to an hoop that's in the heap. That's the Java length class instance that represents this class. That's created when the class object is created in the JVM and it's populated with all the other data when you start doing reflect operations. It's just a bare empty class when you first start. But you have to have that proxy in the heap. An instance class has a few extra fields. There's a bit of a variation in the tail. For example, an object array class. If it's an array of a ray of foo, element class would be just single array of foo and bottom class would just be foo. The last one really needs all the array dereferences. In instance class you've got a couple of other structures that are hung off it. One is a pointer to all the model of the constant pool data, another is a pointer to an array which counts as part of the class stat and that array has pointers to all the method objects which we accounted for in the method status of the pool. But that variant tail is interesting. What goes in there? Now if you've got a user defined class it can have object values as fields. And in that case the garbage collection needs to know where they are. So there's usually a small amount of data used to keep track of the indexes of those object fields to allow the garbage collection to reverse objects. So we'll have 8, 24, that's the offsets to where there's two object fields for example. That's usually tiny. What's often bigger is the V-table and the I-tables. Now a V-table is used when you need to do a virtual method invocation. For any given class there are certain methods that have been called with an invoked virtual. You need to know what implementation to use. So in that tail section you have a load of pointers to all the implementations that are appropriate for this class. It's either a local method or an inherited method. When you want to do a virtual method invocation you go to the class object, you find the V-table, you index, and that's the code point that you need to call. Now an I-table does a similar sort of thing for an interface, you've got a table of methods which are this class's implementation of all the interface methods, either local or ones that are inherited. So the size of the V-table is determined by how many methods are not private because you can do a virtual method so long as it's not private and so long as it's not a locally defined final method. If it's locally defined and final well then it's always going to be called directly. Similarly if it's private it's going to be called directly but all the other methods will determine the size of the V-table. The size of the I-table you get one for every interface implement and you get the size of it is however many methods are I defined in the interface. So that defines the complexity of this bit which is the bulk of what's in that tail. The other structures that are hung off the class, the constant pool, it has a fixed overhead of about 80 bytes including all the object constants that are referred to the constant pool are actually created in an object array on the heap and the constant pool entry is an index as to where that value is. So if you've got any constant strings they're not stored in the constant pool itself because that would require the garbage collector to go scanning through it. They're stored in the heap and there's one point to start and scan objects for the garbage collector. But other than that there's the actual cache data is the bulk of what's in this object and it's basically a load of pointers which either point to symbols or indirectly identify class method names and so on and or have constant values like numeric values in them or as they say they're indexes for objects in the object array for strings and class references and there's also a tagged byte array. So your overhead is determined by the number of things that are on the constant pool and the byte code you have the same and you're actually in memory representation. You have nine times that, eight for each pointer and one byte for each tag. There's a thing called a CP cache. It's actually a cheat. It's a little tiny header of 16 bytes plus a load of 32 byte entries. What the constant pool cache actually is is a cache for the interpreter. Anywhere in the method code for that class where there's a field or there's a method that gets called or accessed, there'll be an entry for that, a unique entry for that in the constant pool cache that the interpreter uses a quick access to be able to access the field or invoke the method and it gets set up when the thing is resolved first time. And then finally you've got the methods array that points to all the method objects. So every method has an 88 byte. This is all JDK8 sizes by the way, very conine. This method has 88 bytes. There's a const method which has a fixed over and a 48 bytes plus some extra tail. And there's a method data object that's shown in gray because that's only written in when you actually need it. It's generated by the JIT compiler. These things actually most methods don't have them. So although they're quite expensive, you're talking 264 bytes plus then a whole that are counters for profiling. They're actually not a sizable part of the metadata. And there's also a pointer to an end method. That's a pointer to another part of the JoGames data into the code cache which I've shown in blue. So that's not counted in the stats that you'll get. There's also a method counters object, 32 bytes for basic profiling. So the two tails of those two objects, the const method, if you have local variables in the compiled code, you need to compress them into the tail there to keep track of that so you can identify local variables if you ever need them. If you have line number information, it also gets put in there in a compressed form. The exception table records exception flow. So if you, if you, all right, thanks. If you have any exception flow, that needs to be recorded in there. There's annotations on the method will be a compressed byte array in there. And the actual original method byte goes in there. So the size of that really depends upon the complexity of the method and how you compile it. The method data has a whole load of different counters and they're all different sizes in what they're counting used by the JIT compiler. There are counters for it where there's a call for branches, counters that track method parameters against actual method arguments to do type profiling and so on. So it's very difficult to describe the complexity of that but as I say, it's not usually a very large part of the actual count. So let's put that all into use. Let's look at those stats again. Now I sorted the EAP stats according to class bytes and one of the really surprising things that came out and was that there were some very, very big classes. Now those classes that you saw the basic allocation size, these are all instance classes. They're all logger classes generated by JVOS logging. The reason they've got such a big value in there is because they've got massive V tables and I tables. That came out of the way they were designed. We'll see in a second why. And you can see they've actually got loads of methods and those methods both have both implement interfaces and the virtual methods. So the way the code was designed there's a class called basic logger which has about 12 methods and it has a third interface called basic log which have about 12 methods and it's implemented by this class called basic default logger. And it prints messages, warning messages, error messages. And the class EJB logger that the code is generated from has a load of abstract methods defined in the interface that are generated in implementation. So there's about 475 of those which because there's lots of different errors to report and they say what type of message you need and so an error log message, you call the error method. There's a message ID and a string that's formatted with arguments formatted into it plus optionally a throwable on the end. So this method here get manager txt status failed. It prints an error string without any formative arguments and it prints details of the throwable. Now in the generated code as well as having the generated method which is a virtual method and therefore has an itable entry and a vtable entry you also have another auxiliary method that retrieves that constant screen. And the idea was you'd override this and you'd build different locales to you'd use for different locales you have a different class implementation. So using overriding interfaces is incredibly expensive. You get 475 methods in the interface and in the generated class. You get 475 methods in each I table. So the one, you've got a whole of unnecessary code there and the vtables have not just 475 methods but 950 methods in the actual implementation. So the solution that we're now implementing is to get rid of the interface and make the actual original class just a class with dummy methods generate an alternative replacement which you put in a jar earlier in the class. You can pile against the dummy version but replace it with a real version. There's no interface so there's no I table that's 475 methods and 475 pointers in the I table removed. And similarly the implementation methods you don't need the auxiliary string methods that's another 475 methods saved and you don't need a vtable because these can actually be final methods. You don't need to ever call them in virtually and they're never overridden so you can make them final. So you can save 950 times 8K words for the I table and vtable and 950 method objects with all their acceleraries. So it's an enormous amount of saving. Obviously that doesn't apply everywhere else across the code base. The biggest way to save space is to actually just use less classes but it's interesting that you can actually look at your stats and maybe improve things. Right, I think I'm okay for time. And do we have time for a question or one question? Yeah, I didn't, that's a really interesting detail. The static methods have method, what's the cost for static methods? You've got all the, obviously the method details. So it's going to be another method object that's attached to that class. It's interesting that static data, when you allocate the Java line class instance and the heap, you allocate some extra storage for all the fields and that's where that goes. In the detailed stats, if you look at the instance count and you look at the instance size and you look at the instance bytes, they multiply up except for Java line class. Java line class instance bytes is a whole load more because you've got all the basic standard fields of Java line class instances and you've got lots of extra data for all the extra static fields. So the static fields get stored in the heap where they can be easily garbage collected but the Java line class has a variant size in the actual heap layer. Right, great. One logical class, this is a log game. Yeah, yeah. That's still the same. There's loads. There's loads, basically loads of way to go to the end of each generation as to if it's validated, if you're not careful. And so, you really need to work it out. So, did I get the threads? So you're phrased, you basically have the footprint quadratic to the number of interface implementers, right? The number, well, for every interface you count the number methods in that interface and that's the number of slots you have in the I table. So you've got that for every interface you have. It's actually really working because they're all local, you know, for every single application. I have a short brief comment. The proposed things, in order to improve the space your objects are using, was to reverse the dependency, the co-dependencies. So you turn an interface into class code. Yeah, to basically just get rid of the interfaces. So you have a class that inherits the default logo behavior without having to get into breaks.