 So, hi everybody. My name is Falker Simonis. I'm working for a small company called SAP. We are doing virtual machines, Java virtual machines, a commercial one called SAP JVM. We are doing the PowerPCAX and S390 ports in the OpenJDK, which I'm leading. Since recently, we are also doing a binary distribution of the OpenJDK called submachine. You can grab some stickers after the talk if you want. Now, let's concentrate on class data sharing in the hotspot VM. So, the other speakers have laid a good foundation for this topic, and I will concentrate on how class data sharing is actually working in the hotspot VM, how the new application class data sharing, which will come with Java 10, will work and share some details about the implementation and the current restrictions. So, my slides, they are all on GitHub. So, if you know, sorry for that. Yeah, you can look at them later if you want. So, class data sharing is supposed to cache pre-processed class metadata on disk to improve startup performance and reduce memory footprint. We've heard that several times today. So, short history of class data sharing in the hotspot VM. It was introduced since quite some time in JDK 1.5 in 2004 already, but at that time it was quite restricted, so it only supported the client VM and serial GC, and it could only be used for caching system classes. It was done at that time. It was in the Sun JDK. It was done in the initial contribution for OpenJDK 6 and 7. Then sometime nothing happened on this topic since, but in JDK 9, there were several improvements. Support for the server VM was added for G1s, parallel and parallel old GCs. Also support for shared strings, so not only classes, but also strings could be shared. And with Java 9 support for application class data sharing was added, but in JDK 9, it's only in the commercial Oracle JDK as commercial feature. Commonly OpenJDK 10 with JAPS 310 application class data sharing has also been added to the OpenJDK. So this is actually a short overview of Andrew's talk just for people who didn't saw this. So we have on the right side a Java class, and for every Java class, the hotspot has to maintain a certain kind of meta data which is stored in the meta space. Previously, this was the permanent generation since Java 8 is in the meta space, and this is all C++ objects. So there's an instance class, and the instance class links back to the class object which represents the type of the full object. And then we have the class loader data which also links to the class loader in the Java heap, and we have a class loader data graph which links together all the class loader data, so actually all the class loader, so the VM has a chance to iterate of all the classes which it has loaded so far. We have the constant pool which was mentioned before, the constant pool cache which is used to speed up mainly the interpreter operation by caching a resolved object, the constant pool, and finally we have the method data, the constant method data which contains the bytecode and other things which aren't supposed to change, and finally then we have the end methods, so the compiled methods which reside in the code cache which is again a different area of the heap and which we won't touch in this talk. Okay, so how does CDS work? So before we can start to use class data sharing in the Hotspot VM, we first have to create the shared archive. So this is an offline step and this is also one of the problems of the class data sharing implementation in Hotspot we will see, I will talk about this at the end of my talk when it comes to the limitations. So by just calling Java minus X share done, there will happen a lot of stuff and I will briefly go through these steps. So first, the VM allocates space. You see the address, that's actually the address where the shared archive will be mapped into memory. So every instance of the Hotspot will map the shared archive at the same memory address. You can configure this with a command line option if you're not happy with this address but generally it's 32 gigabyte. It will be mapped into 32 gigabyte. Then it will load in the classes to share and you may be surprised but your JDK or JRE, if you download it it already contains a class list which is a plain text file with a list of classes. This is generated at build time. So when you build the JDK as it was shown in one of the previous talks there is a small Java application which tries to mimic other small Java application. It includes some of the containers, some of the util classes and so on and generates, and from this class this is generated which is part of the JDK and when you run share dump on your host this class will be used to preload all these classes and create this shared archive. Then a lot of stuff is done on the preloaded classes like class verification for example so this hasn't to be repeated later on when the shared archive is mapped into your space, into your Java process. You see the number of classes so there is about 1,200 classes in this class list file. Some unshareable information is removed for example if you remember the image I showed you before there are links to the class loader for example in the meta space and obviously these are different in every VM instance so this cannot be shared in the shared archive so such information is removed. Pointers which point from one class to another are relocated. Then as I told you with Java 9 we can also store some string objects and symbol tables in the shared archive as well these are also dumped into the shared archive and finally the links to the Java mirror class which is actually the pointer back into the Java heap is also removed because every time the shared archive will be mapped into a Java process space this link has to be updated because it's obviously different in every running instance and then finally the whole archive is dumped into a file and you see the archive contains read write and read only spaces because obviously some of the part is really constant but some part like for example the instance class has a pointer to the class object to the Java mirror and this has to be patched for every running instance so like every shared library also the libc library for example contains constant parts and read write parts which can be patched and this will bind these pages which are changed by the process to the Java process so they actually are not shared once more than one Java process is using this archive in the end you see the parts which are mapped into the heap so you see this is another address and this is why this only works with g1gc because g1gc has the possibility to map some parts into its memory region which contains the saved strings and in the end you see we generate this shared archive which again is stored by default in your JDK directory under lib server in the file class is jsr Java shared archive again you can change this name this location we will see that later on in the other examples and the size is about 18 megabytes of the shared archive so just a trivial hello world hello CDS sample Java program we just print out hello CDS and then we read in a read line so we can analyze the process once it's running so it doesn't stop now how do we use CDS we just put the x-xshare.colonon option into the command line and run our program and nothing special happens so how can we verify that now our classes really have been loaded from the shared archive well we can misuse the new logging API and instruct it to give us a log of the class loading and when we run it this way we get a line for every class we just loaded and every of these lines contains a reference to the location from where the class has been loaded and if this class has been loaded from the shared archive you see a source shared object file for example not all files have been loaded from there so for example the Java internal I don't know I've abbreviated it class pass file order was obviously not in the initial class list so it is not has not been dumped the shared archive file so this is loaded from the module from the base module right from the file system finally also my example application has been loaded from the file system because it was not obviously not in the shared archive in the class list we can just check how many classes have been loaded from the shared archive and if we grab for shared object files we see that 477 classes have been loaded from the shared archive and if we do the opposite so grab for all the classes which have not been loaded from the shared archive we see just about five so for toy application like Hello World which your estimation was right it's about 500 classes we see that most of them get loaded from the shared archive so that's fine so performance Christian told us that we shouldn't use time for performance measurement but this is not really a serious performance benchmark so I just used time to do some measurements so we see that when we run with shared archive to run we get about 9% our performance improvement for the whole application and if we measure the time until our application class gets loaded which is usually the last class for Hello World program we see that it's about 30% faster so I will show some numbers later for Tomcat but unfortunately I couldn't reproduce this 20% or 30% which Michael mentioned I'm not sure if I did something wrong but I think in my measurements it was not more than 10% to 13% improvement so let's see how much memory we can share again we run the Hello CDS program in background and then we use the PMAP tool so Christian introduced the PMAP tool I think it's really a great tool and probably the only tool is the real true information about memory usage because it's the right information from the kernel so it has a lot of options you could use even XX to get a lot of more information I just pull out some of the information which I think is interesting for us so if I run PMAP on our Hello CDS process you see the binary which is mapped into memory you see here the java heap so by default on my machine java heap has about 2GB but from these 2GB there is just about 129MB are really mapped so it's read write the other part of the heap it's reserved but not mapped so there's actually the difference between the RSS and the virtual space which can reduce people and then finally you see our shared class file which gets mapped to several locations so the first two lines with FF that's the part which contains the shared strings and which is mapped into the java heap by G1 you see and then again you see the different parts, the read write parts which can be potentially patched by the instance which loads the shared archive file and you see the constant file which is truly shared so this will always be shared between all the instances while the read write part maybe at some point in time privately mapped to your own instance of java and you will not have sharing so for example the byte codes that's a good example people think byte codes for example what do you do when you do debugging well you patch the byte codes so you cannot really put them into read only memory they are in the read write section and most of the time they are shared but when you start debugging your application and patch your byte code you will just make this page it's a copy on write you just make this page private for your process and only all the other process which don't touch this page will share the memory okay so this was a toy application just to show you how everything basically works so now I've tried to use CDS and the new app CDS which comes with java 10 with Tomcat and ngrinder it's some kind of application I really don't know what it really does it was just a big war file which I wanted to deploy in order to get a lot of classes loaded so Catalina options it's actually environment variable which influences the way how Tomcat starts and we can now launch Tomcat with class data sharing on it will use the shared archive which you have just created in the first step inside our JDK directory and we also use xlog to log all the loaded classes into a file just to get a baseline of how many classes have been loaded so when we run what count on that we see that Tomcat with this ngrinder application loads about 12,000 classes and when we now grab for the number of classes which are loaded from the shared archive we will see that it's just about 1,100 so it's if you remember the shared archive which we've created initially contained about 1,200 classes so most of them have been used by Tomcat but Tomcat again uses a lot more classes so how can we improve this situation well we have an online option called dump loaded class list when we run Tomcat with that the VM itself will print out a list of all the classes it has loaded and things it can share in a shared archive so let's just use that and dump that in a file and when we look at that file we see that again that's just about 3,000 classes from the 12,000 classes because CDS only allows by default sharing of system classes which get loaded by the boot class loader so obviously that's about 3,200 but all the other applications classes cannot be shared by the classic class data sharing so here enters the scene application CDS which promised to allow our class data sharing also for application classes and even for custom class loaders again we start Tomcat and add the use app CDS option so you can do this you can try this yourself with the early access OpenJDK 10 builds and when we look at the results that's actually pretty disappointing so we see that it's about 300 more classes can be shared now but that's actually not a lot so what's the problem here loaded class list still only dumps the classes loaded by the boot class loader and the platform class loader and the application class loader also known as system class loader so obviously Tomcat itself is a dynamic application and it loads not really many classes through the application class loader from the class class but it loads a lot of classes with own custom class loaders and unfortunately the dump loaded class list application doesn't handle these classes so what can we do here well let's first use the file that was generated with the 3500 classes and see how we can regenerate shared archive with these classes at least so we use the share dump option like at the first slide and we turn on application class data sharing and now we can tell them the location of our class list so this time we don't take the default class list which comes with your JDK but the one we have just created and then we can also tell the JDK the location where to write the created shared archive to so we don't want to store this archive into the JDK directory just to a second play and for some unknown reason this is a diagnostic BAM option after the first time I will file a bug to change this because I don't see why a shared class list file it's a normal product option and shared archive file is a diagnostic option which has to be unlocked first anyway that's how it's today and when you look at the created shared archive you see it's much bigger now and the shared archive which was created from the default class list was about 18 megabytes in size and this one is now nearly 50 megabytes in size it obviously contains more than 3000 classes so the first one only contained about 1200 and now we can use this shared archive again by turning class data sharing on using appcds the location of the shared archive so we don't want to use the default shared archive but the one you've just generated but actually appcds as I told you promised to support class data sharing of classes not only of the classes loaded by the application class loader but also for classes loaded by custom class loader so how can we reach this unfortunately we saw that the dump loaded class list doesn't help here so we have to sorry I'm just somehow my slides get out of order so how can we create this from by scratch a class list to enable application class data sharing also for custom class loader so when we look at this class list of classes and unfortunately there is no documentation for how we can use application class data sharing for custom classes we have to look in the hotspot code to the class list parser cpp file and there we see that this class list file cannot only contain class names but it can also contain another format like class names with an ID and with yes with with references to the super classes and the implemented interfaces and the source of these classes and when we look at the output of the class list load with the debug option we see that actually this output contains exactly this information so it not only tells us the loaded classes but also the class loader which loaded the classes and the source files where they were loaded from and from this lock output we can actually assemble easily class list file which contains the required information and I have brought a small tool called cl for cds class list for class data sharing which is on github you can easily google for it there is not many programs with this name but when we now we can now run Tomcat with the logging class load in debug mode and dump that to a file we can then just call the cl for cds tool to create a class list from the class loading log and then we can use that list and when we do that we see that now the archive is about 100 megabytes big and when we run Tomcat with this shared archive and count for the classes which are truly shared we see that it's about the whole number of classes and when you call for the shared classes we see that it's about 9000 now so that's better it's still not optimal but it's better and what's the problem well app cds still has some limitations so for example it doesn't support 1.java 1.5 classes and obviously this ngrinder application still contains some old classes also dynamically generated classes can be shared because they are generated dynamically at runtime different classes a class which is loaded several times by different class loaders can only be shared one time for one class loader that's another limitation and also classes which get modified by the class loader itself not by an instrumentation agent but there are class loaders for example netbeans that's why it's not easy to use up cds with netbeans which has class loaders which load a class and then do some bytecode changes to the class and then feeds it to the VM obviously that cannot be done with the cds implementation in hotspot the dumping of the shared alpha is an offline step which has no idea how the class loaders modify the classes another problem is that bytecode rewriting is switched off for the reasons I've told you before so interpreter speed might be slower so that's actually it here's another link to the slides and you have any questions please come to me I will be here all day thanks a lot