 Everything we're about to say is a lie so don't base any decisions on it. We're here to tell you a few well-chosen words about Java and particularly Java in a world of containers. So containers and clouds and microservices and all those buzzwords managed to fill up the whole room just based on those buzzwords, I guess. So we'll go through a bunch of things. 25 minutes we'll try to sprint through it and not for the best at the end. Let's start out with on a high level Java in a world of containers. So in general, in a world of containers, we expect a few different things. So the first one is that we expect safety and security to become increasingly more important over time. Both on, like, you know, we've seen both obviously on the software side but more recently also on the hardware side that security, well, it can be a bit of a thing, let's say. So over time we expect that to just increase and become even more important. We also expect there to be a lot of sprawl. And what I mean by that or sprawl, it comes in a few different flavors. The first one is that we expect there to be a large, very large number of instances. You basically are encouraged to create instances and create processes on the fly and create a lot of them, run more small instances rather than a few large ones. So that's one of the major shifts we're seeing. We also expect there to be a lot of different applications where in the past maybe you ran the same thing over and over again. With containers, you're almost encouraged to create custom containers and custom images on the fly when you need them instead. So a mix of applications. We expect these containers to run on a mix of different machines, heterogeneous systems, not all machines look the same way, and also with different configurations. So even if you can, and maybe in some cases it will help to create containers that look the same way, in many cases you will probably tailor make them with different settings and so on and so forth. And we expect people to use container-specific tooling for allocating resources. So bringing that over to what it means for Java. So we like to think that Java is ideally suited for running applications in this type of environment. There are many different reasons, this slide lists a few of them. It's everything from the fact that, well, after all, we have a managed language and runtime that gives you all the benefits of not having to care about malloc and free and pointers and we do bounce checking for you and all those nice things that we've come to learn and love about Java. We have, with the JVM, we are abstracting the exact environment you're running in, both when it comes to operating systems and CPUs and all that. We are, John Rose mentioned earlier today, that the whole platform is a reliable one, both in terms of obviously stability, but also the fact that we value compatibility very highly. So it's a key design goal of ours to make sure that new functionality will be backwards compatible and even forwards compatible, and really try to make sure that when you take your app and deploy it on another version of the JVM or the JDK it will just continue to work and you'll get all the nice benefits of moving up. The JVM also obviously adapts to any changes in the environment and makes sure you take advantage of whatever is in there and compile code is needed and all those nice things. And of course we have a very rich ecosystem built around all of this. Many different frameworks and libraries that all make this a good environment to run Java in. So with all that said, we are committed to like obviously continuing with the stuff we've done but also committed to making sure that Java is the first choice when it comes to deployments in container environments or cloud environments in general. How many people in here have played around with Docker? Excellent. How many people have created a Docker image with Java in it? Okay, most of you. Okay, good. So I'll go through these next few slides really quickly with just the background if you have never touched Docker before. So the point is it's super simple to create a Docker image with Java inside of it. You basically create a directory structure not unlike this. You create your Docker file. You put your JDK, TARDZ in there and your application in this case Hello World. You create, well and I should say inside of that Docker file, you set up your base image. So what operating system effectively do you want to base this image on? You say, well, put the JDK inside of that, set up some environment stuff, tell the Docker image what to run when it starts up. And then basically you go through two different steps, one of which you do ahead of time. It can be months ahead of time if you so will. It is building the actual image. So in this case I'm building that image and I'm calling it. I'm giving it a tag my slash JDK nine. It could be anything. And then from that image you stamp out individual containers when you start up the applications, so to speak. And obviously you can reuse the image across machines and again across time as well. Yeah. So when you saw the Docker file a moment ago, you had to really look at it. So when you saw the Docker file a moment ago, you have to realize that every line in that Docker file is going to result in a new layer in the composed image. And those layers can be shared and cashed in. They have to be composed incrementally, but the layers can be shared around independently of each other. But there is also a cost to having lots of layers, which is the overlay file system will impose some kind of performance and startup cost for each layer. So generally like that three or five line Docker file is pretty minimal. And there are tools as well to squash your images into a single layer if you really need that performance. So now when we know how to create Docker images, one of the things I'd like to talk about a bit is creating custom JREs. So it turns out that if you take the JDK, the good old JDK we've been used to dealing with for a long time, and you just bake that into all your Docker images or in general into your deployments, it can be a bit on like the large and unnecessarily large side, let's say. The whole JDK weighs in at around, let's say, half a gig. It's 568 megs, I think, the JDK on Linux X64. And as much as like it contains a lot of great things, you have the stuff that you absolutely want, Java lang, Java IO, and IO, things like that, it also contains a lot of different things that probably you don't want. We're trying to help you not want them over time. So Corba, for example, we're trying to deprecate and remove. But that said, there's a lot of stuff in there that you either don't need or at least for some application probably they're not needed. So with JDK9 and with Jigsaw in JDK9, we now have the support both in terms of modules and tooling to create JREs on the fly as you need them and tailor-made for your application. And yeah, right. In the past, we only provided you with basically one JRE. It was like the JRE, and we decided what went into that JRE specifically. But now you have the option of saying, well, I want these and these pieces. The one thing you cannot exclude is Java.base, that module is needed to start up the JVM at all. But basically, for your application, you can select freely what you want to include. And we also have tooling to help you select whatever should go into the JRE. So specifically, there's a JDEPS tool that now lives in the JDK itself. And you can point it at a class file or a jar file, and it will help you spit out the modules that it depends on. So I, for example, here, this is just the first few lines of running that on a Tomcat jar I found somewhere, and it's saying that you need Java.base, Java Instruments, Java Naming. There were a few other things, but basically that's it in much help. So creating a custom JRE is very straightforward. You used JLink tool that Dan mentioned earlier, and you give it a few options. So the example there shows how I'm creating a JRE with only the Java.base module in it. I'm basically saying create, spit out a JRE in this directory. The directory in question is my-jre. I'm saying pick up the modules from here. The modules live inside of the JDK, so you start out with your JDK, and you say pick modules from this JDK, and I only want to include Java.base, and a few microseconds later, you'll have a JRE that is tailor-made to only include Java.base, and you can run Hello World with it. It should be noted that the application Hello World or Tomcat or whatever you're running does not need to be modularized or made modular where to use this. You can create it for your own old, let's say, Java applications even before modularizing. So here's Dockerfile using multi-stage Dockerfile build, which I think is a new feature since 17.6 release of Docker using their release numbering system. So this is good because it lets you do the build stage stuff at the start, so we're using from over in JDK9 as Java build, doing some J-link stuff in there, and this is where we in FNproject would be able to do, like, and then you can copy things in the second kind of stanza from the Java build, and you only end up with the data and layers from the second half in the actual executable image. So if you're using Dockerfile production, for sure, look at the multi-stage build, it does, it makes our lives easier. So what kind of benefits can you get by doing all of this? Why would you create your own JRE? Well, it turns out that one of the key things is size, but as Dan mentioned, there's more to it than that as well. I'm going to talk a bit about size specifically over the next few slides, and the first one here is showing a chart with three bars on it. The largest one there is the JDK, as I mentioned, it's 560 or so megs, and again, it contains a lot of great stuff, a lot of which you may not need. So the other two bars are showing what a corresponding Java.base only JRE would look like. So this is where I've run J-Link, and I've said only include Java.base, and I end up with something ballpark 50 megs, which is obviously pretty sweet savings, almost like 10 times or more than 10 times even. It may well be the case that a lot of applications need more than Java.base. So as some kind of more realistic number, we've created something called neti, it's like quote neti. It's a set of modules that are likely for your Java applications. It's almost like likely that it will work if you include those modules, but obviously it depends on exactly which application you have. But as you can see, we're still down almost like 10 times smaller, 60 megs or so. So that's just pretty sweet. It's starting to get bearable, let's say, still a bit large if you compare to, I guess, your average C++ application or something, but then again, you get all the benefits of Java and the safety and all of that. I should point out that the neti bars and all of these bars, they don't contain any application. It's just the JRE itself, the core libraries of JBM. So obviously depending on what you add on top of this, it will become bigger. The size can be optimized further. One of the key ways of doing that is by specifying a command line option to J-link called dash dash compressed, or there's a short version of it as well. Basically what that does is take some of the resources we have in the JRE and compress them. So it's not entirely unlikely that you'll see something ballpark 25% savings on top of all of this. So you can squeeze it down even further. Yeah, there's a runtime cost to decompress. So there's a trailer in software engineering between the image size and the time it takes to start your application. That's the point. OK. So now we've optimized the JDK or the JRE, if you so will. And it turns out that if we now go back to and look at what a full Docker image will look like, it will look something along the lines of this. So on the left-hand side there I have Oracle Linux because I work at Oracle, but think any Linux distribution. And obviously you can tell that we continue optimizing that top 46 megs, but it's not going to make a big difference if the bottom thing is still the big one, right? So time to start looking at what we can do on that side. And one of the things we can look at is obviously sort of in the same vein as using J-Link on the JDK side, we can look at stripping down the Linux distribution itself. And it turns out that there's a version of the Oracle Linux Docker image called the Dash Slim version. It's basically the same thing, but stripping out everything that an application is unlikely to use. And that's obviously a good thing. If the application in the end doesn't need it, why include it at all? And in general it's true, obviously, that in the world, like the one we have with Docker and containers, you're unlikely to sort of log into your container and start managing it and running things on the command line. So as long as it runs the process it runs the application you have inside of your image, it doesn't really matter what the base image is. As long as, in our case, it runs Java, then why would we care about the exact distribution that is running underneath it? It's quite possible to start another container on the same host that enters the same namespaces. So if you do need those debugging tools, you can start another container containing just the debugging tools and attach it to the same namespaces and then you can use why shall I call a shell if you haven't got that in it. Right. So with that in mind, surely there must be something we can do about those, let's say, 100 or so megs of base image. And that's where we queue up Alpine Linux and the MuscleSea libraries. Alpine Linux is a distribution that has its tagline that it's small, simple, and secure. And it's basically targeting environments much like the ones we're describing here with Docker and containers. The key interest for us on the Java side is that it's using the MuscleSea libraries. And normally on Linux, you'd see the GNUSea library. We don't even think about it all that much. We almost associate the GNUSea library with Linux. But obviously, there's nothing preventing you from running another sea library. And the MuscleSea library has its tagline that is lightweight, fast, simple, lots of different things. In the end, it should just be another sea library. All sea libraries should be the same and it's a standard API and whatever it is. And that turns out to be almost true. Hence, we created OpenJDK Project Portola. Thank you, John Rose, for the name. It comes from, by the way, the fact that Alpine Road in Bay Area leads to Portola Valley. Yeah, nerds, exactly, we're all nerds here. So the goal of the project is to provide a port of the JDK to the Alpine Linux distribution or more accurately because it turns out that what's relevant for us is the MuscleSea library. And the status of the project right now is that it's sort of working. We have the code, all the changes that we know are needed in the forest, we're in the repo, and it's sort of working. Only we haven't integrated into Mainline yet and I'll get back to that in a second. But basically, in a nutshell, the changes needed were relatively straightforward. There's a lot of, you know, if you have code and you've only worked with a single library for a long time, then you tend to rely on things that are, in the end, library-specific. In this, in our case, you know, C library-specific. And a lot of it was just shaping up codes that are actually compiled. I happen to be the lead of this project and I happen to be the person who made those changes. Again, they're sort of straightforward. We found one latent bug in Hotspot while doing this and that has been integrated into Mainline so that's sort of ready to go. Okay, the good part about Alpine Linux is that it weighs in at four megabytes. So that's pretty impressive. Basically, it is the C library and nothing else. It does have busy boxes inside of it as well, so you do get some tools for obvious reasons. But it's very, very small. And what that means is that we can take our, you know, whatever it is, 150 or so megabytes and turn it more into 50 megabytes. And now we're starting to, I'm going to say, get to the point where, yeah, we could optimize it further, but it's starting to look pretty good. If your code programming code is pointing you at 50 megabytes is still enormous, but then, as Mark mentioned earlier, substrate VM is what you want to adapt to stripping it down even smaller. I had a section in here that I tore out which is talking about the minimal VM. So there's this version of the Hotspot JVM that only contains a subset of the component that brings down the size of the VM from the lib JVMSO from approximately 20 banks down to five. So that will save some additional space, but then you don't get all the functionalities. There's a trade-off. I have two questions coming up on this, so the first one is, is there interest in Alpine, a port to Alpine in here? Hands up. Okay, a bunch of hands. Second question, anybody interested in helping maintain it? Hands up? I tried. Okay, yeah. So again, there are early access binaries that we publish. They're next to the normal EA binaries of 10 and all that. So please go ahead, download them. If you have feedback, if you have questions, if you want this, do speak up because we need the feedback to understand if it's worth moving forward with this. So we spoke a lot about image size. It's important because clock of call is a thing. If you're running on Kubernetes or some other management engine, you don't know exactly where your contents will start. And the image has to get from Docker or wherever. But it's too there. And even if it's already there, there's a start up cost proportional to the size of the image to do again with Docker and it's the fastest in copyright caches. So... Size matters. That... I didn't mean it to come out that way. Okay, let's touch on a few of the features we are working on that are specifically well suited to Docker, let's say. The first one is something that a few people have already touched on, so I'm going to go through it extremely quickly, which is sharing across instances. Now we've optimized the size of a single instance by bringing down the size of the image and all that. What can we do to leverage the fact that we're running multiple instances next to each other on the same machine? And on the operating system side, you already get that through shared libraries. That's the sort of standard mechanism for doing exactly that. So for the JDK, we already get that type of sharing with the shared library that makes up the JVM itself and all that, but what about Java class data and in general stuff that is specific to Java, I guess? Class data sharing was already touched on, but in essence, it's doing exactly that. It's taking data that we have inside of the JVM, so data that was read from the class files and massaged into something that the JVM liked better. And what this functionality does is to dump that data down into a file that can then be used later as you start up additional instances. And that file's around 18 nodes, so you've got an image size versus performance trade-off that is something you need to try. It depends on so many different things, whether it's trying to help you or not. Right. It's good to point out that when we created this many years ago, it was more meant for multiple processes on the same machine, but since Docker is in the end sharing the same file systems like deep down inside, if you cleverly put the archive in a shared image, then it can be shared across different Docker containers as well. We are, yes. So we have seen good startup time improvements. I know Falker has not seen the exact same numbers, let's say, but we've seen good startup time reduction and also footprint savings. I'll mention quickly that we're also working on AOT, which does the same thing, but for jaded code, experimental so far, but being worked on. Last part we're going to talk about is how we are adding support in the JDK for honoring the limits that you can set when you create and configure Docker containers. So the JVM, obviously, as you all know, has a bunch of different ergonomics inside of it that looks at the environment where it's running and it's trying to size things correctly, both the heap size, the number of threads, you know, who will sort of set up and so on and so forth. And with Docker, it turns out that this is not transparent. As in, we need to do something explicitly. It doesn't just work to look at the same old numbers. So a lot of work has gone in lately into making sure that the JVM is aware of these things. Right, yeah. Some patches landed recently and will be in JDK 10 and it's released next month, am I right? So Docker allows you to restrict the compute resources mainly in terms of CPU and memory. In CPU, you can have a time-sharing list and schedule a kind of share of what's going on, which can be, first of all, or not, into an empty space. Or you can pin to a specific set of CPUs. And in the patches that Bob Bandett has added to JDK 10, the JVM will try its best to represent that, both in terms of the choices that are made by each ergonomics, but also to your code and your library's code in runtime.available processors, which is looked at by, like, elastic search net, like lots of libraries will use that to size their thread pools. There's a lot more detail I wrote up on this link if you want to find out everything I know about. But what we found is that on a very large machine, starting lots of JDMs in containers was slower than starting lots of JDMs. This is JDK 8, from that. Starting lots of JDMs in virtual machines, even with the hypervisor overhead, because thread pools were being sized to the host and not to the container limit that we were trying to specify, everything wanted to create 144 threads for everything, and it was everything to the end. Two containers were too slow. But in JDK 10, we've done some experiments and we'll be moving to JDK 10 when it's released. And this is not the problem now for us. So, that's good news. There's another slide as well. Ah, yes. Sure. I just did mention a few times, hasn't it, that heat size isn't the only memory that a JDM uses, so there's a couple of flags for allocating a percentage of the container that's resorted in the strength that's available to you. You obviously don't want to give 100% of it to your heat as you run out of memory for everything else. So, that's it for that. Yeah. Last slide. That's on the wondering resource limitations or limits. We've done a lot of other type of work as well, making sure that our tools, our serviceability tools, jCommand, jStack, and all those work across container boundaries as well. More work is coming. This is, as I said, we're committed to making sure that Java runs really well in containers, so keep your eyes and ears open for more coming up. Thank you very much.