 Good stuff. All right. Hello everyone and welcome to my presentation. I'm going to be talking, my name is Adam Farley, and I'm going to be talking to you about Java and containers. I imagine many of you have had that pleasure trying to run big old Java in a heedy-bitty living space, and hopefully we'll be able to undo some of those headaches for you now, or at least enumerate some of them for you. First of all, who am I? Can I speak a little bit louder? Okay. Does this thing have a volume control or? Okay. How's that? Any better? Can you hear that at the back? Well, how about I just call Tim, and you can just repeat what I'm saying. Anyway, my name's Adam Farley. I'm an open-source developer. I work at IBM. It seems a bit of a contradiction in terms, but we do a lot of work in the open-source communities. My focus areas are on the OpenJDK community. I've been checking in a lot of fixes, which some of you may be aware. And AdoptOpenJDK is another open-source community that I contribute to, and that's the community that generates some OpenJDK binaries that people can download and use across. A lot of different platforms, a lot of different Java versions. We do a lot of good work there. One thing which you should know about me, which always comes up and yet it's not on my slide, is that they have a habit of booking my presentations for in the middle of lunch, and anticipating that that be the case, I brought jelly babies. If you could crack that open, pass it along and back. There's also some Bassett's all sorts. If you're more of a licorice person, if you could, thank you, should keep you going until you find time to sneak out and pinch your sandwich. So, the plan. The plan today is to define the softwares that we're going to be discussing, so I'm going to be discussing detailing the bits of Java and the bits of containers. We're going to need to understand in order to follow the rest of the presentation. We're going to be talking about what happens when you bring these two technologies together, the clash, the problems that occur, resolutions that you can use to resolve these problems, and then a breakdown of each of the different solutions step by step. And we've also got some commands you can use to make things easier on you that I can detail later on if anyone's got any questions. So, first of all, Java. You probably know this by now. I honestly didn't know the kind of crowd that we'd be angling at, so I just made sure to start from the basics. It's both a programming language and it's an executable. The executable can be run used to run Java code, compiled Java code. How is it structured? The executable is often packaged with many other tools and it's located inside a package called the JDK. And so you've got the Java runtime environment which is used to execute code. You've got the compiler, which you use to build code into byte codes. You've got JShell, which can be used to run individual bits of Java code piece by piece. You've got JLink. We're going to be going into a few of those later. The point is there's a lot of different tools inside the JDK. Java also includes a large class library, you may be aware. Functionality has been grouped into a structure. I was going to have a little ditty on modules, but I think you've heard enough about how modules work and I think we can move on with that. And where can a code be found? The code for the entire JDK can be found at the open source open JDK project, which you then build to produce open JDK binaries which then get JCK certified if you have an agreement with Oracle. And then the resultant, Javas can be located at anywhere that has that kind of agreement. So java.com and other groups as well. Okay, now onto containers. What are containers? Well, first of all, hands up. Who doesn't know what a container is? Yes. Excellent, because I'm starting from the ground up here and I wanted to make sure my efforts were not in vain. What is a container? A container is, it's not a lightweight VM, but if I call it a lightweight VM then I'm only wrong in ways that don't matter. And what it is is the illusion of a lightweight VM. It's a group of processes on your computer that are corralled and surrounded by little bits and pieces that give you the impression that you're working in your own little machine. There are some different container services. The most common one I believe is Docker, which is the one we're going to be focusing on today. You may have noticed the blue whale in some of the earlier presentations, but I wasn't quite sure about using their logo. So instead we've got a nice little picture of a docker, a docking port. What you do with these lightweight VMs, the point is that you're supposed to be able to spin up lots of them on the fly in parallel. So if you want to create several different concurrent VMs to test lots of different variations on an operating system, or maybe you want to create a different virtual machine for each stage during your pipeline, or maybe you just want to have a fresh environment rapidly to do your development in. So maybe you require a specific set of dependencies, you require a specific folder setup, and you want to be able to just trash everything you've done so far, the mess you've made, and then have something brand new, Virgin Territory you can step into and make a brand new mess. The structure file, the structure, each docker container is its own little world which we've just discussed, and these things are generated by things called docker files. It's a simple docker command. You say docker, go look at this docker file, make me a container, spin one up for you. You've got coordinating agents, so you don't need to worry about putting together these massive structures of containers yourselves. You can just rely on a pre-existing agent such as Kubernetes, you can use Helm for this purpose, and you basically just push the right buttons and it spins up an entire array of containers that do what you want or close enough. And where to get them? Well, I know I usually get them via a simple app to get, I imagine there's a website somewhere. Okay, javas and containers, when you put the two together, why you'd want to do this? First of all, isolation and the ability to spin up lots of different Java processes on the same machine without worrying that they're going to interfere with each other. Speed, you've got the ability to spin up and destroy, as I mentioned, a lot of predefined environments. You spin this thing up and it's, brand new, you've got all the dependencies, you've got all the folder structures you need and everything's there and ready for you to start doing what you're doing. And it's supposed to be easy once you've created the container. And the pipeline. You can combine these two concepts, isolation and speed to create a string of different VMs. You can have one environment for development, one environment for building, one environment for testing, one environment for prototyping deployment, that sort of thing. Now, this all sounds great if a little complicated to set up sometimes. So what's the downsides when you combine Java and containers? While the downsides is size, you may not be aware of this, but the JDK is several hundred megabytes in size. Actually, it's 292 because I counted. If you've ever had to download this thing, especially over a slow internet connection, you'll definitely know that the JDK is not small. So sticking it into something that's supposed to be tiny and lightweight, it's a bit of a mismatch. So you've got awareness. The JDK, in fact, awareness is something we're going to be covering in detail as we go along. The JDK, at least up until a year or so ago for OpenJ9, and I think is still the case with Hotspot. The JDK doesn't know that it's running in a container. So if you say to the container, I want you to create an environment that has restricted resources, memory, CPU, that sort of thing. The Java you're running inside there doesn't necessarily know that that's the case. So it goes on. It tries to do its business believing that it has much more resources at its disposal that it actually has. And as a result, it can run into a number of walls. And startup, the JDK isn't what I would call slow to startup, but relatively speaking, there are ways of making it faster. And if you're using containers as a sort of small disposable single purpose process that you just want to spin this thing up, have it do its thing and then shut down many times rapidly in succession or concurrently, then startup time becomes a serious concern. So you want to minimize that as much as you can. The response to some of these problems, I recommend you should change the Java process rather than trying to change the container. Why? Because if you change the container to fit Java, then it can negate some of the strengths of the container itself. So if you create a container with unlimited memory, then that's great in that Java will like it, but it's bad in that how many of those can you safely run the same computer without running up against the constraints of the machine in question? You don't know that you're rolling the dice every time you spin up more than one of these things. So ideally, you want to clamp down on those resources, clamp down on that or resource usage and restrict the container to a fixed size so you know exactly how many you can run on your machine. The Java change is also enabled flexibility. You have no idea what the environment of tomorrow is going to look like. I especially don't know what the environment of tomorrow is going to look like. And so if you make sure you change your Java to fit inside a nice small compact container, then maybe tomorrow your company is going to be embracing a mobile platform. Maybe you're going to be running it on a Raspberry Pi. You don't know where your code is going to be running in the future and having strict resource control from the get go and having a Java process that can fit inside of that and not fail means that you're more likely to be able to port into these environments with less of a headache. And that covers future proofing as well. Okay, now, one of the problems we mentioned was the size of the JDK. It's a big thing to fit in size a small container. Those of you who know about containers may also know that if multiple containers have the same set of images, they can share. That is, I'm getting a little off topic. Containers can share the files that are inside of them if the files aren't different. So if you've got the same Java running in two separate containers, maybe they can share a lot of the space in the hard disk. That being said, it still expands the size of the container. So maybe you want to make it smaller. In this case, we've got the JDK, which is your big package. Contains the Java runtime environment, which is everything you need to run your Java code. And it also contains everything else you might possibly need. It contains JShell, JLink. It contains the compiler. It contains a debugging tool, so on and so forth. It also contains the Java doc tool, if anyone's tried using that. So when you're developing code, the JDK is a good idea. But when you're executing code, it can be a good idea to just download the jerry on its own and just run that independently of the JDK. The good news is a lot of places that provide, whether it's open JDK binaries or fully JCK certified Java binaries, they often provide JRE downloads right next to the JDK downloads. So if you ever wonder what was the difference between the two, now you know. JLink, JLink is a very handy tool. It was introduced in Java 9 and I've been running a bunch of tests with it in Java 11 to produce some quite interesting numbers. It's, the concept is, as the slide says, you keep the modules you need and you discard the rest. In our example here, I wrote some code early on yesterday and I put it through the latest JDK 11 build with hotspot. And we turned a JDK that was 292 megabytes in size. And by J-linking so, we only have the Java.base module, which was the only module I needed at the time. We stripped that 292 megabytes right the way down past the size of the JRE, which was 108 megabytes, all the way down to 38 megabytes. That's slightly over a third of the JRE size and a hair over 1 eighth the JDK size. So if you're trying to save megabytes, this is, the J-link is definitely a very useful tool that you can use. Memory footprint, if you're going to fit Java and the program you're running, the Java program you're running, you're going to want to use up as little memory as possible so you can have a nice small container and you can run lots of them in a single machine. Now in this case, you can find a much smaller memory footprint by using an alternate VM known as OpenJ9, which was open sourced a while back. And as you can see here on these charts, here are the footprint sizes. I don't know if you can read this at the back, but this says footprint size after startup with XMX1G, which is the setting of the heap. It's going into what the heap is is a bit more detailed, but it's a pretty heavy statement on how much memory the JVM is likely to use. There are other forms of memory that Java uses, native memory and the like, but the heap is one of the biggest uses of memory. So if you can strip that down, you can be fairly confident that the rest of the process is going to fit inside a nice small box. As seen on the left, you've got OpenJDK with hotspot. This is how much memory this uses. And then as you go along, you can see OpenJDK with OpenJ9 and various options use that much less. So it's a nice, efficient little system. And I was also contacted yesterday with some updated numbers for JDK11 when running the DayTrader3. Who's familiar with DayTrader3? Hey, one person. The short version is it's a tool you can use to measure how much memory you're using in this case. So you run these things, you analyze the foot, you run DayTrader3 and you can analyze the footprint and it's supposed to remain consistent whether or not you run it once or a dozen times. So the numbers you're getting should be fairly definitive. And the output we got from hotspot was 423 kilobytes. That was the footprint for hotspot and for OpenJ9. It was 57% less to 182, which is good, which is a very efficient little system. Now, one of the problems with a memory-limited container, well, the benefits of a memory-limited container is you know exactly how many you can fit on your machine without making the machine's operating system start to start for resources. The problem is that if your Jerry is not aware of the container's restrictions on resources and why that will be going into a bit later, the problem is it can be killed. It's a process called the out-of-memory killer. It's part of the Docker process. If you take up too much memory in a memory-restricted Docker container, then you hit the limits, you try to go past them, and Docker's out-of-memory killer basically takes out your container. It takes out the environment that you're using to develop, to test, whatever. What you can do to make sure you're staying inside that limit is you can either set the XMX, the maximum heap size, to much lower than the memory limit for the container, or you can pick a VM that has container awareness built in so the automatically specified heap size is smaller than the limits imposed on the container, such as OpenJ9. Some people also have problems when the heap is set too small. This is more of an efficiency concern. When you set the heap too small, you've got, first of all, why would you want to set the heap too small? Ideally, you want to set the heap as big as you can so that there's no garbage collection that needs to happen and your process can run from beginning to end without having to worry about stealing some runtime to tidy up after itself. The reason you'd want to set it smaller than necessary is because when you're running Java on a given machine, you have to make sure that there's enough space for all the other processes sharing that machine to operate. On Linux especially, if you take up too much memory, then the operating system itself can come along and clobby you to make sure that you're not threatening existing services so that the essential stuff can continue to run. Containers don't have that problem so much. There's nothing sharing your container other than that which you put in there. So you know that there's less competition so you can be less timid in setting the sizes of your heap so you can... So if you know that Java's the only thing running inside your container, you can quite confidently make the heap much bigger than you would on a computer with equivalent resource constraints. There we go. Startup speed. Yes, it's another OpenJ9 plug. Please stay with me. First of all, on the left, you've got OpenJDK with hotspot. This is how fast it starts up. The bar doesn't really tell you anything until you compare it to something else. On the right you have OpenJDK with OpenJ9 as you can see the startup time is a little slower. Ah, so that's interesting. Which means that if you're just looking to pick a Java and run it, then you're probably better off with hotspot. But if you're planning on running the same process multiple times in the same container or if you're planning on running a container that has access to the outside world, anyone who's used Docker will be familiar with minus V you could use to give your Docker container a little window into the world outside. And if you do that, then you can create a persistent ahead-of-time compilation cache which means that when you run your Java program, the more you execute any particular part of code, the more Java will notice that you're running that code over and over again, it'll think, ah, if I recompile this thing using legit to make it more efficient, then it will run faster. But then you turn your program off and it forgets that and that's not optimal. The AOT, ahead-of-time cache, allows you to store the results of that compilation. So the next time you run that same Java program, that same process, it's going to pull that directly out of memory. So instead of stealing some run time to compile your code to make it faster, it can just use that and get right up to 100 miles an hour right from the get-go. So if you're prepared to take a little time out and set up a persistent cache between uses, you can see the startup time improves dramatically. So if you're doing the same thing over and over again, you've got different containers doing the same thing over and over again. This is definitely worth looking into. Okay, I think we've got, what, three minutes left? Okay, then I will speed up. Either talk faster or talk less. Good out of memories and bad out of memories. We've mentioned that when you run out of memory in a container, the out-of-memory killer from Docker will kill the container and take down your entire environment. Now that's not necessarily something your Java process can handle. However, if your Java process has the heap set up properly in advance, then it can throw an out-of-memory in form of an out-of-memory exception that doesn't kill the container, which is less destructive. Now, okay, sometimes it doesn't always kill the container. Sometimes you just have a lot of page rage, which is what we call it when there's a lot of paging going on. There's trading memory pages between the memory and the hard disk. So the short version for all this is it's a good idea to set the maximum heap size to lower than the restrictions of the container. And this is basically a reiteration of what I mentioned earlier. I was also going to mention CPUs and threads. I understand we're down to two minutes. Will I dare? I think I will. Again, environment awareness. Environment awareness, Java often doesn't have it. OpenJ9 does, but it's only automatically enabled in JDK11, so before that, you're still in a bit of a tricky bind. Basically, when you run Java, because it's not aware of things called C groups that we won't go into, it can think it's got much more resources than it actually has, because it's looking at the resources the machine has rather than resources your specific container has. So it'll say, oh, look, I've got 100 CPUs. I'm going to run this many GC threads, this many JIT compilation threads. Unfortunately, it doesn't necessarily have all of that CPU runtime to play with. It maybe only has one or two CPUs. If you've specified that when starting up the container to restrict its resource usage. And that can be a problem because then Java's just spun up a dozen GC threads and a dozen JIT threads and it's got to make time for that on a single die, which is a problem, because it means that now all your runtime is being chewed up with JIT and GC threads and your program isn't actually getting enough runtime to do anything. How can you fix that? Use OpenJ9 or Google the options you need to restrict the number of CPUs your Java process needs access to manually. I've got a list here of the commands if anybody wants to go through them with me. They're fascinating. Okay, here's a summary of the stuff that we've covered and I realized that we're out of time. We've defined a list of problems. We've defined a list of common solutions to those problems. And I hope everybody managed to get some of the sweeties in the end. And here's some blinks. Anyone still awake? Thank you very much for listening. Thank you.