 Hi, I'm up here again. This talk is really not about my work, but about other work that's going on at Red Hat. So I'm going to do my best. This is the gratuitous picture of my dog and my husband playing fetch, right? This talk is all about making it easier for Linux and Java to play well together. The issue is mostly I've joined Red Hat a few years ago. And a lot of the people I talk to look at Java as just this black box, right? They've got their Java source code. They run Java C. They get something. They don't know what's bytecode. And then they pass it to the Java command. And stuff happens in there. And they don't know what that is. And they really want to be able to use the kind of tools they're used to, like GDB and Perf and all of those things to see inside the JVM to see what's going on. So there are a couple of efforts that have been going on in the community to make that better. So what I'm going to do is I'm going to show some motivating examples why you might want to use the GDB Unwinder or the Perf Java tool. And that's going to take me about five minutes. And then I'll show you the demos of how they work. And you can see what's going on. But this should be a pretty short talk. So feel free to interrupt me and ask questions. I'm happy to take time. All right, so I'm going to motivate the GDB Unwinder with prime numbers. I like math. And so everybody knows what a prime number is. And if you use the sieve of arachnids, you can cross off all the multiples of two and cross off all the multiples of three. Three is now prime. And now you can see that five is now prime. And you cross off all the multiples of five. So I wrote a Java Streams program to do this. And all you do is you generate a range. And then you generate the range from two to the square root of the number that you have here. And you see that if there's a remainder when you divide it. And so this will generate some nice, pretty prime numbers. And I'm going to do a reduction on them at the end just because I don't want to be printing out streams and streams of stuff when I show the demo. So the thing that's interesting here is if you've played with Java Streams at all, you can just add in a dot parallel there. And all of a sudden, your program that's generating prime numbers will use as many cores as you have on the box. So that's awesome, right? That's great. This is the promise of fortress, but put it back into Java. But as a low-level programmer, I want to know what's going on, right? What is that dot parallel really doing under the covers? So if you go into GDB now and you try and do a backtrace, you get something that looks like this, right? I don't know how many people have ever looked at that. When we were doing the read barriers and trying to debug them, I ended up looking at addresses like that and trying to go in by hand, translate them into methods. The Linux programmers don't really want to do that. They really want to see Java symbols there. So Andrew Haley wrote something called the GDB on Winder. Andrew Den, sorry, Andrew Den. And when you run that, you can see that this is just an interpreted. I halted it really quick and looked at it. And you can see up here that it all calls down into Java, UDL, concurrent fork join, which is what you kind of expected if you knew about the stuff at a low level. But if you're coming at this and you're just starting and you want to know what's going on, you can come and you can sort of see how this high-level streams program decomposes into actual Java method calls in a pleasant way. All right, you can't read this. But basically, I showed you the interpreted frames. But once they get compiled, you can see that in this case, the very low levels on the stack I'll show you in a minute are still interpreted. The stuff that's executed more often gets compiled like here. And some of the stuff up here gets inlined. And the GDB on Winder can handle all of that and show you what your Java program's doing in here. You can see all the interpreted methods because these are only getting executed once. And you can comment out the parallel. And you can go and you can look at the GDP output and see that, in fact, you're up here to evaluate sequential. And you never go to the fork join stuff. So yay, that's cool. Where can you get it? Andrew did put it up there. It's actually just a cute little script. It's a Python script that knows about GDB and it knows about how Java stores its symbols and goes and matches them up. You can read more about how it actually works in detail there and plans to try and get this upstreamed in some way there. Again, Andrew Dinn wrote this. So Perf, I don't know if you guys are Linux programmers or if you're Solaris or whatever, but Perf is a tool for measuring hardware counters and figuring out things like cache misses or TLB misses in your code. But for a while, it's been kind of tough to map that back to where in my Java method am I doing something stupid with a global variable? So I just wanted to take a fun program on random numbers. If you want to try and debug your random number generator or figure out whether it's a good random number generator, what you can do is you can take the remainder. If I have three bins, I can take the remainder by three and see which bin it fits in and see if your bins are equally distributed. So if I got 91 as a random number, it would be in bin 1 or 17 because 15 minus 2 would be in bin 2. And so I wrote a program that does a really parallel random number test like this, but bangs it all into one array because I wanted to be able to show you Perf. So here are just a sample example. So if I ran with four bins and 10 threads for one second, you can see that Java random really does do a pretty good job of getting numbers equally distributed in the last bits. OK, get to the point. I've gone off on a tangent. But the idea is that by using many threads to write into this array of bins, I can cause some cache problems. And we should be able to see those with Perf. So without jitted symbols, if you just went to run Perf and look at L1 data cache misses, you get something that looks like this. And this is bad for several reasons, one of which is if you look over here, these are actually in the same method, but you don't know that. And you've got addresses that don't really help. If you use the Perf tool that I'm talking about, you can tell that 98% of your L1 cache misses actually happened in that code I was talking about. So where can you get Java Perf? It's supposed to be available in REL 74. And this was actually contributed by Google. But the libPerf stuff is actually a part of the lib JVMTI is going into the Linux kernel so that you should be able to do this from Linux. And the last one I'm showing is just Linux Perf bar. This has nothing to do with making Linux and Java play better, but it actually makes the next demo section go better. If any of you, I was an old time son person and I ran Solaris. And my favorite tool is the Perf bar. Because you can watch it, and you can watch your program running, and you can see if there are any problems. If you're not getting the scalability, if for some reason you're blocking. And so Doug Lee and then others have improved on it, have put a version of this out that you can play with. And this is sort of an idea of what it looks like. You can see that in this particular example, I've got an idle thread, some partially idle threads, and some fully engaged threads. It just gives you an idea. It gives you a visual example that for me is helpful to see how your program is running. All right, so let me actually show some stuff. Here's my Perf bar. I'm not doing anything right now. There's the GDB without the symbols, and it's not useful if you run with the Unwinder. And the reason I have this is I can tell when it's actually fully running the stuff. You can see that in fact it does work and you do have your Java symbols all available to you. Yes, it's only for Linux. I don't know. Is Perf available for AR64? This is the Unwinder. This is not the Perf tool. I'm sorry. All right, the question is, will this run on AR64, the GDB Unwinder? It is a Python script that goes and walks into the JVM and knows about GDB. So I don't know for sure, but I would put my money down that it would work on AR64. Andrew Dinn, who wrote this, also worked on the AR64 port. So I have to believe that if it's not there, it could be there if people ask for it. So if you need it and you don't have it, chf at redhat.com, tell me and I will see what needs to be done to make sure that it's there. Go ahead. Yes. Right, it shows the native frames and the compiled frames and the interpreted frames. So you can deploy these tools in hotspots like to inspect, or in the debug version. Okay. And also, it's worth pointing out that if you have disassembly loaded, you'd also be able to look at the disassembly and all the source lines and all the rest of the stuff. That's true as well. So what I'm trying to do is make this, well, I'm trying to demo these two tools that I was told to demo, but what I'm also trying to say is that from a Linux perspective, these are giving you insight into what's going on in the JVM. There are other tools that experts would use. I'm sure there are other tools available, but these are simple and easy to use. Oh yeah, what is that? Well, now you know what's there. Yes? Are the what? Yes. We're gonna run perf. All right, this is gonna take a while. So I should have started this running while you guys were talking to me. But this is basically doing the random number testing into bins on my laptop. And it's using all, it's using 32 threads. So my eighth threaded laptop is all out working. If there are any other questions on GDB, now would be a great time. Or I could sing. What's your question? You say? This is the lib JVM TI is gonna be part of the Linux kernel. And perf is available in Linux. It is, it's available. You can do it in Fedora. Everything is in Fedora before it's in rel. I built the lib JVM TI myself. And if you go to the website that I showed for more information, they'll tell you how you can build your own lib JVM TI and install it. Yeah, it is actually in the main line already. The JVM TI is there in the ministry. Okay, it's in the main line, but it's not in rel yet. Yes. Okay, that's, okay. Do we have any other questions? Anybody wanna know, you know, what the weather's like in Massachusetts? Give us another second. I'm sorry, what does JVM TI do? This is just a way of finding the Java symbols of giving away for Linux tools to find the Java symbols. Yeah, so perf has digit support. In order to do this, it needs to know how to map the addresses to the symbols. You can actually point perf to the particular place saying this is the perf map for this particular process. It turns out that in Java, there is a standard, well, yes, standard way to figure this out. JVM TI actually has the events that say we are about to install the compiled code. These are the addresses of this code and this is the symbol number. So this JVM TI agent is like 100 lines of code that actually just intercept these events and construct the perf map and then perf takes over. That's it. No, I don't think so. I think perf actually does not support that. And actually the JVM TI is just for a Linux app. All right, so you have to do a separate step to actually inject the JIT. You have your perf data file that perf got that there's nothing about Java and then you inject the JIT data to get a perf data with the JIT symbols. And then you can actually just show the output. All right, and this is no surprise. This is what I showed you before, just that you can see that all of those L1 data cache misses happened in that one method that we expected them to. And that's really it. So this was a really short talk, but I wanted to let you guys know that, you know, there's efforts going on to make Linux more aware of Java. That's it.