 Greetings, everyone. Thanks for coming in. Hopefully everyone's in the right place. On the invitation and the menu screen, you saw Java or Cassandra on steroids. Didn't really give a whole lot of other information, but I hope that was a nice fish line in to come in and see what else you can find out about what exactly I meant by that. So I work for Azul. Azul is a purveyor of Java. Java being a platform that is used ubiquitously across the enterprise for all sorts of use cases. And primarily, we're here to learn about how exactly Java can be applied to the Cassandra use case, and that's what we're going to be talking about here today. Specifically about typical customer challenges that we're seeing when we're talking to our multitude of customers out in the industry that is tackling these use cases. Also, we're going to be looking at specific use cases, so not just customer challenges, but specific use cases. We're also going to be learning about our solution to a lot of these problems, and we'll also see exactly how, what type of benefits you can realize with Cassandra running on the Azul platform. So without further ado, customers that we see and interact with aren't just using Cassandra. Cassandra is just one of the many tools and frameworks that enable companies to run their business. Kafka being one of those. Cassandra, obviously another one. Kafka being the streaming aspect of it, but you have to put data somewhere, and that's where Cassandra comes in. And all of this becomes a really nice pipeline that effectively feeds into or enables companies to enact their business use cases, the revenue generating activity. And it's all a pipeline. Stream the data, store it somewhere, and then make it searchable through frameworks like Solar and Elastic. And in fact, all of these have a common denominator. All of these use Java, if you didn't realize that. I hope that's a little bit of an epiphany. But yes, Java powers all of these, and Java isn't, well, there's a lot of debate today, you know, how relevant is Java with Python on the rise and Node.js and all these other types of frameworks that are out there. But Java is very much relevant, and it's used under the covers of all of these frameworks, whether you realize it or not. And so what are we to do with the rise of data? Well, we have to harness it. And data isn't just something that you want to just lose sight of because data lost is an opportunity lost. And since 2020, when COVID hit, there was just a spike in the amount of data that companies were storing and trying to be able to make heads or tails of. Everyone was living at home trying to, you know, putting in orders, whether it be through Netflix and Uber Eats or all these other data. So they were creating profiles. Their first party data was being stored everywhere. So everyone was trying to harness that data. And not surprisingly, 96% of companies have a data strategy, but how good is that data strategy is only actually how well you're making use of it. And we're only going to see even more of a spike with data. Actually, the experts in the field are saying by 2025, there's going to be a data growth of 23% or 175 zeta bytes of data that's going to be out there. And again, if you're not harnessing that data and making use of it, you're losing out. And so how are we going to do that? Well, you can throw more infrastructure at it to handle that data, but then you have to tune to be able to process that data in real time. And if it's Java based, there are issues that are inherent with Java that our platform solves for. And that's what we're going to cover next with some of the challenges that you may not be very well familiar with because you're dealing with yourself in your organization. And that leads us into the use cases. So as I mentioned earlier, people are trying to make use of data that they were getting from their e-commerce websites, from entertainment websites, like the Netflix's of the world, like the Kroger's of the world, like Target, Walmart, all of these companies want user data because that feeds into all the other components of retail, of other types of revenue generating type activities that these companies are trying to use that first party data on. And they have to store it somewhere. Obviously, with the invent of IoT edge computing with data being able to be transferred from our pockets without one of us giving consent at that point, because we've given consent to transfer that data at a point earlier, like your Google Maps. How are we able to traffic as you're driving down the freeway? Well, you've given consent to that. And now we know about that data is being transferred and being stored somewhere to be analyzed, et cetera, et cetera. So how are we going to help with all of this, especially in the case where Java is actually powering the processing of this data? Well, it all boils down to the JDM or the Java virtual machine. The Java virtual machine is the thing that makes Java one of the most loved and I would say loved languages out there in that you can write code once and run it anywhere. But again, there are inherent problems that Java brings along with that. And that's where Azure Prime comes in with the solution. So I mentioned some of the problems and inherently with Cassandra or the storage of the data, it boils down to garbage collection. Now, when it comes to garbage collection, that is primarily being driven by the high amount of memory usage that Cassandra demands that an application that's using Cassandra make use of. So in general, you have memtable. Memtable is the indirect or I should say the middleman in data that's being stored in memory before it's written to disk. And by default, it's required about a third size of the overall heap that's being stored before things are being written out to the disk. So as things get added to the heap, you could get a situation where you can be hitting out of memory errors. And so how do you prevent that from happening? You have to clean up your heap through garbage collection. And so to reduce them out of the memory errors, this traditional open JDK is going to do what's known as a stop the world event where it goes and it starts cleaning up whatever that's in the heap to make room for the new objects that are being allocated and that Cassandra is making use of in the memtable. What Prime does is it does this continuously. It does not use stop the world events to go and clean up the heap. It's doing that continuously while your node is taking traffic from your Cassandra from your endpoints that are sending data that need to be written out to Cassandra. So effectively concurrent garbage collection with Azul platform Prime resolves this error. Now a lot of Cassandra users are going to realize about the garbage collection problem and they might try and tune away the garbage collection issue. They try to solve for the garbage collection issue by shrinking their heap. But in so doing, they don't realize that there's a trap in this. If your heap is small, that means your garbage collection could be kicking off more often. And so in trying to solve that problem, they might actually accelerate it because if your heap is small enough, it's continuously garbage collection. So you have to watch out and find the middle ground between tuning it to the point where it's actually hurting you. And there's another got you in that regard. If your garbage collecting all the time or you're trying to tune to that by shrinking your heap, you can also create what are known as memory fragmentation problems where new objects come in but the contiguous spot that's right next to it is too small to hand or too small to store the object that's trying to go into it. So it's going to take the next available spot, which could create fragmentation. And as you continue on with this process, you might actually, if you're not, if you're garbage collecting garbage collectors and keeping up with that, you could hit an OOM killer. So another got you on that one, you want to try and avoid. And again, Platform Prime does all garbage collection without stopping the world. It's doing this continuously and it avoids all these kind of tuning exercises that you would otherwise have to do with Open JDK. There's also the aspect of supporting Rokash. Rokash is very important in Cassandra just because of how important performance is. Rokash is stored in memory and you want as much in memory data available to you as possible to improve the performance aspect of your Cassandra clusters. So obviously, as things get written to your heap, again, you have another GC problem. If you're not keeping up with the amount of Rokash and other important data that your Cassandra clusters need, if you're not keeping things neat and tidy there. So one last thing I'll mention, there is an issue also with Tombstone. So Tombstone is an issue with the deletion of data from the disk. That data that's sitting on disk also has a pointer in memory and the Tombstone gets written to your memory into heap. And the row is actually going to exist on the heap for a little bit. But until things get deleted, this again could cause garbage collection issues because your heap is stacking up and things don't keep up. You could, again, have another out of memory issue with that one. But again, the garbage collector with Azul Prime takes care of all these issues and you don't have to do any tuning with all of this. It's out of the box with Platform Prime. Here's just a very nice illustration of what one customer's workload was running on with Oracle JDK, which is the same as Open JDK. Here you have seconds on the bottom and you have latency on the left. And you can see here that this particular customer running on Open JDK or Oracle in this case was having a lot of latency issues caused by GC. Once they switched with Prime, the same workload on the right was effectively what they moved to from left to right. And so the latency on the right, it's very hard to see, but this is in milliseconds. So it went from hundreds of milliseconds down to single digits with the same workload. Now beyond garbage collection, this is a very nice illustration of the inherent problem with Java. When you are starting up a fresh new JVM, the JVM, as I alluded to earlier, what everyone loves Java for, right once, run anywhere, the first time a JVM sees code, it doesn't know about it. It doesn't know how to optimize for it. And so it inherently goes through a learning curve. The learning curve is the process of progressing byte code from interpreted mode to tier two, tier two being the most optimized in the middle ground being tier one. And so as the JVM learns about the code, it optimizes based on a threshold. So you hit a certain point in profiling, let's say 10,000 iterations on a particular method. At that point, the JVM is going to say, okay, I've seen you enough times, I'm not going to interpret you when I run you, I'm going to optimize you and I'll go through what's known as a tier one optimization. It's a very cheap optimization in terms of CPU, but it's necessary. You can't go to tier two without first doing tier one. And so from the yellow, which is interpreted, you move up to the green. You do green for a little bit. You do tier one. And again, you have another threshold. And once you've crossed that threshold, you move into tier two where your profiles and you get upgraded to full optimization, which is very expensive because more CPU is going to be required to take that tier one code and turn it into tier two. But eventually, your tier ones turn into tier twos and you flatline, meaning you've hit steady state. All the optimizations that your application needs to run at full speed has happened and you've effectively got to the top of the hill and you're running full speed at that point. With Azul Platform Prime, we have a warm-up technology called Ready Now where all of that experience with the JVM can be profiled in a previous run and fed to other JVMs when they start up. So that new startup or that new JVM that's starting up takes that input profile that was recorded earlier, maybe in production, could also be in a lab environment through a canary. So you can run all sorts of nice use cases to exercise your code and then feed that into a production instance. And when you start up, you're going straight to the top of the hill right away versus slowly through interpreted tier one, tier two, you're doing tier two optimizations right at the very beginning. And you can also actually, if you're using Kubernetes or some other very nice orchestration manager, you can set your readiness probe or your flag that says, I'm ready to take traffic after you've finished your optimizations. NASDAQ traders love us because obviously they want to start running their most efficient, they want to start running their trades at most efficient code. But that doesn't just limit that use case to just the traders, retail, the retail companies of the world like the Walmarts and all the other companies that need to have most optimized code when their customers come in and start placing orders want to have this. So they don't have to suffer any latency or any of the other bad things that can happen when traffic's not ready to be processed at that particular time. And this is all done by our Falcon JIT compiler. So that's what that that name there number two is. We call our JIT compiler Falcon. And our garbage collector is to see their number three is called C4 continuously collecting or continuously compacting concurrent collector. Thank you from the audience. And our warm up technology is called Ready Now. So those are the three names that you see up there. Now when it comes to maintaining SLAs in Apache Cassandra performance, Cassandra is very much scalable and getting as best performance as you can out of it is very much relevant to the hardware that you're running on, but also your choice of JDM. Cassandra cannot run without Java. And so here's a benchmark that we did here on the right where the blue that you see down there is Prime, which is running below the dotted line, which is our SLA at P99.9, which is less than 100 milliseconds. And some of the other flavors that you have there, including CMS, G1GC, Shenandoah, and ZGC. ZGC is also becoming something that we're seeing quite a bit of in the industry that's coming out with the newer versions of Java, especially Java 21, which has a generational ZGC generational garbage collector. Just wanted to point out that Azul solved the garbage collection problem over a decade ago. And we're now iterating on over a decade worth of algorithms. So ZGC is just coming to market with this. And we like to think that we have a leg up on that. But nevertheless, it's in some cases, it could actually work well with your Java use cases. But in the case of Cassandra, we have a proven track record here that goes beyond just garbage collection. As I mentioned earlier, the warm up with ReadyNow and the JIT compilation with all three of those, it's not just about solving the latency issue, it's also solving the throughput issue. If you have more efficient code, you have less infrastructure because our JIT compilation engine is going to produce that just more efficient code. And of course, we have a drop off. It's not like I'm going to say that our garbage collector is infallible. You can see it does finally break, but it breaks at 90,000 QPS here versus some of what else that you see here. And the goal is just to get to that edge, to the cliff. There's a cliff at any GC that you have and to back off from that. Just a couple of use cases, and I like to bring up Tabula just because Tabula is just a great use case with Java and what they were able to do. So those of you that aren't familiar, Tabula is an ad serving company, 3.2 billion daily web pages that they touch with their engine, 30 billion daily recommendations, and all of this, they were running on 8,500 servers. They had to cross the globe. And after they put in prime, some of the challenges that they were having is that their ad serving engine was actually stopping for about 15 minutes to do garbage collection as just one of those bad things that is inherent with Open JDK. And obviously, you can't have all your nodes stopping for 15 minutes. And so what they were doing is inherently what everyone does is over provision to prevent you having to rely on the things that are doing the cleanup. And so unfortunately, with over provision, that means more money. And so after they put in prime, they were able to reduce their server infrastructure by 30%, which is huge. And their database footprint by almost 50%. So it's a great use case that we like to talk about. Fidzei also had a similar issue with legacy JVM. They create a framework for banks to use to catch fraudulent and bad activity. And so they need to do a check on the transactions that's about to take place. And again, garbage collection was a problem that was happening. And some of these transactions that should have been flagged as fraudulent were getting kind of dropped through the cracks. And we helped solve that problem for them. So overall, as I mentioned earlier, a lot of retail use cases for this. The banking industry loves us for that ready now technology. But it doesn't just stop at just one retail company or one travel company. There's all kinds of companies that use us. And that's our NASCAR slide for the day. With that, I have about five minutes of questions, if there are any. You have a question? I have a question of JVM. It's a product on top of Cassandra. Yes, good question. So Azul Platform Prime is a Java run, it's a Java platform that has optimized versions of the garbage collector and the JIT compiler. So OpenJDK or Oracle JDK that you all know and probably use today has two components in the JVM called the garbage collector and the JIT compiler. And we've also added a third component that doesn't exist in that version called ready now. You saw that profile where you can teach a JVM about previous experience. And the garbage collector in Azul Platform Prime doesn't do stop the world events, it does things concurrently and continuously in the background while your JVM is executing instructions, whether it be for Cassandra, Kafka, or any sort of Java workload. So the problems inherent with Java would stop the world events to go clean up heap, meaning objects that are being overallocated, being promoted to old gen and new objects coming in and you having to make room, otherwise you might have an out of memory problem. Azul Platform Prime solves for that by the way that it manages the memory in the heap. No more stop the world events, garbage collection shouldn't be an issue, but we can't solve memory leaks. I'm not going to say that we have a silver bullet for that, that's on the developer. But of course cleaning up the heap and keeping it tidy is something that we solved over a decade ago. And so this is an optimized version of Java that's specific to Azul. Any other questions? There's a microphone for you so that we can record the question. I think you can just talk. Yeah, that's a good question and I would say that yes, you can. But be careful in that the profile is specific to how someone's going to use your product. If you can predict it and it's the same for your customers, if it's or if each customer uses your product differently, then your profile might be different. Right. Right. Okay. In that case, obviously you don't you wouldn't ship anything without testing it first, but I would say that that is that is a legitimate thing that you could do. Yeah. Yeah. The footprint of a writing out profile could get up to anywhere between 100 and 200 megs. We do have a solution that handles this for you automatically because there is a way that you can obviously do that by the file. You can generate first the profile and then you can build that profile into images and then ship it that way. But then there's also a cloud-based solution that we have that runs in Kubernetes that would effectively do that automatically for you, where you just name, you give it a profile and it's actually better to have generational profiles that go through iterations. So Gen 1, Gen 2, Gen 3, and typically we cut it off at around Gen 3. Beyond that, you're kind of, it's already extraneous. You're not really learning anything after that. But after the third generation, any other instances of that same application would automatically get that profile at that point. And then obviously as your application changes, you create new classes, new methods, and maybe the flow of traffic changes, you'd want that profile to change because the speculative optimizations would not be relevant anymore because they changed in the code. So you would want to rebuild it. So that whole other cloud-based solution takes care of that. Otherwise it's just, you know, like you said, ship it with the product and you can do it that way file-based. 100 to 200 megs is about the size. Good question. We solved all the world's problems here today. I'm so happy. Right? Thank you.