 So some of you who have used Dust Equal Bench before, this won't be a deep technical session, so it'll be a recap for some of the basics and some demonstrations. So this is primarily geared for people who need to know what it's about, why they might use it, like I think. I was also considering we have a new version that's in the works based on the Java 21 runtime. If there's enough interest, I may do an online session to get people bootstrapped on that version. So before the end of this, I'll try to remind people to send me an email. If they're interested in that, I'll put my email address up there. So of the five of you, how many of you have used Dust Equal Bench before? All right. I know you have too. That's all right. So you probably know what it's about. It's essentially a tool that speaks multiple protocols that allows us to do testing across different systems, disparate systems using the same common data set, access patterns and so on. And we use it every day at DataStacks for multiple purposes. And it's an open source project. It's a community project. DataStacks sponsors it by paying for its development, but DataStacks also benefits from it because we get to use it every day to do light testing or serious testing depending on what our needs are. I'll wait a minute and recap. All right. So just to recap, it's a multi-protocol tool, right? You know it. It's a brand mirror. I don't have to recap for you. All right. We'll just keep going. We're going to keep getting people coming in. It's early. So it was based on a few different projects. One of them was to verify that we could do procedural data generation in a way that was configurable by users who didn't want to do development, the recipe-oriented approach. That was proven out by a project called Metagener originally, presented at the NextGen Cassano Committers conference. And the first version of it was, OK, it worked. It wasn't that accessible. So the next version called virtual data set used Java paradigms a little more directly to allow Java developers to write the functions that can be used. And that was much more accessible and much more usable. Engine block is just a runtime testing harness that lets us deal with aspects like metrics collection, abstracting the per cycle operations, things like that. And paired with those, we have driver adapters. Driver adapters are like cartridges you plug in to the runtime that essentially allow the engine to speak multiple protocols. If you have an adapter for your protocol, it can speak to protocol. And you might ask why we built it. And originally, the answer was very simple. We needed to do a kind of test that no other tool would let us do. It's really simple, monoponically increasing time series workload, and none of the options out there let us do that simply, as simple as it may sound. It was too difficult to achieve, too difficult for users to manage. The tools that purported to do it didn't do it very well. And it was not accurate. The tools that we had gave us mixed results. I tried to verify cycle per cycle that it was giving us the truth of what was happening, and that was futile. It didn't happen. So we built a simple tool to solve a simple problem. And over the years, as we've had to evolve it, we've figured out how to make it do more and more to kind of adapt to the kind of flexibility we need with the multiprotocol testing requirement. Another benefit that we have is that when you have a workload, it's a single file. The runtime to run it is, if you're on Linux, a single binary. And if you're using one of the built-in workloads, it's just built-in. It comes with the binary. You reference it by name as if it were on your file system. And it just works as if you had it yourself. So we're going to get through the non-technical aspects of this. We'll get to some demos, essentially, and then we'll have some questions. If you have any questions, please remember them. So one of the things that it can do for you actually is to bridge the gap between different testing needs. You have engineers, you have customers, you have users, you have whatever. And if they have a common tool, the concepts become more tangible. People talk about how to do testing using the same language. And that is... that streamlines things quite a bit when you're trying to figure out how to do a one-off test. You're reinventing the wheel all over again. And that's just a waste of time. And often those efforts give suboptimal results because it takes about three or four iterations on a testing tool to actually get it to start working the way it should. Along the way, we figured out how to do some pretty cool things. Procedural data generation was basically the foundation on which every other part was built. And you see it says O1 or near O1 over here. That's because we want the testing clients to behave efficiently. We don't want the testing clients to impose on the behavior of the composed system. So what we do is we try to make the testing clients behave operationally, consistently, meaning that they don't have a wild variance in their behavior when you change a distribution in your data set. So we employ multiple techniques. Some of you might know the alias method. Something called inverse cumulative distribution sampling. Things like that. They allow us to essentially make all of our bindings work in fixed time. Regardless of the size of the data set you're simulating. We also figured out how to have essentially multi-protocol support using a relatively consistent descriptor language. It's a YAML format. You specify your workload template, which is composed of operational templates and the order, basically the access patterns, the data that goes into them. And if you know the format, you know the format. You can use it for any of the supported drivers. We also made it so that you could script what happens in the testing session. So initially this was Nash-Warn and it eventually got replaced by GrowLVM. This is basically examples of the design principle where we have advanced capabilities but we try to keep them below the surface for first-time users. They're there, they're working, but if you're doing basic testing, you don't have to see them. If you need to do advanced testing, you can. If you need to do something sophisticated, like tell the system to vary the load by hour over a period of time, it's about four lines of JavaScript that you can add to the testing scenario. And it's optional. Is there if you need it? You don't have to use it if you don't. I already talked about the uniform workload template, so we'll skip over that. One of the more recent capabilities we've been using more is the ability to stand up a flywheel that's testing have essentially an autonomous agent in the runtime looking at the way it's behaving changing the parameters of the test and doing what we call multivariate optimization. So it's a live dynamic loop of feedback that allows us to do the kind of testing that a person might do over the course of days or weeks. For example, if they're doing a POC they say, well, I think it needs to be 46 threads instead of 42. Well, that takes forever. We can automate all that inside the closed loop system now. Some of the core concepts are essentially the workload. It's a YAML file that describes all the operations and access patterns that you might want to have in your session, your scenario. And the scenario is a sequence of commands orchestrates a set of activities together serially or concurrently. And the core active element of NoSQL Bench test is the activity. So activities operate on a range of numbers on the number line which essentially are the seed values that determine the data and which operation is selected for every single iteration of that particular activity. And it's like a flywheel. You start it running. It uses a deterministic sequence of operations. Let's say you have five different operations that do different things that may be related emulating a client side or application side access pattern. Each cycle will plug in different data to those operations and after it does five operations it will go back to the top and do them again with another set of cycles and therefore another set of data. And within the cycle the adapter is essentially the thing that says here's the op template. How do I want to interpret it? Do I want to make a CQL operation out of it? Is it a Kafka message? Like what is it? And the adapter is a selector from all the drivers you have. In the upcoming, you know, the development branch we're working on right now and actively using actually is something called the container. And the container is a key element for our advanced analysis improvements. The container allows us to have activities that share an execution context that can be operated, manipulated and observed by the autonomous agents or essentially active commands. It just allows that state to be carried through to different stages of analysis so you can have different kinds of analysis occurring on an activity that continues to run and they can be cohesive. I can talk about that a little bit more. If you haven't used a SQL Bench before, right now, the 5.17 or the stable release is what I would recommend. It uses you can instrument, you know, for all kinds of metric flows, but graphite is the easiest and we've used it quite a bit with graphite exporter, Prometheus and Grafana. If you're using, if you want to do, you know, the latest and greatest, you'll want to use no SQL Bench 5.21, which is in development. One of the big benefits of that is that it actually lets you have dimensional metrics, which is much more in the way of Prometheus or Victoria Metrics, right? It allows you to add metadata to all your metrics in a way that every single bit of your test data is uniquely identified, canonically identified. The linearized naming form in graphite kind of breaks a lot of that for us. So in 5.21 and forward, we're going to be focused on dimensional metric labels primarily. And as you'll see in some of the examples, if you want to try some of the built-in workloads, you can use these command line options to find the ones that come with it. You can use the copy command to copy them out to a local file, modify them and run them. One of the first questions we often get from new users is what are the metrics I care about? So cycles is measured as the outside time of execution from the very beginning of knowing what cycle you're running to actually closing out or retiring that phase in the test. Inside that, bind is the operation of binding a cycle to a native operation. It's the thing that does a procedural data generation and looks at the template the way the adapter wants to interpret it and makes something like an actual CQL operation in the native driver format. After that, the time it takes to submit the operation and get a result back is called execute. So bind and execute occur mutually exclusively one after the other. And within execute, we measure exceptional results separate from well, we measure all results, but then we also measure non-exception results separately. So that's what we call result success. And a common strategy for verifying your metrics or giving a good information is to look at both of these numbers. And if they're the same, we have no exceptions. It's always good to have a cross check when you're looking at your metrics. We also have a retry mechanism built in. There's a very configurable modular error handler that lets you tell the runtime what to do, given certain kinds of errors and whether to retry or not, etc. And tries is a histogram that records how many tries each op took before it was successful or it gave up. If you say max tries is 10 and you see something that says 10 in your histogram, you know that you have operations that are probably not completing. So why would you use this with Patrick Asandra? Some of the reasons might be obvious, but I'll go through a few here. When you're learning how to use data modeling for optimization or just to understand what the tradeoffs are for a particular approach, it's a very easy tool to use for that. You can just build a simple workload in CQL, run it, get some results, change your data model, do it again, and you can even have all the details of your different data models fully contained within your workload and you can call them out individually so that you can have them all there as kind of an inventory of things you can try and then you can run them all and get the results and see them all together if you want. Scale planning is one of the biggest reasons we've used it in the past to help customers particularly who are wanting to do large deployments. We do a proportional study of a smaller cluster. We do some scaling math to determine what the scalability character looks like. We might do a validation on a slightly different size cluster to just establish the linearity of our math and then it gives you a pretty reliable measure of what you need to scale up to an arbitrary size. As long as you're careful in how you do the scaling math, proportionality, apples to apples, that kind of thing is very useful for that. Also, if you think about the protocol at the driver level as being a demarcation point, then you can do quite effective end-to-end system testing. In that way NoSQL Bench takes the role of an app simulator. As long as the access patterns and the data you're plugging into it are representative of what you would do in production, it gives you a very portable, very direct and realistic way to emulate that without having to wait for an app team to build a test harness that might not be used again. Multi-protocol testing scenarios are quite useful for a couple of reasons. One is let's say that you want to do an empirical study between different kinds of systems using the same access patterns using the same core data. You can do that. In some cases, you might have mixed-mode or mixed-model application designs. In CQL, you need to do some web API. Maybe you need to even do some graph stuff. And you can mix all that in the same workload. And the last one, which is emergent, is to close the performance analysis. And I'll show a little bit more about that here. To go to the next section, I basically recorded all these in Eskinema so we can see them as if I'm running them up here, but they'll be quicker if I just do it this way. So I'm going to do that. I'm going to get into some of the discussion or Q&A. Does anybody have any questions so far? I've kind of breezed over a lot of things. So anything that we want to pause for a moment on. All right. Let's go ahead. So in order, let's see. Are we in the right place? So this is showing something called an ad hoc command. And I'll talk through these. In this case, this is a command we're running. And can everybody see that okay? Is it zoom level? Okay. So we're using a driver called standard out, which is not an actual low level driver. It's a diagnostic tool used. It lets us take the op template as we specified in our workload template and render it out in string form. So we can say what is this, basically this equivalent of a Java T string kind of treatment. But it allows us to say given the cycles we're giving it and the bindings or the data generation functions we're giving it. What does that mean in terms of data? It's very handy. Some people even use it to generate data for other testing systems, right? And just an aside, if you're using Faker for testing, please don't. It's fake. It's a vignette. It's not data you should be using for scale testing. That's why virtual data set exists. So what this is showing is we have what is considered an inline binding function or inline generation recipe. And this is just a function in Java. The tool chain for the project finds all of these things that are annotated as binding functions. Bundles them up, creates a manifest, puts them on the dock site, sticks them in the runtime. They're all available. For a Java developer you can make your own, put the annotation on there, rebuild it. Now you have a new binding function you can use as a procedural data generation tool. This double curly breaks form in this case is showing us that we don't have to name it. We can just use it anonymously in place. And what we're doing is we're specifying a template of an operation. And when you see this template evaluated per cycle, it's not surprising what it looks like. It does what it says and says what it does. As well, you see the cycles when we specify the same cycles, five, which is short for zero dot five. We get the same result. All of the data generation is deterministic. If you want different data, excuse me, if you want different data there are ways of doing that. You can choose a different range of cycles or you can change your bindings. But if you find a problem in a test at cycle 32 406,000 whatever and it's reproducible because it's data specific, you can reproduce it. You can tell the test to run cycle only if you want to, you can hand that to a developer. They can reproduce it. That's really handy. Here's an example showing the next range five to ten. It's showing this form here is essentially what we call a closed open interval just as you might see on our programming languages. And we also see the progress reporter at any time it's showing the pending operations, the ones that have not been dispatched for processing, the ones that are currently being processed, and the ones that have been completed. So you have an asynchronous view and you can customize the updated interval for that. And down here we're just showing a contrived example of a very large number. Does our data generation function still work? Yeah, it still works. Any questions on this one? Anything you see up there that you want to know more about? Let's go to the next one. This is another example just showing deterministic operations but it's also showing a little bit more about how we can tap into all the distribution functions available in the Apache distributions or the statistical Apache Commons statistical packages. And originally we had to do a little bit of hackery to make this work because the previous version was not exposing the density functions needed to derive the lookup tables. But the current almost released version is quite useful. It just exposes them in a more uniform way. This uses something called inverse cumular distribution sampling which is what allows us to get the O1 type of performance we need by default. Hashes the input over the long ordinal space and then uses that as an index into the inverse cumular distribution function on a lookup table of 1,000 data points per function. And what that does is it gives us pretty close, it's on a facsimile but it's almost a facsimile of the same data you would get if you computed it by hand every time. Or computed it from the function directly. But what it really does is it allows us to do all of these things in the same performance envelope. Different data same exoperational envelope for the client. And we're showing we have continuous and discrete functions here as well. Here's an example of taking a different approach to defining a workload. We're not doing an inline implicit one operation template. What we're doing now is we're actually creating a file. Now you see our procedural data generation functions or binding functions. Much easier to say. We have a map. We name them and we can reference them by name later. So in the exact same statement form, we're not using a double curly brace, we're referencing a named binding which is defined in our workload template. And we have a set of operations that each has a name. And the operation has details that tell the SQL bench how to actually interpret it. In this particular case we still run it with the driver standard out so it knows how to interpret that op template. There will be an easier way to specify that coming up. This is showing template variables and all these are available on the link and I'll make sure you can see that link. You can go and replay them as you like. We're clipping some of the text at the top, but I'll explain. Template variables allow you to parameterize a workload simply by using a macro facility. The macro facility looks at the named variables you provide on your command line, plugs them into where you have the template variables defined. In this particular case you see we have this template variable called maxValue. Here we specify when we run this nothing, meaning it's taking the default of 100. In this other binding function which is different than the template variable form you see here, this is a binding function called template. This is just a handy string compositor that lets you put curly braces in as anchoring points for injection. And those pair up just like with a printf with each function. So we're using the hash range binding function, zero to some and when we don't specify an override for that we get hash values between zero and 100. But say we want to change the cardinality of a data set and test for example we can pass that on the command line just like this and now you see the values change. So we didn't have to load any data into ETL shadow copy anything in production. This virtual data set facility makes the mass and the carrying cost of having completely different test data almost free. You change the definition you have effectively changed the virtual data set you run it again it behaves as if you had all of that on disk already. IO is expensive one of the nice things we learned about procedural data generation compared to bulk testing methods loading data is that modern processes are extremely efficient at generating data orders of magnitude faster than piping IO from disk through memory to the network socket. You would be surprised at how much but you'll see an example of that. If you have a more sophisticated scenario that you want to run that may have multiple phases of testing that do different things or maybe you have a schema phase maybe you have a load some background data phase we call that schema and ramp up in our conventions. You can script it effectively using just a command template. What this does is it allows you to encapsulate a set of parameters testing sequence or whatever in a file. Let's say that you're working with somebody or a customer or developer or vice versa and you need to speak the same language have the same exact workload template it's just this file now and when you call this file as a command the name scenario is resolved by what names are defined within it there was no name scenario specified just a workload template here it found the default one and it ran both of those so let me actually let it run through again I pause it so you can see the top it's unsurprising right it looks like what it does does what it looks like how are we doing on time so if you want to find the workloads that are built in the font size on this one is a little bit small sorry about that the dash dash list workloads option lets you find everything inside that has a certain file pattern and name scenarios and these are all the ones that you can use out of the box there are more these are just the cql ones if you're looking at these for the first time I would highly recommend looking at the ones called baselines v2 because that's the second generation of our canonical test set that we've used for a lot of different benchmarks you can see it's easy to get them out and use them modify them and use them as if you had written them it's much easier to start from examples here there are a lot of options that maybe you don't want to learn how to use all the things but you want to be able to use these workloads so it's good just to take one of these copy it out modify it and run it and experiment with it that way this is just showing running one of those built in workloads it's basically doing what you would expect now this is not real time by the way the screen recording does let you take out long pauses I'm doing that for the time we have so in real time this would run over more than a couple seconds but this gives you an idea about what it would look like and you see there's no detail shown except that it completed and it shows you how far it got you can turn up the verbosity throw a dash v and it turns the log level up from warrant to info another dash v to debug another dash v to trace but usually you don't want all that because no news is good news it's designed to be that way here it's there if you need it this is just showing an interactive customization of a workload I do the same thing I did before I just get one of them out using the cat command similar to the copy command and when you do this it's good to name it differently too you don't want to wonder which one it is to built in or yours but I'm just copying one of the name scenarios instead of using the default name I'm calling it my settings changing the settings and what this means is when I hand this template over to someone else to reproduce what I've done or run the same test all I have to do is give them workload and tell them the name scenario the parameters come along for the ride because we put a customization in there just for that so here's an example of using an inline command but also using a thing I recently added called so there's a drive run mode that you can use to wrap some of the internal logic so that you can see what the test harness itself is capable of without invoking any external systems and this particular mode sorry this particular mode we have the drive run equals emit what this does is it takes the result of an operation uses the driver adapters string rendering and gives you the result so it's not quite CQLSH but it's a one-liner and it just works real quick and if you know how to read it you get the data you need it's nice to be able to do things like that to validate testing talking about multiple protocols web APIs are supported the web API or the HTTP driver is quite comprehensive in the diagnostics you can get from it it's useful for interactively figuring out how to do a web API testing you can inject your own headers etc but more importantly the payload format is a structural template we talked about the string templates well the virtual data facility virtual data set facility actually allows us to do the same thing with lists of maps of whatever of strings and whatever and whenever the template is resolved during initialization the structure is resolved directly static values are memoized for efficiency and whenever the total value of that payload is created it's stitched together from all the memoized elements so it's actually quite efficient it's not just a big text block but it does show you that you can let's go back to the beginning here see if we can see that we have full name here and it's taking this name binding and plugging it into the payload but it knows that this is a map and it knows that this is a string template and this is just a string so something else this is one of the built-in bindings we had this full names this takes all of the names that occurred 100 more times in the last census that published that data a lot of different seed data like that is bundled with the SQL Bench and this produces a realistic simulation of names that were in that census using something called the alias method it's quite efficient O1 sampling method for large data sets with many different weights but we support using any CSV data file to pull that in and do whatever you need this is showing the listing of the drivers which some of you might be curious about like what drivers do we support let's go back to the top and just look at that I think there we go there's quite a few in here if there's a driver that you would like to use that is not there let us know but even better join our developer community and help us build one the APIs for doing that are pretty well abstracted now it's not difficult once you do the first one once you understand how the abstraction layer worked for mapping an operation to a native mapping a template to a native operation but it's something we're doing all the time we need to test something we don't have an adapter for we just go build it and these are all built in and then here's one of my favorite slides this is showing the actual O1 effect in detail for the bindings so we're testing the uniform distribution and we're running 12 threads we're running a million operations for each one then we're doing the same thing with the binomial distribution then we're doing the same thing with the normal distribution and you'll find that they all have almost exactly the same performance envelope alright looks like we're at the end are we at time completely now? alright so if anybody has questions further find me and I'll be happy to answer any questions you have