 So since they're going to embarrass me, this is my team from Red Hat over here. Shaker just walked in as well, Shaq, et cetera. The team, Anisha, who's been a big help on this talk. So if they start to heckle, I'm going to come back and point them out again. So this is all about Pbench, which is a tool that's a benchmark and performance analysis framework. It provides the ability to capture configuration and tool data. Helping you to package it and make it easier to display, analyze, et cetera. It consists of an agent and a server and a dashboard. The agent, we kind of wrap up some price subscripts to wrap your benchmark. So that you get with all the hosts that you have to run and a distributed system and all the tools that you might want to run, makes it a little bit easier to go do. So what problem does this solve? We found that there's a certain level of instant consistency in the data that gets collected and in incomplete sets of data surrounding a benchmark. You know, we've had a team working on distributed systems for a while and in running and collecting, everybody finds their own way to do it. You get some inconsistency. So our goal with this tool is to work towards making it easier to capture all the configuration data you need, the tool data you need surrounding, and I'll give you some examples of that, around your benchmarks. Helping you make the collection a little more consistent so you can see run to run or with someone else be able to compare things. We want to provide a location to archive data, to visualize that data, help analyze it. We want to be able to reuse what we can for visualizations that no one's repeating the same things that they do. And we want to make it easier to compare between runs. So our team works on mostly distributed systems these days. So Red Hat products, if you think about the stores products, Gluster networking with Kubernetes and OpenShift and OpenStack. They're all distributed systems. Rarely do we work on one system. For all the products. So we had similar needs across the team for exercising the network or a storage subsystem or whatever the particular thing is and we found that different team members were solving these problems their own way, which is great because they needed to get their job done. But what ends up happening is with that unique way of doing things, sometimes it becomes a little hard to have the same data collected, to know if you're comparing from the same environment. If you can reuse network results from one for another, etc. So we have this tool with some best practices that we came up. We broke it up into an agent so that there's a collector for your benchmark, we call it the agent. It's not an agent like a background aid. It's more of a wrapper that you invoke directly. We have a server side for helping us archive and manage all the data and a dashboard to view it. So our team has since about 2015, we've collected about 240,000 plus tar ball data sets, about 12.2 terabytes of data that we've stored on our servers. And we don't have a delete button, so that's why it's so high right now. So let me give you a sense of what we're trying to solve here. So let's say you've got a simple program and a script. You want to find out how long it runs. What do you do? You type time, the name of the script. You get an output, tell us you always, this script happened to be calculating. I think I did 1,000 digits of pi using a Python decimal system. 26 seconds to go do that on my little laptop, isn't that great? Okay, you can get a little more fancy if you don't use the built in bash time. You can use the user bin time or the GNU provided time. And that gets you a little bit more information about your script, right? It tells you not only the time, but it tells you how much CPU, how much resident memory it took to go do that. Did you do any IO, in this case it didn't. Did you have any major or minor page faults? We don't use a lot of memory, so we have a few minor page faults. And we didn't swap anything, right? So this is nice, but it doesn't help you like on a simple system with one script. It's not going to help you understand multiple systems. And it's not going to help you understand how your system was configured. Because the time command, well, it tells you that what happened, it doesn't tell you how much. Like for instance, I used 9976 pages of memory to go do this, I think. I think that may be bytes. But of what? Was that of 100 million gigabytes or was it of one gigabyte or whatever? We have no reference point. We have no idea what the configuration is. And it doesn't help us with multiple systems. So let me give you, we'll just sort of ratchet up the complexity here as we go because that was a little simple. So let's say you wanted to test networking between two systems, right? So there's uperf is out there, uperf.org. If you wanted to test uperf, what do you do? You have a server, uperf-s on one system. You run uperf client on the other system. In this case, I'm picking the iperf.xml as the profile that's provided in the uperf sources that's described a certain way of running a network test. So I've modified the xml to talk to the other server. I time the server just so that I get the resource usage as well. I'm not really going to time it in terms of how long the server took because the client does that. But I time the client as well. And I point it in my xml to that other server and I run it. And I capture the output of the report. So now I get the output. I've got 942 megabits per second in my output. Okay, so now what I do with that, is that a good number? So usually what happens is we know, quote unquote, know our system. Maybe it's my laptop. So I know that this laptop has a one gig nick on it, but maybe not. If it's a 10 gig lick, Nick, then that's not really a great number. 942 is good for a one gig, Nick. Maybe you could do better to eke out a little bit more, but for a 10 gig gig, eh, it's not so good. If I'm running benchmarks and I'm comparing results and I'm doing this over time, am I always going to know where I ran something? So we found that getting that configuration data is really, really important. And you don't necessarily use it. It's not like having the configuration data front and center is the kind of thing that you're going to look at all the time. It's when you need it, you need to have it, right? So if a result doesn't make later, it makes sense sometime later, you want to go back to see that it was configured right. So the goals of our tools is to try to get us a complete set of configuration data consistently, I mean capture, as well as some tool data as well. So, excuse me, I'm skipping ahead here. So we were doing time and that's really nice. It gets you a little bit of system configuration. But there's a lot more information about how a system is running than what you get with time, right? So maybe with the networking thing, you might want to run some kind of EBPF script to figure out some interesting thing that's happening in the networking stack. Maybe you want to run PID stat in order to grab how all the other processes in the systems, a system is running with you, either on the server or on the client. You might want to do a perfect chord so that you capture what is the actual program doing, maybe what the whole kernel is doing, where's it spending it's time, and again might want to time the system as well. And I always, should I time the perfect chord, or should I perfect chord the time, because if I want the setup, the timing anyways, you got to figure that out. So you have all that done. So I would sit here and I got to capture the PIDs of those background processes because now I can't run them both BPF trace and PID stat, don't take a command as an argument. So I can't kind of chain all these commands together. So now I got to save all the PIDs in a file and do that right. And then when it's done, I got to kill them so they don't run forever. So that way my data is not trashed. But are those four tools all I want? Is that all that I need? Could I, maybe this was a, UPERF might be changed to something else, and I need some disk information in IO stat. And then I got to add that. So the complexity of the tools kind of ramps up. On top of that, that was a simple two client server. I just go between here and there. What if I want to test the networking fabric? I've got a big switch, and I want to saturate the switch. And I want to have all my clients go through and put that through. Well, I need to set up a number of server nodes and a number of client nodes. So I got to run a for loop and I got a SSH to all the different nodes and start the UPERF thing in the background and make sure that works. And I don't know if the syntax will actually background the SSH command or background the UPERF on the server. I'm not really sure about that. And I got to start all the clients. However, I want to run UPERF to go do that. Well, on top of all the PIDs I want to collect for the tools. I don't have to do that for all the hosts that I'm dealing with. And so you start ramping up the number of systems and your complexity just grows with this. So we are trying to solve this inconsistent and the incomplete while also making it easier to go about and run on a distributed systems. So Pbench offers this simple command, register tool set. And it's another simple command called user benchmark. The same thing we were doing by hand, but in just simple two steps. So here what I can do is I can say, I have a set of tools and so we're going to hand wave on the tool set that we're going to run. But I have it defined and it's just named somewhere. And I have a list of hosts that I want to collect my tool data on. So I just add that in the command line for register tool set. And then I have my, here's my script, whatever my script is going to be. And we'll get to the UPERF in a second. Whatever my script is, I just run user benchmark. I give it a configuration name to help me collect some metadata about it for later and boom. I'm able to run my benchmark as it is with collecting data on all the hosts that I want for all the tools that I've registered. So let's get into a little bit about one more thing. So I mentioned configuration data that we want to capture the configuration of the system as well. So what we did was in the user benchmark command, and in all the benchmark commands we offer in Pbench, we collect a series of data. Is that too small? You don't need glasses or something like that, Sally? Okay, so we have six different configuration data sets that we collect by default. The block tree, we go find all the disks in the tree and grab a bunch of files underneath it. We grab for libvert if it's available. We grab data about how libvert's configured and some of the logs from it. We grab the kernel config. It's kind of interesting to know how the kernel is configured on certain cases. All the security mitigations, we grab a source report, very minimal. We have a targeted set of modules that we use that don't include all big data collections of the SAR data, the logs, etc. And we get a topology output from LSTopology. All of these are if they're available. So, source report collects a lot of that same data. The reason why I think we get the other things as well as a source report is it makes it a little bit easier to find because source report's kind of buried in the tar ball, so the source report is more of a fallback. You want that when you need it, but you don't want to look at it every single time. We also have three other optional data collection tools. There's a tool that we wrote called stockpile, which basically is an Ansible playbook that uses the Ansible facts feature to go grab data about the system. This is really cool, I was gonna point this out to you guys and I forgot earlier. This is a really neat thing because what we can do is all of the upper ones could be collapsed down into stockpile eventually. And somehow to have all that data in there. But it's a really powerful tool because we just run an Ansible playbook. We get a big, huge JSON document. The whole thing can be indexed into an elastic search for search and find. It's really cool, but I'm not gonna talk about stockpile. Insights, so Insights Client is a cool little feature that Red Hat has. We just grab the client, run it, grab the configuration. We can store it with the data as well. And Aura is neat because it's a tool that we use to analyze the performance of Ansible. So in a lot of the performance testing we do, we have to capture how fast Ansible actually runs when deploying different systems. And so that's a tool that grabs the configuration of the Ansible playbooks when they ran. So we had configuration data, we've got the tool data, we've got a way to do that. So real quick, there's a little simple configuration in Pbench. There's a simple agent config file. In that config file you can tell Pbench, actually that slide's wrong. I missed it. That should be pbench agent, valid pbench agent, our default place where we store data. And you can change that in the config file and so that way we collect it. And I'll show you a little bit about that in just a second. So I want to dive into the register tool. I'm kind of going quick. So if you have any questions, feel free to ask. But I figured it's the last talk of the day. You want to plow through this and get on for dinner. So all the tool data, so you have earlier we had time and we had perf record and BPF trace, etc. So the register tool function records the data about the tools that you want to run in a local directory in the run directory, varlib, pbench agent under the group name. And we give every tool set, or every set of all the tools that you want to run are always in a group. And that way you can have different groups for different purposes. You may want to run one group of tools for doing network analysis, another group of tools for doing disk analysis. And so you just register your tool under the group with whatever arguments that tool takes. We have about 38 tools that we've added to pbench or that it supports. There's a non-exhaustive list down there, including the ability to just do a user tool which basically says here's an arbitrary tool to go run. And we give you a very verbose way to describe it and an external data source. So if you've got a running Prometheus or running PCP environment where you're collecting all this stuff centrally, you can just say, hey, go look at the data or record that I had data over there as well. So at the very beginning, I showed you and we'll go blindly back real quick. I showed you that we had this register tool set. Well, our register tool set is basically just a call to a bunch of register tools before. And the sets you can define in the config file, the pbench config file. So you can say that for certain tests, I want a very light set of tools, like just a VM stat, nothing heavy. I don't want to interfere with anything. You might want to have a heavy set of tools where you really get into deep, where you got lock stat and you're doing a perfect chord. And you're looking at PROC interrupts, which is a ton of data. PID stats, which get tons of data. So you can pick and set up the kinds of tools you want in the configuration there, and again, you give the list of hosts that you want to use going forward. So, what happens? You got this register tool set, you got your user benchmark. What's really happening? It's really, well, it's actually simple, right? We start, user benchmark starts running before it touches and runs your script. It goes off, figures out all the tools that you asked to go run and just starts them all running. It handles all the SSHs through remote nodes, how to start them, what to go do, tracking them, the right PID file thing, etc. It then runs your script, and we time it. We don't use time, we just use the seconds in bash. We probably should use the time thing. And then what we do is because we started all the tools, when your script runs, we go and stop all the tools. And so we've just wrapped and done all what you would normally do yourself. We've just put into the script, and if you go look at the source, in the slides, etc., the link to the sources there, you can see it's got a lot of cruff for support and management, but it's basically just those steps. And then after the tools are stopped, we give an opportunity for each tool to be post-processed. So, do you need to PID stat generates tons and tons of data? Do you need to kind of call that down to a smaller set somehow, or for instance, if you do a perf record, you want to do a perf archive, so you grab an archive of the perf data to use it off host instead. All those kinds of things happen in the post-processing steps. And then finally, for user benchmark, we collect all the configuration data that's needed. So, user benchmark is simple because it's designed to run around one command. But commands like Uperf, I mean, if people are familiar with FIO for doing file IO testing and dist testing, you often want to do multiple different types of networking testing. Am I doing a round robin test where I'm doing an echo? Am I doing some kind of stream test, etc.? And I might want to test different message sizes. And I might want a combination of all those things. Well, if you use user benchmark for every one of those things, you're going to get separate results for each one. But here, we've offered in Pbench Uperf and Pbench FIO the way to gather all that data in one result tarball with individual iterations of data. And so will you tell us for Uperf, hey, I want RR and stream and I want these three message sizes. So what I'll do is I'll run that six times. And because I've asked for five samples, I'll each time for each combination, I'll run it five times and calculate a mean and give you the closest sample as well. So really we'll run that 30 times, give you a really in depth coverage of those different parameters across the six systems you're doing. So you got three clients that are matched to three servers. And then we'll also collect the configuration data for you. And that's the beauty of this is that now you can stop thinking about when to collect the configuration data. Did I do it here? It's all wrapped up in the configuration and the run of the Pbench commands. So we have other workloads that we also support, traffic gen and spec JBB. They actually use a different method of start and stop tools, which I don't want to get into that right now. It's a whole other can of worms. And we have a up and coming command called Pbench Run Benchmark, which is sort of the big brother to user Benchmark that will let you describe more complex workloads without having for us to code them directly. So what happens on the tool when it collects data? Well, first of all, in the run area, that's Valid Pbench Agent, Pbench Run, there's a directory that's created for that run. So if you ran Pbench Fire or Pbench Uperf or Pbench User Benchmark, we have a prefix there. Whatever config parameter you gave on the command line gets constructed in the timestamp of the time you issued the command. And so what you get on the bottom there is Valid Agent, I should have Pbench Agent, but it fits, is the run directory. So under that run directory, so here I have an example where I've got two test types and two message sizes. Pbench Uperf is going to give you one iteration for every combination of the parameters that you want it to run. So there are four iterations there, one, two, three, four. Actually I have a pointer I think, which I'm not, yeah, there you go. One, two, three, four, and underneath the fourth directory, I'm showing the fact that there's actually sample directories for each sample that was run, cuz we had five samples up there. So all five samples will be listed there and the data for every sample will be collected underneath under those directories, and then we put a reference result link cuz we'll calculate which sample is closest to the mean of the five runs. And we'll put a link in there so you can easily go find which sample actually was closest to your mean. It's not named RefResult but it didn't fit on the screen without shortening it. So underneath, excuse me, each sample, all the tool data is put because for each sample is what we're gonna run and collect the tools that you asked. And so let's say I had configured MPStat and VMStat in there. For each host that I asked those tools to be run on, so host NA is one host, host NX is the other host. Echoing back to the previous examples, I would have an MPStat and a VMStat directory for each host with the standard out and standard error of how those tools ran. And for MPStat and VMStat, but I'm not showing it for VMStat, we calculate a set of graphs and sort of call the data a little bit for you building an HTML file. And we use a tool called JSChart that was written by one of our team members, Carl Rister at IBM, who is now here at Red Hat. And it's designed to give a graph view of your data along with the table next to it so you can really introspect, zoom in. You can grab parts of the screen and make it bigger and do all kinds of funky D3 things that you'd like to get at the different data sets. So when you're all done with your data collection and you're ready to go, we offer the PbenchMoveResults command or CopyResults command to get your data off your local system onto the main server, give you either archive or whatever. Now you don't have to do that, but if you'd like to archive it and you have a Pbench server, that's what you offer. The agent config file has all the configuration in it for you to talk to that server. When you do a move results, we add metadata about the environment that you ran the tar ball on, or your results on. We compress it, we give it an ID. I think this is one of the cool parts of Pbench is that the directory at that point in time, that directory hierarchy that you had for your results, when it gets tarred up and compressed, we put the MD5 sum on it, and that gives it a unique ID, as much as MD5s can be unique, so that now that becomes an object that can be reasoned about. So we have metadata about it on its one, and we have its ID, and now we can look at it in the system. And I'll show you a little bit in a second. We copy the data over to the Pbench server. The move will remove it locally once it's verified on the remote side. And we give you a URL to where it is. Our Pbench server is designed to handle the archiving of the data. So we make sure that it's properly MD5 and bit some, bit rot checked, all that kind of stuff. We give a way to visualize the tar balls off of the MPStat.html file I showed you with JS chart. We give you a way to see all those on the server. And we index that data into Elasticsearch for a dashboard feature that we have. We spend a lot of time indexing data, a lot of data. So that's not too visible, right? I always thought the slide was gonna be a little small. You can't see it, right? Yeah, no, all right, so I am prepared. It won't let me move it. Hold on, I'm not totally prepared. All right, I'm not totally prepared. Hold on, there we go, come back here. And it doesn't work, don't ask me why. I cannot move the window over there. Trust me, it's visible. I need a little song and dance here. So we have a dashboard that shows you, unfortunately, in our team inside Red Hat, we all kind of own machines, if you will. We do a lot of results off of one run. So we've organized the data off of the name of the host you ran the P bench command on. And so we give you a way to list and see all the data from your particular host. So over here, this host is DHCP 31-122. And so what you would get is if you select that host, the dashboard shows you all the different result tar balls that you had. It gives you the config name and the starting end time for that particular tar ball. You can introspect a particular tar ball in this particular for you, Perf. You can see all the different rows of data that you collected for the different runs in a nice table form for the standard deviation and mean, etc., of the result. You can then go through and select different results to compare. So you can figure out if you wanna compare your streams with a certain size, etc., and then once you hit the comparison, we'll give you the metadata that you had compared on the right-hand side and the comparison of your U-Perf results on the setter. Same thing for FIO and traffic gen as it goes. So future directions. So we need a notion of user into P bench. Right now, as I said, it's all based on controllers. We wanna get it to the point where each user has control over their own data and can publish and delete data. We don't have a delete button right now because we don't own data. So we don't really trust our team over here to not delete someone else's data. So we just don't allow the delete, which has its ramifications of another sort. We're working on refactoring on the agent side. The tooling right now is all based on constant SSHs all over the place. We are containerizing the tools and making it so that the tools are described by the container. And then we're gonna have a thing that will just run the containerized tools that you build. So that'll make it easier for someone who wants to build a certain tool. They don't have to interface into P bench somehow. They can build a container that has what they want for behavior in there. We are gonna be changing the Uperf and FIO workloads to use Redis to help coordinate the way that they execute. And we're desperately moving to Python 3 for all the code base. The server side, we have also notion of a user. We need to go to an S3 for our archiving right now. We wanna finish all of our Python 3 and containerize it so we can deploy the server a lot easier. For the dashboard as well as the users for all the other two, the dashboard needs the notion of a user. We wanna take, and right now people have access to different data sets in Grafana or Kibana. And we wanna be able to have the result dashboard leap off to more introspective data sets in other dashboards that people construct. So we'd like to add support for doing that. We also wanna be able to handle the display of data in existing data sets from the dashboard like a Prometheus or PCP, etc. We're working on comparing aggregating the tool data across the nodes. So if you have those 20 nodes you're doing Uperf on, we wanna be able to take some of that tool data and display the aggregate set of that data across all those nodes and your comparisons between runs. And we have this feature we call Table of Contents, Anisha's working on this. When you look at the, you're used to in P-bench you own this data. The data that you collected in your hierarchy is dear to you because you spend a lot of time making runs to get all that data. And so when there's a problem, when you need to go figure out why something's not working, you need a way to get back and look at that data. And so one of the things we're offering in the dashboard is to get what we call a Table of Contents so we can see the complete hierarchy of what was collected in your tarball. And then be able to go pick it out and go look at it. And do that without having to unpack the whole tarball yourself or have us unpack the whole tarball and look at it with a lot of space savings. And that's it, any questions? Rich, that's a good question. You do not have to have a P-bench server to look at that. We have a little package that loads all the CSS and JavaScript packages for you so you can just look at it on your local machine. Top, I don't know which ones is higher, but the notion of a user is really critical. It is a killer idea not to have a delete button. We did it because to add a user is a lot of work and we had other features we all need to do. And we figured we'd throw hardware at the problem and just keep adding space, but you can't keep doing that really. And we've kind of hit the end of that. And so we need a notion of curation. And you can't do curation without a user because you have to have accountability on who curates what. And so it all gets back to that feature and that's pervasive. It has to go all the way through the code base to make that happen. Second thing that has to happen is on the tool side. We've got all the support for those tools, but they're somewhat idiosyncratic to P-bench and that's really not a good thing. We want to make it so that the notion of the tool that you write is independent of P-bench. And so we want to get it so that you can specify and we think that the easiest way to do this is via a container. You specify what you want to go run and then we'll just collect it off there. Great question. Question was, are there any plans to integrate with OpenShift 4? So today the team uses it with OpenShift 4, but it's not a pure integration with OpenShift 4. We have a tool in the team being used called Ripsaw, which is a workload driver. And we're working on this toolmeister effort to containerize the tools so that that driver can leverage the tools and still give you a P-bench result. Even though it's not a P-bench U-perf or P-bench FIO, that driver can still end up providing a P-bench tar ball when it's done. And so we'll integrate with Ripsaw, which is a operator written Kubernetes operator. Sorry. Question? Yeah, today we do use config maps to run all this and get it all set up. Yes, historically, a lot of this started back in 2013-2014. And so we were adapting existing workflow processes to this. We have PCP in this today. I didn't show it on the slide because it's not a great integration with PCP. It basically collects, it uses a PMCD process and PM logger on the same node. And so it kind of replicates all the same behavior on all the nodes, which is not ideal. What we want to do with the toolmeister effort, containerizing that, is make it so that a PMCD container is just started on all the nodes that you want. And then use a local PM logger to go grab all the data from those PMCDs. Because it lessens the overhead of what you're doing on all those nodes and gives you a central place. There's problems, if you're running a test and you lose a node, then you lose all the data that you collected. And so one advantage would be with PM logger is that if you lose your controlling node, you're toast anyways. But if you lose just one of the nodes, then you just lost some of that data just for a little while until that container comes back. PM logger will keep adding it to its local thing. So all of a sudden you'll have small gaps rather than whole losses of things. We have a team member right now that goes through, Robbie, goes through a ton of effort to avoid losing data and it's a pain. So yeah, we want to go do that. Same thing is true with Prometheus. So we want to be able to put exporters in the containers, specialized exporters if you wanted to, and then have a local Prometheus that you run with your, or not you run, but P bench runs with the test to go scrape all that data, keeping it locally. If they disappear and come back, you only have small gaps. But you have all your data locally and then you can go save it and visualize it going forward. And only for that duration of the test, rather than pointing at some monstrous Prometheus database that's there for the last three days that has tons of data. So, any other questions? Hey, Dan, you're late, Dan. All right, if there's nothing else, thank you very much.