 Okay. Hello. Hello. Very good afternoon to everyone present here. I am Avinash Kumar Dussandhi currently working with the performance and scale engineering team of Redat as an associate software engineer. So today I'm here to talk about distributed system analysis using P-Binge, a tool which we use very frequently in our team to identify different bottlenecks and do the performance analysis. So being a part of a performance team, I think performance analysis is very much important and a great deal of effort is spent in testing the performance of any software before it gets released. So the interaction between hardware and software is really difficult to understand. So I think performance analysis is very much important and that's why it's somewhere is an arc and art. So let's start with the presentation. So the first thing is what is a distributed system? I know most of you guys are familiar with this but let me give you a brief about it so that the persons who don't know can get an idea. So in simple language we can say a distributed system is a system software in which components are located in a network computer, coordinate and communicate their actions by passing messages. They interact with each other in order to achieve a common goal. So these distributed systems are applications or systems that can be run on multiple networks within a network at the same time and can be stored on server also. So if you want to do a performance analysis of that distributed system, you need to know how each component is behaving, how each component is doing in that system and I think it's cumbersome to run and log each tool available in the Linux system and then analyze them. But I also agree like it's really important to do things like do the analysis of those tools to identify different bottlenecks across the system. So Pbench can help you out with those scenarios. We will see how in a while. So let's know about Pbench. I'm sure you guys must be curious about how it got its name. So Pbench is nothing but a combination of two words, performance and benchmarking. That's how it got its name. So performance and it's an performance and benchmarking analysis framework which provide easy access to benchmarking and performance tools on Linux systems. We are frequently using this term benchmarking. So let me explain it a bit. So benchmarking is nothing but it's a set of measuring performance tools and comparing the results of one system with another one or a widely accepted standard through unified procedure. This unified method of evaluating system performance can help us to answer many questions like is our system is performing at is supposed to be? Is our system is capable of doing task A? Why drivers should we use to get the optimal performance out of it? So sometimes benchmarking and stress-stressing are very much essential to optimize the system and eliminate any performance bottlenecks in the system. Getting back to Pbench. So it also acts like an harness that allow the data collection from a variety of tools while the system is running a benchmark. Pbench has some inbuilt benchmarks that allow to run some common benchmark but the data collection can be run differently as well as with benchmarks also. So one way to understand the system performance is running workload on it. What workload means here is when you are putting your application or software in a large scale and making it to do more and more of what it's supposed to do means taking its limits and doing the performance analysis during that phase. Sometimes the workloads that can be run on the system is actually present for the testing purpose but in most of the cases however one has only canned benchmarks that can simulate the actual workload. So Pbench can help you out with this by providing, it has the ability of providing like automation to run the benchmark and collect system information. So storing results and cataloging it can provide you some regression analysis which helps the system to improve and above all it help us to understand the characteristics of a system. Okay so during its like when the Pbench is running a benchmark and doing the data collection it is standardizes the collection of telemetry and configuration information also. Now the main question is why Pbench? You guys must be wondering about like there are many tools in the market which can do this like what's so special about it. I know there are many available tools to look at the system stat but we wanted a benchmark or a framework which can allow us to collect and visualize the data of this stat including system information while the system is running a workload to find out the different performance bottleneck scale. Okay so it also like it helps you to archive the data whatever the data they collected it archives it for later consumption and analyze it like it helps you to analyze the data by providing visualizations and data comparison. It also allow the indexing of data into elastic search for Gator time series graph and you can also visualize the data through Kibana and Grafana. It also provides a way to see the graphs if you want to see the graphs on the web interface it also provides that moving on to next slide the components of Pbench. So it's basically divided into three parts the Pbench agent so it's a package consisting of scripts handling execution of tools and benchmarks so it is the package that is responsible for providing commands when you are running a benchmark across one or more system while properly collecting there the configuration of those system logs and defined telemetry from various tools. The second one is Pbench server it is responsible for archiving the results star wall indexing them and unpacking them for display so that so the data which we are getting from the agent is going to land upon our server and it went through different post-processing so when it gets when the data got into our server it got automatically unpacked the server will index it and unpack it for the user to visualize. The third one is web server it is for displaying various graphs and results in web interface although we are indexing the data to elastic search to get a time series graph or to visualize through Kibana or Grafana we want a way so that user can see the results instantly as soon as the data get into the server so this is for that. Let's discuss about the architecture so you can see there are different machines which have agent install on it and they are collecting the data they are sending the results to a server by a simple command that is Pbench move result. On server the data which we got from the agent went through different post-processed things like archiving indexing from there it goes from two places the first one is elastic search as I told you for indexing and to get a time series data you can visualize it through Kibana or Grafana the second one is web server for instantly seeing the results on web interface. So what are the available benchmark scripts and tools there are many scripts or benchmarking scripts specifically in Pbench but we mostly often use these benchmarks the user benchmark let's suppose you want to run something which is not already packaged up as a benchmark script so what will you do it will use this you can use this Pbench user benchmark so what it will do it will take your benchmark as an argument it will start the collection of tools invoke it stop the collection of tools and the post process the results. Pbench FIO so the main Pbench FIO was created to automate the sets of FIO and including the calculation of statistics of throughput and latency so you can run sequential read and write as well as random read and write with different block sizes with this benchmark script. The third one is Pbench Uperf so the main purpose of Pbench Uperf is to character characterize the capacity of your overall network infrastructure to support the required number of traffic using so it's like it can so let me do it once again so the main purpose of Pbench Uperf is to characterize the entire capacity of your network infrastructure to support the induced amount of traffic given by the distributed storage by using multiple network connection in parallel so there are many other benchmarks also but we mostly often use these benchmarks now the available tools so Pbench use these tools to get different data metrics from the system and to do the performance analysis and to find different bottlenecks so what it does is like it collects all the data it accumulates it and it package it as a result all board and send it to server for processing so some of the tools are SAR that is system activity report that gives you an overview of the system the second one is IOS IOS stat it monitors the input output devices the third one is MP state it gives you the CPU utilization of the system PID stat so it's monitor the an individual process managed by the kernel and gives the reports out of it VM stat that is again virtual memory statistics per it is yet another tool for performance and benchmarking so where do I get P bench P bench is currently available via RPMs so if you want to give it give it a try you can do it and if you want to look at the source you want to contribute to P bench here is the GitHub link you can check it out so this is some sample graphs which we get from tools that are visualization and benchmarking visualization I will show you a short demo like how to how do we do things in P bench sorry so first thing let's suppose this system has P bench agent install in it you have to install the the tools from which you want the data matrix so you can do something like P bench register tool and it will install the tool for you and if you want to install a bunch of tools at once you can also do this it it will install those tools for you if you want to clear those tool you can use P bench clear tools it will remove the tools so let install all the tools first now your system is ready to collect the data let's run a simple benchmark so as I told you if you want to run something that is not already packet package as a benchmarking script you can use this P bench user benchmark like this like P bench user benchmark like this you can run your own benchmark custom benchmark I don't have right now so let's run a simple benchmark so what it will do is like it will sleeps though that benchmark for 10 second it will observe your system for those tools it will collect the data from those tools and it will give it to you so it's collected all the data and if you want to send the data to the server you just need to do P bench move result and it will package those data as a tar ball and it will send it to server so let me show you how it looks like on the web server so I've already ran our test with more data so you can see here let's this one so these are the tools which we have registered to get the data and it it is showing here if you want to see IOS stat it will show something like this on the web let's see another one like the MP stat so this is how it looks the results again here sorry yeah so that's all any questions sorry I can't hear you just wait excuse me the basic question is like you are showing it for a demo for a single system so I would like to speak a bit louder sorry so it is audible now hello yeah yeah so I would like to understand is there any way to see it for multiple systems because in a region wise I'm having a two system distributed I want to see a collective log so what you are demo is giving me a sense like I'm getting a single log of what I can see it and see I like converted into some GUI page and just seeing that so let's suppose if you want to collect the data from a from a like 10 you have 10 nodes and you want to collect the data for all 10 nodes what you need to do is like you need to install agent on all those nodes you can like install the tools over those nodes and when you run a P bench use a benchmark or fire or you purve you can give the destination of that inventory file it will take the IPs of those nodes from there and it will take the data out of it and it will package it as a single result and it will show you exactly that is the intention so I would like to understand is there any dashboard kind of a thing in the single place itself in the single page yeah so the data is ingested in elastic search so you can use Grafana to see those nodes like you just need to select the node to which you want to see the data and it will show the data to that node so I may ask one question why you're not included those kind of things in your PPT sorry can't hear you so may ask one request to you because the topic itself we are interested to see that things only yeah so we are got disappointed by seeing a single node so you have any pictures or you have any images to show that kind of a dashboard right now I don't let me see my system if I will have I will share with you okay fine so this is your head ID right shall I have a contact with you to I can see some redhead email ID there yeah you can contact me in that email ID yeah okay thank you yeah hello hello yeah hi so if you have to run P bench across a multiple set of nodes I'll have to individually run P bench across each of these nodes is it or is there any way I can easily orchestrate so you don't you don't need to run on all those nodes you have a master node you can run the user benchmark or any benchmark in that node it will collect the data from all those nodes and it will get back to you okay so the master will orchestrate across all nodes yeah so if I can see it it's mostly you know agent-based sorry hello is it audible now yeah so if I look at it it's mostly an agent-based you know kind of tracking your system matrix right so I want to just know I think you said it's kind of you know your master agent collecting the data from all this you know slave nodes right so it's a basically you are pulling the all these system matrix from all your slaves yes so is there a mechanism that you can push all the you know system matrix from your slave nodes to master instead of pulling it you can do that it's up to you then you have you need to build your slave agents in such a manner that no you can directly go to the slave machine you can directly push the data to a server you just need to change the configuration file where to send the data but then if you have a thousand nodes I have then I cannot go to you know thousand nodes and every node and I can put it right so throw the inventory file or throw the ansible you can do that like you can change the configuration file at once okay okay so my next question is if this agent will provide the system matrix based on the different processes or the application running in or there's a whole nodes information to send just an example let's I have any system and when I execute all these tools it might give me just you know CPU usage of let's say 90% so will it also distribute the CPU is a 90% of different processes or the different application that it is running can you repeat it I didn't get it so let's I have five application running in a node right so when I execute these tools so it'll give me knows let's say 60% CPU uses will it also segregate you know which application using how much of a CPU it will tell you which process is actually taking that CPU it depends whether it show the application and name or not no but I am interested in to see let's say X application using you know X amount of CPU okay so this is something like it's based on the data which is collected it's not like you can monitor the live system so that let's suppose at this time your CPU uses was this your memory uses was this and this process is doing this so you can visualize the data in this like in the benchmark result you will get it so it will produce two three types of data the first one is this JSON data it will create a CSV file as well as it will create a summary file also that will have that process name or application name which you are talking about okay my last question might be yeah I mean if you look at you know now you know market so you have a many such kind of a tools okay I can name it few you know you have site line you have Naju's you have there are many tools know how it is different from other tools I mean I didn't see you know much difference in this tool let's say an example of site line or Naju's okay so the first thing is like it I don't know about much tools like whether they provide the automation of workload or not it can produce or simulate the workloads it can like it's it's an a property of P bench like it can collect the data while the system is running any benchmark so maybe these are the advances of P bench among all other tools out there okay thank you thanks for the talk so one of the things about distributed systems is you know the network throughput yeah right so for example let's say you know I'm keeping one server in one rack another server in another rack and like that right so I would like to measure you know what is the network throughput between you know so I want to transfer large files from one node to another node and see how it performs is it possible to do it through P bench you know means you want to measure the throughput when you are sending a data from one server to another right right if P bench is running during that time yeah can identify that no so so does P bench have built-in test for that I mean can it simulate that workload I mean that is no I guess no not right how old is this tool when when was this started this tool I guess it's been two year maybe okay it's a relatively new tool then yeah okay okay it's mainly an internal tool which we use for different purposes different team so for a user who is like outside redact if they provide us like what they want more above from this tool we are happy to help okay so we can provide at the GitHub play yeah it's there at the GitHub at this yeah yeah so I went there okay at that I can create an issue or I can request yeah you can you can open an issue no problem okay yeah okay thank you yeah so couple of questions the first thing is like hello so assume a situation I'm having a data center as you know situation I'm having a virtualized environment I may have a host machine along with that I'm having four five BMS on top of it so I'm not sure right now maybe I need to take a look and we'll get back to you. So is this kind of a testing has been done till time or you don't have any idea. Actually I'm in the tool scene so I developed this thing I am not sure whether some team has tested it already or not so maybe there are some other people I will ask with them and we'll let you know. So is there anybody in the yeah yeah there are other people. A second question yeah so basically when you're talking about this agents and servers right so you are showing some graphical interpretation so is there the client agents which are just collecting the information right and you are having a tool commands to send this information to the server. What is the point in keeping this kind of information only in the agent anyway we need to send the server right. You can have a single command with that we can directly send the information to server. Why we are having the agent there. It depends on user to user for us like there are many users who want to send the data to a particular server for the most easy to install the agent and server on the same machine. I'm asking a situation anyway I'm going to send to the server I'm not going to do anything with the information what is the agent is collecting. Instead of having two commands can you have a single command so that the information will be directly given to the server. So there is only one command you are using P bench move result and you will get the result. We are collecting and giving one command right that it shall we can move the data right. Yeah but it depends on user let's suppose you want only the SAR data or someone who wants the IOS stat and SAR both so for those we have given those options. Okay I can ask in a different manner instead of sending these results to server only I'm keeping this data locally what is the use for objectives to send this data to the server right. You can visualize it locally also you don't need to send it it's not mandatory. Okay I can have a server in the local itself I can see the results. Yeah that's how I configured it actually in my local system. Any more questions? Okay thank you.