 Good morning. Welcome to the last day of the conference. My name is Lukas and I'll be talking about a performance compiler and how, and my journey of integrating a Ruby on-res application with performance compiler. So I'll explain a little bit of what the performance compiler is and then I'll dive into the integration. So this is more talk about lesson learn, the lessons learned about, during the integration and with the PCP. So the object, the subject of the talk is the foreman, which is the, I like to say server for, sorry, software for server management. It has a lot of features. It's open source project. It's, as I've said, written on Ruby on Rails. We do have a booth there in the main hall. So stop by if you're interested. I want to go into the details. And my motivation was Red Hat Sales Satellite 6 product, which is based on the foreman, except it's in red color and all with all the documentation and stuff. And my project was to improve performance monitoring of satellites. And so I was sticking for some solution, something simple that customers can install, which is not complex to deploy. And PCP looks like a great tool for the job. So what is a performance compiler or PCP? It is, I like to say, different monitoring software. It is a different in a way that in standard monitoring solutions you have managed nodes or monitored nodes. And then the software itself and then some kind of database, which is completely separate entity. So most of the software uses non-relation databases or although some of them do use RDBMS, but this is the design. So the problem is the database is tightly coupled to the software and you can't get away from it. And you need to deploy and manage both, which can be challenging. If we want to go after a performance issue, I don't want to tell the customer to install Nagios or install Prometheus or install something very complex. I don't just want to have something, some tool that is available on the server, which can be used. And that is actually PCP case. PCP does not have any kind of database. It does have a demon that does the monitoring and there is another demon that if you opt in, you can actually store the metrics, the readings of the monitoring in what's called archive files. So these are just similar to log files, but these are binary files. And then you can easily take the archive and send it over email or whatever and we can take a look, which is great. So in short, PCP is an open-source framework, basically, or toolkit, I should say, for monitoring and analyzing performance data. It has a lot of features. I just keep pointing it to you. It's very lightweight, which I really like. It's just four megabytes RPM, probably, or any kind of packages, very small. And distributed, you can basically, when it's single in node, multiple nodes, you can deploy various strategies. You can opt in collecting archives or on all hosts or several hosts, depends on your setup. And it's in all range of distributions and you don't need any kind of external database, which is great. It provides basic metrics type, which all the monitoring solutions do provide. But it also includes units, which is pretty nice, because I've seen many times that you have some graph and you see millions of units and you don't know if these are IOS per second, kilobytes, megabytes, you're not sure. And PCP has this in the metadata, so it shows you correctly if that's bytes per second or kilobytes per second, megabytes, it will recalculate that for you, which is great. You can do live on historical data, obviously. It is a very high resolution. So there's one upstream user who uses PCP, which is Netflix, and they are doing a lot of, I guess, high resolution live monitoring. So it scales pretty well, I guess. One of the nice features is hot port monitoring. You can opt in some processes, which are your most interested in. For example, a process that utilizes IO a little bit or CPU, and then PCP can gather more data from those processes. So you don't need to gather them all. That would be too much of data. You can export and integrate with third parties. There's a couple of command line tools, or I should say many command line tools for analysis and stuff, which I really like because I don't need to install anything on my PC or laptop to analyze the data. I don't need to install database or something. You just run the tool and see the results. And a couple of graphical tools I'll show some screenshots. Agents, of course, plus 100 packages or engines in Fedora, and good documentation. And it's actually pretty old. I mean, stable plus 20 years of existence. It roots in SGI, which is still a company that does a lot of HPC computing. And the overall design is pretty simple. You have the main demon, which is called PMCD, collecting demons, stands for the collecting demons in CD. And then you have a couple of agents, which are running usually as sub-processes. The PMCD manages them all. So they do the actual performance collecting. And then you have consumers, which are usually utilities. So PM chart, PM stat, PM logger. PM logger especially, it's actually a demon. And if you run it, if you enable it, it starts collecting the metrics you configure to archive files. So you don't need to gather metrics into archive files if you don't want to. That's completely optional. And then all the same tools, which is great, let's say PM stat is one of the command line tools, can work alive, connecting to the PMCD, local or remote, or you can also load the archive file, which is great. Good morning. So here's a triple example of such a PMD agent. As you can see, it's pretty much simple. It's very similar to any other monitoring software. You just do some initialization and just then you have a callback. So the callback is called every second, every 10 minutes. And that's the place where you do the collecting data. And notes the block metric here are the units. So in this case, it's counter. I'm selling the PCB that this is a counter and I'm counting, you know, in the second units. Here's an example of a tool. PCP comes with a variety of command line tools, which I really prefer, but if we're into graphical tools, PM Trot, PM Trot, I think it's the tool. This is a cute application. It runs on Linux. It runs on, it should run on MEC, I guess. It doesn't run on Windows, I guess. And you can easily correlate stuff. Works pretty well. There's this play, pause, rewind, set of buttons you can use. So if you have an archive file, which you gathered from running instance, you can load it up into PCP chart and just, you know, go back in time and, you know, rewind and stuff. It's very, very nice. And then PCP itself comes with a couple of graphical dashboards, I should say. Those come with in separate packages. So the PCP package is called PCP on both Fedora and Debian systems. And if you want one of those, you need to install web API demon and then the web app Grafana vector. Those are the two. I think there's one, one, another. I'm not sure. I've only used these two. And just enable the demon and it runs on a weird port, of course. Just open a firewall and then what you're getting is if you go to slash Grafana, if you know Grafana is a JavaScript framework for doing a lot of graphs, you know, doing graphing. And the way it works is the PM WebD demon emulates Grafana API. So basically Grafana JSON, sorry, Grafana JavaScript thinks it's talking to Grafana API, but it's actually talking to PCP. And PCP provides the data. So you can easily have the data. And the way it works is for Grafana, it's actually reading those historical data from archive files. You can go back in, you know, like one year or something and just, you know, see, see the disadvantage of this approach is you're not getting the live data like past seconds, past 10 seconds. It appears here after a minute or two, after the PM logger actually writes data into the archive files, which is, you know, which is fine. If you want actually the live data, there's this another tool called Vector. This is provided by Netflix, which is actually one of the major users of PCP. And this is actually the opposite. It doesn't read any data from archive files or archive files, something like database, but PCP uses archive files, something like log files, but binary. This is actually connecting to the PMCD and reading the values from there. So it can only show, it won't show any historical data. Just once you connect there, it starts, you know, you know, plotting the numbers and just you need to wait 15 minutes until you get something like that, something like this. So that's vector. All right. Okay, go ahead. Is Vector only used for live data or can also somehow get data from some store venue and you can use it as an interface for historical data? No, you actually can't. They are very opinionated about this. They want to see live data only. So you need to use Grafana for that. Or of course, command line tools, which I'm going to show you, because very much here. Can you read it? So the PMCD is the demon you run. Again, PCP is very live fight. That's what I like it. And also it's included in rel 7 and 6 by default, which is great because it's in base repositories. It has a couple of packages in optional. But if you use CentOS or Fedora, it's there if you use DBN or any kind of DBN distribution. So any major distribution is there. So you need to start the PMCD and that's it. As you can see, it's also spawned several agents, which you can configure, you know, by default, it spawns a couple of agents there. And then the one of the nice commands is PCP, which if you run PCP shows you some kind of overall, what's going on, what's named, how many disks you have and stuff. And if PCP is running or not, and then agents, which are enabled. PCP organizes the metrics in a tree in our hierarchy, separated by dots. PMInfo is one of the main commands you use to list all the metrics. You can actually read and plot. I'm gripping actually everything that starts with contains these partitions. So as you can see, it's pretty obvious what these mean, I guess. You can see a little bit more of information as upset PCP maintains in the meta data units, I mean, units and data type and units. So it can actually show you nicely formatted output like in kilobytes, megabytes, you know, human readable output actually. And if I provide a little bit more information, PCP is using this technique called instances. It's nothing new, many other monitoring software justice. It's obviously partitions, we have multiple partitions, so you don't want to create a metric for each individual partition, which this can change a lot. So every metric can have multiple partitions, sorry, multiple instances, which you refer by numbers or strings. So here we have two partitions on this server. The same goes for, for example, this is kernel or load, which is obviously a load, and we have one, five and 15 minutes. And here is an example of how you can read, actually, some data. So I'm doing a PMval command and I'm telling it which metric I want to see, and optionally I can specify partitions, I mean instances, and it will do until I do control C. I specified I want to see five seconds, five examples, it gathers one, you know, by, each second, by default, PCP, if you connect life. So PMval is nice, but PMstat is, if you know VMstat, Unix command, it's a very similar interface. So if you're familiar with VMstat, you need to just quickly show what's going on. PMstat is the tool of choice, and it works the same way. As I've said, all the tools you see here can actually work with archive files too. So if you downloaded an archive file from, or your customer sent you an archive file from yesterday, you can actually specify dash ale and archive file and you can go back in time and, you know, seek, which is great, you know, this is pretty unique, I like that. If you know IOSstat, obviously, there's another tool, and I like this approach, PCP, that kind of don't reinvent the wheel, and if you're familiar with Unix tools, they will provide you the PCP versions of them. So VMstat, IOSstat, it's pretty similar. It's not the same, they do not aim for the total compatibility, but it works. So I have an archive file, recorded like a couple of days ago, so this is how archive file looks like, just a binary blob with some metadata. And I'll tell this another tool, which is PMlock summary, which is able to calculate you summaries, calculate the average values and stuff. So I'm telling it, okay, from 9 to 9.15 in the morning, calculate me some, you know, general average values for this partitions with bytes. So this is actually my workstation, so I have a couple of partitions here. So from here I can tell that it was, like, this is my main part. On average it was, is that writing or reading, I'm not sure, reading 14, 15 megabytes per second on average within this time frame. Now if you're familiar with Atop, look on this. This is Atop, basically running and asking the PCP, actually it's not Atop, it's a rewritten Atop, and you can do this again, go back in time and, you know, seek. So if you have an archive file, this is great, you can see which processes are actually, you know, doing some stuff. As you can see, I don't see details, you know, all the processes collecting that would be, would generate a lot of data. But you can opt in, there is this, there is a way to do this, to actually tell PCP together either all of the processes or some of the processes. So this is the same, but actually I'm, sorry, the previous example was live, this is actually from the archive file. So here, here I missed the list of processes. I can specify I want to see all the processes which consumes more than one gigabyte of memory, and PCP will gather the data for me. And here, here I can actually specify all the processes that contains Java, I want to include, you know, all the details. So I would see the details in the, in the Atop. I don't have a root here, so anyone, anyone, anyone. And you can, of course, this is dynamic configuration. You, you want to put this into configuration file to have this persistent. And then if you know SAR, obviously there's another tool called Atop SAR, which provides you some data in SAR fashion. And there's a couple of other tools, there is a new tool as well that, you know, shows you a lot of data in a nice, nice format of the way, but I want you to go into details. PCP.io is the main website, go there and see yourself. So, quickly, where is it? All right. So, I said Ruby, yeah. So I, we had this, we have this nice open source project called Forman, which is Ruby on Rails application. Now, PCP, you can write agents in pretty much every single language except Ruby, of course, all right. So I was thinking like, okay, how do you approach this? Because I wanted to, you know, see the data in PCP. So there are a couple of options. Writing an agent, obviously the hardest way, instrumenting and then tracing. So there is a small, small agent called trace. And it provides a very simple API. It's meant to be like for cron jobs. You can, you know, send some metrics from cron jobs and stuff. So obviously, obviously I made a PCP trace gem, Ruby gem, which fulfills this API. You just need to install a couple of dependencies because it's native. And this is how you use it. Obviously, this didn't work well, you know. The, you know, announced a new project, of course, not a single day, announced a new open source project. But the PCP, they've told me, this API is actually being depreciated. So it was not good, you know, try. So instrumenting is another way of gathering data from applications. So there's a memory map value API kind of, you can create a memory web file. And then the PCP, there's a MMV agent, which reached those values. It's pretty fast. And I was thinking like it's a good fit. Again, Ruby don't have any library. So I was thinking like writing, writing native MMV Ruby, but then I found that there is a goal line library called speed, which was Google sum of project two years ago. And it provides me everything I need. So actually, I wrote, and I was thinking like, since the PCP tool mechanism, and we have for Ruby, you obviously need to spawn multiple processes because Ruby doesn't work well with multi-threaded. Actually, the libraries are not safe. Thread safe. So I was thinking like pushing those data from the Ruby application as soon as possible. And I found this UDP protocol, which is StatsD. It's just a UDP packet, text space. You just send a metric number and the type. MS stands for milliseconds, like time, duration, gauge, and counter. And then that was the idea. To write a library, which I called, or it's a demo on PCP, which is table name, I know. But it works. It listens to UDP packets from StatsD and it's kind of using the MMV API to push them into a PCP. And this is how Forman is integrated today. So you need to run another demon and PCP and demon and Forman. And you can get those internal, I say, telemetry, very internal numbers from the application, like what is performing well and what does not. So again, hey, new project. But the PCP Devs. Okay, MMV is not probably the best way. It was meant to do the instrumentation of actually applications in CC++ and Java and go. So I thought this is a temporary solution. So finally, I'm able to announce that I'm mentoring Diploma Thesis at Pascal University as students implementing a fully featured PMDA agent, which will be PMDA StatsD, probably its name, and will be high-performance C-coded, multi-threaded, pluggable, which HDID histogram supports. So if you're working with count, if you're working with time data, you need to obviously aggregate the data and do some, usually histograms is the answer in monitoring. And it will replace both Trace API, which is appreciated now, and the MMV StatsD demon, which I wrote. And hopefully the next year will have, or this year will have a nice agent to do monitoring of Forman. Second PCP will have StatsD agent, which is not available today, which is great. And everybody can profit. It's not just for the Ruby world. Basically, you can send the StatsD packets from anything, like using Netskites from a shell skip, right? You can easily send the crown job, how the crown job is actually performing, or how long it takes, which is great. So yeah, that's it. They told me to put in some numbers. So I was thinking one is good number. This was one lesson for me. Evolution is still better than the revolution. I learned a lot along the way, and finally I got there. Hopefully we'll get there. Questions? How much time do we have? Five minutes? No, one minute. So this talk is already available on this. I've pre-recorded the demo, like my dry run is there, and also the slides should be available somewhere on my, I'll put it on my blog. Thanks for coming and enjoy the last day of the conference. Thank you.