 Awesome. Thanks for coming. I'm really excited to have you all here. I hope that you'll find this to be a really awesome session. So I'm presenting Linux Server Deep Dives. My name is Amin Astani. I pressed a big red button so you can keep going. So who am I? I'm the Senior Manager of Site Reliability Engineering at Acquia. I'm one of the old-timers. I've been at the company since December of 2010. I was in operations and now I lead the SRE group. My job is to basically champion DevOps, SRE, and operational and agile best practices at the company. So big warning, this is not your usual Linux talk. We are not going to be talking about top or PS, or uptime, or even the really cool tools like SAR and MP stat and IOS stat, and even the esoteric tools like Strace or LSOF, but okay, maybe a little bit of Strace. I did a talk around a year ago in that link that goes over all this stuff. So today we're going to do an introduction to some advanced tooling, particularly Perf Events and EBPF. We're going to talk about where they came from, what their capabilities are, how to install them, and a live demo of some examples that you can use today. So the game plan is I want to provide you some inspiration on simple yet really powerful ways to troubleshoot Drupal from the infrastructure and performance side. All the tools that I was describing before, they answer the question of what resources you're using on the system, but the tools I'm going to introduce, talk about how resources are being used in greater detail. So before we get started, we've got some caveats. First off, these tools do introduce a performance overhead. They're better than the older tools like Strace, but still keep that in mind when analyzing production workloads, test them in non-prod if you can. Secondly, some tools require you to rebuild your services in order to use them, such as MySQLD and PHP. Some tools require you to install debug packages to be useful, particularly the debug info and the debug symbols packages that are usually built with all of them. Then finally, they require root access. So the environment that we're using to demo on it's a really small 18.04 VM, latest Drupal, one gig of RAM, one core. There's no fancy caching or anything like that. It's just a real basic setup just to get started. You can see it over here. So that's Bumami and I have the VM in front of us so you can actually see what's going on. All right. So also before we get started, let's do a little bit of operating systems basics. Let's talk about something called system calls. So syscalls, it's how programs under the hood interact with the kernel. So anytime you read or write to a file, anytime you talk to the database over the network, anytime you're doing a read and write from Memcache, anytime you're talking to your client over HTTP, or executing other programs, things like that, those are done via syscalls. If you want the full list on the Linux system you can run man-to-syscalls. It'll give you the full list and then if you want to read about each individual one, you can run man-to the name of the syscall. It's a lot of fun, trust me. All right. You guys ready to take the dive? Awesome. Cool. So there's some new tools. Let's talk about them. They're really exciting. So the first one is perf events. It's a recent feature of Linux kernel since 2.6.31, introduced in 2009, originally called performance counters for Linux. What it does is it enables capture and analysis of broad performance related kernel events. One problem it's not very well documented, you have to do a bit of googling. But it's really easy to install. You just run, you just install the Linux tools package. The second thing we're going to go over is something called the extended Berkeley packet filter. That's kind of weird. Why are we talking about this? Well, it is originally just a packet filter, but the project evolved in 2014 that expanded on its original usage. There's a few things about BPF that's actually pretty cool. The first thing is the packet filters are actually little programs that run inside the kernel in a virtual machine. Very, very odd. There's internal guarantees, the programs loaded, can't crash, and can't run forever, and they can also access and kernel debugging features. So what does that mean? You can use it for in-depth performance analysis of running server not just network. There is a toolkit that was authored called BCC of BPF compiler collection. It provides us an accessible wealth of observability tools we can run today. You can also write your own if you're interested in doing that in Python actually. So to install BCC, pretty straightforward, you just run that command on your Linux of choice and you get access to it. So let's get right into it. Examples of demos the moment you've been waiting for. All right. So first, let's just talk about PERF. So PERF allows you to monitor for a specific operating system events, and there's a few things you can do with it. First, you can access counters. So basically the number of occurrences that something happens on your system. You can also trace real-time tracking a process and the system at large. And we're talking about SysCalls again, which is why I did the introduction earlier. We can do probing. You can actually set up like a little trace point or a probe for a certain events and then capture that information later. And then finally, there's reporting. So you can go and do your information gathering and then run a report and you can see what happened after the fact. Just pretty sweet. So example of counters. So there's this command called PERFSTAT. So what this is going to do is it's going to list the number of system calls for a given command. So we're going to demo this. I'm going to run that command right there, which basically is saying, what happens when you run Drush status? So let's find out what happens when we run Drush status. All right. So let's see. Just to show you what I'm doing. There's no shenanigans. That's what we're running. All right. So this is what happened behind the scenes on the server when we ran Drush status. All of these are system calls for getting information about files. So just to run PERF status here, access in the file system 7,000 times. You are making three network connections here. You're sending 21 messages and you're receiving 72 times. So pretty interesting stuff. You can see that it executed 12 programs. So all of this information, I was able to gather by just running it under PERFSTAT. And you can see, again, this is the output from Drush status. And that is what happened under the hood. Pretty exciting. All right. So why does this stuff matter? Well, a certain module or a feature may be badly performing, but you might not know why. Well, if you wrap it in this tool, you get some better clues. So PERF Trace, this is really cool at the last talk that I did about Linux. I was talking about S-Trace. This is a more performant replacement. So a system called Tracer. What it does is you can attach it to a process, or in this case, the whole system, and it will print everything that's happening in real time. Tracing PHP processes is a big thing that I used to do when I was in operations at Acquia. I did it to troubleshoot performance problems on customer sites when you don't have an APM installed. So if you don't have an APM, you can always do this. And PERF Trace has less overhead than S-Trace by a lot. And when I mean a lot, I mean a lot. So Brendan Gregg, who's a big performance guru at Netflix, he did this little test. So he ran the DD command, which is basically makes a big file. And he went and made 10,000 times, wrote 512 bytes of the bit bucket. So if you just did that, it was pretty fast. You can see it takes only three and a half seconds and it does it a gig and a half in a second. If you run it with PERF, it goes up to nine seconds. But if you run S-Trace, it does 218 seconds. So the overhead of PERF stat is substantial, but definitely not as bad as S-Trace. So PERF Trace, we're gonna demo this now. We can see everything on the system all at once. It's really cool. So let's do it. No caps. Like literally everything, everything. So you can see there's a lot of MySQL activity. And then if I go and let's say run a little crawler that's gonna crawl this umami site, now you're seeing everything happening all at once on the system, pretty magical. Of course, I'm tracing everything. So there is a little bit of a slowdown, but you can find all the files that are being opened. And yeah, this is pretty sweet stuff. And then for a single process, you can attach it to a process ID like you can with S-Trace or you can say PERF Trace command and we're gonna do that right now. So I have this one. So this is just catting DevNol real straightforward stuff. And actually let me full screen that so you can see everything. And you can see, hey, can I access this file? Can I open it? Can I read from it? I can allocate memory. All of that, all available here. And it even tells you how long it took, which is pretty nice. So if you saw a system call that took a really long time, that might be indicative of a performance problem. PERF record. So this is really nice. So you can basically run this and sample all the CPU activity on the system. So this example here, it will record the CPU activity on all processors a thousand times a second for 10 seconds and then you can run a report on. Now the thing is you do have to install debug packages in order to drill down into specific library calls. Otherwise you're just gonna get like hexadecimal gibberish instead of the actual names of the functions that are being called. So we can just go ahead and try that now. All right, so I'm gonna run my little crawl here and we're gonna run our PERF record report thing. So let's run our crawler, let's run that. So it's going and just hanging out and generating data for 10 seconds. And now look what we have. So the top thing that's happening on the system is in the kernel. But look what's underneath it in user space. So it's PHP. There's some function called Zend-Find. You folks can probably tell me more about that than I can. But we can actually zoom in and see everything that's happening in the stack all the way down. PERF top, similar to what we were doing before a PERF record, it'll just do it in real time. So this is really nice. It'll tell you again what user space and kernel functions are using the most resources, so we'll try that now. And then again, we'll run our crawler again. So very similar to what we saw before, you saw that PHP is taking most of the time, which is reasonable. And we even have a good example here. What is generating network traffic? So what we're going to do is analyze all invocations of the net dev transmit function, which is happening in the kernel and we'll see what's happening now. That is PERF top, yeah, so far. So we'll go ahead and run our crawl again. And you can see that SSHD is generating a lot of transmits because it's talking to this console here. And then WGet, which is the local crawler, and then the Apache threads. So pretty much what you would expect. There's also dynamic tracing. This is pretty interesting stuff. You can basically tell the kernel to monitor for specific functions and create a probe. And then you can record just that probe. And then less than delete them. I'm not going to go into them in great detail because they are pretty technical. But know that this exists. This might be a really interesting tool if you're really getting into tough Linux performance problems. So let's start moving into EVPF here. So there's some really interesting tools from this suite from BCC, the compiler collection. So the first one we'll play with is TCP Connect. So we're going to go and detect external calls performed by a server, which could be durable. It could be cron jobs. Maybe it's doing a curl to do something or a Git clone. And it could also help in detecting intruders. So we're going to go ahead and run the TCP Connect. And in Ubuntu, they have the suffix BPFCC. That's just how they name everything. So we're going to go ahead and do that. So I'm just going to run that. Now, if I go and try to, let's say, curl my website, you'll see that it goes and does that. If I run the crawler again, you can see all the connections from WGet. So it's making the connect system call. So they're client connections. So this is a way that you can figure out what is happening outbound, which is pretty nice. We can also trace HTTP requests in real time. So there's a tool called TCP Tracer to detect this. And this is really cool because you can find abusive or high throughput clients. So let's try TCP Tracer now. Okay, so these are established connections here. Now, again, if I do my crawl again, because it just generates a whole lot of activity in terms of files. So it's a really nice generator of load. So you can see the source port, the destination port, the addresses. Is it IPv4, IPv6? So pretty interesting stuff. If I did a huge file transfer like an SCP or something, we can probably look at that too. Now, TCP life, how long do your HTTP client connections last? So this prints out the latency and data transfers for each, which again can be useful for analyzing what your clients are doing. So if I'm generating load, now you can see some pretty interesting metrics. So the duration in milliseconds of the TCP connection here, also the transmit and the receive in kilobytes. So again, you can see how big the requests are. So if you're running this on a load balancer, for example, and maybe they're downloading a really big file that you didn't intentionally upload that people are pulling down, you could see that. Take this. Trace file accesses on a web server. Now we're gonna get into more interesting stuff. So we can use stat snoop. So stat is a system call that basically says, give me the information about this file. Like the permissions and where it's stored, basically the inode information. So stat is executed actually quite a bit by Drupal. So we're gonna go ahead and run this command here. And if you read between the lines, I'm basically just seeing for, I'm looking for static asset accesses, better like JPEGs and PNGs and things like that. So we're gonna give that a shot. Yeah, stat snoop. Okay, now if I run my crawler again. Now some of these tools are a little buggy, which is why you get these stack traces, but you do get the idea. You can see as I'm crawling the contents of the site, it's going and opening the file for the static assets and local file system. And then you can also see the crawler writing the asset to its local cache because that's what the crawler is doing. It's mirroring the site. So you can see all of the files that are being, at least static as it does the work. Okay. Now similarly, we can do file top. So just like regular top, you can find processes that are using CPU. We can figure out which specific files in the whole system are getting the most activity, which can be really, really helpful. So it is just file top. Now for this example, we're gonna go ahead and use a little script I have that is this. So we're just gonna make a one gigabyte file with random characters in it. We're gonna see what that does. Now look at the top. So you can see foo file. You can see the big fat writes that are happening right there. So now you are able to easily identify what file and what process is generating your load and the server. Again, that might be really useful for the example of having a static asset that's very large that's being pulled down by many clients at once. So how large are your per process IO operations? So this is really interesting. There's a tool called bite size. It prints a histogram of the storage IO operations on a per process basis. So this is really interesting because you're able to find programs that might be doing massive reads and writes. So we'll try running that now. And then again, we're gonna go ahead and run that DD because that generates a massive write to the system. And this tool, you just let it hang out until you've captured whatever event that you're looking at. And then you press control C and then you're able to see the results. So you can see we wrapped up here. All right, now let's full screen this. So these are all of the things that were happening in the meantime. So this was the DD file program, other programs running in the meantime. This was the, this was the removed because I deleted the file. And then the kernel worker and then the file system and then the DD operation. And so on. So let's talk a little bit more about file systems. So how long does it take on the file system to do something? So there's a whole class of tools, EXT4 distribution, XFS distribution, so on and so forth. They will generate histograms similar to what you saw just now of how long it takes to actually do the reads and writes the file system. So when you run, let's say, IOS stat, we can run that DD again. So this is generating a basic summary of what's happening on the disk. But as you can see, it's just telling you, the reads and writes to the disk. It isn't telling you how big the requests are. So we can break it down by running this tool. So we're gonna run EXT4 disk now because EXT4 is the local file system for this VM. So if I go and generate that big write again from you random, so now you can see how long it takes in general for the read operations. So the reads are pretty fast. We're talking about from zero to seven microseconds. And then you do have some, but not many that are an outlier, the long tail, right? And then your write operations, the requests are completing between, most of them are in this area of microseconds. And then it also shows you the amount of microseconds it takes to open a file, which is pretty nice too. So then you're able to figure out where performance-wise you read and write operations are happening. So let's talk about whether or not you need more memory. So let's get back to our operating system theory. There's something called a page fault. So when a process needs to read from a file, if the contents of that file isn't already in the page cache, which is stored in RAM, well, it creates a page fault. And then it goes and makes a requested disk to get the data back. So this is really important on servers expected to serve a lot of file data. For example, a file server. So if you don't have a lot of memory on your file server, it's going to affect performance because there isn't enough room for the page cache and you're going to get a lot of page faults and then you're going to have to continue to ask the disk for its contents rather than storing it in the cache. So a really good analogy for this is when you think about Varnish, you're telling Varnish I had this many gigs for my cache, for memcached, very similar. Radis, the NODB buffer pool for my SQL, those are caches. So Linux does the file caching for you for free, but you do have to watch for it. Now there are two tools that are really nice to figure out your page fault rates. So there's cache stat, that is for your file system cache. And then there's the DC stat tool which will also run as well. That is useful for the directory cache. So when you go and you're like running LS, it's populating the directory cache. So what I'm going to do for this demo is I'm actually going to cheat. I'm going to blow out the Linux cache for us so you can see what I mean here. So this little command right here, we're going to tell the Linux OS to drop all the caches. So we're going to guarantee some page faults here. So I'm going to go ahead and run this program. All right, so that's happening. And now I'm going to go ahead and run that crawl. So it's going to hit every single node and asset on the Drupal site. So watch this. So look at all those cache misses that are happening right now. And then similarly, we can do the same thing with DC stat. And then let's also blow out, if I can get that command in there. All right, there you go. So okay, so I've blown out the caches and you can see the misses are already taken. So if I go ahead and run that crawler again, it should, yeah. So there's all your references to the directory cache. And you can see all of the misses that took place there. And then of course, over time, that's going to get rebuilt and then subsequent runs shouldn't generate that many faults. So yeah, that's not as dramatic. So there you go, it rebuilt the directory cache. So there is a PID per sec. We can trace the rate of creation of processes. So let's say you have a cron job that you wrote that might be doing an infinite loop. You might not know about that. Or maybe you just have scripts on the machine that you run that might be poorly authored and it's generating a lot of load in the machine, but you're not quite sure where it's coming from. This helps to figure that out because processes can be short lived. So we're going to go ahead and generate that, generate load and see what it does. So if I run this PID per sec tool and then let's say I said, while true do echo dribble com sleep on. All right, so I got one there. That's pretty cool. Now if I take out that sleep, let's find out how many I can do. We're just going to let it rip. Why is it zero? That should not be, you know what it is is because echoes a built in function. So I have to actually create a process. That's why. There you go. So now you got all your processes per second. So this system is generating 1300 processes per second right now. So if you saw that in a running system, you should be getting really concerned. So now we can figure out, okay, we have a high rate of processes per second. So let's find out what's doing it. So there's a tool called XX Snoop. So we can now run it after our use of PID per sec and we can go and generate all that load again and we can see all of that taking place. Now let's go back and run this a little bit more gently so we can actually see what this is doing. So the output is pretty interesting. You have the process ID of that sleep, but then you also have the parent process. So this will tell you where it's coming from, right? And then if I went and said, you know, this, oh cool. There's a bash session that's generating these. Maybe I should kill that off or investigate where it is in the tree. So that is what that is for. Okay, now we got, we're getting closer and closer to the really, really fun stuff. So we're gonna go and spy on someone. It's gonna be me, but we're gonna pretend we're spying on somebody else. So we're gonna snoop on another session on the Linux machine. Let's say this is a jump host and you wanna figure out what people are doing in real time. So first off, we're gonna go ahead, let me clear this and this. So I'm gonna look for the, I'm gonna create the process tree, everything that's happening on the system. And then we're gonna go and filter for everything and PTS. Now the reason why I'm doing this is then I can see it in color. So if I look up here in red, I can see, okay, I have a dev PTS zero and a PTS one. So what's this one? Okay, so this is dev PTS zero. So I wanna trace that one. I wanna trace the bottom one and I'm gonna actually going to perform the tracing using the top. So let's see, TTY snoop dev PTS zero. Now when you try it, when you do it in a text editor, you know, it kinda works. But yeah, you can trace people's activities which is kinda scary to think about it. But I thought it was cool, so. So, yeah, I mean if you have root, you should be able to do these things. Why can't you do these things? It's just like the movies now, we live in the future. And now we can spy on all the user sessions because if we can just spy on one, why can't we spy on everybody? So there's bash read line and what it's going to do behind the scenes it's actually looking for the system called exec VE, which is the system called for run this program at this path. So we're gonna run that and then I'm gonna start typing in some stuff and then we'll see what it does. So bash read line. Okay, so now I'm tracing. All right. Why are we not doing that? Oh, there we go. All right. So what if I wanted it? Okay, great. Yeah. So if I had a multi-user system here, if I ran this on a jump post and people were actively working, you could see all the commands that are being executed by people in real time, which is scary, but fun. Okay. So let's go even further. Wouldn't it be interesting to spy on encrypted traffic? Because you can. Now to be fair, we're not actually like, intercepting a certificate and going and doing man in the middle. This is much more simple. What we're doing really is when you're doing SSL, there's calls called SSL write and SSL read. So when you make a call to SSL write, you're providing an argument, which is the data you're trying to encrypt. And then SSL read, what you're returning is a data that is being decrypted. So what this does is it intercepts those two and then prints out what you have. So we're gonna go ahead and run SSL sniff. And then I'm gonna go ahead and run W get my website. Actually, we need to make sure it's HTTPS. So there's the get request. Apparently this was what was encrypted and sent over the wire. This is what we got back. Yes, I know I'm not hosting my website on Drupal. It's on S3 static assets. And then the actual response. So yeah, you can kind of snoop in crypto traffic on a server. Okay, so how are we doing on time? Exactly, I'm nice. Fantastic. So in summary, perf events in EBPF are pretty awesome. I hope you think so now. You can see more details on Linux server activity than ever before. And you can install them and use, install and run these tools today. One big piece of advice that I'll tell anybody because I am an operator by trade, test and non-production first and finally have fun. Further reading, Brendan Gregg, he's the performance engineer at Netflix that inspired a lot of this content. He's a very, very smart fellow, much more smart than me. He wrote a book on this and it's like 700 pages. And I mean, it's probably a really good blunt weapon, but definitely order it online if you're interested in this stuff because he'll tell you everything about these tools and practically every permutation under the sun to understand your systems. There's also on the kernel.org, the events reference site and the GitHub project for BCC which has references on all the tools. And then Julia Evans, I'll definitely give her a shout out because she wrote this really cool perf cheat sheet that has a bunch of commands that you can just print out and put out your desk. And then again, learn some syscalls, they're really fun. You can run man two syscalls and it'll give you the whole table and then you can run man two on the actual syscall name and read up about all that stuff. Anyway, I definitely appreciate your time. This has been awesome. Thank you so much. With the remaining time, do we have any questions? Sure. Yeah, I was just, oh yeah, the microphone, the microphone picks up from everywhere, but I'll go over here too. Yeah. So I had, when you ran Drush status, I was surprised to see that it was 12 executions. I was curious if you could turn on exec snoop and then run Drush status to see what those 12 executions were. You know what? We should. That's fun. Let's do it. All right. So we have exec snoop. Yup. And then we had the, yeah, let me go back and turn the thing and get the command. It was really early on. Oh yeah, we can just run regular Drush status. Okay. Yeah, that's what we're asking for. You know what it might be? I wonder if the processes that are being executed, like exec snoop, I think exec snoop doesn't monitor for like threading where it's executing things under a process tree because otherwise you get a whole bunch of stuff. There might be a reason why we're getting so few. There was also the, on the other one, there was like the grep minus VE and those other, so there's probably grep processes and things too. Yeah, totally. That's so cool. Yeah, absolutely. Excellent question. Any others? Hi, when you did the TTI snoop, so you have the, so what happens when you solve it as fast as possible? Let's try it. Oh. All right, cool. So I am like, so yeah, let's try this. So we're gonna, all right, so we're gonna TTY snoop. Okay, so now if I just, you know, if someone wrote a program that didn't properly handle passwords and I typed anything, it's gonna get anything, obviously. But I wonder what would happen if I typed in, like, well actually no, the problem is if I run sudo su, it's just gonna give it to me, so hold on a minute. Login? Well yeah, I'll just do, we're just gonna take away my, yeah, all right, we're just gonna remove this. For HTTP password to create an HTTP file. Oh true, yeah, I could have done that as well. All right, but I disabled my ability to sudo without passwords. So if I do this, oh wow, it doesn't work. So that's good. That's really, really good. But if you, Yeah, that's all right. But there we go. Any other questions? I'm relieved that that doesn't work. The more I know, the more scared I become. Any other questions? Don't be shy. This is a fun topic. All right, well, again, thank you so much.