 Good morning, good afternoon, good evening. Wherever you're handling from, welcome to another Red Hat Enterprise Linux Presents. I am Chris Short, host, showrunner and producer extraordinaire for this thing we call Red Hat Live Streaming. I'm joined by the one and only Scott McBrien who if you've been watching today, you saw earlier on the level up hour. So Scott, we're talking about troubleshooting rail today. Yeah, I thought it'd be an interesting topic to talk about some of the tools that we've got or can use at our disposal. And then maybe also some common things that we've seen in our history, collective history. Right. And how we can figure things out. Yeah, like, so our intention is to semi-break stuff and fix it. Hopefully that'll actually work. Yeah, well, I mean, the trick is to break it bad enough that you can't fix it. Right. And demonstrate it. But not so bad that the box... It's just down, exactly. Yeah, so yeah, we thought up a few scenarios here and we're just gonna run through them. And if you have any questions or specific scenarios you find yourself in commonly, ask us and we'll be happy to answer any questions you may have. So, I thought that maybe we should get started with a DNS problem as we had DNS problems this week. Yes, I think that's a perfect space. Like, I think we both came to the same conclusion at the same time that that was the right place to start, given all things DNS, because it's always DNS's fault, poor DNS. It's not always DNS's fault. It's really not. But it is my favorite haiku. Right, yeah. All right, so I just made a quick change to the resolve.conf file. So, everyone now knows where the problem is, but having a messed up resolver is a fairly common problem to have. And... Like really common. Yeah. Oops, so I can spell. That wasn't gonna work. And it might. Maybe if it did work, we'd know another problem. But yeah, so this is kind of how it manifests itself, right? So when you use... You're like, I can't really ping or connect to anything. And it doesn't seem obvious what the problem is. Like, it seems like your internet's out, but you're fine. Like you have connectivity, but you don't, because nothing resolves. Yeah, so one of the things that I think is true for any type of troll shooting, you kind of have to know the basics of how the thing works in order to figure out where it's broken to then investigate what's specifically about that part is broken. And so for DNS, we kind of rely on host names all the time for everything. And so my favorite is when something like this happens and somebody calls me or emails me hollers up the stairs of me, the internet is down. The whole internet, the entirety of it. It's gone. This distributed network, which is using BGP to route itself around problems, it's just gone. Well, things gone. Global network gone. But I mean, to be fair, from their perspective, if they're really a person, things no work-y. So the first thing is like, well, how bad is it? Right? So the first thing I would check is probably like, does this machine have an IP address? Right. All right. So maybe we just look at something like an IPA. There you go. Right. And so. I have an interface address of that. Yeah, so I've got an address. All right, so. Okay, cool. So if I have an address, then maybe it's not able to connect upstream. Right. So. So maybe ping your gateway. Whoops. It helps if I have that installed. Yeah. Oh, I guess we get. Is there an AP command or IP command for that? I feel like there is like extended something. Yeah. There it is. IP route. There we go. So there we go. So let's try and ping our internet gateway. Oh, that works. All right. It should be fast like it is. And cool. Cool. So if we can. Connected. Yeah. And we can ping like the next thing up. Right. Well, then it's like, well, can we ping beyond the next thing up? Right. All right. So. Ping. Like that. Yeah. Okay. So at this point, we've determined that contrary to popular belief, the internet is not down. Right. But clearly there's something wrong because when we were trying to use machines, they were reporting back as networks transit. Right. Can't even resolve Google. So there's a problem. Right. And so now we start to like pivot away from networking diagnostics like IP routing and into DNS diagnostics. Right. So there are some tools that we can use for that. Quizzing for short. Cause. Well, obviously host is my favorite. And then dig is my second favorite. All right. So, well, that doesn't work. Oh, well, that doesn't work. Yeah. We deprecated it. And it now comes in like a different package. In fact, when I used route earlier, same thing. It wasn't that other package that we need for. Wait, what is it? I mean, it's not in its look up. You could do it in its look up, but that also is probably in that. Deprecated. Past deprecated. Yeah. What package is available? Yeah. No. Okay, good. You named it already. Oh, dick. Okay, cool. Yeah. Yeah. Oh. Oh no. You don't even have that installed. Oh. Yeah, no tools here. This is completely fail. Hmm. I don't think I can do this cause I'm having DNS problems. Right. You have DNS problems. So you have, unless your repository's an IP address. Man. Well, that leads us to an interesting problem. How could we install software packages without DNS? But let's not go there. No. I mean, cause at that point you have to pull out your phone and look up IP addresses. All right. So let me try something else. Hold on. One second. I don't want anybody to see it. Super secret. Cause it looks like I don't have any of them that would need to install this box. Wow. We're gonna fix that. You could do that, but you still have to get the packages there. Yeah. You have to get the ISO. So I assume you would have an ISO landing. So normally I think dig should be there. Right. But we stripped down this image. I mean host is normally there too, I thought. Yeah. I mean, it isn't the door. I mean, that's way upstream, right? Numb local buds. Now I'm curious. So this box doesn't have the bind details installed. So they're just gonna stick it on there. Okay. And then host should be there too. All right. Okay. I think I have installed the right packages and then rebroken the machine. Okay. All right. So let's try this again. Host. All right. So here's where we're like falling off the rails. Mm-hmm. There should be a response here pretty quickly since it's Google with their massive global IP network. Yeah. Yeah. And if we do some of the dig, we should get a similar. Oops. Dig clear. All right. This will be fun. It's gonna have a DNS error. Yep. I betcha. All right. But like one of the hallmarks is that it's taking a really long time. Yeah. And we would expect the DNS queries are pretty quick. Rapid time. Yeah. All right. So we could ping by IP. We can't even do DNS lookups. And that probably explains why we can't do ping by host names. Mm-hmm. Oh, and I don't have that either. Dang it. What are you trying to do now? Well, that's kind of S-trace it. Oh, S-trace is not standard at all. Ha ha ha ha ha. Maybe not. Well, I think, I mean, I think, I think we've determined, okay, there's a DNS problem here. Where should we go? All right. I don't think we need another tool, especially S-trace. Fine. S-trace is not installed by default. Well, S-trace, one of the nice things about S-trace is that like as you're profiling the app and showing you all the craziness trace out, but nobody reads because it's too much. What'll happen is like it's going, it's going, it's going, and then it just stops. And waits. Because it's waiting for that time out from the DNS lookup and then it finishes. And so if you like read back from that pause, you can see the library that it's calling, which is a get host by name or something like that. It's like, oh, okay. So I've confirmed this is really like probably doing DNS records or DNS lookups. All right. So where do we start going from here? Well, one place we could go is we could look in here. So if we had an IP, we could stick it in here and do a manual DNS record, which is like 40 years ago was the way we did things. But in reality, like we rely on the resolver settings. So. Yeah, we need the resolver. Yeah. And where did this resolver setting come from? Question. It tells us in the file. Network manager. Yeah, because this box is a DHCP client. So it received its DNS settings as part of its DHCP lease. So if we hadn't done anything in this box to mess it up, right? This is its natural state. Not only would I fix it here, but I'd also wanna go to the DHCP server to make sure that it had the right setting. It wasn't handing out the long IKE for a resolver from DNS settings for everything that's out there. Yeah. So we'll set it to something else. So it's actually not a non-rather IP. So this is a top level DNS server out on the world. In reality, you'd probably wanna choose one that is within your network. And then that way you can rely on things like DNS caching and not doing a whole bunch of DNS lookups at the root system servers on the network, but we are what we are. All right. So now that we have it fixed, it works pretty darn quick. It's not DNS. It can't be DNS. It was DNS. Google, that works. Okay, DNS. That one's solved. DNS. You had mentioned DHCP in our show. Yeah. I mean, so like DH Client is a thing, right? And I know a lot of people, like their security teams for whatever reason, don't embrace DHCP, but for those of us that do, sometimes changes don't propagate in time for, you know, like if there's a new route on the network, not new route, but if there's new information available, be it, you know, cause there's a lot of stuff you can configure with DHCP, like DNS servers, for example, if there was a change on your network via, made via DHCP, but somehow your box did not pick it up, it's sometimes good to just hit that refresh button on your DHCP lease to make sure you've got everything you need from the, you know, DHCP server that's been configured by, you know, your expert crack IT team. So if you want to do a, you have to check in C of DHCP is on first. Yeah, that's where I was going to go. Do that. Good question. Where's that setting stored? I believe it is in network scripts. Let's see, I was in system D or system config, there we go. So each interface has this configuration file and stored within it is that, right? So bootprote equals DHCP will cause it to do a DHCP client request. This variable also supports the term boot P, although no one really uses boot P clients anymore. If it is literally anything else besides boot P and DHCP, it means static. So a lot of places will put in none or static, but it could literally be like fluffy McButter pants means static because any term that's not DHCP or boot P means static. In newer world, we have network manager. So NMCLI is the network manager, the command line interface. So let's look at our connection. Maybe export or the, oh, show. Let's go show, there we go. And so this is like a lot of stuff about the connection. That's a lot of information. It is. And somewhere down in here. What could you just do a slash DHCP? Would it come up? I don't know what the term is. It's not DHCP or has DHCP in it. Well, we get some stuff like a DHCP timeout, but I wanted to ask you about that. Oh, DHCP options right down there. Yeah. See how everything's configured there. You got a lease time and everything, there you go. And so that's actually showing us all the stuff that we got from the DHCP server as part of our lease. Right. So let's say our domain name server changed, but we didn't pick that up in time. And now we're having DNS issues and our DNS is actually configured correctly. Or so we think. So now we need to refresh this command or refresh, run the command to refresh the DHCP lease so that you can get whatever changed came down the pipe. Yeah. Somewhere else in here is the timeouts also associated with that lease. So leases have a refresh time and a renewal time. So you don't want to kill your DHCP server with random traffic all the time. So a lot of places will like do a check in time once a day or maybe once every couple of days and a renewal of a day or every couple of days. If you're in a network that has a lot of transient guests like an office with a lot of guest workers, for example. There you would turn those values down because you're not expecting someone to be there for multiple days on end. So you might have them refresh every couple of hours so that you could return that use lease back into the pool of available leases. But I think the defaults are something like one day and one hour or something like that. Yeah. Sorry, one day, yeah. So that might be explained why this box didn't get that refresh, right? Because it's only looking every day and you made the change four hours ago but that's not another day. So he's going to wait until tomorrow to pick up that change. All right, so we decided we want to refresh this. And you actually told me the command earlier, short. It's just dhclient-r. So that is a refresh that basically checks in and... It's a two-step process, actually. But yes. In line with us. So remove stale PID file. All that actually means is that the PID file that had all the information that DHCP gave it that was created by dhclient has been erased. So now it has no recollection of what is going on in the world. So now you do a dhclient to get a new lease or at least check in with the DHCP server to see if a new lease is available. And we can see that also here in all our recipes. All right, so here we see a whole bunch of network manager output, talking about changing the status of the device getting that new information from the DHCP server. Yeah, so it's always good to trust what verify but I always assume that for some reason it didn't refresh correctly and run a dhclient anyway. Yeah, and when you're on the phone with your internet service provider and they tell you to reboot your machine, it's like dhclient.shr. There, I've rebooted, it's really fast. And yes, the information I pulled is still mangled, so. So we have been asked where our shirts are from. I know where your shirt's from. I know where my shirt's from, but yeah, where's your shirt from? So this was a Red Hat Booth shirt. Yes. Or AWS Reimant in 2019. Wow. It's one of my favorite Red Hat shirts though. It is pretty cool. Yeah, your shirt is Cool Stuff Store. My shirt is Cool Stuff Store, which you can get for yourself. Go to coolstuff.redhat.com and you can get a Containers or Linux shirt all of your very own, as well as many other shirts that I wear throughout the week. As I've told my boss many times, I don't think you understand, I can't have enough Red Hat shirts. Like that upper limit needs to be really, really high. I will say, since we stopped traveling and getting Booth shirts, my wardrobe is very repetitive. Yeah, I mean, yeah. I would, you know, if I threw on a college shirt one time and people thought I was interviewed and it was actually a Red Hat color shirt at that. So it's like, okay, t-shirts for the live stream. Got it. I think there was one time where I was like going to a nephew's birthday party like right after a stream and I was in Apollo and it looked weird. But anyways, what do we want to break next? All right, so let me clear this real quick. One of the other things that you had suggested before show was I installed too much stuff. Yeah. And I need to uninstall it. So there are a couple of ways. A lot of people don't realize that there's potentially craft on your boxes because you know, you installed a group that you didn't intend to. My favorite is I used to work at this place, a very well-known place, but manufacturing, right? So it was an old BSD admin that was in charge, he was actually a contractor that was in charge of all the Linux boxes. So a lot of tickle, but also he didn't, you know, he was not real certified in any way she had performed and didn't completely understand that small images are good images, right? So by default, they did a full install of everything on the rail DVD, everything full install. So the boxes were huge, they were taking up enormous amounts of space, tons of updates, tons of vulnerabilities being, you know, tons of ports open that don't need to be just all kinds of things that made these systems very, very crafty. So I had to go in and clean that up. Nowadays it's a lot easier than it used to be. So it's kind of nice to show this off. Yeah, well, and I've actually heard it said, like we doing everything installed because we don't know what's gonna happen on this box and year, three years. Yeah, and to be clear, like some of these boxes were like they were at his desk one day and then they were in, you know, the manufacturing floor, never to be touched again unless something needed to be touched kind of deal. So yeah, that was their thing in exactly. Just install everything because we don't know what's gonna happen here. Yeah, but like to your point, the more stuff you have, the more stuff you have to maintain. Right, exactly. So lean is good and you can always, well, not always but in many configurations and operations, you can install more stuff later, right? So all of a sudden you realize that you need such and such PHP module. You can install that later. You don't have to install everything. You don't need to install all of the PHP to get the one mod you need. Right, right. So this box, I think is kind of representative of some things that we see. This one is now able to do graphical desktop. And you can see that's really helping me from my SSH session. Yeah, I noticed you were on a terminal. Not necessarily logging in on like a console. Yeah, man, I use that graphical desktop on this box all the time from my terminal. From your terminal. I mean, you could, yes, you could do some X40 and that would get interesting or, you know, I'm sure Waylon has a remote server or something like that, but why? Yeah. So. Oh, that's a lot of stuff. Yeah. That's a lot of stuff. So these, this is just like the packages that say good home in their name. Like that's not even all the stuff. Right. But this is the stuff that I don't need anymore. But yeah, you would get an email client, calendar client, all those other crap with it, right? Like it's not just gnome and all the components of gnome, it's all the stuff that comes with the desktop environment. It's a lot. You don't use an email client on your servers in the data center. Oh. Wow, follow up with your bad ideas for the day. I mean, if you're, you know, if this was a logging system of some sort maybe, but you know, no. Well, even then you don't need the, you don't need like Thunderbird or evolution. Right. Yeah, whatever comes in the desktop. Yeah. No, you don't eat that. You need like mail X done. Right. Exactly. All right. So there are a couple of ways that we can, that we can solve this. The first one, I think is more generally applicable. And we're talking about a pre-show. So Chris, I'll let you tell me what you think I should do. So it's actually from a recent enable sysadmin article. So I'm going to drop it into the stream chat, but I'll walk you through it. So do a DNF group list. Yeah. But we're on Rails. I'm going to use Yell. Yeah. I mean, in Rails 8, Yell is literally some not-linked to DNF, but maybe some people are on Rails 7 and they need Yell. Yeah. Oh. Yeah. Okay. Cool. Okay. So what do we got group installed here? Oh, look. Graphical administration tools. You got server with GUI. Anything else in here? Graphical? No, it doesn't look like it. So we need to get rid of those two, pretty sure. All right. So these are the ones that we could install as groups if we're interested. These are the ones that we currently have installed. Right. Sorry, I misread this. As are these guys. Yeah. So that minimal install is what we want, right? Normally you don't, if you can do a minimal install and then add your packages through Ansel or any kind of administration tool or automation tool, that's kind of the goal. So a minimal install server with GUI is like the polar opposite of each other. And it's funny that they're installed together here. So yeah, let's get rid of server with GUI. And that's going to take one command. So yum group remove. It's in quotes server with GUI. And you can put a YF dash Y after that if you want. Oh, we're just going to answer yes to all the questions, aren't we? Hell yes. Why do we have a server with a GUI installed and a minimal install together? That seems, I mean, like yes, that's the most, you know, secure install of a server with a GUI, but you're using the terminal all the time. You're not touching the GUI. Let's get rid of that. So I know that in our like, rel lab environments on lab.redhat.com, we always add in the dash Y option. Cause I want to make it so you can click the command and it just runs it and you are doing it all right. But generally, I think it's a good practice to review the transaction before you say yes to it. Cause maybe this is a server with GUI because there actually is, maybe it's running a enterprise database client from a third party company that requires GUI tools in order to do its installation of said enterprise database from third party company. And so by removing the server with GUI, you may end up removing that database software too because it depends on one of the packages that you're deleting. For our purposes today. Fair enough. Yeah, yeah. I see your point, but yes, today, I'm comfortable hitting dash Y. Fair enough. Boom. All right, so it found all of the packages that are part of that group. Six hundred and sixty seven of them. Wow, that's a lot. There's a lot. Yeah. And it's now cleaning them off of the system. You know, that begs the question. Six hundred sixty seven. How do we get to that much? So I actually don't think it's six hundred and sixty seven packages. I think that's six hundred and sixty seven transactions that have to be done by you. So it's probably more like three hundred and thirty. Yeah, but still, I mean, that's that's a good number of packages. That's a lot. It is. But notice that, like, there's Perlin Python libraries that were pulled in because of it. And oh, Samba Common, because of course, we want the file manager to be able to access window shares. So let's pull in all the Samba stuff too. Oh, there's there's one for Wacom tablets. We need to support those because somebody might plug that into the machine. And then when you use their wide swath of vulnerabilities appearing in front of me. Yeah. All right. So this is doing this thing. It is. And what it's done, I'm going to show you the one other thing that we can do that's unique to this box, because if we installed it from Kickstart with Server with GUI or we installed it from the installer server with GUI, this would be the way that we need to clean that back off. But I didn't install it that way. I actually applied that after the fact. So we could also do a yellow history. And right here, transaction number seven, that's where I installed all the stuff. So I could do a undo transaction to remove all the packages that were installed or erased by this package transaction. So if I wanted to, you know, it's three, one. Group info. I remember how to do it. Group up. No. You get something like this. It's going to work at me and say that I'm wrong. Oh my god, I pulled that out of the, like, recesses in my brain. Insanity. That's awesome. You should leave our sync though. But like when you install something and it pulls in all these dependencies, right? If you just remove the thing that you installed, the one package you installed, typically it leaves all those dependencies there on the box, right? So doing a history undo will actually take the entire transaction that was executed to get that package, including all the dependencies, and undo that transaction. You can also do this for things like removals. So you can undo a removal. Potentially you could do this with updates, but I don't know that I really recommend that. I mean, unless there is a certain, like, package you're creating, you know, on a regular basis that gets updated, yeah. I was thinking more like, there's that time where you're doing your maintenance window and applied your updates, but now it's borked and you can't figure it out. So you just want to undo it. Like, in theory, you could do that, eventually. That's the problem always. Yeah, and I've had some weird things where, like, it messes up. And granted, it's been many years since I've done this, but when I've undone things to a package snapchat from earlier, it may do things like janky things with grub and just weird, weird stuff. Oh, that's right, yeah. But if you, like, install a whole bunch of stuff for, I don't know, some people package you wanted. And now you need to undo that. You can undo it pretty easily with the YUM History Undo and your transaction number. And YUM also has the, like, auto-remove thing, doesn't it? Can't you do, like, a local cleanup job? It does. It says, like, removed orphan packages or something along those lines. And it's been long enough that I don't remember exactly what that does, like how it determines what an orphan package is. Yeah. So I'm just going to say, I don't know. We'll have to figure it out some other time. Yeah, cool. All right. What other fun stuff we're going to break? Well, let me make another problem real quick. OK. Want some music? Oh, you're going to create this problem. There we go. There you go. So, I mean, wow, you're not even doing random. You're just filling it with zeros. Good job. Well, why do we have to be random? Well, that just seems extra work, extra computations. Oh, yeah. So what we're doing is we're making a large file called big file. That is taking his input content from dev zero, which is no character in an endless, an endless stream of zeros. Yeah. So what I'd like to do is. Have the entire disk be full. We'll see what actually happens. Here. Oh, oh, it's already happened. Oh, there we go. But it's very slow. Right. You know, it's a user. I mean, it's done 13 gigs of data so far. Wow, that was fast. Yeah. Is it done? It's not done. Let's check it again. I know I am killing it. So kill. If you don't send any other arguments, we'll send SIG term. Yeah, which will terminate the process. But I did tell him what to do, not send SIG term. I told him to send SIG user one. Right. And SIG user one, when sent to DD, will cause DD to, like, send out a status to a standard error. So. Right, that's this is what's happening when I'm SIG user one in the DD process, because normally I just run silently and we have no idea what's happening. That's true. OK. DEMI or DMI3, MIS, I don't know how to say it. If you do the DD with BS equals 1M, it would have been faster. Yeah, that's true. It would just take, like, hunks of one megabyte of dev zeros at a time, character by character. But now you're stress testing the disk and filling the bell. What could possibly go wrong? Not a great, it's not a great. Maybe we just have too much disk space. Here, let me go to this other box. Let me just see. You got L tiny box. Sorry. Yeah. All right. So it's got 30 gig. So it should be close. OK. Be close. Yeah, try to execute a command. Oh, yes. OK, good. Boom, done. All right. So at this point, my disk should be full. And that'll lead to all kinds of weird things happening, depending on what file system is full. They determine the weirdness. For this one, my root file system is full. So things like logs are probably not very happy with me at the moment. Anything that writes to or reads from disk isn't happy with you at any moment. Yeah. So just to kind of see what's going on, I like using DF. So yeah, we've got the 20K available on this 35 gig file system. It's not bad at all. No. But we need to like figure out what's making our file system full. Right. And now we already know that because we watched it happen, right? Right. But like, why is the disk full? Is there anything we can do to fix it from becoming full routinely? Because normally what happens is it's like some batch job or some process that constantly runs, that ends up filling up your disk. A lot of times it's done by mistake, right? Like if you're testing a batch group or trying to figure out all the files on a file system and you use up all the I-nodes accidentally, right? Like, yeah. And actually, that's a really good one. If I used up all the I-nodes, I actually wouldn't see it with DF-Desh because that's your data block space. Right. Like that's actual disk disk. So now we need something different. Yeah. So DF-Desh, hi. Yeah. I've actually seen that happen where somebody touched files and touch makes zero-length files that consume one I-node and they sucked up all the I-nodes. So yeah, I used to work at a newspaper company and a friend of mine, Pearl Expert, right? Like this is the, you know, wait, well, 2010, 2011. He needed to index all the images in the storage array. So he wrote a Pearl script. To save them locally, but he actually flubbed it and didn't do that. It actually wrote it to the disk on the existing, you know, the server, ran out of I-nodes while we were out to lunch, no less. Yeah. So like he started this job and left. None of us knew he had done it. And then all of a sudden the pager starts going off and we're like, why is the pager going off? It's the middle of the day. Like, who the heck is running this? It's like, oh, OK. Oh, it does me. Guy at the table is like, oh, oh. Yeah, pro tip. Don't don't do that when you're about to leave your desk or on a Friday. Well, I mean, yeah. Oh, I love people that are like, oh, thanks for this new update. I'm just going to put it. Oh, it's the weekend. Let me just nail it right now before I leave the office. It's like, let me just hit the point on this. Let's do that Monday when you get back to the office. Yeah. Anywho. So you are, you know, low on inoids here, too. So it's kind of a good example. Yeah. I mean, 98 is not a lot. That's not a lot. Oh, but you know why that is? Because it is an XFS file system. And XFS file systems actually make more inoids as they need them. So unlike EXT four or three file systems where you got a finite amount of them where you created the file system, XFS will actually figure this out and live make new ones before you run out. Nice. It will probably be all right. So this is it's live new ones as, yeah. So the process is dead now. Did you kill it or not? It filled up the disk. So it killed itself because right here. Oh yeah, that's right. No space left on device. So now we get to find this. So yeah, like, OK, we know the disk is full, right? Like we figure out the disk is 100% use. Where, what command can help us see disk usage? Yeah. Yeah. Do you maybe? All right. So we could do that. Is there a better one though? That's a good question. So because this is a relatively simplistic box arrangement, we could do something like do you on slash and figure it out? Right. But if you've got a whole bunch of different file systems and you have to run do you on all of them? You kind of want to find just like the biggest files on the system. OK. So if we just run do you, and I'm going to run it for a second and then kill it. All right. So it just goes through and shows you all the files through their disk space usage. What I end up doing a lot is something like this. Yeah, just do top five. Oh, and I don't like looking at errors in there. So, you know, ah. No, it's OK. Oh, that's because I put that up in the wrong spot. I redirected the head errors, not the errors. Right, not the errors. There we go. Much better. All right. So I know that there's a lot of stuff in this directory. Yeah. Ruth is doing something wrong here. Yeah, so do you, I think, get you pretty close. Right. But it's not going to give you the exact thing. But we can now do that and figure it out pretty quick. Yeah. Who put this big file here? Well, it has a called big file. I don't understand. I don't get it. Was it supposed to crash the house this time? Yeah, we can also do something like this. Yeah, you could do lots, I don't know, 100 megabytes. Yeah. Do a type. That's right. All right, so those are all the files in the entire system that are bigger than 100 megabytes. Oh, fancy. So yeah, yeah. I will warn you that find literally goes out and looks at every single file. Yeah. So like if you have a whole bunch of time, right. And we already preceded our file cache from our DU. So like happen real quick. But if you're like out there touching a storage array, out on the network, that's probably a bad look. Yeah. Yeah. So constrain things to make it more simple and easier. And like a slash is going to include all these mounted file systems, including ones that might be elsewhere like your san array. So, you know, being a little bit more choosy and where you run your find is probably good. Yeah. Like if you know your mountain points or, you know, for your application or, you know, opt app, for example, then yeah, you can get an idea of where to start looking first, you know. It's where you're probably wherever your application is writing files to. Or, you know, if you have, God forbid, everything installed and you have PHP running and somebody's uploading stuff to your server. Speaking of which, here we go. Ooh. OK. So. Why'd you do that? I don't like that. I like it. OK, fine. You deleted it. I did delete it. But it's still showing up. It was a fool. What a look. It's not even there anymore. It's not even there. Wait. Could it be in a cache somewhere? Oh, it's worse than that. Oh, I know. So this is something else, especially in log files, if you like, delete a file. Oh, yeah. It's not actually deleted because somewhere. Somebody has less on that thing running. Has it open. And in fact, that's what that's what's happening over here on terminal two, right? I used a VI to open that file. And because of that, the system is going to keep it around. Yeah. In the data storage, because we don't want to mangle whatever process is using that file data. Right. Do an LS-LA here. I don't think that's going to get you what you want. Yeah, no. That is definitely not what I want. Yeah, but you know what is what you want? What is what I want? I mean, I'll also list open files. I always always forget that one. And then I don't know. So you look, you don't even have an install. I don't even have an installed because I'm terrible at my job. Try and install it. But it's just a fool. I know it won't work. Tricks seeing me. All right. So let me let me just kill this off here. I might have like destroyed the machine terrible. Really? No, you didn't get off. Yeah, OK, OK, still full. Still no. There we go. Oh, yeah, that's right, because you did. One, five, five, seven, six. Get out of here. There we go. OK, now let's do. OK, now we're much happier. Yeah, all right. How do you file is gone? So let me do this guy. Just plug in. OK, good. All right. So let's go off. We'll get you what you want. These are all of the open files on the system. And so we could have done something like looked for ones that are really large or we just scroll through this enormous list. But that would be why it wasn't actually removed and the file data was not marked as free space. And this happens a lot for things like log files where you'll delete it. Yeah. But some process still has it open for writing. And it's still being saved and like manipulated on the disk. So in that instance, what you can do is figure out because the second field right here, this is the process ID of the thing that has it open. So you can figure out what process ID this is. So you might not want to. No, true. All right. So that's this grub here. Actually, no, that's not. I thought it was. Decrusting. Let's grab this one 5548. That's a little better. So that's the system D process. SU log in shell process. So this was like a patchy or engine X or some other type of service SSHD, right? If we did a service reload, that would usually fix it. Here, maybe I do a kill dash up, sorry, SIG up on 15548. So the hang up signal will cause it to refresh all of its files generally. So it'll close all of its files and then reopen them all. And when it closes all of its open files, then that would release the file that we had deleted. And there you go. A lot of times the reason that's happened is we've done something like use the log rotate to rotate a log file to an older file for archiving. But we didn't kill the process that was writing to that older file. It's still got it open and still writing to it even though it's no longer active file, right? That's why log rotate has things like the post-rotate scripts where you can tell it to do things like restart services after it rotates the log away. And believe it or not, getting the log rotate set up for certain things I've had to build myself, I have filled up just because I've forgotten to do stuff. I have definitely run into the disk full problem because of something I didn't do right with log rotate. That's a great example. Airpoint. All right. So I think our last thing that we're going to talk about is how could I avoid getting the situation in the first place? Is there something that I could do to help avoid getting the situation in the first place? Yeah, like help me out here. Yeah, so here, let me spark up a new box one second. Because I've actually got a lab for it if you want to try it out on your own. I know, it's crazy. If I can find it in my list, there we go. We've been adding new content to the lab.redhab.com page. And of course, that moves tiles around, which means that I can't find the things I once had. Let's try one more time to share the screen. OK, so let me share the whole screen this time. Don't make fun of my tabs for short. You're going to do it. I almost did earlier. I stopped myself short to get it. Oh, name jokes. All right, so Red Hat has a service that's offered. Don't tab shame me, Chris. I'll tab shame you. All right, Red Hat has a service that comes with every Red Hat Enterprise link subscription called Red Hat Insights. And so once this box finally provisions itself, we'll go through and look at that. It is running an older version of RHEL to make sure that there's an insight that gets triggered for it. And I'm going to do one more. I'm going to break it one more way. All right, so. OK, I'm going to make it real broken. That's good. Yeah, actually, that's right. Yes, OK, so what I did was I changed the policy that we were using for SC Linux to a string that is not a recognized policy that's installed on the box. So we were running in targeted policy mode, meaning we're using the targeted policy. And now we're using a type called I'm So Broken, which is not actually a policy name. The consequence would be if I were to reboot this box, it would not boot back up again. Oh, yeah, don't. Yeah, yeah. Yeah, because when it tries to load the policy called I am So Broken, I can't find it. And then it just stops where it's just done booting. It's not able to load the policy, we've got to stop. And this is something that could like maybe somebody botched a search and replace or they updated a automation module and now have applied this across your whole population of boxes, which you won't find out about until it reboots whenever that next happens. So ones that the problems that you experience upon boot are one of my favorites because it might be like months before someone actually gets to experience it. And then they have to remember that months ago they did a thing that could have broken this. Right. All right. So what we're going to do is we're going to install Insights Client. Most Row 8 installs out of the box come with Insights Client now, but because we did a minimal install, that's why it wasn't included by default. But most of your other stuff, oh, and I can't do that until I'm like this. See, we're already having problems. We just got alerts that another terrible storm is coming. And all I can do is just think about what it's going to be like tomorrow with no power again. So we'll try a little time. I don't know. Cool. So got the Insights Client installed. I'm going to go ahead and register this box with Insights. And it uses the system ID and host name stuff that are already tied to the subscription for the system. So there shouldn't be any login information or anything else that I need to provide because I'm going to use the already registered system information. All right. And at this point, it's collecting Insights data and it's uploading it to the Insights service. All right. So now we're going to go over. Let's just verify that everything's cool. And now we're going to go to cloud.redhat.com And notice in my lab here, I give you some credentials to use for working with this box. I don't know. Well, 87.20, I don't have an explanation point. Well, did you just say your password on there? I did for that account that I provide the password for right here in the lab. I was just thinking. I'm not sure if you saw the news. There was an Italian Olympic broadcaster that didn't realize they were on air and asked for the password to the system they were about to log into and was given the password and repeated the password and commented on it as a very good password and everything else, not realizing they were on here. Yeah. I'll have to find that story real quick. Now that you've mentioned it, I did see that this week. All right, so we're going to go to the Red Hat Enterprise Linux advisor or insights. And then we can go through and just kind of like look at the dashboard of what's out there and available and things broken and whatnot. But I want to go to an inventory. Can I look at this one box that I'm looking on? And that's this box that registered two minutes ago. All right, so at this box level, I can actually see individual stuff for it. Like what CVEs are outstanding for this box that I should download and apply, which is still the box has never been updated. We probably need to do that. But also under the advisor advice. So right there, it's an important advisor advice. System will fill the boot when SC Linux type is set improperly. And then it actually includes the actual value from the box. That's kind of nice. I could find it. And then the steps to resolve it tells me like what we think it should be set to, which may not be true for your environment. And so let me just go ahead and do that. And then there is one other piece of advice, too, that Network Manager uses an internal DHCP client. And it says we just need to install a new version of Network Manager. So I'll go ahead and do that one, too, while I'm here. Exolvable things. Yeah. All right. So at this point, I could wait and check in again later or tomorrow and see that the problems have been resolved or maybe I should just have it do a check in now. I'm getting tabshamed from the audience, Chris. It's not, hey, I didn't repeat it. I didn't repeat it. All right. So now that it's done, let's check in. We do have to wait a couple of seconds for the Insight Service to actually go through all this data that it just collected and update the UI appropriately. Right. Let's see if we're actually doing that now. Detention just all your message stand by. All right. And we've resolved all the advisor advice. Good job. Same thing is true for the vulnerability service. As you apply the updates and the CVs are closed, you can verify it here that it's done. And then there's more stuff, too, like compliance if you're actually trying to comply with some of the security standards that Red Hat ships like Disastig or CIS or another one, HIPAA, you can age your compliance against those security standards using this tool. As in addition to the OpenSCAP client site utilities, but this will give you a nice dashboard of your population against those compliances. All right. Cool. So I noticed that Kellogen, I asked if he could also use those credentials. I suppose you could. In fact, they're part of the lab exercise. So if you only get through that in the chat as well. Short. There we go. So that was the actual lab I was doing. But if you're actually looking for something more than just trying the lab I've already written and you want to have your own access to Red Hat Enterprise Linux stuff, you might try the Red Hat developer subscription for individuals. And that'll give you up to 16 Red Hat Enterprise Linux entitlements, plus insights, because it comes with every Red Hat Enterprise Linux entitlement. And you can have your own account with your own credentials. Yeah. So I'm going to find the link to that real quick. Developers.RedHat.com. I think it's just down my register. Register. OK, it's in some place else. Feature download. Download now. It's a slash register. Well, you can download. If you already have an account, you can download it here. Yes. And if you don't have one, you can go to the registration page and register it. Actually, it took me to this page. I'm not even logged in on this browser. So interesting. All right. Cool. Go get your free rel, folks. It's free. Can't beat that. It is. All right. Thanks, Scott. This was fun. And I always enjoy breaking things on air. What could possibly go wrong? What could possibly go wrong? Yeah. All right. So last show of the day, folks. But tomorrow, come back for some awesome data service office hour, as well as in the clouds, DevNation, DevSecOps is the way and get ups guide to the galaxy. So another full day on tap for us on a Thursday. So stay safe out there and we'll see you soon.