 Welcome to the homelab show episode 63 Troubleshooting tips and tricks and I'm joined here by Jay. How you doing Jay? I'm what I'm doing really really well How are you probably a little tires? You've been editing a lot. I know that I am yeah I've been editing a lot of content, but it's uh, you know, I'm getting so caught up now that it just feels great I always complain on behind on behind, but you know what that's not the case now So it feels great to have a content flow. I think by next week Totally be back to my normal content schedule Awesome. Yeah, we've done 63 episodes or 63rd episode which means with all the projects and all the things we talked about We probably should talk about how to troubleshoot some of those things. That's when we know we were talking about doing it on the earlier Podcasts episodes and we're like now let's push it back a little further because now we can discuss some of the troubleshooting things Which is gonna be kind of fun But before we jump into this let's quickly think our sponsor of this show and that is Linode They've been a sponsor for I'm you know I should probably count but I think it was but only one or two episodes They didn't sponsor so pretty much all of the shows but it probably 60 episodes or so And we like to thank them for being a sponsor and these troubleshooting tips We will cover today will pretty much reply to even things you host in Linode so if you need one of these projects that we talked about hosted somewhere on a public IP address external from you and There's gonna be more videos coming where that'll become applicable as well Linode is the place to host it You know just not everything or especially if you're behind CG Nat is easy to host yourself in your homelab Even though yes We may consider Linode extension of your homelab that public-facing thing Maybe you want people to poke at it. I'll let them poke at their IP address instead of years We have an offer code to get you started with Linode and we thank them for being a sponsor of the show Oh, and it's Linode slash the homelab show. I should probably make sure I say that They could be your ultimate DMZ you think about it. Yeah. Yeah, put it out there. Let's let their cloud deal with it Alright, what's the first tip on our list? Where do we want to start? Is it the before you buy? Yeah, I would think so. All right It makes sense to start here now I do concede that this only probably benefits a I don't know 25% or less I'm not really sure the percentages are now a lot of times if people buy equipment for homelab It's mail order at sea bay things like that. You don't have you don't you can't touch it until you actually get it But with things like fake Facebook marketplace I'm not advocating for Facebook, but it exists and you know, we have to acknowledge that it exists They have a marketplace and a lot of people are buying servers there So it is the case that sometimes you actually can touch what you're about to buy before you buy it And one of the first things that you can do because what I've seen is that People just assume everyone else is running windows even people that are selling secondhand equipment They might load windows, you know, I don't know what version maybe windows server or something Anyway, if they You know will allow you to see it running which you know if it's an in-person buy Why not you really want to see that it works? The first thing you should do is go to the windows event viewer immediately because There's actually a serious hardware problem windows is going to be complaining Now keep in mind though that the event viewer likes to complain So there's never going to be a situation where you'll go to the event viewer and not see some critical errors Okay, so the presence of critical errors or whatever windows just calls everything that Now you might get lucky and not see any errors, but you probably will But there's always something in event log unless the event log is completely broken Which is even worse, um, but or suspicious if it's completely cleared before you arrive Hmm But if you look at it and you see like hardware errors about You know a device failing don't buy it That that's that's gonna you're gonna have some trouble with that device. So just say no, thank you And you know, I've done this myself There were like two laptops that I bought for a linux on old laptop series I really need to get back to that because it was a lot of fun But both of them I bought if I remember correctly off of facebook marketplace at least one And I remember, you know, he brought it out He put it on the back of his trunk and powered it on And I just went right to the event viewer and i'm like, um, yeah, I need to see if this thing works And the only thing suspicious I found is that he put an apple hard drive because it actually said in, um The most hardware devices, but it's it's it's it's it's it's you know It is an ssd legit and it's made by name brands. I didn't really care But anyway, the point is you could see some hardware errors And if you do Then don't buy the device ignore the driver errors that doesn't really affect you at all That just means the person didn't install the driver for something. Um You'll probably see errors about a non-clean shutdown. It happens But if you again hardware errors, you definitely want to check that out But but that's only if you can get physical access to it before you buy it like, um There's a server store in ohio. I wish I could remember the name of it but I've actually driven all the way down there because Um, the guy will just let me you know power on whatever before about I buy it He didn't even know I had a youtube channel. So it wasn't special treatment. Um And you can also ask people to send some pictures of it powered on Put something in the message so you know the pictures were doctored and I doubt they'll take the time to really photoshop it But like, you know, right today's date on a post-it note stick it to the monitor and take a picture and Yeah, little things you can do to try to verify It is an aggravation because there are scammers out there. That's a lot of why we're mentioning this Don't blindly buy things now. There are reputable places you can buy from You can of course look on ebay and things like that and I not saying ebay is reputable But you can at least try to make some determinations about the reputation of the particular person Of note when you look on ebay, uh, look at the person's selling history And if they made a fortune uh in a great rating selling socks and next, you know, they're selling servers Someone may have taken over that account and use that reputation to convince you that they were a great place to buy from So just do a little bit of due diligence on the buying part. Um, when you're building out your home lab It's one of those little things. It's you know, you get excited. You're like, that's a great deal and it has 128 gigs of ram Then maybe it's not another thing to check too if what you're buying is a physical server And if you can get pictures of this even better, I kind of feel like they need to Take pictures of this like look at the capacitors if because if it's a server you can just easily open up The case and look at it You want to make sure none of the capacitors are bulging or leaking especially not leaking, but they're bulging. They're about to leave Yeah, so I will admit that's a little bit less of a problem now because they Most things have the more the better capacitors now, but years ago. That was a huge problem It's not the problem hasn't gone. We completely uh, the good news is it's just less of an issue here in 2022 I think that even is more to my point though because a lot of the Equipment that circulates when it comes to homelab is the older stuff before they fixed it And there was a certain Dell power edge. I wish I could remember the model number Or this was so common and I remember Going online and there was like somebody who is literally making money by Say hey, send it to me and I'll redo the capacitors for you that he made a business out of it at the time Because of how many there were and those models are going to circulate Yeah, you're right. It's less of a problem now and especially if you're buying like a laptop I mean you start trying to open up a laptop in front of somebody. They're not going to let you do that That's going way too far. Um, so of course everything I'm saying is situation specific But then the next thing though is What you should do before you put it into production So you have this device and you know everything checked out for you And it's time to get it going now. This is the hardest thing for me because when I buy any device I want to get it going right now. I don't want to wait. I mean, I'm it's like a kidney candy store I want to just dive in and start munching on it I'm so much from your servers, right? But I just want to get going on it but it is better to spend some time and Do some more troubleshooting because it's possible you could have Like memory errors you can run mem test on it You should run mem test on anything before you put it into production. Unfortunately that does take a long time But it's a lot less time than building the server Finding out everything You can't even trust the data at this point because you don't know what data went through bad ram and what's corrupted I mean, that's a way worse situation. Just spend the time be patient run mem test And also run spin right on level two. I'll talk more about spin right later But before I would even say if you have a brand new hard drive run spin right on it. The problem is um You're pretty much looking at overnight at this point at least but If you could spare that time spin right is going to just totally Um, even new drives are going to have some issues that the drives firmware may not find and that's what spin right does Yeah, um, but yeah, I still recommend you do that anyway You know regardless because it just makes sense to make sure the hard the hardware is actually good before you start relying on it That's that's very important absolutely So the other thing I want to Mention because tom has something he wants to get out of the way So I will mention this Next because it's extremely important and I'm the biggest hypocrite in the world for even mentioning this because I'm an offender here Okay, I constantly don't do this no matter how many times tom tells me to do this. I'm trying to remember Um, check the physical layer first now. We're getting back to the capacitors and all that but I won't mention that um You'd be surprised how many times it's a bad cable Or something to that effect. I I once had a whole roll of network cable that I bought and I was making my own cables and It's not hard to make cables. I thought I was just really bad at it, but it you know, it turns out That entire roll of cable was bad. I mean it just it was bad no matter who made the cable so Check the physical layer first. Especially if you're dealing with wi-fi issues, you know, what are what are your walls made of? That's a very big big thing there Um, check the physical layer again capacitors But cable and the other thing too is just checking all the loose plugs and things like that You know, we we've had clients that make the assumption switch is bad And you know a whole leg of a building went down and there's six buildings and they're all fiber together And they're in panic someone cut our fiber and everything else I'm like, did you check the sfp modules? Well, no, I said, I you know, they're they're a client that we know And we're like there's a box of them last time I was in your office because you had spares just in case and there's a pause And he goes, I'm gonna try that you see mail me back like 20 minutes later. Hey, that fixed it It was oddly one of their sfps went bad But they thought someone cut the fiber and things like that and started looking at there You know, then they went to the is a switch bad Do we need to start reprogramming it because we can get to these switches People started jumping all over the place on it and sometimes it is something just really simple And it's rare that those things happen that a cable goes bad or a port goes bad But it's not the the number of times it's happened over the years is not zero So you kind of methodically just go through all the especially the easy things to check, you know That it makes it a lot Faster to go through Just each easy section as you go and kind of work your way up there But the checking the physical layer on a lot of that loose plugs loose fittings lose connections You know, those are all really just they're they're more common than the more advanced problems You think you have with it. Um, I went and leave it myself I went into a little panic when something happened at the office and then it took me a second I'm like, oh, that's right. Check the physical right here It was something as absolutely simple as I should have just looked Somehow a plug had come out and I thought a server ring just died But it just somehow I don't know how a plug came out. It was just a demo server, but you know, I was like my demo I mean, sometimes it's obvious. I remember my first it job ever I was doing help desk in a factory So I worked in the office part and I would just walk out into the factory Which you know, I got a lot of steps in and I ruined a lot of shoes So I got a lot of exercise but I go out to the factory floor and address an issue One of which was this person complaining that his network connection is is constantly bad And intermittent and when I go in there, I see the network cable lying across the floor and he has a roller chair So, of course, he's backing up over the darn cable repeatedly And I'm like, did you ever maybe think that we you shouldn't like have your chair Running over the cable and maybe we kind of should consider like not having the cable across the floor Now that's obvious, but often it's really not that obvious. You just have to kind of look at it I've troubleshooted things for like four hours sometimes months like my wi-fi issue From a couple years ago, tom I was complaining about it for a couple of months and I tried every setting in unify no demand And come to find out it was just a bad cable going to the access point. Yeah after all that So yeah, check the physical error first Absolutely. All right, all the physical errors out of the way the one I'll get out of the way That I just wanted to throw out there. We didn't a lot of these tools we're going to talk about next We're going to be going to be a lot of linux command line troubleshooting tools I'll have one pf sense one in there, but Windows has sys internal Utilities it's A long list. I don't know if there's going to be an episode on sys internals But there's enough documentation out there from microsoft on it. It's kind of the de facto Set of tools you're going to use inside of windows to troubleshoot a lot of detail windows problems So we're not going to spend a whole lot of time on that or any time on that other than to mention Generally, you're going to use the sys internal tools and There we go. That's yeah I would go as far as to say that if you if you do use windows servers in your home lab I mean I know enough about windows to know get sys internals like I know every windows admin i've ever talked to They all say, you know, how great those tools are sing their praises So if you run windows make sure you also have those tools downloaded as well So that way they're ready for you whenever and if if you ever need them You know, um before we jump to the next one j. Let's let's talk about one real quick here Because I seen someone mention in here and we should have been on our list because it's always dns It is always great. Um dig is your friend when it comes to dns I have a whole video on using dig to help troubleshoot dns issues It's you know finding mail server problems for whether or not you have mx records a records and all those types um, I don't know if that I mean, I'm not going to spend a lot of time on it right now because maybe that's a Uh episode of how dns works I don't know if we've ever covered that but that might be just its own episode because all the different trickiness To dns, but I have a video on dns dig is your friend ns lookup is ns look like deprecated in linux or I don't know because deprecated in linux generally means You really support for the next 10 years that everybody uses it for like What 10 to 15 years after everyone tells you not to and then finally it goes away if it even does because you know It's weird. I've been using dig for song. I never think about ns lookup. Um, I think it's still in stalled by default I don't know how you dig is my go-to tool for all of my things Uh, it's just it's you know easy to use dig at the ip address of the dns server and then the commands after Uh, but it's just invaluable because you want to know how each server responds Um, and you know bug us enough and send us some messages and tag us in twitter And we'll do a whole episode on dns because you know, we can I'll dive into my zone transfer story that got goofed up one time So so here's the thing. I'll make a deal with you tom I will absolutely be on board with an episode on dns just as long as we call the episode It's always dns that must be the title and then always dns that has to be the title then i'm on board Yeah, but we'll mention dns. Definitely on the list. All right now. Um, you have a good one in here That's uh, or rescue zilla Yeah, I there's a few tools in here and i'm gonna mention a couple of them And there's and by the way, I know We're gonna get comments like why didn't you mention x? Why didn't you mention? Why trust me like I know like there's a bunch out there But um rescue zilla I like because it's compatible with clone zilla if you do take images Which is which is great. We had an episode about that the rescue zilla Supports those images plus a bunch of other tools. So um, you know your general troubleshooting tools and apps and things It's kind of like the swiss army knife of bootable tools so It's one of the things that I recommend that you have flash to a usb or um, you know, tom turned me on a ventoi recently Which is a lot easier. Um, I haven't tried it with ventoi But but I assume unless the knowledge base says otherwise you can use it with that Either way at the very least have a flash drive rescue zilla on it because if you need to do any data recovery Anything like that. It's a good one to have But I also mentioned zubuntu as well, which is going to be strange at first But there's a reason so I didn't mention ubuntu or pop o s Because well, first of all the thing is i'm talking about live bootable linux distributions Where you run it off of usb, which is pretty much the majority of distros nowadays um What I like about zubuntu is that it has the same You know, it's built on ubuntu. So it has the same compatible compatibility but gnome doesn't really work well on gpu starved devices And let's face it unless you have a very specific kind of server You probably don't have a gpu because a lot of the server hardware It has gpu enough to show something on the display and that's it like it's not built for games Unless there's you put a gpu in one It's probably going to have the weakest video card you can get so you probably won't be running gnome on that so zubuntu is lightweight it has the xfce desktop environment and I know there's lighter weight ones but I feel like zubuntu has like some of the best functionality with Being lightweight so it'll run well on whatever server and you can use that for file recovery So for example, you want to get all the files off a server You could just connect it to the network boot from zubuntu or sync everything to somewhere else Verify the data and you can sell it or whatever it is you want to do And another thing is that it has boot repair that you could download From the repositories and even though it's live you can still install software until ram gets full obviously, but Um, you can use part of boot with that and g parted Oh, yeah g parted live is another one But yeah zubuntu for file recovery and boot repair for sure It's just a good thing to have I mean you could even use it to to google the problem you're having with the server on the server I mean let that noodle around in your mind for a minute. Um, but yeah g part of live is great Well, no, you can actually load g parted in the xpuntu as well. It's what I was saying right you can and um For some reason i've always had g parted live separate. I don't even know which is nice and that works too Um g parted separate is nice if you want to use it It's just a live CD you can download that boots right up and g parted It's nice because it's really lightweight and dedicated task But if you're troubleshooting things And you find that troubleshooting leads you to partition problems Then you can apt get installed g parted along with googling those problems right on the rescue one Which is kind of fun like jay said it's let that noodle around a little bit You can boot up do all your diagnostics have internet do everything on there solve the problem And the other advantage using something like xpuntu is like he said are syncing files But we'll go a step further you have all those facilities in there for smb mount and everything else to be able to mount Other things without having to load a lot of that that kind of just comes default loaded in a bunt in the xpuntu To get things moving so it's a great way to rescue servers and things like that That you don't know why that last command caused the problems that did or what happened, but it's a good way to help unravel that Another thing that i'll mention about this that's probably even more important is that live media help us Determine like is it a hardware issue? or a software issue because if The problem follows whatever distro you boot then it's probably hardware I mean obviously it varies But if you try to reproduce your problem in live mode You'll learn a lot about the situation that you're dealing with Especially if everything works fine in live mode and another trick you can do with live media that I don't think a lot of people know this You can for example boot from let's say zubuntu and then get another flash drive And put that in another slot and then you go to install zubuntu would be careful Don't install it on your actual server point it to the other flash drive And it'll treat that flash drive as a legit hard disk And then what that flash drive then becomes is a actual install of linux that is not in live mode. It's actually Booting it's writable. It's every bit a real linux installation on the flash media And you don't have to deal with the try or install or any of that other stuff It's already installed it's installed on the flash drive. So at that point you have a legitimate hard drive Yeah, it could be a little bit slower But you can install or build whatever software it is that you had running on the internal disk and see if that works And then you can take that flash drive out with your software and settings on it Put it in another server and see if it works better there because if it does then there might be something specific to the original server Yes, absolutely And what's next on our list here? So there's going to be a lot of smaller things to mention One of which and I have a video coming out that goes over logging, but Tail the syslog I mean, it's like the ultimate thing when you are troubleshooting anything just tail dash f syslog or var log messages regardless of your I mean depending what distro you have That you can watch the messages scroll as you're trying to reproduce the problem And it's the number one way that I troubleshoot ssh because ssh Yes People complain about this because you're like, well, I can't connect to the server The ssh is not even giving me an error message or anything that really tells me why Some I mean it it could tell you why but I mean ssh will tell you that your password is wrong When your password was not the issue and it was the key I mean, there's just all kinds of oddities there So I think the issue is that ssh doesn't want to give you too much information about why you can't connect because if it's a bad guy Then they're getting information about why their attempt isn't working. So it's a good thing that it's not giving that info But on the other end You could like in a in a, you know, web console window just tail the log try to ssh from your terminal and in the logs especially Varlog secure or the off log in devian will give you information about ssh But for everything that's not ssh or security related just tailing the syslog or varlog messages will absolutely give you some information That'll help you understand exactly what kind of problem you might be running into So you could check check dmessage as well. That's a kernel ring buffer. There might be some information in there as well Another thing that's that's not as known because this is a problem that doesn't happen often But I think homelab people Would be more likely to run into this than the average user I'm talking about iNodes. This is not something that people really think about first because Yeah, it's that's a good interesting one too Because this was a topic I just brought up when I was talking about the new way xcp and g backs up And i'm glad they added the warning on there because they're doing Data chunking into lots of small files to make merging deltas faster But that chunking of putting it all in a bunch of small files They warn you if you're not using zfs or butterfs Don't do this if you hit an ext4 it only takes x amount of data to completely overrun All of that so that's you're just going to have a real problem For sure and I I remember very early in my career how frustrated I was when I ran to the end of this So here's what the symptoms look like You have an error that the disk is full you go to write a file and it says insufficient free space or something like that And then you do what everybody does df dash h and you look at it and I looked at it. I'm like It's 20 full. It's literally got 80 of its space free. So it's not full. What's going on here And then after a ridiculous amount of researching I stumbled upon this I'm like, oh, so if I type df dash i instead of dash h I'll see how many iNodes are in use And then the maximum number of iNodes that can be used And like you're saying it takes a ginormous number of Files to fill up the iNodes. This is not something They'll run into unless something is really wrong like you if you have a mail server Which is probably the big problem. You shouldn't be running your own mail server But if you are and you have like a bunch of like error messages Like nagios would be sending out that are constantly queuing up and duplicating over and over and over again Then that could cause it. Um, it's probably more likely to happen in a company though, honestly Yeah, anyway df dash i if you have a situation where the disk is reporting that it's full But you look at it and it's not actually full at least not in the sense that you think of it as being full Yeah, and I'll also mention I almost just out of muscle memory habit. I'm going to type uh df i h And the reason why is I just want human readable on the sizes It just makes it a little bit easier to to show you on the space side of it a human readable format So you understand it easier Because a glance if you're if you don't see just how many digits that is when you deal with larger storage places You're like, oh, yeah, and uh, it's easier to see. Oh, that's a g in front of there We have that many gigs free Of space on there and just a touch back on the log files There's a great utility that free is built into most repositories called lnav l nav and L nav is great. So l nav will allow you to do color coded real time plus has reg x matching of any of the logs you Uh put into it and you can even take and put multiple logs together into one l nav view So if you wanted to watch two log files simultaneously in real time You can type l nav and the two different log files and it will consolidate the view on there I've done a whole video on it. Uh, it's just a really impressive tool for being able to not just see the logs Or search the logs but also watch them in real time as something's happening It's just it's an invaluable troubleshooting tool for sure when you Have log problems to go through and another Uh thing i'll mention about logging this is thanks to michael in the chat room because I can't believe I forgot Actually, I can I can't believe it because I'm I started using linux way before system d was even thought of so i'm still not like conditioned If you think of it like doing everything the system d way But you know it is the way it's going Um, and no i'm not trying to start a flame war. Believe me, but journal ctl dash f dash f is follow So you could follow a log file, which is great But you could also use the dash u option to follow a specific unit. So if you wanted to Um follow ssh then you could use um dash u and then ssh And what's easy to remember because I've said, you know, when nobody's around Let me come on. We've probably all done this. We're trying to get something to work and we get agitated f u, you know, we get upset so, um Excuse me journal ctl dash f u. It's so easy to remember and then the unit name after I mean, obviously you could do dash u f Right, but it's not as fun So if you do dash f u and then the unit name like ssh d or whatever it is You'll follow that specific unit's messages And you won't see messages that are meant for a different service. So that could help you narrow that down so It could be like on a on uh ubuntu, you know journal ctl dash f u apache 2 Or http d if you're on sentos and so on so that's the system d equivalent of the tail dash f Yeah, and I'll give a shout out to cat daddy for duff d u f That is prettier than that. I'd actually not use it before I knew because it wasn't installed on my system I just installed it really quick. It wasn't it wasn't a repository, but uh duff is a more enhanced prettier version I know i've seen it before at some point. I didn't know what people were using when they showed that So it's a really nice kind of incurses drawn version of the d u command. So that's clever We're going to come to a little more just utilities. Oh, I'm sorry um I was going to mention when it comes to all like these disc utilities for for free space related shenanigans The interesting thing here you have to install them before it becomes a problem because if your hard disk is completely full Then you're not going to be able to install duff because it's full. Um, so obviously your inability to install duff answers your question Yes, the hard drive is full. Yeah Problem solved, right? But you know give me the error message you expect though But you know considering what you tom have added to the list, which i'm going to let you own this one That's another one that you'll want to install before it becomes a problem because then you can't install it If you have no space to install it in but it'll help you with your free space Yes nc du And there are there are utilities that are graphical like ui graphical this one actually has some graphics to it I actually was just using this today because I had to solve some elastic search problems Where one particular index was too big, but I wanted to delete stuff from the command line So using this in the command line it can analyze all the directories And then tell you how much data is in each one of them and when you have Well, I don't know 600 of them in a server that's storing a bunch of elastic data You just want to purge one particular index because something broken that's the fastest way to fix it It can help quickly find the offending folder that had 400 extra gigs of data for no reason and it when most of the other ones have a gig of data so It's a really nice utility for searching down and figuring out what's taking up all the space on a You know per folder basis per directory basis. However, you want to call it And and drill down into it. It also Trees down as you go into each of these it is just a really simple utility, but boy is it handy I've actually used it I had a weird docker problem where the docker didn't delete a bunch of stuff that didn't show up in docker But I did use it and I found all the extra files that the database When something had failed had an extra replicated copy on it. It wasn't in use by anything, but it was definitely using up space That could happen Yeah, so it's sometimes almost a utilities, but it's just really handy sorting out space problems of linux We've spent a lot of time on that but you also in real life you spend some time on sorting out storage issues, so Yeah, it's uh, it's storage in dns and number of contenders with dns being number one obviously because you know, we you can't Even the entire internet can break sometimes. Um, it happens another thing to address here is this kind of a Weird problem to work through you have an issue on your server and you want to troubleshoot it and try a few things But you're also kind of scared of like Making the problem worse like breaking it more and then it becoming more You know harder and harder to figure out how to fix it because you know, you just made it worse um One thing to keep in mind is that if you have Your server running in a vm clone it Or take a snapshot depending on the data if you had like a you know 100 gigs of data or way more It's going to take a lot longer to clone obviously, but If you clone a vm And i'll do this on linode and proxmox both I could just you know Use all of my troubleshooting against that clone and I don't care if I break it I'll delete that clone and create another clone until I get it down to a science about what exactly fixed the problem And I know exactly what to do in the on the production instance Now of course you could take a snapshot all the same that'll serve the same purpose It doesn't have to be a full clone But you know some companies out there will even do full clones just to you know Put in some you know backup testing as well Um, so it kind of gives you an opportunity to test your backups if you grab a backup that was taken You know after the problem started and you restore the vm You know your backups are at least working because you're able to restore it and the data is there But if it has a problem then you can just troubleshoot it and I think it is Kind of liberating to have this feeling like I could do whatever I want to the server It's not my main thing I can break it and that's fine And then you just have no hesitation and you might actually be able to fix the problem sooner because you know You have a contingency plan just as long as you're not a bonehead like I've done some time And then you're actually on the production server because the ps1 prompt is exactly the same on both Okay, you got to be careful, right? But with a clone of your vm if you're troubleshooting you can absolutely Get it down to a science as far as what caused your issue and what you need to do to the real one to fix it Yeah, there's and I have a lot of snapshots that are frequently named before tom did a thing Me too Yeah, I just hurry up and snapshot something run the command see if the outcome is what I Had hoped or not what I had hoped and if it is great I'll eventually delete my a point in time that was before tom did the thing Yeah, that's really handy. It's one of the reasons, you know virtualization is so popular It isn't just because it makes a lot of things easier or more efficient use of your hardware But it also allows you to grab those points in time Where you can say this point in time was exactly You know before I did this command that I think will work I think it will fix my elastic indexes. Ooh, that didn't fix them at all. That seems to have caused a completely new problem so Funny you mentioned that because I'll do like a snapshot. That's base install Um updates installed, you know kind of down different layers I have so many snapshots that if a vm where admin looked at it They would just get red hot mad and there'd be steam coming out of years. You know, you you need to get rid of those snapshots I've heard that over and over again from vmware people But you know snapshots are a very useful tool and I use them without hesitation Yeah, but also the majority of my youtube videos Are um, actually the result of avm. I record from real hardware But before I hit the record button I rehearsed on a virtual machine and I had snapshots to know like Or to test what's required and what's not because I see a lot of how-tos out there that'll Say you need to do these 10 things and I do like six of those things Or check the documentation from the project and it doesn't even have those things in there So then I just keep trying over and over again until I get it down to science And then I recorded on real hardware because I know what's going to work So yeah, I mean I take snapshots way too much I think but it works Yeah, um next couple utilities and we're going to get into the networking side of things now and I really like it's just simple, but it gives you a real-time stats from the command line And yes, you can do this with a couple of utilities we'll get to but beam on is really simple bmon And it monitors your network gives you a cool little graph to show you how much data is going across They can also give you the bytes collision errors broadcast deliveries Fragmentation problems a lot of little details about your tcpip stack and When you're doing a bunch of goofy little troubleshooting with networking It's kind of helpful to be able to see that Next one on that list is going to be and I seen net stat and someone said the replacement is ss. Yes Use both SS or net stat both are solid and you want to see what connections are going somewhere and what connections are going back and forth Two things those are great for having snapshots There's a lot of scripts They probably have a lot of that integrated in there Especially with a lot of net stat because you need to know what is or is not connected But let's go a step further and let's watch that in real time filtering for connections even ip dash ng ip traf ng is awesome It lets you in real time with a completely because you can ssh into a server and do this by the way when you do that Also when you create the filter you can create inverted filters like please ignore the ssh connection I have going into this and then Start looking for the connections you have in there ip traf ng is just really helpful Let's say you want to see if a certain server has a tcp connection to a certain ip address or maybe even just certain ports What's connecting on this particular port that I have a service running on and I wanted to update on the screen in real time And have a nice little it's all just driven through keyboard commands or up down arrows to go through menus Kind of an incurses look but it makes it really nice to go through those menus and go All right, I can see this or I can see what's hitting the ports and easy example Maybe you're setting up a unify server and you want to understand what services are coming connecting to one particular port on there Then you can start doing that and make a list of those ip addresses This is some really those Handful utilities are just kind of a combination of things I use a lot to kind of burn through and understand the tcp connections going on And what's going on with them? I'll also give a shout out because beam on does a nice job But if you just want to watch speed there's if top And that's interface tops kind of like top but for the interface We'll show you how much bandwidth per ip and then there's also speedometer Which will give you a speed rating for that network interface and kind of has a cool graph to it And all this is a command line So these are all just really helpful when you want to watch some of the data going across And that stat is a great idea one of those things that Is is technically being deprecated but in linux it takes a very very long time So I think your advice will probably be good for another 10 years. I'll never understand why Getting rid of legacy things is so hard and companies I get it because I know what the challenges are but I still reach for net stat and I haven't made the ss command Muscle memory yet no matter how many times I see articles out there telling me I shouldn't use net stat It's just install the net tools package and it's there So, I mean as long as they make that make the package available to me and it's installable I know technically as an educator. I should be telling people to not use deprecated things but Honestly, my opinion the warnings are out there and they usually have the warnings out there a very long time in advance So someone All of a sudden can't install it anymore and they're not ready You know what's on them because they should have made attention to that warning But while it's in the repository, it's muscle memory. It's really hard to break. But yeah, yeah that it is now There's plenty of other things to do. I see people talk about more of them, but we'll move on to Using iperf I suggest this all the time and it's going to be kind of related to another question Someone asked about mtu when you're doing some of these troubleshooting things iperf is a great way to load up interfaces and this works in windows and works in linux and works in free bsd There's compilation. I think there's uh, there's an android. I don't know if there's an iphone But there's even an android app for it Wow iperf is great It's even built in the pf sense The the overall with iperf is you set up one server to listen and the other server to broadcast and then you can test the Line speed because before people in start diving into I can't figure out Why can't these file transfers between my nas and my computer going? I always ask and I rarely see anyone start with a I tried iperf and i'm getting a full connection between these two devices And i'm not getting the speed I want on my nas it always starts with my nas is only transferring at X you know, I have a 10 gig connection, but it's not even getting to one gig I'm like, what's iperf say? and there's always like a 24-hour response in my forum post before oh, I did test iperf And I had it plugged into the wrong port because I couldn't get above a gig. I'm like, okay now we know the problem um I purpose is a handy utility for understanding Whether or not you have solid connectivity. It does not write any files So you are not limited by the file system in it It is doing just a raw network socket Basically to go from point a to point b and see how fast we can get there And if you can't get there at full speed you have a problem kind of related to the mtu problem When you think about the way mtu's in the way you're doing the chunks If the Switch has a misalignment so to speak you set a 9 000 mtu But then your switch actually requires you to set a 9 000 plus a few extra bytes for the header and That can cause a problem because you've got some extra v-lands in there And so it's offsetting it you'll run into an iperf problem right away when you start Connecting devices and realize something's dropping some of the packets And you now have less packets going so there's lots of retry transmissions going across this line And retry transmissions take up bandwidth therefore you do not get the full bandwidth iperf just makes a lot of that troubleshooting and tuning Really easy to do when you're doing it. It's just it's like the common question I ask all the time is did you run iperf? Because I do a lot of nas troubleshooting a lot of vm troubleshooting iperf is even built into xcp and g I believe it's probably natively defaulted or easy enough to install on proxmox. It's also in true nas So it's already on the platforms you use and then to go a step further You can load it up on your phone because next question is how fast is the wi-fi? Well, it depends where you're standing and iperf can kind of make it pretty easy because you can Have a fixed number You know how fast iperf is and you keep moving between fixed points and seeing how fast the data Traverses at full speed across the access points. So iperf is just it's an easy free utility been around forever works on all the platforms And I just find it really really handy for doing all the network speed troubleshooting You know what it's funny Before I discovered iperf Which I wish I knew about it before I did My go-to was like having the larger version of the sentos iso downloaded the dvd And I would just send it from one computer on the land to the other And just kind of do a verbose mode and just watch how fast it goes which is not an accurate way to do it but Before iperf that was totally my go-to is just send a big sentos image across the network and see how fast it gets there But don't don't do that use iper don't do that Yeah, there's not iperf is just uh Makes life simple now right to go a step further This is a little bit more in depth but definitely worth if you want to play with lots of tuning things And I've used this plenty of times in my videos And that's for onyx for onyx test suite is amazing if you want a Test suite where you can have it all graphed and mapped out for you So you don't have to open up your own spreadsheets to do this For onyx just has an amazing set of utilities. It's used by the industry as a whole you'll you'll see other not just youtube channels But lots of people serve the home and all those people one of these tests. They'll run the for onyx lab test Um, it's just great because you have a repeatable Absolute consistent you can even script it You can go straight further and it has a web interface if you want to get complicated into it But just using the command line for onyx download the suite load it up on a server Run the commands document make notes I should say of just which commands you ran so you run it consistently They have a test suite that'll test everything from processor or memory They have sequel simulators apache simulators So you can simulate the entire type of workload and this is used commercially I know by a lot of companies before they build servers because they kind of determine What is this server going to be set for? Oh, this is going to be your you know a sequel server All right, let's go ahead and simulate a sequel server on this Then you run the command set on there it creates an output Then you for each time you run it it'll ask you would you like to upload these results? You just keep saying yes and for each run you do you can make notes of what you changed Change this configuration. So you're running the same test The important part is you're creating a consistent baseline of tests And then for each note you made of your change for example, I did my iSCSI versus nfs Nothing different the command output was the same The only thing different was the vm was stored on an iSCSI target for one and an nfs target for another And I ran each one of those sequences of tests And this is how I get those different performance numbers But having that consistency you get from something pharaonics allows you to tune Because you can't really tune without a consistent baseline test that you run Perfectly the same you can't just say it feels faster now because that's that's the kind of fuzzy way of doing it You want something that's extremely objective And that's why if you see me and I thought about maybe I'll do a video on it because I don't think anyone has Of how pharaonics works, but it's free. It's open source It's easy to get access to and it is a wrapper in some ways for a lot of other utilities Because it uses under the hood fio and a few other things It even has a Colonel compile benchmark, but if I'm not mistaken it actually loads gcc and compiles to make it work so So just to address a comment in the chat room because I feel like a lot of people are going to have this Same question is ph o r n o n i x Yes, because when you hear someone saying it yeah, it doesn't sound like it, but it's That's what it is. Um, absolutely and I've used it too. It's really good Now it does share the name with the news site for onyx And yes, they're the same people to produce it because they do a lot of linux testing and they built this utility To have a consistent baseline for their reviews. So it is a it is also pharaonics news is Not coincidentally named it is the test suite is named after and maintained by them They do a lot of testing there and of course, this is also really cool. The results database is public By default people make it public But you can then look at something maybe the pharaonics news site tested and then you can compare your benchmarks to theirs And that's one of the things it'll give you stats on that so you can see how you're doing To get baseline averages, especially, you know, drive performance is Very tricky to do With a lot of setups and pharaonics, you know trying to say, all right Is it as fast as this array or that array when you're testing things? It's it's really it gives you some really good numbers So definitely um, I it's an involved utility If there's enough people message me or dm me on twitter or something and say time Can you make a video about it because it seems kind of hard because our documentation is sparse But it's it's not if you play around you just kind of figure it out If you're familiar with running command line stuff Uh, and once you kind of get the nuances of you're like, oh, and they do have forums and support as well so I wanted to take a quick moment and mention more about spin right because I meant to mention that at the same time I was talking about zubuntu and you know rescue zilla that it was another one to keep on hand. Um now what spin right is is a way of um Triggering the error correction on the drive to do the job that it's supposed to do anyway because let's be honest We're not the fact that spin right exists is great because it's useful But it shouldn't exist because this is the job of the hard drive to be doing the error correction But the error correction of the disk firmware is just not that great So what spin right does is if it has a if there's a bad sector and all hard drives have bad sectors And then in ssds we have bad cells It if you have a situation where you can't read data from a drive What spin right tries to do is it tries? Group force almost to read the the sector over and over and over and over again It's like a a three-year-old that's asking the same question over and over again It's asking the hard drive for that data over and over again and the hard drive is answering I can't read this I can't read this the hard drive is supposed to say You know a long time ago. There's an error with this sector. Let's map it to a Good sector, but it doesn't do that as good as it should So spin right is just a way to force the drives firmware to do what it's supposed to do And when the situation that it helps in is of course file recovery, but it also kind of helps with I think they call it disk route where you have things are slower to load, but they do load It's just over time Because there's some wear it actually forces the data to get mapped to a better area and it can actually make a slow install Actually feel fast again and all you did was run spin right on it So I just make sure on physical hardware. You don't need to run this on VMs. That'd be crazy. Don't do that Yeah, but you can run it on the hypervisor on the actual physical hardware You can run it on SSDs at level two I'm pretty sure if I remember correctly level two is what you want because some of the ones Some of the modes will take weeks. I mean that's when you have data Like it's you need that data and you didn't have good backups And you don't mind spending having this thing run for weeks to have like a five percent chance So that might read the data and you get that data back But just running it every year in your physical hardware makes sense And also if you run it on brand new hardware that might also help as well So I wanted to it's not free by the way. I want to throw that out there Um It's not free. It is slow. There's a faster version coming out. It's been in development for like 10 years It'll come out one day promise. Well, actually, it's not for me to promise I'm it's made by Steve Gibson of GRC He's one of the uh two hosts of security now one of my favorite podcasts But I've used spin right a lot and I like it and it's saved some It's actually done a lot of good. So it's just one of those things I Like to have around I'm not saying to go buy it Because what if you don't need it, but if you run it every year, I think it will actually add value So it's just even though it's not free. It's something to consider Yeah Gibson's put a lot of work into it his new version is a lot faster. So Yeah, his new version isn't now unless it is I think he's the beta if you buy it if I'm not mistaken Oh, okay. Well, that's interesting. Well, I'm still waiting for the final release. Yeah, I'm waiting for the final release. So But it's one of those things that that really shouldn't help Right because again hard drive firmware should be doing this stuff and it shouldn't be an issue But unfortunately spin right exists because the hard drives themselves just don't do a good job with this kind of thing So that's um why I wanted to recommend that Yeah, no, that's and it's worth um It's worth mentioning that because it's One of those things and I guess I'll mention this too because it's kind of related to spin right is test disk I don't know if you've ever used it. It's a it's a command line file recovery utility it's It's one of those challenges sometimes where people lose things and test disk has some ability to try to do it Spin right try to recover it just back everything up. Um, I know I know we're saying that but everyone's like well cool I wish I would have heard that before I lost my data But recovering data is always really really tricky So I'll throw that out there and spin right can sometimes save our drive, you know provided It hasn't gone too far Exactly if your bios can't read the disk then spin right cannot help you in that situation. Um Yeah, it does not recover from catastrophic failures Right if it catches on fire, it's not gonna help. Um now Test disc was that the one that people use for like when sd cards start getting corrupted It's actually I I've over the years. It's been a minute since I've used it But we used to have a lot of photography companies and um I I mean I would suggest they don't do this But they would they would constantly reuse and they'd buy the cheapest flashcards They could whatever they could find on sale And then they would lose a wedding and it was just shame and I'm like I My opinion is if you're spending that kind of money in a wedding Just buy a dedicated 30 dollar Sandisk like you can buy one for 30 or 40 dollars or just not that expensive But nonetheless if you buy the cheapest one and you reuse it many times there are a limit And with nice high res high quality cameras you're writing a lot to these So you occasionally would lose them and sometimes it would lose the file table The good news is tools like test disc do a simple thing where they walk the disc and identify file types And reassemble based on the file information that can be reassembled It does pretty much consistently lose the metadata about the file such as its name But when it comes to photos for example, that's not as relevant You just want to find all the photos and then you can organize and rename them As you see fit like so-and-so's wedding So Exactly. Yeah, and I feel like it's a isn't okay utility. It's not like the most amazing But hey, it's something and sometimes you can get some files back with it I'm considering how many people within our audience are Using raspberry pies. I think that recommendation might go a lot further than you might think because I know we have some Photography people in our audience. I'm one myself, but um Yeah, I mean raspberry pies. I mean It's probably more of a question of who doesn't have a raspberry pie in our audience and the people that don't are probably the ones that are Um unable to find them because of covid prices right now. Um, that would have one otherwise So that could actually go pretty far for that recommendation All right, I think we've reached the end of our utilities list Mm-hmm Sure have plenty of stuff for people to try Plenty of stuff for people to troubleshoot. It's best to be familiar with these tools before you need them So play around with them. I beat your half is just kind of fun because you may discover something on your network You didn't know was communicating Uh, the recovery utilities may be not but for onyx boy, it's fun to benchmark things Uh, it it's kind of I played with that a lot and so some of these are just kind of fun Uh, because tuning and troubleshooting and figuring out the changes you made and if something was better So get out there play with all these utilities we mentioned. They're a lot of fun I'm actually going to this time Put them into the show notes. So they'll be into the uh description which will follow over to the show notes So we will have those in there for those of you that didn't hear some of the commands Or just want to cut copy and paste the commands and uh, you know have fun with them So thanks everyone for listening. Looking forward to seeing everyone next time. Thank you