 Today we're going to be looking at file recovery. Now when you have a file on your hard drive, or flash drive, or SD card, the first part of that file is the header, and there's parts of information there that, well one, the very first part tells your operating system that this is a file, and then the next part tells it what type of file it is, and then there's a bunch of data after that containing the information of the file itself. Now that first little part is important, because that, as I said, tells your operating system this is a file. Now when you delete a file, basically it just removes that first part, and the rest of the information is still there, what type of file it is, and the actual data of the file. So if you ever accidentally delete a file, you can recover it as long as you act quickly. Now the first thing you have to do when recovering a file is whatever hard drive it's on, whether it's your internal, or an SD card, or a flash drive, or an external USB drive, unmount it. Disconnect it from the computer until you're ready to do the recovery, because at any point if a drive is connected and mounted to a system, the operating system can and most likely will be writing little bits of information there, and at any point your data could be overwritten, because the operating system doesn't see that data as being there, it sees it as empty space even though it's really not. All because that first little part saying this is a file has been removed. Now there are programs out there that will allow you to easily recover many file system types, file system types, file types. And the one that I normally use is called Photorec, and it's in a package in most repositories under TestDisk, and so you can look through aptitude or whatever package manager you use and look for disk test, disk test, don't know why I'm having trouble saying that, and once you install then you can run Photorec, tell it what hard drive to look at, again do not mount it. So if it's a flash drive, plug it in and hopefully you're running an operating system that isn't automatically going to mount it, such as like Windows, which will mount and start searching through whatever file, so you definitely want to avoid that. And if it's your internal drive, you might want to boot a live CD, because you don't want to be running off the operating system, off the hard drive that you're trying to recover from. Another thing is you need some place to put the files you're going to recover. You don't want to write them to the same disk, because again, you might overwrite the file you're looking for. Now you'll also need a lot of space. In today's example I'm going to use a one gig flash drive, and the reason for this is because whenever you're doing this, again, your operating system doesn't know where the file is or that it's even a file, Photorec is going to go through the entire hard drive looking for the particular file type that you tell it to look for. And so if you're looking for, example, a PDF on a hard drive that's 100 gigs or a terabyte or something large like that, I'm sure at some point you've probably had a lot of PDFs on there, and it's going to find all of them and need some place to put them, because it doesn't know which one you're looking for. Although programs like Photorec are great, it makes it really easy for you to recover lost deleted files, and it's great if you have an SD card or a USB drive from a camera that you accidentally deleted some photos off of. It's great, and you can just find a place on your internal hard drive to store that. But again, if it's your internal drive and it's a very large drive, it's going to find a lot of files and it's going to need a place to put them all. So today I'm going to show you a little more of an advanced technique using basic tools. Basically, we're going to be using grep. Luckily in the Linux and basically all Unix and Unix-like operating systems, hardware is always accessible as if it was a file itself, and then includes hard drives and the partitions of those hard drives. So you can grep through a hard drive, the raw data on it, whether it has partitions or not, and you can look for certain strings, and at that point you can pull out information. So although using a program like Photorec to recover files is a lot easier, especially with larger files or larger file systems or drives and partitions, I'm going to show you a way to kind of narrow down. You're going to need to know some information about the file you're looking for so you can look for it, but that's what you use grep for. You find a certain thing about that file that is unique to that file, and so you can find where it is on the hard drive, and then you can pull that information off. Now you may not know where the file begins and where it ends, so you'll pull off a larger chunk than need be, but it's going to be a lot smaller than what you would get from Photorec, which would be grabbing a whole lot of stuff. And then I'm going to show you in this example we're going to be looking at a PDF file and text files, which are even easier, how to pull that out of the data that you pull off the hard drive. Now we're also assuming one, as I said, that you know something unique about that file you can search for, and two, that the file isn't fragmented. We need it to be in one place on the hard drive. So if you're trying to recover a large file like a video file or something large like that, there's a more likely chance of it being fragmented, which will make this process a little bit harder. But something small like a simple PDF, which is what we're going to recover today, will hopefully all be in one section and you can pull it right off and then find where the beginning and end of the file are. So that's what we're going to be looking at today. Again, it's a little more advanced than just using something like Photorec. If you're just trying to recover some photos or music or documents off a smaller drive, and you have a larger drive to store them all on, Photorec, which is in the package, TestDisk, that's a great way to go. But this is a little more advanced option. Using built-in tools, Grep is on pretty much every system already, even small ARM Linux distributions. And you can use these basic tools to kind of narrow down your search. It will still take a long time because it still has to search the entire drive, or at least until you find the file, which may be at the end of it. And today we're going to be using actually a 2Gb flash drive. I think I said 1Gb earlier. It's a 2Gb flash drive, so still a lot of data on there. But it'll only take a couple minutes to search through the entire drive and find what we're looking for. So let's dive into the tutorial now. If you enjoyed my tutorials and would like to see more, please think about contributing to my Patreon account at patreon.com forward slash metalx1000. Okay, here we are. We're in a directory that I have mounted a 2Gb flash drive. And right now there's only one file on that drive. If I list it out, you can see a PDF file here. I will display it out here so you can see it. It's just notes from last month's tutorial. So let's say I'm working on this drive and I accidentally do remove that file. And it's telling me it's right protected, which is good. But let's sudo remove that. Okay, so I deleted that file. Let's say, oh no, I deleted a file I need. So first thing we need to do is unmount that drive. That's extremely important. Unmount it and don't mount it. Again, until after you've recovered the file. So it should be you mount. There we go. So it's unmounted. We can double check by saying mount. It's not there anymore. Control L to clear the screen. And okay, so as I said earlier, in Linux and other Unix-like and Unix-based operating systems, hardware is accessible as if it's a file. So as you probably already know, if we look inside our dev folder, we're going to have a couple of files, usually dev, SD, but it might be HD depending on what operating system you're running or it might be something else. But here we go. The one I just had mounted was this one SDC and we also have a partition right now. Since there's only one partition, it doesn't matter which drive we search. If you had multiple partitions, you might be able to narrow down your search by searching just that one partition. But in this case, what we're going to do is we're going to now start looking at the raw data and we can read it while it's unmounted and not worry about anything being written to it. So first thing I want to point out is it's raw data, it's binary data even though there are standard strings in it. So if I was to cat out dev SDC 1 and probably need to be sudo to do that, we'll hit enter and you can see raw data starting being printed out. And the first part is bootable, boot options on that drive. So we want to clean that up a little bit. I think I've shown in previous tutorials the strings command which will find all the printable strings, the ASCII characters in a binary file. So we can use that, we can say dev.sdc1 hit enter and again sudo to do that. We're always going to need to be sudo or root for this because we're working with the raw data on the device. So you can see it starts looking through the hard drive and you can see there's a lot of information on there even though we only had one file on there. I've had files, I've had operating systems, live CDs on there so it's going to see all that information that hasn't been overwritten. I'm going to control C to stop that and control L to clear the screen. If you ever do accidentally cat out binary data and it messes up your terminal you can always type reset and it will usually reset your screen there for you. So we need to do something that we know about that file I'm looking for. And if you remember when we looked at it it was my notes for the Xorg on Android. So the first thing we can start doing is we can use sudo grep and since it's binary data we're going to say dash a otherwise it will tell you it's binary data and it's going to be confused. Dash a says look for the ASCII information inside this binary data. And I'm going to say Android. So we're going to start searching for Android because I know that word is important to that document in the title and all that stuff. And we will start searching. And again this is a two gig drive. I don't know exactly where on this drive this has been written so it might take a couple of minutes for us to search through here and hopefully it will eventually find the word Android. Now I can tell you right now there's going to be in this particular case I know I had files on there but I don't know what Android in them. So it's going to find more than one instance of that. So we're going to want to narrow down the search and just to save time this would be my first search. Just Android. But since I already know I'm going to find more things just to save time. I'm going to control C to stop that. My next search would be to narrow it down a little bit. Well that first search will hopefully maybe give me something I can narrow it down by and I would in one of the lines it finds Xorg on Android. So I'm actually going to hit enter there and it's going to start searching again. Now it's looking for a longer string so it's narrowing down the search. It's not just looking for something that says Android. It's looking for Xorg on Android. And again even though the file is deleted the information will hopefully still be there because I unmounted right away so the chances of it being overwritten are unlikely. So now we're just waiting and I'm going to pause the video until it finds it. That would probably be a good idea. Okay, it took probably another 30 seconds after I stopped recording to find it and then I killed it. I could have let it keep going. I'm going to assume that this is the only file on it that says Xorg on Android and stop it there. I think there might be other ones I want to go through to clarify hopefully which one is the one I'm looking for. So now that we know that we have found at least part of that file we want to take that section and put it into another file. So let me go to a drive. I'm going to go to my temp directory and make a folder I'll call it a rec for Android recovery whatever a rec. Okay, so different drive. This is my internal drive. I'm searching the flash drive. I'm not writing to the same drive. We're going to run almost the same command as before but we want to do is we grab one line from that file. What we want to do now is grab a bunch of lines before and a bunch of lines after. We don't know exactly where the file begins and ends. It's a relatively small file so what I'm just going to do is I'm just going to guess and I'm going to say dash capital B and I'll say 10,000 and dash capital A and I'll say 10,000 there as well and we're going to still look for the same string on the same partition and I'm just going to create a file called recovery You can give it an extension of some sort if you want. It doesn't really matter. I'm going to hit enter here. Now I'm going to wait until that has completely run. I can also look at I can open up another portion of the window here. I didn't realize I had one open already. So that's running. I'm in the same folder here just to split the window. I'm going to list out my files which is only one and you can see right now it's zero so it's found nothing. If it's a larger hard drive, once I see this file size get big and stop growing I can, you know, kill the grep command because I pretty much have my file. So let me do this again still zero. I hasn't found it yet. So let me explain this command though. Pseudo because you have to be Pseudo grep, you know, to look for A if we're saying look for the ASCII in a binary file. B means look for 10,000 lines before and A means 10,000 lines after this string. So it's going to find that string and it's going to take 10,000 lines before 10,000 lines after and put them in this file because we don't know where that file begins and ends. So let's go ahead and look down here again. Oh, you can see we have 2.4 megabytes and if I check it again it's not going up in size. So at this point I can assume that it's recovered the file. So what I'm going to do is I am going to control C and kill this command. If it's really important you might want to just let it run. But again it's only a 2GB hard drive it'll take a couple minutes. If you have a terabyte drive, you're probably going to want to stop beforehand. Although I can start working with this file at this point and let that keep running but it doesn't really matter at this point. So now if I cat out, actually let's string out strings my recovery file that it's got and you can see blah blah blah oh and I got other information here too. You can see there's some python code in here because I didn't grab just that file, I grabbed that file and things before and after it as well. So at this point I'm like, okay I need to find where the pdf file begins and ends. Now I already know how to do this but let me create another file as an example. So I'm just going to say vim test blah blah blah doesn't matter and then I should be able to say let's open that up let's open that up with Chrome I guess that should be Google Chrome should have done this before the tutorial so here we go. What I'm going to do is I'm going to control P and I'm going to print to a pdf file basically I'm just creating a pdf file that I can look at and I can learn about pdf files if I didn't already know. Which I've done this before so I already know what I need to look for but let's list this out. So there we go we have test pdf let me display that just so you can see that it's a pdf file there it is with the text I just created so all I did was create a pdf file so what I'm going to do now is I'm going to say head-n1 test.pdf and you can see it starts off with %pdf-1 .4 so we can assume that that's how pdf files begin this dash 1.4 might be a version number so we might just want to search for %pdf but we also need to know where the file ends so let's do tail-n1 test pdf and it will show me the last line of the file which is a %eof for end of file wow pdf files are very simple like that so this is great we know what we're looking for now let's real quick just do a string so I'm just walking through my thought process here recovery anything that says pdf oh look at that we have a couple lines that say pdf and one of them is a %pdf there's something before it we're not going to worry about that just now let's see if we have an eof in here we got a couple eofs but we can see there is one that say %eof so we can assume those are the two lines that we want to start and end on a while ago I did a tutorial on using sed to find two strings and pull everything including those two strings in between so the way we do that if you remember that story let's put it on the screen here sed-n and then we're going to say inside single quotes forward slash pdf is case sensitive so we got that another forward slash and then a comma and then forward slash I'm sorry I put a single quote there don't want that there %eof forward slash p and then a single quote there so this is what we're doing we're looking for this string and this string and everything in between okay so if we do that we tell it to look at our recovery file and then let's just put that into a new.pdf or whatever you want to call it shouldn't take long it's only a two and a half megabyte file now if we display we can say new.pdf and there we are we recovered our file and we didn't have to pull every pdf off which again pds are fairly common so there might be a lot on there now you also notice when we ran that it says some garbage before %pdf so the file works but we got a warning let's go ahead and head out again dash n1 for the first line of the file and we'll say new.pdf and again remember there's this portion here that's not part of the pdf file that's part of the file before the pdf that was on the hard drive so although the pdf worked we can remove that using said so what I'm going to do is I'm going to say said dash i which means in place so it's going to replace the particular file although you might want to put it into a new one because you already recovered it you don't want to screw up and have to go through this again but you still have the recovery file that we can pull from anyway so single quote s forward slash what I just copied from up there and I'm going to say forward slash forward slash and a single quote again from our new.pdf so what we're doing here is we're saying said in place meaning just overwrite the file that we're looking at s for substitute this information here replace it with well nothing because there's nothing between those two slashes we'll hit enter and now we can say display new.pdf we get the file that we were viewing earlier if we go back to the shell here there are no warnings so again for a lot of simple stuff you can use photorec and that will allow you to recover files but for advanced users you can save some time and storage space and not have to recover a bazillion files just to get one using techniques like that of course you have to have a little more knowledge on what you're looking for and in this particular case we're looking at PDFs which was simple you'd have to look at the header and footer of other files and see if you can pull them out that way so knowledge on what the files look like having some of the similar files as example to study is good and also again having an idea of what's in that file so as always I thank you for watching please visit my website filmsbychrist.com that's chrisofthek there should be a link in the description this is part of the series hopefully there's an annotation on the screen to the full playlist on working with binary files and as always I hope that you have a great day I sat down tweaked the program and got to run in about three minutes and then your argument when you saw my code was that I was using source