 Welcome back, everyone. Today, we're going to be using Autopsy, a GUI forensic tool in Windows to analyze a disk and think about what data we're looking at and where that data is actually located on the disk and what that data means. So I'll give a link to the Autopsy tool, the icon once you install it, it looks like this dog biting a spyglass. So if we open up Autopsy, once it loads, you'll get this start start menu, right? And you probably don't have any, let's say cases, you probably don't have any cases created yet, especially if this is your first time running it. So we want to create a new case. Okay, so click on create new case, and then it pops up and it asks you about the case name. So you want to give this, you want to make this something, something very specific. So case name, for example, you might, depending on the types of cases you work on, we would normally have some sort of code for the type of case we're working on, maybe the location that the case was was in. So imagine that we're working on a hacking case and hacking case maybe has a category of 01. Okay, so we know that it's a hacking case, we know that it is in soul. So we'll say 01 for soul, maybe Chunchan would be 02, maybe Busan would be 03, something like that, right? So if it's in soul, we say 01. And then what else could we have? So we know it's a hacking case, we know it was a hacking case that took place in soul. Maybe we want to say dash j i j because it's my initials. So it's I'm the one starting this case, and possibly even the date. So we'll just say 2017. Okay, so the reason I'm talking about this so much, the reason we don't just say test or something like that is you really need to think about your case structure because this is going to create a case file. And you don't want to, let's say, mix up the case files. So if there's, if you create a case today, and you create a case, you know, five years from now, they can't have the same name or you don't want them to have the same name. So think about some structure that you can use that will become relatively, relatively generic, relatively unique. So for example, the date is is unique, if we put, for example, 2017 01 01. So we've started the case on this day. And that makes the title unique. So then we can actually look at it and say, okay, we know that this is a hacking case, it took place in Seoul, we know that the person who started the case is jij, and they it was started on this date. Okay, now we have a rule about how to name cases. So if we work in a group of people, no one will have the same case. But every no one will have the same case name, but everyone would know what that type of case is when it was started and who to talk to, if they have questions about that case. Okay, so the case name is actually more important than you think. If I mean, most police officers that I've worked with, whenever they're actually assigned a case, they have a case management system, and they would get a case number. So they would probably just use the case number. But it's also worth it to think about the case name as well. So you might have, for example, let's say case number dash, and then you might have some code. So case number, but that doesn't really tell us anything about the case immediately. So we could have dash 01 01 jij 2017. And then that would show us, you know, this is this case number, you can go look it up if you want to, but it's also a hacking case that took place in Seoul, and it was started by jij. Okay, so just think about the case name, think about the structure and the rules you want to use. The most important thing is that you're consistent. And the naming is unique, right? So everyone on your team should be consistent. Everyone should know what the rules are. In a lot of organizations, everyone just kind of named cases their own way. The problem is, as technology evolves, most organizations usually want to centralize either data storage or backups or something like that. And then everyone has different case names or kind of structures, let's say in their naming. So then whenever you combine everything, sometimes there's some overlap, sometimes things just don't really make sense for the new structure. So make sure that everyone's aware of the rule and everyone's following the same rule for naming. Okay, that's my naming kind of rant. Okay. So next, the base directory where we want to save the case data. Okay, so we have this base directory. This should not be the C drive, basically. So you should be using a forensic workstation, right? And if you're using a dedicated analyst box, a dedicated computer for doing digital forensics, you should have either an external hard drive or a separate internal hard drive that is basically used, hopefully exclusively for case data and nothing else, right? So we don't want to put this on the C drive because first off, we don't want to mix case data with our own system, right? Second, we don't want to mix our own systems kind of information or potentially viruses with our case data, right? We don't want the two data sets touching each other at all. We want to keep them as separate as possible. So you should use a separate hard drive to store case data on. So here, you can see that I have this e drive cases folder. So on this e drive, this is actually just a network, basically a network share. But we can pretend that this is a separate hard drive that I'm going to store all my case data on. Okay, so make sure if you're doing this in real life, you the best situation is that you're saving all of your case data onto a separate dedicated hard drive. It could be an external hard drive. External hard drives tend to be slow unless they're firewire. So don't use an USB external hard drive because it will be very, very slow. But yeah, try to make it a separate physical disk. Okay, next case type, we have single user and autopsy has the ability to do a multi user setup. Single user basically just means that everything is saved locally to to whatever base directory that you've identified multi user allows multiple people to gain access to the data. And this is really handy in big organizations, especially if multiple people are going to do the investigation. For a lot of tools, only either either all the data is located locally. So only one person can access the data or they have to copy or I'm sorry, they say not or but they basically have to copy data around or multiple people have to get access to a disk image and everyone has to process it if they want to access the data. So in single user mode, if one person is going to analyze the case, obviously choose single user. If multiple people need access to it, you have to set up centralized infrastructure to allow multiple people to connect at the same time. But it is possible with autopsy and I will talk about it more later. Okay. So it says case data will be stored in the following directory and it gives the full path. Okay, so this is exactly what we would expect eDrive cases and then basically our case name. Okay, so we can click next case number. So this is again, if you're working on a real case, most likely the case already has a case number from some case management system. So we'll just call the case number, I don't know, 001. Again, you want to have some sort of pattern for case numbers 001 is just for example, but usually there's some reason why there's a case number. It's associated with documentation and everything else. So make sure that this case number is the same as any documentation that you are doing. Okay, examiner, put your own name in there, because you are the one doing examinations. Autopsy doesn't have the ability to do disk imaging. It is strictly for analysis examination. Okay, so case number and name. Then we click finish. Okay, and then once the case is created, now it's asking you to import a data source. You can just close this, but if you don't have any data sources imported, then you can't really analyze anything, obviously. If you do close this, you can configure Autopsy to do, well, you can initially configure Autopsy, but we want to import a data source. So we can select the data source type. So we have an image or VM file. If we click the drop down box, then we have image or VM file local disk and local files. So image or VM file, like we've done before, we took a disk image of a hard disk. It could be a flash drive. It could be the hard drive in a computer. Once you create either the DD or E01 disk image, then you want to select image or or VM file. So VM file is a virtual machine hard disk. You can, I'm not sure all of the different types, but you can import them directly. We mostly work with disk images. Local disk, if I wanted to, I could add the local hard drive of this computer directly, but I don't want to. You might use that if you were trying to use Autopsy for a live analysis situation, but I'm not really sure when you would use that. You might also use it if you had a disk attached to a right blocker connected to your computer. And your computer had already mounted the suspect hard drive. You don't want to image the hard drive, but you do want to analyze it. So instead of having an image or going through the process of having an image, if you connect the suspect's hard drive directly to your computer with a right blocker, then you could choose local disk and basically analyze the local disk directly. And then local files, you can just add files directly to your case. Yeah, there's no packing or anything like that. There's no unpacking or anything like that. It will just go through and process the files directly. So we want to add an image today. So you've already taken an image before. Hopefully I'm going to use an image downloaded from basically the digital corpora, which I'll give a link to. And if I click browse, it is here, this NPS 2011 scenario for E01. So I have an E01 or an expert witness format image. I'm going to open it up. Now this image is quite big. I think it was 13 gigabytes or something like that. Quite large, took a while to download. This disk image is from the US. I mean, I know it was created in the US, but I'm not sure where. And I don't think that it was created in Korea where I'm currently located. And since I don't know the time, it says, please select the input time zone. Since I don't know the time zone, I'm going to default to GMT plus zero. This just makes everything GMT. So if you don't know what time zone your image is from, which you usually do, because they're usually from your local time zone or from a country you're working with, you usually have a pretty good idea. But if you don't know the time, just set the time to GMT plus zero. That basically is the default setting. That means that it will not calculate the local offsets essentially. Then we can just click next. And then we get to the ingest modules, the ingest modules. And this is what does all of the actual extraction and analysis of the disk. Now each of these, we can talk about each of these in a lot of depth. I'm just going to run through them really quickly. So first, recent activity. This goes through, for example, extracts recent user activity such as web browsing, recently accessed files, things like that. So this is a, I won't say a script, but it's a program that goes through and looks for data sources inside the disk image that we can link to a user doing something in the system. So for example, internet history, the last files that were opened, things like that, there's a lot of those data sources in the system. And this module basically extracts all of them. There's nothing to set with recent activity. It's just a default script. Hash lookup. So I've already added a hash database. And there's another video about how to add hash databases. This is a known hash databases to use. So NSRL are basically it's a database containing information about files that we already know. We already know these files and we know that they're probably not relevant to our case. Okay, so since these files aren't really relevant to our case, we might want to filter them out so we don't see them. That way we can focus on files that we don't know anything about. Okay, so this known hash databases, the NSRL, I'll give a link to where you can download it. It's also very large. Known hash databases are files that we know are not very interesting to our case. Select known bad hash databases to use. So these are files that we would be a database that contains information about files that we know are not good, or we know that they're very interesting for our case. So for example, viruses, if you make a hash of a virus, you could have a database. And if you find a file with that hash value, then that file is probably quite interesting to your case. For working on child exploitation images and videos that we know are illegal images and videos, we can make a hash database of them and then scan a suspect's hard drive very quickly with that. So there are some global settings, but I talk about that in the other in the other video. Okay, file type identification. This tries to, yeah, just like it says, it tries to identify types of files. Let's see. Okay, I'll talk about extension mismatch detector in a second, but file type identification tries to find specific types of files. So if I haven't configured anything by default, there's nothing configured. So there's a lot of different file types. And each country tends to use data sources that have their own unique file types. So for example, Korea has a very specific file type called hwp. So we might add hwp to this file type identification to figure out if hwps exist in the image. So it's just to identify different types of files. We'll talk about that more in a second whenever we get to extension mismatch detector. Embedded file extractor. So extracts embedded files, for example, doc, docx. So these docx, pptx, xlsx, they're all kind of like, basically, you can think of them almost like zip files, they contain other files that create the document. Doc, you could put a picture or you know, video or something inside a doc and save it. The data itself looks like a word document, but there's actually images inside. So it's very interesting for investigators to be able to pull out the images that are saved inside a doc or a docx, and then see those images separately. So one trick to hide things, basically, is to embed images at a PowerPoint. So then you have a PowerPoint that looks like it's a professional presentation, but actually it has a bunch of, you know, illegal content or whatever it is. So embedded file extractor basically opens up files and extracts any other files that might be inside. Exif parser. JPEG files, some JPEG files have what we call exif data, and it's just metadata inside the image. You can't see it whenever you open up the image, but it is data that is inside of it basically. So this goes inside of the image and extracts the exif information. And exif information contains a lot of interesting things like sometimes it contains GPS information, it contains data on the date and time that the image was taken. And even if you move the image around, exif information is not changed. Most new phones put exif data inside JPEG images that are taken automatically. So exif information can potentially tell us a lot about what was going on, what cameras were used, when pictures were taken, where pictures were taken, things like that. So although you can change exif information, it is a very good place to get some information about images. Okay, next is keyword search. So you can preset a list of keywords or a list of patterns that you want to search for. So for example, keyword search, we have by default phone numbers, IP addresses, email addresses, URLs, and credit card numbers. Now, credit card numbers, URLs, email addresses, IP addresses are pretty much universal, right? Email addresses you might want specifically to search for something like go.kr or .kr. So we could modify this to change it to something more specific to our region. But these are pretty much global phone numbers tend to be local. So this phone number, I believe, is by default set to US US style phone numbers. So if we go in, if you want to search for all of them, you can just select them, and then it will search for all of these patterns. If we look at global settings, we don't have any keyword lists made, we do have string extraction, I think. So here string extraction in the keyword list, because we're in Korea, we probably want to select Korean. And then that lets us be able to do keyword searches for Korean text. Okay, we click on general. Yeah, okay. Right. So now we can create keyword lists. So if we create a new list, let's call it, I'm just going to call it test. Okay. So now we have this test keyword, we can click on new keyword. So what's a new keyword? Well, I can just search for. Let's search for test. Okay, so click okay. So now test is selected there. And this will automatically search this entire disk image for any files or any references to test. Okay. So whenever we selected a new keyword, you see this regular expression and regular expression basically lets us search for patterns. So for example, T, period, EST, sorry, ST, T period, ST, a regular expression. And if you put a period, that means any character. So if there's a T, P, ST, it will find it. If there's a T, EST, it will find it. This is really handy for names, for example, or patterns where you might not know all of the different, all of the different content. So for example, J, period, period, H, I don't know. So this might find Josh, something like that. Right. Now, regular expressions can be are obviously much, much more powerful than just dot dot, but they're a little bit complicated and too much to get into right now. Learning regular expressions is extremely valuable if you want to be a digital forensic investigator. It's just so useful in a lot of different situations, not just for keyword searching, but in a lot of things. So for example, the IP addresses, the IP addresses, email addresses, all of these are based on patterns. I'm going to deselect that. So all of these are based on patterns. We're not looking for a specific IP address. We could look for a specific IP address, but these are based on patterns to find all IP addresses. So keyword searching is one of the main tools that digital investigators use. We base a lot of our searches based off of patterns, but we can also search for literal strings, which is a name or a word or a phrase, whatever it is we're looking for. So you can set that in the global settings. Email parser sounds exactly what it does. So it opens up, if you're saving your email locally, most of the time, the email is saved inside kind of like a zip file as well. So this PST, for example, PST OST files are kind of like containers and they contain all of the email. So this email parser opens up those files and extracts everything out and then goes through and analyzes the emails directly. Extension mismatch detector. So this is related to file type identification. So extension mismatch detector is often it's a very simple way to do anti-forensics. Let's say you had a doc file or let's say you had an illegal picture. So it's a JPEG image but you want to hide it. You want to make sure that nobody can open it up. If you change the extension from JPG to .doc, then whenever you double click on it in Windows, it will not open up. So this is a relatively easy way to make people think that the file is just bad or there's nothing there or whatever. And then if you change the extension back, then you can still open up the image. So if you change a .JPEG's extension to .doc, you try to open it up, you won't see anything. If you change it to back from .doc to JPG, you will be able to open up the image again. So it's a very simple way to try to hide things but it's very easy to detect as well. So this extension mismatch detector, if we go into global settings, we can see all of these different types of files. So we see all these file types. So let's click on application PDF. This is a file type and the extension that is acceptable for this file type is PDF. So just to show you, let me open this up, just to show you real quick. So I'm on the desktop and I have this file called Halim Info. So if I do file, I'm in Linux right now, if I do file Halim Info HWP, this HWP file is, they're pretty, they're basically Word documents. They're very specific to Korea only, as far as I know, only Koreans use them because they support Hangul. Okay, so if we do file Halim Info HWP, then it shows us a little bit, you know, Hangul Korean word processor file 5x. Okay, so if we do file dash I, then we can see the file type, which is application xhwp. So if I copy this, right? So right now in application x, I have application, for example, MS Office or application, you know, BZIP, all of these different applications, but I don't have this, this HWP application in it. So I can do new type. So the MIME type is application xhwp, click OK. Then application xhwp, click new extension. And what extension do we accept? Well, the extension is hwp. Okay. Okay. So now if we find, for example, image, image jpeg, if we have a image header, a file header that has the type of image jpeg, and we have an extension of hwp, we would have a file that doesn't really work, it doesn't open up, but we would detect, we would basically detect that they're trying to use extension changes to hide their files. Okay. So extension mismatch detector is a very simple, very powerful way, but autopsy comes with quite a few of the most common file types, but you really need to add your own. Okay. Next is E01 verifier. We took images, I believe in a raw image format, or a DD, or a DD image format. The image that I'm opening right now is an E01 or an expert witness file, which means that the hash value for the image is actually stored at the end, or in the footer of the file itself. Okay. So if you have a DD image, you don't need this because it won't do anything. So you can deselect it. I am currently using an E01 file, so I'm going to select it. Okay. Next is Android analyzer. You can, and we will, extract data from an Android phone and then use Sleuthkit to analyze this data, but right now I'm not analyzing Android data, so we're going to select, keep this module deselected. Interesting files identifier is a very, let's say, general, general module. You can configure it to say what types or what kinds of files you think are interesting. And then it will just filter these and provide you, for example, this interesting items dropdown menu. You can just select it and then see all the interesting items. So it's just a filter to quickly identify what you think is interesting. So if you go into global settings, there's lots of different ways you can configure it. MIME types, for example, the type of file patterns, you can do all sorts of rules for this. Photoreck Carver. Photoreck, we've already used a little bit. Photoreck basically just carves data out. Photoreck is the module is built into autopsy and it helps to do its file carving to especially extract data in unallocated space. And then Virtual Machine Extractor. If we're dealing with a real disk image, that real disk image might have a virtual machine inside of the disk image, which means we need to extract that virtual machine and treat it as a different computer. I know my image does not have a virtual machine inside of it, but if we were doing a real case, we would probably want to keep this checked just in case. Okay, so those are all the default modules. There's a lot of other modules you can add. There's modules available online. There's scripts available online that you can add more functionality to this. These are called ingest modules. Notice if I don't want to run something, then I can just select it or deselect it. And then if we click next, then it will start processing. So my image is relatively big. I'm going to start processing and we'll come back later and do the analysis. Thank you very much.