 Welcome. This is physical memory forensics for files and cache. I'm Justin Murdoch, and to my right is Jamie Butler. Just a bit of background information. Jamie Butler is the director of research and development at Mandiant. He's focused mainly on host analysis and operating systems research. I'm a computer science major at the Rochester Institute of Technology. I'm currently on co-op at Mandiant working on their enterprise product as a software developer. So we're going to speak today about physical memory forensics, and this is kind of a layout of the talk. First we're going to go over traditional forensic methods, kind of a background information, and then specifically move on to memory forensics. And given that background, we're going to speak about the issues that are in existing tools right now. So a lot of them are missing important information and often misattributing data to executables. So most memory forensics also deals with utilizing files. So we're going to speak about memory mapped files, reconstituting binaries and data files, and specifically the role that cache plays in this process. Then we're going to talk about possible applications of our new techniques and show you guys a couple of demos. And then we're going to speak about our new tool that we're going to be releasing pretty soon that uses these new techniques and speak about wrap up with just some further work that needs to be done in the area. So traditional forensics, it's kind of a broad overview. A host has two large sources of information for forensics, so the disk and memory. And lately memory has become a great way to triage a host, say in a forensics investigation. So the reasons for this are the average size of disk is growing extremely high. So most hard drives out there come at least 250 gigabytes. I know people in this room probably have way more storage space than that. So searching through that whole image or just dumping a full copy of the hard drive is getting to be a much longer process. And memory can really help you out to speed up what you're looking for. It's relatively small comparatively, so you can scan the whole space pretty quickly. Also, for intruders to get their code running on a system, they have to load it into memory. And in almost all cases out there, they aren't covering their tracks. They aren't cloaking their memory footprint because that's really just too much work in the most case. So also, many of the artifacts that the kernel needs to load the program into memory, we can use to gain a lot more information about the executable. So specifically for memory forensics, memory is divided really into two basic sections, userland and kernel memory. This talk is going to focus on userland memory. Again, because in most attacks, most intrusions, they're focusing on userland memory. It's easier to get execution and it's more resilient to coding errors. Basically, if you're developing this attack, then if you got a couple bugs in your program, all of a sudden you have to crash the system because it's running in the kernel. So it becomes very costly to develop these kinds of attacks. So memory forensics traditionally focuses on recovering all the binaries out of the memory. So all the executables, DLLs, this is one of the main focuses of any investigation. And most of these tools rely on virtual address descriptors or VADs. These describe the processes address based in memory and they make up, they're made up of these objects. So as you can see, it's got a pointer to left and right child. It's usually in a tree structure. And each VAD also contains the starting address and the size of the memory, along with a pointer to the control area. So here's a representation of a typical VAD tree. You can see it starts with a VAD root and just each one of those VADs contains information about the virtual addresses of the process. So traditionally, you would scan physical memory for an E process block and that kind of lets you know that there's a process at that location. From there, you get the directory table base or DTP in the E process and this will help you translate from virtual to physical addresses. From there, you locate the root of the VAD tree and just kind of step through the tree translating the virtual addresses to physical and usually start with the starting address, take the size and just grab all the data in between there. Some other tools also utilize information about the PE headers to reconstruct the executable with their knowledge about the different sections inside. An alternate approach is to just use the DTP to try and translate basically brute force the whole address space. And it kind of just starts at a beginning address and goes all the way through to the end and this has its limitations, really. On a 32-bit system, this pretty much works because you got an upper bound of about four gigabytes but on 64-bit, the size can just be enormous. And this also leads to some misattribution of the data because the virtual addresses could translate globally, not particularly for that process. And this kind of leads us to problems that are in the existing tools right now being used. So with that, I'm going to hand it off to Jamie Bowling. So as Justin mentioned, the traditional approach in the current memory forensic tools is to really take a virtual address, its base and the size. You know, even if that virtual start is zero and size is four gig, you're going to brute force across the whole thing and you're going to do the translation in the context of the process that you're trying to analyze. So every process as the previous slide showed has a directory cable base and that is used for the virtual to physical translation and that should tell you what's in the process context. Well, what we found in our research is that there's a lot of data that's actually missing if you do this. For instance, the first thing that we encountered is when we're trying to reconstruct a process, obviously like the attackers usually injecting code, you know, in the form of an injected DLL or whatever into the process address base, so in order to analyze that and to detect it, you need to be able to translate the code for that DLL. Well, the operating system, the Windows loader is going to load the DLL as a memory map file. So memory map files are stored in a special way because basically the OS doesn't want to waste space. And what do we mean by that? Well, on a Windows host, you have, let's say, a user mode process most likely has the DLL, NTDLL.DLL mapped into its address space. So if the OS did not use memory map files or these shared files across the address bases of all processes, then that DLL would have to be replicated for every single process that loads. So obviously that would waste a lot of physical memory and this design was thought up, you know, back in the probably Windows 16-bit versions when there wasn't a lot of memory to waste in the first place. Plus it's also just more efficient in today's world, you know, we're a greener, so let's not waste memory. So these memory map files are shared across all processes, even if they're only used once, right? So because of this, they may be in your process address space, but the address that they represent may not translate in your page table entries. So I won't go into the depths of how you do virtual to physical translation. If you want to learn more about that, there's slides on the internet and so forth, you can Google those. But basically the page table entry is the very last table structure you'll find when doing a virtual to physical address translation. When you go to read it to find out where the physical page is, lo and behold, it's all zeros. So that doesn't tell you anything. That means typically we just had to ignore that region because we couldn't get access to it. And here's an example. I've taken the the hunting app project challenge three. If you're familiar with that, that was a memory image. Came out about a year and a half ago. I translated or I acquired these files out of the memory image. And there I just chose file at random. And you see the file size there in the first column. And then the bytes acquired with the traditional approach. So that's using the vads. That's a starting virtual address, ending virtual address, and using the DTB of the process to acquire it. And that's how much of the file that we could acquire. And then I used a different technique that we'll cover in the last half of this presentation, which was using what's called file objects, which represent the memory map files. So for instance, with, you know, the ace.dll, we got 70% of it using the traditional method, but we were able to increase that to 93% using this more accurate method of the file objects. Also, we'll talk about in a moment like what you can do, how this number may go up to actually 100% if you're running on a live system and non-memory image because you have access to the disk. The second problem that we ran into was basically when you're doing, we come from a background where we build products and tools to do incident response. So in that context and even in probably some of your more traditional forensic investigations, trying to determine exactly which process is infected is important. So knowing the whole host is infected is perhaps interesting, but then the next question your boss is going to ask you is, well, how do you know what artifacts within that host are infected? Because they may lead to things like user accounts that have been compromised, also processes have a creation time to tell you perhaps around the time of the infection if they were creating new processes and so forth. So attribution is important to us. One of the issues with the traditional approaches, especially with the brute forcing method, is if you brute force over the global address space as Justin alluded to, there's areas of the address space that are global for every process. So basically when you cross from virtual address in user land and you go into virtual addresses there in kernel land, most kernel addresses will translate appropriately in every context address space. And the reason this works and everything, you can read Marcus on which is Windows internal books and take those to bed at night and it'll keep you warm. But basically if you do that over a couple of years you'll figure out that the kernel addresses appear global because Microsoft wanted to save entries within the cache. Within the actual cache that's on the chip, you know, like your L2 cache, they wanted to save cache lines so they have these global addresses and save time on speed transfer context switching so they didn't have to flush the cache every time and so on and so forth. So kernel address is basically global. Here is a graphic that someone else stole from the Windows internal books that we stole from that person off the internet because I don't like to draw. So this is a virtual representation of a 32-bit system. And basically all I want to show here is you'll see there at C100000 is the beginning of system cache. So that is a cache that the operating system is keeping that's different from your L1, L2 cache that the CPU is keeping. This cache is going to be used for things like file IO. So if you request to read a page of a file, the operating system is going to assume that you're probably actually going to want to read more than one page so it's going to read ahead. And that read ahead is going to cache for you and it's assuming locality of reference. So all your future reads statistically should be relatively close to where you're currently reading in typical programs. So those go into the cache of that virtual address. Well, if we acquire memory or acquire a process and we're brute forcing basically anything in the cache will appear to be in every single process. So that's really bad for attribution. So let's talk about ways to make this process better. So we're going to utilize file objects. And file objects can represent a number of different things. They can be, for instance, memory map files that we kind of touched on which would include DLLs and EXEs. They also include data files which may not be mapped into memory but are in the cache in different places, like a word or PDF or registry hive, web history. We've seen restore points. Windows XP, those nice restore points so we could actually see after the attacker attacked we could see what was happening on the system what they installed because of the restore points. So the VADs are still interesting to us but we're going to utilize a little bit more data that they make available. So VADs describe a range of memory that the file occupies. And if it's a memory map file or if it's a file object represents that region of memory then the VAD will have what's called a control area. And if you're very familiar with Windybug you'll be used to these structures that we're talking about. Control area would have a pointer back to the file object. So we're finding VADs in memory because we found the E-process block. Once we find the VAD we're trying to parse it to see its control area. Once we get to the control area we're parsing it to find its file object. Now file objects contain some useful data including the device name that would be things like hard disk volume 1 as you see in this example. You can then translate that for the OS in question or for your host that's probably C colon or what not. Also contain the file name itself and then the next thing that we're going to talk about today is this table of three pointers depending on where the file or where the file data is actually backed up. So the three things we're going to look at are image section objects, data section objects and shared cache map. Here's a graphical representation of the file object. It contains a pointer called section object pointers and section object pointers only has three members and Windybug will tell you all this. Or Rosonovich might as well. So image section objects represent are very interesting to us in forensics and IR because they represent binaries that are loaded in memory. So if binaries load up the windows loaders going to create a section object for that binary. But this pointer, this image section object thing is not actually a pointer to a structure called image section object that doesn't exist. It's actually a pointer to a control area. So yet another control area. That control area will have a pointer to what's called a segment object. This segment object can be used for sanity checking like if you were just scanning through memory you may find some artifacts that are no longer in use that aren't actually usable. If you try to parse them in certain code you probably crash. So we'll use this for sanity checking. That segment object will contain a segment size and the total number of PTEs represented by that segment. So we'll talk about PTEs in a moment. But basically for our sanity check you can compare the segment size should be equal to the total number of PTEs times the page size. Page size in windows for the most part you can just... There is a small alteration on page size but for the most part you can assume that that's 4k or 1,000 hex. The thing that you're going to want to be parsing if you're trying to dig out the binary data from memory would be the subsections. So subsections represent the individual pieces of a file. How many here has ever loaded up like a... looked at a PEE, a portable executable in PEE View or Lord PEE or some other viewer? Basically what you'll see in there is that the PEE file is broken into a bunch of sections. So there's like a code section that's usually called .txt. There's a data section, there's a resource section. There's different things like that, a relocation section. Well, all these sections have a relative virtual offset, a relative virtual address within the PEE and then they also have permissions. So once this thing loads into memory what should the permissions on that section be? Since each section within the PEE file probably has different permissions then there has to be a subsection object to represent every single section within the PEE. We couldn't find a pointer that would show us where the subsection object was so if you just stare at the hex for long enough the math starts to come out at you. Basically the subsections all seem to line up at the very end of the control area that we just found. So although the segment object was way away in virtual memory, the subsection object was immediately following the control area. So that was nice for us. Basically for every version of the OS you can just determine how large the control area is, add that to where the base of the control area was and then cast that as a subsection object. So these subsections still contain an array of prototype PTEs. This is the crux of the data that we have to parse in order to get the binary out of memory. So the prototype PTEs contain the physical address of each memory page in physical memory. So we'll have a graphic in a moment it may be a little bit more useful for you but I'm not sure. Basically though, if the prototype PTE contains the virtual address of that subsection object that it's within, then that means all bets are off basically this page of memory is on disk. And when I say on disk I don't mean while you've heard arguments back and forth about hey can you acquire the page file and use that for memory forensics and stuff too I personally believe you can't. You can only use the page file at runtime but even if you were to be able to use off-line page files and marry those up with off-line memory images even if that were possible you won't be able to access the data that's represented here because memory map files mean exactly that they're memory mapped so if they're page to disk they don't go to the page file they are represented by themselves on disk so if you want to get that data you have to go to the location of NTDLL.DLL on disk for sector 15 or whatnot and read the data so the page file is not useful in this case also something else that's within this subsection object will be the number of full sectors and the number of prototype PTs for the subsection so again going back to disk when we're looking at the PE on disk each the disk sectors are 512 bytes and the page alignment in the Windows OS is 4K so there's a little bit of a fix up that you have to do so you have to take into account the total number of PTs that this subsection represents but also the total number of sectors full sectors that it represents and it contains these numbers in the structure so you use that in order to parse it correctly because if you were just reading PTs blindly you would get more data than is actually within that subsection and the file in memory wouldn't line up like it does on disk so we're going to use those full sectors to our advantage and basically by keeping track of where we are when we walk these sectors in memory you know 512 bytes at a time we will know the offset that we need to read if the file is page to disk so using that we can get all the data if we're running on a live system another thing of interest is there's probably definitely more than one subsection especially if it's PE file so there will be a pointer to the next subsection and you basically just chase the chain so this is just a nice graphic of what that looks like conceptually in memory with our array for each subsection now data section objects they represent data files in memory I'm not sure what types of files load themselves as a data section object I know that PDFs don't but Word documents do so if you're loading up Microsoft Office documents they're going to look like in memory the data section object that structure actually is exactly like an image section object besides a few sandy checks don't work in this case but basically since they're the same data section objects in an image section object they point to the same structures we can surmise that the data access for a Word document would be just as fast as the access as any data access or code access in this case for an image section objects so Word documents are going to perform as well as EXEs because their structures are the same it's a short cut of where to get the data it's not relying upon the cache which is somewhat subjective the OS defines how to load the cache and when and when to flush it based upon utilization on the system and the number of resources available well if you have a data section object that's not the case all your structures are right there you have quick immediate access to the data so I thought that was kind of cool maybe Adobe should change how they load their files so we've covered image section objects and we've covered data section objects the last thing we'll touch on here is the shared cache map so the shared cache map is used to represent the file data in the cache so if all other bets are off if you don't have an image section object you don't have a data section object then you should go parse the cache structures in order to get as much data as is available now again that may be nothing or it may be more or less the whole file it really all depends so in the shared cache map structure this structure is actually defined by Microsoft and in Windybug it contains the file size, the amount of valid data within the file, within the cache the valid data should never be larger than the file size because if that happens then you're actually looking at uninitialized data also it contains an array of pointers to VACBs so VACBs are virtual address control blocks this is the structures that define where the data is in virtual memory in the cache so if the file is one megabyte or less then Microsoft decided to put in some performance improvements by embedding an array of four VACBs right into the shared cache map itself so no reason to chase anything they've just embedded it right there which works really nice for them because in the cache are less than a meg you'll probably go have a lot of web images and so forth that are in the cache there will be less than a meg in size so let's keep it simple each VACB I should mention represents 256 kilobytes so if you have four of those obviously you can represent a one meg file if it's larger than a meg though we have to go to an array of pointers so what that looks like is it's a nested structure if it's one level deep basically there's one array that points to the VACBs themselves it can represent 32 megabytes if it's one level deep I think I forget the number I believe each array has 128 entries so if you read Marco Sanovic's David Solomon's book on Windows Internals Chapter 13 I believe in the newest version is all about the cache and they'll tell you how to calculate the depth of this tree so arrays of arrays of arrays it gets really fun I recommend recursion but anyway the largest file that you can have on Windows OS is two to the 63rd power so that's the largest you can have and because of the number of entries in each array and so on and so forth and the number of blocks that they can represent your tree can be no deeper than seven levels deep here's what the shared cache map would look like kind of conceptually this obviously the bottom is just two levels deep and if we parse this we'll get all the file data in the cache so let's talk about some of the applications in this technology well in the past most of the tools I believe all the tools that are out there freely available will parse the VAD tree in order to do process reconstitution but probably since we care about some data that's in the file system or in the file cache we should probably also parse the handle table so we'll parse the handle table we'll parse the VAD tree and then once we get there we'll parse the file objects that we find now the Windows registry hives are a nice example of how you can use this if you would analyze the system process basically it has a handle to every registry hive that's in memory so you can get all the data there let's say you're going to acquire this to your local hard drive or to your system analyst station so you can do parsing with your traditional registry forensic tools you would want to acquire the system process out of memory once you have that it'll literally write the individual file names to the local hard drive with the content of those hives also PDFs as I mentioned are found in the cache we haven't had a ton of time to research all the things we can get out of the cache if you're on a Windows XP system you can get the restore points because it has a handle to a file that is basically keeping the data for the restore points and so you can get that out of memory in some cases the caveat for the cache is all bets are off it's hit or miss rather than be useful the way we're going to use this most likely and some of the free tools that we're releasing are to do data reduction so we have hashing we beat up on the AV industry all the time about hashing and how it sucks and that's nice and I've done it myself however hashing can also be useful we have over two decades of data so let's start to use it to our advantage to make the problems simpler memory forensics is all about data reduction so we went from a 250 gigabyte hard drive now we're down to a 4 gigabyte memory image probably now in that memory image we're going to find roughly on the the Windows 7 machine now looked at we're going to find about over 4,000 files in memory right if you don't take into account eliminating duplicates so everything's going to have a handle or a VAD to NTDLL everything's going to have a VAD to kernel32.dll and so on and so forth so now we went from 250 gigabyte that we care about first down to 4,000 files and then we do a data reduction to get rid of redundancy and now we're down to about 1,500 files well now the problem is starting to get a little bit more manageable in finite time so if we could use quite listing and hashing to eliminate the other components of the operating system that we don't care about that number would get a lot more manageable so there's been some efforts in the past not my research but others who've done fuzzy hashing and different things like that in order to try to make the hash that's found in memory match the hash on disk the problem with these fuzzy hashing techniques is we don't have the data going back years because they just take subsections of files and most white listing technologies and so forth use the full hash of the file so the comparison is difficult to make and it may have benefit in your organization like if you had a white list I mean a gold master that you were deploying across the enterprise you may be able to use data reduction with fuzzy hashing however we found that most people don't have this and so we went to the section image object and we parsed it like on a law of system which means we can get access to the file system as well if there's a page that's not there in memory and by utilizing this data we can make the hash that we find in memory match the hash that's on disk so we call this memd5 it's a cute name of our other co-workers came up with that will be released in the free tool that I'll talk about in a moment another application for this better process reconstitution or binary acquisition what have you is that there's a lot of tools out there that are starting to use you know like byte patterns of malware and things like that there was a tool developed by Xonamix which was acquired by Google unfortunately and they killed the project or took it internally he came by it anymore but it was called VX class it was kind of cool I liked it enough to convince my company to buy it but VX class would generate byte pattern signatures for classes of malware so families of malware so what the VX class leveraged was that if you had a malware sample or a lot of malware samples they were reusing code and if you were doing IR you probably didn't want to focus you know all your you have a limited amount of resources so you probably don't have a large malware team what not you don't want to focus all your efforts looking at the last 50 variations you just want to throw it into some automated system and have it spit out hey this is a zoo spot right so that's what they did with their VX class and then it would generate someone else joined the company and one of his PhD ideas I think were around generating the commonality between all these things in the byte pattern so you have 100 samples they all say I'm Zeus well let's generate one pattern that matches for all 100 samples so we could use this to search memory and it was fun but we got some false negatives well by utilizing this the image section objects and so forth we no longer get the false negatives so it's more useful to us also how many people have ever used Clam AV so the large number here they use it well Clam AV also has this concept of a byte signature match for malware so I think in the last time I downloaded Clam AV they released 40 signatures that are what they classify as byte pattern so I didn't look at what the database is and what those signatures represent but there are 40 signatures in there that now will be able to use in memory analysis so we're trying to use the tools that we already have apply them in a triage scenario so we have a few minutes left here I'm just going to cover some demos quickly I apologize my laptop died like right before I came to Black Hat so I had a fast machine now I have a really crappy one so anyway basically I won't run the acquisitions on this hard drive we take about 6 minutes per sample but this if you have memorized you can do this at home as soon as we release the new tool the new version here we're going to run a process of acquisition pull the binaries out of memory and then we're going to look at what we got so the first example was the registry hive if I acquire the system process I have a bunch of data here along with the sizes so we're pulling out individual files and we're giving them a name it's encoded so that we can write it all to the local file so you'll see the encoding of C colon slash things like that if we couldn't determine like it didn't have a name it didn't have a control area associated with it we'll just give you the range that we found it in and so forth but if it did have a name we'll write that out you know there's some files we couldn't find any data we could find a file object we could find its name but it just didn't have any data that we could carve out a memory so those will be zero in size but what I want to show you is loading this into a registry viewer because again this is a system process we're going to load it up into registry viewer and since this is coming from the cache there may be some pages that are not there right but if we encounter one of those in the cache example we'll just write a page basically with no ops because we need to keep linear the linear order of things in the file so that it can be parsed by tools like access status registry viewer if you see here you know I can drill down into things and I can see when the key was created so I can do all my traditional forensics on it there's no guesswork there's no carving hives out of memory and guessing where they are and stuff like that just works another thing I'll show you is here I'm going to I've acquired basically the so from the Honey Nut Challenge the Honey Nut Challenge 3 there was a memory image that had it was basically infection through the web I think it was Zeus and when I acquired the Firefox process I was able to get the web cache information the URL history information so I load this up into one of our free tools called web historian and there wasn't much data here because they were just demonstrating an attack and seeing what you could find but this last entry here contains the exploit codes pdf.php so I can see the URL I can see the access time when the user clicked to go there and stuff like that because it's all in the web history log that Firefox is keeping and I just acquired Firefox out of memory so I can use my traditional tools also I acquired the Microsoft Word process out of memory and this was a memory image that I asked my wife if she would load up Word Document Center and then take an image of her laptop here you see we were able to acquire the Word template and an actual Word document so if we try to open that Word document it's going to complain this was the Word document is written in the Word 2010 this is Word this is Microsoft Office 2007 it's going to complain about the content it's a little corrupted it's probably also dealing because we're coming from cash we're just going to go ahead and say okay let's go say do you really trust this we're going to say yes and here's the Word document for this presentation so there's a 13 page white paper and it is all there so I'm not really sure if this has any applicability to IR and forensic switches my day job I thought it was kind of sexy cool so what the hell I guess for all you e-discovery types you might like it then the last thing I'll show you before I go is this is the UI for our memory analysis tool that is also free the UI is open source it's called audit viewer we're going to be releasing a new version where you can say filter out the known MD5 so if it's going to do the MD5 compare that to a known set here if you had something like bit 9 you could white list right you could utilize their service and this will just fire a bunch of crap at bit 9 I don't know if that's legal but if you have a password and a user account I'm sure they'll probably let you do it because we don't have a bit 9 account for today's demonstration it's still running okay finished you could also use the NSLR from NIST and stuff to do your known white list and then we'll go say filter on trust and there we've reduced oh by the way this is every single process on the system because I double click the root node here so this is every process these are the things I have to care about now as an incident responder triage in this host so we went from 1500 down to I don't know maybe 30 or something like that so it's a big time saver and that's about the end of my time there are a few caveats where places we need to continue the work basically dealing with what will be available in the coming weeks you can check the blog, these slides will be on the web the white paper will be on the black hat site but dealing with ASLR because it changes the address space a little bit when programs load but we have all the data we need to actually reverse that trend so we can fix it up and just run the fixed up data through our hash and get the same hash values also there's something called the security directory that's on some files it's the certificate info and that won't be present, there are no artifacts representing that on Windows 7 and Windows 2008 previous versions there were the artifacts so again we have the PE header in memory so we can detect that that exists we can go to disk, we can run it through our hash again so that's really the end I'll be available at the Q&A room after this thank you