 All right guys the few the proud that my guys who made it all the way to the end of the day here pretty amazing Can you get everybody can hear me in the back you can hear me? All right good. Well, we are here to talk about advanced file system hiding and detection You'll have to forgive me because you either get the We got nothing on the the laptop here, but we got both screens So if I fade in and out just let me know I'll stick to this one. Anyway, I'm Irby Thompson. This is Matthew Monroe My colleague we work at Lockheed Martin, but don't hold that against us Anyway, all right, what we're going to talk about today First thing is doing a little history and analysis of traditional Data hiding methods and then we're going to do a case study on finding new places to hide We chose NTFS because there'd been some projects on Linux about hiding data hiding in the file system But we wanted to look at something different NTFS was closed sourced made it more fun to reverse engineer, you know So frag FS is our implementation. We'll do a demo of that. We'll talk about detection a little bit and maybe some future considerations So let's start off with why is data hiding important a Lot of people will say that modern root kits alleviate need to hide data persistently I Will disagree with them. I think they're they're wrong I think you know root kits are really good at hiding data on a live system But what happens if you need to store data on that system until you can get it off? How would you be able to hide from offline forensic analysis say? Investigator pulls the hard drive and you know runs in case against it. You want to be able to hide against that as well Of course, you can say what if the root kit is entirely memory-based. It never touches the disc You never have any traces on the hard drive itself. Well, I mean that that certainly embodies something but You don't have reboot persistence if you reboot you're gonna lose access to that machine If you were trying to actually save like say you had a keystroke logger on the machine You're gonna lose obviously those keystrokes. So I think there are Situations where you want to be able to hide on a machine covertly. I would call this the patient hacker There might be some foreign intelligence type types that would be interested in you know being able to maintain access on a machine through reboots and also Staying saying pretty pretty low on the radar screen from forensic analysis So Just moving on a little bit information hiding is nothing new. We're not really pulling any new tricks out of the hat I mean we're kind of taking existing ideas and applying them to to modern technology So I'm sure many of you as kids wrote with invisible ink Use the lemon juice, you know take the little candle and you got your hidden message there Kind of trying to do similar stuff to that Hiding on data data on computers is just the modern application of old principles basically and we categorize data hiding into three Major areas the first is out of band and we'll get into that a little more than in band and application layer So just diving right in out of band data hiding is what I would call The portion of a medium that's outside the format specification for that medium, which means nothing So what does that mean exactly? Well say you had Say you had a radio channel. It was you know 210 megahertz. Well, what about the area that's like 210.9 plus 209 point, you know, whatever So it's you're trying to hide outside the normal realm of operation Going to a hard drive in our specific case If you're hiding beyond the end of a partition so hard drive might be a hundred gigs big The partition might only be 99 gigs and that last gigabyte of space is just kind of it's just kind of out there It's beyond it's not really being used, but it's still, you know part of the drive There's a program called slacker that came out last black hat Maybe two black hats ago that use slack space within the file So if you know how files are stored on a hard drive, they're stored in Basically a page boundary. So if the file is only like 3k, it'll still take up 4k on the disc And so that last k is just unused and so slacker would stuff Data in those little slack spaces at the end of on files You could take a hard drive and mark certain sectors as bad And they would no longer be within the realm of normal use But they might might still be you know able to store data and finally hosts protected area I'm not gonna go into that too much, but basically modern hard drives present One amount of data to a to a system, but they actually underneath can store more data They can have system management information stored there and whatnot In-band is kind of the opposite of out-of-band. You're hiding within the format specification You're not breaking the format in any way or you're not doing anything illegal so to speak you're kind of Using the file system in ways. It wasn't necessarily intended to be used Alternate file streams is the big one here NTFS supports multiple data streams for a single file and most people, you know Just see the the regular data stream if you if you open a Windows Explorer It only shows you the regular data stream But if you go to the command line you can you can do like Copy data into another data stream of the same file name and it won't show up for most tools Similarly, you could use a file system journal log You could say I need a hundred megabytes for my journal log, but it only uses the first, you know 200k All the data after that or all the space after that we would consider that to be in-band, you know You're hiding in space that's been reserved, but you're not actually I Mean you're not supposed to be there really and reserve, but I'm not allocated sectors Going back to our original hard drive example What if we had a you know partition that was 99 gigs, but we might only be using the first 20 gigs Well, we could put data out later on that partition And within the partition that but not showing up as a file per se Finally application layer. This is kind of hiding in a higher level format specification It's kind of like in-band data hiding except at a different level of granularity most people would recognize this with a Stegonography you think about stegonography where you try to hide data within say a picture So you can take a picture you can manipulate the bits in such a way that that you hide data in there But the picture still looks the same Similarly, you could take a word document and you could add extra spaces You could add maybe tabs and new lines in a way that that it doesn't make it look any different to the naked eye But actually it's storing information that you might want to be hiding There was a tool called hydrogen that came out a while ago And I haven't I haven't played with it in a while, but it actually used redundancies in i3886 op codes to Restructure the op codes of a program in such a way that you could hide data, but it would still execute exactly the same I really quite novel tool. I would say and So that's kind of the application layer hiding We're gonna do a little more analysis on each of these because it's it's important to to think about if you're gonna hide something How should you do it? Specifically forensic tools know about many these methods alternate file streams They know about that slack space at the end of files. They know about that too and they will flag the data as such They'll say hey this data is hiding, you know, it's an alternate file stream It'll just show it as if it were a separate file similarly if you were using a Strings or signature search across a whole hard drive you're gonna find the data if you know what to look for no matter Where it is so say you're a forensic analyst and You think somebody's hiding data on this on this hard drive Well, you know that if you know they're hiding like some special character sequence You can just search from beginning to end and you're gonna find it And there's ways to get around that obviously with encryption obfuscation But that that obviously even our methods of hiding are not gonna be able to hide if they know exactly what they're looking for But just going a little further experienced analysts are gonna find stuff beyond what the tools give them As long as they have the time and money available and they know that there's something there They need to find they're gonna be able to find it Fortunately for us. They usually don't have the time available to them Okay, so let's just do like pros and cons of each of these Out of band analysis the strengths is that being out the side of the boundaries major often overlooked I mean how many you guys have actually looked at the space that be on the end of your partition Probably well a few of you have that's good. That's good. You're on top of things But I'll be honest. I've never brought it brought out the hex editor to look at the end of the drive There's often a lot of space available there, but It is and it's hard to discover without special tools and it's also resilient the weaknesses that it's also hard to access without special tools Often, you know, if you just do a file open you can't do it on the you know space beyond the end of the drive I mean you could well, I won't get it all out but I think it's it's harder to access basically and it's hard to hide from plain sight analysis so as The gentleman back there was saying he's actually looked at the end of the drive Well, if you open a hex editor and look at it well, it's you're kind of sitting sitting duck to say say the least In band analysis The strengths are that it's usually to access easy to access with existing tools So you can use your regular file system calls to do things you're not breaking the specifications So you're you know using the system the way it was intended to use even if it's maybe kind of a obscure way of using it You're less devious in some senses because you're you're not doing anything illegal The weakness is that the storage space is often small and Scattered you'll have you know a few bytes here a few bytes there But you don't have a contiguous big space the high data in and that part of part of a frag if as as we address That issue and we'll get into that a little bit more And you're lying on security through obscurity. So as soon as the method becomes known such as alternate file streams It's not really worth anything anymore And the specifications may change that that great hiding location you used to have is now either being used for something else or it's You know, it's no longer useful basically Application layer analysis the biggest strength is you're hiding in plain sight I can show you a picture and you would never know that there was data hidden within it I mean if you had the right tools you might be able to detect it, but nine times out of ten It just looks like a picture So you're hard to detect you're hiding in plain sight Obviously your weakness is that the amount of storage space you have Relies on the size of the data you're trying to hide within so the bigger the picture the more you can hide there But obviously that's it's a trade-off It's difficult to access without special tools. You need special algorithms to to do this kind of hiding in a way That's not obvious. I think it's a Johnny long that presented where he just opened up MP3 and typed some text in there You can make a little more, you know obscure than that, but that was one way to do it So they're complex algorithms, and it's not very resilient if the data if somebody opens the picture in Photoshop and changes it Then you just lost all your all your hidden data, which may be okay But we're trying to be resilient here trying to be persistent in some ways So let's look at a just a screenshot here of in case detecting alternate file streams as you can see Well, the text is a little bit small, but The first one that's checked there 11452 is test file and I created an alternate file stream I just opened the command prompt and I said, you know cat this data into test file hidden And so it created an alternate file stream of the same file So in Windows Explorer you only see one file, but in case obviously will show you the second file as if it were something different So in the bottom you can see a hex dump of what that second data stream says. It says this is our hidden text In an alternate stream not very exciting Slack or similarly is hiding again at the end of files in the slack space of the end of files and in case will mark any Any space that's beyond the end of a file? I mean that still belongs to the file, but it's not like within the file size. It'll mark it in red So obviously at the bottom there. I've highlighted that with With the cursor, but that data in red is obviously not supposed to be there. It should just be nulls so In case it's pretty good about detecting those kind of things So now Matt's gonna talk to us a little bit about finding new places to hide Yeah, my section of the talk is going to be basically it's a case study is My goal is that nothing we're showing is necessarily super revolutionary like Ideas and how we do things are somewhat similar to what people have done in the past But I also want it a lot of people don't know how this works So the goal of this part of talk is to help step you through how you would find a new place to hide Or if you're an analyst how to think like people are trying to hide their data Look where you want to look where you're currently aren't where your tools aren't going to show you What are the things you need to look out for? And the basic start-off is you want to hide some data when you're doing so you have to figure out the Domain of your problem. How much space do you want to hide? Do you need a few K to hide which I did you want several megabytes? Maybe you want gigabytes? You want to hide movie files or something like those are very different problems Also, you want to figure out what type of access do you just want to be able to read and write it? Do you want to be able to execute programs? things like that Also, what are your performance requirements? Does this need to be like high-speed access? Can you write it really slowly? Do you need to read it quickly and that can make a difference in how you hide things? for example Stego tends to be very slow because algorithms are complex. So that might not work if you will have high-speed data requirements Also, how sensitive your data is it? Is it okay if your forensic analyst finds it? Like how bad is that for you? And how will do you want to hide it beyond just making it less obvious words at but actually if they see it Can they tell what it is? Finally is how long does it need to stay around is it something that's going to be like do you need it an hour to store on disc? Are you talking about months on disc and? What place do you want to hide and how you want to hide it? Those things will affect? I want you to figure out a basic idea of what you want to do start looking at how people have hidden before because Anytime someone's presented how to hide data or whatever else you've got tools out there that will detect it And if you're serious about hiding stuff you don't want to use something people already know about You want to come up with something new? Then start looking at some file system specifications and try to find new places to hide within them And what you're looking for within the file systems are something that's unused reserved otherwise not being used by something else We chose in this case NTFS because it's used on Windows. It's the default file system since NTE 235 and It's XP has a 2003 has it Vista will use it They talked about putting a new file system, which really was a new file system was NTFS with a database added to it But even that they're not doing it's going to be NTFS just by itself The core part of NTFS Is what they call the master file system table or master file table? And this is the it's an actual file on the drive that holds all the metadata all the information about all the other files and directors on the entire system is stored in one place and It's a big table. It's got entries a file the director They all have entries within this big table that store information about them And it's this big file that grows as you add more files to your system But it doesn't actually shrink if you delete a whole bunch of files. It doesn't get smaller suddenly Unfortunately, this is something that's not very well documented understood. It's Microsoft proprietary. They don't release. They don't talk about it fortunately Some people in the Linux community have worked very hard for several years to reverse engineer it and there's a lot known about it Unfortunately, not everything we'd like but enough to serve our purposes at least Looking more closely what we know By reverse engineering this file system is that You have these entries in the MFT and they all have a fixed size when you form in a drive It makes them each a size by default. This is 1k on Windows for each entry and each of these file every file and director of system normally takes up one entry technically they can take up more but generally it's one entry per file and The information about each entry is stored as attributes There's a whole bunch of different attributes for each file and they can be stored in any order To help explain what those attributes are Basically an attribute is this block of data. They've got types associated with them. There's lots of different types There's 13 different types of metadata currently an NTFS and Basically these attributes some of them can be repeated for example, we talked about alternate data streams there is a attribute that's type data for Files and you can have more than one attribute of type data for a file And that's how you keep as many streams of data as you want for a single file name It's just by adding more attributes to it and directories each director entry is stored as an attribute and Each of the attributes can have different characteristics you can have them some of them can have names some of them can be Compressed or encrypted and you have different attributes that each one can have independently and Important for us ends up being this feature they call resident or non resident and what that really means is a resident attribute is something where it's stored in the MFT itself and Compared that to non resident non resident would mean it's an attribute where the MFT has a pointer to where else on disk It is for example with the file if you have a hundred byte file an NTFS It will store the entire file in the data attribute in the MFT itself But if it's like a 200k file, it's way too big to fit in the MFT so they have a pointer to worlds on disk that data is at and That's a difference between resident non resident attributes A little more examples of world examples what you'll see is what you see during forensic analysis is you're gonna look at Two things that show up immediately. Everyone looks at is the standard information attribute and the file name attribute These have your great your timestamps your file names all that sort of information stored in those attributes and Basically every file has them every directory has them. They're just there All files you've got a data attribute of course even if you're zero length you have a data attribute says hey I have no data Directory is are interesting because you look at them and every director entry stores a separate attribute Actually more importantly it's stored as two attributes currently, which is you have the long file name of stores an attribute and the DOS file name is stored at separate attribute So it ends up taking a lot of attributes for every directory if you've got 2000 files in directory you've got 4,000 attributes for that directory which is quite a few um and once you're After all the attributes because they're stored in any order the sort of mismatch they can go whatever They have this little marker that says hey, there's no more attributes for this current MFT entry So it knows to stop trying to parse the MFT for it and As I said a little bit earlier you've got 13 Attributes currently you only use about five of those 13 different types because you've got a whole bunch of ones Kept over from like NT days They did security differently, but all those attribute types still exist, but they're not currently used by Windows XP for example, so if you're thinking about hey, we're my hey data. Well, I could use an old Attribute maybe put data there or something to help Explain this hopefully better. I have a picture here A little example MFT entry you've got a little MFT header that's got all sorts of different information different flags about what it is Is this deleted at MFT entry or not such things Then you got for example got over here. You got a resident attribute It's got a little bit of header information then at the bottom of it's got a whole bunch of attribute data for whatever goes with that attribute if it was Say data that might be data for your file. It might be a Director a entry so on so forth below that you might have a different non resident attribute that very similar Information except the bottom. It's got what to call a data run that says hey This is where it's on just to look for this attribute Then you might have variable number of different attributes and finally an end of attribute marker After that you got a whole bunch of stock space to your 1k boundary So we're thinking hey, that's basically how it works I won't go. Okay, so this is MFT. We want to look at let's hide some information there Let's see what we can do. So let's look for places. We can hide it looking at in band you've got a whole bunch of reserve space you end up a whole bunch of bytes in every attribute are reserved and Doing a little bit a research analysis of live systems We found out that for every file on the system. You've got about 32 bytes of Reserve space within its attributes and for every director about 64 bytes That's not a whole lot of data maybe it'll work if you want to hide a just a small amount of data But that's not much but what's more interesting is that you've got a whole bunch of slack space after every MFT entry is that Normally an average file director whatever on a system has about 450 bytes of Attributes with stores in the MFT yet. It's formatted to be 1k in size. So you have almost 600 bytes For every single file or direction or system. That's there and unused Seems like a lot of space So we want to look at well, how can we use that? What are things you have to worry about? Um Well first is What happens people delete files? Fortunately for us NTFS has this thing where when you leave a file it marks a bit in the MFT entry that says it's deleted It doesn't actually delete anything except says we're not valid anymore Actually the entries will get deleted if that MFT enters every reallocated for a new file was zero everything out But otherwise just stays there So if somebody deletes an MFT entry which you're using It's still there. They changed one bit The reserve space Though you have a problem with is that NTFS has changed where There's been five or six different versions basically if you count the minor versions of it So it might change in the future some of the currently reserve space might go away some new stuff might become reserved So on so forth so you have that problem of hey Technology may change what we're doing now doesn't work and you also have a problem of all that reserve space It's zeroed normally So hey if you're a forensic analyst and someone's there's some non-zero reserve space and MFT You're you have a good hint that something's going on there. It's pretty obvious that something's wrong Looking at the after attribute slack space You have a problem of these are attributes and they might expand and grow Someone adds more data to a file someone adds an alternate data stream to a file You suddenly got another attribute taking up space and all that sucks base You thought you had may go away dynamically on a system. So that's a pretty big concern On the other hand we actually have an advantage here is that because of how the MFT works dynamically That space isn't always zeroed is that NTFS only changes stuff on the drive It has to if you add if it gets rid of an attribute or something it doesn't zero out that space It just marks it as no longer use it will move like the end of attribute marker I'll just move up higher if you delete an attribute It won't actually remove the old data will still keep it right it back to disk Which means you end up with a whole bunch of garbage and the supposed slack space This is really common in directories For example, if you have a directory the first five or six files you add to it all of that information is stored in the MFT But after that point it gets bigger than 1k So it cops it all out to disk somewhere else and puts a little non-resident attribute in that says here's where all the director information is really stored at and It keeps old director information there that's still there though So it looks like if you're just looking at a raw hex dump of it You're like Where's the attributes end and whatever else because it's not clear unless you really understand what's going on Because of how it does that Well, we talked a little about these problems now Let's go into well, how can we avoid them where what are techniques we can use so that these problems don't become a real problem for us That we can get around them Basic it is let's find some safe entries. Let's find some safe MFT entries some things are more safe than others and Basically ideas that some files rarely get modified or deleted Operating system files you install your operating system rarely You got get patches but a lot of operating system files never change font files are a great example You might add new fonts to your system, but who deletes fonts off their systems or modifies the font files. It's very rare You also have a thing of old files tend to be pretty safe people install the applications They use as soon as they get a machine and they're not likely to delete them They might upgrade them, but they don't delete them. And so those files tend to stay around You also have the another interesting characters of NTFS talking about attributes possibly growing and stuff is that once an attribute goes non-resident It never becomes resident again. If you have this big file and you Truncate it down to zero size. It doesn't make the data attributes suddenly resident again It still keeps this non-resident keeps blocks allocated else. We're on disc for it So we know that attributes themselves won't tend to grow a lot non-resident attributes don't tend to grow a lot You also have a thing of directories to tend not to get deleted Various reasons you have to lead all the files as a basic one and even that a lot of Uninstall our programs if you notice don't even delete the directories for the programs that uninstall like you have to go back And manually delete them all a lot of people don't do that. So director's director entries tend to stay around Just quite useful to know Basic summary is if you're looking for safe MFT entries to put data in you want to look for something that has non-resident attributes and It's never been modified because files that aren't modified don't tend to be and it's been around for a long time and Well, okay well Talk a little about NTFS. I want to go. Well, how much can we actually store? What are we actually talking about and? basically is If you stall Windows XP Base professional install 12,000 MFT entries by default. That's how many files and directories it has Most systems have at least a hundred thousand. I know Erby's laptop has 230,000 MFT entries in it and No, these are safe using our basic metrics. We found about 60% of MFT entries. We'd consider safe And looking at the case we're looking at well look at the slack space height and then the attributes So we're talking about 600 bytes print entry 100,000 entries at 60% of them we'd consider safe enough to store data in that gives us 36 megabytes of hidden storage on an average NTFS Drive That's a lot of storage for something that you can't see or most people can't see But there's still a few issues about actually implementing this and getting this to work The first issue is that you've got 600 byte chunks all over the disk and that tends not to be very useful So you want to have a technique where you can map these different chunks across the disk into a Contiguous address space something that you can access like it's flat like it's like it's a drive itself even So that all your current ideas of how you access files write files do all that sort of stuff Work you don't have to change your paragraph of thinking about how you access your data in order to use this Also, you want to have that mapping system be dynamic You we try to be safe and not lose an MFT entry, but what happens if we do we want a way to be able to replace it Perhaps grow if we need more space people add more files their file system. We're able to Dynamically change how we're doing stuff to keep up and modify and try to maintain both our persistence and our ability to React to things that go on. She needs some sort of dynamic mapping You also have to think about No matter where you hide no matter whatever else forensic analysts will be able to find you If they really want to and they have enough time they'll find they can do a string search across the drive and find some data So you have to think about more than just hiding out of their vision What happens when they do find you and big thing there is encryption? Maybe you just want to XOR that's good enough for you We gave talk earlier this year and at that time our proof of concept code used blowfish, which was okay now we've gone to what's GCM AES, which is actually a Data encryption specifically designed for data storage and the advantage here is note Is that it's authenticating encryption is you can tell when encrypted data block gets changed? And that's quite useful is if somebody overwrites your hidden data You know that got overwritten you can say I might know I might not have the data anymore But at least I know something changed and you can do that with your encryption if you use the right algorithms That's quite useful. Unfortunately with all encryption is key management is pretty hard Well if someone only has a raw hard drive that pulled it out They might have a hard time getting your key back if you use some key dispersal algorithms and stuff You can make it really hard friends of analysts to say well Here's an encryption key to decode all the rest of the data But if they have it on a live running system can do Runtime analysis and watch you as you encrypt and decrypt You're pretty much gone like they can find your encryption key and they'll get your encrypted data if they want That's unfortunate thing and I wish there were better ways to do it But currently nobody's really figured out a way that I know of to do so But that doesn't mean we should throw it out encryption is still quite useful You also have the problem here of Change tracking is basically the thing is what happens when Windows decides to change your files Fortunately it doesn't change what doesn't need to change But it does change some of it and we might lose our data as talked about you want to be able to dynamically remap it so on so forth Check sums work We've gone to authenticated encryption that works quite well But actually for this technique found one of the best things to do is just keep extra copies of your data around Is that if you are smart about it and you use different MFT entries? It's unlikely two files on different parts of the system will change at the exact same time And you'll lose both copies of your data So if you keep two three four copies of it you can say hey I lost one copy Let's make a new copy somewhere else and keep up with any file system changes You also have the problem of NTFS might notice some changes join your file system Fortunately if you use the sock space at the end of attributes they tend not to notice which is very good Ultimately you'd really like to watch your file system changing the disk and be able to respond to say hey It's gonna overwrite my data. I want to move my data before it does so That's possible, but it's complicated. I won't go into it in this talk But as a note if you're really interested it can be done Finally You want to think a little bit about usability, so you've got your space. You know we're gonna hide we're gonna hide in this case Slack space then after the attributes in the MFT table, but how we're gonna use it Well the big thing is none of us want to rewrite all our applications suddenly use this hidden data area So you want to be able to write some way of accessing it that uses standard techniques? You also Have to worry about how it's presented the operating system is does this the operating system you recognize it? Is it something that sees it doesn't see and The big thing is standard interfaces And see your files read write open close those kind of things are very useful Unfortunately file execution on Windows is very hard Windows because of how it does file execution says It will only run files out of file systems that recognizes so to get it to run an executable out of the hidden data area It's actually quite difficult But if you can convince it that it knows what file system you're running it works great Specifically I'll go now move on to the final parts, which is what fragfs actually does Basically what we do is fragfs is proof-of-concept code that this technique actually works and It starts up you tell it to format it will scan the entire MFT I know just MFT on your system and look for what it considers safe entries and then it'll put How much it figures out how much space is in each entry and it'll put a little chunk of data there in this case There are 160 bite chunks of data Any it'll put and if you have 320 bytes and I'll put two chunks in one entry and so on so forth and the advantage of this technique is that we keep no indexing of any of the blocks we use on disk and I know for example like slacker and some of the other tools that come out the all the forensics tools detect them by their Indexes they don't actually look for data. They just look for index pointing to where the hidden data is at and we totally avoid that by each Block it has all the information it needs to be identified by itself by our tool and That's quite useful Basically how it works is we have unlimited redundancy you can keep as many copies of each block of data as you want When you format you tell it how many times it detects when it's been modified and can respond to that If you lose one chunk you don't lose them all because each one's independent There's nothing to be over. There's no central database. It can be overridden and Each chunk can easily be relocated if one event's overridden you can take one take a copy and put it somewhere else on disk quite fine Unfortunately does have a disadvantage of by doing this is that you have to scan the entire MFT to find all the chunks for your hidden data Before you can actually start modifying stuff, which is can be slow But if you cash that information to keep it in memory, you don't have to keep rescanning Specifically we have two different versions of the code one is a user space library Which you can basically link programs against provide standard interfaces of open read write Has a built-in mini File system in it basically store files like they were just normal files and it read and write from it you can link with any program Specifically actually people are interested in if the black hat website from black hat federal actually has the demo code You can download executables of it and we also have demo code for a kernel driver, which is actually much more sophisticated and works in the kernel and can creates a virtual drive on the system that Windows sees as a DOS drive and You can actually run programs out of and do everything like was a normal drive Interestingly enough because of how you can set up those virtual drives It won't show up in Windows Explorer, but we'll show up from command prompt into programs And finally everybody will give a demo of the user space code All right guys. Well now for the more fun part. No offense What is this code actually doing? The proof of concept is called hammer.exe I've run it here. I've gone ahead and started it and Obviously you have a bunch of options the most most important is what drive do you want to do this on? Obviously if there's more than one hard drive you can do it separately for each one There's a format command, which is what we run first and I've just actually have run it And what the format will do is set up the MFT Or set up our hidden space so that we can use it I've just run the format command here And it says with four times redundancy, which means we're keeping four copies of each byte We have a little over 15 megabytes available. So my system standard 40 gig laptop That's 60 gigs. I mean 60 megabytes Sorry of hidden space in the MFT just slack space that Microsoft decided wasn't important So the next thing I'm going to do is store file if I do a dr We have usage dot text. Let's just put that into our hidden file system Then we'll take a look at it and then we'll pull it back out so you guys can see it So I'm going to run hammer. I'm going to say I want to do on drive C I'm going to store usage touch text. I'm going to store it as File number one and what this is doing is as Matt said that the the library at user space library has a Many file system in it and basically it just stores files as numbers So you can go file number one by number two Whatever so it's going to take a little bit of time Hopefully not too long because what it's doing right now is reading the entire MFT And that's the trade-off of our system by not having a central index We have to read the entire MFT which causes you know it to be a little bit slow, but we stored a file in there now We can actually list the files that are in the system in the hidden area And it'll go through again the MFT. It'll find all the hidden files It's going to show you file number zero, which is the actual bookkeeping information and then file number one Which is the file we just hid So there we go file number zero and file number one file number zero is not really useful Well, it is useful for us, but not really so much Just store it in so now let's pull it back out Retrieve number one and store or store this file name usage to that text So now what it's going to do It's going to pull the file that we just stored back out put it right back on the regular rDrive and We'll do a comparison to make sure those two files are the same and unless something fails. They will be so let's compare usage text usage to that text and They are the same pile if I do a dr. You see they're the exact same size so that would be Basically how the tool works there are a few other options as far as if you want to get rid of yourself off the drive You can run the cleanse command and it's like you were never there Which can be useful obviously So I'm gonna go a little bit further we're gonna look at a little bit of detection How do you detect this? Well current forensic tools don't they they treat the MFT as a black box basically They don't know what the MFT supposed to look like they know how to like look at all the files on the drive But they don't really look at the MFT for its own sake and it's pretty big chunk of the drive So there's the need for the forensic tools to understand the file structure file structures better And the the analysts don't have time to comb through hex dumps and even if they do I'll show you what it looks like It's not obvious really that you're there We did detect develop a detection tool specifically for detecting ourselves So since we knew how that we were hiding we thought we'd be good guys and also develop a way to detecting that We consider any data beyond the end of attribute marker to be suspicious So I flip back over Our detection tool is called looker and I'll just run looker. Let's see what the options are because I've never can remember So we want to look at do a verbose which will print out Dumps of interesting information and I'm gonna skip directory entries because if you remember Matt said earlier when directory entries become non-resident Microsoft doesn't ever clean them out. And so you can get false positives there It's just easier For me to skip them. It'll be nice if I actually put the drive letter Within there so it's scanning through it's seeing like some junk data And there if you can see it. I just stopped. There's our usage text actually hiding Beyond the end of attribute marker of different MFT entries. So an entry number 190 We found dated offset 760 So that's how the detection tool works both of those are on the black cat website from black at federal this past year and Feel free to pull them down and take a look play with them No, we didn't get the source code out because we had some other interested parties All right, finally, I just wanted to show you what in case finds in case is kind of the de facto forensic tool and I basically Scanned I pulled up the MFT Opened up the hex view window and I just started scanning through it and it took about 10 minutes And I finally got to an MFT entry That had some hidden data in it at the very beginning. You can see there's a magic number. It's called file zero That just is a magic number for an MFT entry Then you'll see the different attributes there at the top and then there's a little bit of there's the The forward double F's which is the end of attribute marker There's a little bit of slack space and then our hidden data now, obviously that's encrypted data It's somewhat, you know more dense than than the data above it But obviously you could mitigate that if you really wanted to do kind of some kind of statistical analysis so that you looked the same We just chose to dump ourselves in there in case doesn't do any kind of like, you know Flagging or marking doesn't show that hey, this is something suspect. It doesn't it doesn't know what it is It doesn't know anything about it. So Pretty well hidden there unless analysts start doing a bit by bit, you know search of the raw drive So I've already showed you the detection demonstration future considerations Obviously hiding through obscurity only buys you time now that we've you know Discuss this that's not nearly as useful as it would have been a few months ago Yeah, well, there's still no tool that detects it, but it's you know, we have other ways to hide There are many other unexplored data storage areas. So this was just one place We saw hey, this is kind of cool look at all this space we can hide there But there's plenty of other places to do similar type stuff and the way we built our tools We can just retarget it towards whatever space we want to hide in You still have the problem of hiding the access tools So the hammer and liquor executables were residing in the normal file system And we've kind of been looking at ways. How can we bootstrap ourselves out of the hidden space? So that we never have anything that's not in the hidden space and that's a Pretty hard problem. We've we've got some leads on how to do that, but we haven't explored it fully yet And then I just want to close with kind of a side thought and that's should file system standards be open Or at least like the specs be published If they were the forensics tools could do a better job of understanding. Hey, this is good data. This is what good data looks like There's also kind of the counter argument Would it be easier to exploit file systems if somebody could look at the source code and find a little flaw or whatever My general feeling is if it's open, it's just better for everybody The forensic tools know exactly what good data looks like anything that doesn't fit that is out of automatically suspect A hacker only has to find one way to hide whereas somebody trying to play defense has to think of every possible way a hacker would try To hide or do whatever they want to do So I'm all for just you know opening this back at least at least publishing what it's supposed to look like So I hope you guys have some questions. Feel free to you know, stick around or give us questions. Go ahead He do guys if you do have questions either go to Mike or we can talk offline, but your question was Well, actually it is part of our algorithm, that's how we come with the 60-foot number actually The question was is it said like 40% isn't stable for storing stuff in is I cannot be algorithmically determined And actually it was that number came from algorithm. I didn't say it but it's basically is It's a file never been modified It's been more than a year old those are the two key characteristics and if those things are true It's unlikely to be modified anytime soon Yeah, yeah, does he work in VMware? Yes. In fact, that's where I did the initial testing How does the EFS affected does what EFS EFS encrypted file system for NT EFS actually works by encrypting individual attributes such as the data after it would be encrypted So in fact, it actually makes no difference whatsoever because we ignore the normal attributes we store cells after them EFS doesn't do a whole drive encryption. It only does per data stream encryption So it does like the ACL instead of MTF or MFT. Yeah, okay, so therefore we don't interfere with it Yeah, okay, so the MFT is outside of EFS correct Okay Are you aware of anything that actually is using some of these techniques that you're talking about? None that I can talk about publicly At techno security there was quite a bit of discussion about you know, everything's going to EFS and The current standard of imaging an entire drive is probably going to go away Which you know means that There's just gonna be specialized tools to find, you know certain types of data In certain ways, you know, is that going to make it easier then to find stuff that is hidden In slack space, etc. Well, yes and no Take your question as is that in some senses yes or no because ultimately you have to have a file system even if you Hold drive underneath the file system There's still the file system on top and even if your data is getting encrypted by the encryption layer You can still store yourself in the file system where people aren't looking now So it will change somewhat but not the fundamental concepts stay the same