 Alright everybody we're kicking off. Thanks for being here at Crypto and Privacy Village. We've got some amazing speakers here and I saw a video at Black Hat for how they were introducing their keynote and stuff. Is anybody interested in a Black Hat style intro for our next speaker? Oh yeah, okay we're on this. Alright. Un-sa, un-sa, un-sa, un-sa. People over! Ladies and gentlemen, welcome to the Crypto and Privacy Village. Thank you for being here. Our next amazing topic, un-sa, un-sa. Lasers, lasers, lasers, lasers. Is cryptanalysis in the time of ransomware. Un-sa, un-sa, un-sa. Lasers, lasers, lasers. Laser fingers everybody! And now it is my great pleasure to introduce the amazing Mark Magans! Un-sa, un-sa, un-sa, un-sa, un-sa. Thank you very much. That was great. I will never have another intro quite like that. That was really, really good. Alright, so a little bit about me. My name is Mark Magier. I'm a senior malware researcher at Endgame. I do reverse engineering and software development. Please do note I am not a cryptographer so if I get up here and start babbling about something that's very, very wrong about crypto, please point it out or take me off to the side and embarrass me in private. You can follow me on Twitter. My Twitter handle is Magurbomb, just like Yeagerbomb. So just go over the agenda really quick here. We're going to discuss very briefly typical ransomware execution flow. Then I'm going to discuss a kind of a high-level methodology for working through cryptanalysis of encryption schemes within ransomware. Then I'll do walkthroughs through four unique pieces of ransomware from four different families. I'll briefly discuss current research that's out in the field. Then I'll wrap things up with conclusion and hopefully have time for questions. So in a typical ransomware variant, what you're going to see are a few different things. So typically a payload is going to be written to disk and executed. There's going to be some sort of key generation or retrieval routine. Optionally there's a key exchange with the C2 but that's more in asymmetric encryption implementations. So we don't always see that sometimes with symmetric encryption. We're going to have hard-coded keys or some sort of key generation that's done on the fly. Then in terms of actually encrypting the files there has to be enumeration and directory traversal of the contents on the disk. So as the ransomware goes through the file system it digitally encrypts the files and sometimes it leaves very nicely written ransom notes in each directory just to remind you that you're completely hosed and all your files are gone. So in order to do crypt analysis on ransomware you kind of have to attack it kind of like a typical sort of malware versus engineering or malware analysis scenario. So typically in that sort of scenario you want to start off with a dynamic analysis approach. You're basically just detonating the ransomware in a virtualized environment, something that's sandboxed and isn't going to reach out and actually touch any sort of sensitive files that you have compromise your actual host. And the goal here is to just passively analyze what the ransomware is doing to your environment. So you want to observe any sort of network communications that are inbound or outbound. You want to look for any sort of forensic artifacts that are going to be left on disk, registry keys being modified, files are being dropped, any sort of effect on event logs. And then finally you'll want to actually look at the results of the encryption. So you want to analyze the encrypted files. And what we're looking for when I say we want to analyze the files is right. Okay, so if these are encrypted files you're going to look in there and say, well it's crypto, it doesn't make any sense. What we're looking for are commonalities kind of across the board between files that are encrypted from multiple different file types. So you want to look for magic byte sequences or water marks that are either in the header or the footer of the files. And you also want to try to determine if the ransomware is trying to encrypt the entire contents of the file or just some sort of partial encryption. Because if their goal is to try to essentially encrypt all files on your system that meet certain file extension restrictions then that's definitely going to be a lot of throughput and in order to make it easier for them sometimes the ransomware authors will only encrypt maybe the first 2 kilobytes, 2 megabytes something along those lines. And when you're doing dynamic analysis it's definitely okay to repeat your test multiple times. You have your virtualized environment you can do reverts as much as you needed. So you're going to want to adjust your environment and control variables as needed in order to gain a little bit better understanding of the ransomware. And when it comes to that when we're initially deployed within this virtualized environment and we initially detonate the ransomware we're essentially doing a known plaintext attack because just by launching the ransomware we'll see the ciphertext that are produced. When we revert back to the clean state before we launch the ransomware we have our plaintext back. So knowing that we can compare the original plaintext to the ciphertext that we've generated. Now in more of a chosen plaintext scenario we're going to try to feed specifically crafted files to the ransomware in order to try to get it to leak a little bit more of maybe it's key stream or key data something that will give us a little better idea maybe of what's going on under the hood without having to fully reverse engineer the ransomware. So after initial dynamic analysis and maybe we've learned a little bit about the ransomware we'll want to do more static reverse engineering. And so the goal of this is to one identify any sort of crypto algorithms that are going to be used if it's something standard like AES, DDS, whatever or if it's something more custom just like a simple XOR. And when we're trying to identify the crypto algorithm once we find the code section that contains what appears to be the crypto algorithm we can then look into identifying implementation mistakes that are going to weaken the overall crypto and potentially provide us an avenue that we can attack and either leak the key or provide us with a decryption capability. And then the second part of the reverse engineering that we'll want to look into is how the keys are generated, stored, and transmitted. So in that sort of scenario we're just trying to see if the key's going to be lying either in memory hard coded within the binary or transmitted in the clear where we can obtain it and kind of replay it. So going through dynamic analysis and reverse engineering as we've noted everything with the ransomware we're going to apply those lessons and eventually to development of our crypto. So for these walkthroughs what we're going to focus on is walking through ransomware encryption schemes that have already been defeated and have been publicized. So I didn't want to spend time working on any brand new ransomware that's out there and publicizing exactly what's wrong with it. Because actually when I originally submitted this talk to the crypto village I had the idea that I was going to get a lot more background on what some of these ransomware variants were doing and how their encryption was implemented and how it was defeated. But in a lot of those descriptions it was really described in an abstract manner and I think there's a couple of reasons for that. I think that for certain AV companies or security companies they're trying to just get you to download their software in order to do the decryption. It's free whatever. But on the other side it's more of an offset concern where they don't want to point out exactly what's wrong with the crypto implementations that the ransomware authors screwed up. So they're just trying to say we have that capability, we can decrypt it for you, but we're not going to point out exactly what's going on. So kind of going through that research phase for this talk, I then kind of turn around and say okay well I know that these encryption schemes are defeatable so how can I actually build more of a how to or walk through guide for doing this. So that's more of the focus of the talk now. As I mentioned these are older variants no longer in circulation. These are all mostly from like 2016-2015. We're going to be doing some high level reverse engineering and we're going to focus on detailing the crypto implementations and noting any differences between the ransom note and reality because as we'll see the ransom notes are definitely a little detached from reality sometimes and they're trying to really just scare people into paying the ransom by just using a lot of technical sounding jargon. And we'll focus on devising proof of concepts for carrying out our decryption. Okay so we'll start off with powerware. So powerware is a PowerShell based ransomware, one of the first if not the first ransomware family that's based in PowerShell and so here's a ransom note that we can see here and so this says that it's not only using RSA 2048 but it's using AES128. So it's doubly encrypting your files so that sounds pretty bad and they even provide links to Wikipedia because they're so nice. And of course they provide instructions for accessing Tor in order to get to their site so you can pay off the ransom payment. So they're very nice guys. So now the advantage of PowerShell is that as a scripting language we get access directly to the source code that's deployed with powerware. But the flip side of that is that of course they can run it through an obfuscator and this might be pretty hard to see from this kind of mile high view. This is all of the source code for this variant of powerware. But the very real names are kind of screwed up. They're not very easy to read. It's a mix of alphanumeric characters, helper and lowercase alphabet characters. So one thing we'll end up doing is de-obfuscating it a little bit just to help us make sense. But as you can follow the arrows and the red text here it can be chopped up into a few different pieces. There's the initial preamble that can include the crypto setup. There's the actual file and directory enumeration where we're looking for specific file types like as you might be able to see like docx, xls it's essentially walking through the file system and trying to match the file path with one of the extensions that's in that list and if it matches that then it runs it through the encryption routine and pointing out the file right and in the bottom there's a file cleanup routine as well that's going to delete the script afterwards. So looking more at the crypto setup portion, as I said before the variable names are a mix of lowercase and numeric characters. So going through this we can actually provide a little more context in terms of what those variables mean. So if we just walk through and relabel things it'll help provide a clear picture and so things look a lot better after going through doing a little bit of de-obfuscation. We can kind of see how certain variables are being used here. So what can we find out here? Well, we're utilizing symmetric encryption, utilizing the Rindal Manage class which is basically AES using a 256 bit or 32 byte key. We have an initialization vector that we're specifying. We're padding with zeros and we're using Cypher block chaining. Now if we move on to the portion of the code that's contained within the file and directory enumeration one thing that sticks out right away is the use of this integer value 2048 or 2 kilobytes. And what that basically is saying right here is that only the first 2 kilobytes of the file are going to be read in and decrypted. And for any files that are less than 2048 bytes it's just going to completely skip. And looking on the code section directly below that we're not making any further modifications to the crypto object that we'd previously initialized and configured before. So just something to keep in mind. Okay, so now going back to the crypto setup do we see anything in here that's bad with their implementation? So there's a few things here. They're utilizing symmetric encryption. They're using basically AES. So in itself that's not bad. As long as they're doing key management right and handling the IVs right. But as we can see yes it's utilizing a 256 bit key but it's completely hard coded. They're not doing any sort of unique key generation. It's going to utilize the same key for every single file encryption that this script runs no matter what system it's on, no matter what time of the day, anything. Same key everywhere. And the same thing for the initialization factor vector. They utilize the same value everywhere all the time. So with a hard coded key and a hard coded IV they're utilizing CBC mode. And those four things combined that's not good. That's basically a very very huge vulnerability. And the CWE if you're familiar with the CVE database actually provides a little bit of background to prove that I'm not just making things up here but non-random IVs and hard coded crypto keys are bad news. Okay. So taking what we know let's try to build our own decryptor. Okay. So I simplified things a little bit here in this version of the power script. I just cleaned up the file enumeration just to target a specific text file that's going to be on my desktop. And so I'll click over here. This is a whole bunch of times so nothing too sensitive I suppose. And I'll run my modified version of the PowerShell script. And it looks like that data's been encrypted. I have no idea what it says now. And so let's see. So we're looking through the code here. What can we do knowing that all the characteristics we saw earlier, how can we decrypt this? Wait could it really be that easy though we change two characters in the PowerShell script and we're going to decrypt everything? Yes it is. I wish it were that easy every single time but you know. Okay so moving on to NEMU God. Here's our ransom note. And here they're touting that they're using strong RSA 1024 algorithm with a unique key. Okay all right we'll see. We jumped ahead a little bit in this case here. What you're seeing here is a pseudocode representation of what is essentially the enumeration and the encryption function. And for this specific binary this wasn't going to take a big level of reverse engineering. This was literally the first function that's called by the binary. So you know popped it into iDepro went to the first function that's called and then utilized X-rays to decompile it so I don't have to show a bunch of ugly looking disassembly up there just to give you a high level view of what's going on here. But as we can see it's only going to encrypt the first 2048 bytes again. So two kilobyte file size limit or at least buffer limitation for the encryption. Then it's XORing by a static key which I've highlighted there. So there's no sign of using anything even close to RSA 1024 AES whatever. So yeah that is their you know that is their encryption scheme. So let's go over what we know. We're not using asymmetric crypto. There is no RSA code in here. It is utilizing XOR. And there is no unique key generation. It's utilizing a hard coded 255 byte key. It's the same for every file. And it's only going to cover the first 2048 bytes. And on top of that this is a simple encryptor binary. Okay so knowing all that what can we do to decrypt our files? Okay so here's my very sensitive data again. And so we run through, hold on I'll pause it real quick. So the way the encryptor binary works is you just have to pass it on the command line the path to the specific file that you want to encrypt. So knowing that we pass in the path to our test.txt. Looks like it goes through and encrypts the first two kilobytes. And then hey well maybe I'll just try running this again. Let's see what happens. And our files our files back. The plain text is back. That was pretty miraculous. You know so hopefully I'm not blowing any of your guys' minds but if you XOR plain text with the same value twice you're going to end up with the plain text. Okay so moving on to Torrent Locker. And so while the first two cases were pretty easy the last two ones are a little more complicated and there's a little more work that goes into developing decryptors so bear with me here. So here's another nicely worded ransom note this one actually takes advantage of the Windows API for drawing a nice window for us instead of just a plain text file. So it's using AES 256 it claims to be. And then they're also saying the encryption key is encrypted on top of that with RSA 2048. So now jumping ahead a little bit with this example if we go in through a debugger like we're using all the debug in this instance and we're just kind of stepping through and looking around the encryption function to see well not even the encryption function itself literally just going through and we found calls to write file the Windows API and then we kind of backtracked a little bit to see if there was any sort of crypto setup before that. And what we see are hard coded references or hard coded strings that seem to reference this library named Tomcrypt and specific file paths for C files that seem to be named CTR and crypts and AES.C so looking at that okay AES counter mode to do a little bit of a sanity check we can decompile the function that contains the reference of CTR and crypts so we can see that all nicely written out there and since LibTomCrypt's an open source library we can go through and compare that to the exact source code that's available out there. And so you can see some sort of similarities there but mostly we're just kind of seeing if we're within the same ballpark generally what we're seeing and we don't really need to get too deep into that we don't really need to use source code analysis here. So from very limited reverse engineering we know that it appears to be using AES in counter mode and their implementations based off of LibTomCrypt we don't know if they've actually modified anything but you know ransomware authors tend to be pretty lazy so they probably just copied it. So could this potentially be vulnerable? And you know as with a lot of other encryption algorithms, symmetric encryption algorithms yes they can be subject to implementation flaws and I provided a little bit of an excerpt from very nicely written paper about counter mode security and basically ways to properly implement AES counter mode along with ways not to and so this gives us a little bit of a hint to say okay well if they're reusing the key then that'll give us a little bit of a window into what's going on. So we'll keep that in mind and we'll try to prove that. Okay so how can we test for a flawed AES counter implementation? Well alright so here's a little bit of math here so hopefully this goes over okay so we're going to want two plain text files. We're going to want one large plain text file that's going to consist solely of null bytes and our goal here is that if there's a file size limitation we want to exhaust a key stream and we want to find that out. As we saw with the previously the two previous examples there were file size limitations so we're going to take a stab at that. For B our second plain text it's going to be a non-null plain text just something that's a arbitrary size that's less than A that we can hopefully use our key stream that we devise to decrypt the cipher text and generate the plain text. So what we're going to do is have the ransomware encrypt file A resulting in A prime or the cipher text and then if we take that result A prime and we XOR with A that should yield the key stream and since A in this case is null bytes if we XOR A or I mean if we XOR anything by null or zero it's redundant. So what we have in this case we're supposing is A prime will be our key stream. And then we go through and we encrypt the second file B resulting in B prime and then we take that and we XOR it by the key stream that we generated from A and that should result in our plain text for B. So that's what we're proposing so let's see if this actually holds true. So as you can see A dot text is just a null bytes not going to go through there. Let me pause this real quick. So I wrote up this quick little dirty python script to basically just XOR A prime with B prime. So all we're trying to do is see if that's going to yield the plain text for B. And for B B is actually the decryptor. Then in a separate tab you'll see that I actually saved a separate copy of the decryptor. It's called decryptor.exe and because it's all just a python script but I just changed the file extension so when I launch the ransomware it's not going to overwrite that file too so I can just preserve that copy and go from there. So going back a little bit there's our B dot py again. Nothing looks too crazy just XOR and there's our fake executable. So we'll go through and launch locker dot exe and we'll see pretty quickly that the files get encrypted and it dropped the ransom note that we already saw earlier. So we'll pop these two files into a hex editor and what do we see? We see that so this is the cipher text that just generated from A so we kind of scroll down and at a certain point all we see are null bytes so either they just start XORing by null or we actually have a limitation to the decryption. Now if we go over to the cipher text for B we can see that it looks like everything in there has been encrypted so nothing too crazy. So now we go back and we take our decryptor dot exe and we rename it to decryptor dot py. So we've successfully evaded the ransomware. Takes a while for the rename for some reason. Then we launch our decryptor and everything looks fine right? You know we have pretty much every line but what we can see at the bottom is that instead of saying print output it says print out P and then there's two characters that look garbled. So what do we think happened? So we revisit the cipher text and we actually see that the two last characters UNT for output are still within the cipher text. So what do we think happened? Well what they probably did in this case and you know this is without reverse engineering the encryption algorithm or anything, what we're supposing is that the actually that view helps a little better. What we're supposing is that the encryption operates on specific block sizes and if they're using a block size that's like four kilobytes or eight bytes that's something that's going to be left out and that's going to be an extraneous chunk that's not part of the want to be part of the last block that's encrypted. So it's just going to be left unencrypted within the cipher text that's generated. So that's an edge case that you'd have to take into account with a lot of the decryptors that you're writing. You want to be able to test a lot of different file sizes so you can make sure that if you write one decryptor and it works for this one test case that's great but you want to test it across multiple file sizes, multiple file types in order to cover all your edge cases so if you write something and you release it to the public you want to be able to make sure it's actually going to reach the audience that you want it to. Okay. Alright so moving on our last example is apocalypse. This one sounds pretty cryptic. They're saying all your files are encrypted. Everything. The world's coming to an end. And just by doing some dynamic analysis we launched the ransomware and we pulled multiple cipher text in order to compare them a little bit. And what we can see right here is that we have what appears to be a watermark a magic byte sequence that consists of the first four bytes there. So that's something to go off of for us. Now if we use the same plain text that consists solely of null bytes that we used for our previous example let's see if this actually generates a little bit more for us. And actually looking through the file just through these first few hundred bytes we can see that there does appear to be some sort of repetition in the cipher text. You know it still doesn't exactly reveal everything to us but it's something to go off of. It's something to note. So what do we know so far? We have what appears to be a magic byte sequence. Repetition in the cipher text when we provide a null plain text. The ransom note if we actually go back to that it doesn't seem to mention encryption type. That's kind of odd because all the other ones what we've seen so far mention some sort of industrial grade RSA 4096 and AES 256 whatever. So it's kind of curious they don't mention that they're using some sort of high grade encryption. So let's proceed with reverse engineering from here. So for this case we're actually going to utilize Ida Pro. And one thing that you can do right away this is definitely a little bit more of a cheat and it's not always going to work. But you can do a simple string search against a binary literally for like instructions in the x86 instruction set. So you can do a text string search for XOR and that'll result in finding the XOR operations. So if we look through the table of the results on the right we can see that multiple XORs are just clearing out certain registers so nothing big there but the two results at the top those are the ones that we're going to be interested in and they both seem to be included in the same function. So a pretty easy way to look into a binary not always going to work but it's a good place to start and it's worth a shot. So if we actually get into the disassembly of the function that contains those XORs we can see that those two XORs are included within that block towards the top and as denoted by the green line that's going back and forth those two XORs are going to be looped over. So that sounds or I mean that sounds like it's probably going to be our crypto algorithm. So moving on from there there's two right file calls. The first right file includes the previously ASCII representation of the previously identified byte sequence for the magic byte sequence. The second right file appears to write out the transform buffer that's affected by the XORs that we saw from earlier. And here's a decompiled view of that same code section. We extrapolated out to cover more code within that function and this gives us a little bit more of a sanity check to determine okay this really does look like we're reading in file contents and we're just transforming them and there's one specific line within the decompilation that we're going to focus on. So as we can see on the right I've narrowed down to just the nuts and bolts of the encryption and in this case it's using a hard coded key which we've identified there and all it really looks like it's doing is it's utilizing those two XORs but there's a little bit more math involved here but it's just looping over that until it's exhausted all the contents within the buffer. Now if we look on the left there's a python script that I wrote to essentially mimic everything that we see here. There's nothing too special about it. I first read in the first four bytes I saved that off to the header and we could do a little bit more of a sanity check to make sure it matches up with the previously identified magic byte sequence in order to confirm that we're working with an encrypted file but as I said earlier proof is this concept you know this doesn't have work for every single file. Then from there we're just individually reading in bytes and transforming them according to the set of operations that we specified there and we're writing it out to a separate file. Okay so let's test out this script. Okay so we started out with our script on the left on the right I just have the output file already created and it's just set to null data so just to show you that there's nothing in there already. So the readme.txt the original plain text that we're using is just a readme file for ollie debug that was our crypt text or cypher text to the left and we run it through our script and it looks like everything is good so we've generated our plain text again. So research in the field despite the overall proliferation of ransomware within the last few years researchers have managed to keep pace and it's pretty impressive considering how much money is going around and is involved within ransomware. I think there's been estimates that in 2016 over a billion dollars was successfully solicited from victims of ransomware. So new variants and new families are popping up day by day but there are researchers that are out there that we're really working hard towards and analyzing all these samples and more importantly developing and releasing decryptors to the public for free in order for victims to get access back to their data. The bleeping computer forums are a really good source to start with if you're trying to research decryptors Blood Dolly is one of the very active members there Maurer Hunter team, Maurer Tech, Demon Slay those guys are all very intricately involved in developing decryptors and reverse engineering ransomware. Maurer Tech is pretty famous for his WannaCry kill switch that saved people quite a bit of time in pain and there's definitely several others and there's a lot of companies also involved in developing and releasing these decryptors to the public and it's something that definitely should be noted because as I said last year over a billion dollars was collected from people but people are releasing these decryptors for free to the public and it really is a nice service that they're providing not only to the community but to users in general of the internet. So to wrap things up crypto implementations are very prevalent in ransomware crypto is hard ransom notes are definitely not the most trustworthy source for crypto implementation technical specs so don't believe the hype if you see a really bad ransom note it's not always going to be truthful and as with just normal Maurer analysis the reverse engineering and crypto analysis of ransomware and decryptor development they're not linear processes it's going to be a lot of trial and error you'll want to utilize chosen plain text attacks when you can in order to try to leak out more information about the encryption algorithm without having to fully reverse engineer it and you'll want to focus on sections within the code whether it's the disassembly or running through debugger whatever where modifications are going to occur then from there you can use that as a base to dig deeper for more clues in order to help get a better idea of what's going on with the encryption and you know initially focus on building out a proof of concept you know do a lot of stress testing so you can cover all the edge cases like we saw with Torrent Locker and you know so you stress test you harden and you cover all the edge cases and hopefully you have yourself a fully working decryptor that's going to work across the board and that's all I've got to say thank you time for questions wait the mic isn't working thanks for taking my question so after sort of analyzing all this crypto ransomware and sort of clustering together I guess the mistakes that all these all things make can you sort of you know associate them with like a group like skitties or like professional criminals or like software engineers like who is writing this shit so that kind of varies like from case to case because there's definitely a lot of cases where you can tell by reverse engineering the code that it really doesn't reach that you know level of complexity that's like typically seen in more like an APT sort of scenario so what we're seeing a lot lately is the rise of ransomware as a service and ransomware kits that are developed by more more seasoned developers and they offer it up to like budding cyber criminals who want to launch their own ransomware attacks and get some of that you know sweet bitcoin so I see that being as more of a cottage industry that's going to develop a little bit more but mostly the cyber criminals that are developing their own ransomware I would say right now the complexity isn't you know where you would kind of expect it to be with the profits that have been generated so far. I do have plenty of examples but then I wouldn't be able to decrypt them so I wouldn't think it's good but with a lot of the ransomware that does the encryption right they're utilizing asymmetric encryption so they're doing some sort of key exchange with a C2 so they have a little bit more network infrastructure involved but right as long as they're doing the key exchange correctly that seems to cover most of their bases and that in that case then you're kind of left with just hoping for some sort of like side channel attack or some artifacts being left on disk for finding and recovering a key so. Any other questions? Alright, thanks for your time.