 Hello, I am Bitweasel, the author of the CryptoHaze password cracking suite, and I'm here to talk about cloud cracking today, which is something that we've heard a lot about in the past and haven't actually seen much of. Just a little background on myself, I've been working with video cards and parallel computing since the NVIDIA 8800 GTX came out back in the CUDA 1.0 days. Things have advanced a lot since then, but I've been working on this for really the past four and a half years as a project. Back when I started, the Pentium 4 was still the hot processor. Two threads in a system, and occasionally, if you were lucky, you could get access to a Xeon box with a few more threads. John the Ripper was the password cracking tool of choice. It worked incredibly well on Pentium 4's and other CPU based systems. Sony had just come out with the PlayStation 3 running the cell processor, which was also a processor of interest for high performance compute, and back when I started, you could actually still install Linux on a PlayStation 3. We've not necessarily improved since then. However, while the PlayStation 3 has remained pretty much the same since I started, we are three full generations of video cards beyond the 8800. So I'm going to use some icons in this presentation, and I wanted to go through them because they're not going to be labeled elsewhere. The first icon you'll see frequently is target hashes. These are the hashes that you've acquired from some source and wish to find the password for. The next thing is the cloud which I am vaguely describing as somebody else's computers. They're machines that you can't necessarily touch. And I'm also going to be using the cluster which is sort of the opposite of the cloud in that they're machines you can touch. All of these are very vague terms because, well, they're not really well defined, and in most cases the cloud means whatever will get you funding. So let's take a look at the state of cloud cracking. It's been in the press for the last, oh really two or three years, and let's see where things are. We have Amazon EC2 GPU instances, which are either depending on who you believe, the end of passwords as we know it because anybody can afford to crack lots of passwords, or $2.10 an hour for two relatively ancient Nvidia GPUs that don't perform well. Last year at Black Hat we heard Thomas Roth talk about his cloud cracking suite, and I believe he's talked about it again since then, and I've not seen it. Let me know if you have. Moxie, who talked earlier today, has his WPA cracker cloud service, which is quite successful, but involves you sending your target hashes off to his systems, which may or may not be acceptable. And finally a lot of people end up just rolling their own. If you're doing distributed password cracking on more than one machine, most people end up doing it themselves. So one of the big concerns with any sort of cloud-based cracking is disclosure of hashes, and some people feel this matters, some people feel it doesn't, but non-disclosure of hashes I term as hashes not leaving systems under your control for whatever variety of your control it needs to be. That can be systems entirely on your premise, it can be systems you have root on. Again, it's very vague, but the alternative is throwing your hashes out into the cloud, or really more realistically onto the internet. And some people have argued that, well, yeah, if you're a pen tester you have to worry about this because of NDAs and other such things in professionalism, but if you've just popped a website, well, you can spray the hashes wherever you want because who cares? I'd like to tell a little story called LinkedIn. Some of you may have heard of them recently, but I'm guessing a lot of you have not heard the back story on how that was discovered. Based on people who do very good things with passwords, like use unique passwords for each site, either through putting the site name in their password or through using something like LastPass or KeyPass or encrypted text file, a number of people determined that this really big list of hashes that had been posted on Inside Pro, which is a Russian password hacking forum, could have only come from LinkedIn. This was verified by people who said I have only used this password on LinkedIn and also verified by the number of passwords that were cracked with link or LinkedIn or interestingly blue in the passwords. Based on people who have also changed their passwords on a frequent basis, passwords are like underwear, change yours regularly, it was determined that the actual breach happened six to eight months before LinkedIn hit the news. Somebody had successfully breached LinkedIn, gotten away with it, and then six months later when somebody went, hey, I need some help cracking these hashes, we discovered that LinkedIn had been breached. That is a good argument for non-disclosure of hashes if you're not necessarily supposed to have them. The same is true of eHarmony, and that's cut off at the bottom, but LastFM. Most of the big password breaches have not been discovered by companies going, yeah, we lost our hashes. Most of them have been discovered by the hashes showing up somewhere else, and users doing exactly what we told them, which is using unique per-site passwords. So the tools that I've written, the first one that I'd like to talk about is the CryptoHaze multi-forcer. I'm terrible at naming things, by the way. It brute forces multiple hashes. It is a GPL v2-based tool, so it is entirely open source, it is compatible with a variety of other tool licensing, and if you want to play with it, great, go download it. It's on SourceForge. It is designed to run on Windows, Linux, and OS 10. Some other popular tools only support one or two of those platforms. Everybody seems to use their own operating system of choice, so I choose to support them. I support CUDA, OpenCL, and the CPU, depending on what devices you're targeting. In some cases, CUDA works better on Nvidia cards than OpenCL. In some cases, OpenCL works better on a CPU than SSC2 code. It just depends, but I support all of them. It is written as a C++ framework with the goal being easy extendability. So to add a new hash type of an existing implemented variety, say just a plain unsalted hash, once you've written your device kernel, there's very, very little glue code that you have to write on the host side to add it to the framework and begin cracking. And it has integrated network support. Unlike a lot of the other tools where there are third-party wrappers, my network support is built in, it's fully integrated with the tools, and it lets you use a whole lot of systems at once. Now, why does this matter? Well, primarily because we are brute-forcing passwords. And brute-force, if it doesn't work, you're not using enough. There's a very real limit on how many video cards you can put in a modern system. It is currently either 8 AMD video cores or 16 Nvidia cores. And this is a driver limit, and approaching these limits does tend to lead to problems like PCI extenders hanging all over the place and the occasional system fire. I don't recommend system fires. So my framework with the distributed network environment works as follows. You have your main host system. This is typically a system that you have physical access to. You dump all of your target hashes into it and launch the server instance. You then have a lot of compute nodes, which can be dedicated systems, so dedicated systems loaded with video cards, or desktop systems. A thousand Core 2 Duo CPUs is still a force to be reckoned with. They can either have video cards in them or they can have desktop CPUs. For those of you who use laptops, I actually support the use of both simultaneously. And I found that on most laptops out there, the performance of the video card and the performance of the host processor are roughly the same. So instead of using one or the other, you can use both and double your performance assuming you don't melt your laptop down. And the remote hosts can either be local or they can be on the internet. It doesn't matter as long as they can talk to the server. So they go about talking to the server, get their work units, perform work, return results back. If they disconnect because either the user is shown back up to their machine or they crash, the work units are retried. So it's really a very robust system designed for large-scale distribution. This is not exactly a new concept. Other people have done setups like this before. And usually they're fairly complicated to get set up. You have to set up a web server and put stuff in the database. And it's really not terribly easy. So for my multi-forcer, to enable the server, you simply pass the enable server flag. And you're probably not going to be able to see these because they are below the edge of the screen. But I also support a server-only flag, which is running the server without any compute devices attached to it. The reason that I add this is because unfortunately, video card processing is still in its infancy. Cards glitch out and cause crashes. Drivers glitch out and cause crashes. And by not having these running in the same process as the server, it keeps the video card crashes from crashing the server. It also lets you use a separate dedicated system very easily. There's no requirement on CUDA or OpenCL to run the server. For the clients, they work in a similarly easy method. You pass them the half cut off remote IP command line parameter with the host name of your server. Now a lot of people will run some sort of an attack for a while and then look at the results and go, you know, I want to try something different. And so they'll turn the server off and restart it. In earlier iterations, this caused the clients to terminate because they went, aha, the server's done. This is a problem if you have a lot of clients scattered about. So the behavior of the clients at this point is to simply sit and wait. If the server's not there, they'll just wait for it in a sleep loop. And when the server shows up, they'll connect, do the work. And if the server shuts down, they shut down. So what this means is that you can leave the clients running on systems and they'll show up and help out as needed. I don't yet have support for this, but I am working very actively on getting Windows service support enabled with idle detection. So for those of you with a lot of corporate desktops or just a lot of desktops in your house, you can have a client running that will determine if a user's in front of the machine. If they are, go to sleep and do nothing. And then when the user's away, you can use the machine and it will join in and continue with the current job. Still in development, coming very, very soon. I didn't have time to finish that before DEF CON. So let's look at some numbers. I've said I can do massive scale internet hash cracking. Let's see what that looks like. So some of you who've read my blog probably recognize this. This is from about two weeks ago. It involves six systems, roughly 30 video cards, I think. I actually have managed to lose count. And in case you missed it in the lower right-hand corner, we're running at 154 billion NTLM per second on a list of 10 hashes for length seven. This does not actually slow down much as you go to larger hash links. This is just the biggest number that we got for that day. So it's what I'm using. I also support doing things like SHA-1 hashes. This is an attack using the same set of systems on 1,000 length 8 SHA-1 hashes. If this doesn't scare you, you've probably been writing websites. So please salt your hashes. Because otherwise I can do this to them. And SHA-256, SHA-512, et cetera, are a bit slower than MD5. But basically unsalted hashes can be attacked at absolutely insane speeds because I hash the password once and I check it against all of the hashes. And there are very, very efficient ways to do that. If you're interested, either talk to me or check out the source code. Lots of fun with bitmaps and caching layers. So if you're curious as to what some of the systems that we use look like, these were some of the machines that we used for that 154 billion NTLM per second attack. There were also several other systems scattered about the internet that I don't have pictures of. And we ran the server on an EC2 instance. So EC2, we've heard it's really expensive and doesn't perform very well, and we've heard it's the end of password cracking as we know it. I did something really, really radical called benchmarking. So I have these weird things called numbers. For those of you not familiar with EC2, you can rent instances of a variety of different sizes. And I've benchmarked a couple different instances. These are using OpenCL for the host CPU and CUDA for the GPUs in the last instance. Also, I was getting, I believe, around 400 million NTLM per second using the host CPU of the CG1 instance, because it does have a quite powerful processor on it. So these are the numbers that I got. Roughly 3.3, 3.4 billion NTLM per second on the GPU cluster instance. Well, this is all well and good, but how much does it cost? Because most of us aren't made of money. So I looked at the costs. And if we look at the on-demand costs, which are what everybody thinks about when they see EC2, things are pretty expensive. I completely apologize, I didn't realize the screen would be cutting stuff off. The CG1 instance is roughly $2.10 an hour for the demand instance. A demand instance means it will run until you tell it to stop or the system crashes, which happens a little bit more often than I'd like to see, but anyway. The second instance type, though, is called a spot instance. And what this means is you put in a bid, and as long as the cost of that system is below your bid, you will get instances launched. The downside is they may get randomly cut out on you, but if we look at the numbers for them, the green lines are a whole lot shorter. And for the CG1 GPU instance, the spot price has been hovering right around $0.35 an hour. Now, I don't know about you, but $0.35 is a lot less than $2. And this makes a difference. So we've got the benchmark speeds. We've got the prices. Let's turn them into something that makes a little bit more sense for us. Let's turn it into dollars to crack NTLM link fate. And because the numbers are not showing, I'll just add the numbers, you can spend nearly $10,000 if you use small on-demand instances. Or if you look at the bottom, you can spend under $200 to do the same thing. So I would go ahead and say that being able to crack NTLM link fate for under $200 with absolutely no hardware purchasing means EC2 is probably still quite viable for this. All you need is a credit card. And again, as I said about other cloud cracking talks that have not actually delivered anything, you can go to my website and you can do this. I have on the Project Wiki a user data script for EC2 and directions to do the setup that will allow you to launch a GPU instance and run a script that will automatically install the needed dependencies, install the multi-forcer, and point it to a server of your choosing. So this is out there. You can do it. We've also heard that AMD is a whole lot better for password cracking. So let's take a look at some numbers. The CG1. or the GPU instance for EC2 can do about 3.4 billion NTLM a second. Unfortunately, on the same workload, an AMD 6990 does 10 billion per second and change. So if you're building a system, use AMD cards. They're faster. It's not necessarily that Nvidia cards are bad. It's just that the AMD cards have a hardware barrel rotator and a bit select operator, which can replace a lot of very, very complicated logical operations with one instruction. And if you've ever looked at our current hash systems, MD5, SHA-1, etc., they're basically doing barrel rotates, logical operations, and additions. So that's why AMD cards are faster. Now I'd like to talk about something completely different that you probably have not heard in a long time. Rainbow tables. I would like to announce they're not dead yet. They do have some limitations. We've been told reasonably accurately that rainbow tables are only good against unsalted hashes. So in 2012, when everybody salts their hashes, um, rainbow tables are useless. Not true. The other interesting thing though is that rainbow tables are also usable with static salts. So if somebody were to do something silly like go, well, I need to cash out something that we'll call domain credentials on a local machine. And they went, you know, we don't want these to be unsalted. Let's salt them. They could do something silly like use the username, which sounds like a good idea. It's unique per user. It's not going to be duplicated. And it's not like there's a common user on pretty much every system on a domain called administrator. So static salts are not good. Domain cashed credentials can also be attacked with rainbow tables. There are tables out for the domain cash credential one, or I guess MS cash hashes. There are not any yet for the domain cash credential two, which are stretched significantly, but I am working on it. So the next problem that we've had with rainbow tables is their size. Um, back when rainbow tables first started out, they were pretty small. Uh, you can't see the numbers there, but, uh, let's say length six tables for your favorite hash algorithm of choice. They're about two gigabytes quick and easy to download. Uh, also contrary to some stuff that I've read, larger hash types do not necessarily require larger rainbow tables because if you're storing them efficiently, you only store the minimum number of bits necessary to differentiate the hashes. So the fact that Shaw 512 has a much longer output than, uh, MD5 does not actually affect the table size at all. Uh, length seven tables become a little bit more useful, but, uh, they're about 41 gig, which is not too bad. Length eight tables, which are by the way out there for both MD5 and NTLM, and I'm working on the Shaw one tables, are about one and a half terabytes. This starts to be kind of difficult to transmit over the internet. Uh, so what about length nine? Uh, length eight has been kind of where things have stagnated because we just can't ship them around. Uh, length nine tables are looking at, uh, roughly 60 terabytes for a full set. This gets really big and, uh, just in case, uh, you hadn't noticed that this is a logarithmic graph, let me, uh, make my point a little bit more clearly. Length nine is big. And for those of you who live in the United States, where we're America and damn it, we don't do broadband, um, let's look at some table download times. So, uh, this is the time in days to download tables, assuming that you're downloading it 20 megabit, which is about halfway between what, uh, some people wish they had and, uh, some people are a little bit over. A week for length eight tables, if anybody's actually seeding the torrents that fast, which by the way they're not, uh, is maybe acceptable. But how many of you really want to spend a year downloading length nine tables to your, uh, 60, 70 terabyte home storage servers? Now I'm guessing some of you have home storage servers that can handle that, but, uh, it's probably not the majority of you. This, up until fairly recently, looked like a hard limit on rainbow tables. They're just getting too big. Yeah, we can bump the chain length up and we get a, uh, a linear reduction in space, but unfortunately we increase the pre-compute time, uh, with the square of the chain length. We also increase the table search time with the square of the chain length. So there's only so much you can do modifying, uh, the chain length to make things fit. And unfortunately you can't come up with a reasonable chain length for length nine. So end of the road, right? If it were, I wouldn't be standing here talking to you. Uh, I have invented something that I'm calling web tables, which are tables in the cloud for, uh, a definition of cloud that works for me. Instead of having to download the tables, you can do your table search remotely. You don't send me your hashes though, because I don't want to see them and you don't want to send them to me. So let me go through how this works. You start with your target hashes, these are either unsalted hashes or domain cache credentials or something that you've obtained from somewhere. Um, we take those, we run them through our processing device, and I say GPU because that's what's most typically used, but you can also do it on a CPU if you're on a laptop that doesn't have a good video card, uh, or potentially using multiple laptops, um, you generate your candidate hashes. These are derived from the target hash, but they're not the target hash. Uh, if some of you are really, really familiar with rainbow tables, you'll probably know that the target hash does make its way into the candidate hash list. I skip it. I just don't send that. You suffer a fraction of a percent loss in hit rate in exchange for not sending anything sensitive over the network. Uh, they're also sorted and so basically it's easier to go back to, uh, it's easier to brute force a password than it is to brute force a candidate hash into your original target hash. Once you have these, you send them out over the network to the cloud. Uh, well the cloud has some nice features to it potentially, one of which is the cloud can have things like fusion IO devices or other very large solid state drive arrays. For those of you who have used rainbow tables much, uh, the local drive speed is actually a big limiting factor in table search. It's very IO per second heavy and spinning hard drives are really bad at this. Uh, solid state drives work great if you have a couple terabytes of solid state drive hanging around, which again, I'm sure some of you do, but I'm guessing the majority don't. However, with a server hanging out serving a whole bunch of users, that becomes cost effective and actually very, very efficient because you can service more users on one server. Uh, and you don't have to download the tables. So once your candidate hashes get sent to the solid state drive in the cloud, it does the table search and gives you back the chains that you need to regenerate. Now you run these through your processing device again, which does require knowledge of the target hash. You have to know what you're looking for. Uh, and out pops a really good password that we have again trained users to do. Now the neat thing about this from the perspective of somebody who doesn't want to send your actual target hashes out to anybody else is that all of the processing that requires the target hashes is done inside your environment on your laptop, on your EC2 instance, on your massive cracking rig. It's all under your control, so that which you send out to the cloud is not that which you're looking for, and that which the cloud holds can be much larger than the storage space you have available. You can quite literally sign up for this service and five minutes later be cracking passwords. So, uh, speaking of that, again, as I promised that I was not going to be talking about things that don't exist yet, this is up and online. I have NTLM and MD5 length six and seven tables available. The length six tables are completely free. I also have NTLM length eight, and as soon as I get around to finishing uploading them, I will have MD5 length eight. So, roughly four terabytes of sort of rainbow tables for you to use without having to download them first. And the client supports Windows, Linux, and OS X, although the latest build for OS X is not up yet because I ran out of time. So, in summary, web tables give us the instant access to the rainbow tables with no need to download them. The table searching can actually be much faster than you're capable of doing it locally because of the solid state drives that make sense for serving lots of users. And we can support very large tables. We can do those 60 terabyte tables for length nine. You can have full access to them without having to have 60 terabytes of storage. So, you can't quite have your cake and eat it too, but you can use your rainbow tables and not have to download them. Now, what if you're concerned about defending against this type of stuff? If I've scared you, good. I'm not the only one doing this type of stuff. This is hardly revolutionary. So, I've said it before, salt your passwords. There is zero reason to not be doing this in 2012. There was zero reason to not be doing this in 2006, but salt your passwords. If, yes, actually I heard Bcrypt mentioned, if you are not a crypto guy or girl, that's great. Don't invent your own password storage algorithm. Use Bcrypt. Use PBKDF2. These are algorithms that are designed to store passwords securely. They're not designed to check files for corruption. I would say that the John the Ripper website, openwall.info.org. Anyway, they have a great PHP library that does Bcrypt. It's bulletproof. You give it a password, it gives you back a nicely salted, iterated hash. Use something like this. And if you're going to use PBKDF2, which does support multiple iteration counts, don't be silly like Blackberry and use an iteration count of one. You're not making it much better. I've said it earlier, but I'll say it again, salts must be fully random. If an attacker can guess anything about your salts before dumping the hashes, they can build a precomputed table against it. They can build a table against that silly user administrator that seems to show up on every single Windows machine connected to a domain unless you've renamed your administrator account. Maybe not a bad idea. The other thing is your salts must be large. The performance penalty for salted hashes comes because when I go about attacking passwords, I generate a password, but if I have 100 salts, I have to run that hash algorithm 100 times with each salt. So I'm slower by a factor of 100. If I have, let's say, about 7 million salted hashes instead of 7 million unsalted SHA-1, I'm slower by a factor of 7 million, assuming that people haven't done something silly and use a small salt space, because in the past, people have salted passwords with very small salts, say, 2 digits. That's great, except there are only 100 unique salts. So that means I'm slowed down by a factor of only 100, even if there's 100,000 users. Use big salts. And finally, iterate your hashes. Use something that does a lot of iterations. It slows the attacker down. It really doesn't hurt the defender very much. Your users are not going to notice if you take a millisecond instead of a microsecond to check their password hash. The internet doesn't run that fast. And the other thing with iterations is, as Blackberry demonstrated, they have to be a reasonable number. And that number has to go up as processing power increases. As time goes on, things get faster. Video cards have been tracking Moore's law very, very closely in performance. So if you're not doubling your iteration count roughly every 18 months, the attackers are out running you in power. So I'd like to throw in some final thoughts here before I go to questions. For those of you who are in the audience who are a bit younger, who are kind of interested in the security realm or still in college, my suggestion would be, find something you're good at and get really, really good at it. People who are really good at what they do don't tend to have trouble finding work. And usually it's something you enjoy. I'd also like to point out, and this may be a little bit personally motivated, but open source projects are really good for resumes because you have experience with a larger project. You have experience coming onto a project that already has a framework, a direction, and development done. And it's something you can talk about. A lot of people I'm sure have worked on projects that are really, really cool that you can't really talk about to a future employer or anyone else. Open source works around that. If it's public on the internet, you can talk all you want about what you've done with it. And it takes a lot of time to get good at this stuff. I've been working on this stuff for four and a half years. Video cards are not easy. FPGAs are not easy. C and C++ are not easy. So if it feels like it's taking forever to get good, that's okay. Everybody's been there. Salt your password hashes. If you take one thing away from this talk, it's that password hashes are like eggs, they're better with salt on them. Now, I would like to discuss one of the worst password policies out there, which has been seen recently by companies like Yahoo. They're using an industry standard 16 rounds of ROT 13 for their password encryption, which is unfortunately indistinguishable from plain text. There's an easy way to tell if a website is storing your passwords in plain text. If you go to the website and you go, huh, which password did I use? And you click the forgot password link and they send you an email and you go, that's my password that you just sent to me over email. They're storing your password in plain text. Please submit them to plaintextoffenders.com. This is a website for the purpose of public shaming. This is a website for the purpose of public shaming of websites who do that. Email is not encrypted in transit. Email, they just kind of assume that everybody reads your email. If you're hearing you read my email, please don't tell me. And don't store plain text passwords. This is also a great example of why you should be using a unique long per site password because if somebody finds your password and your password is my Yahoo password, I wonder what your Google password is. I wonder what your PayPal password is. And we've actually seen this in the wild. Some passwords that have come out of plain text leaks that are really long, nasty, say 20 character random upper lower just absolutely amazing passwords have been cracked in other dumps because that password from the plain text dump made it into somebody's word list and when another site with either salted or usually unsalted passwords had their database dumped, that person was using that same really, really good password somewhere else and that got cracked. So a really good password will not save you unless you know the password policy of every place that you have ever put that password. Also I'm a big supporter of publishing your password encryption algorithms on your website somewhere. Trust me, the security community will not mock you if you go, yeah we're using 50,000 rounds of SHA-256 based PBKF2 with a doubling of the length every 16 months. If you say we're using unsalted SHA-1, you probably will get mocked and you deserve it. So if you want to find out more, that's my website. WebTables is the currently active and online WebTables system. That's my email address, should be easy to find, it's all over the website and I've got an IRC channel. So with that, that ends my presentation and I'd like to open up the floor to questions.