 Well, thanks for coming to this talk. I'm Matt Weir. I'm from Florida State University and this talk here, as you can see, is a password cracking on a budget. Before I begin, there's a couple of people I just want to thank. First of all, Bill Glodek, who is another research student, who unfortunately for me graduated, but he did quite a bit of the work here. Professor Sudir Agarwal, my major professor, Professor Brenno, and then also the National Institute for Justice for funding this research. So I really do appreciate that, so I don't have to teach all those Intro to C++ classes there. Real quick about me, just so you know who the hell I am. As I said before, my name is Matt Weir. I'm currently a Ph.D. student at Florida State University. Before I decided to go back to college and take a significant pay cut, I was a network security engineer for Norser Grumman Task. And the last project I was on, I helped support the JTF Joint Task Force Global Network Operations with some of their forensics investigation. Also, just to let you guys know, I have had my password stolen in the past, so this hits a little close to home. In fact, I discovered that during the course of this research. A lot of what we do is try to figure out how people actually create passwords. And in order to do that, what we do is we look at disclosed password lists. So like a hacker will break into a site, they'll steal all the passwords, they'll post them online, and then we take a look at them to try to figure out what they are. And if they just post the hashes, then we actually have to go ahead and crack those passwords, in order to try to figure out how people create those passwords in the first place. So a long time ago, I played a video game called Batmud, which is way better than World of Warcraft. But unfortunately, they didn't have a very secure website, apparently. So as I was going through these lists here, I was like, oh, hey, I know these people. That's my username. Crap, that's my password. This is hitting a little close to home here for me. But I think that's actually fairly common. If you think that every single website that you've ever entered a username and password into is really secure and never been broken into, you're probably at the wrong conference here. I like to have a disclaimer before I talk, just so you notice and can walk out early if you want to. But I'm a student and I crack passwords in a research-type setting. So we're not really getting hard drives and encrypted files and having to crack them. Instead, we're going ahead and we're cracking these big old password lists here. So some of the tools that we do use may be a little bit different than what you would use in real life here. But hopefully, the techniques are fairly similar. Also, I'll freely admit I've been wrong about many things before. I'll probably be wrong about a lot of stuff in the future. So I certainly don't claim to know everything there is about password cracking. And in fact, quite a bit of what we've been doing, I kind of feel, has been reinventing the wheel. Because password cracking has been something that's been out there for a long time. But it's not really well documented. So we've had to kind of rediscover a lot of it ourselves. And actually, that's kind of the goal of this talk here. If you are just say up your own forensics shop or your system administrator, the last system admin left and then passed down all the passwords, and now your boss is telling you you need to go ahead and crack these passwords, what do you do? Because a lot of stuff online is pretty much just like run John Ripper or Cain and Abel, and it leaves it at that, or it may give you like a CISSP overview of how to crack passwords. But trying to actually apply that in real life is a lot different I've found. So really, all this talk is just if you're put in that position, how can you maximize the chances of you cracking up that password? So since I do tend to kind of wander a bit, I figure it might at least give you an overview of what we're going to be talking about. At the very beginning, we're just going to start real quick with just some password cracking basics, just so everyone here who wandered in kind of, we're on the same level, and also kind of tell what we've been focusing on as far as our research goes. Then we're going to go ahead and talk about input dictionaries, word lists, a little bit about where to get them, but more about how to generate them yourselves, or at least some of our work is trying to generate them. Because a lot of the word lists online are leave a little bit to be desired there. And then we're going to talk about word mangling rules. How do you go ahead and pick them? A little bit advanced, John the Ripper. And then finally, some of our research that we've been doing on trying to automatically generate word mangling rules, because doing that by hand is a real pain. Also, because this was a 50-minute talk, and that's probably the worst way to give information about technical stuff in the world, please, my email address is here, it's on the slides on your CD. If you have any questions, go ahead and let me know. Also, I'll be available in the Q&A room afterwards. Please show up, grill me, we can geek out about word lists and stuff like that. But it's really important for me that this research actually has some application in the real world. I don't want to just go ahead and write papers that don't really apply to anything. So at the end of the day, if no one's actually using this information, or it's not worthwhile, I'll go ahead and I'll try to research something else. So please, get in touch with me. If you think what I'm doing is wrong, or you think these tools are crap, please let me know, as long as you can tell me like maybe a better way, you know, some way to improve that crap there. And if you have some good ideas that you want me to look at, I mean we're research students, we're always looking for stuff to work on. So please, you're not bugging me if you go ahead and contact me. Also, you might want to copy down that URL at the bottom there because that's not on the slides. All the tools, all the scripts and stuff like that we're providing online. Unfortunately, our main website, which is on the CD, is currently down right now because our system admin took it down to make sure it's all patched up and stuff like that because he said that some people were banging away on that. So I mean that's my fault, and now I learned that before you give a talk at DEF CON, you might want to give more than a week's worth of notice to your system admin there. So I apologize for that, but I threw up it on Google pages there, so it's just reusablesec.googlepages.com. And you can download all tools and scripts. So real quick about password cracking, don't worry it has knocked me a CISSP prep course. So there's really two types of password cracking that people think about it. Online and offline. Online, we really don't care about too much. This is what people are trying to do to our website a couple of days ago. The site's online, or the computer's online, and you're just going to try different username and password combinations to a site that's currently operating. And the research that we've been doing really doesn't apply to it that much because on online password cracking, first of all, it's generally very slow, so you're very limited by a number of guesses you can make. And it's noisy, so the system admin actually looks at the logs, which happens every once in a while, it really shows up. But more importantly too, you're limited by a number of guesses you can make before the system locks you out. So you try four passwords, if you get it wrong, then you're locked out of the system. So what we're really focusing on though is on offline password cracking. This is the computer forensics. So you obtain the warrant, you broke down the door, you seize the hard drive, you get back to your shop, and all of a sudden you realize, well, this hard drive's encrypted, or there's encrypted files on here. How do we go about breaking these? And when you get into that situation there, you're really only limited by the amount of time you can spend trying to crack these passwords and the amount of computing power you can throw at the problem. So I'll be talking about password hashes a lot, and really it just comes down to hopefully your computer, your bank, the website doesn't store passwords in plain text, because then there's no need for password cracking. Someone breaks into the site, they see all the passwords, they're done, they can have a beer. So most sites go ahead and they store a password in a one-way cryptographic function that mangles the password so you can't figure out what it is very easily. So let's say a user goes ahead and creates the password defcon. The computer will not save the word defcon on the computer, hopefully, in Syria anyway. There's a couple of talks about this later where it sometimes does. But instead the computer will hash the password. So the Md5 hash of defcon is that long string there. To log in though, and it stores that hash instead of the actual password. So when you log in, you type defcon in again, the computer hashes that password and it compares it to what it has stored on the computer. If those two hashes match, it goes ahead and logs you in. Cracking passwords is very similar. You just make a whole lot more guesses. So you make a guess, you hash a guess, you compare it to the hash that's on the hard drive. And if those two hashes match, you've cracked the password. So really the question is how do you go ahead and making those guesses intelligently? Because hashing those guesses takes a lot of time sometimes. So there's really two main ways that you go about trying to crack a password. The first one which we're going to be spending most of our time on is the dictionary attack. So you take words from an input dictionary that contain words that you think someone might want to use to create their password. And you mangle it in a way that people normally do when they create passwords. So your input dictionary might have password and you try that. And if that doesn't work, you try password 11 or capitalize P and replace the A. And this is what people normally think about when you talk about password cracking. Now when we crack these lists here, we hit what we kind of call a brick wall after a certain point. And that's really, I mean, initially when we start cracking passwords, we're doing really well because we're cracking lists, you know, a couple thousand passwords, you know. So initially we'll crack, you know, a couple hundred and then we'll like maybe go down to cracking maybe like 10 an hour and then maybe one an hour and then one a day. And then pretty soon we're to the point where we're cracking maybe one a week and we're like why are we doing this? You know, because, and we got to move on to the next list there. And we call that kind of the brick wall because you really, you go a long, long period of time. It's like hitting a brick wall where you're not cracking any passwords. And it seems like you're just kind of spinning your wheels. And when you're doing a dictionary-based attack, it's really frustrating because you got to figure out why you're not cracking those passwords and you don't know until you actually crack the password why it was. So, I mean, you could not be just trying to write dictionary words. So your dictionary could be a poor, it doesn't have the word in it, it doesn't have DEF CON in it, so you're not trying that. Or you might not be trying to write word-mangling rule. So the person might have capitalized the first letter, replaced A with an ad symbol and added 11 to the end of it. And you're just not trying that as your word-mangling rules. And it's really hard to figure out when you're trying to crack these passwords, do I try more dictionaries? Do I start with more dictionary words at the problem? Or do I try more advanced word-mangling rules on the dictionaries I'm currently using? Because it's a real trade-off. Because you're always limited by the amount of time, the amount of guesses that you can make. So the bigger your dictionary, the less word-mangling rules you can apply to that dictionary. And vice versa, the more word-mangling rules you do, the smaller dictionary that you have to do to apply those word-manglings to. So it's a real trade-off. And then there's brute force, where you just kind of go through it. We're just going to go try every single possible combination. And don't let anyone tell you anything differently. Brute force is wonderful. I love brute force if you can do it. The problem is this is password cracking on a budget, and generally brute force is not an option for longer passwords. But if there's no password creation policy, it still helps you get a lot of passwords that you normally wouldn't get with a dictionary-based attack. So here's some examples of some real passwords that crack with brute force. Like VPTP. That's not going to be in my dictionary. It's not a real dictionary word. I have no idea how they picked it. But since it's really short, it's easy to crack with brute force. Another one, WF capital X 8 ZJ. Once again, I'm not going to crack it with a dictionary-based attack. And luckily, they didn't make it one more character longer. Because if they did, I wouldn't be able to crack it with brute force. But since it was only six characters long, I was able to crack it. 00K00. So that's kind of just something that they can remember. Once again, it's very resistant to dictionary attacks. And the final one that kind of just annoys me is Wii. For Nintendo Wii, I think. And this just kind of points out that dictionaries are only good when they were made. So most dictionaries are fairly old. If you're looking to crack someone's favorite band is MC Hammer, you're golden. But if they're like Linkin Park, even though they're old, too, you're probably not going to find it. So anything recent, it's very hard to keep these dictionaries up to date. So I want to run a few demos just so I can have stuff breaking in front of a couple hundred people. But I guess the real question is, how good can I do if I just do the kind of script kitty type saying, I go online, I download John the Ripper, I download a couple of these word lists, and just run them? Because that's an honest question. Because if you can do really well with that, then you can just walk out and go to a different talk. You don't have to listen to me talking here. So if you'll excuse me, I'll be kind of lame and just copy and paste these commands so I don't fat finger them. And what I'm going to do is I'm just going to go ahead and run John the Ripper with the default rule set. And so also we're going to go ahead and we're going to crack half the MySpace password list. So a long time ago, one of the most famous disclosed password lists was someone set up a fishing site at MySpace and managed to grab about 60,000 MySpace user names and passwords. So we feel that these passwords are very representative of what normal people would do when they create passwords if there's no strong password creation policy. Also just so you know, we split these password lists up into two different parts. Whenever we get a plain text password list, mainly because it's really easy to create password cracking rules if you know all the passwords beforehand. So this way we'll have a training set and a test set. So we don't look at the test set at all, but we do look at the training set to try to figure out how people create passwords. And then we go ahead and try to make our rules and practice the test set. If the passwords are hashed when we grab them, the whole thing is the test set because you have to break them in the first place. Also just so you know, these passwords are not hashed. So I'm just outputting them to standard out and just seeing how many of them we cracked with John Ripper's rule set. The reason for that is that this is a 50-minute talk and it'd be really boring just to watch his crunch numbers here. If this was hashed with, let's say, MD5, this test here would have just taken about an hour. So at the very end of the talk, we could have looked at it and been pretty boring. If this had been hashed with a stronger hash, like Linux's or Unix's Crip 5, which does 1,000 MD5 hashes, we'd have to definitely extend this talk time a little bit for this. So just to give you an idea, an hour if this was hashed with a very basic hash, 1,000 hours to run this test here, if this was a stronger hash. And we used just the basic words.english.text dictionary, which is the one that you find on all the different password cracking websites. And it ran there and then we managed to crack 3.2% of all the passwords there. So yeah, I mean, if you're only trying to crack one or two, that'd be great. But if you want to tell your boss, hey, you know, we were setting up a forensic shop here and we have a 3.2% chance of cracking these passwords, that's probably not going to cut it very well. So the next question is, could we just be using the bad word list here? Or actually, you always hear about people always using the same common passwords. So why don't we go ahead and use a word list of just common passwords. And this once again is just downloaded offline here. It's like 816 most common passwords. It ran really quick. Only 41,000 guesses versus the 10 million guesses from the previous one. And we cracked 1.71% of the passwords. So once again, I mean, well, that's actually pretty nice. Very few guesses. But still, you're not even cracking double digits here. So let's go ahead and use a bigger dictionary. And this dictionary's description is the big dictionary. This dick 0294. And this is going to actually take a couple of seconds here. And this is actually just saying one of the few good big dictionaries that I've had experiences with. And I'll complain about big dictionaries a little bit more later. But hopefully it'll just take a minute here. Oh, there we go. And finally, we're cracking double digits. But it took much longer there. It took 37 million guesses. But we did manage to crack 19% of the passwords. So this is actually starting to get a little bit semi-respectable here. And finally, we're going to talk a little bit about custom dictionaries in a little bit here. So I might as well go ahead and try one of these here real quick to show you. So this is a custom dictionary. It's actually based off of Wixionary. And Wixionary is a sister project of Wikipedia. It's an open source dictionary. And they actually provide them in a whole bunch of different languages, which is really nice. So this is actually, we're running it off the English dictionary off of Wikipedia that we did. And this generated much less guesses than a big dictionary. This only generated about 3 million guesses compared to the 31 million guesses of the big dictionary here. So about 10%. But we still managed to crack 12% of the passwords. So using custom dictionaries can definitely help you, especially if the hash is really strong. Cool, nothing blew up. Okay. And just in case that wouldn't work there, so you can just see graphical representations. That's the top one's the number of guesses, and the bottom one's the number of passwords cracked there. So you can see just how well they did there. So the first thing I guess we should talk about though is word lists. You know, how important is a word list to your password cracking? And as you can see from that previous demo, it actually is extremely important. Unfortunately, it's also boring as hell. I'll freely admit that. If you're not doing password cracking currently, or you don't enjoy organizing your sock drawer, this is probably not going to be the most interesting thing in the world. There's probably actually a big overlap between password crackers and people who do like organizing their sock drawer, actually. But I'll just hit the high points, and once again, if you want to meet afterwards, something like that, we can geek out about word lists. So there's a lot of places to find word lists online. And in fact, most of them have the exact same word lists. So they all steal from each other and try to talk about them being the ultimate word list site. The first one I really want to point out is the openwall.com, the FTP site, is for John the Ripper. If you go to their website normally, they don't really advertise all of their word lists because they want you to spend money and buy their actual big word lists, which I don't blame them. I love capitalism. But if you go to their FTP site, you can download a bunch of word lists from them. The other two sites below that are just kind of general word list sites. One site that's kind of want to make fun of a little bit, please don't hack me if you're in this audience, is the argon.com. And they have the ultimate gigantic word list, they say. It's two gigabytes large. And everyone says, oh, it's got to be good. It's two gigabytes of words. And that actually has to be one of the worst word lists I've ever played with. First of all, it's so large, you can't do any word mangling rules on it. It takes forever to run. And there's just so much duplicate, so much junk in it that it really doesn't crack passwords very well. About the only password I managed to crack was that that impressed me when I just ran it one time, is one guy decided for his password to use an HTML markup tag, which I have to admit is a really good password because there's all the symbols, uppercase, lowercase, and all that. And I wonder what he does for a living. But that was just, in one of the web pages they sucked down, he had the exact same HTML markup tags that was able to crack that. The last site I just want to point out, if you go on BitTorrent and you can find it, is the Exploits Master password collection. It has a ton of different word lists on it. But it also has what looked like passwords on that too. I don't know if they actually are passwords or not, which is why I can talk about it. But it's really good if you're just starting to do your own password cracking research. You can go ahead and log on to that and try your tools against the things on that as well. And they don't have email addresses on it or anything like that, so I don't feel bad about this. It's just random passwords. So with these word lists here, there's a lot of stuff that you need to be really kind of anal retentive when you're dealing with them in order to really help you out later. And every single time I try to cut corners with this, I've gone seriously burned. But just things you need to think about when you're managing your word list are, you know, you want to avoid duplicate words. It seems pretty simple, but it's actually a pain in the butt sometimes, especially when you have multiple word lists that you're trying and you can't remember which one you did previously and stuff like that. But duplicate words equals wasted work because you already tried those guesses before. You also got to ask yourself, how are those words terminated? Are they terminated by a tab, a space, a new line, or a carriage return? And that's really important. Most password crackers are fairly smart about dealing with that, but if you're writing any custom scripts that can really bite you in the butt, which I found out myself too. Standardized capitalization. How many artifacts are in the word list? How many HTML tags are there? How many timestamps and junk data just shows up there when you try to create the word list because you're not going to go ahead and copy and paste all these word lists here. And then finally, is the word lengths important? And this is not really for the hash that you're doing because most password crackers are fairly smart about that, but more along the lines of if you want to do case-mangling, because if you have a really long word and you want to try every single possible case-mangling in that, you're going to spend all day on that one word just trying to case-mangle it. So you might want to terminate it if you're doing a serious case-mangling. Now, as I said, the word list you find in a line leave a lot to be desired. So first thing you probably might want to do is if you're a forensics investigator, you have the hard drive in front of you, and you're just trying to crack individual files on the hard drive, is to try to find to see if you can find the password anywhere else on that hard drive. And for that type of research, I really want to point you to David Smith at Georgetown University, and he's doing some really good work and he's doing some really good tools for parsing that hard drive, grabbing out potential passwords to try them. Next thing, creating a word list by hand. It's a pain, but in some cases it can be really effective. Probably the best success I've had is swear words in different foreign languages because people using swear words in their password are international type saying. So I've gone to a bunch of different sites, swear words finish, swear words German, and stuff like that, led me to some interesting sites too. But just creating those custom word lists there really worked quite well. Now, if you want to go ahead and script this, on your Backtrack CD there's a WYD Perl script that you can use as well that works a lot better than WGet because it'll actually parse out some of the junk as well. It's not perfect, but it's better than doing it by hand sometimes, in different word lists. Now we created some custom word list generation as well. As I mentioned earlier, Wictionary. A lot of the foreign language word lists were really, really, really poor. They didn't have much in them. Some of the characters were messed up in it and so on. So I wanted to go ahead and create a password or crack passwords in foreign languages because like Finnish and Swedish and all those Norwegians really seemed to like disclosing passwords. So I looked at Wictionary, and I was like, I don't want to do that by hand. So I created a program called WikiGrabber that will go online. You can specify what language you want. It'll go ahead and download it. Since we're doing some passphrase of research as well, you can actually specify that if you want only grabbed nouns or only grabbed verbs or adverbs, it actually makes pretty good word lists. The next step, of course, we are syncing as well. If we're already doing this for Wictionary, why don't we create something for Wikipedia so we can create customized word lists like hacking beer, vodka, and Vegas? And just grab the Wikipedia pages for that. That actually still needs quite a bit of work because I didn't realize it before I started working on this here. But trying to find the right words on Wikipedia is actually fairly difficult. So if you do beer, for example, you'll find out a lot of information about the fermenting of beer and the history of beer and so on, but you're not going to actually find beer names like Amstel Light or Boddingtons. For that, you'd actually have to go to the individual countries and that's actually not linked off the main webpage for beer there. So trying to find that is still hard or still a lot of work to be done to make that a little bit easier, but that's also on the CD and on our website. Now the next thing we saw was we got a whole bunch of cracked passwords already. If someone used a word in their password before, the chances of someone else using that word are very high. So we wanted to go ahead and parse our cracked password lists and try to make customized dictionaries to crack more passwords. And that actually works extremely well. So the first thing we did was we went ahead and extracted the alpha characters from a password and used that as a word. And you can make certain like judges there so if you have an at symbol between two letters, you can probably change that to an A and so on. You just strip out special characters and so on. And that, as I said, we've had extremely good success with that, but we were a little bit worried that we're just missing words. There's mangling rules there that we don't know about because we're dealing with, you know, tens of thousands of password lists. You're not going to look at each one individually. So we wanted to go ahead and just try to see if we can parse out the low hanging fruit and then look at the remaining ones and try to figure out how they were created. And to do that, we went ahead and decided to use edit distance of passwords. So you're all probably actually familiar with edit distance even if you haven't really dealt with that term because it's used in all the different spell checkers out there. If you type te, instead of za, it realizes, you know, okay, you switch to h and e because that's a very close word there. Well, we thought, why can't we go ahead and use that with passwords as well, with analyzing these? So in the normal edit distance, you have rules like delete, swap, transpose, and insert. So to change Apple to apply, you would have an edit distance of one because you only need to change the e to a y. And you can match the words based upon whoever has the closest edit distance. We decided to add a few more rules to it to simulate how people create passwords. So for example, adding numbers to the end of a password is one edit. So the Apple 99 to Apple would be having an edit distance of one. And this actually worked really well. It had some minuses and pluses, but I'm pretty happy with it, actually. It does produce false negatives. Sometimes they'll say, you know, this word was created from this one, but it actually wasn't. But if the word-mangling rules make it so. And also, initially, I thought this wasn't very good because it's only good as the input dictionary. So if your input dictionary sucks, basically what you do is you have your password list, you have an input dictionary, you give it, and it tries to figure out which rules in the password list match up with rich words in your input dictionary. So if your input dictionary doesn't have the word, it's not going to match up. So, and initially, that was kind of a bad thing. So we're trying to do, like, best of breed. So we feed a bunch of different dictionaries and try to create one good dictionary that's really edited towards it. But what we found out later was that it's actually extremely good, though, for evaluating dictionaries. It's because it's nice to be able to say, this dictionary sucks or blows. It's much nicer in an academic setting to be able to say, this dictionary sucks and blows because. So this actually helps us out quite a bit. And also, it goes back to, remember when I was talking about hitting the brick wall, whether the problem is you're not trying to write enough words in your word list or whether you should try more word-mangling rules. This actually gives us an idea of what is the theoretical maximum number of passwords we could crack with this word list here. And so if we try every single word-mangling rule we can think of to crack this password list here, with this dictionary, we're only going to crack this many. So like the really big dictionary can only crack, it will crack about 50% of the passwords. If you try every single word-mangling rule you can think of. So that's kind of the top bar there. The words on English dictionary can crack about 50% of them, even though it has, you know, 200,000 words in it. So that's why we can say this one is not very good. Common password one, you can crack about 5.3% of all passwords, just with 816 words. And the dictionary one, you can crack 32% with 68,000 words. And really, the reason why this is good is you're never going to hit that limit, really, because there's always these crazy word-mangling rules that you just don't have a chance to use. But when you start getting close to that, okay, I probably cracked all the passwords I'm going to crack with this dictionary here. Maybe I need to switch to another dictionary to try this, or maybe go to brute force. So that really helps kind of eliminate the guessing on when you need to do, whether you need to add more words to the problem or whether you need to add more word-mangling rules. Okay, I'm actually going to kick off the next demo here real quick, and then we'll come back to it later when I have a chance to explain it. So hopefully it doesn't crash in the meantime. But this is actually going to be our customized word-mangling rule creator. So the next thing that we're going to talk about is word-mangling rules. So this is what everyone really thinks about with the password cracking, is how do you go ahead, how do you mangle these passwords here to recreate what a user is actually doing? And I have to say, the one thing that's really surprised me when I started doing this is how limited most password crackers were when it comes to word-mangling rules. And I think the reason for that is that Landman hashes, the old Windows hash, really spoiled us. Because old Landman hashes, they capitalize everything. So you don't have to worry about case-mangling at all. And also, they're seven characters maximum, so it became pretty easy to brute-force them after a while. So I think we kind of sat on our rear end there for quite a while with a lot of these password crackers. So it makes password crackers look really good when, in fact, they're not really doing that much advanced stuff. And one thing I found, I mean, it's very easy to crack passwords, not easy, but it's fairly straightforward to crack passwords to have only one simple word-mangling rule applied to them that everyone uses. So, like, 80% of... It's like 80-something percent of all people in these lists that we've been seeing here use two numbers at the very end of their password for when they create passwords. Or... I think that's right. You might want to double-check me on that one. That's why I love slides. I need to have this stuff. But a lot of them do, anyway. So that would be one great word-mangling rule, really easy. But when you start combining word-mangling rules, it gets actually very difficult. So this password here, password 12 was capital P, W, and an at symbol. That is an extremely strong password. I mean, password's going to be in every single person's input dictionary, so that's not a problem. But even though doing that word-mangling rule where you capitalize the first letter, you capitalize the fifth letter, you add two numbers to the end, and you change the A to an at symbol, putting all those word-mangling rules together, it's very unlikely that you're actually going to go ahead and try to be able to crack that password there. So that's an extremely strong password, and you can all see how easy it is to make. Also, if they use a non-standard rule, even one non-standard rule, the chances of you cracking are very small. So like password was just a seven between the P and the A. Since not many people go ahead and use that as a word-mangling rule, I'll be really surprised and kind of impressed if someone cracked that without resorting to brute force. The chances of it are very low, even though it seems like a really simple password. So if you're creating your password as a defender, I highly recommend just don't stick with the pack. Try something even a little bit different, and make your password fairly secure. Now, one question I get when I talk to people here is should I use Keen and Able or John the Ripper? Because these are the two major free password crackers out there. And I love Keen and Able. I have to say it because they put a lot of work into it. It integrates a lot of different things into it. You've got your art poisoning, your sniffing, and everything else. So it's a wonderful program, so I hate to trash it. But really, if you're cracking passwords seriously, you need to be using John the Ripper if it supports the hash that you're trying to crack. And the reason for that is pretty simple. Keen and Able doesn't have very main word-mangling rules at all. In fact, it's extremely trivial. It does adding two numbers to the end of a password and case-mangling. And it doesn't even combine those. So if you have a password with a password one with a capital P, Keen and Able is not going to crack that because it doesn't try capital letters and numbers at the same time. So I mean, Keen and Able is fun to learn how to crack passwords on, but if you're cracking passwords seriously, you don't want to be using it. John the Ripper, on their hand, is configured via a config file. So you can go ahead and you can get really crazy with all the different word-mangling rules you can think of. And in fact, even if John the Ripper doesn't support whatever word-mangling rule you want to use, you can always just create a custom script and pipe those guesses directly into John the Ripper since it's command-line, which is really nice as well. So you can do pretty much any type of word-mangling rule that you want to. Now, when you download John the Ripper, first of all, one second of the block people will say is it doesn't support the type of hash that I need. And the answer to that is it probably does someone's written a patch for it. So go ahead and find it and download and install it. And it actually will support more hashes than Keen and Able does. But you need to go ahead and find those patches, and actually most of them are actually on the John the Ripper website. Also, I'd recommend against using the default John the Ripper config file. It's not actually bad, but you can do a lot better. Now, one thing that's kind of annoying me or it surprised me is even though this is probably one of the most popular password crackers out there, I've yet to find somebody else's John the Ripper config file that they posted online. People just don't like sharing that information for whatever reason. So it's been kind of hard to tell what other people are doing. So I tried to solve that a little bit there. I included a couple of our sample John the Ripper config files onto CD. So that way, at least you can look at what we're doing to go, noobs, you know, you're doing horribly. But at least now you have some examples of what some other people are doing. Also, the John the Ripper config file, it is kind of intimidating a little bit. It's just pain. And it surprises me because I'll be talking to somebody who knows like three different scripting languages and a bunch of different programming languages. And they'll be like, oh, that John the Ripper config file, I can't figure that out. It's not that hard, but it just intimidates people a lot. But I mean, there's a rules read me file there. I highly recommend reading it. I have it open all the time when I'm modifying it because I forget how they do it. But I highly recommend if you're going to be doing some serious password cracking, you really need to take the time to learn that. But, because I'm kind of tired about everyone whining about that every single time I mention John the Ripper. I went ahead and I created a custom John the Ripper config file generator. It's menu-driven as long as you don't mind text. Oh, cool, thanks. And it didn't make it to the CD. I'm sorry, it's very bond, the link to it. But once our main list upside, it will get up beyond that. And it's on the download one right now at reusablesec.googlepages.com. I probably should have put that at the top there. I'm sorry. But not only would this allow you to go ahead and create custom config files and stuff like that, but I should ruin a couple of different things that I wanted as well in this kind of feature creep. First of all, you can save all of your settings that you have here so you don't have to retype them in every single time. You can go ahead and add specialized or password creation policies to this. So that way, you don't have to modify all of your config rules every single time you want to try a different site. So you can specify, like, I want the passwords to be at least eight characters long, have two numbers, two special characters, and so on. Also, I've made it so it's very easy to change the character set as well that it uses. So that way, you can switch between different languages as well, since it seems like all the password crackers out there focus on English, and I kind of want to share to love a little bit there. So that makes things a little bit easier. Now, as I said before, brute force is a wonderful, wonderful thing to use. But usually, you don't have the option to brute force the entire key space. So there is kind of a halfway thing where it's called targeted brute force, where you try to brute force just the format of the passwords. So that way, if your dictionary is poor, you can still maybe crack the password. For example, you might want to brute force six characters followed by two numbers, and that speeds things up a lot. So even though you're brute forcing characters and numbers, you're only trying characters in one spot and numbers in another spot, so it actually narrows down the key space quite a bit. Now, you can do that directly in John the Ripper. It's a bit of a hack. On the CD, there's some additional slides on how to do that, and I included a sample of brute force config file on the CD as well. But in reality, if you really want to get fancy with this, I highly recommend just writing a custom, you know, parole script or something like that and piping into John the Ripper. And that way, you can get really fancy, too, and start adding Markov rules and things like that to really try to help speed up brute force. Okay. Now, with word-mangling rules, though, we quickly found out that it really is a pain to specify them, because the first couple of word-mangling rules are fairly easy. But as you get into more and more less probable word-mangling rules, first of all, they take a long time to run because you add two numbers to an end of a word, you're trying 100 guesses. If you try four numbers on the end of a word, you're trying 10,000 guesses per word in your input dictionary. So you want to kind of narrow it down quite a bit and you end up creating a couple hundred different word-mangling rules and then editing and managing it is a real pain. So this idea was actually Professor Sudhir's idea originally, and it actually really works out pretty well. And that is why we create computers, is why don't we have them go ahead and analyze password lists, try to figure out how passwords were created and create custom word-mangling rules based upon that. So the actual name of this is a probabilistic, context-free, grammar word-mangling rules. So you can tell what we're writing a paper on right now. But really, in theory, all it does is it tries to figure out how passwords are created and it assigns a probability to that word-mangling rule. And it also assigns probabilities to every single number, every single special character, capitalization, and everything else along those lines that it finds in the password list. So it'll say one is much more common to be used than seven and 99 is much more common to be used than 11. And really what it does then is it tries to generate the most probable passwords first. So right now, we have a fairly basic, and the one that's on CD here, a way of figuring out what the word-mangling rule is. So for like password 11, it would say, you know, uppercase the first letter, have seven lowercase letters, and then two numbers. And there's definitely stuff we're looking into adding, you know, for like, it's pretty easy to add onto this, you know, replace the A with an at symbol and stuff like that as well. What it does is that word... that structure there has a probability. The structure of, you know, numbers has a probability. And you can actually even assign probability to input dictionaries. So you can say, this is really, you know, common passwords, so I want to try a lot of word-mangling rules on this. But this is a much bigger input dictionary that I want to try, you know, eventually on some of the more common word-mangling rules. And it'll go ahead and it'll spit out the guesses in order for you. So it might try, you know, like password 12 and then password bang before it tries password 13. So that way it really specifies all these different word-mangling rules really well for us. Now, our current implementation makes guesses and outputs some of the standard out. So that way we can go ahead and pipe this into any other, you know, password-cracking program so we don't have to worry about the hashes and keeping track of which passwords are managed and so on, because I'm lazy and I don't want to have to code all that. So right now, in most of our tests, we actually go ahead and output them standard out and input them into John the Ripper and use John the Ripper just to go ahead and do all the password hashing. Now, I'll admit this structure does have some overhead. That's why I didn't just run it immediately. But the overhead is actually fairly low compared to a strong hash or even a semi-weak hash. So we have used this actually successfully in real-life cases. So it's not saying that, you know, you run for a couple of days and it takes, you know, 10 times as long or so. And because we have a graph and I'll show you the results in a little bit and I'll talk about this graph here afterwards if you want to. But this kind of shows the number of passwords we crack over time versus John the Ripper. So initially, during sort of very high probability rules, we do pretty much the same. In some cases, actually John Ripper slightly does a little bit better. But after a certain point, R starts to do much, much better. And unlike John the Ripper where you have set number of word-mangling rules and it's done, R has just millions of word-mangling rules. They're very small, but they're still, you know, millions upon millions of word-mangling rules. So if we let this run, it'll just keep on going and going and going. And we don't actually have to specify all these different word-mangling rules, which is really nice. So now we get to check to see if anything crashed or not. Oh, cool, it looks like it actually worked here. First time for everything here, yep. So once again, we trained this one on the training set of the MySpace rule and we're trying to crack the test set. If you want to see some graphs and stuff like this of us training on different password lists and then trying to crack other password lists, talk to me in the after-talk room about some graphs and stuff like that, because we've definitely looked at this, too. But like with John Ripper, with the English word list, it cracked 3.2% of the words. With our password cracking thing, we cracked 5.6% of the words. With the common passwords, John Ripper cracked 1.7%, using the same number of guesses, we managed to crack 2.9%. So it's not perfect, but like the Wikipedia English one, John Ripper cracked 12%. We cracked 21%. So in most cases that we found here, I mean, it does much better than John Ripper. And that's pretty much it. Oh yeah, and here's just the graphs so you can see the number of passwords cracked there versus John Ripper. I don't have much of an inside show. That would explain it. So that's it, though. If anyone has any questions, feel free to ask now or talk to me in the after-talk room here. That's my email address. As I said, please email me if you have any ideas or anything like that. If you have any password lists you want us to crack, as long as they're legal, I'll definitely take a look at those as well. And that's a rough bit of it.