 Okay, so your program promised you an open-source love story in 3x and so like every good 3-act play we open with the dramatic person a So first this is peppercorn Peppercorn is one of your users dogs and like any good dog owner The first thing they did upon acquiring peppercorn was go and change their password on all of their software services to peppercorn This is Mallory Mallory is an attacker. We will come back to Mallory more later, and this is me My name is TJ shook I am on the internet everywhere as TJ shook with the dots and spaces removed from my name github Twitter Also Instagram and yo like tenderloft. So if you want to yo me as well you can I work at Harvest I'm a developer there. We are the makers of the world's best time tracking software If you do anything where you charge money for your time agency freelancer consultant, you owe it to yourself to check it out Notably for this talk. I am not a security expert There are people who are real security experts who get paid lots of money to know about a lot more stuff than I do But by virtue of having users I have to be a security expert and so do all of you Ignorance is not an excuse here if there is a security breach in your system You can't just say we didn't know any better that is insufficient and you'll see later There's a lot of interconnectedness here that your breach can have further reaching implications So though I am not a security expert. I have to be as do you So back to Mallory. Let's talk about her attack. So good security is about layers You should have all sorts of different kinds of security. You should have application level security You should protect against SQL injection XSF CSRF rails gives you a lot of that kind of by default You should also have infrastructure level security. You should have a secure data center You should have physical firewalls between your devices However to truly investigate any particular level of the layer cake that is security We have to assume everything else has failed. So we have to assume that this works Mallory can just run her script and get a raw dump of your database So let's Considering what we are going to do with our users passwords. What's the easiest option for what we can do? We can storm as plaintext It's easy it makes sense when someone just logs in they pass their password. We check it if it matches great This is obviously bad and no one here is doing this, correct Right does anyone want to admit to doing it? All right Someone is doing it someone is not raising their hand But someone's doing it because it doesn't matter they have a site that's just like you know Gift ranking and you know you say like I like this anime give more than this one It doesn't matter if their passwords leak someone's just going to rank a bunch of gifts on their behalf But this is bad because users reuse passwords So if they break into your site and steal a bunch of passwords, they don't necessarily care about your site But now they have a big Long list of user names and passwords that they're going to go to Gmail and try and Facebook and try and all the banking sites and try and they're going to get into some of them And if they get into some of them they now have additional vectors of attack to get into more of them So it's important again to remember that you are part of a greater world And you cannot be the weakest link in the chain because again ignorance is not an excuse for not knowing this So we know this is bad so we need some way to obfuscate the dump so that when Mallory gets it She cannot use it so the easiest thing is just encrypt it This is a very secure form of encryption known as ROT 13 wrote 13 a Caesar cypher with a key of 13 You take all the letters and you shift them by 13 so an A becomes an N a B becomes an O a C becomes a P That's just for illustrative purposes. This could be DES 3 AES 256 any kind of real encryption But the key to all of this is that there's a key there's a way to reverse it So here the key is 13 if you know 13 you can take all these passwords and reverse them back out again So somewhere there has to be this secret. It's either in your app code or it's on the server or any number of places But it's important to remember that since the attacker has already gotten through most of your defenses Presumably also have access to that secret in some way and it's also important to remember that an attacker could be a malicious employee Who has easy access to your application code or your server? And they don't even have to be the only attacker like if there is a public leak and you work at somewhere big like there Was a big leak for LinkedIn passwords all it takes is one malicious employee to kind of like leak out the encryption key And then all of those passwords are broken So it's important to note that encryption is reversible so data is obfuscated But if you have the secret you can decrypt it and read it hashing is irreversible so if you have Peppercorn and you apply a hash function to it you get some output and If you have secret 1 2 3 4 and you apply some hash function to it you get some output But if you have some output you don't know what the input is because hashing is irreversible. There is no Inverse function to give you the input Additionally hashing is deterministic this becomes useful for authentication because when you hash peppercorn you get an output When you hash peppercorn a second time you get the same output when you hash the third time you get the same output That's how you can actually do authentication when the password comes in you hash it if the hashes match You know that the input was the same, but you can't back out because it's irreversible It's also important that it's deterministic, but not obvious So peppercorn has the same time as the same output, but if you trivially change the input say you capitalize peppercorn The output is completely different and if you have output that's just trivially different So here the least significant bit is off by one. You have no way of telling what the input is So this is great all of our problems are solved We just hash all of our passwords and now anyone with a dump can't reverse them back to get plain text passwords Throughout all this I'm using md5 just because it's shorter so it fits on slides better But sha1 is effectively the same so we can't go password backwards. All of our passwords are safe However, we have a problem in that hashing is deterministic This is a double-edged sword because the hash of peppercorn is always the same The hash of peppercorn is always the same and this leads to the concept of rainbow tables A couple of logistical points here. First of all, that's the best slide that's ever been made Second I have already turned it into a gift for you. I will tweet it out later Third we're not actually going to be talking about rainbow tables. We are technically going to be talking about lookup tables Which are kind of a generalization. They're the first step of rainbow tables rainbow tables are a little bit more complicated The understanding of them that you're going to get those effectively the same They have the same consequences and the same ways to mitigate them and if I was talking about lookup tables My slide would look like this and that's boring. So instead you get this one But remember we're talking about lookup tables here Okay, so we have this dump Mallory has this dump and she wants to work backwards from a table to see What the original value was so as a proof of concept we can use the world's best lookup table Which is Google if you just drop in the hash and Google it we don't even have to leave the results page You can see that that md5 is for peppercorn. So Mallory in a better tool than just using Google can figure this out So we need some way to render all of these pre-computed tables obsolete so that you can't just Google a hash and see what it is The easiest way is to just change all the inputs So we know that the hash of peppercorn is this but if we just append a string of nonsense to it We get a different hash and in our apps password hashing method We just say okay Here's our like app-wide string of nonsense you append it all the time and you can see you check our label lookup table And there's nothing there. So we did it. We totally defeated it. We did it Except we didn't so an attacker cannot just look up the password now in a pre-computed table But they can generate a new table trivially because it's not that hard to generate these hashes on this MacBook Air Which is never considered to be like the fastest computer on earth. I can calculate 13 million Shaw ones every second 13 million a second on a MacBook Air So if you have that altering scheme from the app code or from the server again It's a secret that's built into your thing someone can just use that to attack by doing the same method This is where harvest was so as a proof of concept. I decided to white hat attack our database I had put down a dump I did my best to anonymize all this data. So I only worked on the raw hashes You can use any freely available program going through and googling each individual hash will take a long time There is one called hash cat that you can download Google it It's not hard you can use John the Ripper which you can install via Mac home brew It is not that hard to do this you also need a word list I googled for about ten minutes and I found a 25 million long dictionary and I ran it through hash cat and as I was considering whether I should make a cup of coffee 87 seconds later There was peppercorn along with 80,000 other passwords 80,000 passwords in under a minute and a half Now this isn't even a majority of our users, but it's enough to do significant damage I can now go and attack everything again. I can't I know these passwords, but I don't know who's they are I don't know your password Additionally in this list of what you would think are insecure passwords We're also these that a lot of people at first blush might think are secure So the first one that universe It looks secure because it's that sort of lead speak alternate swapping But these word lists are really good and they have a lot of those and the programs are really good and can do a Lot of these automatically so even if your word list only has the word universe It knows to swap things out the second one looks really good until you look at a QWERTY keyboard And you realize it's just effectively a hardware hack where you're just tracing keys on a keyboard Again, they're smart enough to know that the last one. I have no idea. I don't even know what that is I can't figure out why it's in there. It's probably just in there because of length. It's not that long But again, they're good these passwords are easily cracked and as a user You should be using some kind of randomly generated long passwords, but that's a different talk So our last attempt was close But it was using a global salt where we were just depending known nonsense So we can do a per password salt so now instead of doing just peppercorn with the global nonsense We can on every individual user give them a different one This now gives all of our users those strong passwords Everyone now has very long very randomly generated passwords effectively We just store them in the database next to the thing knowing the salt doesn't particularly help the attacker But now the computational complexity is greater because they have to compute a table for every individual user I Got rid of the email column for space here. By the way, you would normally also have that If you have a very random salt as well with particular length you would get enough entropy so that people with repeated passwords So if you have a thousand users with the password password, they also have different hashes So this is pretty good. This gets us pretty far, but this is pretty good for 1976 This is approximately what Unix's Crip 3 does in 1976 and at the time Modern hardware could calculate about four per second So there was enough kind of difficulty to keep someone from generating a million lookup tables for all these users But today we have these this is an AMD AX 7990 it costs about a thousand bucks So anyone can reasonably buy one of these this can calculate 1.5 billion hashes per second So my MacBook Air 13 million this 1.5 billion it makes generating these one lookup table per user calculations feasible again and The problem is all of these hashing algorithms that we've been talking about Shawan MD5 they're not made for password hashing They're made for things like checking vial validity on both ends of a network transfer and they're designed because of that to be fast So because they're designed to be fast hardware keeps getting faster and they keep getting computed faster And now we have a problem So in 1999 Niels provos and David Matsiere's published a paper about a future adaptable password scheme where they were trying to solve this exact same problem And they came up with B crypt now B crypt has all the goodies. We've already discussed. It's a one-way hash pre-image resistant It's deterministic it has built-in salt So you don't even have to worry about doing them anymore But it has two additional goodies that are the actual notable ones that we're going to talk about a little bit more One is that the underlying cipher is X blowfish. It's based on blowfish, which is notably expensive It's known for taking a long time to boot up But it has a new set of algorithm to be even more expensive. So the EKS stands for expensive key schedule So it requires more memory which makes GPUs and other specialized hardware Less feasible in an attack, but more interesting is the notion of an adaptive cost that was right in the title of the paper That's how important it is. So let's look at the dump again So we have a dump with all of these B-crypt digest in it B-crypt digest looks approximately like that So let's investigate its anatomy First of all ignore the dollar signs. They are just delimiters. They don't really signify anything at the end here We have the actual output of the hash that is the checksum. That is the hash. It's a hundred and ninety two bits encoded into base 64 To the left of it is the salt. It's a hundred and twenty eight bits salt that again You don't have to worry about it's just taken care of by the B-crypt algorithm It's again base 64 encoded to be a little shorter on the left here. That means this is B-crypt a Value of 2 to a to be 2x and 2y all signify B-crypt for assorted historical reasons We can go over those later, but generally you'll see 2a because that's what B-crypt Ruby uses But this is the most interesting one. This is that notion of cost So let's see what that means when you want to be crypto password You pass in additionally this cost parameter that gives you an output if you do it a second time You get a different output, but that's because of salting We already know that and if you do it a third time with the higher cost you get a different output yet again But there you see that the cost is different This isn't particularly notable until you look at the time that it takes to do each of these So the first one took about 0.06 seconds second one the same the third one took just over one second And that's what that adaptable cost is as hardware Gets faster over time we can march that cost forward along with it to get more expensive calculations These are all approximate averages using my laptop right here So a cost of nine takes about 0.03 seconds to calculate one of these hashes and a cost of 16 takes about four seconds Right now today You probably don't want to do that last one because you don't want to add four seconds to your login flow but somewhere in like the 12 to 13 range is probably Something that wouldn't be noticed by your users, but would definitely be noticed by an attacker Using a cost of 12 in harvest that previous attack that took 87 seconds would now take about 84,000 years So we've now kind of gotten over that hump of the cost calculation where it's no longer worth it So be crypt is kind of that sweet spot. It has a Ruby library We'll talk about that a little bit more later, which makes it very easy to use And it gives us kind of this future proofing Some people right now who skipped ahead in class are saying well Why don't you just use PBKDF to or why don't you use script or something else? And that's fine. You're again. You're ahead of the game. Those are okay. Those are probably fine I disagree with you. We can talk about the finer points later But if you are already using PBKDF to or script you can stick with it It's if you're using a an older algorithm like sha one or MD five that you'll want to change So how can we fix this problem if we have it currently? We need the plain text version of the password because if we have hashes we already learned they're irreversible So if you already have the plain text if you're in that step one This is easy just run some kind of one shot to go through them all and convert them all Otherwise we need the plain text password and the only way we get this is in our current authentication method So currently we have something that looks like this Authenticate with the plain text password that we used to hash and then match we want to get to here where we're using b-crypt to check them The astute members of the audience will notice the double equals there and get immediately concerned because I told them the b-crypt is Reversible and we're comparing this plain text string against what appears to be the reversed b-crypt digest this is actually because b-crypt Ruby overloads the equal equals operator and I think this is one of the worst design decisions of b-crypt Ruby and I intend to reverse it eventually I've actually just got an issue on b-crypt Ruby like two weeks ago that someone was confused about this But just know that that's being overloaded and it's actually not just doing a string comparison So the easiest way to do this is just hook into our off filter with there with a pre-filter so you get the plain text password and you run it through this conversion and then you check it because it's Been converted and the conversions easy to b-crypt provides a way to check Is this already a valid b-crypt hash if it is just kick back because we're already good Otherwise take it convert it easy as pie This code works because this is exactly what we used in harvest We dump this in and over the course of a couple of weeks This is two and a half weeks of natural conversion of our users This is the number of users who had b-crypt passwords So there's the immediate big spike upfront as daily users and like all the API hits came in and then it slowly Tapers off as we catch more and more We needed some way to get the rest and because I had already white had attacked the database I just did it a second time but this time with conversion in mind and we got a big spike that took us the rest of the way There were a couple remaining ones that didn't get caught in this they had both strong passwords And they didn't log in recently those we just reset and send them an email and let them know it was up And it wasn't that many users and it wasn't that difficult So if you have this problem, it's not that hard to fix you can do it There is one downside to b-crypt and that is that is it and it is an expensive algorithm But because of that it's an expensive algorithm. You will see more load on your servers This is the CPU across all of our servers. You can see it approximately doubled after we launch b-crypt However, it's still within the realm of tolerability We also probably have a higher level because we still support basic authentication through our API Which gets used a lot and every single one of those carries a username and password Which will do another b-crypt it if you are already whole hog on OAuth and only support OAuth You won't see as big of a hit, but again, it's totally worth it So act one was the easy part you can learn that on Wikipedia act two is where we had conflict in a three-act play so This is about that binary gems We talked about b-crypt Ruby b-crypt Ruby is the Ruby gem that is for b-crypt I wanted to add a feature to this it was missing something that I thought would be convenient But the test didn't run and the dependencies were out of date and they were docks missing so that one pull request turned into a dozen pull requests and After they were all merged You can get commit bit by persistence and so after enough annoyance I'm on asked can I get commit bit and then suddenly I was the maintainer of b-crypt Ruby There have been a couple of de facto maintainers before me most of whom no longer do it This is what b-crypt Ruby looks like the source of b-crypt Ruby, but more accurately. This is what it looks like It's just a Ruby gem. That's a wrapper around C and Java extensions so when you distribute a gem of it as a convenience to your users so that they don't have to have a Compiler on their system you can provide Compiled binaries, but you can see that every version has four different versions One is the compiled Java version one is the compiled like Knicks like version the top two are Windows versions one for x64 and one for 32 bit The problem is that for Windows you need to have what are called these fat binaries that provide support for Windows 1 8 1 9 2 0 in one wrapped up thing I'm not a Windows developer, so I had no good way to do this Luckily when Amon added this to b-crypt Ruby He left these long commit notes about how to do this But he left these three years ago and like anything about computers written on the internet that's three years old It doesn't work Five years ago Aaron introduced the notion of fat binary gems Where I found out that he made the same queen joke as me five years before me But additionally like anything written about computers on the internet. That's five years old. It definitely doesn't work All of this is ultimately though just wrapping up rate compiler Which is this great gem that abstracts a lot of this away So I followed through all the docs of which there are many it is very lengthy and none of it worked so The rails team has this rails dev box that they use to keep you from having to on your development Sheen have all the dependencies that you need to run the rails tests You can see what's in the box all these dependencies So I had a dream that was inspired by this where I wanted to make a rate compiler dev box that had all the rubies You needed GCC the JDK Min GW, which is the library that allows nicks like machines to compile Windows binaries and I thought man This will be great. We can all have these fat binary gems forever and vagrant Kind of offered that solution right what it says on the tin create and configure lightweight reproducible and portable development environments Exactly what I wanted and it didn't work so what do you do with anything that doesn't work you put it on github and With that I opened up rate compiler issue number 79 in our three-act play. This would be the climax This is our turning point. You were promised a love story This is Louise Lavena Louise is the developer of the one-click Ruby installer for Windows as part of that work He became a member of the Ruby core team as part of the work of both of those He was voted a Ruby hero in 2010, but important for this talk He is the developer of rate compiler So when I opened that rate compiler issue, I said listen, I did everything I can I followed all the docs I just can't get this to work and then Louise opened up rate compiler dev box number two Which was his pull request to attempt to help me through this trying trying time and on this epic thread He dropped triple hearts on me not once not twice not three times before times So I need all of you to take out your assorted internet devices phones computers and help me pay him back So everyone tweet at Louise right now three hearts Thank him for being a wonderful maintainer an OSS collaborator Pro tip if you have a Mac that's running Mavericks or later you can hit control command space and get access to the real emoji so Louise help me through this and to half of you I want to kind of implore you to find your Louise and thank them for the work they do It could be a maintainer of something you use it could be a co-worker or whatever But more important than just thinking because that's easy collaborate with them You'd be surprised how willing they are to accept this collaboration And it can be very worthwhile additionally Louise lives in Argentina in Paris So it was a nice little fun global collaboration to to the other half of you the half that are already Maintaining assorted things and you know are the the Louise is already out there I want you to be the Louise you wish to see in the world So next time you know someone comes to you with that issue that you've seen a thousand times already or has a Question that you think they should have just posted a stack overflow or God forbid it's a stupid question Before you say RT FM remember that the problem might be the FM and you know do your best to help them through it And the world will be better for it So what have we learned number one just use B Crip. Just do it. It's easy It's not that hard talk to me afterwards if you want to go through this will laugh will cry will hug We'll convert our passwords number two distribute a dev box if you have anything that has complicated external dependencies Other people will be much more likely to Collaborate with you and submit pull requests and help you go forward with it if you make it easy for them Also, if you have something that has external dependencies, you should try rate compiler dev box It will help you cross compile your gems without pulling your hair out But most importantly I want to encourage you to release collaborate and iterate Thank you