 So, John Downey works at Braintree. They do payments, so they care very deeply about security and cryptography based security. So John's from Chicago. As I said, he works at Braintree, and he's going to tell you about modern cryptography in Ruby. Thank you, John. Thank you. So, as Josh just mentioned, I'm going to talk a little bit about cryptography. If you're interested, I just tweeted a link to the slide, so I'll also have this information back up at the end so you can grab it. You want to follow along with the slides, though. So as was mentioned, I work at Braintree. Braintree is a payment gateway, so what we're interested in is helping our customers find ways to get paid. So in doing that, we have to deal a lot with financial compliance and a lot with security. So, no surprise, like most companies, we're hiring, so if you're interested in working at Braintree, we have an office in Chicago and one out here in Menlo Park. And that come talk to me. Also, if you're a Braintree customer and you're here, I would love to talk to you about your usage of Braintree. So the first part of this talk is going to be a general overview of cryptography from like the modern sense. And then the second part is going to be a, like, how do we as developers just commonly tend to mess this up? So I'm really interested in how we can, as developers, find ways to teach other developers to use cryptography properly and use it securely. And so for a quick overview, cryptography means secret writing in Greek. And in our modern setting, we typically, it serves three purposes. So encryption. So this is all about keeping something that I wrote secret, even if it's intercepted, they can't read it. And when someone talks to you about cryptography, usually this is what you think of. The next thing is authentication. And for this, I don't mean like user login or authentication. I'm talking about making sure that the message hasn't been changed in transit. This is also called integrity protection. And then the third thing is identification. So who sent me this message? And you often see this in the form of like digital signatures. So modern cryptography is a very rigorous science. It's based on hard math problems or problems that are considered hard to solve on classical computers. So an example of that is factoring large numbers into their base primes is considered extremely hard. And that's the basis for the algorithm RSA, which is a really popular public key cryptography algorithm. So what we're doing with modern cryptography is we're betting that there are gonna be no major advances in math or in computing. So cryptography should be peer reviewed, just like all sciences. And one of the key things is that you don't wanna invent your own, unless that's your field specifically. It's really easy for you to come up with an algorithm that you think is unbreakable, it's very hard for you to come up with one that other people can't break too. Another thing is though, even if you have these great algorithms that are really well understood, don't try to implement your own. There's a lot of edge cases and things that you just don't think about, where you could screw it up big time. Try to stick to government standards. So depending on your trust level of the government, you'll wanna stick to government standards. These have been reviewed by a lot of academics. They've been put under a lot of pressure. They've been reviewed by the National Security Agency, and they've kind of gotten through this process. So I always recommend people to stick to the government standards. Another thing about modern cryptography, we have this kind of a guiding principle is something called Kirchhoff's principle. Which I can tell you everything about my crypto system, except for I'll keep the key private, and that should be strong enough. You shouldn't have to keep what algorithms you use secret. You should only have to keep your key secret. This is also commonly referred to as security through obscurity. So cryptography itself is very strong. The individual algorithms themselves are very strong. I don't expect to wake up tomorrow morning and the news is like the RSA algorithm is broken. And they just completely broke it. But what I do expect to see is that the ways that we compose these algorithms together commonly fail. And when they do, it's catastrophic. Oftentimes the primitives themselves are misused, and I'll have an example of that later. So another thing that tends to come up with development is we spend a lot of time focusing on the algorithms themselves and not about how they fit into the larger system. So there's a quote I really like from a book cryptography engineering by Bruce Schneier and Neil Ferguson. Bruce Schneier's a really big name in security. It says, you've probably seen the door to a bank vault, ten inch thick hardened steel with bolts to lock it in place. It certainly looks impressive. Oftentimes we find the digital equivalent of this bank vault door installed into a tent. With people standing around, looking and arguing about how thick the bank door should be, but not paying any attention to the tent. So in general, you should approach any time that you're going to have to deal with cryptography in your systems and your applications with a really healthy dose of skepticism. Crypto is very difficult to verify and test. And unfortunately even the experts screw it up most of the time. Some really good general prescriptive advice though is if you have a model where data is going in transit, so over a network, you can use TLS and SSL. TLS stands for transport layer security. It's the successor to SSL or SSH or VPN. If you have your data at rest, and this is like data that's sitting on disk or in Dropbox or S3 or something like that, use GPG. So if it doesn't fit one of these two models, you might want to try to rework it until it does. These two methods are really well understood ways to secure your data as they either go over the wire or sit on disk. So this was a last minute addition slide. So the internet is all about this new crypto vulnerability that is about to be talked about next week. So it's actually going to be announced next Friday. It's called crime. And it's from the authors. And their previous vulnerability was called beast. So at the very least, they're really good at naming. But what it is, is it's an attack on SSL. And it attacks the compression. Or we believe it's going to attack compression in SSL. So if you haven't heard about this, I imagine within the next week you will definitely. So the best bet and the general advice right now is to disable SSL compression on your servers. And that will mitigate the vulnerability. So even though TLS and SSL have their warts, they're still really well understood and well pounded against by the academics and the government. So they're still the best thing we have for this situation. So now let's talk about some places where crypto goes wrong. Again, four specific places we're going to talk about. And there are just things developers commonly screw up. The first thing is random number generation. So randomness is a really central part to any crypto system. We use it to generate our encryption keys, our API keys, our password reset tokens, our session tokens. And recently there was a talk at a security conference about where some researchers were able to attack common PHP applications and the way that they use randomness to generate password reset tokens. And they were able to predict what a password reset token that was sent to a user was going to be so that they could actually reset the user's password. The title of the talk was I forgot your password. So this is probably one of my favorite slides. It's actually really difficult to see from over here. But the one on the left that has a very clear period. You can commonly see that it looks ridiculous and how it just forms a pattern. This would be similar to what you would get if you just called kernel.random or kernel.rand. It's not a cryptographically strong random number generator. This is actually the one on the left is actually from PHP, again, on Windows and a really old version. So it doesn't look nearly as bad anymore, but this is the best one to get the point across. And the one on the right is just output from a really cryptographically strong random number generator. So this is a line of C code from OpenSSL. And a little backstory. What this line did is it mixed in random data from the system into the OpenSSL's random number generator. Unfortunately, in 2006, this was commented out in OpenSSL on Debian, which then got into Ubuntu. And it was not then rediscovered until 2008. So for two years, the random number generator on Debian was broken in OpenSSL. And this kind of goes to show you that good cryptography looks very much the same as bad cryptography. And so without this line, the only input to the random number generator was the process ID of the current process, which is fairly small on Linux systems. And you could just exhaust that really quickly and just guess the initial state of the random number generator. The reason this was added was it was committed to fix a warning from Vowderind. If you're not familiar with Vowderind, it's a program to help you analyze memory leaks and things like that in your C programs. The thing that really gets me upset is the person who made this commit ran it by the OpenSSL mailing list and got a very little feedback. And so it went in. And the outfall of this was that every key generated on these systems for these two years had to be completely revoked. And for those two years, the systems were not secure at all. So this is that line today. I went back to take a look on it, and they have appropriately defined what it does, so this doesn't happen again. So some recommendations. When you're dealing with things that require a secure random number generator, use the cryptographic library random number generator. In Ruby, we have a really great one. It's in the standard library called secure random. And we also have the OpenSSL random. So secure random will actually wrap OpenSSL random, or it'll fall back to some system random, either on Windows or on Unix systems. And so it'll give you a consistent API over random number generation that is somewhat portable. So you don't have to think about where it's coming from. On Linux systems and other Unix-like systems, if you just need on the command line, need some randomness, you can use dev random or dev u random. The difference is that dev random will actually block and wait until the system is generating enough entropy before it will return. And a real quick explanation, entropy is the measure of how unpredictable the next output of the system is. So if I flip a coin, we would say that has one bit of entropy, because it's one bit unpredictable. So you would see this manifested and your output would just stop. So if you've ever been generating a GPG key or an SSL key, all of a sudden your output stops, and you have to bang on the keyboard for a few minutes, and then it'll start again, that's what's happening. Your system is blocking for entropy. So the next thing I want to talk about is length extension attacks. But before we can talk about length extension attacks, we have to talk about the thing that they're attacking, which is a hash function. So I like to think of hash functions as fingerprints, and they're often called fingerprints. The really important thing about hash function is that it's a one-way thing, and it's not reversible. So just like a fingerprint, without some sort of fingerprint database, you can't take someone's fingerprint and know who's it is. You would have to go look that up. And ideally, we would like to think that no two people have the same fingerprint. Just like ideally, no two inputs to a hash function we would be able to find have the same hash output. So the text at the top, which I don't expect anyone to read, is the mission statement to US Cyber Command. And for whatever reason, they like this so much that they ran it through the MD5 hash function, which comes out to this thing down at the bottom, which we'll call, it's either called a fingerprint or a digest. They like that so much, they put it into their logo, and the inside gold ring on the inside. So I'm not really sure why they did this, and I especially don't know why they used MD5. So for hash functions, a general recommendation, use SHA-256. If you're building a new system today, it's part of the SHA-2 family. MD5 is actually considered broken, and the federal government recommends not using SHA-1 after 2010. So despite those two things, I will have slides later that use SHA-1 as an example, because it has a smaller, it's much easier to fill on the screen. So back to our length extension attacks. And now that we know what a hash function is, we would think this is somewhat of a typical request to a RESTful web service. We want to create a new thing named a widget. And we'll attach a signature to this. And the signature would let us verify who's sending it to us, and maybe let us control whether or not they should be able to even create this thing. So in a lot of systems, unfortunately, we see the signatures computed maybe somewhat like this. They'll have an API key of some kind. They'll prepend it to the thing that they want to send. And then they'll run it through a hash function like SHA-1. And that will be the thing that they send as the signature. So I'll tell you the dirty secret. The current generation of hash functions, they return to you when they're done. The output that you get is just the internal state at the end of a loop. So they kind of reveal to you exactly what happened inside of them. So what happens is, so the state begins in the top left. The variables h0, 3, h4 aren't special in any way. These are just some internal variables inside this function. And the values that they're initialized to are from the spec. These are the starting values for these variables. And what happens is, your text goes through this function, and it updates these internal variables. And at the very end, the values of these internal variables, that is what becomes the output of your hash function. So here's those variables again. And you can kind of see them. I've concatenated them together. So what we'll say, the SHA-1 of this string is this. And that's how it works. That's how the current generation of hash functions work. So with that, I'll explain the length extension vulnerability. So what if I wanted to add this to the end of the query string, where price equals 0? So it would look something like this. We'd have our API key, and then, propended to it, would be our query string. So in Rails and Sinatra and a lot of web frameworks, the last one wins. So price from this request would actually equal 0, which would be bad. What I'll do to exploit this length extension, I'll start with the state that came out at the end. I'll just run that price equals 0 through, and only that part. And what I get is some new state. So I've ran it through just that part of the function. So this new state on the bottom right, these new values, that is actually equal to the output of running that through SHA-1. So the unfortunate thing is we can append whatever we want to an existing string without having to know the API key. So effectively, we would create a request that the server thinks looks somewhat like this. And once again, we never had to know the API key. So this isn't theoretical. This has been exploited in the wild. Flickr had this vulnerability for a number of years. It was actually discovered by the guys who did the crime and the beast vulnerabilities. So how do you prevent this? First off, don't do that. Use HMAC SHA-256. So HMAC is a construction that uses a hash function to generate what's called a MAC, which is a message authentication code. And we sometimes will hear it called a key hash function. And so these are resistant to length extension attacks. Another thing real quick about length extension attacks, if anybody did the stripe capture the flag recently, that was the answer to level seven. It was a length extension attack. So the next thing I want to talk about is password storage. So passwords are a really huge topic, and they're really complicated in how people will introduce complexity requirements and things like that. But the way we tend to store passwords, maybe a little more outside the Ruby and Rails community, is just terrible. And then we have things like this happen. So the day after the last time I gave a talk on cryptography, LinkedIn leaked most of their password database. This is a big news story. And it happened to last FM, Yahoo, Dropbox, Blizzard. And there are a number of other companies that this happened to. And in most cases, what they leaked was the hash of a password. And they thought that was what was going to protect them. The problem is that at some point, we deemed this was the correct way to store a password. We would run the password through SHA-1. And that would be it. That's what we would store. So this does have one desirable property for password storage, and that's one way. So we store this one way verifier. When the password comes in again, we'll run it through SHA-1 again, and then we'll see if they equal. So it's very useful for verification. And if you just happen to select star on the user's table, you wouldn't see the user's real password. So it keeps it away from prying eyes. Or later on, people started like, well, we need to add additional property. We need to add some randomness to our passwords. And that way, we can defeat some things. So we'll append assault to it. Assault is just some random data. So this gives us another really useful property. It's randomized. So it'll defeat a pre-computed table. It would force an attacker to brute force only one password at a time. And so this is what LinkedIn was doing. And they thought they were safe. So the problem is SHA-1 and similar hash functions are extremely fast. They're designed to be fast because they're used in things like SSL, where you want your Amazon page to load as fast as possible. So they're designed to be quick to get their value. And a modern graphics card can actually calculate 2 billion SHA-1 outputs a second. So we want a new property. We want this to be slow, in addition to the other things. And the answer that we have, and the recommendation, is use something that's called adaptive hashing. So Bcrypt, Scrypt, and PBKDF2 are all adaptive hashing functions. And what adaptive means is that you can tune the algorithm. You can give it a parameter saying how slow you want it to be. So my recommendation is, first off, if you don't have to store passwords, just don't do it. If you're OK with delegating your authentication to someone else, you should do that. In my company's case, dealing with financial data, we're not comfortable with that. So we don't. And if you just have to determine whether or not you feel your application is OK with that. So if you aren't going to store a one-way verifier and use Bcrypt, Scrypt, or PBKDF2, the really easy way to do that is to use an existing framework. So since Rails 3.1, we've had Hassecure Password, which is great. It uses Bcrypt and kind of does everything for you. We also have Devise, which by default will also use Bcrypt. And it kind of makes the choice really easy. The one thing I will say is, make sure that you calibrate the number of iterations. So you tune exactly how slow you want it to be. So the other thing is that Bcrypt and Scrypt are native extensions. And if that's an issue on your platform, PBKDF2 has a pure Ruby gem. So the last thing I want to talk about is trust. All right. So hands up if you've ever seen this. All right. Now, keep your hand up if you've verified the fingerprint before typing yes. All right. So that's really important, because what you're doing is saying I unequivocally trust this system. I know who it is, and I've verified the fingerprint myself. So inevitably, something like this will happen. Either you reinstalled the system or the IP address moved, and you'll get this really nasty warning saying, hey, this computer has changed since the last time you talked to it. I don't know what to do. You have to handle this. Now, I hope that what happened was you weren't being man in the middle when you typed in yes the first time, and now that person's gone and you're getting this. So the problem is that you have to handle this manually. So really think about when you're trusting SSH, whether or not that system you're connecting to is what you mean to connect to. So these are, and I don't expect you to read this because it's supposed to look ridiculous. These are the list of the root certificate authorities that are shipped with Firefox as of July 2010. So my question, do you trust all these organizations? Do you trust their employment and auditing practices? Do you know how to adjust your trust of each of these organizations in every program you use? Because in most programs, it's different. Really what this comes down to is that in some ways, trust on the internet is a very broken thing, and we rely a lot on the user to know what to do, which is just unacceptable in some cases. So the last thing I want to talk about is crypt. So crypt is a project that has been started to hopefully replace the OpenSSL bindings in Ruby going forward. If you've ever worked directly with the OpenSSL bindings in Ruby, you'll know that they're atrocious. And so I'm told that there is a preview gym around the corner, and the goal of crypt is to have a cross back-end library so that from the front end you have the same interface, but on the back end it could either use OpenSSL or Java crypto environment or Microsoft or Apple, and you just kind of get the same consistent interface in the front end. I'm told that there's a lot of progress that's been made on this through the Google Summer of Code. The person who started this was Martin Bosslett. He's talking at RubyConf this year about this. So if you're going to RubyConf, definitely make sure to check out his talk. And then the last thing is here's some resources. Grab the slides, check them out later if you're interested, and questions. So thank you. We have time for a couple questions. So who's got one? Hand up, hi. There's a way of authenticating users that we don't use very often, and I wonder if you have any thoughts on how we could use it more or whether we should. And that's client-side certificates and browsers. So that's actually a really good question. So if you don't know the SSL-TLS standard defines just how you have an SSL cert on a server, you can have a client have what amounts to an SSL certificate. And they'll present that to the server when they connect. So it never really caught on, except for inside the financial industry, a lot of internal applications use it. Integrations with banks will sometimes use it. And so I think the reasoning for that is that I think most users don't know, they just see the padlock icon. They don't really know anything further than that. And it used to be the case. It's not anymore that configuring these in your browsers used to be somewhat complicated. So that's why I think we don't use it more. But I'm a fan of the feature. I just know nobody really uses it. I'm glad that you helped inform everybody here that it does exist, though. So we have a big usability problem in there. I'm sorry. Oh, yes. We have a big usability problem in websites where, in particular, if we have display ads or other third-party content, users are presented with that awful mixed content warning that really stands in the way of our using more secure methods on sites ourselves. This has been a pretty heavily discussed problem, especially in folks who are interacting with the ad industry. Do you see any way of getting out of that trap? Not until they're willing to serve all their ad content over SSL, which some of them are. Some of them aren't. In some cases, where Google does the JavaScript stuff, they can just piggyback on your page. So I'm not as familiar with the ad space. But if they don't serve it over SSL, then you're constantly going to get that. So real quick, the reason why you get that warning for those who may not realize it is that if double-click or whoever inserts an ad into your page and they do some JavaScript, and that's not over SSL, if an attacker intercepts that in the middle, they can inject JavaScript onto your page without you ever even knowing it, and without your user knowing it. So browsers tell the user, hey, there's something on this page that was loaded not over SSL. I wouldn't type my credit card into it, because that JavaScript might just pull it out and insert an image tag that calls it out to another server. Yeah, unfortunately, I don't really know the answer to that. I'd like to talk about it later, though, if you're interested. One more over here. So there's one other possible way of authenticating clients that doesn't get a lot of use, but seems very interesting to me. And that's not having usernames and passwords, but just emailing people links with one-time hashes that logged them in, and then remembering that they've logged in. What do you think about that? So I'm actually not that familiar with that. So I haven't really thought about that as much. From initially, it seems like an interesting idea. I would see a lot around the fact that we're increasingly becoming a very mobile multi-device. And so you'd have to do that a lot. Exactly. Well, one of the big advantages is actually mobile and multi-device logins, because it means that you can say, log me in. It sends you an email, and you can click that link in all three of your devices, and then you're logged in without having to type your 27-character password into a little tiny web form on your phone. Right. Yeah, so that's actually an interesting idea. Is there a specific pattern that's called that? No, it's something that I've seen recently. OK. Yeah, I haven't seen that, but that's something interesting. I have a question about two-factor authentication. Two-factor the planet. I love two-factor. OK. Big fan. Cool. All right, so I think we're done with the talk. Thank you very much.