 Thank you for coming. Both of you. It looks a lot emptier up here than it did a minute ago. So I'm just gonna jump right in. We've got a question to start us off with. Do you trust me? Thank you. If you don't, you're in good company. But if you do, I'd like you to take a second and just think about why you trust me. This talk is about why our apps can trust our users and vice versa, so that's the topic of the day. So me, my name is Michael Sweeten, which you know if you read the slide up there a second ago. I work for a company called Atomic Object. This was us, I'm pointing to a screen you can't see. This was us about a year ago, about 50 people, and we do custom software development. So web, mobile, lots of stuff across the board. And I have to do the obligatory pitch that I'm sure you'll hear at every single talk today. We are hiring experienced developers at our offices in Ann Arbor and Grand Rapids, so hit me up. So today's talk here is not so much about algorithms or bits and bytes or any of that kind of stuff. I'm not going to start unpacking AES and RSA and padding algorithms and things. This is a talk about how those different pieces all fit together. Because it turns out that it's a really complicated nasty dance to get any of the stuff working right together and the stuff that many of us might have experienced in college in our district's math course does not come anywhere near covering it. So that's what I wanted to talk about today. I apologize in advance if some of you did want more details. Now in addition to some of how all that fits together, you'll see my little spy pop up on the side of the screen when I have a story to tell about some real world security failures. So put some context on some of these things. So a moment ago, I asked you about trust. Now, my name is Michael Sweeten and you can trust me because I have the ID to prove it. The ID was issued by the government. So I won't go so far as asking you to trust the government but maybe trust that they checked my birth certificate and other IDs before they issued it. And it's got my picture on it and the name matches what you see on the conference program. And so you can at least trust that this is the talk that was meant to be given at this slot even if you don't trust anything else. Almost everything we do in terms of security in our apps is built on a chain of trust like this. Something to go from program to name to government to ID. Almost nothing is made from some big overarching solution. It's all tiny little pieces fit together in just the right way in a lot of ways that I'm terrified that I'm going to screw up because it's hard. So here's an example app. This is what you get if you do like Rails new or whatever the command is nowadays and add device and don't put any styling on it. So I made an example user account for Rails accounts and let's log in. Okay, I'm logged in. It's really easy though to hand wave over what just happened when it's buried under a single click because we actually did just a lot of steps. We had to make a connection. We had to send data. We had to evaluate data and change things. So let me unpack that. We lost about a 10th of an inch of screen. Okay, so we had to do a DNS lookup to find the server. We had to make a connection. We had to somehow magically make it secure. We can send our credentials over, check them and log the person in. So that's great and it's better than what we had but that's still pretty hand wavy. So let me rewind. We've connected. Now before we do anything else, we're gonna check the server certificate. So fun story time. We used the certificate to prove authenticity. Authenticity is important. So here's an example. Back in 08, a man stole roughly $850,000 from two banks around the DC area. He walked in and said, hey, I'm here for the pickup. And he looked right. He had the uniform, the gun, he had everything. Except he wasn't actually working for the security company. Grabbed a bag of half a million dollars, walks out. I don't think he actually had an armored car. He just probably got into a Ford Escort and drove away. Strapped it in. Did the same thing the next day and took another 350K from a different bank. So it doesn't matter if you had the truck or not because even if he did and the truck is secure and it's locked, it doesn't matter if you're sending your data securely to the wrong place. So this is what our certificate's for. But let's dig into what it really is. Our certificate is really three key things. So firstly, it's a public key. It's a public key for the server that we're trying to have a conversation with. But the key itself, it's just a number. It doesn't tell us anything. And the number could belong to anybody. I could give you a number, 42. So what we add in the certificate is some metadata that says not only is this the number that we're talking about, it belongs to this name. It belongs to amazon.com, Rails.com.org. And we trust that the certificate authority has validated that and they've issued the certificate. And what they do is they sign it. They apply a cryptographic signature which was created with the certificate authority's private key and so we can verify it. This establishes a chain of trust for us. I trust my web browser. The browser ships with public keys for all of the certificate authorities so they can check that the certificate is validly signed. And then the certificate was signed as a whole so we know that key goes to that name. And because there's a public key in there, that's a tool we can use to help facilitate secure communications with whomever has that private key. And we haven't proven that part just yet. All we've proven is that this is the right public key for us to be using. So we trust the key. This, more or less, this system of the certificate authority vouching for the key, basically what the mafia does when they induct a new member, works about as well, actually, to be honest. Okay, so we've got that certificate stored in memory. We're gonna use it more in just a minute. The next thing we have to do is key exchange. We have a key in that certificate but that's not actually what we use to communicate, generally speaking. Public key algorithms like RSA, I'm not sure if this applies to DSS or other signature algorithms, both RSA at least. It's firstly, really computationally intensive. And secondly, there's size limitations to the size of things you can encrypt. So it works well for exchanging a 256 bit key, a little less well for downloading season two of the Sopranos. Okay, two ways of doing key exchange. So the first is RSA key exchange. The client generates a secret and is gonna encrypt it using the public key in the certificate. Then we can send it over and be assured that no observer will be able to steal the key. There we are. Server has the key then and now we have the same key on both sides and we could at least secure. So that's pretty good. And that was in 71, I think, when the RSA's paper was published. It was a pretty big deal. There's one other way we do key exchange which is a little bit better, I think. So I don't know if you remember seeing, I don't know if you guys remember seeing this in school or this image, this was just taken from Wikipedia. This is Diffie-Hellman key exchange. I'm not gonna dig into exactly how it works, but here's the super quick demo. Each side generates a secret and they're gonna keep that secret secret. We also generate something that we're going to exchange publicly. And this is where we do something special. On the server side, we sign it. This is an extension of regular Diffie-Hellman key exchange that we do with TLS. And this lets you prove authenticity for the server because if we did RSA, only the person with the private key could decrypt it because that doesn't apply here. We need to do that some other way. So the server signs the key. We swap the public halves and then from that we can construct a session key we can use. But this is where things get cool. The first half that we generated never got transmitted. So that provides a facility we call forward secrecy. If you remember what I showed you for RSA, all the data that matters for the connection was sent over the connection. So if someone happened to record it, even if at the time they didn't have the private key, if a year or two later they acquire it, they can go back to their recording of that conversation and decrypt it. With Diffie-Hellman, as long as we delete those, the parts that we didn't transmit, we can be assured that it's relatively unlikely at least that anyone's gonna ever reproduce this particular conversation. Now one thing I thought was really interesting, forward secrecy is actually not enabled in most of the different Cypher suites SSL and TLS provide. Not really sure what the deal is there. It was kind of a big news item at some point in the past when Google enabled that for all of their services. But it's a good thing and you should consider it. Okay, so regardless we somehow have established session keys. And so we can start encrypting. So that took a few hops to get there. And before we move on, let me just review what we've accomplished with those. We talked about authenticity, we wanna make sure we're talking to the correct person on the end. The privacy one is pretty obvious. I wanna send something and not have that be observed. The last one is integrity and I didn't talk much about that. And that's because there isn't a step for integrity. The way we do that with TLS is pretty much every time we exchange data, a hash is provided for the data from beside that sends it. And so on the remote end, they can validate that that hash still works. So that's threaded through everything we've talked about. Okay, these bits are stuff that we typically don't have to deal with ourselves every day. But this part, the actual login stuff, that nuts and bolts, that's what Devise gives us if you use Devise. That's what your login controller will do for you. So all of these pieces are transport layer and that's our app. You can think about it like this too. So on the top is stuff that someone smarter than me wrote for me. And put it in Apache or Nginx. And in our browser in Chrome and Safari. The other stuff, that's my JavaScript, my HTML, my gem file, my controllers. The key word being mine. This is all stuff that's my problem. And your problem. So these are kind of separate pieces. I'm now gonna kind of hand wave over forgetting about those first bits for a little bit and focus on how our apps handle stuff internally. So I started up a development server and I'm gonna make some requests. So this is, I'm gonna show you the login request that we demoed right at the start. And now because this is development mode I don't have TLS enabled. This means I can actually just tell that in and make a request. It gives you a really good hand line exactly what's happening on the wire when you do this because there's no other layers in between. There's a request. I just click the login button and we see it's a post to the user's controller to sign in and down at the bottom there's the data. I'm sending my email. I'm sending my password. And there's all URL encoded and we're relying here on it being sent over an encrypted channel because if it weren't, this is exactly what it would look like to an observer. So that's the step. We just sent the login credentials. That's all there is. Verifying it though, it's gonna take a little bit of work. So this is the server's log output and the key bit is this here. We sent a username of RailsConf.example.com and so I'm gonna dig through my user's database and try to find that record. Now that's actually the only piece of the login process that's really showing up in our logs in a good way here. The rest of it is actually right here chronologically speaking. That's where we're gonna work on the in-memory stuff and figure out if the password I supplied is also valid. I am gonna dwell on password for a little bit because it's something that we have to deal with a lot and that's easy for us to screw up. For temporary session keys, that's all stuff that's under our control but the passwords come from our users so we can't revoke that without really irritating our users. So let's say we have some users. Some of them have good passwords, some of them are duplicates, some of them are bad passwords, users. So this is a simple way to do it. Now we all know this is wrong. It's easy but it's wrong and it's wrong because if someone gets ahold of our database, they have all our users' passwords. Because users are so readily reusing passwords, it means that if we leak our database somehow, and this just keeps happening if you watch the news. If it gets leaked, we're leaking not only our users' passwords for us and whatever data we have, we're also leaking probably their bank's passwords because they're not gonna have a new password for every service. We all know better of course but they don't. Okay, so storing plain text passwords is simple but don't bother. Okay, let's give it another try. Suppose we hash the password. This is marginally less bad. Data ends up looking like this in the database and so we just take the hash of whatever they provide, compare it and see where we're at. But immediately it's really obvious that if users have duplicate passwords, it's exposed. I also wanna point out, SHA-1 is really fast. So it's kind of a problem. So another story time. In 2013 at the Four Seasons Hotel in New York, they lost a lot of money. This is one of my favorite stories about just brute applications of force. They had a jewelry case by the front desk. So this is broad daylight. So front door is unlocked, desk is manned and they have a case here and it's probably bulletproof glass and locked. But two men walk in, trench coats, hats pulled low, pull out a sledgehammer, smash the case. They grab an up $2 million worth of watches, jewelry, cuff links, et cetera. And they walk back out and it was so fast no one could do anything. So I like that as an example of, you didn't really expect that failure mode. And that's the way it is with the password hashes here. You could try to calculate all the passwords, all the hashes and it seems like it should be infeasible. But it's actually not. There's an attack type called rainbow tables and all it is is an implementation of a time-space trade-off that lets us create a lookup table. So a lookup table for all zero to eight character passwords with the 95 printable characters on a US keyboard is less than half a terabyte. So today, I have that on my laptop. Not a big deal. But even in 2000, that much storage would cost you about $6,000. So you can bet the NSA had access to that kind of computing power. And probably any Joe with a credit card can make a system that can do that. So even though it looks better than storing plain text passwords, it's almost identical. But we can mitigate that by applying a salt. So in the database, it looks like this. We just generate some random data. In practice, it should be longer than this, but I ran out of width on the slide. And when we hash the user's password, we include the salt and whatever they provided and then check that against their password, whatever we've stored. This actually finally starts to get us in the realm of being kind of secure. It's still vulnerable to the fact that, SHA-256, SHA-1 is definitely broken. There was a Google published an attack recently. SHA-256 is I think the current state of the art. If you're watching this in six months, I flatter myself. If you're watching this in six months, maybe things changed, I don't know. But right now, so this is good, but not perfect. We can do better. What you really, really wanna do is just apply a password-specific hashing algorithm. Bcrypt, Scrypt, PBKDF2, a lot of them out there, just go and pick the right one. And this ties together everything we have. So it's, I don't know, it's hashing looking crap that you've taken your database. But let me really quick show you a little bit about what's stored there. This is a bling delimited. So that first field is the version. Bcrypt in this case is version 2A. This is the bit, the second field is 12. That's what differentiates it from just a standard salted password hash. That 12 is a work factor. So I can make that 15 or 20. We can use that to scale how hard of a problem this is. Is a password hash? We don't want that to be fast. There's no advantage to it. Because for login, our users don't really care if it's 200 milliseconds over 100. And if we find out later that our computers are too fast, bump it up to 20, 25. First 22 characters there, that is the salt, 128 bits, base 64 encoded, and there's the hash. So that wraps up everything. So that's verifying the login credentials. Finally, we can start to get into the point of logging the user in that we started trying to figure out 20 minutes ago. So I made my request. And here was the service response. And the interesting bit amongst all these headers is this, the session cookie. I actually had to configure Rails session store to use the database for this. So this might look a little different, but the key thing is it's something that we gave to the user that they can bring back next time instead of a username or password. It's revocable, so if we need to change things, we can just log the user out. Not a big deal. It's a lot harder to do that with their actual password. Okay, so the next request, they're gonna bring in that token, that cookie, and they'll be logged in and we're done. That took us 22 minutes to get there. So this is what we've accomplished. We used the username and password to authenticate the user. We've issued an unpredictable unique session token that we can revoke. And that whole first bit, those six steps, we know no one else has that token because we've exchanged it over a secure channel. That's actually kind of a big thing, how you treat your tokens. Here's an example. This woman who I've anonymized in the 2015 Melbourne Cup, 2015 Melbourne Cup in Australia, betting on horses, she won $800. And very naturally, she was excited and she posted on Instagram and Facebook about it. In the time it took her to get from the track over to the betting counter, someone pulled their secret token out of that barcode and claimed the prize. She was a bit miffed about that. And now we think we do better because it's not printed, we're not exposing it, but it turns out in 2010, developer Eric Butler released this Firefox plugin called Firesheep. And this is a screenshot from his website, which is still up, do a search, you can find it. A lot of major web apps at the time were using SSL to protect the login page but not to protect all of the other parts of the app. And so that session token, that last property, they broke it. And that meant that if you were connected in a coffee shop on an open Wi-Fi network, you were basically just shouting your credentials out into the world for anyone who had listened. So not optimal. The lesson here is you should always be using SSL if you care or TLS, which now replaced SSL, if you care about security at all. Okay, so here's my Twitter. I don't use it too much, as you can see, but I'm gonna try to perform the same kind of session hijacking XTAC against Twitter. So I go to Twitter and open up Chrome's DevTools. And here we have an off-token cookie. Just went over to cookies and I had to experiment a little bit to find out that it's this one and not like Twitter session. But if I give this token, they will act as if it's coming from me. So I provided as a header when I hit Twitter. And so that gets sent over in the HDDB headers. And they give me back a successful request. Now this whole thing, this is a mess to dig through. But if we actually dig into the HTML, okay, that's still a mess, I'm gonna scroll down about 7,500 pixels. You see, hey, there's my name. I'll leave it as an exercise to the audience to verify that your Twitter doesn't have my name on it. And that's all it took. Now that does not mean that Twitter's insecure because they did everything right in terms of exchanging it over HDDBs and protecting those tokens. I was able to get at it because I control the endpoint. It's on my laptop and I control the browser. So if someone takes over your browser, you're just kind of screwed. I don't really have any good advice for you there. Hopefully it's a problem that will be solved eventually, but you have to end up trusting something. We trust certificate authorities, we trust our browser, we trust our laptops. That's a really hard problem. Okay, we talked about all this. Okay, so I wanna take a second here and shift gears. One of the guys out of my office makes fun of me for really liking single sign-on. But I think it's actually a cool demonstration because single sign-on between two separate isolated apps, if we wanna have, and what appears to be a shared session between them, ties together everything we've talked about so far. So you have to use our cryptographic primitives to establish trust with each individual system and between the systems. So yeah, I got time. We're gonna dig into this and hopefully you'll see how this complicated dance works a little bit better. So let's say we have some boring system. Maybe it's an HR system. It knows who I am and it can authenticate me, but it doesn't have any data that I actually am interested in. Maybe there's some other system, a Wiki. Maybe that one has data that I want. They're web apps, of course, so let's dot comify them. And now we're gonna, when we provision these servers, I'm gonna exchange some keys. And this particular implementation is just random data. And it's the same data on both sides. There's other ways to set up keys. We just need something that'll help us establish trust. And it knows about me, so let's just assume that they have some creepy pictures of me and the database. Okay, so I'm a web browser, which side am I in? This is backwards. So I go to my Wiki and I say, hey, give me the data. But of course I'm not logged in. I don't have a session, I don't have a cookie. So the app says they don't know me. But it says go talk to the other one. The other database here is gonna be redirect over here because this is the database that actually knows who the hell I am. Just standard 302 redirect. And we're gonna carry along the address for the page I'm trying to get to for that, in about 11 minutes when I run out of time, we'll know which page to redirect me to. Okay, so I go to the other database and I say, hey, will you vouch for me to the other guy? And I'm not logged in here either. So he says, no. But this guy is capable of authenticating me, so he'll give me a login form. And I can fill in this login form. So I make a post request. I include my username in it. And maybe that'll check out. So I'm logged in, he's gonna give me a session cookie. But that cookie, it's only good here. My brother's not gonna send employees.com session cookies to wiki.com. So he's also gonna give me a token that the other app is gonna use to authenticate me. And that token is gonna have to be two things. One is gonna have to establish who I am. And the other is gonna have to prove that that token is actually accurate. I'll share a fun story about forged tokens later if we have time. So who I am is trivial. Maybe it's JSON. If you're in SAML, maybe it's XML, but whatever, it says who I am. And that's gonna have an expiration date on it. This token is only there to get me from here over to the other app. So it doesn't have to last more than a minute or two. If it does, then if I get fired and my access is revoked, then that token will leave me still logged in. So that would be bad. So we keep this short. We probably URL encode it to make it easier to send. And that's who I am. Now for authenticating it, there's a lot of techniques. We could use public key cryptography to create a signature, which could be verified with private keys. Another method that I kinda like for when it's just two systems that I'm doing something homegrown is a hash-based message authentication code. So this is actually the specific primitive that's used to do integrity checking on all of the SSL packets that go across. And all it is is we wrap up some random data as a key, the data to be signed and a hash function, and they're applied in some fairly specific way to create a hash. So if the other side has the key, which remember we pre-shared the key, they'll be able to validate it. Okay, so you redirect me. My URL's starting to get to be a bit of a mess as I accumulate things in it. So we're going over to the other app over here and we are still carrying along my final destination. But now we have the token that says who I am. And we've got the HMAC that says you can trust this. You can trust that it came from someone who knows. So I follow that redirect, just make a get request. And now that acts like login on this side. Issues me a session token. And so now I have one token for over there, one token for over here. And they're different and they're not really connected. These apps don't have to share database because having that key set up lets them send data even over the insecure channel of the user and we can trust it. Now it'll finally redirect me to the place I wanted to go. So I follow that redirect, I provide the cookie and now I can have all the data. So I wanted to illustrate that just because, get to this in a second. I wanted to illustrate that because it's such a mess. I feel like applying cookies and SSL and everything, all those layers, is a lot of different pieces. But that model, that's the same model for OpenID Connect, SAML I mentioned, pretty much any single sign-on solution you get off the shell will work roughly this way. Now I originally intended to include this graph in the rest of the talk but it was a pain in the ass. But I wanted to throw it in at the end anyway. So there's just a little dependency graph how some of the different math primitives and other business things like trusted third parties all come together to let us be able to act securely online. I'm not gonna go through this but I just put it up there to show this is a bloody mess. It's a really careful stack of things and I find it interesting so I've read about it. I definitely do not trust myself to implement it. So I strongly recommend if you can get a third-party trusted well-audited library. For example, Libsodium is an implementation and one thing that's cool about that is they set it up so that they never have a branch in their code that depends on secret data. This is apparently a good thing. But I pointed out because that's the kind of thing that I'm not going to think about because I'm not an expert in this. And in all likelihood 90 plus percent of you guys aren't either. There's very few people in the world that trust to do this well. I don't know who they are but I hope they're doing it well for me. Okay so that's the main stuff I wanted to cover. A few minutes ago I promised a story so let me share with you two more fun stories about authentication going wrong. This is one of my favorite ones and I actually included this in my talk submission to RailsCon in 1970 or 71 maybe. This is back around Vietnam War's protest. There's a group of protestors that broke into a Philadelphia draft board office. So they were going to steal these selective service papers for people that were being drafted. They discovered in this particular office they couldn't break through this door. The padlock was not one they were able to pick or replace. I was just something they'd done in other similar heights. So one of them had this bread idea while they were casing the joint, it's a tactical term. He wrote a note and the note said, please don't lock this door tonight. And he paced it to the door. They came back a few hours later and sure enough it was open. So absolutely my favorite height was just so damn simple. So that's the problem with if you use SAML or something. You get a note on this other side that says you can trust a person who has this token as long as it came from the right place, as long as that token is signed. So it's really critical that you validate your trust at every level because if you break that chain, you're leaving your front door wide open. Okay, so we got about four minutes if there's questions, discussions, heckling, high fives, no? Okay, well, thanks.