 I'm Aaron Parecki, and I'm excited to be here today to talk to you about securing your APIs with OAuth. So let's just jump right in. This is the spec. Here we go. OAuth authorization framework enables third-party... I'm just kidding. Specs are a terrible way to learn this stuff. Turns out the specs are written more like a legal contract, and every word is very carefully chosen because of its meaning, and each word is defined in five other specs. That's not the best plan for how we're going to actually learn this stuff. The worst part about OAuth is that it can feel a lot like going through a maze trying to actually navigate this because it's not just one spec. It's like 20, and you have to figure out how these all fit together or how they relate to each other. We're going to take a totally different approach. We are not going to go through the specs step by step. Instead we're going to start by going back in time and talking about why we even have OAuth in the first place. Anybody remember this pattern back when Yelp was brand new? This was a very common pattern at the time. This was a new app with launch, and it would say, okay, let's see if any of your friends are already using Yelp. Please enter your email address. Please enter the password to your email account. We now understand that this is a terrible pattern. You don't want to go around teaching users that it's okay to enter their email login in random applications. This was the only way to do it at the time. It was extremely common, even like Facebook was doing this. People would happily give their email password to these applications because they actually wanted the end result of this workflow, which was that they actually did want the application to have access to their contacts. It's just that if you give it your password, you're giving it access to everything. You need a way to avoid giving access to your actual email messages, but still we wanted this application to be able to access your contacts. So OAuth was created to solve this particular problem. We want to find a way to let an application access data in an account, but only some data and only without giving it your password, right? So it turns out that around this time in the late 2000s, a lot of these companies were building APIs and everybody had the same problem at around the same time as they were building out APIs. And developers at these companies got together and formed what eventually became the OAuth spec and the group that now maintains the spec. So now when a new app like Yelp launches, instead of typing your email password, it's signing in with Google, signing in with Facebook. And this is the pattern that we see now all the time. The interesting thing though is that this wasn't actually the original goal of OAuth. OAuth was never actually designed to tell the app who logged in. OAuth was designed to give access to data. So you have like Yelp wants to access your Google contacts. Or Lastfm wants to read data from your Spotify listening history. Or Buffers trying to post to your Twitter account. All these things are about accessing APIs, not about identifying the user. I like to think of another analogy of checking into a hotel. So when you go to a hotel and you check into the front desk, you give that person your ID card. They give you a hotel key card. So you take the hotel key card and you go to the door and the door. You swipe the key on the door and the door lets you in. Now this is exactly analogous to an OAuth exchange. And what's interesting about this system is that in order for this to work, the door doesn't care who you are. It does not need to know your name or a unique user ID. All it cares about is whether this card has access to this door. And that decision is made by the person at the front desk. So in OAuth terms, the person at the front desk is the authorization server. The key card is the access token. And the door is the API or the resource server. So I bring up this example because I want to point out that it's entirely possible for an API and a system with OAuth to exist, where user identity isn't actually even important into the system. But of course there are many situations where the application needs to know who logged in and the API certainly needs to know who the user is. So it turns out that OAuth doesn't say anything about how to do that. There is nothing in OAuth that will help you with that solution. Instead, we need something beyond OAuth, which is where OpenID Connect comes into the picture. So the other thing to keep in mind about OAuth is that it was originally created for that third party app access of Yelp is trying to access your contacts from Google. But it turns out that as the space evolved and as things have matured over time, the OAuth framework actually provides a very good solution for first party apps as well. So when you go and visit gmail.com and you click sign in, you don't see a password prompt. Gmail does not ask you for a password. Gmail actually redirects you to Google's OAuth server accounts like google.com and then you go log in there. And this has a couple of really, really important benefits, which is why we use OAuth for a lot of things, including first party access as well now, which is that notice how there's no password field here? It's because some Google accounts don't have passwords because there's many ways you can actually have a Google account, one of which is through delegated to a different identity provider. So first, you enter your email address and then that determines how you're going to log in and whether you're going to do two-factor auth. So I type in my email address, Google then asks me for a password for this account and then it asks me for one of my two-factor auth prompts. Now, the other really important part here is that I'm looking at the Gmail in this example, right? Gmail could care less about whether or how I'm doing two-factor auth, whether my account's delegated to another identity provider. And that's really powerful because it means that you have flexibility of consolidating all of your logic into this one OAuth server where that's the thing responsible for authenticating users doing two-factor auth and things like that where then you end up back at Gmail and Gmail's happy because it has an access token and that's all I needed in order for it to work. So this is the background of why we have OAuth and how it started getting applied into first-party apps even though it was originally designed for third-party. So we're going to take a look at how OAuth works and talk about OAuth from the applications point of view first. In the second half of this, we're going to talk about OAuth from the APIs point of view. So from the applications point of view, the goal of the app is to get an access token. It's trying to get that key to someone's account. How it gets the access token will be based on what kind of app it is and where it's running. So there's a bunch of different OAuth flows which will determine how you get the access token. The authorization code flow is probably the most common which is used for web apps and native apps. There's the device flow for applications running on a device that maybe doesn't have a browser or doesn't have a keyboard, so I think like your Apple TV. There's also the password flow which is part of the OAuth spec. I would argue it's not really in the spirit of OAuth because it does ask the user to enter the password into the app. And there's also client credentials which is for when there is no user involved, then you've got like machine to machine communication. You just need an access token to access some sort of system level data. But the important part is that the end result of all these flows is always the same. There is an access token. And as far as the app is concerned, the access token is just like a hotel key card. It's just a thing that it carries around and it's going to use it to make API requests. That access token has no meaning to the application. Just like in a hotel key, you don't care whether it's an RFID or NFC or a Magstripe card or a physical key. And you don't care what's on that key. You just care that it works when you use it at the door. So in OAuth, there are essentially these five roles and these five roles are, they have specific names in the spec because the spec is very picky about how it chooses to name things. Those names in the spec are not the things that we usually talk about in conversation. So the spec names are in the parentheses there. We usually talk about users, not resource owners or devices, not user agents. But here the goal is that the application or an OAuth term as the client is trying to get an access token, which it can do by getting it from the OAuth server. Then it's going to go use the access token at the API. How it gets the access token, again depends on the flow we're using, but it's going to have to get the user involved somehow. So it has to go and communicate somehow with the user using probably their device they're using, right? So you can start to imagine there's a lot of arrows bouncing around when we'll walk through a couple of those flows. Starting with the simplest flow, which is the sort of foundation of the OAuth flows, which is the authorization code flow. So in this example, we'll start at the top. The user is in their browser and they visit the app's website and they click the login button. I'm trying to use this application, please log in. The app says, great, I don't know how to do that, but you can go over there and log in there and then they'll tell me when you're done. So redirects the user over to the OAuth server. There's a bunch of stuff that goes into that redirect, including like the app's client ID identifying it and the scope of what they're trying to access. That's basically the user going over to the OAuth server saying, hey, I'm trying to log into this app. It wants to access my contacts. The OAuth server says, great, please log in here, because I have a password for you. Please do multi-factor off and then do you approve this request? At that point, the OAuth server says, great, here is a temporary code you can take back to the application. That is an authorization code. That's a one-time use, time-limited code. That goes back into the browser. The browser then delivers it back to the application saying, here's the temporary code that the OAuth server gave me. You, the application can use this to get an access token. So behind the scenes, then the application's web server will go and talk to the OAuth server again and saying, hey, the user gave me this temporary code, please give me an access token. Here's also my client's secret so that you know that I'm actually this real application. The OAuth server says, looks good, I just issued that code and this client's secret checks out, so here's an access token. And now it can go make API requests. So these lines are colored differently because these are two different ways of sending data. You might notice the top half of these lines, the blue ones, always run through the user's browser. And the concept of that is called sending data over the front channel. So the pink lines on the bottom are sending data over the back channel. The idea with the back channel is that it is a secure communications channel from a client to a server. This is, again, kind of the normal way of doing things. We kind of take it for granted because it's the way that we normally make HTTP requests. But it has a couple of really important properties that the front channel does not have. Using the front channel is literally using the address bar to send data from one thing to another. So benefits of using the back channel, again, these are things we usually take for granted because we're just so used to it. But when you make an HTTPS connection from a client to a server, that certificate's verified, then the connection is encrypted, which means that can't be tampered with and the response that comes back is part of the same connection, so the client can trust it. It's like hand delivering a message. So you walk up, you say, here is a message. You can see that the other person is taking it and accepting it, they can verify who you are, and everything is happy. Passing data over the front channel is like throwing it over a wall and hoping they catch it on the other side, where neither can see over the wall. So in this example, the OAuth server is trying to get this authorization code to the application. It's really just throwing it over the wall. It can't tell whether the application's actually received it, or it also can't tell if someone hasn't jumped in and stole it in midair. Also on the other side of this picture, the thing receiving this data, all it sees is this authorization code flying over the wall. It can't actually tell where it came from on the other side of the wall. This is a really important detail about using the front channel in OAuth, is that neither side can really trust that it was successful or that it hasn't been messed with in flight. So you might say, why do we use the front channel at all then if it's this insecure? It has a couple of really important benefits. One, it's how we interact with the user. It's how the OAuth server can interact with the user to do things like multi-factor authentication. It also ensures that the user was there and gave permission for this to happen, and is not just relying on the application's word for it. And also another detail is that it means that the receiver doesn't actually need a publicly accessible IP address, which is not that big of a deal in the web server land, but if you're talking about an app on a phone, that application running on the phone doesn't have any way for a remote server to push data into it. So we use the front channel to get data into the application. So let's walk through an example of this step-by-step. We start out with the application building the link to send the user to login. So you go find the authorization server's URL. You add in a bunch of stuff into the query string that describe the request you're making. So response type code is saying we're doing the OAuth code flow. Client ID identifies the application, the redirect URL is where the app is waiting for that authorization code. The scope is what the app is trying to access, like your contacts. State is a random string the app makes up and this is the first of the things that protects the front channel. This ends up just being a regular URL and then the app, this is the login button the user clicks. The user lands on the OAuth server, logs in, approves the request and then is redirected back to the application with the authorization code. This is all the front channel steps. So at this point, the app has just received this authorization code flying over the fence and says, well, how do I know that this is from the real OAuth server? I'm gonna check the state parameter that I made up to make sure that it at least came back with the state value I sent. That's okay, Assurance, it's not 100% guarantee yet but this is at least one level of a check. And then the sort of closing the loop here is using the back channel to exchange the authorization code for an access token. So the app makes a post request over to the token endpoint. This is back to the OAuth server saying here is, I'm doing the auth code grant type, here's the code I got from the user, here is my redirect URL, client ID and because this app has a secret, includes a secret. This is the way that the OAuth server sort of closes the loop and says, okay, well, I know that the authorization code couldn't have been stolen because if it was stolen, the attacker who stole it wouldn't have the client secret and that way the server can only return the access token to the real application. Or if there's some error, there's like a way to transmit error codes. So this is the sort of baseline OAuth flow. This is useful for what are called confidential clients which basically means the app has the ability to keep a secret. If you notice, this flow relies on the presence of the client secret to be secure. Without the client secret, this flow would not be secure. Somebody could jump in and steal the authorization code. Public clients are applications that can't keep a secret. The easy example of that is a JavaScript app where no matter what you do, if you put any API keys into your JavaScript source code, that's gonna end up down in the user's browser and they can see it by viewing source. It's a little bit less obvious, but it's also true for mobile apps where there's plenty of tools available to decompile a binary and look at strings inside of it. Sure, it's a little bit harder, but it's still possible. So for those applications, for browser-based apps and like single-page apps and mobile apps, you just can't use the client secret which means you can't do the regular OAuth flow. So if we think back to the OAuth flow, the problem was that the authorization server returns the OAuth code in the front channel, which means the authorization server can't guarantee it actually was received by the right application. It needs a way to verify that it's not about to return an access token to an attacker, which we normally use the client secret for. Can't use a secret client secret, so what do we do instead? Turns out there is a solution and it's called Pixi. So Pixi is a spec, an extension to OAuth. It stands for Proof Key for Code Exchange and this is the solution for public clients that can't use a client secret. So let's walk through how this one works. It starts off the same, the user visits the website and says, I'm trying to use this application or launches the mobile app depending on if you're in a browser-based app or a mobile app. This app has no client secret, so it says, hang on, I'm gonna generate a new secret on the fly right now and this is unique per request. Stores that secret, hashes the secret. The idea with the hash is it's a one-way operation so even if someone stole the hashed value, there's no way to reverse engineer it and know what the original secret was. So then the app says, great, go over to the OAuth server and take this hash value with you. That causes the user's browser to land back at the OAuth server saying, I'm trying to log into this app, it's trying to access my contacts and here is this hash it gave me. The user logs in, approves the request and then the OAuth server returns that code. It remembers the hashed value and returns the code back to the browser. The user takes that code back to the app and now when the app goes to get an access token, it doesn't have a client secret but it does have that plain text secret generated at the beginning. So it includes that in the back channel request. The server then says, okay, well, I see that you're using this authorization code and when I generated that code there was a hash value associated with it. So I'm gonna hash that secret that you're sending me right now and compare them and if they match then I know that the code wasn't stolen. So that's the trick for plugging up that, closing that loop of being unable to use a pre-provisioned client secret. It's sort of like an on the fly secret that's unique per request. So that was an extremely quick overview of OAuth clients. Mostly the OAuth client, all the stuff we've been talking about is talking about how clients are talking to servers and getting an access token. I wanna shift gears now and talk about how the OAuth server and the API coordinate. Now in a small scale application like if you're using a built-in OAuth server that's built into your API these are gonna be part of the same code base so you can just kind of do things however you want. But as soon as you start getting into larger scale deployments or using an OAuth server as a service then we need to talk about how these things coordinate. There is a really important part here about how we actually deal with the sort of scoping access tokens and limiting risk and also communicating to the user what's happening. So did anybody see anything wrong with this picture? This was an actual example of a sort of OAuth worm that happened a couple of years ago. The scopes here are this application's requesting to read, send, delete and manage your email and manage your contacts. And it's saying that Google Docs is trying to do this which is a little bit fishy. But what happens if you look at what this little dropdown. Now this starts to look pretty suspicious, right? So what happened here is that somebody went into the Google Developer Console and registered an application called at Google Docs, uploaded the Google Docs icon and essentially just tried impersonated Google Docs. As soon as one person clicks this, they get an access token with the scope of being able to send email from that user's account and read all their contacts. So they can use the contacts API, pull down their address book, send an email to all the people in their contacts list saying, hey, I just shared a Google Doc with you, click here to view it. That person, the next person's gonna click it. It's not a spam email because it's coming from a real Gmail account to a real Gmail account. That person's gonna click it, open in Google Docs, they're gonna see that prompt and then the process repeats. This actually spiraled out of control so fast that Google tweeted this out. This was like 20 minutes into this incident. And then there was also at the same time this Reddit thread unfolding which was fascinating to just hit refresh on over and over again because Google engineers started chiming in saying, okay, we're trying to figure this out. Oh, we figured it out and we blocked the client ID, now everything's fine. Right, so the solution was that they just locked down the client ID so the next time anybody clicked that link they got an error page. Did that actually solve the problem? I don't think so because the problem was that users were tricked into clicking okay when they shouldn't have been. Not that there was any actual vulnerability here in Google's API or even in OAuth. So this idea of scope and getting users consent to these operations becomes pretty important when we're dealing with this because you can see how it can blow out of control. Here's some other examples of these consent screens talking about or showing how these servers present information to the user. So this is GitHub describing what this application is trying to do. I think people would not have fallen for this nearly as much because of two important points. One, it shows the developer name ahead of time and two, that URL that they're gonna get sent back to at the bottom is a lot more visible than hiding behind a drop-down. WonderList does this thing where they show what the app can do and what the app can't do which I think is pretty nice. Spotify is not the best example. Pretty much you're just gonna click the only button with color on the screen and not read anything because it's so small. Facebook does this cool thing where they actually say this app will receive this data but you can edit the info you provide to the app and then you can go in and unselect scopes and then Fitbit took that idea very literally and just has checkboxes for choosing what you actually share with the application which this mechanism's been built into the OAuth spec for a very long time but not a lot of apps have actually taken advantage of it. So this is, scope is this idea of limiting what an app can do, what an access open can do on behalf of a user. It's not the way you're gonna build a permission system into your API but it is how you're going to mitigate risk and especially around third-party access. So the other half of this is dealing with actual access tokens in your API. So I wanna talk about access tokens for a few minutes. As far as the application's concerned, an access token is just a string, it doesn't mean anything, it's just gonna put it in the header and make a request. The, on the API side you all of a sudden care a lot about what access tokens mean and how to validate them, right? So access tokens basically fall into two different families. Your access tokens are either gonna be reference tokens or self-encoded tokens. Reference tokens are just a way of saying the token itself is a random string and it points to a record somewhere else. The simplest example is storing tokens in a database table where you have a column that's the token, you have a column for like the user ID and the expiration date and permissions and et cetera, et cetera. There's many different ways you can implement this but that's in a simple example. The idea of a self-encoded tokens is that the data about the token lives inside the token itself and then it's somehow signed or encrypted or both. Again, there's many different ways you can implement self-encoded tokens. So the important thing though is if you're building an app, you don't care about this difference. You're gonna take that token, put it in a header and move along. If you're building an API, you're gonna be receiving these tokens and now you need to somehow verify them and extract data from them. And again, it is only the API that should ever try to understand access tokens. If you're building an application, pretend they are just a random string. So reference tokens and self-encoded tokens. These are two different ways of handling this and they have trade-offs. So there are benefits to both and there are drawbacks to both. Benefits of reference tokens is that they are sort of very simple. If you store them in a database, which means if you want to revoke one or deactivate a whole application or deactivate a user account, it's very easy, you go look in your storage and you delete the tokens that you don't want anymore. Next time your API checks whether a token is valid, it'll look in the database and the token won't be there. But the downside is that it means you have to store all active tokens. And if you only have like 10,000 users or so, whatever, a database of 10,000 rows is not that big of a deal. But if you have millions of users or millions of users and hundreds of thousands of applications, that starts to get to be a lot of data. Also importantly, the API has to actually go and look in the database or look over HTTP to check whether tokens are valid, which adds latency and sort of centralizes how you can build your architecture. So this ends up really only being the best for smaller scale APIs, especially if there's an integrated OAuth server. Because then it's just sort of simpler and there's fewer moving parts. But these drawbacks start to become a real problem if you're building a larger scale system and then also it's not even really feasible if you're using an external OAuth server as a service. So self-encoded tokens have a couple of really important properties as well. They don't need to be stored anywhere because everything about the token that you care about lives inside the token string. It means that you have better separation of concerns. So your API is no longer in need shared state with your authorization servers. It also means your APIs can validate tokens without calling back to the OAuth server that generated them because they're self-encoded and they say whether or not they're valid right there. The downside is there is no way to revoke a self-encoded token because the token is a statement about whether it's valid. So if you want to revoke tokens, you have to sort of add the state back into your system somehow. But this doesn't have been the best for larger scale distributed architectures and also using external OAuth servers. So let's look at an example of this. One way to deal with self-encoded tokens, one way to implement self-encoded tokens is using the JSON web token standard. It ends up being a very convenient mechanism because there's good library support. There is actually a spec working its way through the OAuth group right now to actually standardize this particular token format for access tokens. This is an example JSON web token. If you look, there's two dots in there and if you split in the dots you find three parts. If you base 64 decode the three parts, you get JSON data. It's important to note that this is not encrypted which means anything you put into the token is visible to the end user and the application. It's just signed. It's signed by the authorization server and that is so that you can verify that this token was not tampered with by the application. So that KID property is the key that signed the token. The KID is the identifier of the key. So you go into the keys document of the server which you can find in the metadata of the server. That JWKS URI says where the keys live, you click on that, you get the actual key data and notice that there's a matching KID property there. So your access token comes in, you find the KID of that, you go look at the keys of the server, you find the matching key, you plug it into a library and it tells you whether or not it is valid. That's enough to tell whether the jot is valid. You then have to make a decision. Is that enough? That's very fast. It turns out that math, once you have the token cached, it can happen in like a millisecond which is fantastic. It means it's super scalable, right? The downside is that you can't revoke these tokens which means there is a window in which you might be getting the wrong answer. There's another option to validate tokens which is to go back to the OAuth server and ask, hey, is this token really valid? So let's look at this example. If you have access tokens that last for eight hours and one is issued at time zero and then you check the jot signature, it says it's valid, you go and ask the OAuth server remotely, is this valid, you get the right answer, you get the same answer. An hour that goes by, same thing. Then the user goes into the security settings, deactivates this application and revokes it. Now another hour goes by and the app uses the token. Well, if you've done local validation, it still looks valid because the token hasn't changed and you can't change that token once it's been issued. If you were to go back and ask the OAuth server if this token is valid, the OAuth server says, no, it's not valid because the user revoked it. So now you've got different answers depending on how you're validating tokens and you will continue to get different answers until the token expires and then you get the same answer again. So this is a really tricky challenge because you have to decide what is your tolerance for getting a different answer of whether tokens are valid and both in terms of their functionality of it but also the time window. So we end up with this pattern ends up being a pretty common way to solve this where you have a gateway that sits out on the internet and that API gateway is doing only the local validation so it takes less than a millisecond to validate these tokens. The benefit is that it's able to throw out junk requests, expire tokens, invalid, just junk thrown at it and it'll only pass back to the backend APIs things that pass the Jock validation. That does mean that it's potentially passing back requests to these backend APIs for tokens that have been revoked for reasons other than the token expired. So what that means though is that at least these APIs aren't getting slammed with junk requests from the public internet. So at this point, these APIs now have to decide do I care whether this token has potentially been revoked or not and the answer is different depending on what the API is doing. If you have this customer API that's like returning the user's profile image it's probably not the end of the world if that's also returned for a token that's been revoked. Probably no new information has been leaked but if you have an API that's gonna actually go and charge that user's credit card you really don't want that to run with a revoked access token. So for that particular operation you can actually go and then do token introspection you add in the latency of what that takes to go back to the OAuth server and ask but then you do get the right answer and that's a way to sort of split the difference of getting the speed benefits but also getting the security of getting the right answer all the time. The other thing to keep in mind here is that you have a lot of flexibility in terms of how long access tokens should last. You can and you can make different decisions based on different again different rules of your system. So how long access tokens last are affect how long you're gonna have a different answer for those two different ways about any access tokens. So again you can again split the difference and make decisions based on what your tolerance is. So you can say admin users get one hour access tokens 24 hour refresh tokens they have to log in every day. Consumer users I don't wanna bother so they get unlimited refresh tokens and they stay logged in for 24 hours but then if someone's gonna do something sensitive in the system they have only four hour access tokens. So you can see there's a lot of flexibility here and a lot of the OAuth the tools and the servers and the products that you'll use give you this flexibility and give you these knobs to turn. So I just wanna leave you with a couple of links to additional resources and further reading. If you didn't already find the OAuth simplified book we have some copies back at the booth OAuth.com is the ebook version of the book. OAuth.com slash playground is a interactive walkthrough of all the OAuth flows. So you can actually see step by step the exchanges. OAuth.net again is a community website of OAuth it's a good resource to find related blog posts. If you haven't yet signed up for an Octa developer account it's a great way to try out the stuff and try out the blog posts and play around in your own little environment there. The book is also available at OAuththeosimplified.com and I will leave it at that thank you very much.