 Hi everybody. I'm Aaron Parecki. I'm a senior security architect at Okta. I'm also a member of the OAuth working group at the IETF, and I've been in the web standard space for quite a long time now. If you've ever googled anything about OAuth, you've probably landed on the website I maintain, which is OAuth.net. It's a lot of great resources for finding things about OAuth, and if you've ever seen anything wrong with it, please feel free to contribute. The link is at the bottom and you can add your own resources and things you find on there. So before we talk about how to hack OAuth, I want to talk a little bit about what OAuth is and set the scene for how some of these things take place. If you ever tried to read anything online about OAuth, I apologize because it can often feel like trying to find your way through a maze of content. It is not obvious how these things fit together. OAuth is not just one spec. It's made up of a whole bunch of specs. So it can be pretty overwhelming trying to figure out how to navigate this and where these things line up and how they fit together. So instead of going through the specs, we're actually going to take a step back and first talk about how we got OAuth in the first place. So do you remember seeing this kind of thing online like, I don't know, ten years ago or so? This was a very common pattern on the Internet a long time ago. We look at this now and we say like, oh, ha, I would never put my email address and password into some random application like Yelp, right? But it was very common at the time to do this because, you know, this new app like Yelp would launch and it would ask you to enter your password for your email address. In order for it to go and find your friends to see if they are already using Yelp, which is like a very cool feature, but this is a terrible way to do that and we understand that now. But it wasn't limited to just Yelp. It was also like Facebook doing this. Can you imagine if now Facebook would be collecting people's email addresses and passwords? That would not go over well. But users were happily giving their logins to these services because they actually did want what the application was promising, which was the ability to find their friends. So what we're looking for is a solution for how can we let applications access the contacts in their API, but not have it also access the person's emails. And that was the original problem that Oath set up to solve, which was how can we let an application access data in an API without giving that application your password. And originally, it was always about third-party access to data. It was about Yelp wanting access to your Google account or last FM wanting to access your listening data in Spotify or Buffer wants to post to your Twitter account. It was always about that third-party access and the result of the Oath flow is that the application then has an access token which it can use to go make API requests. After the Oath flow is complete and the user has allowed the application access to their account, that application will have an access token, which is a string of characters that doesn't mean anything to the application. That string of characters is then used in an API request to access that API. What's interesting is that nothing in this process actually tells the application about the user. It doesn't actually tell the application who logged in. And that's totally fine. Oath was not created as a single sign-on protocol or as a way to talk about user identity. It was created to access APIs. So I like to use this analogy of checking in at a hotel, which seems like a funny idea right now. You go to the front desk, you give that person your ID and your credit card. They give you back this hotel key. You take that hotel key and you swipe the key on the door and the door lets you in. Now, in order for this system to work, the key card doesn't actually need to represent you a person. It just needs to represent that you have access to this door. So the door doesn't even care about who you are. The door just cares about whether this key card has access. So in Oath terms, the person at the front desk is the authorization server, the key card is the access token, and the door is the API. With that kind of background out of the way, let's now talk about how Oath works just enough so that we can see where some of the holes are that we can start poking out in order to actually break it. So all of Oath is modeled around these five roles. These are roles to find in the spec. The terms at the bottom are the spec terms for them. Those are the words that you'll find in the spec like resource owner and resource server. Those aren't typically things that we use in a conversation when we talk about this. Instead, we talk about users and APIs and applications, but they're basically interchangeable and we have to remember that these are roles, not actually components. So the application might be a mobile app, but it might also be a web server app. It doesn't matter where that application is running. We always model it the same way in Oath and talk about that application getting an access token. Similarly, the Oath server and the API are sometimes part of the same piece of software, like how GitHub has a built-in Oath server that's part of its same software. But in some cases you might be using an external Oath server either as like an open source project or as a service like octa or off the zero. But again, it doesn't matter how these things are configured. We still talk about these roles the same way. So let's start by going over the most common Oath flow, the authorization code flow. So the flow starts off with the user visiting the application's website or launching the mobile app, and then they click the button that says log in. And that is them expressing interest in using this application. So they are saying, I would like to use this application. The app says, great, don't give me your password. I can't use your password. So instead go over to the Oath server and log in there. And it does that by creating a URL to the Oath server with a bunch of stuff in the query string that describes the request, things like its own identifier and what kind of data it's requesting. So that causes the user to land at the Oath server where they then log in and then they will see this prompt asking them, do you allow this request? If they click yes, the Oath server generates a one time use authorization code and sends it back in a redirect to the user's browser to have the user's browser deliver it back to the application. Now, this is an important step because what's actually going on here is the Oath server is trying to get this authorization code to the application. But the only way it has to do that is by telling the user's browser to deliver to the application itself. So once the application has the authorization code, it can now go and get an access token and does that by making a post request back to the Oath server to exchange that. This is a step where it can also include its client secret, its own password so that the Oath server knows that that code was not stolen in that redirect step. So the Oath server validates that request, checks the client secret and creates an access token and returns it in the response. Now the application can go use the access token to request data from the API. So the blue lines on top are what we call front channel. That's the idea where we're actually using the user's browser to move data between parties. The pink lines are the back channel and that's the sort of more normal one where we're sending data from a client to a server over HTTPS. There's a lot of benefits to the back channel that we often take for granted, even because it's over HTTPS, we can validate the certificate. So the client knows it's talking to the right server. Once that connection is established, it's encrypted so that data can't be tampered with in transit. And the response that comes back can be trusted immediately because it's part of that same connection that was established. I like to think of this as hand delivering a message where you can walk up to somebody, you can hand them a thing, you can see them, they can see you, you can see they took it. You can be sure that nobody else stole it in that process. Sending data over the front channel is more like throwing it over a wall and just kind of hoping they catch it, where you can't actually be sure that they caught it. You can't tell if somebody stole it mid-flight. You can't tell if it just landed on the floor. And on the other side, there's a similar problem, which is that on that side, you can't actually be sure where it's coming from. You don't actually know if it's from the real OAuth server or from a fake OAuth server. So you might be wondering, well, why do we use the front channel at all? Then if it's so risky or insecure, it does have a few really important properties that it turns out is very useful in OAuth. One, it's the way that we can actually be sure that the user was in front of their computer when they logged in, because it's the OAuth server that is collecting their password. But also importantly, it's a way for applications to be able to receive data even when they don't have a public IP address, which for things like mobile phones or JavaScript apps, turns out is really important. OK, so that was the quick overview of what OAuth is. If you're curious about more of how that works and the different environments OAuth can run in, do check out my videos on YouTube, search for Octa for Developers or just search for my name, Aaron Pirecki. And you'll find a bunch of videos about OAuth that talks about that part in more detail. Now let's move on to the hacks. So it turns out there's a lot of ways to hack OAuth. In fact, most of them are documented in the specs themselves. There's a handful of attacks documented in the core, the original document. There was a later security document put out. There was a document put out about specifically OAuth and mobile apps, which again talks about some of the particular aspects unique to the mobile app environment. There's also a new document called OAuth to security best current practice. And that is a huge list of other ways that this can kind of go wrong. So I'm going to save you from that because those are well documented and well understood. Instead, I want to focus on a couple of attacks that happened in the real world that actually resulted in some pretty big press headlines. So I'm going to start off with this one, which is well understood now. And in fact, a lot of OAuth security has been based around the fact that this happened a long time ago, but it's a good place to start. So Twitter, back in 2013, there were a bunch of headlines that came out about how Twitter's API keys were stolen or leaked or whatever you wanted to call it. So what happened was someone went and downloaded all the Twitter apps on all the different platforms like iPhone, Android, Windows Phone and then decompiled them and extracted the strings out of them and then posted all those API keys on GitHub. The result of this hack was that basically anybody can now impersonate the Twitter apps. So what I mean by that is that any of the permissions that the Twitter apps themselves might have that other third party developers couldn't get now anybody could get those permissions. And the lesson here that we learned was you can't put secrets in native apps. It just doesn't work. There's no way to do it because anybody who downloads that app is downloading that secret and can just decompile it. So this was actually one of the major problems with OAuth 1, which that it was entirely based around this idea of using a secret for that application to sign requests. And that secret was always provisioned by the developer and then put into the application, which works fine for web based apps where users don't have access to the source code. But when we're talking about mobile apps, the users download those apps, download the secrets with them and can extract them like we saw. So that was a major problem with OAuth 1 and that was actually a huge motivator in deciding to start a new version of the spec. So OAuth 2 avoids the need for a client secret in cases where we can't use one because of this additional extension added to OAuth. Pixi, PKCE, which stands for Proof Key for Code Exchange, is basically a solution for how to do OAuth with applications that can't use a secret. So if your application can't use a secret, you might remember that that first diagram we looked at of how OAuth works, that one of the last steps there included the application using its secret to authenticate to the OAuth server. We don't have a secret, so what are we going to do? Let's now walk through the flow again, but look at what's changed. So it starts off the same. The user clicks on the application and says, I would like to use this application. Please log in. Now, this time, the app first generates a new secret, random secret for this request, stores it internally and then calculates a hash of that secret. Then when it builds a URL that it's going to launch the user into a browser, it includes that hash value in that URL. So that causes the user's browser to deliver the hash to the OAuth server. Now, the OAuth server still, again, gets the user to log in and prompts them for permission and then it issues that temporary code, but it also remembers the hashed value. So now the user's browser delivers that temporary code back to the application and now the application is ready to go and exchange that code for an access token. This time, because it doesn't have a preregistered client secret, it does have that secret that it's holding on to from when it started the request. So it includes that raw secret value in this request. And what that does is it lets the OAuth server say, OK, well, when I issued that code, I remember seeing this hashed value. So that means whoever is going to get this code now has to prove that they were the ones that started the request. And to do that, if they include the secret they use to generate that hash, the OAuth server can hash that value itself and compare the two hashes to see if they match and then it can respond with the access token. And then we're done and everything moves on. The good news is this is often just handled by a library for you. So the app auth library for iOS and Android and JavaScript does this for you. Oftentimes, if you're using a company's product, it's SDK will also be doing this under the hood. All right, let's talk about the next one. And this is about hacking JSON web tokens or JWTs. So these headlines were going around around 2015 when a whole bunch of libraries were found to be vulnerable to the same problem. And that was that many libraries were actually not validating JSON web tokens properly. So let's dig into this a little bit and see what's going on. JSON web tokens are often used for API authentication where your access token might be a JSON web token. Here's what a JSON web token looks like. It is a long string of characters. It always starts with EYJ. And if you look closely, there are two dots somewhere in there and those separate it into a three part token. There's a header, a payload and a signature. Without being able to validate this, somebody could go in here, modify the payload, the middle part, change data, change a username, change a permission, things like that, and then repackage it up and then send it to the API and try to make API requests. And if it's not validating it properly, it would then just read whatever data is in the middle and then the attacker is in. So the thing about JSON web tokens is that the header part talks sort of about the token. One of the things in the header is a description of how the token was signed and there's a couple of different options available to you if you're creating a JSON web token. One of the signing mechanisms is RS256 and that's an asymmetrical algorithm where you use a public key to verify it and a private key to sign it. Now it just so happens that there's also a signing algorithm called none and that means there's no signature. So the hack here is that you go get a real access token that's probably signed using RS256. Then you go inside the token, you change the data you want in the middle part, the payload, and then you change the signing algorithm to none and then make an API request. And then if that server first looks at the header to decide how to validate the token, it'll see the signing algorithm none and it will skip checking the signature at all and now it'll just believe whatever you put into the token, which is clearly wrong and that was the problem. There's another more subtle version of this attack where you actually put in a symmetric algorithm into the header instead. And then what happens is you send the server your modified access token telling it that it's a symmetric signing algorithm. It'll go then find the public key which it was going to use to verify the asymmetric signature, but it treats it as a shared secret instead, which basically means you can use that same public key to actually create the real signature because if it's treated as a shared secret, you can create the signature just like anybody else can with that key. That one's a little trickier because your resource server will think it's actually validating your signature because it is, it just happens that anybody could have created that signature, so it's meaningless. So what's the takeaway here? The takeaway is that that JSON Web Token header which talks about how to validate the token is untrusted information before you validate the signature. And you have to treat it as such. Basically what that means is never actually let the header determine what signing algorithm is used. Instead, when you go to validate a JSON Web Token, you should only use signing methods that you know are safe and that you know you're expecting, which is probably just the asymmetric ones. Thankfully, most JSON Web Token libraries around 2015, 2016 actually fix this by requiring you to tell it which signing algorithms you are expecting when you go to validate a token. That way you can't be tricked into accepting a signing algorithm from the header itself. All right, let's move on to one of the more subtle ones. This is what I like to call an OAuth phishing attack. So in 2017, there was this attack on Google's OAuth, which resulted in headlines like this. It had various names like an OAuth worm or a phishing attack, but let's take a look at how it worked. First of all, do you see anything wrong with this picture? I'll give you a second to look closely. You might notice that we are really on the real Google OAuth server. It's accounts.google.com. There's a secure lock icon in the address bar. So we know this is actually a page served by Google. This is not a fake page, but it's a little bit suspicious that Google Docs is requesting access to my contacts. For example, one, why doesn't he access to my contacts? But also, why does it need permission? Because it's Google and theoretically Google already has access to everything inside, right? But how about if I click this little arrow next to Google Docs and then I see the application developer who built it and also the URL that I'm going to be sent to when I authorize this app. What is docscloud.info? Well, I can be pretty sure it's not actually Google. Whatever that is, it looks suspicious. Now, if anybody clicked that, it would be pretty obvious that this thing is wrong at this point. The problem is that it's too easy to skip past this and not actually click that. So in OAuth, in order to start a flow, you just need to create a URL and get someone to click on it. Typically, that's the link that says login, but you could actually deliver that link any way you can think of. You could send it in a bit.ly link over SMS. You could put it in an email, for example, and get someone to click on it. And because anybody can make this link, you can transmit to anybody and start an OAuth flow when they click it. So the way this spread was that the attacker just got one person to click on this link. And as soon as one person did, it granted this application access to that person's address book and the ability to send email from their account. And then this worm can go and use the Google APIs to actually send an email to everybody in their address book. And now that email is not actually a fake email. It's not spam. It's coming from a real Google account and going to a real Google account. And that was partly why this was so convincing, because Google spam filters couldn't flag it because it looked like legitimate email at first. And what's the contents of that email? It's this message that says, so-and-so share to file with you, click here to view it in Google Docs. Now, if you got this email, it's going to be from somebody in your address book. And it's actually a real email. It's not from a fake address. It's from them. So then you go and you click the open in Docs button and you get taken to Google. You're not even taken to a phishing site. You're taken to the actual Google page where you're then prompted with this. And then, and because you're already thinking about Google Docs, you're more likely to just click through right here and not even think too much about it. And as soon as you do, now the attacker hasn't accessed open to your account and can repeat the process. And that's how it spiraled out of control. So within about 40 minutes of this starting, it had spread so far that Google actually had to tweet this out and be like, Hey, we're investigating this. Don't click on that link. We're looking into it. Meanwhile, on Reddit, where this had been reported, there were Google engineers chiming in being like, OK, we're looking at it. Oh, we found out what's going on and we've disabled the app. And at that point, the next time anybody clicked that link, they just saw this error page because what they did was they just disabled the client ID. But if you think about it, that didn't actually solve the problem that just stopped that particular instance because the problem is actually a lot deeper. The problem is that it's a phishing attack and phishing attacks often don't have a technical solution because it's all about teaching users what to expect and not to sort of blindly trust things. And a lot of the reasons that worked was because that screen that asked for the user's permission was so convincing because it was Google screen and they didn't do a good enough job of preventing people from impersonating Google apps. So how could they have stopped this? Well, for one, they can make sure the developers can't go and name applications with the word Google on them because really only Google should be able to do that. But they went a few steps farther and this is now having ripple effects throughout a whole bunch of related products. Shortly after this attack, they launched some pretty big changes to their API policies. One, they actually now require that applications request only the bare minimum permissions they need to function instead of requesting a broad range of permissions. They also now recommend requesting permissions in context. So instead of clicking a link to log into an app and having it request everything all at once, you first just log in. And then when it needs to go send an email for you, it'll then request that permission. We also call that incremental authorization. But one of the other things they did was they actually called out a few specific scopes as restricted scopes. And those are things like sending email from a user's account or accessing files in their Google Drive. And for those restricted scopes, if your application needs them, your application actually has to go through a security assessment, which is a manual review process. And in their documentation, they actually say that this process might cost anywhere between 15,000 and $75,000. So basically what this did was a whole bunch of these smaller services that use Gmail for like email marketing automation. They all just suddenly had to shut down because they couldn't afford that kind of fee. So that's not my favorite solution to this problem because that feels a bit heavy handed. And I recognize that it's really a design problem, which is very hard to solve. But we can look at some other companies to see how they've handled a similar problem. This is GitHub's authorization screen. And if you notice on this one, it looks very different from Google. But down at the bottom next to that green authorize button, it actually shows the website that you'll be taken to when you click it. And that's super helpful because then as a user, you might recognize, oh, well, this clearly isn't the real application that I thought I was going to because I don't recognize this web address. So really, this problem is all about just making sure users are well informed when they're granting permissions to other applications to access their account over OWA. So it's a hard problem and there are several different kinds of solutions, as you can see, some more heavy handed than others. All right. The last one I want to talk about is Facebook. This made major headlines in 2018. Fifty million accounts were hacked. Now, normally when there's like a large scale hack like this, it's actually something pretty silly. Like if somebody left a database publicly exposed to the internet or someone just stole a password dump and then tried it on some other service. But this one caught my eye because normally companies will just sort of write a high level apology email and send that out via the press releases. But this was actually so serious that Facebook put out a lot of information about exactly what went wrong here. So on their security blog, they actually had a lot of details that we can look into to try to learn something from this. So from their own post, this was a phrase that the VP of product use to describe the attack. The vulnerability was the result of the interaction of three distinct bugs. And this is also very interesting because it wasn't just one problem. And when we look at these three bugs, you'll see that individually, each of these doesn't actually seem that bad and seems like it couldn't cause something at this scale. It's only once you stack them up that it becomes a problem. So let's dig in. And that was really useful for being able to test out your privacy settings. You can make sure that this one post you made was hidden to these particular people and confirm it by checking out what your page looked like to them. So that's where this all starts. So here's the text from their statement. First, view as is a privacy feature that lets you see what your page looks like to somebody else. This is supposed to be a view only interface. However, for one type of composer, the little box that lets you post into your profile, the one that lets you wish friends a happy birthday, the view as incorrectly offered the ability to post a video. But again, by itself, whatever, right? That's not a big deal. Worst you can do is like upload a video to your own wall. OK, so hold that one in your head for a second. Now, second, they launched a new version of this video uploader and that version had a bug where it generated an access token that had the permissions of the Facebook mobile app. Now, this is starting to sound a little bit suspicious because what do you mean it generated an access token? That sounds to me like they're letting one part of their system just sort of assert things on behalf of some other part of the application without actually checking it. But again, by itself, this wouldn't be so bad because all that really could happen was that you could use that access token to post to your own account or read your own data. OK, so hold that one in your head. Third, when the video uploader appeared as part of the view as it generated an access token, not for you, the viewer, but instead for the profile you were viewing. Now, this is bad. This is clearly bad because now I'm able to get an access token for somebody else's account. But again, if it were only this, it wouldn't have been that bad because the view as was supposed to be a read only interface. So I would have ended up with a read only access token to their account. But when you stack these, if you use the view as feature to view somebody else's page, you would end up with an access token belonging to that user that had the permissions of the Facebook mobile app, which basically means it can do everything. And that's why it started cascading out of control. So what's the moral of the story here? This is again a pretty tricky one because again, each of these bugs did not seem that bad on their own. But I think the thing that kind of ties it all together is that it's really important to keep clean security boundaries between these different parts of your application. Don't let parts of your application pretend to be other parts or pretend to be users. We have the OAuth framework for a reason. And that's what lets us make sure that the user was involved when issuing an access token for that user. And that is all I have for you today. I hope you enjoyed this. Please feel free to get in touch with me on Twitter or via my website. If you have more questions or would like to chat more about this, you can get a copy of my book at OAuth2Simplified.com. The book goes into a lot more detail on all the different things you need to know about building OAuth applications and servers. I also have cat stickers available on that website as well. Thank you so much.