 So, thanks everyone, this is going to be attacking and defending JWT tokens, the ultimate guy. Before we start, I would like to shortly thank the Linux Foundation and all of you for attending such a great event. So, first of all, you don't know me, so who am I? I'm Leo Joskovich, I'll try to pronounce my name, and I'm coming from Mexico originally. I'm a security researcher in Palo Alto Networks. Usually we do a lot of fun by hunting for web vulnerabilities, and obviously we also try to improve our products as well. I'm a student also for MSc Cybersecurity at the University of London. But I also have some hobbies as well, like football, soccer, Argentina, Mexico at that time. I also like baseball, and sushi of course, not because I'm in Japan, I promise you I like it everywhere. And of course, I like chili, I like a lot of spicy. So if you see me around putting some spicy on my soup, don't worry, I'm coming from Mexico, so this is who I am. Today we have a very special agenda here. We're going to start with a brief history of how we used to do the authentication and authorization. Just a brief, so that we can move forward to JWP or Jot on the Mentals. We're going to see the advantages of Jot, as well as some terminology, because at first it looks very confusing. There's a lot of terms there, and they're almost very similar to each other. Then we're going to move to some statistics, you know, because it's good to have some idea of what's going on in every industry. And we're going to move on to a real exploitation scenario. We're going to see more attack vectors because there's many, many vectors out there. And of course, how we can protect ourselves from those attacks, how we can mitigate those as well. So let's start with a brief history here. We know that HTTP, the protocol that runs the web, is a stateless protocol. It means every time you authenticate, you send a request, a special request to a server, you need to actually provide, again, a cookie or some other form of authentication, just like every time you go to any office, you need to take your batch and put it again. And it doesn't matter how many times you were there, even after 15 years, the CEO, everyone, needs to take again the batch on that entry system and put it again and again for every time you go in. So the same applies for HTTP. Now, regarding session management and access control. So it used to be something similar with HTTP by using cookies, but we refer to authentication. So we used to use some session cookies depending on the technology. We had PHP, we had Java, session ID, ASP, Cloud Fusion, all different kind of technologies. And over there, we used to send the cookie, the session cookie for authenticating against a web server. The same goes for the authorization. The authorization was used to be done only on the server side. It means if you're an administrator or if you have the right permissions to access certain resources. So we used to do that only on the server side. Everything was just there and only there. This also has some other disadvantages because it means that every time we need to have a special database, some long file, with all these records, with all these sessions, and every time the user goes and tries to authenticate against that server, so it needs to go and look up for where that record is, where the session ID even exists, and then move on from there. But obviously that's a little bit slower than just using other technologies. And of course, we know that cookies, I mean, unless you use the right flags, which are a few ones, they are vulnerable to access attacks, which is a very common attack that probably you are all aware of. So cookies nowadays, this is what we see in almost every server. You know, just that's where they want. They want to kind of keep track of the user for advertising purposes. This is the main use of cookies nowadays. So let's move on to some JOT fundamentals. And you probably say, hey, what do you say JOT? So by design, JOT stands for JSON Web Tokens. That's RFC 7519, just in case somebody wants to know the number. And in the RFC itself, it says that you should pronounce that as JOT. I have no idea how they come up with such a name, but it's by design. And it's a standard for exchanging data between applications, any kind of applications. And optionally, we need to emphasize that, they are cryptographically signed. We will see later why I must reinforce this specific topic about optionally cryptographically signed. And where do we see those JOTs being used? So we can see those in containerized web applications, any sort of microservices, traditional APIs, you know, anywhere, any e-commerce, any online shop that you visit, they probably make some use of API mobile applications. Container registry, if you have some sort of containers, anyone to push or pull any image to that repository. So you do that, the authorization also, by using JOTs as well. But mostly something that it's maybe not so common is vaccination for, vaccination certificates for COVID, they also use some sort of JOT, okay? And as you can see here, I mean, it comes in a format of like a QR code. But at the end, if you scan it, so you will see that usually, depending on the country, but usually you will see kind of JSON web token there, and you will see almost the same format as with any other web application. And that's important because you will see later on how you can fake and you can take advantage of vulnerabilities to create your own vaccination certificate. So just to summarize here, and also compare this, we have the evolution from standard cookies and we move on to JOTs. So first of all, we mentioned that tokens are stored on the server. And by JOTs, they're not, and that's an advantage because you don't need to keep up, you don't need to keep trace of every single token where it is stored and you need to store it in that specific server. And that leads us to easily scale up. When you want to grow app, you have 10 instances, you have 10 containers, you want to make thousands of, hundreds of replications. So we don't need to save that specific token and move it over this huge amount of resources. You just save it on the client side. You save, I mean, on the same place where you save your secret and then you pull up the secret and then you can authenticate and authorize the user in hundreds or even thousands of different services. Another advantage here is that we can include more data. It's not just your authentication token and that's all. By using JOTs, you can include the data where this token has been generated. For statistics or logging purposes, you can add roles, you can add the email, the name, you can have a lot of data because at the end, this is the purpose of JOTs, exchanging some data between applications. And all this, of course, you can encrypt. And the same goes for cookies. It's not strict by itself, but you can also encrypt the cookies by using some basic algorithm. But here in JOT, it comes very, very handy when you want to exchange data, sometimes sensitive data, and let's say you're a bank and you want to encrypt that data because you don't want the user. I mean, we mentioned that the token is stored on the client side. So if you encrypt that data, you can move on to and transfer it from servers more securely. And that was just like a brief of JOT fundamentals. But now, such as Mr. Miyamoto says, let's talk more about JOTs. So we have some main components of how this is constructed, how or what consists of a JOT or JWT token. So we have, first of all, the header. The header is, I mean, just to simplify this, we can see that you can define the validation algorithm. So just to keep the simplest, I chose to use the HMAC with SHA256. Okay, we'll see later if you, there's a few differences between this and the rest. So this is all that we can use by the header. And then the payload is actually the mean. It's actually what we want to see, but we want to, I mean, as an attacker, we want to manipulate. Or as a, you know, the developer or the defender, this is what we really want to protect. So here, just by an example, I gave it a name of Haruki Tanaka. It should add, this is just a claim. It's kind of an attribute. Again, we mentioned that there are like just a few claims or a few attributes that you can add more to your token. You can add a, you know, there's a full list of reserve or even custom claims or attributes that you can add. So this is just a very simplified version here. EXP for expiration. This is the way of, one of the ways of revoking tokens. Once the expiration is, the time has been passed. So we can for sure validate and make sure actually that the token is not valid anymore. And of course we can also, the role, if, you know, for authorization purposes, if this user is an operator, is an administrator, a support editor, you know, depending on the type of application that we're talking about. So we can add the role as well. And why not? We can add also the email. Don't try to email that. I'm sure that this doesn't exist, not just made it up. So there's no Haruki Tanaka on this till now. It works for the links validation. And lastly, we have the last component. So the third component is the signature. The signature is the cryptographic string that we use to validate both the header and the payload. And it just looks like, you know, some random number, long, long string that we usually don't really understand what it means. But let's try to keep this on the side for a moment and then we'll come back to it later. So the format for jots is, I mean, again, excluding the signature is the header that we have right here at the top. Then we have a dot. And then we have the payload. Again, I just simplified this to use only two attributes. And the way of transmitting that and transferring this to the client and back to the server is by encoding this into Bay 64. Now, we need to bear in mind here, this is not a common Bay 64 that we know. It's called Bay 64 URL. There are some nuances there, such as, you know, because we're talking about URLs. So such like the plus, the minus signs, they're a little bit different than the common Bay 64. And this is how it looks like. You can see here on the right, this is basically, again, just the encoded version of a very simple jot token here. Now, how the generation flow works? Like, okay, we talk about the three main components, how the basic format is, but how actually it's being generated. So first of all, we have a user that logs in. Of course, you know, that's not a strong password. So don't try that. And after, of course, the user is successfully authenticated. So then the server actually generates the token. Okay? So this is just the version that we saw earlier. We have the algorithm here, which is HSHMAC SHA256. And the name and role and email, of course, that's pulled out from an internal database that is not exposed to any user. And then the server also encodes this to the Bay 64 URL version. But then what happens here is the server signs this token. The server makes sure that, okay, hey, this is a stamp and this is what I'm going to use in order to validate this token because if later on this token is being manipulated, so then the server will know that actually it can either refuse that or accept that for further processing. So this is the resulting token here. We have the top of the header. We have in the middle here after a dot. You can see that the dot is not being encoded. And this is the payload. And lastly, we have the signature. And these three parts are being sent to the user. So as you can notice here, the user is already authenticated after these few steps. So this is how the generation flow works usually. Of course, we can add more, you know, a very complex scenario here with more different endpoints and what, but this is just to simplify this flow. So we talk about the signature. Now, what is the signature exactly? Why do we need a signature? So the signature just to remind ourselves is what actually securely validates the token. If there is no signature, it just means, hey, just come in. Like, I trust you. Like, why not? So it's being calculated by taking the header and taking the payload, again, followed by a dot after the header and the payload. And then we convert that into base 64. That resulting string is being taken to the next step. And with the key, again, the key in this case, because we're using HMAC, it's just a long password. It's a long password. So of course, you need to choose, like, you know, the rules are that you need to use a strong password, you know, 8, 10, 12, you know, above this kind of characters. If you're going for the manual part or you can use UUID or whatever the mechanism you use to generate that really strong and long key. But on the other hand, you can use a private key, of course, with another algorithm. This is a PKI system where you can use private and public keys. But let's leave that for a minute. Now, what happens is that after you sign the token and after you have this part, you send that to the user. So if even one bit, just one little, just a word, even just one dot, something has changed into that token, right after the token is being sent to the server, oh, the server is going to say, hey, no, that's not my token because it has been altered. This is not the same signature. And as you can see, and it reflected by this puzzle, there is going to be inconsistency because the server is going to say, hey, this is not the same. Not just by one bit, just like, you know, by hashing function. Basically, in some cases, it could be a completely different signature. It could be something completely different, different call or different string. So the server is going to say, hey, this has been manipulated, so I'm not going to accept that. And that's why this signature is very important. So the whole flow, it works like this. So after the user is authenticated, let's say that the user wants to request or wants to access his or her own bank account. Okay, let's say the ID just for this example, it's going to be one, and they want to view their account. They want to view their statement. They want to view whatever deposit they have on their account. So they send the previously provided token here. They send it to the resource provided. Okay, just, it could be at any API endpoint. And then this server, obviously sending it to an authentication server, so it's going to check for the signature. The way of checking for that is, again, by revalidating, regenerating, I'm sorry, revalidating that token by using the same key and the same algorithm, which is in this case H1256. And that we compare to the provided signature that the user has given us. Because we cannot just say, okay, we're going to take the header and the payload and we're going to just try to sign this again because that's not going to have any other difference, right? We need to retake it and compare to what the user has provided us. And if all goes well, okay, if no argument, no parameter, no value was tampered there, so then the user will have an access granted there. Otherwise, nothing is going to be there. Okay? And we will see later why, I mean, specifically, we need to also validate not only the signature, but also other parts of the token as well, okay? Because we should never assume that the user is talking specifically about HMAC-256 or less instructed by the token itself. But that was too much. Now, that was about just some fundamentals, but what about the terminology? You probably have heard about some other terms, JWS, JWK, JWE, JWT, but just too much. Probably the first time that you learn about these terms, you're kind of overwhelmed. You don't even know where to move on from there. It's just too much. So for that, we have this simple vein diagram that tries to simplify this. I know it can't be a little bit overwhelming at first, but maybe after this explanation, it will be much easier. So first of all, we have JWT. So JWT is just, yeah, just the essential format. It's just basically saying the JSON format, the way of actually originally transmitting data is by using JSON. That's mainly what the RFC, the specification, 7519 is talking about. Then we have JWS. JWS adds more the signature or the way of validating these tokens, because otherwise, again, they're useless. We don't trust the user. We don't trust any information provided to us. And then we have JWE, which is just another extension of that. It's basically, it could be both, but basically it adds the encryption layer. It adds the payload. It's not just signed, but it's also encrypted, because again, if we are a bank probably, we want to hide some data or some information about the user, maybe for some other industries or some other organizations. It's not required, but for some, they do. And these are, these two, JWS signature and the encryption are actually defined in something called JWA. JWA is just like the way of saying, okay, you're going to encrypt, so which algorithms you're going to use. You're going to sign, you're going to validate which algorithms are you going to use. So these are defined in JWA. Then we have another specification called JWK. And JWK is just when you use public or in private keys for signing and validating your tokens. It basically is this structure of how these are being constructed as well as some public keys. So usually companies might have like four, five, ten different public keys. In case some key has expired, you need to revoke some key because it has been installed for any reason. A company may have more than one, and this is where this is being defined. Now because of that, you might think, okay, now what I'm going to do? I'm going to sign. I'm going to, you know, like just, we have a PSH, we have just a lot of algorithms, a lot of choices, okay? You go to the store, you have 300 different potato chips. What are you going to choose? Okay, this is secure and this is all secure, but this is all secure and it just doesn't work. So for that, there's some other organization called Jose, which they try to standardize everything here. They have like a framework and they say, hey, please make use of this. This is what we want to use. If you could use some other more stronger algorithms, you're free to do so, but just in case this is what the standard that we try to make is what we want to do. So Jose basically collects and compiles all these algorithms, and he tries to make some order here with all this. So just a little bit more if we dig deeper into this. So we said that JWA, the algorithms, these algorithms are defined right there, and what do we have there? So we have JWS, which is again for signing, and signing basically means integrity. We want to make sure that the token wasn't manipulated by any during the trial transmission from the client to the server. So we have some examples here of what kind of algorithms we can use. We can use RS with RSA. They all use SHA with different key lengths, 256, 384, 512, and we have also the elliptic curve ES or PS. And basically the difference between them is, again, if we use symmetric keys or asymmetric keys, if we use HMAC, so HS, basically means that we are going to store a password, a long token, UUID, whatever, in our server, and by the rest we're going to store our private key. Okay? But pay attention here. By design, there's also an algorithm called NONE. I have no idea why they did that, but yeah, it's there by design. You can specify NONE. You're not going to use any algorithm to actually verify that the token wasn't manipulated. It just basically means I trust everyone. Okay? Open house. And on the other hand, we have JWE. Now here, it's a little bit more complicated because we're getting into cryptography. We have just different, huge list of different algorithms that we can use, and that's for confidentiality. Okay? Integrity for validating confidentiality is for hiding data from the user or whoever is in the middle. And on the other hand, we have some sort of example here of what a JWK, the key store, looks like. We have the key type here. You can find the algorithm type, EC for elliptic curve. So that's where we have X and Y. There's just two coordinates, two values that we need to provide. And as well as the public key, the key ID that we use for that specific token. And basically this is all about this three or four actually different specifications. A little bit more about JWK is where this is located. Okay? It's a file, a JSON file that in many cases it's being stored under these locations. It could be .will-known-jwks.json or under open ID, off, or simply under root, JWKS.json or in other maybe not so common places or just like slash api slash case. Or if we're using a version, one, two, three, depending on the version that we use, or simply something custom as JWK-set.json. Now this is important because we need to know where the token is leading to us. In some cases we could see some more information that it's being stored over there. And again, as an attacker, we want this kind of juice information here. But before we proceed, let's see some statistics. Let's see something interesting here. So apis became crucial here because they bring a lot of benefits, really a lot. Now by saying apis, we imply also JOTS because apis work usually with JOTS, with JWT tokens. And we have here some classification by industry. So we see that at least the financial sector, at least that this is what the stateofapis.com presents, we have at least 80%, we have 1% of the websites that use all apis. It means almost everyone. And by technology, interesting that it's a little bit less, but technology, manufacturing, besides government that are on top of that. For some reason, the ability takes a little bit more time with government agencies, but still they also have a huge use of apis. And that's why it's very important that the API implies also JOTS. Now, let's talk about a real exploitation scenario. Okay, we talk about some attacks, some history, what do we know, what can we do. So that's just a real exploitation scenario if it's not now, so when it's going to be. Okay, now this is a real attack that happened not a long time ago. This is in samokad.ru. It's just a provider of electric scooters in Russia. And this is owned by Mail.ru, the biggest provider, the biggest mail provider in ISP in Russia. So what happened here is that, I mean, there was a key, actually the signing key was predictable and you can understand by yourself what that means. So the user was already authenticated against the server, so you can sign up freely. And then the token was sent to the server but after being manipulated. We can say, hey, the role is set to user or operator and then we can manipulate and change that to admin. Okay, what's going to happen? Because the key was predictable so the token was actually valid. So when it reached the server here so the token was actually validated by the server. And when the user requested any endpoint, the admin panel or any other privilege panel or privilege resource, so the access was granted. Basically that leads to a full account takeover. You could impersonate any user. You could actually do any action on the server because why not? I mean, if the server usually trusts your token and your token is not being actually validated as it is, as it should, the signing key is not as strong as it should be so then that leads to a complete disaster. But let's see a little bit more about some, my bad, about the real or the technical data here. So we have the token here. Again, this is just a simplified version. There's not much data about how exactly or what fields were included there but let's assume that they have this admin set to false. It just basically means you're not an admin. So by using this short script in Python, you can already sign that and change the admin to true. Then the key here, that was the default key that you used, it was predicted. They just used secret. Nothing really fancy. So the token was signed and then the token was sent to some vulnerable endpoint. Just imagine the admin panel. So the token was validated and accepted by the server. Then the server here actually granted access to the user. Yeah, you are whoever you are. You are an administrator. Two minutes before, you didn't have access to that. You got 401 or whatever the other HTTP error that you got. You got access denied. But after that, you can actually see everything. You are an administrator. You don't even work in Russia. You don't even work for mail.ru. But you could just do that because of this little role here and the problem about the key. Okay? So you might think that this is not like a real, it sounds maybe like a very stupid thing to do, but trust me, there are a few examples that we also do something similar to that. Now, besides that, we saw just one thing about the predictable password. Predictable signing key. There are other many attack vectors here. So we have on one hand, we have the header here and we have something called Algorithmismatch. There has been, again, just one example, one CD of H because there has been many of that. I mean, H here. You basically put, instead of HMAC, you put RSA and basically the server, it accepts both. So you just get confused by that and you can sign your token with that. Or the use of known. Yeah? If you use known, the server accepts that. You just don't need any key and any signature. It basically means, yeah, just bypass the validation there. Other cases are discrepancies between algorithms and signals, like, okay, because if you're using, let's say, HMAC with 256 and then you have, you know, like a huge signature there, then it doesn't really make sense. I mean, it's just 56 bits, but you send something like 4,000 bits, which it doesn't really make sense. Or on the other hand, we have also injections. All kind of injections. Excesses injection, astral injection, remote command injection, anything. Or even path reversal if we talk about the key ID parameter here. We have other vectors here. JQ, or X5U, or X5C, which are claims or other attributes that we can use to verify, or actually to lead the user to verify the token by using a URL or by using a certificate located on an external endpoint. So that could lead to a SSRF and an open redirect. And again, that happens to Invicti. Again, there are many cases. We have cases here for many of these vulnerabilities, just like this one. This is a website. It's really funny about, it's called HowManyDaysSinceAjotAlgoNoneVol.com and it basically keeps track of this kind of vulnerabilities. So it happened just 298 days ago. Maybe a little more. With the Brazilian government, COVID certificate, and you can see here that people, this guy here, the hacker, it was able to create a fake COVID certificate of the president himself. And the same thing happened in UK and in more countries as well. So let's just go back to some more attack vectors here. We have also on the payload, we have other CVS, we have this realization attacks. If you use the CTI, the content type, and you manipulate that to use the realization mine type, you can use JTI for basically the absence of JTI for avoiding replay attacks. So that happens in commonlibrary.jot.net. Simply claims were, they were not validated. Any of these claims, any of these attributes, the value was not validated. They just trusted the user. Or in some other cases with the signature here, just what happened now with a reviewer, this company here, that the key was predictable. They didn't use a real strong key. Or other cases, which is something more complicated, is where, yeah, they were actually signing in cryptographically signed a token, but there was an issue with that with mathematics. So they could just input some random value and that could turn into a valid signature there. So how do we protect our results from that? How can actually mitigate all these attack vectors? So obviously the first thing to do is to keep updated all these open source third-party libraries. Whatever you use, Python, json we're talking, whatever you use, it should be updated. And always don't trust the user. You need to sanitize and validate all. When I say all, it means the algorithm, it means every single attribute, every single claim and value, okay? You never, ever trust your users. More often when it with time-based claims, yeah, because they can be also manipulated, not before issued out, there's like three different of those time step claims. Also, you need to keep a white list of URLs to verify against those key stores, okay? Because you don't want your user to provide 10.1.0.1, which is your admin internal endpoint, right? And also, it's arguable about you should send your jobs in your authorization header. Otherwise, it will be in the local storage, your station storage, and that's not really recommended. And of course, don't not include any sensitive data. No passwords, nothing else, just whatever you need. And lastly, you need to properly revoke tokens. What happens if somebody stole the token, somebody stole the token, some made in the middle, so you need to have also ways of mitigating that as well. Of course, use strong keys, but not only that, use safe and secure places to store those. In a key management system, usually in the cloud, and of course, log and monitor, because you need to know where you keep receiving the same request with some, you have a pattern there with you receiving the same token with just slightly different values, you need to know that you are under an attack. And basically, that was all. And thank you for joining and listening. Any questions? Cool, in Japanese as well. I'll repeat my question. So you mentioned the example of encrypting JWT tokens. So I'm curious if we are talking about a web client in which the storage is accessible by the users. Is there a scenario where they also use encrypted JWT tokens? And how does it benefit them? Okay, that's a good question. Basically, when you send the encoded token to the user, you put it into your terminal, you do PiBase64-D and you just decode that. It's in another format, but it's visualizable by anyone. So if you apply cryptography to that, so then the payload, second part, it's going to be like just some random symbols there. It's going to look like binary. Even after decoding that, so you're not going to be able to decrypt, unless you know the key, but usually you're not going to be able to see what's inside. So that's the encryption or JWE that adds to JOTS. Okay, so in that scenario, the encryption key would still be available to the client so that they will be able to decrypt this decoded JWT token. Is that a correct assumption? That's a good question, but no. Actually, because everything happens on the server and that just being sent to the client. Okay. Actually, the key, you're right. The public key is there, but it's also encrypted. Right. So there's not really a way of decrypting that unless you have the key. Got it. Thank you. Welcome. Any more questions? Okay. Thank you so much.