 Hi everyone. Thanks for spending time with me this afternoon, and my apologies for having you squeeze into this tiny room. But if I may ask, if you can come even closer, you'll be able to see the code snippets that I'm going to show later. So if you want to do that, you can move forward. So let me start by talking about myself. I'm Atil Tirchibagwale. I'm the CTO of Signal.ai, SGNL.ai, and we're into continuous authorization or continuous access management. If you would like to know what that is, you can go to our website after my talk. Some of you might know me from my work on standards called the Continuous Access Evaluation Protocol, something I started when I was at Google and has now grown into this pretty big movement. But this presentation is about another exciting standard that I'm working on in the IETF called Transaction Tokens, and it helps you secure your identity and authorization in microservices. So let's get started. So why do we need this? So as you know, any modern architecture, external calls to an API or an application will result in a lot of calls internally to various microservices. And in this diagram, that external API microservice encapsulates any network infrastructure that you might have. So but the important thing is that these calls are short-lived. They don't exceed a few minutes of execution at any given point of time. And even if you have MapReduce or some large processes running, they will finally result into calls that last a few minutes or less individually. So you can think of these internal calls that propagate through your internal services as always being short-lived. Now, unfortunately, there are a lot of attacks that are possible where a VPC might be compromised. And as you might know, some of the recent pretty damaging attacks have been as a result of compromise of privileged users that were compromising the VPCs of companies. And this can result in user impersonation or arbitrary code execution. And so extremely damaging to any enterprise that needs it. So obviously we need to do more about security than what we have. So today, most commonly people use implicit trust. If you're in the VPC, you can just call any service. And it doesn't matter because you're in the VPC, you're fine. You're trusted, which is not great. There's something that has come up recently, which is service to service trust, where you have some kind of a trust infrastructure. And I'll get into that, how it works and all that. But you have established trust between a service and another service and make sure that only those services that are configured properly can call other services. Now what we want to get to is user trust. This means that you have the service to service trust, but you also can assure the identity of the user that is calling. So you cannot, let's say Joe is calling a service externally. The internal service cannot change that to say, oh, instead of Joe, it's like Atu who is calling the other service. You cannot do that. And that's what I mean by user trust. And then finally, you get to a short context, which means that if Joe is making a call externally and saying, I want to buy 100 shares of Microsoft, the internal service cannot say it's Joe, but he actually wants to sell 1,000 shares of Google. You cannot do that when you have the assured context. And so as you can see, the level of security that you get increases as you get down the scale. Now practically speaking, most of us are over here right now, save for a few companies that haven't implemented their own ways of doing everything. But as an industry, we need to be down there in order to be secure, in order to get secure outcomes and for people to be able to trust what we are doing. And so how do you get there? So before we get there, let's talk about how a microservice infrastructure can be attacked. Now the most common thing is the privileged user compromise. And like I was saying, this is something that has happened very frequently recently in a very highly publicized attacks. And it can be done through credential compromise or through session hijacking, where somebody just steals the token after you've done all your strong authentication and all that. And so it doesn't matter that you've implemented the best PAS keys or whatever security mechanism that you want. But if the session gets hijacked after that, then you're kind of lost after that. So what the attacker can do is they can make spurious calls. They can insert their own services into the VPC. And it's effectively like remote code execution in the cloud. And so terrible, terrible compromise. The other one is there may be malicious insiders. And they may have different motivations. You may have people trying to benefit financially on the side while doing their job. Or you may have people who are disgruntled employees and want to compromise something about their employers. But there is also this category about curious insiders. Like if I have sensitive data about my customers and a customer service rep wants to say, hey, let me see where Taylor Swift lives because she's one of her customers. You shouldn't allow that kind of stuff to happen because that is a huge liability for the company. And so you want to protect against a malicious insider attack as well. And then finally, you also have the software bill of materials or the software supply chain kind of compromise where you think your service is secure, but in your CICD process or something, there's some code that gets in and that calls home and then sort of brings down code or is able to modify the behavior of the service that is running. And so to protect against these and other attacks that maybe I haven't included in this presentation, let's talk about how to do that. And so before we get into the specifics, let's just say that trust is fundamental to all this. And trust is established using, you get privacy, security, integrity through all these PKI and finally the certificates and all that. But the trust establishment is the fundamental thing in all this. And how is that done? It's done using two things. It's using either transport level security like TLS or it's done using digital signatures. And those are the most common ways of doing it. Maybe there are others, but for practical purposes, this is what it is. Now you need, in order to verify signatures, in order to verify certificates, you need to know the public keys. And then you have the public key infrastructure, like public CAs and all that. But also you have mechanisms of distributing public keys within your VPC. Like if you're using Spiffy or Spire, you'll have a way of distributing these keys inside your enterprise. But just be aware that that is a key place to compromise. Because if somebody is able to change the routes in your trust domain, then all bets are off. Because that is the basis of the security that you're having in your VPC or in your infrastructure. Now I talked about the four models. So I'm going to go a little bit into the details of what those four models are. So the first thing is the implicit trust in microservices. So this is just you've deployed stuff into your infrastructure, into your VPC. Just by way of their existence in the VPC, you just trust that anybody calling my microservice, it's fine because they're inside the VPC. And I'm not going to check anything about the caller. And I'm just going to respond with anything. So this is catastrophic when the VPC gets compromised. And you'll see if you analyze some of the attacks, this has happened. And unfortunately, this is a problem today. Now the next step is you identify the service that is calling. So if there are two services or maybe five services, you've configured each service in a way that says this service may only be called by these other two services. And when those services call, using Spiffy or something, you identify that this is the service calling me and only trust those calls. And you don't let other services call you. And as a result, you get better security because an attacker can't just inject their own service into the VPC and start calling. Or if a relatively unused service is compromised, that's OK because it's not configured to call anything and the damage can be mitigated. But you still have a lot of vectors that are still unprotected. Now, like I said, you see that little green logo over there? That's the Spiffy logo. And if you haven't looked into Spiffy, I would highly recommend you to look into Spiffy. It gives you a great way to do service to service trust. There's actually a little booth on the show floor about Spiffy. You can go over there after the talk and talk to those guys. And it's, I think, the most common way in which people are doing service to service trust right now. And so let's talk about where we want to go. We want to go to user trust. And what does that mean? It means that you can be assured not just about the service that is calling, but if, let's say, an external user, say Joe, is calling, then you will know at every service in the call chain that it is Joe that is making the call. That cannot be changed. And how to do that we'll get into. But that's the meaning of user trust. So it mitigates a lot of the attacks because now you have a big limitation on, you cannot do user impersonation in the call chain, even if you attack the VPC. But at the same time, you can still change the parameters. And the best security that you can get is the assured context. What that means is that it's not just that you're assured of this is the user is calling. You're also assured that this is the user and this is what they are expressing that they want to do. And so if it's Joe calling and saying, I want to buy 100 shares of Microsoft, down the call chain, you will know that, yes, that was indeed the way in which that initial call was made. And so there's no way of changing these things. And so how do you get there? So we are introducing this concept called tracks, transaction tokens. This is a currently an individual draft in the IETF OAuth working group. It's being, I couldn't go to that meeting of IETF because I'm attending here and speaking here. But it is actually being proposed to be adopted as a working group draft soon. So hopefully in a few months time, we should see an RFC come out of all this. And so what are they? They're basically just short-lived jots that assure the call context and the user identity. And so I've told you what tracks is. And I think we can end the presentation here. But let's get into a little bit of details. So like I said, it's a short-lived jot. And it uniquely identifies a specific call chain. So when an external call comes in, it sets that context and it's able to identify down the chain that this is what the user was trying to do. This is when they call all those details and I'll get into what all those things are. But it assures the user identity. It assigns a transaction identifier so that the whole call chain sort of hangs off of that one external transaction. And it has the originator information which was the endpoint that was called, what was the user IP address that was called from and all those things. And what is the purpose of the call? There's something and I'll get into what the purpose is. And then the transaction context, like what is the user actually trying to do? What are the parameters? Things like that may not be in the external call, but you can have some computed values like what is the assurance level of the user and things like that. And so all that comprises a transaction token. So the benefits here are basically that you can limit damage even if your VPC is compromised. And you can do that by providing an immutable context throughout the call chain. So you basically can configure each service to say unless I can verify that transaction token, I'm not going to do anything. And so as a result, if any spurious calls are made or if any parameters are changed, doesn't matter because the transaction token cannot be modified. And so you now have a way of preventing damage, right? And there is one limitation in the way transaction tokens work today and there's some future work that I'm going to go into in the talk that you can still have a possibility that a service in the middle grabs a transaction token and then reuses it by passing some other checks in between. And so there is no call chain information inside the transaction token. And I'll get into how to fix that. But for now, because the transaction tokens are short lived, less than five minutes or whatever you want to configure it, but less than what it'll take for you to execute the entire call chain, typically it's not going to be more than a few minutes. The chance of reusing those transaction tokens to do a replay attack or something are very small. And so it's okay to have that possibility for the sake of efficiency. What you'd really like is, hey, I know that Joe called to buy 100 shares of Microsoft but has the fraud protection service actually processed that call. I need to know that that was executed before I can execute the trade, right? So if you can have that kind of an assurance, that would be even better, but that's something that we are not covering in the current spec. We are proposing that for later. And I'll get into that. So let's talk about what exactly goes into a transaction token, right? So there's an issuer, like this is who created the transaction token and then get into how it gets created and all that. There's an issued at time and an expiration time. And like I said, the expiration time is typically a very small window. And the audience is the trust domain in which that transaction token can be used. Then there is the transaction identifier that stays constant throughout the call chain. There's the subject identifier. There's the request context. And then there is the purpose of the call. And then there are the authorization details, right? Let's get into all those things now. So the subject identifiers in transaction tokens are defined from a related spec that is soon to be an RFC in the IETF. It's called the SecEvents Subject Identifier Spec. And it defines subjects in different ways, right? So you can have simple subjects that have only one component, like an email address or a user identifier or a phone number or whatever way you choose to identify a user, right? That provides a lot of flexibility. Or it can be a complex subject. Now, a complex subject would be basically something that needs multiple entries to identify the same subject. Like in this case, you're saying that this is the email but you're also saying that, well, this is the user with this email address but I also want to say for which tenant in my system is that user acting right now, right? So the same user might be in two different tenants and you want to specify the tenant. So you can use a complex subject to clarify which particular tenant that user belongs to or whatever you need to specify a subject uniquely, right? And there is some work going on. There is some proposed changes to this where we can use the sub, you know, field of a jot to specify a subject by a simple string and that would be a shortcut to just any of these things. The other important claims, like I said, is the requester context, which, you know, it's a claim that identifies the originating component. Like, you know, this came from this particular API gateway or this call came into this particular endpoint, right? Or this call was originated from this IP address by the user, you know, all of those things that may be important to services that are down the call chain, right? It can also have some other environmental information like maybe, you know, your time zone or other things that can be useful for processing. The other one which is purpose and this is what gets into the transaction token server, which is gonna issue that transaction token, based on that external call, there is a representation of that, then, you know, what is that caller trying to do? That is trying to, we are trying to capture that with this field called the purpose. And the reason why this is important is because you don't want that transaction token to be misused in a way that is not intended of how it was supposed to be used in the first place, right? So if Joe is calling for executing a trade, that token should not be used to say, well, give me historical data for this stock or something like that, right? So you need to be able to specify and you can think of this like a OAuth scope, but we are specifically not using the word scope here because you don't want to confuse transaction tokens with OAuth tokens and I'll get into that a little bit and these are so completely different things just so you know. And so finally, the most important thing in the transaction token is the authorization details, right? So this could be like parameters that are specified in the external request or it could be things that are inferred from those parameters, right? Like for example, the last thing that you see over here, is the user level, right? And the user level is claimed to be a VIP, that is something that was inserted by the issuer of that transaction token, right? It wasn't something that came into the external call, right? So it could be a combination of things that come into the external call or things that the transaction token service decides that needs to be specified and is immutable through the call chain, right? And so what this does is basically gives you a complete picture of what the user is trying to do, where did the call originate from and what is the purpose of the call, right? And so let's take a quick look at how this whole thing works, right? So this is sort of a flow diagram. The box at the top here is the external microservice that is gonna be called and this, like I said, it encapsulates any network infrastructure that you might have, firewalls, API gateways and all that and then from there it sort of hangs all these internal services that are just called directly, right? And this is the new thing that we're introducing which is the transaction token server. And so let's see how it works, right? So what happens is the end user or the external application will invoke the API microservice and the API microservice will present the authentication information. It could be an OAuth token or it could be something else. It presents that authentication information, the call context, everything that comes in and it gives it to the transaction server and it says, okay, now mint me a transaction token, right? Based on this. Now, important thing to note is let's say you get an OAuth access token or an OIDC ID token or something in the input. You're not gonna put that into the track, right? The track is like its own thing. It's just used to assure the transaction token service that yeah, this actually represents a legitimate sort of thing. Now why this works is because if you can now control the evolution of your API microservice, right? Then you can be sure about issuing a transaction token securely. Now all the rest of your services in your call chain can have a CI CD pipeline that is not as secure as what is needed to update your API microservice, right? And as a result, you get better security because let's say one of these internal microservices gets compromised, the damage becomes extremely limited, right? And so what this, so let's go into what that happens. What happens after that is that the transaction token server now verifies that request. Make sure that it understands what the user is trying to do, verifies the trust on everything. Does its own computation about what to put into that transaction token and then issues the track back to the API microservice. And then that transaction token is then issued, then just propagated down in the call chain all the way to any service that requires it, right? So this is basically just how transaction tokens work. And okay, so how does the actual communication between the API microservice and the transaction token server work? It's using a spec called OAuth token exchange, right? So there is an existing RFC on how you can do a token exchange with an OAuth server. We're using that same spec to issue transaction tokens, right? Even though what we're issuing are not OAuth or OIDC tokens, they are transaction tokens. Now, there is a specific way in which you have to use the OAuth token exchange protocol in order to get transaction tokens. And so these are the things that you need to do is that the subject token field has to be the external token that was used to authorize the external call. The subject token type is the external token type that you got and the RCTX parameter contains the information that you need to generate the track, right? I'm sure you have a few questions about it. I'm happy to explain to you after the talk. So now, in response to an OAuth token exchange request, what the transaction token server does is it basically does all its computation of whether this is a legitimate request, you know, whether this transaction token should be issued or not, it does all that computation and says, okay, now I'm going to issue this transaction token and then issues a particular transaction token. I explained to you all the fields. The token type in the response is TXN token, which stands for transaction token. And, you know, this is how the OAuth token exchange protocol fields are used to provide the response. And like, you know, there's no refreshed token or something like that in the response. It's a requirement of the strat creation process. Now, there is an additional case where somewhere down the call chain, you may want to specialize that transaction token even further or you may want to modify something about that transaction token in a way that doesn't damage the security of the call chain. So it's a very sensitive operation, but it is required, unfortunately, in many cases. And so what you can do is you can also do a replacement transaction token. Now, in that case, what you're sending along in the OAuth token exchange request is just the track itself, right? And maybe any additional things that you are indicating to the track servers that, hey, I want this new transaction token to replace the existing transaction token and this is what I'm trying to do. Now, the track server is, it's very important for that track server to make sure that it's not negating the security by issuing this new transaction token, this replacement transaction token. But, you know, by making all those checks, it can still issue a new transaction token back to the requesting service. Like I said, you know, there is a lot of caution to be exercised in this process. And, you know, this is what we recommend in the spec. There's no requirement, you can do whatever you want, but at least these are the things that you should be careful about. You should not, you know, change the purpose arbitrarily or you should not like, suddenly make it looser than what it was. Try to specialize it more, try to assert more values rather than removing values from the transaction token and things like that. Now, I told you that there is, you know, there is still a possibility that, let's say there is a call chain and there's a service that is compromised and it's trying to bypass the fraud service and it's just directly calling the settlement service just as an example of a stock trading operation. Current spec of transaction tokens does not have a way to identify the call chain. So to do that, what we are proposing is that, you know, we can have actually nested transaction tokens. And so what that does is, yeah, I don't think the slide is very clear, but a service in between can sign the transaction token by itself. So basically it's, let's say, you know, the external callers, Joe is trying to buy a hundred shares of Microsoft. There's a fraud prevention service in between that does all kinds of checks and says, okay, I've processed this transaction token. I need the downstream services to know that I've processed it. So I'm just going to sign this transaction token myself. That's what creates the nested transaction token and then that gets propagated down. And like I said, this is not yet in the spec. It's something that we are working on. The drawback is that the transaction tokens tend to get bloated as a result because now you have intermediate services that are padding their own signatures to it. As long as those services are very limited, not like every service needs to do this, you should be okay, but there's some additional consideration. There's also the problem that now those services have to be trusted because you cannot have them evolve the same way other services are otherwise. You're vulnerable to the same risks that you're trying to prevent against. All right, that was it. We still have seven minutes for questions. If you want to know about Signal, the QR code on the left will tell you and if you want to know about Trats, the QR code on the right will give you more information. Thank you, everyone. And if you would like to ask questions, please come up to the mics so that for remote users, it's easier for them to listen. Yes. Hi, thank you for the talk. Do I understand correct? Like if there's a call chain, then service one will repurpose a token so that the operation of service two will happen. But then if you have to enforce some policies, then the transaction token server has to be aware of the flow of the operations in order to constrain it. No, I think what the transaction token server is trying to do is basically just assure something about that call chain. It doesn't really need to know which services that token is going to get propagated to because all it's trying to say is this external caller is this person or this application that is calling and it's saying that these are the things that are immutable about this transaction. Like this is the, let's say it's a share settlement transaction, this is the shares we're talking about, this is the quantity, any other context that needs to be preserved throughout the call chain. It doesn't need to know which service is going to get invoked. But then when, so let's say in your buying shares example, right, so there's a token that allows you to buy the shares, but then you need to request the balance of the count, right? So then how would the service number two know that, okay, I can use a buy shares token to give arbitrary account numbers? No, so what the service is going to do is it's going to check the content of the track, right? It has the public key of the track server. It's going to verify that this track came from that server, right? And then it's going to say, okay, this allows me only to do these things, right? And so it can only do those things, right? And so even if it tries to modify something, let's say it tries to modify something about that track, the downstream service is going to reject it because the signature won't compute on the track. Okay, thank you. Sure. Yeah, my question is regarding the transaction token server. How does it verify that the claims that it's going to put in the Jot token are correct? Does that make sense? Like how does it know that the API service isn't lying about which user made this request? Right, so that's a great question because that is exactly sort of the, you're now limiting the trust to be just with the external endpoint. So the internal services that are there in your infrastructure can evolve much more rapidly without a lot of checks, but the external endpoint, which is your API microservice or could be the user-facing microservice or something, that needs to be secure because it's going to trust that microservice to give you some information like the call context and all that. It can give you the incoming OAuth token so that microservice won't be able to impersonate someone else, like because there's an OIDC token, let's say, which has the identity of the user in there. And as a result, the track server will not be able to, you won't be able to trick the track server in saying that, oh, it's not Joe, it's actually Atul calling, but because the OIDC token that is in the request will say that it's Atul calling, right? But there is a high level of trust between the API or the initial microservice and the track service. This might be a dumb question, but this reminds me of a little bit of the open telemetry and how you have like a root trace and then you have spans and you've got like telemetry being emitted by each individual service towards an outbound thing. I'm wondering if there's any inspiration or anything useful there. I'm not familiar with open telemetry, but just based on what you said, there's been some discussion about how these tracks can be used to like log the execution of a particular call and that is definitely an offline use of the tracks, but it's not something that is core to the problem that we are solving. I guess what I'm thinking is, let's say that you've got, like you have the propagation, you have the root context, like so you're passing along this root span for the request. Each service has its own like span ID underneath of the root trace ID. If you wanted to do an authorization check, like say that the fraud service ran against it, hotel isn't used this way right now, but I could imagine like reaching out to this authorization system outbound, not carrying that information with the job, but to check that that happened. Yeah, I was just trying to find that slide that has that information. So the trace ID is like the TXN here, right? But we don't have a span ID because each service cannot modify the track, right? It only comes, you know, it minted at the beginning of the call and it's just used afterwards, unless you want a replacement, then you're to go back to the track server and you don't want to do that all the time because otherwise you're gonna make that track server work really hard, right? So that is the problem I see because in open tele, I mean, I don't know about open telemetry, but in other cases you need a trace ID and a span ID, like in a dapper, I don't know if you know about dapper. So that's what I'm familiar with that, you know, you need those things and the track can only give you the TXN, the trace ID and not the span ID. Thank you. Thank you for the session. So question is the transaction token never goes back to the user, right? It's basically- No, it's only inside the- Inside. If you see the audience value, right? It's the trust domain that you have internally. It doesn't have any meaning outside that trust domain. Okay, so second question is, you said session hijacking, this can prevent, right? But session hijack of a privileged user, it can prevent, but if an end user gets hijacked, then there's no impersonation because you're now thinking that the end user is making that specific call, but it's not a catastrophic compromise. Like the, let's say Joe's session gets hijacked. You can only make calls as Joe, you cannot make calls as Atul or something else, right? So- But inside are also, let's say I have a privileged access, I passed a session or some sort of a token, and with that you created a transaction token. So what if somebody inside compromise the initial token of an insider? So they can act on his behalf with all the privileges, even with the trot, right? Right, so you're not, I mean, you're using these transaction tokens to execute normal user requests, right? If you're using these transaction tokens to do privileged user requests, like modifying services and all that, I think that's a slightly different use case and maybe we can discuss that because I don't know how that will work, but yeah, I hope I answered your question. We are out of time, so maybe we can take the discussion offline. But thank you everyone for sitting through the talk.