 Let's get started. Welcome everyone. Thank you for joining. My name is Preeti Desai, and I am an OpenStack evangelist working on various OpenStack projects. One of them is Keystone. So, we are all here to find out how Keystone can be configured in multiple OpenStack Clouds or multiple data centers. Before jumping into the solution, let's go over what are the Keystone core concepts, and how can it be configured in single OpenStack Cloud and then take it to the next level. So, Keystone is the heart of OpenStack. Keystone, all of these services integrates with Keystone, and it's a very key service in an entire OpenStack. Now, Keystone supports user management. Users are a human user or a service user, and it has information such as name, password, and which domain does it belong to. Then, project and domain. Project is a tenant, is a group or a team in your organization if it's a private Cloud, and domain is a group of projects. It can be a company in public Cloud, or it can be another organization in private Cloud. Domain is basically, it defines the administrative boundaries between the users and projects, and users and projects generally belong to one particular domain. We have roles, a set of operations for what kind of operations can a user perform are defined by roles. So, roles, you can create as many roles as you want, but it doesn't make any, it's not meaningful unless you assign that role to certain user onto a certain domain or a project. So, once the role is assigned to a user, he can basically request a session token, and get authenticated and perform the operations. So, token is the most sensitive information in OpenStack and Keystone. If anybody possesses token, it's basically your identity. So, until it expires, that person can behave as if you are requesting certain operations. Token has information such as the user information, which domain does it belong to, and also has the information, such as the expiry time. Also, there are different types of tokens. So, if it's a PKI token, then all the information is in the token itself. In case of UUID, it's just the hex, it's just the string which the entire information is stored in Keystone database. Now, with Kilo, they are coming up with a new format, Furnet. It's basically in between UUID and PKI, in terms of size. PKI tokens are pretty large, UUID tokens are decent in size, but there are performance issues with that. So, there is a Furnet token, and which is basically a good, it's a new feature. So, the Furnet token again has information such as your user expiry time, but rest of the information is all encrypted, and only Keystone has the symmetric keys, Keystone has the keys to decrypt that token, to actually regenerate the token based on the content. So, there's no, basically the token doesn't have to be stored in any database, SQL or MAMcache. So, Keystone can regenerate the token based on the content. That's the Furnet. Keystone also supports service management. You can basically, all of your OpenStack services are registered in Keystone, and they can belong to one region or multiple regions, and region is basically a geographic distributed, it can be within OpenStack Cloud, can be within one rack or within a different data center. Region can be even defined for that rack or a different data center. And end points are generally associated with the services itself, and which region does it belong to. So, this was the very high level, maybe five minutes overview of what Keystone is about. Let's look at how we have architected in R. We run Private Cloud. How we have architected in R Cloud. So, we have three Keystone instances running in R Cloud. They are all backed by load balancer. So, there is one WIP for each of these services. We have MySQL database for storing information such as domain, roles, assignments, catalog. The entire information is in MySQL database, and we have LDAP for users and accounts information. All of those users and groups are stored in LDAP. We have, we're using PKI. We have MAMcache for that, but we learned this summit that there's no need of storing PKI tokens in MAMcache. So, we are going to revisit that and change it to not cache the tokens. So, now how, very high level, how the authentication and authorization workflow runs in our infrastructure. So, for any OpenStack Cloud, if you take, user requests a token. Keystone basically, user sends the username and password. Keystone generates token, session token, it returns the token. User goes back, takes that token, goes back to NOVA, and then request for a new VM. NOVA, since we are using PKI, NOVA goes to the Keystone middleware, authentication middleware. Authentication middleware verifies the token and grants the access to the user. So, then NOVA goes back to Glance. Glance does the same thing and verifies the token since it has the authentication middleware. It doesn't have to go back to Keystone for verification. And image is basically whatever image that it's needed, it creates and provisions and returns back to NOVA and the VM is created and basically it's given back to the user. So, now we have this entire deployment of Keystone running in the west coast of US. Now we are building a new data center in east coast and we were debating, can we leverage the same Keystone in west instead of deploying the new Keystone instance in east. So, we looked at few options and we also looked at the federated identity and also we'll look at if, so federated identity is the very hottest topic of the summit. We all looked at the keynote and it works pretty well. It's amazing, a new feature. So, we looked at how we can leverage it and I'll tell you the details later and reason why we didn't go with the federated identity and then we have something global identity. So, how does, so we all pretty much know from the keynote how the Keystone to Keystone Federation works. Here's the high level workflow. So, user, so we have west coast Keystone running. In east coast, we have all the users in LDAP. In east coast, we don't have to replicate that LDAP. We don't have to provision users in east coast and that's the biggest benefit of Keystone to Keystone Federation. In west coast, the Keystone works as the identity provider. In east, it can be a service provider. So, now user goes back to, it basically for accessing any resources in east, user goes to west and request the SAML assertion. West Keystone and west returns the SAML assertion and then takes that assertion back to east and then it gets the token, the session token. If we revisit the flow, we just looked at a few slides before. This is how it works without Federation. So, if you see the user goes back to the Keystone instance gets request the token and then the token is granted. So, there are okay, what I'm going to say, I think Steve might throw a tomato on me, but please don't. So, what we realized that Keystone to Keystone Federation in private cloud was causing us a single point of failure. So, if the Keystone in US west is down for some reason even you cannot access any resources in east. That's the biggest bottleneck for us for private cloud running production applications and we decided to not go that route. And also the workflow, the workflow changes the way you request token and then the session token is granted. Here, in this case, there are multiple steps. You need to go to get the SAML assertion, get back and then go back to the Keystone in east. So, after actually deciding not to go with this option we kind of debating whether is it a good idea to actually replicate entire MySQL and LDAP. And that's what we did. So, we have the MySQL replication and LDAP replication in both the data center. And so, in west Keystone in west has two regions and all the services from west are registered in west Keystone and also all the services from east are registered in west Keystone and vice versa. So, there are two regions in both the data center. And user goes back to the west Keystone and request the token and resources for that particular data center, gets back the token and uses that for the further processing. Same, he basically follows the same workflow even for the east data center and the way we have thanks to our infrastructure engineers, the way we have set up the replication we have very low latency. And it pretty much basically works if you create a project in west it shows up immediately with less than one second of latency in east. So, now the key question is once you get authenticated in west can I use the same token in east? So, the answer is yes. Since we are using PKI token it is possible but my boss says it's not done until it's in production. So, I have tested this works in depth stack but it's not yet working in our production environment theoretically it should work there should not be any issues. But we need to look into why it doesn't work in our production environment. So, but if you notice here there are two end points, two different Keystone end points, US west and US east. So, what we decided was so this is highly available. SQL latency is one of the cons but we are using MySQL Galera for replication and it's pretty good through there is not high latency. And the one of the cons, one of the biggest federated identity core is that identity has to replicated in east as well. If you use federated identity then you don't have to replicate your identities. So, we took one step further and this is what we have in vision. We haven't implemented this but this is what we want. We want a global Keystone endpoint which should be accessible and based on the request where the request are coming from. Load balancer we can create a weapon load balancer and then load balancer should be able to route that request to the nearest data center. And also if you generate a token in one data center should be able to use in the another data center and this can be replicated, this can be applied to any number of data centers. So, and also the like since we are using PKI it's the biggest advantage. The feed, we can use the same token since we are using PKI. If you're using UUID then it's not possible to leverage the same token. So now what are the problems with this kind of approach? What did we learn about this? We have highly available Keystone across two data centers. But the problem is since we are using PKI the token size is pretty huge. It's a very entire catalog is in the token so it's pretty big. And we were using domain specific drivers. So we'll go into detail of how it is orchestrated and what it is but the biggest problem for us was the orchestrating of that domain specific driver. So for the token size since the PKI token is too big we looked at Keystone supports this multiple endpoint filtering feature. We looked at this feature. So the way it works is it's a dynamic multiple attributes filtering. So the underlying idea is you can group endpoints based on their attributes and associate that group to the projects and once anybody request token on that project only those endpoints will be returned in the PKI token and not the entire catalog of cloud. So those endpoints can be grouped based on different attributes and all of these are the attributes basically interface whether it's a public admin or internal interface, service ID, region ID and enabled or not. So here are few examples. So now we have two regions in one data center. So when a user is requesting a token it's basically adding all the endpoints from both the regions into the token and it makes the token size double what it was before. So we could actually group the endpoints based on regions US West and East. Or what we could do is let's say if we have a particular service, Swift for example, then we can basically the projects for Swift who are only created basically they need all the resources for Swift. We only assign the Swift endpoint to that project. So we can filter even service level. And not only that you can now also group service level and the regional. So you can say all the core OpenStack services in US West belongs to one group and that group is assigned to the user to the project. So this is very good feature in Keystone and we highly recommend using this. The biggest problem with this feature is the project, the groups are at the project level and so basically it impacts the project provisioning workflow. So whenever a project is created we have to make sure that we associate this group to that project. So then it impacts the entire provisioning process. But the biggest advantage is it significantly reduces the token size. So today we don't have this feature enabled. We have all well tested in our environment. We haven't enabled it but we are going to enable it. It's being delayed because of the project provisioning workflow but we have plans of enabling this. So now how the domain, next is the domain specific drivers. So what the domain specific driver does is basically if you have the identity coming in from your LDAP, you can filter, you can basically create organizational units or maybe some filtering on your LDAP directory structure and associate only those units to a particular domain so that not every other, not like thousands of users are visible to a domain. And the way domain specific driver works is you have to create this config file with all the information, user, credentials, where is the LDAP, what is the tree, where should the users be coming from. And it has to reside on the Keystone service box and also once you create a new domain, you have to restart the Keystone service. So if you have, let's say if you have one box, it's still okay but three Keystone instances running in two different data centers. When we create a new domain, we have like three files to create here and then three files on the other data center and we have to restart the identity service. So restarting the identity service is pretty much that the authentication and authorization service is not available during that time. So Keystone team has done a very good job on proposing a new feature, on having a new feature in Kilo and all of this processing is just taken care by one API. So this is amazing that now all the configuration file, all the domain specific configuration file is stored in MySQL and can be either added new config or updated later on if you have any updates for that particular domain, you can update it using that configuration file. So this is one API that can do all the magic. And this is how the JSON payload looks like for creating the configuration file. So you provide URL credentials and the tree. We are using, there's one more interesting feature. We are using the user filter, user underscore filter. So all of our users are coming from the same organization unit, but we have added filter based on the domains. So it allows us true isolation of the identities. So just remember one thing, any number of data centers, any number of OpenStack clouds, Keystone is capable of working as the identity service for your cloud, I'll take any questions at this point. You rock. Thank you. I know when she sees me come up to talk, she gets like, was he gonna say, was he gonna say, no, you're awesome. One really nasty hack I had to put in back in the days when Horizon didn't have any sort of memcache, was that the PKI tokens got too big to fit into the cookies. So, and I'm probably worthy of stoning for suggesting this, but what they do on my suggestion is they actually hash the token and it means that the validation falls back to a UUID style online validation. What this means for you is that if you have, you probably wanna think of Horizon and Keystone as a unit. So that means that if you have Horizon accepting a token, not issued by the Keystone that it's gonna go and try to use it for, it's gonna fail because it's not gonna be in the backend. The PKI tokens won't fall back that way. So you wanna have people hit the Horizon next to the Keystone that they're gonna be using, if that makes sense. And sorry. I know, that's more of an answer than a question, but the rest of this is fantastic. Thank you. I guess I have more answers to give. So you said that, so you said that you're using PKI tokens and that they're great here because you can verify them offline without talking to the correct Keystone. But the caveat here is that you can't get revocation events? No, we do. Okay, then now I have a question. The authentication middleware has the revocation events. So Keystone, basically it goes to the, it listens to the revocation events and then the token are revoked. And those go to both regions? Yes. The events. So the, yes. Okay, thanks. I have a question. You mentioned, oh, by the way, beside PKI, there's also a compressed token called PKIZ. So you reduce the size a lot. But you mentioned it's possible to get rid of the PKI token to be persistent? Yes, we don't store PKI tokens in memcache or MySQL. But as mentioned, if the Horizon, you cannot save the token in the session, you have to hash it. So that would be, the must be to make a service call to the Keystone server side. If it's not persistent, you cannot validate the token if it's valid. If Horizon is hashing the, Horizon is hashing the token, right? Yes. So my point is, is it really, is anyone sitting here have a production environment that we totally don't store the token at all and that works well? Because at least from the log of the Keystone, are we also using PKIZ and we still see a lot of traffic and validating the token from the server side? So yeah, just a general question. Thanks. Any other question? Can you talk a little bit more about what you were saying with the revocation thing? Are you using the revocation events or the old revocation list? You're using revocation events. So you have custom code in there to actually fix the stuff that we don't quite have upstream? No, so the, there's no custom code. The revocation events are, we have enabled the events in middleware and it basically generates the events when a token is deleted. If you're not persisting the tokens, how do you, okay, so just to make sure we've completely confused everybody here, there are two different ways of dealing with revocations and this is an evolving thing. The first is revocation list, which is a revocate token by its ID, which is what PKIZ uses by default and that is done by using the token in the database and tagging this one is deleted. So to generate the revocation list, we query the database for the list of tokens that are revoked and send that across. There's also revocation events, which is the ability to revoke on a broad class of tokens, like if this user is gone, revoke all tokens for the user and it's more complex code. And that is currently in Kilo, but it's only on the server side because it's being used by the Furnet token stuff or Furnet, I guess, it's pronounced it properly, that is coming out now. So you're running that in middleware, it's not something that we've got upstream yet. We want it upstream. No, it's not upstream. So you're using the commits that are out there that are on review? Or are you using revocation list? No, we are not using revocation list, we're using revocation events. I'd be really cool to see how that's done. That's awesome. That scares me. Hi, so I've become a pretty big fan of Keystone V3 domains and then this time I came and I heard a talk about project hierarchies. And so I was wondering, can the two coexist or what's the direction going forward? I mean, domains versus project hierarchies, do they conflict in any way or just curious? Not in the scope of this topic, but yes, their domains are top level projects. The hierarchy is going to be there. Domains are essentially an abstraction for collecting users at this point. I don't want to hijack, but it's not really in this topic. But it's Keystone sessions, that's fine. So you mentioned you're using a filter to filter out users in your LDAPs, presumably because you have many thousands or hundreds of thousands of users in your LDAP and a list would then take forever. Same thing as us. How do you construct that filter? How do you manage the group or is it a group of groups? What's the methodology you're using to manage that filter and make that filter work for you? So we are using, we have OU representing all of your users in one OU. And then we are using common name to filter the users per domain. So whenever a new domain is getting created, we create a new common name for that domain and then add that into the domain-specific driver. So you're filtering on OU, you're not filtering by group? No, by OU. By OU, okay, thank you. But you can also filter by group too. The Keystone has that capability too. And we have validated it. Again, it's not in our production environment, but you can even create groups. And... Do you know if group of groups works? Sorry, I didn't get that. A group of groups. So if you're creating, let's say you create a group for every project, for every tenant that you create, and you give them the ability to add and remove users from that group. But you also wanna have a single filter. So you create a group of groups that includes every group you've ever created and then you use that in the filter so that the filter just filters out the people and that are part of OpenStack. And when they add more people to their individual projects, they're included in the group of groups by reference. I can answer. So, yeah, I think Nathan can be a concept. It depends on the LDAP server implementation. Typically, and in pretty slide, you saw a member of filter, so she was doing a group filter there. But the LDAP server implementation matters. So I know when... Active Directory. Active Directory does do some nesting. I know 3D9 Directory server does well, because I wrote it. But you can nest groups and member of, will traverse the hierarchy and be present. So really, it's a normal LDAP filter, so it depends on the grouping. Is that part of the LDAP client configuration or that's an LDAP server side? LDAP server. Okay, yeah. Thank you. I don't have any more questions. Okay, thank you guys. Thank you. Thank you.