 My name is Ryan Lane. I'm from Wikimedia Foundation. I run a project called Wikimedia Labs. And today I'm going to talk to you about LDAP integration for OpenStack. So I'm going to do a show of hands. Who in here is an LDAP administrator? I apologize to you for the rant that is about to come from me. So there's a thing about LDAP, and that's like basically no one actually understands LDAP. LDAP administrators understand LDAP, but people that actually want to use LDAP don't really get it. And I've thought about this in the past as like, why don't people understand LDAP? They understand SQL, they understand NoSQL. And there's this trend in like going to NoSQL for things, because people don't want to deal with MySQL, they don't want to deal with the schema, and they don't want to deal with having to like know how to do their queries properly and indexing and things like this. And they also don't want to deal with database administrators, because database administrators are kind of assholes. But this is much, much worse when it comes to LDAP. LDAP has a very strict schema. And even worse than that, LDAP administrators are inflexible assholes. Like when you actually want to do anything inside of LDAP, you can't. You can't do any rights. You can't change the schema, because the LDAP administrators won't let you do it. And this doesn't really make any sense. Like, I'm also an LDAP administrator, and I've been through this in the past. And I've really thought about this, and in the way that OpenStack works, and the way that the cloud kind of works now, we're moving away from a model where you have operators in control of everything. And with LDAP, this is also one of these things where we need to give up some control. So that's the question. It's like, why are we keeping control of these things like this? And there's a mantra that I think every LDAP administrator should chant to themselves before they tell a user no for something. LDAP is just another data store. This is just a place for you to store information. And specifically, this information is something with a strict schema that a number of different client applications understand because it has a strict schema. But part of this is that it's also for authentication and authorization, and LDAP administrators are seriously paranoid. They're worried you're going to write something into the LDAP database that can then be used by other client applications accidentally to give out authorization to clients. But realistically, LDAP is very fine-grained in its access controls. So there's definitely ways to control this and make it work inside of a cloud environment. And basically, this is just the end of my rant. It's just a rant. My talk is actually going to be about integrating LDAP with OpenStack. But to actually integrate LDAP with OpenStack really well, you have to give up some control. And so we're doing this inside of Wikimedia Labs. Some background on Wikimedia Labs. Its mission is to enable the world to treat Wikimedia's infrastructure like any of its projects in that anyone can edit it. So like Wikipedia, you go to a page, you click Edit, you type some text in, you hit Save. As a regular user, you can do these things. So Wikimedia Labs is meant to be a cloud environment where anyone in the world can come. They can join a project. They can start doing some infrastructure work. They can push changes into Puppet. We can deploy them to a top five website. We're actually giving anyone in the world root-level access to our actual technical infrastructure. So with this, we have this kind of concept of our OpenStack cloud that's somewhat different than public clouds in that we have the labs infrastructure itself, which we give a production level of support to. Inside of this, we have projects. These projects reflect real-world projects. So we have a tools project. This is partially maintained by the operations team, but it's also partially maintained by a volunteer community. And in this, we have things like bots that edit our different projects like Wikipedia and Commons and Wikidata, et cetera. We also have tools that people run that are too expensive to run on our cluster, but are expensive enough to run in this because if they just break, they break, and they have to deal with it. We also have some other projects like deployment prep, which is a beta portion of our environment that is a QA environment. We also have a community-made projects. So in these projects, we give a user a project. They can create virtual machines. They can give floating IP addresses. Just random people in the internet can have this access as long as it's Wikimedia-related. So inside of this, we have a very large number of fairly non-technical users. These users have basic understanding of how to write code and write bots that edit Wikipedia and things like this, but they're not operations people. They have need a little bit of hand-holding, and they need some things done for them automatically. So there's a number of things that we do inside of our projects on the user's behalf. The project is just a regular open-stack project, but we've extended this concept in a number of ways. For instance, inside of a project, users can create instances. Inside of those interest instances, they have immediate access to shared storage that's just for their project. So it's per project-shared storage, hopefully do in the future. Then we also have something called service groups. These are per project users and groups that users can create themselves that span across all of the instances. We also have MySQL replicas from our production environment. We take the data that is on Wikipedia in the databases, and we replicate it into this environment in a sanitized way so that they can do queries for research or tools or bots and counter-vandalism and things like this directly inside of this environment. We also have pulled out the idea of SSH keys from being an open-stack. I'll explain why soon. But the thing about these MySQL replicas is that they're available to every single project inside of the entire environment. So this is something that's not really supported by something like Trove. We've had to do some interesting things to make that work. And all of these things, for the most part, we're using LDAP to do, and we're using LDAP to create a multi-tenant environment inside of our open-stack environment. So this is our directory information tree. And in this directory information tree, we have the Keystone information. That's the users and the projects. Then we also have a set of global information. This is information that's available to every single instance across every single project. And that's user information, group information, hosts for DNS. This will hopefully move the designate. And hosts is also for Puppet as well. Then we also have autoFS information that's global. We have pseudo-words information that is global. And this is so that we can set up global shares that each different instance can access, like home directories, per project shared data storage that's just available automatically when people create instances or projects. And then we have global pseudo-words information so that us as the operations team have root on every single instance. So here's our user objects. There's a set of information that's used for Keystone authentication, and then there's a set of information that's used for Shell authentication. The username, password, and email address is for Keystone. The UID name, the UID, the GID, and SSH keys are for Shell authentication. Similar here are instance objects and host objects. These are in the hosts OU. The instance object holds both DNS information for each instance, and it also holds puppet information for each instance. Then we have separate host objects that also exist in the hosts OU, and that's used for public IP and public DNS information. We're using pDNS with an LDAP backend for all of this information. It points into LDAP in the same places. Then we also have global groups. Those are in the group OU. We have a couple special groups. Well, a couple sets of special groups. We have a Shell group where when we give people shell access, that allows them to get shell access in any of the projects, but it doesn't actually give them shell access yet. It just allows them to have it or not. So if we remove shell from someone, they're removed from the entire environment. Then we also have a set of project groups. So every single project that we create in OpenStack makes a global POSIX group that can be used inside of all of the different instances in each one of the different projects so that you can do access controls inside of the projects using group information. So you can do cross-project shares that have NFS information across them. You can set up a number of different things like that for authorization, PAM, et cetera. Then we have project-specific information. So looking at the DIT, this is under Keystone's information. And then under each individual project, we have a full set of OU's that are specific for those individual projects. And so we have users and groups. We have pseudoers. We have autoFS. And any other thing that we want to add in there, we can. It's under each individual project. So it's per project. It's basically a per-project root for each different project so the instances can be configured to use those roots. So for this, autoFS, we have autoFS, and the autoFS itself is manageable by the project admins. We have the global autoFS that makes things accessible to every single project. But then we also allow users to add their own autoFS entries so that they can set up shared file systems, maybe through Manila or something like that, and then have the ability for these to automatically be created as autoFS entries so that all of their instances automatically get that share. These, of course, override the global policies. So, for instance, if that project doesn't want to share global home directories, they can override those home directories or a project-specific set of home directories. Same thing for the project storage, et cetera. For the pseudomers' information, similarly, we have global policies. These policies are actually merged with the project-specific policies. So project-specific policies can't override global policies so that they can't take root away from the operations team. These are... Users that are project admins can go in and they can make new policies saying, like, these sets of users or these sets of groups have root on these sets of instances inside of their project so that they have the ability to control access within their projects. Similarly, we have a concept called service groups. So the users and groups information that we allow people to create inside of LDAP is basically... you create something called a service group. The concept that we had was that users need to be able to create system accounts, but we manage everything through Puppet, and everything that goes in through Puppet has to be approved by the operations team, which puts a very large review burden on us. So instead of requiring people to push in system accounts for everything they do in their project, instead we give them the ability to create project-wide system accounts. And these are we call service groups. When you create a service group, it makes both the user and group that is the same name. Inside of the groups themselves, you can add any project member into that group. One thing that we also do with this is we integrate the sudoer's information with this. So when a service group is created, a sudo account is also automatically created with... a sudo policy is also created with that. So you create the service group and user, and then a policy that says any user that is a member of this group is allowed to sudo to this user, thereby giving them access to the system account and allowing them to install software underneath it and things like that. So the basic flow of how we do things inside of labs... So when an instance is created, as the user creates an instance, our system automatically creates an LDAP entry for this, which has the puppet information and the host information for DNS. Then it also injects a small amount of user data, which actually we've kind of pulled that step out. We make images that have this information pre-populated. But then it goes through and it joins Puppet. The Puppet Master sees that there is a signing request. It checks LDAP to ensure that this is actually something that's in LDAP and was created by someone. It signs the request, and then it builds the instance. When the instance builds, Puppet automatically does a certain number of things on the system. So looking at the instance configuration, this is the NSLCD configuration. It sets up the base, which is the root of the entire directory information tree, and then it sets up global users and groups in the NSLCD.com, which sets up the global OU for groups and users, which is people. And then after that, it sets up another set of roots, and these are specific to the project itself. So it sets up the same exact thing for groups and users, but it uses the project sub-drate. So this exposes the service groups and users to every instance in the project. So we also use access.com, and one of the things that we do in here is we allow root for everything because then the system breaks without that. The next thing we do is allow the service groups to be available. We have a naming convention for the project roots, which I'll go over later when I discuss how we integrate Garrett with all of this. We have a very specific naming convention for compatibility reasons with some things. But basically, we allow the service groups next, and then after that, we ensure that the user is in the shell group. If the user is not in the shell group, they get denied access. After that, we allow every project member. So we're using the POSIX group that the user is a member of to allow or deny them access into that project. So if someone is not a member of a project, they can't SSH into the project. And now I should go back to why we're pulling the SSH keys out of OpenStack. We don't allow OpenStack to actually inject SSH keys into our instances at all. We have a global NFS share that's read-only. We have a script that pulls all of the SSH information from each individual user and then writes that into the user's sets of authorized keys file. We set up SSH to... I think I have SSH in here. I forgot the SSH slide. Sorry. We set up the SSH daemon to both read from Roots authorized keys file and then also the NFS share for the SSH keys for all of the different users. This allows us to... When a user creates an account in labs, they automatically have global access to every single project, but only assuming that they're in that project group for these different projects. Similarly for SUDU. A lot of different things have this concept of being able to add multiple Roots, so that's really easy. So the SUDU is Root. The first one is the project subtree added in from the user's policies and then there's the base one as well. Actually, I guess this does allow... I think this is probably flipped in production. This allows SUDU to be overridden, but it's flipped in production. Similarly for AutoFS, you have the ability to set multiple bases. So this allows the project subtree being first to override the base configurations. And this allows it to pull policy from multiple places. We also set up something in the options so that you can use variables inside of your definitions. So we have project-specific ones that can be used. You can pass options into AutoFS for variables, and we use Puppet to set that project name there. And that allows you inside of your AutoFS definitions to use that project variable to actually reference your own project name. So then we have Garrett integration. Garrett does not support the concept of multiple bases. So there's some configuration challenges that go along with that. Garrett does have LDAP integration. It can pull groups from LDAP. So the concept that we went with is to point the Garrett LDAP configuration for groups to the root. And by doing this, we have the POSIX user object class. And so by doing this, it can pull the global groups information, and it can pull each one of the different project-specific service groups so that in Garrett you can allow access to the groups by service groups or by each individual project. So you can create a repository, and then you can give access to that repository to an entire project, or the members of that project, to access to individual service groups inside of that project. This allows each individual project or service group to maintain its own code. So one of the things that we had to do to make this work is that we have to make everything globally unique for groups. And so the naming scheme that we went for with the POSIX group was project-name, and there was a reason that we used project-. We originally did not, and then we had people creating groups like Apache, or projects like Apache and root, and other projects like that, giving them immediately access to the root group or the Apache group, which is really dangerous. So we prepend this to make it something unique on the systems themselves, and that's per project. And then inside of the projects, we have the service groups, and the service group names that we use for that are the project-name.the-service-group. And this makes the actual service group unique across all of the different projects in LDAP, so that you can use them in applications that don't have the ability to do global and then subtree configurations. Saltstack is an interesting integration with LDAP. There's actually a lot of ways to do this to make it multi-tenant. One is that you can use environments. So each project would be a different environment. That information can be configured via LDAP by setting the environment of the instances or the minions to use the project name as their environment. Similarly, you can also use Git for all of your pillar information and your state information and some other things underneath Salt so that it will automatically switch between branches. So you can set up branches for each one of your different environments that then also allow people to push Salt code into their different branches, which they have access to through Garrett, by giving them access via the projects. Additionally, you can set up a grain for each one of the different projects. So when Puppet runs, it'll add grains into the configuration for Salt so that you can target instances inside of a project as an admin by using the grains. Unfortunately, the grains are something users can just set up, so you can't really trust that. So one of the other things that you can do is you can use pillars, which are global configuration, but they're managed by the server. So not only can you do pillars that exist on the server itself, but you can also pull these from an external system, which could be OpenStack itself or it could be LDAP in our situation. And with these, you can target instances inside of a project using their pillars, which are something that the users can't modify themselves. So this gives operations engineers the ability to make calls across the entire environment or individual projects inside of an environment. Past that, Salt also has a peer system that allows minions to call things on other minions or sets of other minions inside of its own project. So by making a combination of the pillar system and the peer system, you have the ability to grant access to one instance inside of a project and to make calls against all of the other instances in a project, thereby giving your users the ability to do remote execution inside of a project. Similarly, there's a lot of other things that you can do. One of the things is there's a Salt API which provides a REST API interface that you can write modules for so that you can give users the ability to pass in calls to the Salt API using possibly Keystone integration if someone writes the Keystone integration for this, please do, I want it. Which would allow you to do Keystone authentication for remote execution calls outside of the entire environment, just like you would with Nova. There's a ton of other possibilities with this as well. I would go over Puppet, but Puppet is actually fairly limited in how its LDAP integration works. We are using Puppet for LDAP integration. It's using the LDAP ENC. But unlike Salt where you can use the environments in some same way by having branches and things like that, there's no way to really control in Puppet what users are allowed to run on the server since they can run server functions. So if they push code in and they do a Puppet run, they can own your master. So there's definitely some limitations in what you can do with Puppet which is why I'm not really going to go into any detail there. For my SQL itself with the shared replicas, this is a disgusting hack and probably no one else should do this, but the actual idea behind it is kind of interesting. So we have to give users in all of the different projects and the service users inside of these different projects access to the database information inside of these. We also have to give them the ability to create databases inside of this or create tables on replicas that we're pushing out. For this, we create grants for the different service groups and the users, but there's a limitation here in that mySQL has a username limit. It's a certain number of characters. So we had to come up with a scheme for mapping users to the databases and the scheme that we came up with was to use P and then their UID number and then U and their UID number for service groups so that it's the project's group ID and the user, the service user inside of that group ID mashed together. Similarly for global users, we just used U and the UID because those are unique across everything. And then we also allow the users to create databases and basically the grants that we give allow them to do this. The only real reason that we give grants for most of these things is that if we have a user in a project that's abusing the system making long queries, et cetera, that we can just block them. Because realistically, the majority of the data is replicated from another source. Then we also have NFS. So realistically, this is a terrible hack as well and I would like to replace this with Manila as soon as possible. We have a script that pulls LDAP information since we inject the puppet information and the host information into LDAP automatically when we create instances, we can use that information to actually populate shares inside of NFS. So we have scripts that run that say, give me back all of the instances in this project and we run this every once in a while and then we create shares with the IP information for the different projects. This is of course a problem because if you don't run it often enough, then if instances are deleted and recreated and then new instances are created in other projects, there's a race condition into which new instances may have access to data in other projects. So this is definitely something we want Manila for because in Manila when an instance is created or deleted automatically triggers something that would change the shares. We run this very often, we run it once a minute and we've made it fairly efficient so it doesn't kill us but if we had a large enough DIT this would not work. So with NFS there's also a problem similar to MySQL and that NFS has a 12 group limit. There's ways to get around this but basically if a user is in more than 12 groups the server will not understand any of the extra groups that's there. There's something you can do about this and there's one horrible thing that we've done is that we take a dump of all of the users and service groups that are in LDAP and we stuff them into a local group file. This is not fun but this is actually something even with Manila that you would have to do if you're going to do shares with multiple systems and have groups or users that are global at all. So with this this also puts another limitation on how we do users in groups and previously we had to change the system to make groups globally unique and that was for Garrett and this system we also have to make the UID numbers and the GID numbers completely unique as well. So there's some basic patterns that I'm using here and one is to use the project's OU for different things. If you want to limit access to a certain set of data, make an OU and set the system or service to use that project's OU for limited scope of information. Similarly there's also just filtering global information by using attributes and that you can say I only want instances back that have a project name of this thing and I only want to get the information back from that and by doing that you can have a globally flat scoped OU that has all of the information but you can filter by that and then there's a possibility of doing both like what we're doing with Garrett. We have the per project information and we also do filters by posting group. So are there any questions? So the question so the question is what actually when a project is created what creates the project information that goes along with it? Right so in our system we have our web interface actually does this when people create projects or instances. We have some hooks in Nova that will also do some of this information as well. In the future we'll probably use Designate for DNS and we'll put hooks in there for that. But basically it should be set up on some kind of trigger and whichever system is using the information is going to have to have credentials to modify that OU and in those specific situations you would want to create user accounts for each one of the different services that have limited access that only have access to do the things in the specific OUs. So you add ACIs to the OUs that allow those different services to only do their limited scoped things. So your question is did I choose to use LDAP to provide global information over using Keystone. So realistically I'm forced to use Keystone I would probably use LDAP for everything if I could. Keystone is problematic in some ways but realistically the answer is I chose to use LDAP because I have the ability to extend LDAP in ways that the systems understand. So LDAP has been a standard for a very long time and has a very strict schema. Different applications already have LDAP support and in that LDAP support it's fairly configurable on how you actually use it. So for instance I can't extend Keystone to do a lot of the things that I'm doing like the autoFS information etc etc whereas with LDAP I can just extend an OU, I can put information in that and I can scope things inside of those different OUs. So it's really more about flexibility than it is about anything else. So the question is whether I'm choosing one or over the other or if I'm really using them for different purposes one for the system level stuff and one for authentication and authorization. That is basically true I'm using Keystone for open stacks authentication and authorization and I'm using LDAP for basically everything else. Also we have some other applications that are not integrated with Keystone at all like Garrett and a number of other web interfaces like we also do Graphite is using that for authentication and a number of other management tools are using the same set of information where we can't use Keystone. So yeah, it's basically it. It's a method of extending something that is used by other systems that is native. Okay, so the question is how does Keystone integrate with LDAP in this situation? So I actually started off by using Keystones so I should actually say that I started using all of this when NOVA was the only project and during the Bayer release and there was some initial LDAP integration and I rewrote portions of it probably most of it and that's what ended up being the Keystone schema. So Keystone was kind of built off NOVA's LDAP integration and so I kind of built this around Keystone's schema or not the schema but the directory information tree. So I'm actually just extending Keystone's directory information tree. The LDAP integration that goes along with this is standard Keystone integration. So when you create a project that project information is then available inside of Keystone. When you add users as members of that project they're members inside of the project in Keystone, same thing with roles. We have role information that's underneath the different project I use and members are added inside of that as well and so all of that maps one to one. The question is is the password still in the LDAP directory tree and yes I don't use Keystone to do any user or group or role management. Our web interface actually does all of the management when a user creates an account in our web interface it creates an LDAP account and the password assigned is what the user enters, same thing with email address and all of the other information. We also add SSH information through that same web interface we don't use Keystone for any of these things. Keystone uses SSH in a read-only kind of way so it has the ability to pull that information back from LDAP and when it does authentication it just does a strict bind against LDAP. I don't write to LDAP by Keystone now that's not saying I would not if Keystone had good right support. Well in this case the application that we use is writing to LDAP and we're allowing the users to write to it. Not Keystone. Yes! Right. Keystone, the comment was basically that Keystone in Havana release has the ability to separate the authentication information and authorization information and so basically you can allow authentication to LDAP where your information is not stored in LDAP but or I should say your user account information is stored in LDAP but your project and role information is stored in the Keystone database and in my specific use case it's actually way more beneficial for my application in LDAP because then I can use that information for other things and so basically it's what Keystone integration is actually what's providing my multi-tenancy for other applications which makes it really powerful which is again one of the reasons I went in the rant at the beginning because people should allow you to write into LDAP. So the question is whether I think that we should have a managed system that writes into both Keystone and LDAP. When you say Keystone and LDAP do you mean Keystone? No, actually the way I would like things to go is for people to actually write into LDAP. The reason for this is because the point behind LDAP is that it is a system that has a strict schema that can be used by clients in a well-known way. So if you're writing into MySQL all of that information that you're writing into MySQL about projects and roles is gone for everyone else. You can only use it for OpenStack and now you can not use it for anything else at all which kind of defeats the point of like having an infrastructure that's multi-tenant. The question is does this affect the clients that actually talk to OpenStack? No, it doesn't because they still talk Keystone-native. So any of the stuff that they're doing that is OpenStack-native level things, they authenticate to Keystone, they get a token, then they get a scope token, they do what they're going to do. Any other questions? So was the question like so you're saying that by giving users pseudo access in one project that they can use that pseudo access to hop into other projects? So at least to be able to read information from their projects. So yes and no. By giving people pseudo access it only gives them the ability to access things within their same project. They can make policies in their project but the policies they can make only allow them to modify instance information, etc. We actually restrict that through a web interface. But realistically it wouldn't matter anyway if we allowed them direct LDAP and access to modify that and they put in instance names from other projects that wouldn't affect anything because the instances themselves are appointed to that individual subtree. So other instances from other projects aren't reading from that subtree so that information is ignored by everyone else. But otherwise for project hopping it's not actually a concern. We put a lot of effort into ensuring project hopping can occur. So like there are certain things that you could probably do project hopping with. For instance, if you allow people to pseudo and then write into global home directories it would be an issue if SSH keys were put into the global homes. But we actually split that apart and put it into a read-only OU and things like that. So there is definitely care that has to be taken when looking at doing global directory information trees and especially in regards to project hopping because it is a concern. Any other questions? Thank you.