 Good morning everybody. Thanks for joining us nice late start today We're going to be talking about controlling access to open stack in the enterprise byline. This is not public cloud So let's get cracking Thank you finished Okay, so who my my name is Sean O'Mara. I'm a senior systems architect of Marantos I'm also the architecture consulting practice lead for EMEA Asia pack and Russia So all the Marantos architects work in my team within this region I've been an IT industry for over 20 years Primarily focused on enterprise and various types of enterprise solutions I've been my first open-stack public cloud was built on Essex So I've been in the open-stack community for a while. I've dealt with the pain A lot of those clouds been built in enterprise. So this is some of my learning some of the learning from Marantos We're going to be discussing this today And I'm going to be asking for a lot of feedback from you guys at the end of the session And going on into the future around these particular topics So We'll discuss typical requirements seen in the enterprise. We'll talk about how open-stack does access control today Where's the falling short? What is being done? And what can we do as a community and as operators? I'm assuming most people in the room here operators or some are involved in the open-stack And then we'll talk about some of our back solutions that are available So just setting the scene I'm talking about enterprise. I'm not talking about public cloud I'm talking about dealing with things like private clouds Hosted clouds the typical snowflake clouds that we're all building I'm not talking about any particular distro either of open-stack Public clouds are different they have different immediate needs the way that they're legislated is different the rules around public clouds are very different We're talking about integration in open-stack with that stuff from Big Vendor X Every data center has somebody else's weird and wonderful author system in place We're colliding with the old world. We're colliding with a whole bunch of things that relate to governance Typical role hierarchies and I'll talk more about that in a second The one point I'd like to make is a lot of the asks that we see from enterprises don't always make sense But they need to be met anyway. It's not a case of that doesn't make sense. We won't deal with it Even if it doesn't make sense sometimes we've got to deal with it, especially when it comes to authentication and auditors Okay So what are the enterprise requirements that we're dealing with these days? Enterprises want different backings. There are typically three or four different type of authentication backings that we're seeing It's caused by different org structures. It's called by different trust levels and go to any enterprise today And you're gonna find everything from Microsoft Active Directory to Nevelle I mean you the whole gamut something from IBM etc We need to integrate to existing systems a lot of these fourth systems are in place There rules are in place around them the companies have spent many years and many man hours setting these things up We need to comply So those of you work in big organizations with big IT departments, you'll understand this and then along come the auditors Single sign-on single sign-on is a critical component within a lot of organizations. We're trying to cut down sprawl We're trying to cut down on the number of user names password combinations and access in points that users need to understand and deal with Both at an infrastructure level and an application level audit requirements Show of hands who's been through an IT general controls governance audit? I can't believe it's so few people in this room An IT general controls governance audit is Something that anybody has worked in the financial services industry Anyone who's worked in the legislated industry will understand and we'll talk a bit of more about separation of duties in a second and then governance Governance is something that we often ignore as techies because we think we know better I've been there. I've fallen into that trap Governance comes in many forms. It can be corporate governance and rules that we have to apply or it can be legislation I live in Germany now the German IT governance rules from the government are intense I see some smiling faces, but they're intense and I don't speak German. So I don't even read them Okay, so off back ends one of the big things about off back ends and in the enterprises We tend to have a lot of multi-tiered off back ends We have different trust zones that we have to deal with A typical example is you'll have an internal active directory You'll have a LDAP solution for customers and then you'll have third-party sign-in from vendors All of those have different trust levels and different ways of getting gaining access to them within the enterprise They have different rules different roles. So we've got to handle things like role mapping those are Enterprise requirements and these are the things that we as OpenStack and the OpenStack community need to start learning to deal with I mean, I've got some examples up there. We're talking about LDAP solutions. We've got open LDAP We've got vendor LDAP and IBM various others active directory uses Kerberos as an author mechanism Which doesn't give us any of that extra information. So we've got to get role information elsewhere and Then one of the ones that's starting to take a lot of light these days is federated solutions Shibboleth IDP Samal assertions things like that that we have to start dealing with in the community and we need to be able to deal with those things well Now a lot of the focus of what I'm talking about today will be on role-based access controls We'll discuss that in a bit more depth and how OpenStack is handling that but what's important to understand about the authentication mechanisms being used is Depending on where the data is coming from we may have different levels of data available to us to achieve those goals a Lot of our orth back ends are also historical You know, they've been around since near dot they've got user directories of hundreds of users who nobody really knows what they're for And a lot of them are very slow We're talking about OpenStack and how fast the APIs need to run. We've got to take that into account So we're talking caching and cash handling of tokens and things like that user separation I've talked about and we'll talk more about that shortly, but it's about trusted users and role management So other enterprise requirements Granular access control in the enterprise is critical. We're seeing a lot of customers talking to us about separating admins admin roles and reducing admin access A lot of customers are saying great We want administrators to be able to create a tenant in OpenStack, but never get into that tenant We want admins who can work at the API layer, but no access to the operating systems You know, a lot of those separations of duties are required within the within the enterprise it is fairly nonsensical to us who are used to as you know Having roots on a Linux box may able to do whatever we please We can't do that anymore It comes back to the audit requirements around general controls and the separation of duties So Anyone who's done any work in the finance industry As I said earlier general controls auditing separation of duties That extends not just into the administration of OpenStack, but the consumption of OpenStack And I'm focusing OpenStack, but obviously this goes into a consumption of workload and workload Clinication And the last one which I've got there is obviously corporate standards This extends right from financial controls down to our technical controls Delegation of management is something which we want It doesn't exist today in any meaningful form We need a mechanism that I can delegate Portions of control to my environment to alternative users But with very specific rights to create users, but maybe not do anything Which means I'm starting to create very complex access hierarchies The other thing that's required is workflow All authentication all access control to the OpenStack community and OpenStack resources We need some sort of authorization to that and I'm not talking about authorization here in the AAA since I'm talking about approval If I want to increase my quota Someone needs to approve that increase. How are we handling that and how do I access the control access to that? I'm not pretending advances to all of this, but I'm saying these are the requirements Okay, just quick refresher AAA. What is AAA? authentication authorization and accounting We need to handle each of these things within our environments. So it's the who are you? What can you do and what did you do? I hope you like my little split log V We're gonna focus the rest of the discussion this morning mostly around authorization and what OpenStack is handling authorization today And how we can move forward with that Okay, so our back roll-based access control At some point in our lives we've all dealt with it Those of you who have been around as long as I have us probably dealt with you know various group level systems in active directory or LDAP or OU's those are all role Assignments, so how's it done today? within OpenStack today every single project API does its own Policy checking the policy checking is done through the Oslo policy enforcer class and I'll talk about that a little bit further That policy checking is based on a flat file that is stored on the API service on Each instance of the API, so if you've got 10 over API is running you have 10 policy adjacent files Currently there is no single store for that. It's all current Currently available only as policy files the policy file is Essentially like a set of firewall rules that the user is compared to a rule and it allows you to do certain things The default policy set today has only got three users in it four users All who are basically some form of admin. It's very difficult to change this It's not impossible, but it's complex and it can get very messy and you have to have distribution systems to get that policy everywhere Keystone currently handles user crad roll and project assignment Roll creation all sits within Keystone and the back ends are typically LDAP for Authentication or SQL or now Federation through some sort of SAML 2 assertion Assignment backends currently with an open stack are comes out of the SQL database whatever SQL database you're using in the back end That's where your assignments are for Keystone The challenge with that is it's very easy to create extra roles within Keystone it's a lot harder to assign specific rights to those roles and To get the hierarchy correct those roles actually are signed correctly so the enforcer class within the Oslo the policy is The class that handles both the loading of the rules the checking of the rules and agreeing to pass and fail of the rules within open stack The enforcer class is slowly being worked on there are a number of new enhancements coming within Oslo policy Those of you are interested in the code base. It's all available. There's a lot going on I'll talk about a few of the blueprints in a second But it's a very simple class within that class. There's an enforce Every single time you make an API call with an open stack enforcers called Every single call you make enforcers called The rule sets are loaded every time you make a call or they're now they're cached as well So not strictly true, but if there's a change to the policy file, it's reloaded That's obviously is and if you start thinking about doing this it's Adding an extra potential cycle delay in an API call So if you've got a very busy cloud you have to realize that every call is it an extra delay coming along So when you try and change that system out For something else you have to be very aware of the performance So policy to Jason for those of you who haven't seen a policy to Jason file yet the works That is what a policy to Jason file looks like It's a set of rules that denote to an API call that then denote to a role or a user or There are a number of special checks available in policy Jason Newish newish one is an HTTP call You can actually add a rule that you can pass certain information across using an HTTP call Get an answer back now. I spoke earlier about speed Can you imagine Thousand API calls that have to make a thousand HTTP calls a second to some arbitrary HTTP server That's got to process something You're now starting to add not milliseconds but seconds into your call cycle I mean just think about the HTTP setup never mind everything else that goes along with that so The whole policy to Jason mechanism It was great when it was first put in place The teams are doing a lot of work to try and make it better But we need a lot more user input a lot more operations input to the community to explain what's required From enterprise from operators Okay, a couple of new features that have come around There is the capability now to load Policy files from an API Keystone has an API back in for policy Nobody's using it. None of the projects are actually using it at the moment But the ability is there and the ability is there that you can now use the authorized instead of in force class To make those calls So what's our back currently lacking within open stack? There are no native auditing capabilities. Yes, we've got the logs, but those Don't really say whether things were applied or not. It's just a fall-through mechanism The only time you get a log is if there's a failure you don't get positive logging currently For those of you have to deal with general controls auditing positive logging is seriously required It's a critical component. It's one of those things that we have to be able to prove We have to go back Six months a year sometimes and show yes, that person access that resource at that time And we have to prove that we have to have those logs in a way that we can place reliance on There is no method right now built in that allows us to programmatically Modify the rules for our back Those rules are currently as I said just this flat file model They all are created outside of the creation of the API call currently and I'll come to what's happening in the future in a second But what it means is I can't do for example time-based Access controls I can't say an admin can do You know between 8 and 5 he can do something but not between 5 and 7 can't do that because there's no programmatic way to change Yes, there are external systems before someone argues with me, but there are external ways of doing it But they all have different pitfalls and problems Lack of synchronization I spoke earlier about the API's if I'm running a massively scaled cloud and I want to scale my API's so if I want to put 20 nova API instances out there to help me scale Every time I create that instance. I need to make sure that that policy file is copied identically across each one of them That's a big challenge Because there's no centralized store. I'm also in a situation from a control perspective that if I get it wrong on one If I leave the default policy on one of those API's and somebody hits that API and they get to do something Hey, I won't know about it because I've logged it be They've got different access rights all because of an admin mistake a failure of automation it happens stuff doesn't copy correctly It has a terrible format. I mean, I'll go back to this but Working out what's actually happening in policy to Jason. It's painful It's truly painful. I've written some complex policy to Jason files. They Making sure that the rules actually apply in the order you think they're applying It requires so much testing and so much review that you're never quite sure Because there's always going to be that weird thing that somebody tries to do that you didn't think about and oh wow They're allowed to do that and they're not supposed to so Multiple occasions. I've mentioned creates ambiguity It makes us not be able to place reliance on these things and I keeps using this word place reliance General control audit is about placing reliance on data and systems We have to place reliance and Then finally it creates massive separation of duties issues for me to create a new user who has control over Just a small portion means I have to go and review everything else within the stack The other thing is an overtly long policy file takes longer to load uses more memory takes longer to process It's just the complexity grows and grows So what's happening in the community today So the Oslo team are doing a lot of work as along with the Keystone team to Programmatically create defaults and the default rule sets for the Jason The way this is going to be handled It's going to be within and this is I don't want to go into too much dev here But basically within the code within each API call there will be a default Jason or policy call which can be read out and create the policy file automatically Advantage of that is you know, you're always going to have the same defaults. It's not somebody's bad copy and paste It's not a fiddle. So every time you spin up a new copy of the API You get a new copy of the policy file which you can modify But it makes it a lot simpler to manage it a big part of this is the policy Jason files today if I go into Github and I pull the policy Jason file from Nova and From trunk and if I pull it from somebody else I have no guarantee that they're going to be the same thing Because people modify them all the time The other big change and this is the one which I really like is Instead of using the current language for assertion. We're going to start using YAML. What that means hierarchy We can use YAML hierarchies to define policy We're going to be able to put comments in You can put comments kind of in the policy files today but More often than not they break the policy assertion So we're going to put comments and we're going to be able to explain what we're trying to achieve All of this is great. It's a way forward But it still doesn't solve the basic problem that the stuff is in files on The API's and I'm not linking it back to my organizational data coming out of my directory structures So what are we trying to achieve? And This has been going on for almost a year now. We've been doing a lot of work into this We spoke about this in Austin. We have a working demo. We're looking at creating a pluggable enforcer class for For also the policy currently they are not accepting a pluggable enforcer class. There are a few blueprints out there for it What that means is that I'll be able to come along pull out the current enforcer class Like in my own enforcer model and then be able to do those lookups and those are back verification against another back end Typically an enterprise. I'm going to do that against LDAP or Some sort of LDAP based system Tree system if you have a massive sequel database that does your enterprise role behind great we can use that but This is where as a community we need people to start talking about what's required from an enterprise perspective openly on the dev channels in the community channels in the community user group talks so that The developers can start to understand the importance of this. I think they understand the importance But we need as a community to be pushing against it We need to be able to centralize the assignments and policy back ends The current situation where we're coding roles in one place that have no relation to policies assigned to another place We need to fix that we need to pull those closer together What we need to be aware of when doing this and I've mentioned before is we need to be aware of performance This mechanism needs to be fast. It cannot be a delayed slow mechanism and The loading of rules we need to make a decision about whether they're real-time Rules or cash and that it comes back to affecting performance So where are we? There is a blueprint make and force call pluggable that was logged six or seven months ago now Please go in and have a look and comment on their blueprint The idea is that we're going to create a pluggable and forster class Which means just like a lot of the other open-stack projects. We have essentially created a driver This takes away a lot of the argument it takes away a lot of the challenges Yes, we'll still have the open-stack reference mechanism with this policy to Jason, but let's move forward Let's allow other people to move forward and The driver for Oslo policy is related to the make and force call The code is there. It's available There are options that we've done which will allow us to do this But we need the first one to be accepted to get the second one working For those of you interested all the Oslo policy blueprints are also there go and have a look at them Get an idea of where they are with the statuses and comment, please Sound like I'm selling something Okay, so alternative approaches We've run into this challenge at a number of customers. We run into this challenge a number of scenarios So we have started working on different models to be able to handle this at speed So a number of people within Mirantis have been working on this What are the goals? We want to centralize the assignment back end and policy back end Leverage so in this case we work with Fortress, but it could be another LDAP Leverage Fortress to introduce new features for the open-stack RBAC. We're able to do all this OU based role-based component Delegation of permissions hierarchical RBAC, which is a critical component Native support for multi-site Here's another classic one Multi-site replication has been solved For LDAP and for most authentication providers. Why do we need to solve that for keystone? Someone's done it already. We don't need to solve that problem Let's just consume what has been done already We've also leveraged midpoint to provide a single point of management For RBAC and RBAC controls and authentication controls use a crad all of that within the environment So how does this tie together? The other policy call will happen to Fortress We have a driver that talks to Fortress. Fortress has a very good RBAC interface already It has a solution available. There are other ways to do this, but this is one we've been experimenting with It does the look up against Fortress every single time the call is made So obviously I've talked about delays. There is a minuscule delay But because of the speed of Fortress and because of the way they handle it that delay is fairly negligible In small-scale testing we haven't seen any issues You know in the interest of full disclosure. We haven't done large-scale testing on this So, you know, we have to see how that works The keystone the assignment back end could come from midpoint, but will still be stored potentially in the sequel The longer-term goal here is that all will come from Fortress So everything will come from that LDAP assignments back end now this changes a little bit If we want to start using Federation We start using SAML assertions from Federation Where we do that assignment and that role accesses also has to change it has to come from whatever is behind the IDP So That information has to be provided from somewhere Keystone still needs to be aware of it because it's part of the call But it can be created on the fly So it doesn't need to be permanently stored within Keystone Especially if all the policy information is coming from somewhere else Okay So how we get there We've got two assignment connectors that we're working on two versions of the Simon connector This assignment and user creation right now, which is working It's fairly effective. It's a little complex to get work to set up And once you throw SAML into the mix it gets the fun of getting an IDP to work with an SP and all of that If you've dealt with that it's a lot of fun The open stack permissions API which is being worked on we need to influence that we need to deal with the way that that's being Moved forward at the moment essentially what's happened within Keystone they've created a Way to store policy We need to handle fortress support for groups projects because currently open stack doesn't really recognize some of the that information like OU's Potentially extend fortress commander to act as a UI So in summary I've spoken about it during the session. How can you help us? You can help us by getting your stories out there by talking in the community and telling us what you need Let us know your business your specific user case Your specific user stories Typical user story generation we need to understand what it is you're trying to achieve in the community We need to understand how you're trying to achieve it We need people in the various regions around the world who understand the governance requirements both from government and industry To share that information in a way that's consumable to Developers and engineers That's the input which we really need your help on So I don't know how much time I've got left I've got through pretty quickly any questions Absolutely silence. Is everybody awake? I'm not So any questions I can answer it is in place and it does work, but it needs it needs help if you're an admin Okay, so my point is that if you're an admin anywhere You're an admin everywhere. Yeah, right there is the keystone the very famous keystone bug open since 2012 and All of this is great, but before you fix that bug It actually doesn't okay, so much right. I can answer to that I didn't want I didn't go into detail here on that But so part of the work that they're doing with assigning the auto creating the rules Within keystone so within the API so every time you create the API you have that call Is to remove that particular call? I know the one you're referring to is the one that says if admin drop don't even to run through the checks Part of that is to remove that now. There is actually a bug log for it. There is work happening on it It's been addressed in one of the blueprints the top mate. I can tell you what it is But part of that is to fix that In the code patch, which we've done we've actually removed that We've taken it out because it's the obvious one But it's a very it's a very valid question and it is something that needs to be fixed and again it Right now because of this general assumption that an admin is God within an environment That's kind of not really getting the focus should get What we need to be doing and I come back and I'll hop on this and if you guys talk to me in the passageways I'll keep doing this, but we need as operators to feed back our real-world requirements It's great that we talk about in the passages. It's great that we moan about these things when we all moan I moaned terribly especially when I've got a nice glass of whiskey in my hand. It's It's important that we get our messaging across that we do it in the channels because Some of these devs who are working on these projects are working on them either in their free time Or are so snowed under out the bottom and under the volume of stuff They're trying to achieve on the cycles that they're only grabbing at the low-hanging fruit and We need to help them prioritize That's really a critical portion of what we need to do as operators. We need to help them prioritize So Any other questions It's probably not correct them to do this But the one other place I know roles are read like that is horizon to do a an iteration when you when you log into horizon It wants to know what to show the user based on basically iterating through everything that's done Is that still possible when you're I mean, that's a bulk tell me everything operation rather than a what is this? So so our solution breaks horizon right now, okay? But Okay, this is my personal opinion and someone's gonna throw something at me I believe that if you're building an enterprise grade cloud, you shouldn't be installing a rise Why are you interacting through horizon when you have API's? I mean if you're doing proper automation, you don't need a pretty console, but please don't shoot me Horizon has this place. So I'm not I'm not knocking horizon team, but it has this place for admins only Yeah, okay. I grew up in the age of Linux and you know, I started with him It's DOS. So I never really got used to guise Anything else? I'm a vmware administrator. I really like horizon. It reminds me of vSphere Just kidding. So what I love about that question, and I'm gonna actually answer it quickly What I love that wasn't my question. Let me go ahead I've done a lot of Vmware implementations And everything you can do through that you can do through the command line the CLA and an API call And in fact just like horizon where horizon exposes, you know, 30 40 percent of the possible with the API vSphere Exposes 30 to 40 percent of what's actually possible With VMware. Well, I said I was a VMware administrator I don't know this command line that you speak of Somebody shut him up because he will carry on like this for the next 20 minutes My question really was as far as best practice goes Let's say I have an IT service catalog Let's just pick a vendor like service now That's driving my provisioning of new users through the API Would you suggest that That service now drive that through Keystone And then let it be handled in Keystone once you make these changes to to do Programmatically through the API to make those changes or have it drive LDAP and then have the authentication come through I'm going to give you a very opinionated answer on that question To me Keystone's job is to provide tokens That's a big part of what Keystone's job is Keystone shouldn't be the user backing Keystone can There and I said this earlier. There are a million things that handle user management far better That problem has been solved Use Federation get the Keystone do its job which is token management token Authorization that's what it does very very well and very fast Let it do that Let the user creation happen and user ownership happens somewhere else So if using service now service now uses some kind of back-end use a sample to assertion I mean this is for those of you enterprise. This is where we're pushing. We're pushing most of our enterprises to start using SAML Okay, so many things support SAML active directory support SAML. Yes, you've got to install the Federation component, but it supports it a Shibboleth IDP It works There are a lot of them out there Yes, it's finicky to set up and secure But once it's done, it's done. You don't have to do it again So yeah, that's my opinionated answer To get the German rules set down to one sentence. It's a Don't trust anyone so Yeah, I was thinking about this wouldn't it be possible to start with a plain JSON file that just allows nothing and then go with some kind of Ansible or other orchestrator and Enter the things you want to use for a specific role by an orchestrator So that's essentially what they're trying to achieve with this idea of moving to the Ansible to the policy file being Auto-created on first use. This is what they're trying to achieve because we do need defaults. I mean I Think the Nova I don't know if someone's got a machine in front of them. They quickly check how many lines are in the Nova policy. Jason It's a couple of hundred To go and hand create that couple of hundred when you're just trying to do basics is Challenging at best and possible. I mean, it's just it'll take you so long and it's such a mess to achieve The main point that I'm trying to bring across is we need to move away from this file We need to be able to place reliance on information. We need single points of information To comply with various governance requirements I mean right now as I mentioned earlier this idea that we've got potentially If I've got five Nova API's I've potentially got five different Jason files I can't audit that I can't prove if I have a breach that I was in the right And that's what we need to be able to do and that's part of centralizing and bringing all that all that enforcement into one place And that's that's if there's a message. That's what I need to get across anything else Silence people way down the back there. All right. Well, I'm done then. Thank you very much