 Hello. Hi. Yes, Mike's working. Great. Thank you for coming, getting on in the afternoon today. My name's Rob Clark. I'm a distinguished engineer at IBM. Some of you might know me from the various talks I've done around OpenStack Security over the last couple of years. And today I'm going to talk to you a little bit about some of the things we've really been focused on during the last release and some of the things you should be able to reap the benefits of going forward. So I was trying to decide what to talk about today, and part of the reason I was good going through this process was because I was supposed to be presenting a small part of somebody else's deck. And unfortunately he can't be here because he's just had a baby, which I think is quite rude of him, but there you go. So I'm going to kind of thumble through the whole lot. So I'm going to start with security-related projects. Now, what security isn't? Security isn't Keystone. Security isn't Barbican. It isn't Catalan. And it's also not Solometer or Congress and some other things as well. They're all security-related projects. We benefit from them being in the community. They're either not directly focused on security or they're only focused on individual parts. Things like Keystone and Barbican are big enough to be self-sustaining projects on their own. But there are a lot of things that aren't, and a couple of them I'm going to talk to you about today. To give them their due, Keystone, almost everyone here should know, provides authentication with a big N and authorization with a little A. So what that basically means is that Keystone provides you with pretty robust authentication mechanisms. The authentication side of stuff, sorry, the authorization side of stuff, who is allowed to do what, the user role stuff, is more diverse and a little bit weaker. There's ongoing discussions around dynamic policy and that sort of stuff. But really interesting highlights, they now have credential encryption at the back end for databases, so when you're using various Keystone back ends, you get credential encryption in there. And a real driver for that was a whole bunch of work that's been done during this cycle on PCI readiness. So there's an awareness within the OpenStack community that things have to be moving to a position where deployers of OpenStack can go through PCI DSS with relatively little pain. It's never painless and that means bringing in features like credential encryption. Barbican provides secret management for OpenStack, enables a lot of security features. It's a really pivotal security technology within OpenStack because basically anything that wants to do encryption for its users, like Swift, like Nova, will rely on Barbican often for its secrets. It supports PKCS 11 and KMIF HSMs. Dog tag is also supported and has been for a very long time. And I've put it down as a highlight here. The certificate management system is deprecated. So this means that Barbican isn't going to be responsible anymore for requesting certificates for your service, but you can still warehouse them there if you want to. And this is really good because it signifies a real narrowing of focus within Barbican to really managing secrets. Asteland is another security supporting project. It can sometimes be described as like a middleware or an adapter between services and Barbican, which today really it is, but it exists so that you can integrate with other key managers if you want to. Often this gets attributed to the security project, but really if it's attributable to anyone, it's Barbican, which is where it came from. So what is the security project? So I'm going to go through a super brief history because I've done talks on this before. So around about the false timeframe, myself and Brian Payne got together at the summit and decided that security should probably be something that was in open stack. We followed on from that by creating the vulnerability management team along with other people and starting off the security notes process. Time goes on, Havana passes by, we end up with a security guide. I can see a number of the authors from the security guide in the audience today. We all got together in Annapolis, I think, and spent a week writing a book, which was fun, you can buy it in tree form. Then Icehouse comes around, we start building out security tooling like Anchor and Bandit. And now with these newest releases, we've really had a focus on threat analysis and on Cintrobus, which are the two main things I'm going to focus on today. So the security project has a number of member organizations. We have contributions regularly from all of these groups. It's worth pointing out right now, I suppose, that security upstream or open stack security isn't necessarily anyone's full-time job in a security project. We don't have anyone paid by the foundation to take this stuff forward. Sometimes the contributions coming in from security-interested developers, and sometimes they're from security people like myself who like to pretend they can be developers. And somewhere between the two, we end up with a good mix of people. Security project really operates with two pillars. So we have a development pillar where we create tooling like Cintrobus that I'm going to discuss in a little while, like Anchor, which is an ephemeral PKI system, which I spoke about in previous summits, and Bandit, which is a Python static analysis tool. And there's been a lot of interest in Bandit recently, and there's been a number of talks on it. I suggest you go watch them. If you're writing anything in Python and you're not putting it through Bandit, then there's a good chance that if you're introducing stupid errors, logical failures, using bad libraries, just doing stuff that you don't know is necessarily wrong, you might not be catching it. And that's the sort of stuff that Bandit can catch. It's integrated into a number of open stack CI gates, and will be integrated into more going forward. The vulnerability management team, OSSNs, threat analysis, and security guide, these all stand on the second pillar, which is our guidance and governance. So, in many ways, we act as a group of consultants to the wider open stack organization. We're available as a resource to fact check on things. We work directly with the vulnerability management team who are kind of part of security, but they're very autonomous in their own little box where they can receive and triage and deal with vulnerabilities. So this is really how the security project stacks up. This is an evolution of images you guys have seen before. And I managed to get a little Lego guy on there as well, which is always good. So I promised to talk to you a little bit about threat analysis. What it really is, is a security sniff test. We're looking for anything bad that's going on in the project. For a long time, this happened anyway when a project wanted to become vulnerability managed. When it wanted to have that tag that said that the project was a big enough part of open stack that people like the VMT were going to support you when vulnerabilities come in. They're going to give you an embargo process. They're going to go and get CVs for you. They're going to help you deploy fixes. And this really consisted of software engineers going and just having a lot through the code and seeing if there was anything sensible. These weren't necessarily security people. They were security interested. And that's really where it started. A few releases ago, we started talking to the VMT about ways we could improve threat analysis. We even did a talk about one way we attempted to do it using functional decomposition a few summits ago. That didn't work. It didn't scale very well at all. Scale turns out to be the really difficult thing. Threat analyses are something that security people like myself end up doing a lot. And there's a lot, my friend Travis McPeaks over here. He coined the term security architect magic, which is where we look at a diagram or something. We go, oh, well, that bit's wrong. And then if you give us enough time, we'll probably work out why it's wrong in a way we can articulate. But we needed to try and remove as much of that magic as we could to develop a process where developers can drive it more than we do. So any threat analysis process should be able to identify entry points, assets, and persistence within a system. It should be able to document where data transits, where it goes through any format changes, where it goes through any transformations in the formats of those transformations. So where it's stored, the origin, and the destination. That's because this is where most vulnerabilities come from. A huge, huge number of vulnerabilities just come down to changing data from one format to another and not really thinking about what you're doing. And that can be from reading it off disk into memory. It can be from parsing. It can be from all sorts of stuff. But they really don't generally fall into these categories. And one of the main things we have here is impact of control failure. So our aspiration here is to create a system that allows us to have a rich enough set of documents that when a project has a vulnerability, the VMT or that project can look at their threat analysis and understand what the effects are of that control that just failed because of the vulnerability and be able to very quickly document how to respond to it. So as all good things, this started at a mid-cycle and it started on a whiteboard. And generally, there's very few problems in the world you can't solve with a whiteboard until you walk away and try and turn it into a real process. The real process ends up looking a little bit something like this. And this is all documented and online now. So you identify a common deployment and identify best practice. And this is pivotal for how we're going to do things in OpenStack. Projects like NOVA, we're going to build up to, but can be deployed in many different ways and in many different use cases. So they can be private clouds. They can be public clouds. They can be federated educational stuff. There's a whole bunch of stuff. And we can't provide a threat analysis for all of them. So we asked the development team, so you know probably what your most common deployment models are. Then you know what your best practices are. So that's what we model. And if people diverge from best practice, then they're already going down a bad road anyway. But if we document that best practice through threat analysis, then hopefully there's more incentive to stay with it. So in our process, it starts with diagramming out the application. We do this collaboratively. We do it online. We do it in the room, or we do it through something like a Google Hangout or insert your whatever chat mechanism here. And we found this work quite well with Barbican, which is the first project that we've taken through this end-to-end. When we produce an asset catalog, we list where all the really interesting bits of a service system are and how they're persisted. We document the failure impacts, and then it gets submitted to a repository. Now, it might be that the security team doesn't even get involved until they submit to a repository stage. When it comes in, we'll do a review. When we've reviewed it, and we're happy that it looks good. And looks good doesn't mean it doesn't have any vulnerabilities. Looks good means we've had a look at the documentation that was provided, and we are happy that it gives an accurate assessment of the project. If it says there are five vulnerabilities and there are bugs, as far as I'm concerned, that's good. That's a good threat analysis. That means they've done well on that project. So then you should, if you comply with all the other things that the VMT want you to do, get your vulnerability management tag and be awesome. So we had to build a simplified diagram method for this. It's really simple. It's based on Droyo. You have basic actors, basic symbols you'll all be familiar with, like databases and queues. We have dotted line actors. So any dotted line actor is a third party. Early we recognized that even in best practice deployments, it's extremely rare that a project like Barbican might have its own MySQL database. They probably have to share it with a dozen other services because the deployer is deploying three control plain nodes or something like that. So we document the data exchange between them and the transport. This is our example diagram of some notional thing. We have a web service on providing content, like a blogging type service. We have these security boundaries between the internet, the corporate DMZ, and the corporate network. We chose not to ring fence these completely and just use them as separators. And this is because we always, as security people, we always want to spend extra time looking at anything that crosses a security boundary. Unfortunately, when you do most of these things for OpenStack, almost everything transits over almost all security boundaries because things do a lot of domain bridging. But it still helps us to identify those things we need to look most closely at. So this is what Barbican looks like, which if you use to these sorts of diagrams, this is actually quite a simple way of representing Barbican. But it is a best practice. It is their PKCS11 deployment, which they see as being the most common HSM deployment. It's got third-party services in there, like the Keystone Event Queue. They don't expect to have their own rabbit, so that's third-party. They don't expect to have their own Keystone, so those are third-party. And we can see how data moves around, and we can see places where things have persisted. So one thing I haven't mentioned really is our asset catalog in any detail. So we do this asset-oriented threat analysis because we found, mainly through trial and error, it was the most scalable way to do these things. So the idea here is to understand what's at risk, to quantify that, and to describe the worst-case impact for those things. So after a team's got a diagram, they can start considering what's actually in these different components. So in Barbican, this is just a snippet of what we had, but there was secret data, secret metadata. Those are the key things at Barbican. It's bread and butter. That's the things it moves around and manipulates most. And there are things like RBAC rulesets, rabbit MQ credentials. All these different credentials could be in one file. And we don't say, well, the Barbican.conf file is an asset. We look at the individual things. There'll be a lot of stuff in there, like debugging and things we don't care about, so we're not really going to put them in. But individual credentials, they live there now, but they might be overriding in environment variables, and they might be provided in some other way, or maybe you can override them through the API. So we'd like to identify them all in turn. I've pulled out the rabbit MQ credentials. Once we've had this asset list, we basically do a simple security triad. Yes, I know there are other things like stride we could apply, but again, we need this to be small. We need it to be scalable. So we're looking at, in confidentiality, integrity, availability. And for each of these, we're looking at what the very worst case is, and we're postulating. So we're saying, if there's an integrity failure, so an attacker can write to these credentials, what can they do? Well, we know they can change the credentials. The worst thing we can think of, they can do with that, is cause a denial of service within Barbican. Confidentiality, they could get access to the Q. The worst thing we could find them doing there was exhausting the Q, because there wasn't anything too secret that was going across that Q. And availability, if there's an availability problem with access to this asset, then again, you're going to have this denial of service situation. So with Barbican, we generated a number of findings. Modifications of ACLs could end up compromising various secrets, misconfigured HSM credentials. There's a whole bunch of stuff. And I'm going to quickly, I think, deep dive through one of the more interesting problems we found. But before I do, I'm not picking on Barbican. It's just that this was the first project we took all the way through, and we will find similar problems in other projects. I'm going to explain the way a system was built, the way it evolved slightly. And then when we did our threat analysis, we were able to identify problems with the implicit security model that they built. So this is the point where the documentation's been generated, the diagram's been generated, and then it punted to us as a security project to have a look. So the database model for using PKCS 11 in Barbican is basically PKCS 11 is one way of talking to an HSM. It's very scalable. It uses a database for everything. And the implied model is that it's fail safe. So everything that happens in the database is protected because all the cryptographic operations have to happen on an HSM, which should provide you with a higher degree of assurance. Basically, confidentiality assured. So I'm going to quickly walk you through how this works. So if Barb wants to store a secret, he tells Barbican, hey, I want to store a secret. Barbican talks to the HSM and says, hey, HSM, here's a secret. Please wrap it with a key that I don't know about. I know you have it, but I don't know what the key is. Then return me the encrypted version, the wrapped version. And I'll go store it in the database. I know there are keystone interactions and other things, but I'm trying to keep this simple. So then Barb wants to get his secrets. He says to Barbican, hey, can I have my secret? Barbican goes, gets the wrapped secret from the database because it's encrypted and protected. Pushes that to the HSM and says, hey, unwrap it. HSM unwraps it, passes it back to Barb, while Barbican passes it back to Barb. That's great. But that turns out to be not that useful in the cloud. What I really want to be able to do is create an object for some purpose, have it encrypted so I know my cloud provider can't necessarily access it whenever they want or whatever your various security assertions are. But I might need someone else to be able to access the object. It's rare that tenants do things that only they are concerned with. Quite often they want to be able to share things with other tenants or other services. So in this case, Barb wants to grant Alice access to a secret. So he tells Barbican, I need you to remember to give Alice access to the secret if she requests it. Stored in that ACL change is stored in the database. And then when Alice wants to get that secret, for a legitimate reason, Alice says to Barbican, hey, can I have Barb's secret? Barbican looks in the database as well. Is Alice entitled to this secret? Do they base ACL comes back? Yes, she is. OK, great. OK. I'll get that secret from Barb's account. I'll go push it to the HSM. They can unwrap it. I'll get it back and I'll give it to Alice. Now, one of the things we found during threat analysis was that when we looked at the integrity failure for the database, we were like, well, is this ACL table in the database that's not protected? So what if I as an attacker add myself to the access list for Bob's secret? That's not protected by the HSM. So what ends up happening is an attacker who, if there's an integrity failure on the database, and remember the model is supposed to support the database being compromised in that way and never expose secrets. And if you steal the database, you can't expose the secrets. But if you can just get into the database for a second or two, and add yourself to the ACL tables, then an attacker can then say to Barb can, hey, I want Bob's secret. And Barb can check the ACL table and goes, oh, attacker, yeah, I see you're in the ACL table for Bob's secret. And then it will go off to the very secure HSM, decrypt the key, and pass it back to you. So what we end up with then, if I go back to the database model for PKCS11, this is the original assertion. And through threat analysis, we actually found that the model doesn't work as it is today, because there's this problem in the database. We found a really big design problem. And we've got a bug in now. I figured we should put a bug in for it. There was some discussion at the moment about whether the security team should put bugs in, or whether the development team should. I'm happy, either way. This was discovered in the open. We do our threat analysis on an etherpad. So there's no point necessarily putting in an embargoed bug. But yeah, it's a good validation of what we're doing. This is quite an interesting implementation-oriented thing. We're not entirely sure what the fix is. Well, the direct fix is to make sure that you can identify when there's an integrity failure on the ACL. Now that could mean using some database property that provides signing or hashing, or doing some band verification, or something like that. I'm not sure. But as a security project, we're quite happy to now. What should happen is we're on this bug. So they'll propose a change. We'll take that change, and we'll put it into how we know the system works from the threat analysis. And then we'll just run through the same steps again and see if we get a better result. So during this release, we've created a threat analysis process that we hope has these qualities. So we've got clear diagram methodology, basic security assertions that's your confidentiality, integrity, availability. It supports new projects going through the vulnerability management process. The expectation from all of you here that have already got the tag, the expectation is that you will come around and do a threat analysis at some point to be able to maintain your vulnerability managed status. So Nova Neutron, everyone else, that's got big, scary projects where you know there's nasty corner cases. Just start thinking about this. We're going to come back around. That's not a threat, because to be honest, for the threat analysis, I don't really care if you have big vulnerabilities in your project. My point is to document and understand where they are to be able to provide support to the VMTN others, fun as it is to find problems in projects. OK, I'm going to talk to you for a minute or two about another new project we have. So this is Cintrobus. Cintrobus provides us with an API fuzzing framework built specifically for OpenStack. So fuzzing is basically where you have, for API fuzzing at least, where you understand certain parameters and you provide them with what is sometimes garbage input, sometimes it's partially based on templates, which is how Cintrobus works. But basically, you're injecting unexpected input into these different parts of the system and seeing how they react. We use raw request templates within Cintrobus. At some point in the future, if OpenStack ever has full swagger definitions for all of the services, then we can possibly build something based on that. It's something I really like the idea of. But right now, it uses raw templates, which actually means it's very easy for a developer to build tests specific for their service. There's no formal changes, no formal swagger declarations, or any choose your API documentation framework. None of those are required. Cintrobus at the moment is targeting these services. It has found vulnerabilities and issues in all of them. And it's still very much in a beta stage. These are what it tests for in general. So these are fairly typical types of vulnerabilities that we find in OpenStack services. Buffer overflows are not so much, but it will find them in supporting libraries and things that we're using from time to time. We definitely get a lot of issues coming with cross-site scripting and string validation and those sorts of problems, which you'd expect as we're dealing with mostly stuff written in Python. It's extremely easy to use. It's available on PyPy. I think a new release dropped last night. It's also, obviously, an OpenStack repo. See lots of people taking pictures of that. So I'll wait a second. There we go. So commands are very simple. And I'm actually going to show you a quick video because I was too scared to do a demo in a second. But it is very, very simple to run through. OK, now we're going to see if internet magic works. We are, but I'm going to have to do a little bit of scrolling. That's OK. So the CintraBoss team, I'll mention them in a minute, but they actually put this video together yesterday. So work on, they've moved into a virtual M for CintraBoss. That's how the init works. I think we tracked from the bottom of the screen here. There we go. So I'll ask you two or three questions as it's doing its initialization. It's all set up. It's ready to run now. So just quickly edit a configuration, which is going to be up at the top of the screen, of course. Basically just changing the port things are running on, changing to a testing directory, where we've got our payloads, which are the nasty bits, and our templates, which express where you need to point the nasty things at. Quit out of that. And then we're going to run through our SQL tests. This is much easier than trying to do a live demo. Run through its tests. We get our results in a nice, possible JSON format. And we found there were a number of failures, no errors. And we find that in a few places, the server's returning 501s, which it should only be doing if something broke on the server end, which means, at the very least, there is some HTTP compliance bug we need to go fix because they're returning the wrong errors. Or it could mean that something broke in very nasty ways. Fuzzers generally consist of a testing side and a debugging side. That's the same if you're testing something written in C. It's the same if you're testing a web service. You need something that generates errors, and you need something that allows you to interpret and inspect errors. Here, the two are separate. So now we've got a log of this. We know where it happened. You can go back through the other logs. And then we can go and have a look at the resulting output from that service. And we can sync them together. And obviously, it doesn't take much to use things like COLA or any other containerized service to allow you to spin up a service, run a individual set of things, see how the service reacted, trash it again, and bring up an entirely new clean service. We can step through these things relatively quickly. So it's found a number of bugs. Found 500 errors in Cinderglanz, Keystone, Neutron. Some that cover all of them. It found a stored cross-site script in Horizon. Now that did end up being a duplicate that someone else had reported externally. But at this point, Orkunstak's been through many of the good quality open source commercial, I should say, static and dynamic analysis tools. It's had web inspect. It's had app scan. It's had people look at it with covariate and all this of the stuff. And they didn't find this in Syntrobus did. So that's good. StoredXSS is bad, especially in Horizon, because you can access a lot of stuff from there. And we're finding other things as well. So that's always good. Syntrobus team, they helped me a lot. They put together that video. Like I said before, I'm the PTL of security project. So I provide a home for smaller projects to come in to work on security things. We provide them support, ways to work into the OpenStackCI and those sorts of things. It's being dominated right now, it's fair to say, by Rackspace and Intel. Intel through the OSIC and a couple of Rackspace guys are driving it as well. I know they're looking for more support, more involvement. It's really not that hard to write payloads, to write templates, especially if you're looking at templates for your services. I know they'll be interested in looking at more. They have to prioritize the things that to them are high priority, but I know one or two other people I've spoken to over the course this week are interested in writing templates as well. So we're coming towards the end of my talk now. I spoke to you about threat analysis. I spoke to you about Syntrobus. Those are the two really new things I wanted to highlight today, because they really are helping us kick the ball a bit further down the road regarding security. And I just want to talk to you a little bit about some of the other things we've done. So we helped create a security white paper. It's available. The link is at the bottom of the slide there. It's high-level marketing-y type paper, but it does go over a lot of the things that we do on a regular basis, as well as some of those things that you might want to consider when you're deploying a service. We weren't the only people involved, so we had people from OpenStack, Selena Zettabyte, and obviously myself and Travis who contributed to that paper as well. CIS, CII best practice, is interesting. So the Linux Foundation has been around for about 20 years. And Heartbleed dropped a couple of years ago. Everyone knows what Heartbleed is, right? Probably not in this room if you don't. But just largely speaking, the worst internet-facing vulnerability in the last 10 years. The Linux Foundation went, oh, that's not good. And the people that were working on OpenSSL were doing it on almost a volunteer basis, which is not necessarily untrue of a lot of people that work on OpenStack as well. So the Foundation realized this is bad. We need to get money together to pay these guys to be able to deploy software securely. They also created a badge for everybody they didn't want to give money to. They can only give money to a few critical projects. For everyone else, they wanted to create this badge, which meant that they met a minimum security baseline. So they did this, they created this badge. And basically it means that your project follows a number of good practices. There's a significantly larger number than are listed here. These are some of the highlights. My understanding is that around about 200 projects have applied for this security best practice badge. 39 or so of them have been issued the badge. But of those, only one project got the badge without having to make any changes or revise the way they were doing things. And that was actually us, which is nice. So we, thank you. So we were very pleased to be able to get this. We stood up on stage with a keynote yesterday. Anyway, but as exciting as that was, it made me realize a couple of things. Firstly, OpenStack is very big. It's very difficult to chase down security issues. We can only do that with the support of the community, with the core sects and the different teams and the various people that are involved. It also allowed me to take a minute to really think about what it is that we've delivered. And I think there are a lot of gaps. And not all, I can't think of a single team that is leveraging everything that we put out there as a community right now. But we continue to try and push out guidance. We try and push out tools, like I said before, like anchor, like bandit, like central bus. And we continue to try and support teams wherever we can through things like threat analysis. So this is what we've really been up to recently. So we've got the threat analysis done. We're finding bugs of central bus. We've hit a few of them are in embargo, but we've got 80 draft or issued OpenStack security notes. The front here has been really supportive in driving that recently. We've got a revamped security guide out. A mid-cycle, we knew we wanted to spend a bit more time on Barbican, so we were actually quite flexible with our mid-cycles and tried to schedule it to overlap with theirs, which allowed us to actually be in the room with them and work with them. We can't do that with all projects, but if there are projects that are really concerned, then we can go do that. Barbican's a natural fit for us, firstly, because a number of us like to try and contribute to that project when we can. And secondly, because being very security-oriented, it's important that they get their stuff right, because everybody wants to build encryption technology on top of that. Key management is going to be relying on them. We work closely with the vulnerability management team, so there are a number of us on the security project, the cause, if you will, who get pulled into vulnerabilities when the VMT isn't sure about what the impact might be or how widely they might spread. And these two other two I mentioned, the CI best practice and the security white paper. So that's what we've gone through this cycle. Next cycle, we're looking for more of the same. We have this idea of the security incubator right now, which is basically where we can bring in smaller projects. They don't necessarily have to be 100% open-stack focused. Tools like, well, any of our security tools, really, can be applied to other projects, but they are built primarily for open-stack to consume. As long as they take that box, we're happy to give them a home, we're happy to provide guidance and support as they build up into being more fully-fledged things. And that's what we found has worked quite well. So with that, I want to say thank you for coming to this talk on the end of a Thursday. I hope that the community as a whole is finding these things that we're doing useful. There's some resources there if people want to reach out to us, they want to understand more about what we're doing. The blog has entries on Cintrobus, on threat analysis, on Newton in general, and the different security things we're doing. You're always welcome to come find us on open-stack-security. The room might be a bit quiet this week, but generally speaking, we'll always be there and happy to help. So with that, that's my final slide. If there are any questions or queries, I'd be happy to take them. Cool, okay, well, thank you very much.