 This presentation is security on OpenStack. I have put a QR code in case you guys want to go ahead and download the presentation, take a look. I'll leave it up here for a couple of seconds in case you want to go check that out. Okay, why don't we get started. What am I going to be going over today? Pretty brief stuff. So what is semantic doing within the OpenStack space? Some security concepts. This is specifically for the grizzly release. And then some things that I think we need to be focusing on for OpenStack in terms of security. So let me tell you a little bit about semantic for in case those of you don't know us. We're basically a security company. A company focused on security. Security also means availability. So we also extend into the storage space. We're also trying to tackle some of the largest problems in the cloud space, right? We have Norton antivirus updates. We have groups like our Message Labs and Melproxy servers as well as the online certificate status protocol that serves hundreds of millions of websites across the planet. So semantic is very interested in cloud technology where we're going as well as securing it. Who am I? So I'm Brian Chunk. I am a global technology strategist but primarily focused on infrastructure architect for our OpenStack efforts. My primary focuses in technology are security and networking. And obviously I have a very large interest in making sure OpenStack ends up being a secure enterprise platform. So what are we doing? Why is semantic interested in OpenStack? Well, semantic is on a journey to build a cloud platform for some of our new SaaS offerings and some of our current as well as new offerings we're looking to build out. So we're looking to try to build an IS, PAS and SES platform on top of OpenStack. We have excellent support from our leadership and it's a green foot opportunity for us. We're gonna be building it from scratch. It's pretty exciting. One of the things we decided to focus on was open source, right? It's critical that given the scope and size of semantic that we're looking for tools that gives us flexibility as well as the ability to control our own destiny in terms of capabilities and features. So we have selected OpenStack and one of our goals while we're building OpenStack is obviously to try to help contribute back to OpenStack, right? Security, capabilities and features so that we can secure it. One of the major things that semantic obviously is known for security. So we have to make sure that any service we built on top of this platform meets our customers as well as our own internal security specifications. And so we're gonna start, we have this already small. We had a presentation earlier if you wanted to check it out. We talked about a PLC that we've done on OpenStack. Some of the things we're building there. But we're really looking to scale this to thousands of nodes across multiple data centers. So let me talk a little bit now about what did we actually do so it gives you some context of what the security means because because of the speed of OpenStack I needed to set some ground level stuff. First is version of Grizzly. These are the components that we actually installed and these are the choices that we used in terms of the ancillary products around OpenStack. So we used MySQL, RabbitMQ, we used KVM and Ubuntu. It's pretty vanilla flavor. It's pretty much the reference architecture that most people here use. So the first thing I wanna talk about is what do people in security generally do, right? So most people when they architect a cloud product they talk about defense in depth. So what does that really mean? It means that you're trying to build multiple layers of security to protect people from attacking and hacking you. And this is pretty easy in the traditional model, right? First thing that comes through when the packet comes into your system it probably hits some type of application or host from a layer seven, right? It also gets processed by firewalls. At some point it probably goes through some kind of load bouncing filter so make sure it reaches the right servers. And then it also at some point it has to hit router ACLs. This is a very traditional enterprise model. The infrastructure static, we have separate components to manage all of this and we have multiple layers within the security infrastructure. What OpenStack does sort of changes all this model, right? We're now in the software defined data center world where infrastructure is as fluid as everything else around us. You can no longer guarantee static capabilities in any of these tiers. So this model really changes and it really looks like, well it really looks as a centralized service, right? So in a software defined data center anything can change, right? Your load balancer filters can change on the fly. You're gonna add, you're gonna subtract. You're also gonna be able to change firewall rules, right? As new networks come up, as new VMs come up as well as new workloads get placed on it. And all of this is very elastic and you need this, right? We talk about elasticity, we talk about capability. But you have to maintain multi-tenancy and you still have to maintain some kind of defense in depth. So what it really means is that your host system security becomes way more critical than it used to because the people that are controlling your infrastructure is now at every host. So now the model looks like this, right? It goes from a vertical stack where packets flow through. You understand what they are. You have separate control and change management for every layer of your defense in depth. To really, you have one place. It's called a software defined API, right? All of your requests go through this API and actually manages your entire data center infrastructure, right? This is the promise of SDN. This is the promise of, you know, low bound service, firewall service, VPN as a service, right? You want to have this elasticity, but you still need to protect it, right? We've gone from like a four-layer architecture to a two-layer architecture, right? It doesn't mean that you can't stop it. It doesn't mean you can't protect it. You still need defense in depth. So what I want to talk about today is building security spheres around it. And it's not just one. You have to still have multiple, right? Some of these security spheres will be moved to the host level. Some of them may be moved to other capabilities around the network itself in order to help protect the software defined API that will control your data center. But if there's any questions, you can feel free to ask in the middle if you like. So one of the first things we need to talk about then is network segmentation, right? Within OpenStack, there are five major networks and you have the ability to control their access points. Who has access to them and why. In our model, what we're looking to do is hoping the application itself will have the ability to control the actual service tiers. So if an application needs to grow, it will do it on its own, right? It does not need to call back into a centralized pool or a centralized security model in order to make that change. So the first one you need to be aware of in terms of networking is your actual physical host, right? There's a BMC IPMI type function where you need to be able to actually make changes to the RAID controllers or anything else. That's a separate network interface. Make sure that you have this very locked down, right? It allows you to reboot the box in many things. The second one you have to worry about is your host operating, your management tier, right? Every single one of your host OSs will absolutely have a SSH capability, right? You have to manage your operating system, your hypervisors, right? Your virtual switching tier. The next one you need to be worried about is your service tier. This is where the software defined that green box. This is a network that's going to control all of your virtual and potentially even your physical infrastructure, right? Depending on how you've set up your data center. And the next one, and the last two are things that I think everyone knows about already, right? You need a private network, right? You need, your VMs have to communicate to each other, east-west traffic. And then you're gonna need a north-south interface as well. This is where the traditional networking, your traditional defense and depth model will absolutely still come into play, right? You still need public IPs. You still need the physical external WAN. You still need the external VPN, right? In order to get in and out of the data center. But all of the components above that is really the controls that you normally have within the data center application space. The next topic that I wanted to talk about then is really, I want to dive a little bit, change subjects a little bit. And I really want to talk about some of the additional things that I looked at specifically in terms of OpenStack. So I'm gonna, these are, takes a little bit longer to go through so I'm gonna slow it down a little bit. So the first thing I realize in the tier within OpenStack is really the token messaging system that drives pretty much the whole security model for all of the service-defined API in that green box. The first thing you have to really be careful is token explorations, right? In the traditional model, that the token itself is passed around and has multiple capabilities, right? Normally when you actually take an action on a system, it's message-based, right? I want to do take an action. I get authorization for that action and I take that action. In this case, you actually have to be more careful because every time that token is issued, that token is available for the lifetime of that token. So if you show a token for a day, that token has a capability of actually making any changes within your infrastructure for that entire day, right? So be wary, right? You have to balance the fact of the actual ability to cash the tokens. You need to be careful. Keystone actually has performance implications. At the same time, you have to balance that, right? With how long do you really want this security authentication token living in your environment? So these are some of the things you have to be really careful about in terms of what you want to do with your tokens. The next one that I focused on was the PKI token management system itself. So within Keystone, you have the capability of signing these tokens. Well, every time you create a signings engine, you have to be very aware of how you want to do that. Okay, so my background is from the SSL business within Symantec, so we're very focused on PKI. How are you gonna manage a signing certificate? The thing that's signing all of these tokens, you have to also protect that. Generally that lives on the Keystone server, but now you have yet another application and another security vector that you placed on it. The second one is, do you want to do a hardware signing module, right? Or is it okay to do a software signing module? How important, this is the thing that's authenticating all of your tokens across your entire service plane. So you have to be aware of that. And the last one is, do you want to have a root distribution, right? Across your PKI infrastructure when you're developing the signing model. After that, we looked at, okay, so the message bus itself, the database itself, the service tiers itself, have to be secured using SSL, right? Unfortunately, I know there's a project called Barbican, but unfortunately out of the box and the performance we installed, there is no SSL management tools. So when you think about our model where we're looking to scale this to thousands of nodes, right, we're talking about thousands of certificates. SSL certs per box, per controller node, per client, right? They have to communicate with each other. You're eventually going to run into PKI management, key management issues, right? So you need to make sure that you have the ability to expire those certs, right? Manage those certs, revoke those certificates, as well as, are you going to again, as for the signing module, are you going to have a C distribution model internal within your company, right? These are all things that you have to consider when you're building your security framework around that green box, because these tokens, these signing modules, these certificates is what one security sphere you're using to protect that specific service API ring. Oh, sorry about that. Okay, the next topic I wanted to cover in terms of how another layer of your security sphere that you need to focus on is the fact that this is a very distributed policy model. One thing that I realized was that the actual policy files themselves are distributed per service. So for every single controller node, you have a different policy, for every controller process, you have a different policy file to manage. At some point, you're going to need to figure out how to maintain that. So for example, if you have four controllers and you have four NOVA processes, you have four policy JSON files, you need to make sure that all of these are synced as you're adding roles, as you're looking at your policy because there's this and yet another attack vector that is being added to that green software-defined API. So I'll give you an example. One thing that I tried on the systems was I created a user, a role called read-only, but I defined that read-only on Cinder as a true read-only user role, but you can define it on NOVA as a full admin role because the name read-only actually doesn't mean anything to the system, nor does member, nor does admin. These are just names that Keystone realizes needs to be passed within the token. So actually, let me go into the next slide. So basically, this is how it works. Keystone has the list of all your roles. Every role within your system is defined within Keystone. But the definition of what those roles can do have been purposely distributed to every single service. So think about what this means for that green box now. Your software-defined API, a complete control plane is distributed across every single one of your service nodes, across every single service within your service node. So now you potentially have many, many places that you have to define your security policy. While this gives great flexibility to the system, it's wonderful, it also adds an additional threat vector that if you lose control of one service node, you've literally potentially lost control of your entire system. So Puppet or any configuration management tool is critical that you'd be looking at in order to manage your policy files within OpenStack. You know, again, it's just another sphere within that green box that you need to add. Oh, and the other one that we also realized was upgrades. So as you upgrade the system, another potential is they're gonna add capabilities. They're gonna add features, which means you're going to be touching this file. This is not gonna be a static file within your infrastructure. So you're gonna have to basically make sure that during the upgrade process, when you add in capabilities or specific policies to specific roles you've created, that you have the ability to synchronize those changes across all of your controller nodes to make sure that they're the same security profile. Okay, and this is actually the one that I guess I have the most heartache about, being a traditional enterprise architect historically, audit and compliance. Historically, it seems that the OpenStack platform that I examined had a very difficult time actually tracking an actual occurrence through the system. I'll give you an example, right? Where is the data, right? So let's say you wanted to audit the system, okay? So you're doing a PCI audit and someone asks you, great. Someone created a virtual machine on your infrastructure. Where is the data, right? I wanna see who issued the command. What did the system do? What are all the parts it touched, right? And how are you gonna prove that it did only that, right? That's what audit's about. So there are three major places that the system looks at. There's the message queue, right? Data's placed on the message queue and people take action off the message queue. You need to make sure that the message queue logs, the message queue data, is exactly what it was supposed to be. The next place you have to look at are the log files themselves, right? Each process has a, and each one has the data that you need. And then the last place you need to look at is the database. So when you're trying to audit the system, there are many sections, oops. Oh, here. But after that, so let's assume you can get all that data. Let's assume you've sucked it into Splunk or you've sucked it into your tool of choice. Now you have to somehow figure out how to correlate all that data, right? This is a disparate data system. So role to policy validation, right? So as I said earlier, if the name of the user was read-only, did they only take read-only actions, right? Because of the distributed system. You have to examine it through the entire system. Code patching and upgrade versioning, right? Where is that held? How do you know that no one changed the code while the system was running? Your user information and roles, and obviously you need to basically audit where the keys are who use them, as well as standard IT management functions. So I wanna go through an example now of what it basically means to audit something. So let's say you wanna boot a virtual machine. So let's be a little practical here, right? So what do you have to look at? Well the very first thing that the system is gonna do is it's gonna authenticate the Keystone. So there are two things you need to be able to look for. You need to have the log, right? For Keystone, as well as you gotta check the database to make sure that any changes that, or the token that was issued was in the database and it's definitely still there. Okay, great. Now your Nova service has its token. Then yet, now Nova's gonna take an action, right? So Nova will then call, will run through a series of commands. So you have to check the logs, the message queue that it placed in order to get the other parts of Nova to work, as well as a database, because it's gonna write status to the database, right? After that, you go through, it needs to load the image. Great, so it calls glance, right? Now you have another log file you have to look at and another database you have to check out, right? To make sure that it booted the right image that it was there. Now it has to build the network for you, wonderful. So it calls neutron, okay, great. So you have the log, you have another log, you have another message queue, you have another database to validate. Then obviously, it makes changes to the host on the OBS controller, as well as it makes changes to the KVM depending on what you're doing. And lastly, if you use horizon, right, to kick this whole thing off, you have the horizon log file as well as the database. So now imagine you're the auditor, right? And the auditor asks you, prove to me someone logged into horizon and they booted a VM and they only did what they said they were supposed to do. It's not so simple, right? The data's there, absolutely. You have to figure out a way to take all this data, be able to present it, right? And then show compliance across the action that was actually taken. What opens stack is wonderful, right? It's a distributed projects, they all are independent and it increases high innovation. It doesn't necessarily make it easier for an auditor to audit the system, right? To validate that every action that was taken is actually there and was not changed. So how do you fix it, right? Well, you basically have to get a tool that some has ability to interact with all of these things and you have to make sure that when you're building your system that it correlates, right? You wanna put hints either into host names, user names or things like that that they're searchable, right? You wanna make sure your database scheme was loaded. You wanna make sure your message queue names are correct and that these are in it. These are, this will help you tremendously when you're actually trying to do something for your audit team for PCI or any other audit you need to provide. Okay, well I guess I'm going pretty quickly through this. So the last thing, so maybe I'll go into, oh, maybe I'll go into these a little bit more. Oops, sorry about that. Sorry about that, give me a second. Here we go, it looks like it's turning on. Okay, here we go. So what I wanna chat about next is, okay. What, where do I think, I didn't think I'd have enough time. So these are additional topics that we need to be looking at when you're looking at your infrastructure. The first thing is, well, okay, so you have this message queue, right? It's got a lot of information in it. In fact, it's part of the critical component that drives your entire infrastructure. How do you do sign messaging, right? One of the things I noticed was that if you own the message bus, I can inject anything on it. There doesn't seem to be any authentication of the messages themselves when the clients or the servers actually read them, right? So if I subscribe to a queue, I'm assuming things on that queue are mine, right? When I take items off the queue, I'm gonna execute them. To help through this audit and compliance, right? It's really cool to be able to say, okay, you know what, I know that this particular client, you know, if you have 1,000 endpoint nodes, right, or subscribing to the NOVA message queue, then it's really cool to be able to say, okay, great, you know what, I am who, this is who I am. I'm gonna put a message for who I'm trying to talk to, put the message down, and the other person can do some type of signing validation in order to pull it off, right? And this will help validate the authorization of your message queue itself. While SSL is great, you need to protect the protocol, you also need to be able to validate the messages within the protocol. The second one is the database itself, right? What I'm gonna be focusing on next is probably, how do I encrypt database columns that are critical and yet still maintain performance, right? You have to have a balance, right? There's so much information in the database, again, because this drives that green box, right, that software defined API. Everything you wanna do, all your authorization, all of your tokens, all of your usernames are in that database, right? You wanna make sure that on some level, you have some layer of protection to protect that database from access. Okay, or in case it does get access, or in case you have to make a backup, for example, and send it to tape, or the database will, the database data will move, right? You need to do DR, you need to do cloning, you need to do backups. Another critical, as I explained earlier, is certificate management, right? So when you're talking about scale, right? You wanna have thousands and thousands of endpoint nodes, and then you wanna implement SSL across thousands and thousands of endpoint nodes. Without a strong certificate management policy, you're gonna, you know, you will struggle, right? And I'll go as a specific example. Let's say you have an infrastructure, 1000 nodes, you have six processes, each one has a separate SSL cert, and each certificate expires in 12 months. Every 12 months, you're gonna be issuing thousands and thousands of certs, right? I mean, you have to manage that. Or you can make the certs good for five years, and then it makes it a little bit easier. But at any point, you still have thousands and thousands of certificates to manage. You wanna find a tool that allows you to do that. Oh, and then, so I already talked about this. One of the things that I also wanted to, that we need to focus on is, how do we do policy distribution, right? So one of the things I also was thinking about was, when you load a distribution, so let's say you go to canonical, or any distribution, and you load it in. It comes with a default policy file. What you probably wanna do is, when you're laying this thing down, you wanna check that policy file, right? To make sure that the distribution you got has a profile of something you're looking for. Because again, remember, the actual security policy is loaded at boot time. So unless you're gonna go through an effort of creating the policy, laying it down after, where you do the distribution, you wanna know you got the right distribution. The second thing is, none of the distributions I saw were code signed. So, and everyone runs the GitHub, which is great, wonderful. You're downloading all this code, and you're laying it into our cloud down based on GitHub or Canonical or Red Hat, or wherever you're gonna distribution. I didn't see any of them code signed. So for me, if I'm gonna lay down my platform, right, I would be much more comfortable if there was some type of validation of the actual Python code that's being laid down. I think this is somewhat optional, but I would probably look into it. And then obviously the one I talked about earlier, which is, you have to synchronize all policy changes across your distributed security platform. It's very critical that you do that. And then probably the last one is really actual authentication. This is really focused on Horizon. Horizon does not support, at least for now, a two-factor authentication, right? Remember that green box, right? That's gonna drive your entire infrastructure. It's gonna make changes to your compute, your net, your storage. It can change routes, ACLs, firewall policies. If you're gonna use Horizon, right, in order to do that, you may wanna protect that a little bit better, right? In order to validate users that come in and they're gonna make changes to it. And that's pretty much it. Are there any questions? So the question was, what do I think out of all of these is the most important? Personally for me, it's the message queue, right? If you own the message queue within this infrastructure, realize that you can inject messages onto that message bus, right? So your choice of whether it be ZeroMQ or RabbitMQ or Cupid, right? You wanna be looking for the capability to protect that message queue because it's part of your audit trail and it's part of your control plane for your entire cloud. So I'd say probably that's the first one. And then after that, for me personally, I mean, this is kind of up to you guys or how you feel, I'm a PKI guy. I wanna use PKI everywhere, right? I wanna sign everything. I wanna encrypt everything. That's me. So I think it's important that you have an HSM or a PKI model in which you go after. And after that, I think, oops. After that, I think it's, you know, they're all about even. Yes? Since we have a captive audience here that there is an OpenStack security group that is working hard to play in all these spaces and just make these things better throughout OpenStack. And also with just recently, we published the OpenStack security guide, which is a couple hundred pages of very tactical information about how to make this stuff happen on clouds that you're deploying. So good resources to check out if you're interested. Yeah, and actually the ad that I read, I read the security guide. It's actually a wonderful document. It focuses a lot on the host controls. It talks about how to secure Linux KVM. And again, these are just security spheres you're gonna add around your software-defined API because you need to expose this API to your users in order to, that's what they want. They want elasticity. They want the API. They wanna be able to control the infrastructure on their own, right? So then you're gonna have to add additional security around the host levels to make sure that you can protect them from themselves. But yeah, that's actually a, you should actually join. Actually a very good group. I had talked to a few people on it. Well, so, okay, that's a trick question. I'm semantic. We have a, oh, sorry. So the question was, in terms of two-factor auth, which ones were I thinking of? I'm from semantic, right? So we have a semantic two-factor auth solution. You don't have to use it. You can use OAuth or any of them. But obviously, semantic has a VIP product and we're looking at integrating it relatively soon. Yeah, so if you're, you know, so just to be fair, if you're looking for something open-sourced, there's a lot of OAuth compatible products, right? That you can look at that will generate tokens for you and do that for you. If you want to enterprise class supported product, then you know, you can look at RSA or you can look at the Semantics VIP product. They both are very good. Sure, we can talk about the semantic solution. So Semantic has a product called a VIP. It's a mobile, it's a, you get the token free actually. So you can download the VIP app on your iPhone or your Android, it generates tokens. But in order to make use of the token, you need the OTP engine. And that's where you have to subscribe to Semantic to actually get the OTP engine that your service will talk to. So, I mean, if you want, we can talk after. I can show you the architecture for VIP. I helped design it, so. But yeah, so there are many solutions here. Again, I just, what I really wanted to stress in the messaging here was that the world has changed, right? Traditionally, you have this vertical model, right? You have separate firewall managers, you have separate network switch controllers, you have separate, for router ACLs, you have separate low balancer manager and they're disparate. And in order for someone to attack the system thoroughly, they have to go through all of these components, all of these device managers. You have to start owning your system slowly over time, right? OpenStack gives wonderful elasticity. It gives this wonderful capability and features, right? And I love it, by the way. And we're gonna use it. So just be clear, there's not a batch in OpenStack. We're gonna use it. We want to use it. We're going to use it, but it's really, we need to bring it up, right? To the point where traditional enterprise security models, audit, compliance and other security capabilities, at least it gets closer, right? To what we're traditionally used to in terms of our customers and our auditors. Anything else? Anyone else? So, great question. Our go-forward plan in terms of, so we have to, so you think about it, right? We have to continue to review every release that goes out. As a user, you should do this, right? Forget whether you're a code contributor or not. You have to make sure that every installation that comes out, you're doing these kinds of reviews. You want to find out what capabilities they're adding or kind of roles you're adding. So we need to create internally some type of security review board for every product, everything that comes out. We have to make sure that capabilities and features we're turning on are there. And as well as I said, right? We have to start creating these additional security spheres whether at the host level or whether at the application level, right? In order to protect these software APIs. Okay? Well, thank you very much.