 All right. Good morning, OpenStack, and thanks for coming to our presentation on Barbican and Vault security models. My name is Dave McCowan, and I work for Cisco on OpenStack security, and I'm the current PTL of the Barbican project. My name is Douglas Mendesaval, and I'm the former PTL of Barbican, and I recently spent about four years at Rackspace working on Barbican exclusively. All right, so this morning I'm going to start with a brief introduction to secret management and why you need it, and then we'll kick off into a Barbican and Vault comparison and then draw some conclusions and make some forward-looking statements about recommendations of how you might use secret management in your clouds. So to start off, you know, what are secrets? They're things you don't want other people to know about. The most obvious one is your passphrases or passwords. You don't want people to know your Gmail password or your login for work because if people know your password, they can impersonate you. Some other secrets that you maybe don't think of every day, but SSH keys to allow you password logins and to servers. So encryption keys are important. If your data is important enough to encrypt, to keep secret, you want to keep those encryption keys also secret from prying eyes. So private RSA keys and X509 certificates, these are things that if people can steal from you, they can impersonate you. So these things you need to keep secret. So places to keep your secrets, well the most obvious places where you remember your password when you log into your email or you log into Amazon, you remember your password. That doesn't scale very well when you get up to seven, eight, ten passwords. So you need some help. So maybe some of us use browser cache to help us remember. If you're a security minded person, perhaps you realize that keeping all your passwords unencrypted on your laptop is not a good idea. So then you look for a password manager. Password corral, keep pass, last pass, or different password managers that can remember all your passwords, encrypt them and then store them securely. The problem with that though works great for humans, but not so great for services. So if you're an open stack service like Nova perhaps, you need something a little more robust. And that's where you come up with full secret management solutions. And that's what we're gonna be talking about today, comparing Vault and Barbican, which are each full secret management solutions. So as you evaluate a key management or secret management solution, these are some things that you wanna keep in mind. You wanna keep in mind how the solution implements access policy. So who or what has access to read secrets or delete secrets or modify secrets. You wanna keep a log. If you have more than one user being able to access a secret, you wanna know which particular user or which particular service and when pulled those secrets out of the solution. And you wanna make sure those logs cannot be faked. So compliance may be something you're thinking about. Whether it's FIPS or HIPAA, PCI, your key management solution needs to follow whatever compliance standards is set for your particular needs. And of course, you need to think about security. The secrets itself, you wanna know where are they stored and how are they protected, usually they're protected by encryption, which means you have some sort of root key or master key. That also needs to be protected. So think about that when you're evaluating key management solutions. And some other considerations is availability. If you'll need your secrets to do whatever workload you're doing. So if your key management system goes down, maybe your whole system goes down. So you wanna make sure you have a highly available solution. You need to make sure that it scales to the number of secrets that you intend to use. You need to make sure it has integrity. You don't want people to be able to modify or otherwise tamper with your secrets. And you want your management system to be durable. You don't want a hard drive crash to lose all your secrets. And then you have more practical considerations too, because how easy is it to use, how much does it cost, and is it compatible with what I'm already doing. So these considerations in mind will begin our comparison of Barbican and Vaul. So they're both secret management solutions, and they both present a RESTful API that allows you to operate on your secrets. In either case, you'd send an HTTP get to get my secret and passing along the appropriate authentication token. They both have pluggable back end architecture. So they're both are flexible with different secret store back ends and different authentication back ends, different logging back ends. And we'll talk about that later in the presentation. They both are open source tools. We're all a big fan of that being at this conference. Barbican was built specifically for OpenStack, where Vaul was built more general purpose. Barbican's been around since the Havana release. So it's four years old now. Vaul is fairly new. It's only been around for two years. They both are easy to install for evaluation for Barbican. Enable plug-in Barbican and DevStack, one line, will get you up and running. Vaul is pretty easy too. It's a get clone, make install, and you're good to go with Vaul. Those are both in evaluation mode and probably not secure enough for production. And so both projects have options to go from there. And so now we'll dig a little bit deeper and do some threat modeling for each of these solutions. Thank you, Dave. So one of the things that we wanted to do for this is talk a little bit about the threat model that Vaul presents on their documentation. So we're both Barbican experts, but maybe not so much Vaul experts. And so one of the first things that they mention is eavesdropping, right? Like, both of these systems have to be secure from eavesdropping, especially in the client to server communications. So the way that Vaul achieves this is with TLS connections between the client and Vaul itself. And then depending on the backends that you have configured, some of them will use TLS for Vaul, for the Vaul server to talk to the backend. Some backends like the file storage, you can't do that, so it's just system calls. Barbican has a little bit more complex architecture, but still we use TLS everywhere between all the components to secure basically all the communications between them. Another option here would be to put sort of the Barbican API on the front end with a TLS connection and put everything else on a segregated network. That would be another good option. And just like Vaul and Barbican, you're going to have to consider depending on what backend you have configured, how to secure those communications. And the example I have here on this diagram is using a PKCS number 11, HSM, the ones that we were using at Rackspace, use network trust link, which is NTLS to secure those communications. Next we want to look at tampering. So tampering is, we want to be able to detect tampering in both data in transit and data at rest. Both systems should be concerned about being able to detect if somebody's trying to mess with your bits. Like I mentioned, they both use TLS, so that's typically how we take care of securing the data in transit. For data at rest, what Vault does is that it uses AES in the GCM mode for encryption on absolutely everything they store. And so this includes configuration and policies. And so if anything changes in whatever storage backend you have configured, the decryption because of that GCM mode, you'll be able to detect where there's been modified since it was stored. We use the exact same encryption mode in Barbican. We use ASGCM to store the payload of the secrets being stored. However, we do store a lot of data in plain text in the database. And at the last, so the last summit, yeah, the security team actually helped us find this defect. We have a file, this bug, 1637115. So if anybody wants to help us fix that, we'd love some help. Next thing we want to be concerned about is being able to access this data and be sure that whoever's accessing it has the correct authorization to do that. And so any requests that come into the system should be subject to any policies that you define. The way Vault does this is they have a pluggable authentication plug-in system. And so there's a whole ton of different choices that you can use if your organization is using LDAP, for example. They have an LDAP plug-in that'll let you use that system to authenticate the users to Vault. It's also got some building systems. If you don't have an external identity, you can generate tokens. That's sort of the default mode in which you evaluate Vault. You can also create username and passwords involved. And then the policy is configured entirely via the API. So once you start storing secrets, you start creating policies for them using the Vault API. Barbecue on the other hand, we depend on Keystone for authorization and authentication. From our point of view, every secret is owned by the Keystone project that store that. And this is important because any user that has a role or the appropriate role on a project will be able to gain access to the secrets. We do have several different roles that allow different access to this. Some roles, you can only read metadata about the secrets. Some roles, you can actually get the secrets to lead them and all that. We did find some limitations as far as our back with Keystone. And so an additional policy that we have is sort of a similar API in which you can grant more fine grained access to your secrets. Next thing we want to worry about is being able to track who is accessing our data. And so every request should be logged in the way that we achieve this. And Barbecue on this, we have a middleware that can emit catf events for any and all API accesses. We also use also logging throughout so that everything that's happening in the API and the worker processes is all being logged. And because it's also logging, you sort of point those logs to wherever you want to keep them. On the other hand, Balt has a configurable login back end system as well. And so you have a few choices there. The simple one would be just write everything to a file. They've also got a syslog back end that can then write to a syslog server. The next concern is the availability of your secret material, right? So if you're storing everything in your system, you're depending on that to be able to retrieve passwords and such. You want those to be always available. Now, in Barbecue, we designed it to be a distributed system that's highly scalable. And so if you have a load balancer in front of your API process, you will be able to scale those horizontally to handle any request load that you may have. Worker processes can also be scaled horizontally. If you start seeing your queue fill up, you can spin up more worker processes and start processing that faster. We use a SQL database through OsloDB. And so you have several options there for high availability for your databases. Two examples here would be Galera if you're using some like MariaDB, or a PG pool too if you're using some like Postgres. We also have a queue in Barbecue. And so typically we use RabbitMQ and there's some HA options there, like clustering and be able to have durable queues and such. And then depending on the back end that you choose, some back ends will have high availability options. For example, if you run HSMs, there are some HSMs that are able to duplicate data between them in a secure manner. Another concern is replication, right? What happens is there's total disaster. As far as Barbican is concerned, that it's an out-of-band process and so you should be able to replicate your SQL database or replicate your HSM to an external store. Vault is a little bit different in the way that they architect things. Vault does not scale horizontally. It's basically just a single process. And what they do is they defer scalability to whatever storage back end you choose. And so for example, they've got four listed, as of now, that do provide high availability. That'll be console, SCD, DynamoDB, or SueKeeper. And so that's where your storage would be done and that's where you would have to scale out to get Vault to be more scalable. They do have options for hot standby, so if your server fails, you can have Vault waiting to take over as master. Documentation is not really clear on whether that happens automatically or not, so if somebody knows, come talk to me afterwards. And then they do offer replication, but it's only available in the enterprise edition. And my budget for this talk was not the price of enterprise edition, so unfortunately I didn't get to play with that. The last thing we want to worry about is the confidentiality of store secrets, right? And so in practice, this means everything's gonna get encrypted. But like Dave mentioned, where that master key and how the encryption happens is pretty important to the security of the system. So the first example I want to go through is a Barbican with the simple crypto plugin and a database adapter. And so the way Barbican works when you configure it this way is that we have a master key that's sitting on the configuration file. Not very secure, but there is ways to mitigate who has access to that key. Typically you would have like a configuration management system use their key store to store that and then as you provision API nodes, you would write that key on the configuration file. What the simple crypto plugin does is it loads it from the configuration into memory and then uses that key to encrypt all the secrets that are coming through. The encrypted bits are gonna end up in your database. The sort of the out-of-the-box configuration for Vault, they have a very, I'd say, complex way of doing this stuff. They use a technique called Shamir's Secret Sharing Scheme. And this is an algorithm that is able to take secret bits and then split them into a configurable amount of shards. And the way the algorithm works is you configure a threshold and then an amount of shards to give you. So you could say, you know, make me five shards with a threshold of three. And what that means is any three of those five shards it creates are needed to recreate the key. One of the things they're trying to address is not having anybody be able, not a single person be able to get a hold of that key. So when Vault spins up, it doesn't have a master key in memory. It's waiting for those shards to be submitted via the API. Once it has enough shards, it'll recreate the shared key and then retrieve the master key from the storage, decrypt that, and then hold that in memory and then from that point forward, it can start encrypting and decrypting things for you. One of the other concerns is, you know, when you provision these shards, the person that is doing the provisioning ceremony may have access to every single shard, which sort of defeats the purpose. So to work around that, Vault is able to use GPG keys where you provide a public GPG key for every shard that's being created so that only the owner of that GPG key is able to then decrypt their shard and hang on to it. Now, the scheme is really great. The only problem with this configuration is that that master key's still sitting in memory and if somebody has rude access to the box that's running this process, they'll be able to dump the memory and get that master key out anyway. And so in my view, both of these configurations are pretty equal, right? At the end of the day, you end up with a master key in memory. So a little bit better security that we can have is using an HSM. And so this is what Barbican looks like when you configure it with a PKCS11 HSM. We use the HSM to hold and generate the master key in a manner that is not extractable. And so when secrets come into the PKCS11 crypto plugin, we retrieve an encrypted project key. So we provision one key per project that is encrypted with that master key. We load that to the HSM decrypted in memory and then from that point use that project key to encrypt all the project secrets. The project keys never leave the HSM in an encrypted manner. And so we get security from the HSM. That's how we're able to leverage the HSM for security. Vault also supports having HSM, so unfortunately that also requires the enterprise addition. And like I mentioned, there's a little bit out of my budget. I think I heard somewhere it's about $150,000 for an enterprise addition of Vault. So looking forward, we think that there's maybe ways that we can use both systems. One of the ways that we are thinking about is we have patch in progress to create a driver for Barbican that can use Vault as a back end. The main driver for this is sort of not having a whole lot of time to work on it, so if anybody wants to contribute to help this make it a reality, then come talk to us afterwards. Another interesting possibility is being able to create an authentication plugin for Vault. The idea there is to be able to create an off plugin because Vault has a pluggable off architecture that understands Keystone tokens and is able to map those roles to the policies that you define in Vault. There's some challenges there, and as far as I know, nobody's really signing up to develop this. But if we had something like that, then maybe we could use Castelon, which is sort of like Oslo dockey manager, to have projects talk to Vault directly instead of going through Barbican. So the question that I think most people came here for is which one should I use if I'm not using one already? And we believe that it really comes down to one question and is who is gonna be using this key management system? If the only people using it are the deployers, your deployment team, they're only using it to store secrets that are used in the control plane, then only have involved may work for you. But if you want the users of your OpenStack cloud to be able to use, to have access to the KMS and be able to do that using the same Keystone credentials that they use for every other system, then you should probably consider deploying Barbican. Oh, I don't have a question slide. So that's the end of our slide deck. Do you guys have any questions for us? We got a couple of mics if you wanna come up and ask us something. I think we nailed it. Oh, we got a question. Awesome. So did you, in your comparison, did you do anything in the way of looking at performance of the two systems? And would one perform better than the other? Are they similar? No, so in both systems of performance, it's really gonna be dependent on what backend you have. For Barbican is the one that I've done most performance testing on. The simple crypto, because it's all in memory, it's pretty fast. When we had a large deployment of Barbican at Rackspace, our bottleneck was the HSM. We were able to basically saturate the process pipeline on it. Because of the way we're doing encryption, I think we maxed out about 50 requests per second, which is really slow. For Vault, my understanding with Vault is that whatever backend you have is gonna be your bottleneck. And so if you have something like console or a LCD, it should be pretty fast. I think we're done then. I don't know how we're doing on time, but if there's some more questions, then thank you guys and we'll be around here if you think of something later on.