 Thanks for coming. This is going to be a presentation on introduction to Vault. I've already had a few questions, so we are talking about the Hashicort product called Vault. And before I start talking about Vault, I want to take a step back and talk a little about distributed systems and specifically about configuration of distributed systems. So in the last 20 years, the way the web works has changed from end to end. Hardware has changed, software has changed, networking has changed, the service providers we have changed. And therefore the way that we create applications has changed. The way that we've deployed those applications have changed. And the way that we need to configure those applications at runtime has also changed. It's all changed drastically. So if we look at ways that we still use today to configure applications at runtime, there are three big categories that I want to quickly touch on. We'll start with configuration management. So in configuration management, you have these systems that when you wanted to play a server or when you wanted to play an application that's running on the server, you have the central system or the central set of rules that can either work with push or pull. And all the rules that you have for the server and all the configuration of the server and the application and everything will be synced from this central storage place to the server. And you have examples of those kind of tools. You have Ansible Chef and Puppet. There are a lot of others, but those are like three of the big ones. And it has advantages and disadvantages. The big disadvantage is that not every system today, especially the lean mean, all the serverless stuff and all the container stuff, it doesn't work so well with these systems because these systems were made to administrate servers or at least operating systems. And it doesn't work so well. But the idea is that when you run this system it will either write a local file or it will populate your environment or whatever you want it to do. And that's how your application when it starts up will read whatever was written by the configuration management system and it'll work. Another way of doing it, a very old school way of doing it, is a shared file system. The classic example for that is MFS. It might be the one thing that hasn't changed in the last 20 years that there are still applications that do that. And I also put as three, kind of put as three in between the KB data stores and the shared file systems because it's really a KB data store where most people consider it a shared file system for some reason, so I left it in the middle. I'm going to come back to them. I'm going to go to the KB data stores and I'm going to come back to the shared file systems, what's wrong with them, and we'll start moving forward. With a KB data store you don't even have files. There is no idea of a static file. There's nothing that's actually being read from a disk per se, but rather your application, when it starts up, is going to connect to some remote database. Usually it's like an in-memory database so that it's optimized to be a really, really low latency. So examples of that are Redis or an LCD or a console, which is also an AshiCore product. And a KB data store is exactly what it says. So instead of reading a configuration file, you're going to go over a set of keys that the application knows up front, possibly not even a sort of, possibly just as it's needed for the first time, and it'll say, hey, what's the database host? What's the database port? What's the database user? What's the database password? And for each key it'll get an answer back from the system. The problem with the big advantage of these KB data stores is that they're optimized to be low latency. They're very optimized for massively distributed applications. They know how to do replication. They know how to do high availability. One of the classic examples of problems or anti-patterns in doing a remote configuration, and this is what I was talking about earlier with shared file systems, is if you have a single point of truth for all of your configuration, like an NFS server, and you need to read that at startup, and when your application starts up, you can't access that because in that second it was down or unresponsive or whatnot, you could get into a loop, you end up going nowhere, and applications can cycle and go out of control. It's a big problem. And when these KB data stores were built in the last, I'd say, five years for the most sake, we were ready at the cusp of virtualization. We already had these massive public clouds where you could have guest operating systems, virtualized systems massively scattered all over the world, moving applications could go up in one and down in the other and up in the third one. So these KB data stores are really tooled for that. And they continue to meet the needs today, even with serverless, even with containers, even as things get leaner and leaner and go all over. But you have other problems. You have security problems. And this is all good when we're talking about storing plain text things, configuration things, like database hosts, but what happens when we're talking about credentials, usernames, passwords, secret keys? How do we store those? All these three categories have their own sets of problems. With the KB data stores, you often have problems in the security model of the KB database itself in that. It's very hard to segment what users can access which keys. So very often everyone will be able to have global read of all the secrets in the system as long as they can connect to it. You also have a problem of security at rest, the back end store of this. Even if the front end is cached and ram, you're not going to want to use any system that's not backed by disk in case this thing goes down. And then how is that encrypted? Well, in most cases it's not. The shared file systems have better support for both of those. With shared file systems you have users and groups on NFS. You have the same kind of thing with F3 or some other hosted file system. But it's really a pin in the ass to configure and set up. And you don't always know if you can encrypt the back end. The configuration management usually would take care of both of those. The configuration management know what secrets are. You have these ideas of encrypted data bags in Chef and you have Ansible Vault. You have tools for storing the secrets, but you have to be okay with using configuration management to manage everything about your application to push changes. Not every application today is going to be that friendly with that kind of setup. And in all these cases you have other limitations. You have static secrets. That means the secrets that you store here are accessible and identical for any user or device that wants to access it. So the application database user is usually going to be the same user name and password for most of the components in your application. And what happens if someone hacks into your database and you see that it was the application user, you might even see what IP it came from. You don't really know how the password or user name was leaked. And when you're trying to do a post-mortem of a problem, you're going to be in trouble. So on that small introduction, we're going to introduce Hashicorp Vault. Their one liner is that Vault is, I quote, a tool to securely access secrets. Vault has two additions. There's an open source addition, which is what I'm mainly going to focus on. There's a commercial addition that has a lot of other fun stuff that big enterprises like to do if they want to spend money on it. And before we go into how it works, I want to talk about some core concepts. So the first core concept we said is a secret. We said that Vault is a tool for managing secrets. I'm getting a new microphone. I'm so excited. Okay, not yet. So what's a secret? A secret is anything you want to tightly control access to, such as API keys, passwords, certificates, and more. Vault, I'm going to quote again from their website, Vault provides a unified interface to any secret while providing tight access control and reporting a detailed audit log. That was a long complicated sentence. Let's break it down. Vault provides a unified interface. A unified interface. Vault is itself a REST API. Once you know how your secrets are stored, where they're stored in the URI scheme, the way to access secrets is really the same for any type of secret. It's always going to be a REST call. It's a unified interface to any secret. We're going to talk about the different kinds of secrets that can be stored in Vault a little bit later while providing tight access control. Access control is a first class citizen in Vault. Every resource has an ACL protector, or one or more ACLs protecting it, defining what types of users it is. Yeah, Nathan's guys. It's my new hero. So, where was I? Access control. We talk about everything. It has its own ACL. All is good there. And reporting a detailed audit log for all the wonderful security people here. You hear audit log and you sigh with relief. You can figure out what happened later. Another core concept I want to talk about are leases. Leases are a very central concept inside Vault. Every entity that is returned by a Vault call is attached to a lease. A lease has a TTL on it, and a TTL can be extended or revoked. Leases are audited. Audit and revocation and lease rolling are first class citizens in Vault. Leases can be chained, which means that you can have a lease for one kind of data that requests another type of data, and if the first lease expires, everything underneath it will also be expired. So, we'll see later. We're going to come back to these concepts and see how that makes life easy. Does anyone know what this is? Oh, yeah! One-time pad. One-time pads back in, I guess the late 1800s and early 1900s. Even today, one-time pads are like the holy grail of cryptography, because they're hypothetical. In theory, they're completely random. They're completely unique and they're completely trusted. If you catch an encrypted message and you don't have the Cypher for that specific message, there's nothing you can do with it. If you catch a message and you know the plain text for a message, the decryption of it, there's no such thing as a replay attack. You can't do it. The way it works is you have two sets of pads, one to encrypt the data, one to decrypt the data. The pads are physically distributed to the right people. They're in practice never more than two copies of Cypher on a pad. And therefore, unless you actually get a message that you know is routed to and the pad that it's routed to, you can't break into it. Why am I bringing up one-time pads? The same idea really applies in security and access control. And Vault has this concept of using ephemeral keys rather than credentials. So if in a classic world that we know of if we want to store the username and password of the application user to the database, even if we know that there's only one server in one place that's ever going to need that, in Vault it doesn't work that way. You don't store a static secret that you don't have to store. Rather, you configure Vault. You tell Vault, okay, this is the database server that you're going to be talking to. These are the roles on the database server. And these are the, this is the access control for each type of role. And when I, as a user, come and request access to the database server, Vault is going to go on my behalf provision a one-time username and random password and give me that. And Vault will do as much as possible based on the back end to ensure that each type of thing can be used once only. For people who are thinking, well, I have a lot of static stuff like the private key for a certificate that's sitting on my server that's not going to help. No worries, Vault does do static stuff. So we're going to do the first demo if this works screen resolution wise. I set up a whole bunch of Docker containers and what we're going to do is we're going to show how we can log into a PostgreSQL server with Vault. This is using a custom JDBC driver to connect to PostgreSQL. It's an open source fork of the main PostgreSQL driver. It can be found at the end. You'll have a link to my LinkedIn and you can see an article where I posted the link to the source code for the driver that we're using here. You can see that, well, I don't know how much you can see, but it looks kind of like a normal JDBC string for all of those who are familiar with it. You have a username and a password and you have these extra variables here that have a Vault host and a Vault user name a Vault authorization path and a Vault database path. We don't need to really understand the ins and outs of it, but let's see what happens when we run this. I have two workbenches here. Both of these authenticated using user tests, like we saw before, here, user test. If we look at the current user here we can see, can you guys see the users? We have two different users. You can see this is Y, V. I know it's not so clear, but it's clear enough to see that these are two different users, even though we used the exact same connection. If we look a little bit closer at the PG user table, the PG user is the user's table in PostgresQL. We can see that we have three users. There's the default user that I created when I brought up this database, which is Postgres. And here are my two temporary users. And if we look closer at these temporary users, we can see the valid util time, which by now has expired. If I bump it up, you can see that these have changed. So now it's 4142 and 4148. And if I wait another 10 seconds and then refresh it, we'll see that these timestamps moved up 58 and 02. Lastly, if I close one of these connections, this is the one that starts with IW. This will take a little bit longer, but in the next 20 seconds or so, Volte will actually have cleaned up. Now there's only one, two users left in the table. We've lost the temporary user. What about auditing? We talked about auditing, so I have let's see if I can break out of the demo slide. I also brought up as part of the Docker thing. Laugh it elastic, don't ask me. This is the standard length container for file beaten for Kibana and for elastic search. Now, let's say that I know from some logging that this database user here did some damage to the database. How do I know who that user is? How can I audit that user? So Volte has an audit log, as we said that it's going to log everything. Most of the audit log is human readable. Sensitive things like user names and passwords that Volte knows up front are sensitive, aren't going to show up as clear text. They're going to be HMAC. Luckily there's a tool that you could use if you know the HMAC and don't know the clear text, you're screwed. If you know the clear text, Volte will help you generate the HMAC you need. So let's look up the HMAC for this user name. My HMAC, close this so it doesn't keep adding interest to our audit log. And I could find the things that talk about this HMAC here. For example, if this demo works as well as all the practice ones, then this is going to be the one that shows the login. Here we could see that the request was two database creds postgres. That was the rest endpoint that generated this pair of credentials. I could see that in the response I have the lease ID. Remember we talked about the leases. This is the thing that could be extended. When extended it will tell postgres to also extend the user that we saw getting extended in the database. The username and the password. And we have the client token. So for the client token a lot of the metadata, this is the authentication that was used on this request. So we already have a lot of useful information here. Let's say we didn't have any information that we wanted. Let's say we had something else that showed up in the... We want to know what the IP of the identity that generated this client token was that logged into this. So I could look this up in Kibana. Here I can actually see the actual login that was done. I have the username which is part of the path. I was using the user password which is just a username and password to authenticate. I could see that the username was test. If I knew what the password was I could look for all requests that were logged in using that password. I have the IP address. I have all the information. I also have the policies which we'll get back to in a bit which shows what the user can do. I have a lot of rich information here that in the case that I needed to do a postmortem to do my homework. This all comes... This is all very standard stuff in Vault. So moving along. I want to take... I want to talk a little bit about the encryption model I have been going on and on about how secure Vault is. We all know that these secure systems are certainly not any better than the encryption model that sits behind it so how does it work? Vault has two key rings. Let's start from the data. Data is encrypted at rest. All data that goes into Vault gets encrypted. That is done using a standard encryption key ring which is an internal key ring that Vault manages. There's no way from the outside to get the proper way through proper channels to the encryption key ring. The encryption key ring is itself stored using a different encryption key and that key is called the master key. The master key is the one piece of information that's never committed to persistent storage. When you initialize a new Vault storage back in you have to initialize the Vault. Part of initializing the Vault is that you create this master key. The master key can be sharded. It's sharded using Shamir's algorithm which basically means Shamir's algorithm is a way of saying an amount of block of data that's X bytes. I want to shard that into A pieces where B pieces are needed to create the entire key. For example, I could set up the sharding so that I make 5 shards but any 3 shards together are enough to give me the master key. What happens when you start up the Vault process like the actual process in the operating system the first thing it's going to do is say I need volunteers to give me the master key. All I know is that I don't know too much about it I just need you guys to start feeding it in and it's going to assume that at least in the open source version of Vault that you split it between several different individuals how you manage this is really up to you in general you're going to want to make more than one shard 3 out of 5 is something that I like in my Vault setups it really depends on how many people you plan on needing every time you start up a Vault instance because this is a very manual process people who want to pay a lot of money you can get and have a big fancy hardware crypto device you can shell the money to Vault and get an automated way of doing it but most of us are not going to do it most of us are going to use the key shards and that's a process called unsealing when Vault starts it's sealed it has no way of getting the data because it doesn't know its own master key you've given the shards Vault then becomes unsealed it now has its master key which commits only to RAM it also commits it to RAM that is not allowed to be paged to disk and that's the only copy of the key this is also a big red button if you think something went wrong any Vault administrator can say Vault seal yourself and you can start unsealing ceremony to get to it another core concept I want to talk about is authentication earlier we saw something about we saw the client token that's this guy this guy that we looked up before so when you log into the token is the basic authentication unit or identity unit inside Vault Vault generally requires a token when you log into Vault using some sort of challenge username and password or other ways that we'll get to later basically what's happening is you're presenting credentials and similar to any web application you get a token this could be based on the way you actually use Vault this is the tender underbelly you need to protect the tokens these tokens are static they're renewable there's a lot of things that you could do to make them dynamic but by their nature aside from the least time in the TTL they're static so if anyone does a man in the middle and grabs the token you're screwed you want to make sure it doesn't happen Vault is going to help make sure it doesn't happen if we have time I'll get into some concepts about features that Vault gives to make it really easy but if you consider that everything is connected to Vault there's a single world where once a person or a thing has its token that token can be considered a secret key secret keys are static they never change they don't normally even have TTL so as much as a warning you guys about it this isn't an unmanageable thing this is just something that as a potential Vault operator you need to be aware of these tokens have leases attached to it these tokens also if we remember earlier we talked about a lease can have things chained underneath it so this token if you login if I login as user Isaac to my Vault and then I go and I get a database credential and I get an SSH key pair and I get a whole bunch of other things the moment I log out of Vault all of those credentials that I have in dynamic places will be revoked on the spot by Vault ACLs I talked a little bit earlier about this we saw also attached to every user is a list of policies for example this user we have has the default policy and has the Postgres policy the way that policies work in Vault is that we have a list of ACLs everything is rest endpoints so Vault defines 7 permissions create, update read, list, delete, pseudo and deny and Vault defines policies a policy is a named list of ACLs that say for this policy these permissions should be set on this endpoint by default Vault is in deny mode so unless you explicitly give permission to do something you won't have permission to do it when you log into Vault when you do that authentication what it's going to do is it's going to have a map list of policies for your username that it will attach to the token and the list that you're connected to and that's how Vault does access control policies on my own time does anyone remember what I'm supposed to finish yeah? so I have time, alright we'll try to do another demo and hopefully this won't backfire on me so in this demo I want to quickly look at the static secrets and I want to look at the REST API so the first thing I'm going to do is I'm going to log in with curl and get a different token up until now on this Docker image I've basically been doing everything as root inside of Vault that's not how we usually do it so here as you can see I'm running curl I'm hosting a file called auth.json which we'll look at in a second and I'm sending that on HTTP because it's a local setup I'm going to use HTTPS to v1 auth user pass login test which means I'm logging in as user test the user file is a JSON file it has the username and the password and Vault gets it and spits out the client token we saw this HMAC earlier this is the token that I'm going to use to do anything else that I want to do on Vault I have the policies I have metadata, the username is test there's not a lot of metadata because I'm using a simple username and password store there are other stores that we'll get to soon that could decorate this a lot better and I have a lease duration I'm not sure if you can see it I can't see it now either but you guys will trust me that under this lease duration there is a time to live that's how long the username can live I also see that it's renewable I could have said it so that this is a one-time token that you can't renew itself usually a token you'll want to let a token renew itself so that you can keep the login going this is a really useful thing especially when you have serverless things or spot instances or containers all these modern computing terms that refer to systems that you expect could possibly disappear at any given moment so how do you revoke them it's very easy to give these guys low latency TTLs inside of Vault you simply have a local process that periodically renews it and it'll keep the token working and then if the process or machine or container or whatever it disappears within the TTL time it'll expire, all the leases will expire everything's been nice and cleaned up no dirty mess to worry about so here I got a client token now I'm going to try to use I'm going to try to, I didn't put anything there yet so I'm going to write vault write value equals bar now I can read remember I'm reading now as user root so I can from the command line interface I can see that the value equals bar, I can see there's a refresh interval I don't really have a lease because this is a static token and Vault knows that it's static but it gives me this advisory this is what the lease that should be configured for this endpoint I hope you'll respect it and if I wanted to get that from curl I'd get a permission denied why would I get a permission denied because in this batch file that I have set up on purpose I have the vault token that was used I have this vault token which is not the same as this vault token why? because this is what I had last time I did the demo for myself in front of the mirror while I was shaving this one if we set it to use the right one okay if I use the right token now I have I can see that the data equals bar this is the JSON this is exactly what it appears as and that's how Vault works for the last example that we can see something that uses a dynamic lease let's try to get the credentials for a Postgres see if I remember where it lives that's what happens when I close my command alright so when I read from it now I get a lease ID I get a lease duration this is a lease that's actually going to be backed by a vault that Vault is going to manage I can see that the lease itself is renewable remember the lease is only as renewable as the token is unless I told Vault to do otherwise and here is the user name that we saw earlier and this is the type of random password that we get from moving on ahead I will stop for questions I promise in just another slide or so the meat and potatoes of Vault Vault is a modular system pluggable these are the list of backends that you have built into it there are three types of backends modules, plugins, whatever you want to call them they're not plugins these are not plugins these are built in you have authentication backends you have audit backends we'll start with audit because it's the fast list there are only three of them you could write to a local file you could write to a TCP or UDP socket or you could write to the RSS log facility and those three basically cover most of the use cases that you have today with the Kaban example that we saw here I was just using a file and it was being sent with file beats to Elasticsearch and everything worked really nicely so this is really all that you need very simple authentication backends I like to break these authentication backends into two types types that are targeted for humans and types that are targeted for machines so on the humans side we have user passwords that's what we've been doing here for large organizations you have LDAP you have radius you have TLS client certificates so each user can have a private key and you can authenticate with HTTPS client certificates to get a token you could use open ID like stuff I'm sure that at some point there's going to be normal open ID support for now there's Github and there's MFA which isn't really a back end currently if you go with the open source version of vault the only MFA you have is Duo security which will give you this really cool push notification to SMS or to your mobile device or to whatever and you could use that in conjunction with the user name the user pass LDAP and radius there's much cooler support in the enterprise version of vault so if you need that you should get it and if you get it because I told you that it's really cool you should tell people at HashiPort that I recommend it and they should give me a license so I could talk about all the cool stuff that I simply can't talk about because I can't afford them and I'm dying so for machine there are its own set of back ends first of all first and foremost there's a token remember the tokens are static strings there could be setups where as an operator you want to actually request the token and put the token in a closed environment and just keep using that token without authenticating in a meaningful way you also have app role which is very similar to user name and password you basically have two secrets the idea is that the two secrets should come from two different channels so for example if I was provisioning a machine with Chef that lived on AWS I would put one half of the secret in as part of the Chef recipe that it would sit on the machine and another part of the secret when I went to AWS and I created the machine I put it in the user data or a tag or something the idea being that there should be two separate processes that generate the thing because you don't want to give any one person access to both halves of the secret and they could they can authenticate themselves Amazon web services in Google cloud each have both in support with vault so that an instance a compute instance on either of those platforms can authenticate itself to vault and Kubernetes also there's been some support on Kubernetes so container running on Kubernetes can authenticate as that container to vault as far as secret back ends there are really two we'll make them three types of secret back ends we'll start with the static ones static is what we saw with the fruit was bar, we have the secret back end which is simple static it goes in, it gets committed to disk encrypted and you could read it out later you also have what's called cubbyhole which I don't know if I'll have time to get to you can do a lot of really interesting things with cubbyhole cubbyhole is the same as secret but a cubbyhole is attached to a token so as soon as the token expires the entire cubbyhole disappears and that gives us a lot of cool stuff like response wrapping which may or may not have time to get to later but really interesting ways how you can transport data through untrusted relays using that and the third type of static back end is what's called the transit back end which is basically plug in encryption vault doesn't actually store the data in the storage you send it, it goes it encrypts it as if it was going to write it to disk and it sends you back the encrypted block you could encrypt that out of that way and the endpoints are are protected by the normal policies and the key rings are managed the same way that the vaults internal key rings are managed dynamic the dynamic back ends the dynamic back ends are the fun ones like the database one that you saw before so there's a lot of database support you have support from ISQL for Cassandra for MongoDB for console which is their KVStorm PostgreSQL Redshift through a plugin plugins came out I don't think I have time to talk about them later so I'll just say for a minute plugins came out earlier this summer and I jumped on them, I've been waiting for a year and a half and I said okay I'm going to make a plugin for Redshift because Redshift is just different enough from PostgreSQL that doesn't work so you have support for that also it's an open source plugin SSH, SSH is really cool it's the same idea that you have with the databases except that it's a physical machine there's actually a Vault SSH command you say Vault SSH into this machine and it'll use your token authenticate the Vault see if it can get a username and password for the machine if so Vault will SSH into the machine as a privileged user add a system user with the privileges that was defined on the remote machine give you the username and password and you'll log into that machine all from the comfort of your CLI so that's really powerful AWS AWS has this idea of short-lived tokens to do a set of authenticated requests against AWS and you can use that also last on the list of dynamic ones is TOTP so if you want to set up your own MFA for your own app Vault has its own virtual TOTP which is a time-based for people who use MFA with those 6-digit codes that you see on most of the web pages that's TOTP Vault can manage the secret key and public keys to help you do your own thing on your own site you can do as many as you want the last one that I want to talk about which is my favorite for some reason I don't know why is the PKI interface basically means that you could run your own certificate authority so this is really useful even today but it was a lot more useful years ago before Kubernetes and these other Docker orchestration frameworks came out because Docker if you want to do authenticated Docker that's all done using not using Mpasswords but rather certificates which are a pain in the rear end to manage so with Vault you can actually manage these CAs you could request a certificate or you can create your own CSR and send it to Vault to be signed you could mount as many of these PKI's in a single Vault instance as you want in parallel and then you could have your top level CA, your intermediate CA there's no way to get the private keys for the CAs out of Vault you have one chance to get them out which is when you generate it and after that they're all securely stored by Vault so there's another bunch of stuff I want to talk about I'm not going to have time response wrapping and plugins so we'll just go straight to Q&A so I'm going to have questions yeah alright so questions yeah so the question was for authorization if there was a plugin system on that can you extend the list of the seven permissions that I mentioned earlier on the ACLs the answer is no there isn't if you have a use case for it Vault is on GitHub definitely post a question and come to me because I'd love to hear it also I haven't found a use case for it yet any other questions I guess we'll end early we'll give you guys an extended coffee break rather than be starting on the topic that I want to have time to finish thank you so very much as a reminder if you like what you heard they're going to be surveys given to you all later please rate the session well if you love me and you want to see me again so come talk to me a little bit later