 From where? Yep. Oh so Hello everyone The next talk is about base lock base 64 is not the encryption by this Virgo will come Hi, everyone and everyone hear me. I'm a little sick so Awesome, I see a thumbs up in the back of the room Welcome to base 64 is not encryption or a better story for Kubernetes and secrets My name is Seth Vargo. I'm a developer relations engineer at Google Small company probably haven't heard of them And I only have 30 minutes, so I'm kind of going to run through this quickly But if you have questions or if I say a word and you're not sure what it means Please come up to me afterwards and ask me or feel free to tweet me. That's my Twitter handle My direct messages are also open so if you don't feel comfortable saying something publicly you can always send me a private message Even if I don't follow you All right, so let's set some ground rules here. What is a secret? Everyone kind of has their own definition for the purpose of this talk I'm going to define a secret as credentials configurations API keys or other stuff that an application or service needs to run Either a build time or run time so when you compile the application or when it's running And we're specifically talking about secrets in the context of Kubernetes, right? There have been a lot of other security talks We're really just narrowing in on Kubernetes here And then we kind of have to ask ourselves a really obvious question. Why should we protect them? Like why not just use a config map for everything? Why do we even have secrets? How do we separate secrets from configuration? Well First secrets are a really attractive target for attackers They often are leaked in public repositories or open buckets on s3 or Google called storage And they often include overly broad permissions and they're often given to people who shouldn't have those permissions Like the CEO of the company should not have pseudo access to all of the machines yet. That's what happens And these users tend to leak these credentials everywhere So we need a really strong strategy for protecting secrets There are really four ways in which we can protect secrets we can audit them So this is kind of retroactively let's log every use of a secret Let's make sure that we can trace who's using a secret and when Encryption encryption in transit and at rest Rotation we know that it's not enough just to have a secret that secret also has to have a lifetime It can only live for you know a couple hours or a couple years And isolation right that's kind of principle of these privilege making sure that the the place where secrets are accessed Is not the same place that they're stored This talk is really focusing on the encryption bit That's really all we're going to talk about today There are a lot of things that we can talk about but we're kind of limited by time So let's talk about encryption There are four Layers of encryption that are kind of common The first is what we call application layer encryption Application layer encryption i'll talk about in more detail in the next slide then we have service level encryption File system encryption and machine level encryption. So machine level encryption is like where you have a hardware device like a tpu Operating system encryption something like bit locker or file vault on mac or windows And then service level encryption is where you have some kind of like operating system level encryption Application layer encryption is kind of the highest level of encryption that we have today Because it's applied at the earliest possible step in the encryption process And it provides encryption at a very granular level So when you think about something like file vault on osx or bit locker on windows That's a key that protects an entire file system And if an attacker is able to get that key either through brute force or social engineering They've decrypted the entire file system and all of the data is now available to them Whereas if you imagine each file in that operating system is encrypted with a unique key Even if they're able to brute force the the high level operating system level key They still need the lower level encryption keys in order to access individual files, right? And this is really the context of application layer encryption is that It it it doesn't just protect the data at rest. It also protects the data as it moves for the system So again taking that same example if I have some file system level security like bit locker or file vault If I take that file and move it to an nfs chair It's now available for anyone. It doesn't matter how secure my local file system is I've moved it to an insecure store However, if we were applying application layer encryption I've instead moved a bunch of encrypted bytes from one file system onto another And it still has the same level of security that it had before so Application layer encryption is just one level of encryption and generally we recommend using at least two layers of encryption So you want to use application layer encryption to protect things at the most granular level But then you also want to use file system encryption encrypted backups or even hardware level encryption with something like tpu's and secure boot So all of this background was to introduce you to kubernetes defaults. So how many people here are familiar with kubernetes? Cool, you're in the right room. How many people know I have four frowny phases Cool, you're about to learn So kubernetes is insecure by default and there's a star there, which I'll explain in a second But by default when you spin up a new kubernetes cluster, whether that's mini kube or kubadmin or local kube All of the secrets are stored in plain text and ecd ecd is kind of you can think of it as like the database that backs kubernetes It's a storage engine where most of the data stored in memory But they're only base 64 encoded. They're not encrypted Meaning anyone with access to ecd or a backup of ecd or the master node can actually retrieve every single kubernetes secret And service account with a single request Now there is a star here, which is that a lot of providers like cloud providers will alter this default behavior So if you're using like google kubernetes engine or aks or eks They don't do this by default But if you're running your own kubernetes cluster on your own bare metal or you're using some type of virtualization where you're managing this yourself If you haven't configured it Anyone with access to your ecd cluster or your master nodes has access to all of your secrets So what does this actually look like? um So here we have a piece of data a credit card And let's say i'm going to create this secret So i'm going to run kubectl create secret with the contents of this particular credit card That's going to hit the kubase pi server and the kubase pi server is going to encode But not encrypt that data. See it's encoded. It's just upside down with two equal signs at the end Um, I'm glad some of you got that joke Right, so if an attacker which is represented by this raccoon on all of the slides has access to that ecd cluster All they have to do is turn that upside down and remove some equal signs And they have your credit card data right and it it can be a credit card It can be a service account a passport a social security number any type of secret right an api key whatever it might be I like to call this encryption And I know what you're thinking right number one. I'm not here to like talk badly about the kubernetes developers Right secrets were an afterthought and that's okay, and we're working on it number two, you're probably thinking well, no one is going to like leave their ecd cluster publicly exposed and I work for a company that doesn't let me show you this But there's a url and I highly recommend that after this talk You go ahead and take a look at how many ecd clusters are publicly accessible with no authentication So instead of just talking about it. I figured I'd show you this So let me jump out of the slides real quick and enter mirror mode all right So um, what I have here is I have a mini kube cluster because I wasn't sure how the internet was going to work But this could also work in a big kubernetes cluster So I'm going to use The secrets default context here And now I'm going to create a secret. So I'm going to say a kube cuddle create secret generic What should our secret be called? Demo, okay, cool. Y'all are real exciting here And our our secret will be I don't know password equals Fogs encryption, okay I type encryption so much. It's hard to type encryption Um, I can't type kube cuddle create secret generic Oh, I spelled literal wrong LIT wait, what? You're right. I can type There we go. Cool. So I've now created the secret. How many people feel secure? Cool, I'm going to make you feel even more insecure. So I'm going to cheat and exact rate into the ecd cluster, but There's a number of different ways you could do this So I'm now on the ecd node Um, this could be publicly exposed This could be a backup that you have restored because someone accidentally uploaded it to an s3 bucket fun fact That's what happened with tesla Just restore it point it to it and you're good to go. Um, so I'm going to run ecd cuddle get registry Secrets it's in the default namespace and we called it demo And look at that and craption is just like right there just in plain text You don't even need to use ecd cuddle like you could just search the file system for this if you knew what you were looking for Um, so y'all should be real scared now. All right, so What can we do to fix this? Well, the first thing we can do is explain envelope encryption to you because the rest of this talk won't make sense without talking about envelope encryption So what is envelope encryption? Well Normally we have two pieces of data A secret like a piece of data and a key With envelope encryption. We introduce another kind of key. So we have two keys We have a data encryption key the thing that encrypts our data And a key encryption key the thing that encrypts the key that encrypts our data. It's very meta so I prepared some animations for you and I hope you're impressed so Seriously this took forever so We use our data encryption key, which is the red key on the slide And we use it to encrypt our data and that gives us some bytes You can see that's denoted by the little red lock icon Then we take our key encryption key, which is the green one over there on the right And we encrypt our key with that key So the actual bytes that encrypted our credit card are now encrypted using this other key We then concatenate those two pieces of data together and we store them side by side using some type of separator In our storage system, whether that's a database or a file system Mobile phone, whatever it might be This is envelope encryption at a very very high level When we want the plain text data back We reverse this whole process. So we separate the pieces based on our separator We then use the key encryption key to decrypt the data encryption key Which we then use to decrypt The encrypted data to give us back the original plain text data I hope you like my animations Every time we encrypt a new piece of data, we generate a data encryption key That's usually the responsibility of the operating system or the software to generate this one time key So you generate some entropy, you generate a 32 bit key or a 32 byte key, you encrypt the data The key encryption key, the things over here on the right, they tend to live a little bit longer So a key encryption key might encrypt five or six or a hundred different data encryption keys or decks as we call them And we rotate them periodically And we store the version number inside of here so we can easily decrypt them Pull up encryption We generally generate a unique deck for each new data entry We can crypto shred if you don't know what that means Think gdpr I have a bunch of data and I need it revoked immediately instead of zeroing out a bunch of data We can just revoke the top level key encryption key and now all of that data is irrecoverable Except for brute force operations And it provides easy versioning and rotation We can rotate the key encryption keys and the data encryption keys separately So we don't have to maintain all of these kind of keys running around So kubernetes 1.7 introduced envelope encryption to try to solve this problem So there's this top level thing that you pass to the kubernetes api server when you start it called an encryption I mean an encryption configuration You give it the different providers you want and there's a number of different providers a cbc 256 gcm 256 secret box Etc You put the keys in this file And then you restart the kubernetes api server with this dash dash encryption dash provider dash config flag and all as well So it goes like this so the data comes into the kubernetes api server The data then goes to this encryption config first before it goes to ecd The encryption config encrypts it and then puts it in ecd, right? So we're we're secure. Yay. This talk is over except I have 15 minutes left So what happens if an attacker has access to ecd? Well, they have some encrypted data. That's great. Like they can't really do anything They can brute force decrypt it maybe but hopefully we've rotated our secrets by then But what happens if an attacker has access to our master node? so That encryption config lives in plain text on the master node Where the keys are? So any attacker worth their salt No pun intended Can grab grab access to that encryption config file, which if you remember from back here Has the keys in it So all we've done is given a really really big lul to a skilled attacker because they're thinking Oh, wow, you went to all of this effort and complicated your setup and added overhead Only for me to just decrypt it with an open ssl command Because the encryption keys and the encrypted data are stored in the same threat model We haven't actually improved our security for anything other than a script kitty So There are a number of drawbacks to this approach number one you have to generate those keys yourself so Anytime you want to rotate or manage keys you have to generate them yourselves So you need some type of entropy source Key management is your responsibility if you've ever actually done this It's really a pain in the butt. You have to restart the kube api server every time you add a new encryption key Which causes downtime or potential loss of service Rotation is a manual process. So you have to decide when you rotate how frequently you rotate for both the decks and the keks And there's no hsm integration So if you work in a large enterprise where you process credit cards or personal identifiable information You can't integrate with an hsm this way hardware security module But kind of as we talked about the biggest drawback is that it doesn't actually improve your security standpoint The plane the the plain text encryption keys are sitting right alongside the encrypted data so kubernetes 110 We're on 113 now for those playing along at home Introduced some happy faces. So let me tell you how kubernetes 1.10 makes this really great kubernetes 1.10 introduces this concept of a plugin For encryption particularly it uses kms plugins or key management service or key management provider plugins They operate on a unix socket so you can leverage unix file permissions to control access And they delegate access to a key management service outside the management layer of kubernetes So it looks like this the data comes in hits the kube api server It hits the encryption config and then instead of hitting a local key The encryption config uses that socket and some type of authentication to talk to an external kms provider Then encrypts the data in ecd. So if an attacker and then to reverse the process it has to go back to kms To get you the plain text data to go back. So if an attacker has access to the master node Or ecd They still don't have access to the keys because those are stored in a separate system For example, that system might be something like google cloud kms or amazon kms or even something like hashicorp vault And there are existing plugins Kms plugins for kubernetes is that do this for you. So if you want to take a picture of a slide, this is a good one So there's one that will integrate with google cloud kms. There's one that will integrate with azure I actually think rita zang is here. She's done a ton of work on that and a bunch of other stuff with flex volumes There's an dwc encryption provider and there's one that i'll demo in a second, which is the oracle kubernetes vault kms plugin If you're on google cloud, so if you're running on gke google kubernetes engine We actually have a an option that will go into beta on tuesday It's an alpha right now, but you can still use it that does this for you So when you spin up a cluster or in the ui you can specify a kms key and we'll set up all of this for you So this is automatically done. We already do a bunch of stuff to protect the cluster This gives you full control over the key and the rotation and we'll do auto rotation of those keys for you But it introduces a new problem, which is how do you authenticate your kubernetes nodes to talk to the kms provider How do you and it's called the initial secret problem or the first secret problem? And generally iam is the way that you do this if you're on a cloud provider So identity and access management on gcp that service accounts on aws That's either arns or access tokens But generally you give the master node either a service account credential and access token Or you grant the machine that it's running on permission to talk to the api and you can revoke that at any time So you delegate pam to the cloud provider through iam And then you separate concerns so the ecd nodes the the physical vms that ecd is running on Does not get permission to talk to the kms provider You only give that to the api server nodes and this separates concerns So that an attacker has to compromise multiple systems in order to decrypt these values And you can revoke access to the kms keys at any time So what if you're not on a cloud provider or what if you already have a custom managed You know kubernetes setup and you're not ready to move to a cloud provider managed one where these plugins already exist Well, this is where something like vault can come in so Instead of the kms provider being a third party kms provider like a google cloud kms Instead we can delegate this to vault and our friends over at oracle have built a really helpful plugin that runs in kubernetes That helps us delegate this Encryption and decryption to vault again. This is envelope encryption. So in vault. There's this thing called the transit backend It's basically key management as a service vault generates a key kubernetes then encrypts uses that key to encrypt another key the data encryption key which it uses to encrypt data at rest So what does that look like? So let me jump back over here My cheat sheet All right kube cuddle config Use context So switch to a new kubernetes cluster this kubernetes cluster has the Vault kms encryption already set up and ready to go and that encryption config So i'm going to go ahead and create that same secret And now when we exec into se d hopefully se d cuddle get registry secrets Default demo, you'll notice that that's not plain text anymore Um, that's a bunch of gargly goop Um, but that's encrypted and it's encrypted in transit with tls But it's encrypted at rest with this aes 256 bit cbc key that vault is managing and vault is being run outside of kubernetes right now So even if you were to compromise this se d cluster, you can't actually decrypt this data by anything other than a brute force attack Um, and all of this is open source and there's a ton of guides and documentation and blog posts out there Including these slides which will be up on the internet shortly that will help you get this stuff set up so to conclude kubernetes can take us from sad to happy with respect to secrets management Use at least two layers of encryption application layer encryption and hardware layer encryption rotate your keys regularly Leverage envelope encryption. It's the fastest and most scalable way to doing data encryption that we have today And protect kubernetes secrets using an external kms provider, whether that's a cloud provider or something like hashicorp vault Thank you, uh, i'll go back because there are people taking pictures of this slide So are there any questions? Hello, so What if I run etcd on the master nodes? That means that I basically need to provide iam permissions for the master nodes and then if I compromise the master Basically, I can decrypt anything by knowing the Uh key id if it's kms for example, right? Right, so the the question is if I run etcd on the same nodes that I run the kubernetes Yes, so this model like suggests us not to run etcd on the same nodes like on the master nodes, right? So we need to the couple that would have like maximum security. Yes, that would increase your security Much more than running them side by side okay But uh, do you have like any suggestions what to do if I have etcd running on my master nodes and basically To use like amazon kms or google kms I need to provide like a service account integrated into Machines metadata right or iam role integrated into machines metadata. What can I do to increase my security? So even if you use an external kms provider When etcd is on the master node with the kube api server you've still increased your security from When the keys live in plain text because you can revoke that service account and those credentials Without taking down all of your production infrastructure And also an attacker can't decrypt your data offline So you've increased your security posture a little bit because in the in the previous model where the keys are just stored in plain text An attacker can just do a full file system dump And now they can like run away with your data and decrypt it offline Whereas with the kms provider if you have some audit logging and those things set up by default If you detect some anomalies you can revoke access to that key And now an attacker can't decrypt that data because you've broken the link to kms So that's really the only thing if you haven't separated at cd from the master nodes That's the only thing you've improved is the ability to revoke access and prevent offline decryption. Okay, thanks Other questions There's no other questions Okay, thank you for the talk Okay, so for example, I have a huge kubernetes cluster for 1000 nodes and have A lot of data on the tcd nodes, but if I want to change my kms key or whatever So should I re-encrypt all this data? How is this process going on? Yeah, so in the question is what should I do with my existing data that may or may not be encrypted in atcd Um, when you generate that encryption provider config, let me pull up a slide that has one on it Um, so providers is an array The first item in that array is what is used to encrypt all data Everything else is try it tries to decrypt the data So, uh, there's one of them called identity, which just takes an empty object That will allow you to decrypt all of the plain text objects that you currently have an atcd Then you can do like a kube cuddle replace all On all of your secrets and it'll take a little bit if you have 1000 nodes But it'll go through and encrypt them all alternatively you can just choose to Um just choose to encrypt things going forward and how you know if you rotate your pot