 Hi, everyone. We're super excited to be here today. We came to talk with you about Kubernetes Secrets. We had a lot of experience using Kubernetes Secrets. And we want to share with you what we learned along the way by using them in real-world systems and the best solution to manage them. But before we deep dive, let's start with a little background about who we are. I'm Gal. I'm back and engineered Firefly. Firefly is cloud asset management solution. We scan the entire infrastructure of our customers and we automatically transform the resources to code. Firefly helps DevOps, SRE, and Platforms engineers in order to regain their control on their environment. Hi, my name is Leav. I'm an engineer team leader at Firefly as well. I recently had a privilege of speaking about migrating to Kubernetes job at Kubernetes Community Days Amsterdam. So I'm super excited to be here again. And today, we're going to share with you everything we learned during our journey with Kubernetes Secrets. So here we go. Let's get started. Kubernetes Secrets up. You just hit me. That yesterday, we deployed a new service into production cluster that authenticated with MongoDB. And we put the MongoDB credentials in Kubernetes Secrets. As you already know, Gal, that's not the most secure solution out there. It's actually not encrypted at all. Wow. Then I guess we have to think of our options for changing this. That's funny. Because that's exactly what I'm planning to do here today. So let's get started. Have you ever questioned the security level of your organization Secrets? When I'm talking about Secrets, I'm referring to database passwords, private keys, service tokens, all those secrets that all of our application and all of our sensitive data is being protected with. Well, we did it. We did it because of the importance of our very own and our customer-sensitive data. And we think that everyone should do so. Today, we're going to share with you everything you need to know about self-secret management in Kubernetes. We will go over the technologies and tool that will help you manage your secrets in a more secure way. And we will wrap up with a demo for using the Secret Store CSI driver. But Gal, first thing first, what is a secret at all? So a secret is an object that contains sensitive data, like an access key, a token, or a password. As developers, we use them all the time. And I'm not talking about your TikTok account password. But really, a more common usage are for connecting to a database, accessing a third-party service, decrypt a message. And much more, these are just a few examples. This is because secrets are essential to fulfill the goals of our applications. Without them, our code just simply won't function as it's supposed to or expected to. However, we cannot put them in our version control systems because this will expose them to the world. Hence, we must find a way to retrieve them from an external source during runtime. When it comes to Kubernetes, secrets comes in a form of a Kubernetes secrets, which is the Kubernetes representation to a resource that contains sensitive data. In order to integrate and consume the sensitive data with a Kubernetes application, we can use two common ways. The first one is using volume. We mount the value of the secret, the sensitive data, into the container file system during runtime. The second one is using environment variable, which is the most common one, where the value is being injected into the runtime during execution. Both of them are easy to use and effortlessly integrates with other Kubernetes resources, such as deployments, stageful set, jobs, and eventually pods. So now that we know what secret and Kubernetes secrets are and how to use them, let's see how we create and manage them. So like every other Kubernetes resource, we can create secrets manually using one kube-control command. But that's not the most recommended way to do it. Actually, in Firefly, we have a term to such an operation, as you can see here. We call it click-ops. Click-ops is when you create an asset manually, and the best way to manage your resources is with infrastructure as code. By adapting the infrastructure as code concept, you can manage and control your environment at scale. And Kubernetes resource, like every other infrastructure resource, can be created with any infrastructure as code tool out there, such as Helm, Terraform, even Pulumi. So it sounds great, but it's not that perfect. There is always a problem. This is a caution taken from actual Kubernetes docs. And it says that Kubernetes secrets are, by default, stored unencrypted in the API server's underlying data store, Etcd. Anyone with API access can retrieve or modify a secret, and so can anyone with access to Etcd. Additionally, anyone who is authorized to create a pod in a namespace can use that to access to read any secret in that namespace. So we understand that Kubernetes secrets are not perfect for most of the cases, for that and for some other reasons. The first one is that it's unencrypted. The secrets value as are being stored in base 64 format, which means that a simple in code will retrieve your sensitive precious values. Secondly, it has limited role-based access control. As limited privilege policies instruct, we wish to give a user the minimum permissions to the minimum scope of data. And as we know, Kubernetes implementation has limited capabilities when it comes to complex access policies. Another thing is the secret rotations. Secret rotation is the principle of regenerating your secrets over time in order to prevent them from being compromised or outdated. It's also important from a compliance regulation point of view, and organizations should always rotate their secrets. Another thing is that secrets tend to be shared between multiple applications and multiple users. They tend to repeat themselves. At scale, it can create an infrastructure mess. We must have one place, one source of truth that will hold the secrets for all applications. And the last thing is that there is no audit. It's a challenge to check who retrieved, modified, or deleted the secret. God, those issues are very serious, very problematic. But you can be sure that the Kubernetes community won't let us down. And once again, they delivered the solution. So if we go back to the official docs, again, we can see a variety of action a user can take in order to start, manage, secret, and use them in a safer way. Today, we're going to focus on the fourth option, considering using an external secrets provider. But before we go down, let's understand what is an external provider and why should we consider, why we should even consider using it? An external secret provider is a software component that integrates with Kubernetes and enables Kubernetes to retrieve a secret from an external source, such as Cloud Secret Manager or third party key management service or just a database. The market of secret stores is huge. And there are a lot of competitors, even open source, like Ashikov Vault. There are, of course, the Cloud Providers one, AWS Secret Manager, Google Secret Manager, Azure Key Vault. Personally, I had the pleasure to contribute to another open source project, Secret Provider, named CyberConjure. So we understand that the secret store provider market is huge. Let's see why we should use it. So there are several reasons why we should want to do it, because first of all, it performs as a single source of truth. If we want to update or change or just receive a value of a secret, there is one place we can get it from. Secondly, it's a secured solution. Unlike a normal Kubernetes secret, the data is being kept encrypted and secured by a third party and an authorized one. In addition, it's easy to share secrets between application and users. And we can share it efficiently because the secret is at one place and everybody can consume it. We don't have an infrastructure mess at scale. In addition, those providers have SDK or software developer kit in any programming language out there, which we can then use in our services. Last thing, and most importantly, is the least privileged concept. Those solutions follow this concept that a user or an application should get the minimal scope of permission to receive its data. And that way we can reduce the risks but provide any necessary data to the application or the user. So it sounds pretty good to me. I actually prefer GoLand. So let's implement our new services. Let's add the software developer kit of Ashikov Waltz. It's an open source secure and centralized solution, including your new service gal. That's probably the first thing that comes in mind after realizing all the benefits of using an external secret provider. But actually consuming the data from the external secret provider directly is not that recommended. For example, we have these quotes from the Ashikov website. Application need only concern themselves with finding a secret at the file system rather than a managing token connection to an external or other mechanism direction interaction with vault. So even Ashikov themselves discourages to directly communicate with the external secret provider. But let's deep dive on why we should not do so. There are mainly three reasons why we don't wish to do it. First of all, it requires some code adjustment in order to consume this API or software developer kit. And we also have to take into consideration that this API might have its own limitation such as API rate limit and much more. Secondly, you quickly become coupled to this service. You become Verde or LockedIn. If you wish to change your secret store provider or use another one, it will take a tremendous amount of work. And lastly, in order to access such a provider, you will probably need a programmatic access tokens which is a secret by themselves. So where would you store that? And actually it's a super secret because this secret holds the key to all of our organization secret. So what are we left with? On the one hand, we want to use an external secret provider as it provides a secured place to store our secrets. On the other hand, we don't want to invest so many hours of work and we want to do it fast. Gal, do you happen to know any solution that can help us manage our secrets in a much safer way, but with a minimal effort? Sure. CSI comes to the rescue. CSI is one of the most recent emerging solutions to manage Kubernetes secrets. But let's start with the basic. What is CSI? CSI stands for Container Storage Interface. It's a specification that defines a standard interface for container orchestration systems like Kubernetes in order to communicate with external storage systems. And when it comes to secret store CSI, we talk about the solution to sync Kubernetes secrets from external secret store into the cluster. And how it works. So CSI allow Kubernetes to mount multiple secrets that stored in external secrets provider into the cluster via CSI volumes. Once the volume is attached to the pod, the data is mounted into the container's file system and it's accessible for the application. Let's have a deep dive into this architecture of the CSI secret store driver. First thing in our architecture is the external secret store, the HashiCorp Vault. We chose HashiCorp Vault since it's an open source and it gives us all the benefits of the external secret store. And connected to our HashiCorp Vault, we can see the first demon set, the Vault CSI Demon Set. This demon set is responsible for communicating and pulling the data from HashiCorp Vault. Connected to this, we have another demon set, the CSI secret store. This demon set is a generic component. It supports reading secrets from multiple secrets provider. It supports reading secrets from AWS secret management, from Azure Key Vault, from HashiCorp Vault, of course, and Google Cloud secret management. And here, we can see our applications. Connected to the pod, we have our service account. Next to it, we have our volume. This volume holds the actual secrets value and we have the secret provider class custom resource definitions. This component is the magic component. Why? Because this component actually holds the pointer for the locations of the secrets in the external secret store and it holds the role with the permissions to the external secret store. So now, we're going to see a demo of this architecture running live in our cluster. So in the demo, we will examine the two demon sets that Gar mentions and we're going to create a deployment that will consume a vault secret using the secret provider class magic component and then we will change the secret value and we will see it changes in the deployment as well. So, just a second. Okay. Here, we can see a vault server with a predefined secret. The secret is in JSON format and we have the key, the name of the secret is foo, the key is bar and we have the secret here. So now, if we will move to the cluster and we will start to bringing our architecture to live. The first thing we're going to do is to examine the two demon sets which we pre-installed on this particular cluster. We will use the CSI namespace to deploy our application. Now, we will see all the Helm char that's installed. We will see here the secret store CSI which can be installed using two simple commands. You can find it in the GitHub page of secret CSI. There they are. The demon set of vault can be installed using a YAML file. They also have it in the vault CSI provider GitHub page. So, those are our two demon sets and they are running. Now, let's start with the deployment of the application. The first component will be the secret provider class. Here, we can see that we have a secret provider class, vault foo, which is the provider and the name of the secret. We are using a provider vault. We explicitly define which provider we want to use. What is the address of the vault server and we are giving the full path to the secret. We also defining a role. This is a role that we created on our vault server. It's pretty easy to create it. That way, we can authenticate to the vault server. So, after filling all the details, we can create our secret provider class. And that's the magic about this component because I can create another secret provider class point to AWS secret manager. And after installing the AWS CSI, secret store CSI will help me retrieve secret from there as well so I can use two secrets provider on the same deployment if you need it. Now, let's see the application. We have a simple service to a demo image that will create a simple REST API. The main focus should be on the volume part. So, here we can see that we create a volume for the deployment. The volume points to the CSI driver and we are using the secret provider class that we just created. We will mount the volume to the pod under the MNT secret store folder in a read-only permission, of course. So, that way, we will be able to receive the secret value as a file. So, this again, the definition of the secret provider class, there is no mention for Vault here. Just the secret provider class and the secret store CSI help us to provide us with an interface to work with any secret store provider out there. So, now, we deployed our application. We will check that everything is running and the deployment and the service is up. Then, we will create a port forwarding just to receive the secret value. So, the deployment is running, the service is up there and this is me doing a typo. So, if, let's see, there's the typo. I'm missing one, so, this is the not found problem. But, again, now, we will create the port forwarding and when we go to our browser, we will address the API and we can see that the secret data has passed successfully. So, first of all, we managed to achieve what we've been looking for. We didn't create any Kubernetes secrets. Our application have a generic interface to communicate with any external secret provider out there. The data is being transferred secured and it doesn't exist on our environment variables. Now, let's change the value of the secret in Vault. By changing the value, we would like to receive the new version of the secret in our deployment. Our way to do it will be scale down and scale up the deployment and Gal will talk about it pretty soon. So, here we go, we're scaling the deployment down and bring it back to life. The port is being terminated and the new port is going back online. So, just checking up that everything runs well. Now, create the port forwarding without the typo and if we will refresh the page, we will see that the value has been changed successfully. And pretty much that's it. Wow, we just saw that Secret Store CSI Driver for Kubernetes secrets is such a powerful solution. We managed to retrieve the secrets from our Vault server into our applications without any code changes and without any major security risk. So, Secret Store CSI Driver for Kubernetes secrets is one of the best practices to manage Kubernetes secrets. Also, we must talk about its pain points. First of all is the complexity. Adding a CSI Secret Store to a cluster can add complexity to the cluster since it's required additional components and configuration to be added to the cluster. Secondly, accessing a CSI volume, accessing CSI volume can add performance overhead since it requires additional network calls and can potentially cause slower access times. And the last thing is the auto-reload. Currently, when a secret value changes, there is no automatically auto-reload to the pods. Secrets can be expired and needed to be changed during runtime. And in that cases, you will need to auto-reload your pods because right now, the driver does not offer out-of-the-box solution for that cases. That's a wrap. So, what we talked about today. Today we talked about what secrets are, what Kubernetes secrets are, how to deploy the Kubernetes secrets and problems with Kubernetes secrets. And then we talked about the ways to overcome these problems. We focused on one solution, which is the Secret Store CSI Driver. And then we had this architecture and demo. Overall, native Kubernetes secrets can be useful to store sensitive data within a cluster. However, it requires additional management and security measures in order to keep them protected properly. There are many great solutions out there. We recommend you to analyze the needs of your organizations and find the best solution that will help your organization keep its secrets properly. Today, we gave you a quick overview of how Firefly keeps its secrets and of our architecture. But shh, don't tell anyone. Can you keep a secret? Thank you very much. So, I think we have some time for questions. If somebody has a question, yes. Who wanted to ask? Here's the QR, if you want to scan it, we would like to hear your feedback. Thank you all. Hello, thanks for the presentation. I'm probably not the only one to have noticed and you went a little bit faster than that, but you mentioned that the need to authenticate to the vault and you created a configuration to authenticate to the vault. So, where is that secret? Won't take it. I didn't hear so well. I will repeat. It wasn't clear how we authenticate to the vault. What was the mechanism in order to authenticate to the vault server, which is, as I said, where? Where is the secret itself? Okay, so we created a role in our HashiCorp vault server. Leav just showed the HashiCorp, the vault console, if you remember, there we created a role. And this role gets permissions in order to pull the secrets and this is the role that the secret provider class, the CRD, use. The role was predefined with all the certifications and we had to create it. And by attaching the service account to the pod, as Gal showed in the architecture, we could have used this role or talk with vault with authentication less. Is that answering your question? Okay. A question in regards to scalability. So, in your experience, where do you sit in the balance between least privilege and actually scaling your secrets providers? Because every time you have a brand new application, let's take the example of AWS cloud, you have to create new IAM roles for your secret providers and probably another service account that needs to be able to grab those secrets and that's not exactly a great developer experience because the developer should be like, I deploy, all of this is abstracted to me, I don't need to think about service accounts, so where do you sit in your experience behind this? So, each secret provider has its own different capabilities. You mentioned AWS, so in order to consume secrets in AWS, there is a policy language like it has in vault and in the AWS, the expansion process with adding more and more secrets can be a bit challenging than in vault because in AWS, secrets manager is a regional resource. So, if I have my application developed on multiple regions, like it's a best practice to do so, it will be hard to maintain two secrets provider, that's why we chose Azure Cope because it was the best for us. We're just heading another abstraction layer in our vault of the environment that we are managing. So, that way we can scale to multiple environments pretty easily. We are using specified roles as possible. This was just a demo in our production. As we said, we use the CSI and we are creating some general roles to some general applications according to the minimal scope of data that they need. Yes, it takes some effort, but we are preferring to invest this effort instead of have security problems or risks. So, is that answering your question? Thank you. Another question? Question over here on the right side. Hi. Sorry. There was a previous limitation in CSI Secret Store Driver where you can't select, like, star select all, like you have to know the name of every secret you want and not just like a path. Is that still a limitation? Do you mean, like, we're referring to a secret in, like, a reggae or something? Yes. This is something that, say, external secrets provider supports, but I think it's a limitation to the Secret Store CSI Driver. We have showed that, like, in our using AzureCorp Vault, we gave the path in the vault and the path can contain a few keys inside. If that answers your question. What if I don't know the name of all my keys? I just know, like, a... the starting part of the string. Like using reggae in order to pull all the secrets that match this reggae. We'll have to get back with you with that because we are using explicit definitions, but that's a very good question. Thank you. Thank you for that. Any other question? I think that's all, yes. So, for that actual secret itself, is that namespace scoped? Or if we say, well, get me the provider or the storage class itself, is it going to list all the secrets? Or is that specific to a namespace? So, for now, namespace isolation... I don't hear that well. Is there a chance to adjust the volume somehow? The sound? No? Okay. Have you heard him? Can you repeat it? Please, sorry. Okay, let me come closer and speak to you. With regards to that actual secret that you're creating, is that namespace scoped or is that cluster scoped? Do I... yeah, the provider class? Okay, so it's just namespace. You're not seeing everybody's secrets because we're looking at 10 million secrets. Okay, perfect. And that is then also secured with RBAC then in that namespace. All right, thank you so much. Thank you. Oh, sorry, yes. He asked whether the secret provider class is a namespace... It's a namespace resource that we manage by namespace, like POD or something like that. Yes, we said it is. And then he asked if role-based access can be applied on this resource. And the answer is yes, like any other resource we can create roles and cluster roles with the exact API version, the resource, and the verbs we want to apply. Of course, we can go to the resource name, but... This is it. Can I have a... Yes. How are we in term-wise? Okay, three more minutes. I was curious, how do you deal with potential downtime when you have to auto redeploy when you change the secret? You mean down times of vault or... Let's say you change the secret in your vault, then you have to change it in all your versions. Oh, the rolling updates. So it's pretty specific to Firefly, but we kind of have the privilege of using some more constant secret, but regarding the time-term secret, we have a Kubernetes operator that is responsible for implementing all our jobs. Before the job starts, we have a validation that validates not against the vault. As I said, you don't want to be dependent on it. We have our own kind of rotating secrets service, and we're using its logs and metrics, and we are using CRDs to update... We have CRDs per integration, and we're updating the last time we did... Last time we rotated the secret, what will be the expiration, and we will run the job only if it's available. If not, we will put it back in the queue, and the queue will delay it for several more minutes. So the workload validates if it's correct? Yes, we have a Kubernetes operator that executes all of our jobs. Thank you. Hi. When it comes to storing secrets in each CD, then you can enable encryption. I'm thinking about version control systems, and you shouldn't store plain secrets in there, but there are systems like SOPs, Mozilla SOPs that allows you to store encrypted secrets, and how does this compare? Do you mean... SOPs is also a great solution in order to store secrets in like a VCS, but it's actually encrypted. This is also a great solution. Like we said, there are many great solutions. We have the CSI, and I'm sure many of them also heard of external secrets operator, and also SOPs, there are many great solutions. It just depends on the needs of the application. Thank you. So, are we done? Okay, here are some useful links. Firefly has two open source solutions. One of them is AIAC, more than 2,000 stars on GitHub. We are artificial intelligence infrastructure as code generator. We also have valid AIAC, which is another open source to ensure infrastructure as code best practices on your infrastructure as code. As Gal said, go Firefly, an application that can help develop accessories and platform teams. Manage the cloud complexity. You can start for free at any time. This is our company linked in-page mine and the most talented gal coin. Thank you very much again. Thank you for being here. We have one more last question, and you can all catch us up later. I just really want to chip in. We are doing the same thing in my team, and the CSI can be very troublesome when it comes to heavy load. You need to think about there are three containers in the CSI driver. If you are going to use it under heavy load, you need to think about the node register and the secret store increasing the capacity. Then you need to think about resource quotas and then you need to think about making sure that there is priority on the daemon set pod. What happens when your node is autoscaling is the secret store pod can be kicked out and all your applications that are needed would wait until that happens. When you are thinking about your cluster has a lot of load, you have to think about other things to make sure that daemon set pod is not evicted. I just wanted to add that. Thank you very much. Thank you all.