 I'd like to thank everyone who is joining us today. Welcome to CNCF, or welcome to today's CNCF webinar, Encrypting Data and Kubernetes Deployments. Protect your data, not just your secrets. I'm Kristy Chan, Marketing Communication Manager at CNCF. I'll be moderating today's webinar. We would like to welcome our presenter today, Maxim Gankovsky, VP of Engineering at ZSET. A few housekeeping items before we get started. During the webinar, you are not able to talk as an attendee. There is a Q&A box at the bottom of your screen. Please feel free to drop your questions in there and we'll get to as many as we can at the end. This is an official webinar of the CNCF and as such is subject to CNCF Code of Conduct. Please do not add anything to the chat or questions that would be in violation of that Code of Conduct. Basically, please be respectful to all of your fellow participants and presenters. With that, I'll hand it over to Maxim to kick off today's presentation. Take it away. Thank you very much, Kristy. Good morning, good day, and welcome everybody. As they say in the airline business, when you board a plane, you realize you have a lot of choices when it comes how do you spend the next hour of your time. Actually, with the quarantine today, you don't have a lot of choices, but we're still going to try and make this hour very educational and very informative. So, with that, I'm Maxim Jankovsky, the vice president of engineering at ZetaSet. Brief bio of myself is I've been working with data security and encryption software for the past over 14 years in companies such as Ingrid Networks and SafeNet, and my team has been responsible for developing coordinate security solutions, including the key secure data security appliances that are still being successfully sold and shipped to customers as part of SafeNet, Jamalta, and now Talos security portfolio. And at ZetaSet, we're developing encryption and security software that has been used in today's ever-changing marketplace. A brief agenda of what we're going to do today, this is not linear, it's kind of a summary of what we're going to talk about. We're going to talk about global security challenges, the state of encryption, where we came from, how we ended up here, and what it looks like now. We're going to talk about data breaches, data protections in terms of DevOps and DevSecOps. We're going to talk about what it would take to engineer an application or an application system with security in mind. And, of course, through to the topic of today's webinar, we're going to talk about how do we protect enterprise data, and then we'll have enough time to go into Q&A, of course. So, first item on the menu, data breaches. The cyber attacks are increasing in frequency. Everybody has been targeted. You'll see a number of companies mentioned in a small phone down there at the bottom of the slide, but there are some big names there. Equifox, Verizon, just to name a few, Whole Foods, IRS, Blue Cross, you look across the companies and you realize every single sector has been hacked. Essentially, data breaches are done, they happen very often, but in terms of the cost of the breach, they're actually quite pricey to the enterprises. There are about 3.62 million per breach, and the breaches happen often. And if you look at it, over 42 percent or around 42 percent of the cost of the data breach is actually the cost that enterprises incur in the lost business. That's quite substantial. So, nobody's going to be surprised when the next data breach occurs. What remains to be seen is that what enterprise are going to be doing to prevent this data breaches from happening. And again, through today's presentation, we're going to talk about containers. We're going to talk about containers in production. Containers is relatively new technology. It's been on a rapid adoption scale over the past several years. And several things that these graphs that you see on the screen show is that the use of containers have been increasing pretty dramatically, and the containers used in production has also been increasing. On the second graph especially, if you look at the number of containers that people are running in production, you'll notice a drop in the number of containers used in production of clusters 50 and fewer containers. At the same time, you'll notice the increase of the number of containers in a larger clusters. What that indicates is that a lot of containers are people moving from kicking the tires and moving containers into production. With the number of clusters and number of containers in production increasing, the number of attacks are obviously increasing. Consequently, the importance of protecting from those attacks increases as well. Some of the surveys indicate that 69% of those surveyed, they intend to store sensitive data in containers. Around 76% use containers for storing and manipulating data that fall under some sort of regulations. The staggering number, that's 94% of those surveyed experienced one or more security incidents in the past 12 months. I think the other 6% are just not saying. What are the challenges? Why data protection is such an important topic? In fact, data protection and data security is a super important topic when you talk to enterprises regarding their storage challenges. Also, it's in the top three of the security challenges with protecting the data in containers, along with vulnerability, management and runtime protection. In the CNC survey conducted in 2019, the survey indicates quite a good uptick in respondents using Kubernetes in production from somewhere in 2018 up to 2019, about 20% uptick in the number of containers that are being used in production. Users are certainly expecting more security with their deployments. Compliance, 68% of those surveys indicate that compliance is critical and it must have. Those who indicate that it's nice to have are probably those moving into larger scale deployments and also moving more containers to production. I fully expect the numbers of this 28% spilling to the 68% and that number increase as well. These few slides, if there are a few takeaways from these slides, is that containers are on the upswing. Container environments are becoming more and more prevalent. Regular potential prices are moving data into containerized environments and they're increasing the sizes of those environments and storage and data security and compliance are becoming one of the several major factors on the road of successful container deployments. How do you protect your data in general, not just in container environments? Encryption is, I would say, the best form of data protection. By the way, encryption in and of itself is not the end-all data protection for your enterprise environment. Data protection is usually a combination of tools, but we like to call encryption kind of the last line of defense. After your firewall is compromised and your environments are broken in, the next thing you have is the safe new living room that stores your most valuable information. So you have to encrypt throughout the process. You have to start a collection, you have to encrypt all data manipulation pipelines and preferably you do that at the time when data is created. Any sensitive information that must be stored must be encrypted. And of course, you have to log and monitor all data activity because oftentimes log mining is one of these tools that gives you visibility and early visibility into a data breach. Encryption is super important. Data protection is super important. So why aren't more people deploying it? Why is it not the norm? And this is where we talk a little bit about what encryption was and what it is now. Encryption used to be quite complex to deploy and to manage. It used to impact performance to a substantial amount of degree. There's a lot of crypto accelerators and a lot of specialized hardware that have been built over the years to address performance concerns. And if you look at those surveys back in 2017, they clearly indicate that 78% are concerned about deploying encryption because it's going to impact the system performance and latency and enterprises just cannot slow their applications down. Also encryption, it used to be and some of the current encryption solutions still are not simple to manage. You have to identify which data you want to encrypt and then you have to manage the encryption solution. You cannot just point and encrypt and you should be able to. Enforcement of policy is another chief horrible to adopt encryption. And of course, with increased cloud and on-premise and hybrid deployments, it's concerning for people how well encryption is supported in those hybrid deployments. Of course, with the rise of cloud and virtualized environment systems scalability becomes a problem. We used to just deploy encryption server in the data center and call it a day, but now we may not even have access to the data center. And the number of environments is increasing, number of containers is increasing, and therefore obviously the number of cryptographic keys that are used to protect those environments is increasing. One of the surveys we don't show here indicates that over 40% of enterprises are not using key management tools, any key management tools, and actually storing encryption keys in the combination of text files and spreadsheets, which, you know, it does not really sound like a 24th century security practice or as a security practice in general. And of course, one of the other concerns is integration with other security tools because large enterprises, they have a combination of solutions and combination of tools, and encryption needs to integrate with those. So it's not easy. Integration, integrating encryption has not been easy, and it's still not. So when your enterprise is at the point where it has to either comply with the regulation or it's just doing kind of a good housekeeping of protecting their customers' data and your task or one of your colleagues is start with choosing an encryption solution, what is that that you look for? What does it mean to have a good encryption solution in your enterprise? As we already talked, performance is super critical. So you want a solution that introduces performance penalty, but in a very small percentage numbers. Encryption is not free, even with today's crypto accelerators and native encryption support in Intel chips. Encryption still costs performance cycles, but you want to keep that to the minimum. Businesses, they don't want to have any impact on their existing processes. If you talk to some of the especially healthcare providers, they cannot have, they cannot be adding even a minute to a patient appointment because their practice essentially runs on seconds. Scalability of physical environments, virtual environments, hybrid environments, the environments come and go. They get deployed several times a day. They get decommissioned several times a day. And of course, you cannot have a person run to the data center and install security appliance every time a new environment comes on board. So back in, I can't really say back in my day, but back in late 20th century and even early 21st century, enterprises had to have dedicated people on their stuff that understood how encryption works, how encryption is deployed, what's an encryption key, what's a key wrapping key, what's a hash key, what type of algorithms are recommended, what types are not, and after you deploy, how do you manage and troubleshoot that system? And it's pretty difficult and pretty expensive. And so anything we can do, anything encryption solutions can do to make it simpler to manage and simpler to deploy. And anything these solutions can do to make sure that you don't need specialized cryptographic expertise, that makes a better solution. And of course, there's a number of compliance initiatives going around since the first data breaches back in early 2000 that compromised the financial sector pretty severely. That's where MasterCard came up and Visa came up with the PCIO Payment Card Industry Standards. And so all of these compliance initiatives around PCIO, around financial sector, around healthcare sector that later resulted in HIPAA regulations and so on and so forth, one of the notable ones in GTPR as of the past few years, all of those compliance initiatives, they are, encryption is a very good vehicle to address chief concerns of those compliance initiatives and essentially pass the compliance audits. So one thing I'd like to put to rest right at the beginning, or right at the beginning of a more technical part of this presentation is there have been tools that have been created over the years that attempt to simplify or make it simpler to deploy encryption and to manage encryption. And some of them are self-encrypting drives, some of them are file encryption solutions and so on and so forth. And so why would we just not use them? So on the left of the slide you can see what I would call the encryption stack, which is essentially a software stack and hardware stack where you can apply encryption. Starting from the hardware, you can go all the way to application level encryption. The interesting thing about this stack is that as you go higher up the stack, you're talking about more purpose-built solutions and you're talking about greater performance degradation. As you go lower up the stack, you're talking about more generalized solutions. And you're talking about better performance, but oftentimes you have to sacrifice some granularity, especially in databases and applications where you might not be able to encrypt a column. In the database, you might have to encrypt the entire table. Or you might not be able to encrypt the entire table, you will have to encrypt an entire partition that the database stores files on. So the goal of this slide is to show that there needs to be a compromise between performance and granularity. And self-encryption drives are very low on the stack, very appealing, but they're not a compromise. We're going to talk about what does it take to make a good encryption solution that you can trust with protecting your data. To make it short on this slide, I'm just going to say that self-encryption drives are shortly not going to help make you a good encryption solution because not only their key management is borderline non-existent, or if it does exist, it's very much substandard. Also, how do you manage a data center of 10,000 self-encrypting drives when you need to replace or manage those installations? And finally, if you're in a cloud environment, do you really have a choice as to whether or not self-encrypting drives are used, how often they're decommissioned and how often they're replaced? So not the answer. And with that, we're going to talk about DevOps, DevSecOps and why do we need to put security in place at the design time and why security is a very, very bad idea. So DevOps is essentially an ability to deliver quality applications, quality software fast. And DevSecOps is an ability to do this with security in mind. So not only are you delivering quality applications and you're doing it on a fast and predictable schedule, you're also developing secure applications quickly that enterprises can trust to store their sensitive data. So security should be baked in from the very beginning. You need to identify your primary drivers for your security initiatives. Is that compliance? Is that a good housekeeping? Is that the desire to protect customers' data? Usually it's a combination of both or at least hopefully it's a combination of both. And so how do you balance security with regulatory compliance? And the reason I bring this up explicitly is because regulatory compliance is usually done on a timeline and enterprises oftentimes attempt to achieve compliance with the least amount of investment in security initiatives. So there is a fine line, there is a balance that you have to figure out between how do you make environment more secure and also achieve compliance. You have to look at what security solutions are appropriate for your environments, not just today but going forward as well. And tempting as it may be to just say environments come with secrets and passwords and all different kinds of ways to store your sensitive data. And secrets and passwords are great, they're very important, they're critical to environment functionality, but they protect your processes. They do not protect your data. So a good security solution is a security solution that you can trust. Because if you can trust your security solution then why bother deploying it in the first place? Security solution as we talked is a combination of components and all those components talk to each other, they have to be able to trust each other. That is why pretty much every security solution needs to have one or another form of what's called the certificate authority, which is think of it as a United States Department of State that issues passwords to security services or DMV that issues driver licenses. It's essentially a service that authenticate security processes and other processes to one another. So when you talk to a processor in a system you can trust it. My favorite knowledge is that you encrypt your environment, you have one or more encryption keys, and so where do you put your encryption keys? You need to have a key manager, a system that is capable of securely and safely storing those keys. I mean you wouldn't put your house key under a door mat. Some of us do, some of us put it in the wheel well and you know we hide it creatively and we're surprised when we're broken into. So no keys under the door mat and there's a number of key managers that are available in the market but different. Depending on your security requirements you can opt for software-based key managers, you can opt for hardware-based key managers, that's up to you. But when you look at a security solution and it doesn't make a specific mentioning of how to store and protect your keys, I say look somewhere else. And of course there should be a root of trust and when I say root of trust, I mean the root of trust for storing what's called the master key. Essentially your key manager protects your most valuable assets is the encryption keys. So who protects your key manager? And for that there's a special component in security solutions that's called the security module. And there are a number of security modules which are hardware-based, a number of security modules which are software-based, but that's kind of a given that if you have a key manager it needs to have a specific security module component so that the encryption key database is protected and encrypted itself with the security module, with the keys stored in the security module. And the security module knows how to store the small number of master keys and stores in a secure and compliant way. So container as environments, they're quite different. They look for when you're within the container, you can feel it say necessarily that you're within the container. Because the containerized and virtualization environments, they do a pretty good job of hiding the fog that you're within a small container from developer or from the end user. But containerized environments are very, very different. So storage in containers is different and therefore the data, especially the sensitive data, might be protected in different ways. The first one, the first key point is that encryption must follow storage, which means in multi-tenant container environments the storage will be shared. But even if you share the storage, you should never share an encryption key. I mean to me, and then to a lot of people, you know, if I say it's slightly different is that you better trust the people you give your house keys to, right? That's kind of obvious. I'm a little bit surprised why, you know, not sharing an encryption key is not so obvious, but apparently it's not. Storage must be independent of hosting containers, meaning today your container might run on one host, tomorrow it might run on another host. That actually means that, you know, the storage must be shared between containers. So we cannot use legacy approach of hardware-defined storage, where the storage is directly and physically tied to the host. We cannot use that approach when managing storage for containers. And last but not least and also super critical is the separation of duties. We have different roles and different actors now. We have developers, we have platform operators, we have administrators, we have a number of other roles, but there needs to be a clear separation of duties. And not that you shouldn't trust your developers, but the developers are not in the best position. They're not the best people to ask to make security decisions. These decisions have to be made elsewhere in the enterprise and you want to maintain that separation of duties. So let's look at an example of a topology that you may see in a typical maybe even simpler docker environments where you have one or more docker hosts and each host runs one or more containers. Containers are obviously belong to different applications. That's the whole purpose of virtualization is sharing resources between containers and sharing resources between tenants. So you can have different docker host running different containers for different applications belonging to different customers. The key point is the storage and the containers. Every storage unit associated with every container must be encrypted with its own unique key. Why is that so important? Because compromises happen with the beginning of the presentation that environments will be compromised. When the environment is compromised, you want to limit the exposure. You don't want one of your environments. If somebody compromises my development environment, I don't want that exposure to spill out to my payroll environment or my finance environment. If I am a solution provider and if I host more than one customer, I certainly don't want one compromised customer to compromise my entire monitoring environment. So that is why no sharing of keys. So how would we do this? How would we accomplish that? Let's say with docker. We would look at docker storage mechanisms and we would create a what's called an encryption volume driver. So at the time that container requests storage, it would be given a storage volume that is already encrypted. It would be encrypted with its own key that is not used for any other storage volume anywhere else. And the key will be securely stored in the key manager with all the certificate authority and the security module infrastructure we already talked about earlier in the presentation. This is done by directly integrating with increased volume driver that is part of docker. And we are going to have some sort of volume provisioning and some sort of volume management. What we show here is a very simplified, very simplistic volume provisioning based on a traditional Linux storage model. But the volume group provisioning might as well look at provisioning specific cloud based storage in AWS or in Google Cloud. The most important part is that by the time container gets the volume, the volume is already encrypted. So let's look at a more fluid environment where Kubernetes makes it easy to orchestrate a very large number of containers. And Kubernetes has a number of worker nodes similar to docker nodes, but now containers can run pretty much anywhere. In docker world, at least there is some association between where container runs today and where the container will run when it's restarted tomorrow. The container can be scheduled on one node and rescheduled on another node at different times depending on the environment load. That's the Kubernetes way of running containers. The storage is most likely shared. The same storage for them is that every storage unit associated with every container has to be encrypted with its own unique key. And Norie specifically emphasized that on the Kubernetes master level, we have secrets and the secrets, as we all know, stores in HCD key value pairs. Up until the latest revisions of Kubernetes, HCD was not encrypted. Now there's an option to encrypt it. But an important part to realize is that secrets, just like I said at the beginning, is kind of like a password file. It is critical, absolutely. To Java developers, secrets are kind of like the JKS key store. So what do we store in a key store? We store keys, we store passwords, we store certificates, and so on and so forth. But we don't use the key store and we don't use secrets to store encrypted data. And so the same approach would be beneficial to that type of environment is you see a Kubernetes code layout a volume claim for a particular storage unit or a storage class as we call it in Kubernetes. That storage class would hopefully refer to an encrypted volume driver that just like a specialized volume drivers know how to provision NFS storage or AWS storage or any other type of storage, an encrypted volume driver would know how to provision an encrypted volume on request. An important thing to note is that every part will get its own separate and distinct volume. And every volume will get its own separate and distinct encryption key and its own separate and distinct backing storage. So that backing storage will be transparently provisioned in the encrypted state by the encrypted CSI driver. So this type of approach, it lends itself very, very well for enterprise use cases where, again, as we talked earlier, encryption is not the one to solve every data protection problem. A big problem of data protection is can you trust your environment? And so in the example of Red Hat OpenShift, where they provide an infrastructure and framework for essentially certifying containers and Kubernetes operators, that gives you a certain level of certain additional level of assurance and security that whenever something develop, someone develops and certifies a solution, it is going through a certain certification process that assures that container images are built on top of a trusted platform, that you can trust the container image, you can trust the deployment mechanism, you can trust all of the components that run within this certified environment, and you get all of those images, all of the container images, all of the operator images, everything basically related to any software including encryption so far. When you get it, you get it from a trusted source. So we talked about different places and different levels at which we can apply encryption, and what are some of the advantages of implementing encryption in the way that's native to containers? We already talked about unique key per valid, so each persistent volume is encrypted with its own cryptographic key, so one compromised container does not compromise the entire multi-tenant environment. We talked about secrets are not protected by default, although they can be now protected from later Kubernetes releases, but the important notion is that they do not protect the data, so a separate data protection solution is required. As we talked earlier, password files don't protect data, they protect your environments, and JTSs don't protect your data, they protect your environment, so every unique system has a password file, but that doesn't mean that every unique system doesn't have a separate encryption and security solution that it comes with. One huge, very large benefit of native container encryption is that you can securely, with the proper key management infrastructure, you can securely erase the data without actually having to erase the data, and that is done by decommissioning the cryptographic key in the key manager, and so therefore when the container goes away, and if the container was to come up again, it won't be able to acquire an encryption key necessary to decrypt the data. So if the number of nodes in the classroom grows, sometimes the nodes get compromised, sometimes nodes need to be decommissioned and replaced, and if these nodes have sensitive data, you don't always have an ability to connect to a node and delete the sensitive data. With the secure node removal feature, you can actually do that by executing administrative commands so that even if the node is compromised or even if the node is later brought up and connected to its native network, the corporate network, it will not be able to access the data. That's again done by managing what's called the certificate revocation list in the key manager. And last but not least is the container storage separation. We already talked about why it's important to encrypt each container volume with its own unique key. So the container storage separation allows you to go even deeper on that, where every container volume is mapped to a unique logical volume, and that logical volume is only available when it's in use by one or more containers. So these are the things that you want to look for in a software-based security solution, in any security solution you choose to deploy, and hopefully with proper deployment mechanisms and with proper identifying of solutions you'll be able to deploy a solution in the enterprise that will protect your data. It'll protect your enterprise from breaches and it will protect your enterprise and hope that it stays in compliance. That's, I think that covers the entire presentation for today, and let me see if there are any questions. Yeah, awesome. Thanks, Maxim, for a great presentation. You're reminded of folks that we do have a Q&A box at the bottom of your screen, so if you do have a question for Maxim, feel free to drop it in, and it looks like we do have one question already that allowed for you, Maxim. It's from Miguel. It says about encryption storage. My first concern is reliability. If we lost damage to some bytes on storages, we can still recover the disks or most part of the contents. From my experience, this is also a problem, even in my laptop with BitLocker. How is it solved on containers? Okay, very good question, and yes, encryption is essentially an additional process and that's done on the data, so it does put heavier load on storage, so it's a very valid concern as to how do we recover from lost or damaged or even worn out bytes on storage, especially if you look at some of the more recent developments in the storage methodologies and you look at solid state drives, they are known for their wear. So how do we do this? You deploy a proper storage mechanisms that provide you a certain level of redundancy. You backup your storage regularly, and you use specialized backup not just the RAID, not just highly available disk volumes. You're also backing up the data regularly. When you deploy encryption, one of the critical portions of your data pipeline become your encryption key. So therefore, one thing you look for in a security solution is what kind of key manager infrastructure they provide. They provide key manager, great, that's already put them ahead of many security solutions. Is that key manager highly available? Is it hosting the data or is storing the data on highly available storage volumes? Because that's essentially what you want to do is to make sure that not only your data is protected, but your keys are also protected. Okay, awesome. Looks like we have another question. It says, from an anonymous attendee, can you elaborate on what you mean by transitioning my DevOps environment to DevSec off? Right, so this is essentially good question. It's essentially how do you take your environment where the goal of the environment is to quickly deliver quality solutions to an environment where encryption and security rather is part of that quality differentiator. So you'd like to not just say I'm in DevOps, which is kind of like agile, quickly develop solutions that have a certain level of quality. Also, security become part of the implicit part of quality. That's the transition from DevOps to DevSecOps. Great, okay. Yeah, I'll read it. And I'm apologizing if I'm butchering listening, but it looks like Saruba is asking, could you please elaborate on secrets are not protected by default? Right, so Kubernetes secrets are not protected by default. Secrets are essentially stored in FCD, which is part of the core Kubernetes deployment. FCD is a key value pair storage. And the storage that's used by FCD to store the key value pairs, it's not encrypted by default. So if your FCD is compromised, then the storage is exposed, and the secrets are exposed. Later versions of Kubernetes provide a way for you to encrypt the secrets stored. And that adds a certain level of protection, at least when you look at the secrets, you will not be able to gain access to the underlining secrets works. Great, all right. Well, I think that covers all the questions today. Thanks again, Maxine for a great presentation. That's all the time that we have for today. Just a reminder that the webinar recording and slides will be online later today. Thanks again, and we look forward to seeing you all at a future CNCF webinar. Have a great day!