 Hi, I'm Tiffany Jernigan. I am a developer advocate at VMware. So for a warning, this is my first time giving a security talk because I didn't really do security beforehand. So the whole goal of this thing was learn security to be able to talk about it for people who are not already security experts. So if you're already a security expert and focused on security, you'll know everything in here. And you can tell me if I did something wrong later too. All right, so you're here for 40 minutes or if you decide to walk out early, I won't cry too hard. This is my first time giving a conference talk besides like the smaller events since 2018. So a little bit nervous, but thanks for coming. Also, that's an awesome mask. All right, so hopefully my clicker is nice to me since sometimes it's not. All right, so less is more. So basically the opposite of what is happening on this slide right now, because there's a bunch there. So basically the less you have, the more secure that you can be. So basically you want to have things like less code. So if possible, you can use existing security solutions instead of going and creating your own. Also off the shelf, likely has maybe more effort focused on that specific thing since that's one of the things they're trying to specifically create and therefore may have more eyeballs on it unless you hire your own entire team to do it. It may therefore be better unless you want to go take the effort and do that. Also give fewer permissions. For instance, basically only give the permissions that someone actually needs. Don't go around just giving admin permissions to every single person. If I did that to everyone here, someone might be like, hey, I don't like you. So I'm just gonna like delete everything, sucks. Also avoid having long lived secrets. Have fewer dependencies. So you're minimizing the attack surface. For instance, you may want to use something like a distreless image instead of using Ubuntu. Also keep up with whatever the latest recommendations are. This keeps changing, things keep evolving, people keep finding out new things. One of the things that you can go to is that link in the Kubernetes docs. And you can periodically check it as to what they suggest for a security checklist if you're specifically dealing with Kubernetes. So the Kubernetes documentation breaks down cloud native security into four Cs. I stole this graphic from there. So there's cloud, cluster, container and code. So as a practitioner, I can't even talk, I'm so nervous. You have to think about hardening your infrastructure at every single level. So every single one of these you need to worry about. You can't just be like, I care about one thing and not care about the rest. Maybe as a specific person, you can worry about one thing because you have maybe a big team and there's people dealing with each part of that. But you can't ignore it. So first, I'm going to start talking about things with platform and cluster components. So unless you need to run everything yourself or you need to have all that kind of specific access, for instance, say you want to be able to do very specific things inside the control plane, I suggest using a managed Kubernetes offering. Basically, managed services take care of a lot of the work for you. So that means there are fewer places that you can potentially make mistakes and then therefore have more issues with security. Security and hardening is one of the primary value ads for these cloud platforms already. So specifically for securing control planes and nodes. So you might hear the word hardening a lot when people talk about security. So basically make sure you harden the control plane. So for instance, only allow people to interact with the API server, the Kubernetes APIs. If you go to the Kubernetes docs, which has a lot of stuff, it will tell you how to do things like restricting access to at CD. So the less that you have running on your control plane servers, the fewer potential places that people can go and attack and exploit to be able to gain access to your cluster. So for instance, you also might want to deal with, if your pod has access to something, it can do things like obtaining node credentials. And that could be a problem because other people might be able to get access to your node, which you don't want to have happen. You want to be able to do things like using a TLS cert. So you can use something like less encrypt for instance. Basically don't be lazy and do that little flag there of the skip TLS cert verification. Or basically it's as good as not using TLS at all. There is a talk by Tabby Sable called PKI the wrong way. So in that, it was from KubeCon a little while ago. And basically it's demonstrated how the end user's ability to be able to create certificates from MTLS and how that can be used to gain access to the entire cluster. So for instance, that shows why you would want to restrict access to at CD. And obviously things like that happening is really bad. Basically at CD it goes and it assumes that if you can successfully authenticate MTLS, you have proper access. Did that actually move on? OK, cool. So also make sure that you have pod security standards. You want to be able to make it so that others shouldn't have access that they shouldn't have. Up until 1.21, the default way of going about that, you may have been dealing with pod security policies. And that was deprecated in 1.21 in favor of pod security standards. Lots of acronyms. There's a lot there. So if you're dealing with running stuff specifically like on, say, some cloud provider, there is a cloud metadata server. And that's where credentials are given into the virtual machine. You basically never want your pods to have access to that. Lots of pretty scary vulnerabilities can happen there that if you allow attackers to have access to the metadata server. So for instance, they could get credentials associated with your VM service account. That would not be good at all. And the link there, Jerome Pedazzoni, has a YAML that basically gives attacker access to your nodes if you don't have those things. So you can play around with that. So upgrading Kubernetes. You want to make sure that you're upgrading Kubernetes. You don't want to just install, set up Kubernetes, and then just be like, OK, everything's great. Continue on. There's a few reasons why that could be bad. Some of them include finding big CVEs and wanting you to move past that. So Kubernetes is basically designed so that upgrades can be relatively seamless. You don't have to go and be like, hey, here's my cluster. Let me kill off everything after I create a new cluster. Move everything over. You don't have to do all of that stuff. You may have had to do it back in the day. It's basically gone a long way since OpenStack. So again, a lot of people are still using very old versions of Kubernetes, which is not great. And the more times, the more often you upgrade, the more seamless that is. If you're using a managed offering, you might even have an option there to enable auto upgrades. Or if you don't, it's usually like a one-step-click button for updating what version is on your nodes, what version is on your control plan, et cetera. Basically, try to stay up to date with whatever the latest patches are. If you're scared about what could go wrong, most likely things won't be breaking. But when you're upgrading, especially if you're doing it, you're not waiting an extreme amount of time or to the point where things are starting to get deprecated. So you might see some of those warnings there. So for your enjoyment, this is the list of the things that you need to do if you are specifically going through manually updating each thing of your cluster. So first, you'd have to update your control plan, then your nodes, things like cube cuddle, and then adjusting the manifest and other resources based on whatever is changing with the API. And if you are running your entire own Kubernetes cluster, there's a lot of little things that you have to manage with that. Hence the whole suggestion of using a managed Kubernetes offering, because they deal with a lot of this stuff for you and making sure versions and all these things go all together. So isolating compute. So quotas and limits, they can be used so like basically noisy neighbors don't affect other workloads that are on the same host. This can also prevent an attacker from using your infrastructure for like DDoS or crypto mining, which you probably don't want someone else doing. You can use taints and tolerations to schedule workloads away from each other. You can also add another layer of security by using things like sandbox container runtimes. So there's things like G visor, there's kata, there's firecracker. Basically, these add another layer of compute isolation between the containerized workload and the underlying OS kernel by adding another layer of virtualization. So basically a lot of these things just keep adding more separation so it's harder for people to be able to attack. So there's only one thing on here because I really don't know much of anything about storage in the security realm, but there is the container storage interface or CSI. As you see, people love acronyms, but it makes it easier than saying out the whole thing. So it's a standard for exposing arbitrary block and file storage systems to containerized workloads on container orchestration systems like Kubernetes. And you can go, I should have put a link there to see how much empty slide I have there. Sorry. So isolating network resources. So by default with Kubernetes, every pod can talk to every other pod. If you're not cool with that, you want to use network policies. So Kubernetes also assumes that you trust the underlying network. So for example, the network of your cloud provider. You may not. So if you don't, one way to address that is to set up a service mesh. So with having like full end-to-end encryption, for example, with MTLS, some people might say that's overkill, it just everything with security, well, not everything, but a certain level of things with security depends on what you specifically need and how far you need to go with that. You could also go the extra mile and you could use tools like Cilium for things like advanced network policies. For instance, you could do filtering on specific API routes. So for instance, if you want to have like a route that's like something, user API, V1, users or whatever, and you want it to be able to talk to, your pod to be able to specifically talk to just that, but you don't want that pod to be able to talk to, say billing for instance. And you can do that using EBPF, which I am not getting into in this talk. But another key word to go Google or Bing or whatever later. So secrets. Secrets can have different RBAC permissions. If you go a little bit hardcore, you can even encrypt them at rest. There is a link up there for being able to do that or telling you how to do it via the Kubernetes docs. Again, yeah, Kubernetes docs. So basically it supports encryption at rest, which will encrypt resources like secrets in at CD, basically preventing parties that gain access to your at CD backups from viewing the content of those secrets, which is very important. Also, don't put the data of what is in your secret directly into a config map. You want to put that into a secret and have your config map read from that secret. Basically otherwise it's just like plain text passwords somewhere. So even better, don't put stuff directly into your secrets, but use something that will put them in there for you. So basically the whole point there is to avoid committing secrets to say your Git repos, which you might find obvious, but definitely ends up accidentally happening. So you don't want to manage secrets yourself unless you really have to. You basically want to use some sort of key management system such or service such as like hash to corp vault. There's your cloud providers KMS or key management service, again, my acronyms if you're using one for basically secret data encryption. So basically what I was mentioning a while ago, try leveraging the work that others have done to make your life easier versus starting from scratch on every single thing that you need. Yeah, there's a bunch of other things there. There's a there's steel secrets, there's KMS and SOPS and by name you can Google them on when I, because I am bad and I did not upload the slides yet. When I upload the slides, you'll be able to click on these things, which I will do after this talk. All right, so next section is on user management and permissions. So people, whether that's you, your devs or who not and robots, so like automated services and code running on your clusters need to be able to talk to the Kubernetes API so they can do things like deploying things, scaling up, scaling down, rolling out new versions, monitoring, viewing logs, et cetera, basically all of the things. But we don't want our auto scaler to do something like mining Bitcoin or developers running away with like, these are credit card database or maybe more realistically, someone steals their laptop who tries to get credit card information. And laptop stealing definitely happens. It definitely happened to me when I was on a like work trip a couple of months ago. Whoops. And so that's why we are going to, not because it got stolen, but the rest of what I said is why we're going to be going over user management and permissions in this next section. So there's two main things which are Authent, which is authentication and AuthZ, which is authorization. So basically there's like a little separation there. So Authent is who are you? And AuthZ is basically what are you that specific person or robot or thing or whatever, what are you allowed to do? So with Authentication, Kubernetes is very flexible here. You can use things like TLS certs, maybe with your own CA or not. You can use OIDC tokens with any OIDC provider. So there's like in-house things like DEX, Key Cloak, or you can use SAS, like Okta, that can turn like plug in with whatever you're, say if you're using a cloud provider using it with IAM. So for instance, you can map like cloud provider users to Kubernetes users for that. So if we go back, thinking about what the best practices are. So use things that are short lived. So for instance, you could use OAuth access tokens instead of using a username and a password. So there's things that like for specifically like for humans or everybody in here. You could do things like TLS, OIDC, services accounts, et cetera. And then like in-client, dealing with client certs. So in Kubernetes, you don't basically create a user. You give, hey, I want to give create pods permissions to like Bob. And as long as someone shows up with a valid cert or OIDC token for Bob, this person or thing or whatever can create pods. And then robots, they end up using service accounts. There is also this project called Spiffy. And you can use that for authenticating from one service to another. It's an advanced use case, but basically short lived identities can help you moved away from long lived secrets. So then on Z or authorization. So again, that's basically, do you have permissions to do what you are trying to do? Say like, do you have permissions to create a pod? Can you do a get on this? Can you delete things? Like what is it that you're allowed to do? So there are a few API server authorization modes. So there's node, which is a special purpose authorization mode that grants permissions to key blitz based on the pods that are scheduled to run. There's a back, which isn't used as much as mostly people end up using our back, which is what I'm going to be talking about in the next part of the slides. There is also webhook. So webhook authorization allows you to use custom OSD system, such as what are plugging in whatever your existing system is. So I'm not going to give you like super accelerated course on our back because in total we have 40 minutes. So it's, I'm just going to give you like a few little things. So like the high level idea on Kubernetes is that you define something that is called a role. So this has a specific collection of permissions. So things that can be done, like can I list these pods? Can I create a deployment? Can I specifically scale up scale down or do something with a specific deployment or resource? And then the next thing is you would bind that role to say you have a user or you have a group or you have some sort of service count. And you can do that either as a role binding or as a cluster role binding. And it's a bunch of yaml. So also once you have created these our back roles, don't just be like, hey, this is, I created them. This is probably right. Or like manually be like, create pod, do I back? Can I do this? That's a lot. So make sure to audit your RBAC. There are a bunch of tools that you can use to do that. So by default, cube cuddle has one that is a cube cuddle, can I, and you can list the things that you specifically can do. There are also those other ones here, which are a bunch of plugins that you can add and therefore be able to just use via cube cuddle as well, which is pretty cool. Note though that you can't take away permissions. You can't be like, here is admin access and I'm gonna take away, like make it so you don't have access to like delete pods. Basically, each thing that you're doing is you're granting access to each specific thing that they are allowed to do. So yeah, like for instance, you can't be like, hey, I'm gonna be like, oh, you can update deployments and you can't do it to some specific one. If you wanted to be able to do stuff like that, for instance, you could have things in separate namespaces where someone only has access to do things in one namespace and not in another. All right, so namespaces again. Basically, permissions can be defined cluster wide or you can do it in a specific namespace. So if you're just playing around with Kubernetes by default, you might have just default. You could give some permissions to do whatever they want there, but then you're like, hey, I have maybe some other namespace of things that I don't want this person, maybe I don't want them to see at all or I don't want them to be able to delete something in this one. I need to make sure this is working maybe once they're playground, dev prod, whatever you want to do. So you can just give specific access based on that. They could even be an admin in their own namespace if you want them to. So maybe for instance, you might have some sort of workshop and everything is running on the same cluster and each person doing this workshop has their own namespace and they can only do what's stuff in that namespace. They can't go and be like, oh, I don't like the person next to me, I'm gonna go delete everything they just did in this workshop so far or more specific things that are like, maybe a little more hardcore than playing around in a workshop. So there's a few gotchas. I'm only going to, there are a bunch of gotchas but I'm only gonna talk about a couple of them. So for one, don't give admin permissions to just anyone. That might be obvious, but sometimes you may be in a hurry, you might not think about it. Be like, I just want this person to do this one thing and it's easy, I'll just give them full access. Whether it's intentional or an accident, things might happen and you might regret it. If you want to be lazier about it, maybe just give them access to a specific namespace, but again, the less permissions that you give someone, the better. So there's this thing with a list secret, like being able to, if you want to like list secrets or just use list, for instance. At least last I checked with Kubernetes, basically you would assume that list just literally shows you a list. Say if you did on secrets, it would just show you here's what the secrets are and that's all you can do. And that get is what you would use to see specific details about it. However, that is not the case. If you do list, you can actually get what is with the secrets as well. So for instance, like, hey, if you were to deploy some sort of Ingress Controller that manages TLS, the TLS certs are usually stored in secrets. So if the Ingress Controller, for instance, has access to be able to do list on those secrets, it actually has access to all of those secrets. So say if it's deployed cluster wide, as opposed to being like in a specific namespace, which is generally what ends up being default. Now this Ingress Controller has access to all of the secrets on your entire cluster. So not exactly great. So if someone else be able to get into your cluster now and via like your Ingress Controller now, they can see all the secrets that you have ever. And at least if you don't have any sort of restrictions up there. So next up is software supply chain, which has been one of the big key terms that people have been talking about probably in like what, at least in the last year, if you especially have been going to some of the KubeCon, I guess KubeCon conferences going to KubeCon. So to give, who here has like heard of supply chain security? Okay, cool, small number of people. All right, so to give you an idea of like what it is in a way that is not specific to tech. So there was, a while ago there was this thing that they termed as the Chicago Tylenol murders. Basically, at some point along the path, things were being put into the Tylenol that was killing people. They weren't sure where that was. So like, was it someone, it was like a man in the middle attack, like was it when they were getting the Tylenol and they were passing it off elsewhere, did someone mess with it then? Was it in the factory? So the factory you can view as like a CI CD pipeline. So you could, at least in software, you could check that for say, I have two CI CD pipelines that are doing the exact same thing is something happening with one versus the other. And basically you could check, hey, is this binary the same as the other one? Then there's like threat modeling. There is like identifying like, who are the actors in the organization? Is there like an angry employee or something like that? But basically for say like the Tylenol, there are so many steps along the way that you have to be like, where in this path did someone put something into the Tylenol to cause this to happen? And you had to figure that out. So for instance, like if there's like a bunch of things. So like there's things like, who has access to your source or pod stories? Like what about the dependencies? How is your software built? Like where are you storing things? How are they deployed? Basically there's a whole trail of breadcrumbs they can go through. So like say if a new vulnerability is disclosed, like how do you determine one? Are you affected by it? And if you are, how do you deploy a fix or some sort of mitigation for what that is? Like how do you know that you aren't unwillingly running some sort of like Bitcoin miner? Like that could be happening and you may have no idea about that. So basically you can't be confident in what you're in your software that you're running, like unless you know where it is coming from and how it got there. So things that you might wanna do is like, restrict the pass that code can take from source to production and make sure that you trace deployed code back along each step of the supply chain. So there's a number of questions that you need to be asking yourself or things that you need to be validating as you go, starting from just like writing your code all the way through the step of running it in production. So for instance, do you need to check, hey, someone made a change? Does this change come from someone that I trust or is it just some random person? So one of the things that you can do, use to be able to like maybe validate this is so there's this thing called SigStore. There's a bunch of projects within SigStore and there's a tool called GetSign and you can use that to sign your commits so that you are identifying basically who this person is with this commit. It's keyless so you don't have to worry about using GPG keys. You can also later down the path, I just figured I wouldn't like put SigStore than other things that SigStore, but there's also a cosine which you can use for assigning container images. So after that basically like is the source coming from your source repo? Like is the code, say if you're pulling down code, is it actually from where you think it is? And then can only authorize people push to that. Like again, like you want to make sure that it's not just some random person that can push code up to wherever you have your code stored. When you're building your artifact, is it from a trusted source? Like can you actually trust the build system itself? Can you trust like where that build system is running and what it's running on? So like once your artifact or your program is built, at that point can you trust like where it's being pushed to? So whatever, let's say it's a registry or if you're dealing with like NPM or something like that can only trust if people push to that. So ideally, the only thing that can be able to push to wherever you're trying to push to is your CICD pipeline. That way you know specifically this is what is pushing and it's not some, if I gave access to every single person in this room, for instance, that's a lot of people to figure out like who had pushed something and what's happening with that. So also make sure that you are dealing with vulnerability scanning for your images. So you can look for known CVEs. There are tools out there such as like Clear and Trivi. Some container registries such as like Harbor, they have that built in where you can just turn on image scanning and pick which of the tools that you wanna be able to use for that. You should also consider using an OPA implementation or open policy agent for policy enforcement. So one of the options that they have is Gatekeeper. You could also use Sigstore to, since they've provided a mission controller to assert image signatures and attestations as well. And apparently I forgot a space there, but yeah, you get the idea. All right, so some actual threat models of things that have actually happened. So for instance, someone might make change to code that you depend on. For instance, maybe they broke the code or maybe they just decided they were gonna delete everything there, whether it's malicious or not. Something happened. For instance, someone might delete a package and end PM that's like widely used such as left pad. So basically a thing to do to avoid this problem because otherwise say you have your code, it's pulling down code from somewhere else and oh no, everything broke. It's not working anymore. So you want to have immutable dependencies. So like are you rendering your dependencies? Or are you giving, basically if you aren't, you're effectively giving someone else arbitrary right permissions to your code in a way. Another thing is there was a SolarWinds attack. It was attack on build infrastructure. So basically a code that was produced by build server had malicious code injected that allowed external access to it. So then people were able to basically inject malicious code and sign it with SolarWinds signing key, which is not good. So one way to avoid that is to have ephemeral builds. Basically don't have a dedicated machine that someone can just hack into. You would want to have it so that basically you have a build so it spins up a new machine on demand and then builds it at that point. Basically if it doesn't exist, it can't be hacked. And then you can also have a hermetic builds which are basically builds that don't have access to the outside world. So people, so basically you have to deal with all those things I've already mentioned, which is a lot if you memorized it, please give me your memory skills because I don't have them. But there's a few more things that I wanted to mention that don't specifically fall into platform or RBAC or supply chain. So when you create a cluster, so say if you're like, I'm gonna be on some sort of web console and I'm gonna go create cluster or maybe you're using like EKS cuddle or some other CLI tool to spin up some sort of cluster. At that point your cluster is basically far from being production ready. It has a lot there for you, but it does not have everything. There are things that you might want such as being able to back up your cluster. Like say something happens, if you are having backups you could easily revert to what was there beforehand. You can use tools such as Valero to be able to do that. Another big thing is observability. So metrics can basically help you determine if things are going as expected or not. And then if they aren't, they can also be used to help you figure out what is happening. Whether maybe someone did something or if in general maybe just something broke. Good logging, that is super important. It's basically indispensable for having good security because we wanna be able to audit who is doing what. You want to be able to see, did some new, for instance, did someone just get access to be able to do whatever they want in the cluster? Who did the specific thing so you can track to see why did something break or why did something happen and who did that? Like you wanna be able to see user and privilege changes. Like things that are being created, security events, interruptions to logging, et cetera. Basically log as much as you can in a way that is useful to you. Also don't forget at the end of the day, everything, assuming you are writing code, not running things that everyone else is doing, you are running code. If you have problems with your code, Kubernetes is not going to just by default catch everything that is wrong with it. So you want to use some sort of like source code analysis tools such as OWASP to help you analyze that source code or like compiled versions of the code to be able to find security flaws. There are some more resources. I mentioned some of these on some of the previous slides as well. If you go to say the CNCF landscape, you can see there's like a section specifically there related to things with security. The landscape keeps growing. If you know every single thing on landscape, that's crazy and wow. I definitely don't. But it's basically like my coworker and friend, Josh Long basically described it as kind of like a fractal. You just like click it and it just keeps getting bigger and bigger. The Kubernetes docs has a specific section on security. There's, if you're specifically like dealing with your own cluster and administration of that as well, there's links to that. Again, there's a bunch of stuff for Sigstore that you can look at as well. On Friday, the coworker I just mentioned, Josh, who is very much into Spring and Java. We have, it's prerecorded. So since unfortunately he did not come to Dublin. But if you want to learn about creating Kubernetes like operators controllers using Spring and Java, we have a talk on that. It's at 1555 to 1635 at Wicklow Hall, assuming things don't change. Cause I've noticed a few people have been talking about the times and such for their talks have changed in the last day or so. So I guess check with that. Yeah, so thank you all for coming here and sitting or standing. Special thanks to these lovely people who have taught me a bunch of stuff about security that I did not know in advance. Cause again, I want with this, I propose this talk with the goal of learning security and be able to present it from the viewpoint of somebody who is just learning it. Since a lot of times like if you're new to something and you attend a talk by someone who's super senior, they, the level of like what they just learned is like six years ago, 20 years ago or whatever it is. And it may not be, it might be way too advanced or just a different level than what you might be familiar with. Please come and tell me your thoughts and things that I should change and whatnot since I am also giving this talk in 10 less minutes at cloud native security con, which is the Kolo event with KubeCon. So yeah, thanks everyone. And yeah, that's my, my Twitter is, has been on every slide. It's up there. I'm really bad at using it lately, but I will try to be better. So thank you. Have a good rest of your conference.