 All right, this is going well so far, everyone, yay. All right, so hi. Yeah, I'm Tiffany Jernigan. I'm a developer advocate at VMware as my slide says up there. If you use Twitter and want to talk after it or whatever, my Twitter is up there as well. Here's how more technology works, because sometimes this thing here fails as well. All right, cool. So this talk is basically focused, like this is a security conference, but I'm hoping and assuming that not everyone here is like, oh my gosh, I am a security expert. I know like a ton of stuff. Hopefully someone here is newer to it. If not, well, you probably know everything here already. Or maybe you'll learn something and that would be cool as well. All right, so first off, basically less is more. The less that you have, the more secure that you can be. And I'm really bad at holding a mic doing this because I haven't done this well, so I'm sorry if I keep moving around, you can't hear me. So first off, have a less code. So words possible, try to use manage services and existing security solutions already out there instead of going and creating your own. Basically, if you're using something that is off the shelf, maybe there have been a lot more people that have already been working on it unless you go and create your own specific security team to do it yourself. So chances are it's probably better unless you're specifically focusing on doing that. Also, give fewer permissions. For instance, only give necessary permissions. Don't go and be like, here I'm gonna give every single person admin access to everything. And also avoid having long-lived secrets. Also, you want to have fewer dependencies. So that minimizes the tech service. So for instance, maybe you'll want to use, say, a distreless image instead of using Ubuntu. Also, keep up with whatever the latest recommendations are. If you go to the Kubernetes docs, for instance, there is a security checklist there. You might not be able to see it too well there. I'll be posting some of the slides later, but basically there's a link there that frequently gets updated with things. So the Kubernetes documentation, it breaks down cloud-native security into four Cs. So there's cloud, cluster, container, and code. As a practitioner, you basically need to worry about hardening your infrastructure at every single one of these levels. You can't just be like, I only care about one thing and not the rest of it. So first, I'm gonna start talking about things with cluster components. So unless you really need to run everything yourself, I suggest that you don't. Instead, use a managed Kubernetes offering. So managed services, they take care of a lot of the things for you, so you don't have to do it yourself, and therefore there might be a greater chance of not making a mistake, screwing something up, more security vulnerabilities, et cetera. So security and hardening is basically one of the primary value adds for cloud platforms. So as I was kinda mentioning, you need to worry about hardening for every single layer. So harden the control plane. Basically only allow people to interact with the API server via the Kubernetes APIs. The less that you have running on your control plane servers, the less potential avenues a attacker can exploit to gain access to your cluster. Make sure that you use real TLS certs, such as, say, from Let's Encrypt. And please do not be lazy and skip TLS cert verification. There's this nice little flag that is insecure skip TLS verify. It's basically as good as not having TLS at all. So there's also this talk called PKI the wrong way. In that, it's demonstrated how the end user ability to create client certificates for MTLS can be leveraged to gain complete control of a cluster. So this basically, it demonstrates the need to restrict access to etcd. There is documentation in the Kubernetes docs that basically tells you how to go about doing that. And basically, etcd by default, it assumes that anyone that can successfully authenticate with MTLS has proper access, which isn't necessarily what you want. All right, so make sure that you use pod security standards to not allow others to have access that they shouldn't have. So pod security policies, they were the defaults, so it's a bunch of acronyms, PSPs, PSS, but that was deprecated in 1.21 in favor of PSS. There's also where you have the cloud metadata server. If you're not running it, say, just completely on your own servers and locally and everything like that. And there are credentials and info that are given into the virtual machine. You basically never want to have your pods to have access to that. So there can be lots of kind of scary vulnerabilities that can happen if you allow attackers to have access to the metadata server. So someone could get credentials, for instance, associated with the VM service account. And basically, if the pod can have access to this, it can get your node credentials. And you might be able to do things like registering to the cluster itself during bootstrap. So pod could put some amount of work, pretend that it is a node, and get workloads assigned to it, which is pretty bad. So for upgrading, if you're dealing with all this yourself, there's a bunch of things that you need to worry about. So Kubernetes basically is designed so that upgrades can be relatively seamless, especially if you are using a managed offering where you can maybe just click a button and have it do some of those things for you. We've gone a long way since OpenStack when you had to create a new cluster and then move everything over to that and then go and delete your old cluster. And I've seen way too many people that are still running super old versions of Kubernetes because sometimes it's just easier to not do anything. But please don't do that. Even some managed offerings might have the option to do auto upgrades. So it just makes life a lot easier. It's you're making sure that you have fewer of like things to deal with CVEs that have been found maybe years ago that have been solved at this point and patched. Basically try to stay up to date with the newest and latest patches or as close to the newest release as you possibly can. And if you're kind of scared about what could go wrong with updating, it probably won't be things blowing up when you're upgrading. It's more likely that you'll maybe run into something like a deprecation cycle and then things are no longer working anymore. So like how does that work? So for instance say you have like some API version that needs to be deprecated. So say that was like for Ingress for V1 beta one. So a new API was introduced, so there was a V1. And that was in 119. So at that point, if you're using it, you're gonna get a bunch of warnings saying, hey, this is being deprecated, it'll be removed in 1.22. You don't have to be like, oh no, the world's like ending, I need to like worry about this immediately. And everything else will, things will blow up. However, like try to plan for it. Basically once you start decommissioning your 118 cluster at this point, just migrate over at that point. And remember Kubernetes will support both V1 and V1 beta one during that entire period. So maybe that's about like a year ish or so. So basically you have a year to upgrade your YAML. So you don't really have an excuse there. For your enjoyment, this is the list of things that you have to do if you are upgrading a cluster yourself. So and the specific order that you have to do that in. So you have to update your control plane. You need to upgrade your nodes. You need to upgrade clients such as kubectl. And then you have to deal with things like adjusting manifests and other resources based on API changes that accompany the new Kubernetes version. So as you can see, you really want to use managed offerings so that you don't have to deal with all of that yourself. So quotas and limits, they can be used so noise neighbors don't affect other workloads that are running on the same host. They can also prevent attacker from using your infrastructure for doing things like DDoS or crypto mining, which, unless you're trying to do the crypto mining, you probably don't want someone else doing that. You can also use things like taints and tolerations to schedule workloads away from each other if you decide you want to. You can also add another layer of security by using sandbox container runtimes, such as like GVisor or Kata or Firecracker. Basically they add another layer of compute isolation between the containerized workload and the underlying operating system kernel by adding another layer of virtualization. By default on Kubernetes, every single pod can talk to every single other pod. If you're not cool with that, then you need to use network policies. Basically, Kubernetes also assumes that you can trust the underlying network. So for example, the network of, say, your cloud provider if you're using one. If you don't actually trust it, one way to address that is to set up a service mesh with full end-to-end encryption, for example, like MTLS. Some people might say that's overkill, but it all depends on what you specifically want. You can also go the extra mile and you can use tools like Silium for advanced network policies, such as applying filtering on specific API routes. So for example, you can have some front-end client pod and you want that to be able to talk to V1 users, but you don't want it to be able to talk to, say, V1 billing using like EBPF magic. So secrets. Secrets can have different RBAC permissions. I will go a bit into what the heck RBAC is later if you don't already know. If you go hardcore with that, you could even encrypt them at rest. Kubernetes supports that with encryption at rest, which will basically encrypt resources like such as secrets in SED. And basically, that prevents parties that gain access to your SED backups from being able to view the content of those secrets. Also, please do not put your specific secret data inside of a config map. Use actual secrets and then have your config map reference those. So yeah, even better, don't put stuff directly in secrets. Use something else that will put them in there for you. So the whole point there is to avoid committing your secrets to, say, your Git repo and then, hey, look, the whole world sees it. That's pretty bad. I mean, you may find that kind of obvious, but just as a reminder. Basically, yeah, you don't want to manage secrets yourself unless you really have to. You want to use something like Hashicorp vaults or sealed secrets or chasmus or sops or if you are using a cloud provider, you can use KMS. And basically, yeah, for a secret data encryption. Basically, leverage the work that others have done to make your life easier and more secure. So people, such as you all or your devs or robots, so like, say, if you have some automated services and you have code running on your clusters, all those things need to be able to talk to the Kubernetes API so that they can deploy things. Whether it's, or maybe you have to scale up, scale down, roll out new versions, do monitoring, basically all of the things. But we don't want our autoscaler to do things such as mining Bitcoin or maybe your developers ran away with a credit card database, but probably more realistically, their laptops got stolen, which may have happened to me on my last trip for KubeCon. So then you just want to make sure that people can't access this data. So that's why this whole next section is for user management and permissions. So we have good old separation between these two things. So if people might just say auth, but what kind of auth? So there's authN, which is authentication, which is basically like, who are you? Then there is also authZ, which is authorization, and that is, okay, well, now we know who you are, but what are you allowed to do? So for authentication, Kubernetes is very flexible here. You could use things such as TLS certs, maybe with your own root CA or not, OIDC tokens with any OIDC provider. So maybe it's some in-house thing, like DEX or Keycloak or SAS, like Octa. And that in turn can plug in with your cloud's IM if you're again using cloud. So basically that could map cloud provider users to your Kubernetes users. Make sure to recall what the best practices are. So for instance, prefer to use short-lived credentials. So for example, use OAuth access tokens instead of a username and a password. So for humans, you might have to worry about TLS, OIDC, service counts, et cetera. Like a fun note is in Kubernetes, you don't just create a user. You give, hey, I want to have this person or thing or whatever to have the ability to create pods, permissions for that, to say like John Deere or something like that, I don't know, or Jane Doe or whatever, or your name or some robot. But basically what happens is, as long as someone shows up with a valid cert or OIDC token for that user, then they can go and create pods. And then robots deal with service counts. There is also the Spiffy project, which you can use for authenticating from one service to another. It's an advanced use case, but with it, you can deal with things with short-lived identities that can help you move away from these long-lived secrets. And if you want to look more into that, you can go to spiffy.io. All right, so authorization. Again, this is, do you have permissions to do whatever you are trying to do? Whether it's like creating some pods, deleting something, listing, et cetera. Kubernetes, the Kubernetes API, has several different authorization modes. So there's Aback, which is attribute-based access control. There's Node, which is a special-purpose authorization mode that grants permissions to cubelets based on the pods that they're scheduled to run. There's also Webhook authorization, which allows you to use a custom Auth0z system, such as plug-in your existing system. And the main one is Aback, which is what we'll look into next. So the high-level idea of Kubernetes is you go and you define a role, which is a collection of permissions, so things that can be done. So, for instance, being able to list pods, create certain types of resources, scaling up and down for some sort of resource, et cetera. And then there's being able to do things like, then you bind the role to a user, whether it's a user or it could be a group or it could be a service account, and you use that with a role-binding or a cluster role-binding. Also, make sure to audit your Aback. There are a bunch of different cube-cuddle commands that you can use to audit permissions. The first one that is listed there is there by default. To query the API authorization layer, there's other plugins that you can add, which is basically everything else there. And just as a note, you can't take away permissions. So basically, if you're like, I want to give someone full access, you can't be like, oh, well, I don't want them to have access to this specific namespace. You have to go and give specific access to each thing that you're trying to do if you want to go about it that way. Permissions can be defined cluster-wide, or you can do it with a specific namespace. So you can be like, hey, I want someone to only have access to this random sandbox namespace, but I don't want them to have access to prod. So that can be a thing to do some separation there. You could, for instance, make someone even an admin in their own specific namespace, and if they blow everything up there, it was their namespace, so it won't destroy anything else. So again, do not give blanket admin permissions to just everyone. That would basically be like giving root access. So namespaces basically make it easier to be lazy. In doubt, basically, if someone needs access to a bunch of stuff, create a namespace specifically for them, give them access to that, and call it a day. So software supply chain. In 1982, there was the Chicago Tylenol Murders. They can figure out where, along the supply chain, that there was something being put in Tylenol that was killing people. So the solution was basically adding a seal there. So for instance, transport is avoiding a person in the middle attack. Your factory is your CICD pipeline, so you can check it by having a concept of a secure factory. So for instance, you could have two CICD pipelines and have them do the same build. You can check if the binaries are the same, and then there's threat modeling. It's identifying who the actors are in your organization, perhaps it's disgruntled employees, et cetera. So if a new vulnerability gets disclosed, how do you, A, determine if you're affected, and B, how do you deploy a fix or mitigation? Are you even sure that you aren't accidentally or someone is unwillingly running a Bitcoin miner? That could be bad. Basically, you can't be confident in the software that you're running unless you know where it came from and how it got there. So there's a bunch of questions that you need to be asking or things that you need to be validating as you go from writing code to running it in production. So does a change come from a trusted person? One tool that you could use to help here is Six Doors Get Sign, and you use that sign commits. So you're identifying who you are, and it's keyless, so you don't have to worry about GPG keys, so that's pretty cool. You can also use Cosign for signing container images. Next is the source code coming from your source repo. Can only authorize people pushed to it. When you build your artifact, is that from a trusted source? Is the build system itself trusted? Can you trust where it's running and what it's running on? And once your artifact or your program is built, can you trust where that is being pushed to? And that only trusted people can push to that as well. And ideally, only your build system and your CID pipeline can do the pushing. Also, make sure to do vulnerability scanning with your images to look for known CVEs. There are tools out there such as Clare and Trivie. Some container registries, such as Harbor, have this built in. You should also consider using an OPA implementation for policy enforcement, so Gatekeeper is one of the options there. Sixter also provides an admission controller to assert image signatures and other stations. There are some threat models and there's a bunch of different ways to mitigate them due to a time I might skip through this really fast, but for instance, so say someone manages to make a change to your code that you depend on. And it's not your code, but something that you're pulling on. So someone decides, hey, I'm gonna delete this or it's accidentally deleted. That could be a huge problem. So you may want to have immutable dependencies. Are you vendering so that you don't have to worry about if it goes down, for instance? There's also things like attacks on your build infra. This happened over with SolarWinds. And basically, some code that was produced by the build server had malicious code injected in it and allowed external access. So to avoid things like this, you may want to have ephemeral builds and to not have a dedicated machine that someone can hack into. So basically, you would want a build system that spins up on demand and then goes away afterward. Basically, if it doesn't exist, it can't be hacked. Few more things. Long story short, if you spin up a cluster by clicking around in some sort of web console or use something like EKS Cuttle Create Cluster, the result is far from being production ready. For instance, maybe backing up your cluster can be really important. You can use a tool such as Valero. Observability is another big thing. Metrics can help you determine if something is going as expected or if it's not and try to help you figure out what is happening. Also, good logging is indispensable for security because we want to be able to audit who does what. We want to be able to see things like user and privilege changes and creation of things and security events, interruptions to logging, et cetera. And don't forget at the end of the day, this is running your code. If you have problems with your code, Kubernetes is not going to be able to catch them all. So you want to be able to use source code analysis tools such as OOSP to help you analyze source code and or like compiled versions of code to find security flaws. Here are some other resources, some of which I guess I have already mentioned. I'm doing this talk again, hopefully with less technical difficulties at spring one in December. And there will be other talks both on like Kubernetes in spring. If you've never heard of spring, it's because you're probably not in the Java world, but there's that. So if you want to come over there. If you could fill out the feedback, that would be super helpful. That way I can work on improving this talk. So that would be great. I'll leave that up there for a second since I apparently have like two minutes. All right, well, thank you very much. Come up and talk to me afterward. I think I'm pretty much at the end of this year. But yeah, so special thanks to a bunch of folks who helped me learn from this because I went into this with I don't know anything about security, I should learn it, so therefore I'm gonna propose a talk that's about security and force myself to do it. So that's how this went. So yeah, again, there's my Twitter and thanks again, everyone.