 Hello. I'm excited to speak about containers and isolation levels at KubeCon today. Hi, my name is Jachi Liu. Today I'll be talking about making sense of the various isolation layers in the Kubernetes landscape. I came across this topic while working on an open source project at the University of Chicago in how, you know, working on multi-tenant Kubernetes clusters, allowing different users to access different layers of the control plane. And I found a lot of these topics to be really, really interesting. I'm excited to share them with you today. Cool. On the agenda is multi-tenancy and isolation, a little bit of introduction to that. And then diving deeper into the components and isolation layers within Kubernetes that's supported by Kubernetes. And then I'll talk a little bit about the particular project that I worked on that focused more on container level and pod level isolation. Going straight into talking about multi-tenancy and isolation. So what is multi-tenancy? In a single tenant architecture, each tenant or user have their own instance of their own cluster. So in this diagram, you'll see that user one has their own cluster, which has a control plane and worker nodes. User two is completely unaware of user one's cluster and has their own cluster to work in as well. In a multi-tenant architecture, some or all of the resources of a given cluster can be shared across multiple tenants or users. So in the diagram on the right, in a multi-tenant architecture, user one and user two are sort of both using the same cluster. They might be sharing a control plane, there might be multiple control planes, and they're sharing the worker and their own resources and pulling together those resources. So why do we care about multi-tenancy? What do we want it? In modern day systems where there are more and more complex clusters and more and more complex platforms, it can be really challenging for platform teams to have to operate multiple Kubernetes clusters. For example, some of the challenges are, you know, running dedicated control planes for each cluster, making sure that from an operational perspective, deployments, patches, and upgrades across a fleet of clusters are up to date and standardized and managed properly. Those things can be really challenging. Additionally, there's overhead with running a lot of foundational resources, such as policy controllers, observability platforms that really should be consistent for a given organization and might actually be better off as shared resources. So as a result to sort of reduce the operational complexity, it might be worth it to run multi-tenant systems. And the general analogy of multi-tenancy is think about this in terms of an apartment where you have multiple tenants, each of them will have their own, their kitchen, their own bedrooms, their own bathrooms. They might share some resources like garbage collection or the front lobby. But for the most part, they're operating on their own without having access to other people's resources. Cool. So when it comes to multi-tenancy, you then have to think about isolation, because if everyone's on the same cluster, what does that mean in terms of who has access to what resources and how much of the given resource a tenant can consume? So in that case, single-tenant clusters are very valuable if you have a small amount of tenants and is definitely the best way to ensure isolation, because everyone is sort of in their own domain without access to anybody else's resources. And ideally, constraints and restrictions on other resources don't apply to single-tenant clusters. On the other hand of the spectrum, there's also a completely one shared cluster where everyone has full access to everything, which is very, very much open. Regardless of where on the spectrum you fall on, there needs to be some level of trust in a multi-tenancy. Ultimately, something underlying is shared. And multi-tenant clusters are able to address some of the downsides of managing a single cluster for each customer or each tenant, but can come in different flavors and different restrictions. The goal with multi-tenant clusters is to enable tenants to coexist without impacting each other, which doesn't just mean isolation, but it might mean restricting quotas or restricting API access to or resource access to a particular tenant so that one single tenant cannot bring down the entire cluster. So ultimately, the trade-off is between cluster management complexity and isolation in the far right example where you have one shared cluster where access is very or everyone has the same level of access. That's the least amount of cluster management complexity, but on the other hand, there's also no isolation. In the single-tenant use case where everyone has their own cluster, it's the most amount of isolation, but it's a lot of overhead complexity to maintain. And along the lines, you might decide that some things are worth sharing, some resources are worth sharing, but some other constraints need to be in place. So there's a constant trade-off between complexity and isolation, and then there's no perfect isolation scenario. Okay, so this talk primarily focuses on the tooling that Kubernetes provides you for isolation, but I also want to take a high-level view on what's sort of available in securing your system overall, because those things definitely come into play. So things like hardware, the hypervisor level, networking, the virtual private cloud, they all come into play and ultimately how secure your ecosystem is. And for instance, if you have a really strict firewall, you might be able to have a more generous RBAC policy on your cluster. So things like having web application firewalls for your external APIs, restricting access for internal APIs to a given network, they all sort of provide different layers of protection. So being realistic and practical about what your security policies are and making sure that you're creating a sustainable development environment, also sustainable environment for your team to understand and manage the layers is really important. It's not always about, you know, everything has to be locked down, but understanding the big picture as well. So that's something that I really wanted to highlight, but the majority of this talk is going to focus on the Kubernetes level and what you can do with your Kubernetes cluster. Okay, so let's jump into just that. I'm going to introduce these topics by going over the basic atomic primitives within Kubernetes, because I think that these are really important to build upon to talk about isolation. So starting off the containers are the atomic unit of work containers run on pods and pods are a logical unit of the application. Pods are then deployed to nodes, which are at the virtual machine serving as the workers and a cluster will contain multiple nodes. The important thing to know here is that you can run multiple containers per pod if your application logically, you know, is separated into different multiple containers. Cool. So what isolation layers does Kubernetes provide you? At the top layer, you sort of have the workload. Within that, you are sharing a pool of nodes. But given, you know, example of isolation might be that different tenants might have priority on resources or the number of nodes allocated to them or how scaling out works. But from a architecture perspective, it's really abstracted away. There's this shared pool of nodes that anyone can theoretically access. The next layer is the control plane, which is a core part of the Kubernetes cluster and is able to segregate the tenants and allow for authorization based on our back. API priority and resource allocation and actually sort of starts to define the protas, the restrictions, the controls. So the control plane includes the Kubernetes API, the scheduler, the controller manager, and also includes at CD for storage and state. But the important thing is here is where the cluster is managing and correcting the desired state and will make decisions in such a way that actually enforces some of the isolation that you might want to add to your cluster. And we'll talk a little bit more about what tooling and mechanism that means. At the bottom layer is the platform services, which include things like central logging, monitoring, ingress, egress, and internal DNS. A lot of this is going to be shared across the tenants. And one of the benefits of having a multi-tenant system is you can have the same logging, say monitoring. You can share these platform services. You can upgrade them consistently and you have the same use cases across the different users on your platform. However, you might not want to allow sharing across all the actual metadata or underlying information in the platform services. For example, you might want to provide the same DNS tooling to everyone on your cluster, but that doesn't mean that you want all tenants to be able to discover each other's services. So there might be additional constraints you built into the platform services to make sure that everyone has the tools that they need to run their applications, but that users don't actually have access to other users' data or other users' information, such as their metrics or logs coming from an application owned by someone else. So again, foundational services, which are definitely one of the benefits of having a multi-tenant system to begin with, but doesn't mean necessarily that all information provided in this platform level is going to be shared with all tenants. Okay, so the namespace permittive is the key way for enforcing and implementing isolation at each layer. Kubernetes doesn't define what a tenant is in a multi-tenant architecture. So it's really up to you to come up with a model that makes sense. For example, you could have namespaces divided by user. Each user on your platform has their own namespace, or maybe your tenant is 18 within your organization, or maybe your tenant is actually different organizations altogether. Those are user-based ways of dividing up a cluster into different namespaces and defining what a tenant means to you. Other examples are you could also have different namespaces for different environments, such as a QA staging or production environment. You could have different namespaces for different applications. So if you have a different microservices, may choose to live in their own namespace. So a lot of different ways to divide up what a tenant means, and Kubernetes is not prescriptive about it, but namespaces in general is a foundational abstraction that allows you to define what a tenant is within your own system. Okay. So the namespace allows you to organize your cluster and define what a tenant means, and then who gets to share what policies. But a namespace by itself doesn't create those policies. So there's no restrictions when you create a namespace on whether or not a user in that namespace can traverse to another namespace. So in order to actually implement different isolation levels, you actually have to add on the policies that are supported by the Kubernetes tooling on to the namespaces so that you can have them. So examples of controls that are relevant and potentially interesting is one, using row-based access control to grant access to namespaces and authorize resources and authorize API access. So at a basic layer, maybe in namespace A, these are the users that can access namespace A, but cannot access namespace B in creating that foundational policy. Using things like resource quotas at the namespace level to constrain CPU memory and storage. So you might want to restrict and limit quotas on a given namespace, because as we sort of talked about before, the worker nodes are all shared. And when, because the worker nodes are all shared, you could theoretically have a situation where one namespace consumes all of the resources and namespace B is unable to launch pods. You don't really want that. So you want to be able to constrain CPU memory storage at the namespace layer and implement those kind of quotas. One really interesting and frequent use cases, you might need to tie certain nodes to a particular namespace like I have done in this diagram. That doesn't come by default, but you can definitely use the pod node selector to implement a way such that certain pods are only launched on certain nodes for a given namespace. So those are examples of controls that could be potentially useful in creating isolation at the namespace level. And there's plenty of different things that you can do and configure, which are really interesting within Kubernetes. Okay. So a note on namespaces in most Kubernetes deployments, tenants are authorized to list all namespaces on the cluster. So this is something that if your use case requires tenants to not be aware of who else is on the cluster, namespaces might not be the isolation level that works for you. And there are other tools, there are other open source tools available that you might need to look into. The other key thing that I want to talk about is I sort of mentioned resource quotas, but oftentimes when we talk about isolation, we think about just security and role-based access control and who has access to what APIs, but it's really important to isolate and restrain resource consumption because you could have a scenario where one tenant, whether maliciously or unintentionally, abuses the resources on the cluster, overloads it, and that will impact all the other tenants on the cluster if it causes the cluster to go down. And one of the goals in multi-tenancy is that you want the different tenants to be able to function without being impacted by other tenants. So Kubernetes is a great API priority and fairness documentation on this topic, but these are really important things to implement in your deployment so that no one resource can sort of take everything down. Okay, I'm going to talk a little bit more at the container level. So containers overall create separation between workloads, even if they're hosted on the same node. Within Kubernetes, containers run on pods and each pod is a logical unit of an application and can actually contain one or more containers and the pod is sort of deployed onto a node. Okay, containers by nature, we think of them as developers, we sort of think of them as VMs that are isolated but you can very easily break container isolation. I'm going to talk very specifically about containers running as root. By default, containers run as root and there's sort of a reason for this, which is you might need to run as root in order to install software and certain applications like Nginx sort of run as root by default on the server, but it's genuinely best practice to run as a root user and if you do need to install software on the container, you can install the software first and then actually run the application as a non-root user. So that's something to take note of. The other way that you can actually break container isolation is with any kind of vulnerabilities built into dependencies or code that you're installing onto your container. So the way to sort of avoid that is to scan your container for vulnerabilities but it can be really cumbersome if you have to scan every single container on every single cluster or every single node constantly. So you actually want to be able to have an accurate scan at runtime. So the idea behind this is you want to treat your containers as immutable objects and don't download code on runtime. That also means that you wouldn't have to run your containers as root on runtime and be able to store, build your container and load it into a container registry where you can just scan the containers in the container registry and scan for vulnerabilities there instead of every container on the cluster before the container itself is deployed to Kubernetes or to your cluster. You can even use a tool like Open Policy Agent which can actually enforce emission control based on the scan results coming out of your vulnerability scan so that you're not deploying compromised containers to your cluster. Okay, so those are sort of my two really important things to highlight about containers and container security. There's quite a lot more and I've left some resources at the end in case you want to read up more on container isolation and container security. But I'm going to deep dive into a use case from the team that I worked with at the University of Chicago on the pod level isolation that we needed. So our use case was that we were enabling users to launch their own Jupyter notebooks in a browser setting so that they can run analysis on biomedical data. And the flow on the back end kind of looked like this. The user would go to a web application where they would request a Jupyter notebook and they would be told what the CPU and memory constraint is. Behind the scenes we had an application called Hatchery that would call the Kubernetes API and launch a distinct pod for that user and this pod is assigned to that user. It actually has a persistent volume backing it up for storing the Jupyter notebook as well. This pod launches our own image of the Jupyter notebook and is isolated from the other pods. The user is able to execute code in terms of what they're executing on the Jupyter notebook but cannot access things like the Kubernetes API or internal APIs. So at a high level this is sort of what I just described. Hatchery is the open source tool that will deploy a pod to Kubernetes so that the user can execute their own code. Hatchery in that process denies the user or that pod access to the Kubernetes API and internal APIs through network policies and it also imposes a constraint on CPU and memory usage per pod so that in case there is a really heavy workload on a particular Jupyter notebook that maybe a user is running a heavy computation on data, a single pod cannot overuse resources and take down the entire node or even the entire cluster. So that's sort of it for my talk. Like I mentioned before there's a lot of really great resources in case you do want to learn more. I'm going to call out Running with Scissors which is a talk by Liz Rice at GopherCon where she did talk a lot about containers and root access and I also want to call Hatchery the open source project which is on GitHub. Okay I think the thing that I want to leave you with is that Kubernetes gives you a lot of tools to come up with different ways to add constraints, different ways to add isolation at the pod level, at the namespace level, at the cluster level. But it's really up to you to come up with a model for multi-tenancy and isolation that makes sense to you and your team and makes sense for the broader architecture for which your deployment, your Kubernetes deployment is in. And it's really important to decide what those tradeoffs are and understand them because you don't want to over-engineer your deployment, you don't want to over-engineer your application. But I'm hoping that at least in this talk I gave you a sense of what basic Kubernetes tools and container level tools you can leverage to implement what you need. Thanks everyone, bye.