 Welcome to talk a bit on the user management in Kubernetes, or well, as the title of the talk says, it's more like a lack of user management capabilities in Kubernetes. The whole talk stems from different discussions that I've had with people newer and newer into the world of Kubernetes, and kind of a lot of confusion that I have now my cluster up and running, so how do I actually grant access to another set of users for my cluster. So we are at that KubeCon 2021, and we're on the 101 track, so we're not going to dive too deep into the whole topics and different configuration options, for example. But we are going to look at the kind of high level topics and concepts, how do you actually then really manage the access for your users into your clusters. So, yeah, hi, I'm Yusin Ummelin, working at Miranda's Engineering. I've been working with Kubernetes and different cloud native technologies for the past quite a few years, even actually before they were really called like cloud native stuff. In the past year or so, I've been mostly working with the new open source Kubernetes distro called K0s, and that's basically the reason why you see some mentions of it in the examples. So this is not really a topic on K0s itself, but you'll see some examples how we do things in K0s, for example. And of course, as I said, it's open source, so you can dig into that if you really want. Although not entirely correct, but the kind of good way to get your mindset into correct kind of direction when thinking about this whole users in Kubernetes topic is like really kind of realize the fact that there is no really, there's no users in Kubernetes or users managed by Kubernetes per se. Of course, there are users using Kubernetes, but the Kubernetes control plane doesn't really manage the user access for you. We'll dive into this soon. So the topics that we are going to look in this session is of course a quick look on how the whole authentication authorization and admission control, the triple A functionality really kind of works in a high level on the APS server. We are going to have a look at what is the thing that we said that is a user in Kubernetes. We'll have a look on what are the sort of out of box options for managing the user access, and then super quick look on then how to tie the user information and user identities into rural-based access control rules. But the triple A authentication authorization and admission control, so those are basically the three fundamental stages when the Kubernetes APS server processes in a request. And as with many other things in Kubernetes, the authentication is sort of a pluggable or at least highly configurable kind of system or stage on the APS server. What the authentication really does is that it really looks at the kind of the incoming request and then figures out that okay, what is the user identity of the thing, either a human or some other service, that's actually making the call. And every single call is either rejected as unauthorized or tied to some identity. And there's this also this system anonymous kind of special identity that are being used and then if the user identity is not really, we cannot figure out what's the user identity. After the authentication stage can figure out that yeah okay, the user is now using. It just passes that information as is to the following stages, namely to authorization stage. And kind of that's the key part to understand. The authentication gives the user information as is, as basically as a string. So there's no higher or more fine grain than that. It's just like a string in the process more or less. There's basically like two kind of categories for user identities in Kubernetes. The service accounts which probably everybody has been kind of stumbling over. And service accounts are really, really kind of access controls and access identities intended to be used for processes like your core DNS server running in a pod in a cluster. Of course it needs to talk to the API so we have to have some sort of a authentication and identity for core DNS. The service accounts are essentially JSON web tokens and they are managed by Kubernetes. But the service accounts are not really intended or designed to be used for human users. So basically of course technically you can use them but there's a lot of drawbacks in there and a lot of potential security holes. But let's call them security concerns at least. Then of course we have the normal users like myself who wants to access the Kubernetes cluster. And now these normal users and normal user access is not really managed by Kubernetes itself. Well if we look at the Kubernetes documentation there's actually quite a sort of good explanation and especially one kind of sentence really nails it really well. So the API server really expects that there's always something that's more or less external to the cluster itself that manages the user access and manages the user information. So essentially user is more or less like a transient thing. So the user data is not really stored on its CD. Which of course means that there's no user management capabilities. So I cannot grab my admin access to my cluster and say like Qt CDL create user, just that's not possible. As I said the API server has this sort of a configurable or plugable, multiple different ways how to actually then identify with the user and then authenticate the user. There's basically four different options out of box on the API server. There's client certificates, we can use static tokens, we can use external service to validate a token. And then we can use this more slightly more complex protocol called OpenID Connect. It's kind of more like kind of like OAuth with some nuances. So let's first have a quick look on client certificates. So especially those of you who have some experience on setting up your own clusters. You've seen that there's pretty much always a certificate on authority that we have to create for the control plane. And mainly for the API server and then for the other control plane components. We can use this cluster wide certificate authority to actually sign client certificates. And that kind of designing with the CA, it kind of creates this mutual trust between the client certificate and the API server. So as long as you can create a certificate that is signed by the CA, the API server can use the information from the certificate itself to really identify the user. Because it trusts it. It has been signing, the Kubernetes CA and the Kubernetes control plane kind of itself has been signing the certificates. So of course it will trust the certificate. So basically it will pick up the user name from the common name field on the certificate and then the possible group information from the organization field. So your distro tooling might bring some helpers like we've done for example in case zero. So you might get these handy command line helper tools to actually create you like a Qt config readily parsed and spit out with the client certificates created in them. Of course as with any system and as with any solution test, there's of course some glitches. And for the client certificate authentication, the main thing to understand is the fact that once you create and once you give a certificate for your user, it's pretty much impossible to actually revoke the access. And that's basically because the CA is more or less kind of static for the lifetime of the cluster. There's a lot of different components that relies on the existence of the CA and the Kubernetes API components like the controller manager is typically using the client certificates also. Do it indicate itself and whatnot. So if you go and actually change the CA or want to recreate the CA, you have to reconfigure a lot of things on the API server and on the control plane in general. And then again, it's not really like, it's not difficult, but it's not trivial to create this line search either and of course requires a bit of effort. It doesn't kind of make sense to have this sort of a super short lived certificates also for your users, or at least you have to build some fancy automation around it. And then if you want to do that, then you probably want to have, want to look at some of the other options too. One of the probably the simplest ways is to use this external kind of static token authentication. So what we can do for the API server is we can pass it a flag called token of file, which is basically a pointer to a CSV file that has the user access tokens. So basically have a token as the first field and then user information, group information. And after that, I can use the token as a bearer token on my API requests and that's it. And of course, Qt CDL for example and Qt config format has support of course these static tokens too. That's like super super simple way to manage your user access. But well, of course there's going to be, but the main thing to understand with these tokens is the fact that whenever you want to change something in the tokens, you want to revoke access from your coworker or somebody who left the company, you want to change a token, you want to create a new token for new user, whatnot. You have to always reboot the API server or restart the API server. So the API server really reads in the tokens only when it starts. And then if you are running like multiple controllers in a highly available setup, of course you have to keep this whole thing in sync across all of the different controllers. And of course as you saw that the tokens are in plain text so it's a bit sketchy maybe. Webhook token organization. It's essentially a kind of external webhook service that the API server calls to really both authenticate the token and then give the user information for the user who owns the token. So let's say that the API server sees a call with this sort of a token 1014f blah blah blah. What it actually does is that of course it needs to be configured to do this, but essentially it calls this external webhook service and then passes on the token. In this situation, token review, JSON format. And then the webhook will basically respond that yeah, it's a known token for a known user. So yeah, good, authenticated. And then it also passes on the user information based on whatever system it can figure this out from. It doesn't, for the Kubernetes API, it doesn't really matter where we dig out this information. So the external webhook service will say that yeah, the token is for Jane Doe and Jane is part of the developers and QA groups. Of course the API server again has to be configured to use this external token webhook. So you have to pass in the address for the webhook and webhook service and whatnot. And most importantly, you have to be able to configure this mutual trust between the API server and the webhook service. In other words, it doesn't make any sense for the API server to just spit out the token to any random service and then expect to get a valid answer back and just trust it. So that's not really a possibility. So we have to be able to build this mutual trust with certificates between these two entities. OpenID is kind of similar, but then again really not, but slightly similar at least to the webhook token. So also in the case of OpenID Connect, the whole identity management is like fully external to the API server. So there's no token files or token CSVs or anything stored on the API server. So basically when an API server is configured to use this OpenID Connect, we essentially, and I'm cutting some corners here, but essentially we are configuring the API server to trust set of JSON web tokens that are assigned by some identity provider. So again, we have to configure this sort of mutual trust between the API server and the identity provider essentially. What's cool about this OpenID and basically on JSON web tokens in general is the fact that the JSON web token is cryptographically signed. So when you've configured this mutual trust, anybody can actually really easily validate and verify that, yeah, this is a valid token. And then after validating, we can dig out all the needed information from the token itself. So at that point we don't actually have to call any other external service anymore. So we can really just validate and verify the token itself and then dig out all the needed information. So it's in a sense from the processing point of view, it's much simpler. And then of course clients like ship CDL for example, they can be configured to automatically handle this token renewal with the client and the identity provider itself with these access tokens, ID tokens, refresh tokens and a bunch of other configurable things. As a sort of quick comparison between the different options that we saw, the X509, the client certificates and static tokens both are like super simple to use and super simple to set up. Maybe for the client certificates, setting up the CA is, well, as with anything that has to do with certificates, a bit of a tedious thing, especially if you're doing it the first time. But in most cases your distro tooling and distro itself kind of handles that part. So it's pretty simple as straightforward to use and set up. But both of them are actually quite inflexible in a way. So it's hard for example to invalidate tokens or almost impossible to invalidate certificates. Webhook tokens are a lot more flexible. They are still quite easy to use. It requires a bit more work on the setup phase when we have to configure this mutual trust between the APS and the webhook service. But I don't think that's like a nightmare or anything. So just a bit more work and a bit more complexity on that. OIDC, super flexible of course, because the identity is provided fully by an external service external system, like for example like Google accounts or whatnot. They are usually quite easy to use. In some cases you might have to have some helper tools in place for your client to actually renew the certificates and whatnot. And I have seen some problems also in different Kubernetes client libraries that actually haven't had a decent and proper support for OpenIDs Connect. Hopefully those are now fixed in most cases. As with Webhook tokens it requires a bit more setup and kind of to build this mutual trust between the APS and the OpenID provider, the identity provider itself. I'm typically pushing people towards configuring OpenID Connect as the main source of access control and identity provider for the APS server. Mainly because it's the most flexible and most organizations already have some identity provider in use that they can quite easily at least usually hook into their clusters. Okay, so now we've used, let's say OpenID Connect. We've configured that to talk to the Google account services and fetch the get designed JSON Web tokens with the user credentials in. So okay, what do we do next? So basically the second stage on the kind of API processing pipeline is authorization. So basically on the authorization stage the API server starts to figure out that yeah okay now we know that the user is using. He's trying to create a pod but is he allowed to do it? As we know that the Kubernetes API provides a really super fine grained role-based access control system. So we can actually control the access to the API based on different APIs, API groups, different API objects on to the level like even sub objects that can a user read logs of a pod. And then of course as the authentication stage kind of gives us the identities of the users, we have to then tie the users and groups into the role-based access control rules. And to do that we basically on an RBG level we create a set of roles which define that what are the rules that are implied for this role. So in this example I've created a role called pod reader. It says that whoever gets assigned this specific role can access pods and can access it with verbs get, watch and list. So basically all the verbs that are intended for reading information from the API. And then we use role bindings to bind a user which for the RBG itself it's just a string. It's not an object or anything it's just a string. We bind it to that role that we just defined. And of course we can bind users to the roles either directly to a single user or then we can also bind a group of users. And both remember that the both of these user and group information is pretty much always provided by something that is external to the APS or itself. So it's not a user information or group information object that we store on the APS or itself. So having said that, user really is a more or less like what I, I'm not sure whether this is a, I'm not a native English speaker so. But I always call user as a transient thing. So it's something that we kind of figure out on the fly. We keep the information in memory during the processing of the request but that's it. Because Kubernetes doesn't really store the information after that. The user identification and sort of authentication of course is more or less externalized from the APS server point of view. I'm using term sort of because on the other hand having the CA, the certificate authority for example stored along the APS server on the node. It's whether that's really truly external or not, that's a different discussion but at least from the APS server point of view it is an external thing. We learned that there's multiple ways to use external, external identities for the users, static tokens, certificates, webhook services and then OpenID Connect. And then we use roles and role bindings to tie the user identities in the set of roles telling that that okay, what a user can and cannot do on the APS server. I provided kind of good information for people to kind of understand and get into the mindset that the user management is somewhat complex and not straightforward in Kubernetes as well, what is straightforward in Kubernetes. But at least gives you a kind of good segue into learning more and driving into more details into each of the different options and what not. I'll be hanging out on the event platform for the Q&A session and with that I thank you.