 Hello everyone. Hello, and this is almost the end of the day, almost the end of QtCon. So thanks to all of you for attending our talk. So we're gonna talk about workload identity federation, in particular for a particular use case that is development environment on Kubernetes and in particular in the Ford use case. Hello all. Welcome. My name is Satish Paranam. I am a technical leader and a cloud manager for various cloud services, some of the things that I'm responsible and accountable for most of the things around Kubernetes, a lot of cloud services in Azure and GCP. My name is Mario. I work for Red Hat as a software engineer. I work on the developer tools organization in Red Hat and I've been working on cloud development environment for the last seven years. And we'll talk about that. Cool. So just want to set the stage before we hopefully try to cover all of these things. I want to just talk about a few things like what are the challenges that we were facing along, why we did this thing, what is a cloud development environment, a few things about Kubernetes, like what is a service account, service account token volume projection, and how that can be used to do things like workload identity federation or federated identities. So with that, so I just do want to set some basic context of how big we are, what we are. We started along about 2017. Currently, we do have like two ginormous bare metal monolithic bare metal clusters on-prem. We have 50 plus Kubernetes clusters running anywhere from Azure on-prem to GCP to plants, spread over 2,500, 3,000 application teams. Around 8,500 namespaces last time we bothered to count, that is. Some journey, right, so we've been, as I said earlier, we've been under journey since 2016, 2017, we started doing Kubernetes. And from there on, we jumped on to things like OpenShift, started with CoreOS Tectonic for those who are around to know about them. Then started the whole journey of least privileges. Why? Because we were, as the fleet size was growing, people were running around with quite a bit of elevated privileges. Then we started working on how do you do, as the fleet size increases, how do you do configuration management of this fleet? Then we started talking about, as we went into public clouds, what is IAC looks like? What does CICD look like? Then whole bunch of things kicked around, around cloud adoption, primarily around GCP, Azure. And then we started in 2021, what we call is like fit for purpose clusters, saying that our cluster doing one thing or one set of things, rather than saying that, hey, we'll have a kitchen sink, everything thrown into one thing. A few things that we talked about earlier on Monday, I guess, around our GitOps journey, we started refactoring all of our code so that everything is declarative. Nothing we do today is click-ups. Everything has to be done through code. It's all declared ahead of time. And then as part of all of that journey, one of the things that we started encountering is like, how do we create consistent development environments? As we started onboarding hundreds and thousands of people, we're saying that it works on my laptop. It doesn't work on your laptop. So how do you solve those problems, right? So as part of all of that stuff, so we will jump a little further and we'll talk with this particular slide is like, obviously, like most of you who have used cloud development environment or cloud, most of the cloud CLIs are written in Python. Obviously, everybody has their own favorite operating systems, could be Mac, could be Windows, could be Linux. But how do you start a Python environment? Just to set the stage, right? Doesn't have it to the same thing could apply to anything. The idea is that it's complex, that more than one way of doing the same thing, and there is no right way. Because right way is the way that you like the best and that works for you. But it may not work for your somebody else across the teams or maybe your partners, maybe your suppliers, maybe your vendors, they want to do it different way. So how do you solve this thing? So one of the other, some of the other challenges on the based on the previous slide is you want consistent user experience no matter where you are, no matter what your operating system is, no matter how new or how old you've been in the team and accustomed to the practices and rituals and the customs within the team. But you also want it to be cheap. You also want it to be secure. The other thing that we want, we were looking at least, is that we wanted to build these systems based on open standards so that when people do come in, they can extend it, we can extend it, or the other important thing is like how fast can we spend them up? For example, we could spend hours and days configuring laptops to configuring development environments, IntelliJ, Eclipse, name your thing, right? So then the other important aspect as I started alluding to is like as we start doing cloud and all these different areas that we are physically present across the globe, how do you do credentials management? Do you provide tokens? Do you provide usernames, passwords? Do you provide MFA? And all these different things come in. But the important thing, other important aspect is like as we go down the journey of saying that everything is GitOps, everything is code, so wouldn't it be nice to say that have an environment defined right in your Git repository that describes a development environment for that particular repository? So these were some of the ideas that we were thinking about. How can we solve some of this? So based on that, we started working with Red Hat, Mario and team, and a lot of the Eclipse community. And Mario, so how do you think we can solve some of these earlier challenges? Do you have some suggestions for us? Yeah, that was the question. And that's exactly when we started collaborating a few months ago. And yeah, there are a few options. I think that most of you probably are aware of different things you probably have tried in your career when you had to do some software development, you had to set up your environment and you wanted also to share your development environment with somebody else. Or maybe you wanted to switch to another branch or another project with another version of Python, etc. So what are the available solutions for this problem? One possibility is using virtual machines. And a tool that works really well for that proposal is Vagrant, for example. It is still widely used, was popular a few years ago. And today we have Linux containers of course. You can specify your development tools in a Docker file so that then you build the image and you will have all the development environment. And then you can use some orchestrator as Kubernetes to provision your development environment on the fly automatically for developers. There you can also, another option, another option is environment managers. So that something like a virtual end with Python or SDK man for Java or other languages. So you're able to have to switch rapidly the version of the SDK that you have on your machine. But this doesn't provide a full isolation. But it can work. On some cases it may be the good fit. And the last one are configuration manager. So those are more powerful tools that allows you to set up a rapidly machine with all the tools, all the environment variables, everything you need, credential, etc. for software development. These are all the options. So we've been iterating with those different solutions for a few years. And since three or four years, we've been actually working on a Kubernetes operator to manage cloud development environment. So it's container based and it's used as it extends Kubernetes. So it's an operator. So it defines a customer source definition that allow a developer to define the cloud development environment in a declarative way and share that with the rest of the team. This is something that we are using in Red Hat for the OpenShift Dev Spaces and also on other projects like Eclipse Che and the OpenShift Web Terminal. So the Dev workspace operator is part of the Dev file organization that is a sandbox CNCF project. So in practice, once you have installed the operator on the cluster, you are able to start a development environment with a QtlApply. So this is an example and I will actually run a demo quickly. So my terminal here is a local terminal that is already connected with Kubernetes cluster where the Dev workspace operator is already installed. So if I try to do a QtlApply explain of the Dev workspace, it provides the description. So you can see here that the CRD is installed here and I will be able to do the QtlApply that I was just showing here. So in this EML, the important part here is the container image. I'm using a container image that I've built with the tooling. So there are a few language SDKs that will allow me to compile, to test my application. And the last line, you see we're actually specifying what editor will be used. In these cases, VS Code. The definition of VS Code is actually in another Dev workspace. So I'm referencing here the VS Code definition so that VS Code will be included in my development environment. So if I run that and then I watch it, so watch the Dev workspace object. Yeah, now Kubernetes, so the operator is the controller is actually starting the container, starting VS Code inside and it's providing a, and it will return an HTTP URL that I can open where actually I will have VS Code up and running here. There is no Git repository cloned. We could have a more complicated Dev workspace EML file with a list of Git repositories, or with environment variables, or with predefined task that will be available from VS Code. So it's much powerful. So we are just showing the most basic example of a Dev workspace here. And I'm going to switch here and I'm going to open, is a fully, so this is VS Code open source that we package in an image and we start with the workspace. So you have access to the terminal. The terminal will open a shell inside the container, the container image that was specified in the Dev workspace EML file. And I will be able from here to use my development tool. So something that we'll talk later is about credentials, adding credentials here. So how do we do that? Because we haven't, I haven't specified anything. And in fact, if I try to do a GCloud projects list, for example, GCloud is available in my image, but I'm not able to actually connect. I'm not able to access the API because credentials are not here. So going back to the slides, and so summarizing what happened, we just did a Qubectl apply, so we created a Dev workspace object. And the operator created a pod with a development container and it started an IDE, so in this case, VS Code. It attached persistent volume to the container so that the source code will be persistent, even if I decided to stop my workspace and to delete the pod, I will be able to restart later and I won't lose the changes on the source file. I've added also some runtime containers. You can add some databases or some other services that will help test the application that you're developing. So you can add as many containers as you want, so the Dev workspace allows you to add also some events so you will be able to run some, to execute some commands before, so just after the start of the workspace or before it gets closed. So it's powerful. And then there is an ingress that allows me to access VS Code from my browser. So these are like the basic components of my development environment. Now, the problem is that I've tried to use GCloud and I couldn't use it, so I mean I could have logged in, but if I have every time that I started workspace logging out, logging manually, that could be something that would be annoying. So there are a few ways we can avoid that. The classical way we've been using is you can create a secret on Kubernetes. You annotate it and we automatically include the content of the secret in the workspace as an environment variable or as a file. But this is not ideal. This is not ideal and we will see. So that's one of the ways today to include credentials, the best, the more secure way to do that is using service account tokens. So that's when actually talking with Satish, Satish suggested that, hey, why don't we try to use the new service account token? For those who have been around and using Kubernetes for some time now, there has been a big change in Kubernetes 120 and 21 is around old service account tokens who are long lived. They were not expired. You cannot revoke them. For example, God forbid you leaked and it went out. You can't revoke them. There is no way to actually say, hey, this service account token is only valid for these guys and nobody else. So based on these particular limitations, the Kubernetes team and the community decided that there is a better way of doing it. And that's what the new one is called as a brown bounded service account tokens. These have basic great properties. Some of those properties are their time bound, their shortlist. Other things, they have an audience attached to it. In other words, it tells you who can use it in what context can it be used. The best part, it is compliant with OIDC standards. Pretty much every IDP out there speaks OIDP. The most important part is the instrumentation that the Cubelet provides to actually rotate them. So before the part, the short lived token expires, the Cubelet will rotate them. If the part is terminated, the old service account token that was minted is no longer valid. So that was the some of the constraints, some of the ideas that we thought that, hey, maybe it would be nice if we can use some of this stuff. So Mario, so how can we basically use the same concepts to further some of the ideas of federated identities, for example? Yes. So the way we, what we did is that we allowed an administrator to configure the way, so the kind of service account token that could be added in a workspace. So basically what the network space operator does, it add a service account token volume projection with the most important part here, so behind the path and the expiration as the audience. So the audience by default is the Kubernetes API server. So it means that the token will allow you to access the Kubernetes API server. And based on the service account privilege, you will be able to do or not do some action for accessing the API. But if you change that, so if you specify another audience, you will have a service account token that then will allow you to actually connect to other API, to other services, external services. So we did that in particular for Ford to allow developer in Ford to access cloud service providers from their development environment. So the idea is that how can we take these properties of Kubernetes and identities that are short-lived tokens that can be projected by Kubelet and federate with, in this case, as an example, Google federated endpoints, right? You can do the same thing with Azure, AWS, you can do it with on-prem as well using things like Spiffy Inspire. But the idea is of federated identities is every part is an identity. A token is establishing that identity for you within that part on that Kubernetes cluster in that namespace. Wouldn't it be nice for somehow to exchange that identity for a Google identity and assume that all the IAM privileges that particular Google service account has effectively letting you do whatever you can in Google or any cloud provider or even in on-prem without ever exchanging credentials. There's no credentials. There's nothing to worry about that you need. You don't have to think about, hey, it'll walk away or there will be a whole bunch of other issues with ongoing issues like social engineering, all the ransomware, malware, and all kinds of interesting ways of people are trying to attack and sabotage all our well-intentioned clusters or environments. So what the idea of workload identity federation simply is to take a identity and exchange it for another identity in cloud provider. So the idea of that basically allows us to do what we talked about. So what I wanted to show you quickly is like enough of talk. It's all boring. Show me the real thing. Does it really work? So let's give it a try. We'll try to see if it really works. So a simple Git repository. I have nothing but Maria was talking about some dev files. You can have the dev file and I'm going to just launch the environment. The goal is this is going to clone the Git repository without me asking any credentials because I'm using GitHub or GitHub apps and all that interesting things provisions a volume attaches a container and here I am. It's fully booted. My repositories are fully linked. All my environments is set up including all the extensions I wanted, all the settings I wanted. It's all rigged up. So no more somebody can say, hey, it works on my laptop. It doesn't work on yours. All you need to bring is just an internet connection and a browser to the table. So let's see what happens we can do with some of this stuff. So what I'm going to do is hopefully, yeah, there we go. So here's the small token. That's my token that was minted by Kubernetes. Sure. There's nothing of any major significance in there is just giving you hits applicable for this token, this part in that namespace on a given cluster. And it expires and an hour or something like that. So now the same configuration that actually made it all happen is simply is that configuration was nothing special. All it is saying is that here's a given service account on a given GCP project. Please swap make it available as this person as long as this is provided. So that is the configuration that was injected into the thing. Now I can basically start doing simple things. I can say, Hey, you know what, I'm going to write into a bucket. I'm gonna write. It's a short into bucket. I can read that bucket content back again. And then finally, if I want, I can delete the content as well. All without using any Google credentials. Now the point of all of this is is that you got a repeatable environment. Number one, all one of the things that we talked about earlier is that wouldn't it be nice if I can just drop a file in your get repository, just like you drop your make file, gradle file, may even palm files or in, you know, package your days on if you're, but the same thing exists right there in your get repository. And then nobody should be have to fumble and say that, Hey, how do I set it up? Just start it. And you should be good to go. So with that, so do you want to bring it home, Mario? Yeah, so with the demo, so we have seen that I'll actually from nothing. So without actually having to do anything manually. Satish has been able to start the workspace. Well, credential for is GCP credential. So that we, we, but at the beginning, we started from challenges at Ford challenges like that, like, how can I rapidly set up an environment where the developer will be able to be productive in a few seconds. We have talked about how we have decided at least read out to invest in a Kubernetes operator to provision cloud development environments. And that was actually solving a problem, the problem that Ford and Satish team had. And then we're our collaboration on using one of the latest features of, of Kubernetes service account token projection with the workload identity. And then we're going to have to off Kubernetes service account token projection with the workload identity federation that allow us now to, without even having to create a secret to insert tokens from GCP AWS Azure in the development, the cloud development environment of the developer in Ford. So this is, this is, this was it. So we have one file and slide with a few links, if you are interested. So the cure code that is, that is here will actually link you to the, to the slides. And all the links will be, yeah, in the, in the last slide of the year. The slides are here. The slides are also available on, on SCAD. So you can, we have uploaded the slides there. So if you, if you're interested to, to, to download them. And with that, so I think we have a, yeah, we have seven minutes. So if, is there any, if there is any question, we will be happy to, to discuss. Yeah. Thanks. Thanks for the presentation. I had a question regarding how the issuer endpoint in your service account token was pointing to Google storage endpoint. Oh, okay. Are you exporting the public keys to a bucket for the YDC JWKs? Pretty much. So if you, if you look at the open ID connects back, open ID connects back is nothing but a public key and a well-known endpoint. It needs to be reachable over HTTPS. And that's pretty much it. Right. So we chose, chose to use GCS buckets as an OIDC endpoint. And then we use that endpoint to teach Kubernetes API server to use as a OIDC endpoint that we can connect with Google. So do you create a workload identity provider in GCP per Kubernetes cluster? Yes. So basically what happens in GCP every, every cloud environment will be slightly different in GCP. They are called as identity pools. You can create an identity pool per Kubernetes cluster. And then you can add people towards the pool and the act of it call a service account. It's basically called as a, a web binding. That's what Google calls it. So you can keep on doing it pretty much across the board, across all the cloud providers. Okay. Yeah. Thank you. Thanks. Thank you guys. I think you applied the answer to the first question, but when you're giving permission for, for a principle to assume service account, it needs to be like a user identity of some kind. Are you using the individuals GCP account? Are they, are using the cluster identity there? Like, who are they all thing in as to perform the service account impersonation? Okay. So typically what happens is that when teams have provision at least in Google in forward environments is you have a machine accounts or a service accounts that are created. You can, you can impersonate that service account, right? And that is what was shown here as an example. You can also impersonate as you if you want to, right? Typically we don't do that. We just create service accounts and that service accounts have some capabilities and we let people we control who can impersonate it. This person can impersonate it or another machine can impersonate it, right? So that's the idea here. Gotcha. And then follow up question of that. In service account impersonation, you, the whole point is a many to one relationship, like a bunch of different entities are going through this one service account. How are you guys handling auditing around understanding, you know, who is actually doing what while impersonating that service account? So what we typically recommend is, is basically least privileged model, right? The idea is that your app when you're talking to something can have some only minimal requirements what it needs. However, let's say we have a lowest environment, whatever environment you go to pick lowest environments, you can do what we call as click ops or, you know, things like clicking around and running commands and all that stuff. So this is typically meant to show you the lowest environments that is all segregated from all production workloads that where you can do certain kinds of things like this. So in production, you will not be able to do that thing. What I just showed you. Okay. Thank you. Question and thanks for the presentation. When you launched this workspace on behalf of the user, I think you're breaking the link between the user's original identity and just creating like a impersonated identity. Does this mean that your workspace infrastructure can impersonate any user? Or is that a problem? And do you plan to try and link the original user's identity so when you sign in to GitHub, the OIDC identity there with the identity, like cryptographically link with the identity associated with the workspace? I think that the question is about like authorization, how we isolate one development environment and why we make sure that somebody else cannot get your user. A little in the sense that your infrastructure can impersonate users, right? So your infrastructure becomes like a source of trust. Yeah. They plan to like ensure that the infrastructure can't impersonate users by maybe linking to the original identity that launched the workspace. Yeah. So there are a couple of things that we use for, so at least for isolating it to make sure that others cannot access your credentials. The first one is that every developer to be able, so the DevOps space operator doesn't have any notion of authorization or users. It's an operator that just reconciled that. But on top of that, we're without open chip desk spaces adds a layer of authorization and authentication. So you need to authenticate. And when you authenticate, then you will be able to create those DevOps space objects only in your namespace, that is your namespace. And on top of that, the other thing that we do to avoid that somebody else could, so first of all, we try to avoid to have secrets because otherwise somebody that has more privilege than you will be able to access to your namespace and to all your secrets. And second thing is that we block executing, so having access a cube exact in your container with a web book that don't allow you to actually get inside the container unless you are the user that initially created the DevOps space object. So these are like the mechanism that we set up to avoid that there are problems like privilege escalation, somebody that have access to your Kubernetes token and get more privilege than the privilege that he had. All right, I think that there are no other questions. Maybe the last slide is if you would like to provide feedback, this is the the code to get feedback. So it's it's always welcome. So whatever feedback you would like to provide us. Thank you. Cool. Thank you.