 Hello everyone, I'm Alex and today's topic will be how to secure both your build and your cloud by utilizing OIDC tokens in your pipeline. So let's start with the agenda. Before diving into any solution, let's start with understanding the problem. Let's start with discovering how modern Sdlc pipelines are working, how they're built and why there could be a possible issue with cloud authentication. After understanding the problem and that OIDC could be the solution to several of these issues, we'll understand what is OIDC, OpenID Connect, what is its concept and how it can be used in CICD environments. Then we will build some very simple real world scenario which will go through the slides and finally we'll show you several demos and how you can build a more secure authentication to your cloud environments using OIDC. Okay, who are we? I'm Alex Ilgayev. I'm Head of Security Research in CICODE. Previously, I led the Malware Research Team at Checkpoint. We're doing a lot of reverse engineering for complex pieces of malware, discovering campaigns for cybercrime and APTs. And nowadays in CICODE, we're researching vulnerabilities in software supply and security and hopefully also mitigations for these vulnerabilities. Elad? I'm Elad. I'm Security Research at CICODE used to do API security and IOTs before I came to CICODE. Okay, so in short, CICODE is a cybersecurity company based in Tel Aviv that are doing complete software supply and security for organizations in all various kinds of security, both in searching hard-coded secrets in your code, securing your pipeline, misconfiguration, cloud security, and many more. So let's start with the problem. So after investigating and understanding how most of the modern SDLC pipelines work, we can divide it to several parts. The first is a developer, whether it's a contributor, a maintainer, collaborator, it doesn't matter. It's the actual person that pushes the code. The second part is the SEM system, source control management, mostly Git-based. It could be a GitHub, GitLab, Bitbucket, Azure DevOps, it doesn't matter. And most of the code is pushed through one of two ways, either direct push, which is usually not recommended, or through a pull request or manager request, depends on the system, where you suggest a piece of code that should be reviewed by multiple developers before it's been pushed to your code repository. And whenever you push a new piece of code, usually, which is the most important part in the recent years, usually there's some trigger event that creates some autonomous procedure in some CI pipeline, in some CI system. So there was a plenty of CI system nowadays, GitHub actions, GitHub CI, Circus CI, Jenkins, and the list goes on. And usually these pipelines are doing predefined procedures, taking your code, checking it out, compiling it, packaging it, pushing it to some artifact registry, some container registry, it doesn't matter. And also doing the deployment phase, creating resources in your cloud, and actually creating your production environment. So to create all these complex pipelines, you need a list of first-party integrations in your CI CD pipeline. For example, it could be artifact registries, PyPI, Maven, Gradle, NPM, Docker, it doesn't matter. What could be also cloud providers, Azure, GCP, AWS. And to create these procedures, you also need to authenticate securely with these providers. So the popular way to do that nowadays is mainly through tokens, tokens, which are similar to username and password, but tokens is like the popular method. You generate some credentials in the system, and you save them in some secret manager of your CI system. So you could access this securely from within your CI. And how popular is this method from simple search in GitHub for accessing to these secrets? There's like hundreds of thousands of occurrences of these access to secrets. It is suggested in like almost every best practice paper. It's better than hard coding the secrets, of course. And we can safely say that this is like the most popular method nowadays from in your CI pipeline. So access some different services. So you could think yourself, what is the issue if the best practices suggest this method and they are encrypted and they're not hard coded. So what could be the possible issue with that? Well, it's better than hard coding them, but it's still it's not enough. And the first reason is Circus CI Bridge. In January 4th, Circus CI, a really popular CI system, announced that it was going through a security breach. They discovered after one of its GitHub token that was saved in Circus CI system that belonged to one of its customers was misused, maliciously used. And then they understood they were going through some security breach, which means that all the tokens that saved in the system are actually compromised. And so immediately when they discovered that, they announced and told all their customers to reroll all their tokens, which means that they could be compromised. Furthermore, when they finished investigating this breach, they went with several best practices. And one of these best practices, they said that to use IDC tokens wherever possible to avoid storing long-lived credentials in Circus CI. This is the first case. The second case, which is more popular, which is the CodeCov Bridge, happened two years ago. The CodeCov is a really popular code coverage tool that usually runs inside your CI system. And two years ago, the installer of CodeCov was compromised. And every workflow that was running CodeCov actually run an additional malicious bash command that exfiltrated all the environment variable of your CI system to some attacker-controlled server. So why it's relevant to the secret? Because usually when you're using secrets inside the CI, they're stored as environment variable. So it's another reason why we need to improve the security of authenticating inside the CI. So what is the solution or maybe the solution? OpenID Connect, OIDC. So let's understand what is OIDC. So it's built upon on the off 2.0 framework, which is a widely popular framework in the web authentication. And it actually extends its capability. It mainly extends it by allowing a third-party application to verify the identity of the user in the authentication scheme. And behind the scenes, it's using JWT or JAWS, which is a JSON web token. So both OIDC and OF using the concepts of tokens and scopes. So token, which is this JWT, is similar to a train ticket that you could show it to do something. It allows you to do something, right? And scopes, it defines what the user can do. For example, you can use this train ticket to board a specific train. So how JWT looks like. So one of the benefits of OIDC is that it's standardized some of the claims and some of the fields in this token. So the most popular one is a subject, which is the identity of the user or maybe the machine. It could be used to grant you some permissions. And an issuer, which is the authority that issued this token. It also contains the token issue, time, expiration, and many more fields. So another very important fact that all these JWT have signatures that were signed by the issuer. So they are signed rather than being encrypted. So let's see how OIDC can be used in web authentication. Let's say I want to consume some resource on some web application and I can click on this login with Google button on the application. So it will forward me to the identity provider of Google, which will raise me some authentication consent. This small window that asks me whether I'm authorizing this application to use my identity. And when I accept this, so Google issued a new token, a new JWT token that's signed by them, containing my identity. That is passed to the web application and the web application can use that to understand who I am and grant me the proper permissions. So you can think why this mad web application because we're talking about CI CD. So let's see how OIDC can be used in CI CD. So it's not a coincidence that in the last month or so, most of the popular CI CD systems started to support OIDC natively. Furthermore, GitLab even in the past weeks even extended its support to the OIDC by allowing you to define a custom audience in your pipeline. So how it works. Let's say I'm some CI system and I'm creating a new job, a new job that doing something that related to my development pipeline. So in that job I'm able to request the identity provider of the system, let's say GitLab or GitLab, to request a new token that contains the identity of the job of the pipeline. So I'm requesting it with the identity provider. It signs that token and give it back to me. I'm the job. And then the job you could use this token to pass it to some, for example, cloud provider to STS, STS security token service of all the cloud providers have this service. And they could use this identity to issue me a short-lived credentials for the cloud provider. It could be AWS token, GCP credential, Azure, it doesn't matter. And then I could use this short-lived credentials to access the resource I want in the cloud. So this is at a high-level schema. We soon will see how to implement this in details. So to do that, we created a really simple example really to explain the concept. This example we will have a GitHub and GitLab. We show how to implement it in both. And we want to authenticate with GCP. And then we want to access a really simple resource in GCP that will be GCS. It's just a cloud storage of GCP similar to S3. And we showed how to implement that both in credentials-based authentication and in identity-based authentication. So I will pass it to Elab. Okay. So in the simple flow, we'll upload a configuration file from GitHub and from GitLab to GCP bucket. So let's start. Firstly, we'll create a service account. The service account will have a permission to upload objects to the bucket. And we'll export the key of the service account. So we have now a long-lived credential that we will have to put in our CI system in order to authenticate. So let's start with GitHub. A common workflow will look like, will authenticate to GCP using the JSON-creditation file and will upload the configuration file to GCP. Next one, let's see the GitLab workflow. This time we'll use the Gcloud client. But this time, the best practice for GitLab is using the Ashikov vault. So if we don't have Ashikov vault, we'll have to save the GCP as a protected or masked variable. When we will navigate to this tab and add the GCP secret, we'll get a warning that says the regular expression does not match to the JSON. There are some workarounds to that. We can base 64 and code this JSON, but it's not the preferred method. And pretty much every website we go, it says that we should authenticate using different methods. And that's different methods. Currently, it's the YDC, meaning we'll not have to rely on long-link credentials, but we'll use short-lived credentials instead. So how do we do that? Firstly, we'll create the identity pool. The identity pool will have two providers, the GitHub provider that will contain the default issue for GitLab. Next, we'll create the GitLab provider that will also contain the GitLab issue. So what do we have until here? We have the identity providers of GitHub actions and GitLab. And also we defined providers for GCP and dedicated for GitHub and GitLab. And we still have the same service account we defined before, but this time we'll not export the service account secret key, that long-lived key. We'll use the OIDC method. Let's start with GitHub. The first thing we see is the permission. The permission is changed and we're telling to the GitHub pipeline to ask for OIDC, the ID token, to be used. Next, we'll use the identity provider full path and the service account email to complete this authentication. And then, as before, we'll upload the configuration file to cloud. As we can see, the main thing that changed here, we are not using the secret we used before. So, where are the secrets? They're gone. Let's see a little demo. We have the same workflow we defined earlier. We will rerun the workflow. Let's go to GCP cloud and we can see our identity pool and both of the providers in this time we'll see in GitHub with this default issuer. We'll go back to GitHub and we can see we have uploaded the configuration file. The workflow has succeeded. And when navigating to our bucket, we can refresh and see that our configuration file is done. And all that without credentials, without the long-lived credentials. Next, GitLab with the same flow. But this time, in GitLab, we'll use the ID tokens keyword. The OIDC token will be saved under the OIDC underscore token and will be used in the same way, but this time, we'll use the GCloud client. So, is that it? We defined our identity pool. We defined our providers. But it's not it. It seems that the default way, without configuring any mechanism to filter a request, any signed OIDC token will be accepted. Meaning that if GitLab, CI or GitHub action will sign the OIDC token, it will be accepted and will grant the service account permissions. So, we can see the identity providers and service accounts are all over. So, we need to use that mechanism. And that mechanism will be using attribute conditions. So, let's dive into the OIDC structure. Both GitHub and both GitLab define some custom claims, in addition to the standard claims. And each one of them defines its own particular environment attributes, like their repository path, the ref name, like the branch name, and the emails and many more. So, we'll have to choose which one of the standard claims or the custom claims do we want to trust. That will be the mechanism that will filter a request to our cloud account. In GCP, we'll use CER, context expression language, that will allow us to deny or accept connections and to choose what environments do we trust. Do we trust a certain repository? Do we trust a certain email? Do we trust any of that? And we'll have to choose. So, an example to strict conditions are conditions that use IDs, for example. The repository ID, the subject, are very strict conditions that cannot be malformed. An example to wide conditions and easily bypassed are conditions that includes the start with, contains, that can easily be tampered by a attacker to create an repository that contains a certain word and bypass the attribute condition. So, let's move on. In this case, we have the same workflow, but this time we'll add a simple thing. We'll add an actor ID, an attribute expression that filters and certain users. And this time, the actor ID will be 100. My ID is different from 100. So, let's rewind the workflow. So, again, in this time, it requests an IDC token from the identity provider and giving it to the cloud provider. But this time it failed. It failed because it says that the attribute conditions denied this connection. And it will deny all the connections that will not contain in the IDC the ID we defined before. So, let's look on a high level what happened in this video. We ask the CI system to give us an IDC token. We send the pipeline send the IDC. And the STS checked the attribute conditions and all the other stuff and saw that the attribute condition returned an false expression, meaning the attribute condition failed. And it returned a return then deny message to the API that says the attribute condition did not match to the IDC signed token. Thank you a lot. So, it seems like this is like a perfect technology. And there isn't a single reason why not every CI will use it nowadays, but it's far from the truth. We showed you examples for GitHub, GitLab and cloud accounts, but many services nowadays are not supporting IDC-based authentication. One of them, a really popular one, is Docker Hub. There is even an open issue for Docker Hub to support this. And until that, every login, every authentication to Docker Hub in the CI would look like this. It should require you to create tokens and access them in the CI. And this could lead to handling many tokens in the system. Imagine you have different tokens for each type of CI with the different privileges. And this could create a real mess. There is a pretty good solution to handle that as well. It's called the External Secret Manager. There's a very popular Secret Manager. One of the popular is Hashacop Vault. We will show you how to implement this also through the GCP Secret Manager. And so in this schema, it's similar to the previous setup, but we will create this Docker Hub token. We will save it in a GCP Secret Manager. And then we will have a service account that will define it to access, to allow it to access this Secret Manager. And the service account will be exposed through the identity pool and the providers that allowed how to define previously. And how to do that? In the GCP, we just access the Secret Manager and we're adding the tokens there. And we're just defining this service account so it will have the permission to access these secrets. So the code is pretty simple. This is the case with GitHub Action. We could use a really simple action that developed by GCP that's called the Get Secret Manager Secret. And in the syntax, you just specify the path for the token in the Secret Manager in GCP. And the name of the variable that this token should hold. And then for the subsequent steps that you want to access Docker Hub, we just mentioned this variable that you found previously. It's pretty simple. In GitLab, we recommend to use Vault because it has a native support for Vault. So you could, you just define the token, the IDC token. In this example, it's called IDC token. And then in the secrets part, you just define the path the secrets is in Vault and the environment variable that you want it to be called, for example, Docker Hub user or Docker Hub passwords. And then in your script, you just, you can use this environment variable that was fetched from the Vault and you could log into any service you want. In this example, it's Docker Hub, but it could be NPM, PyPI, or any other service. So the great benefit of using Secret Manager in this case is that instead of managing tens or even hundreds of different tokens, you can even create a single token. For example, in Docker Hub, 17 Secret Manager, and you could reroll it periodically to secure it from breaches. So this is only the tip of the iceberg in the IDC token's usages. There are many more. Some of this I'm mentioning here. For example, every really popular usage is authenticated to Kubernetes cluster through IDC. Let's say I want to access the Kubernetes API server. I need to do some authentication and login to the API server. There are many methods to do that. The most popular I think is through certificates. But I could also define it to use my identity to log into the API server and receive the proper privilege for my identity. There's plenty of tutorials how to do that. It's out of scope of this talk. And additional use for our IDC tokens is to access cloud resources from within the cluster. Let's say I have some part that needs to upload some configuration file to S3 or to Google Bucket. So I could create a similar procedure, but instead of trusting the GitHub or GitLab issuer, I would need to trust the Kubernetes service account issuer. And I will use the identity of the service account to access to the cloud. A little bit more complex scenario that we could implement that is through Spiff-Espire. I'm sure you heard about Spiff-Espire in this conference. There were several talks about it. And the third use case also very popular nowadays is to use a six-store or a cosine. Six-store is a great framework to ease the container signatures. And you can sign containers using your identity, either personal identity through some Google authentication or the CI identity through the similar steps that we previously showed. You can also sign commits through a Git sign, which is also a great tool from the six-store ecosystem. Okay, so let's recap this talk. So we started first with explaining about the problem, how modern pipelines look like, what is the importance of CI-CD systems, and how many, how they are using these third-party vendors for the workflow. And the popular method nowadays for CI-CD system to authenticate is through tokens, because it's easy and it works. And we showed us it could be a problem because of the and the latest breaches show that. Then we explained about IDC, OpenID Connect, why it's so important, and how CI-CD systems nowadays use that. Then we showed you the steps how to configure GCP and GitHub and GitLab to create this complete flow of IDC authentication. And then we also showed you what to do with providers that don't support native IDC authentication. And that will be through Secret Manager, like GCP has one, Hasekov Bolt, AWS has one, and many more. So thank you very much. You're welcome to check the full technical details of this talk in our blog. You're welcome to scan the QR, it will take you to the blog. Or you can check any other technical information in our blog. There's plenty. Thank you very much for listening. And that's it.