 We've seen how standardization can help you automate your security program, but it can also simplify your compliance efforts. Standardization is important for repeatability, process improvement, policy enforcement, and audits. The key to this consistency is repeatable pipelines. The CI pipeline is your software assembly line. It should be standardized for consistent outcome and measured for ongoing improvement. When policies are applied automatically and exceptions documented, your compliance efforts are simplified and you get end-to-end visibility on who changed what, where, when, and how. Nico Mezenhall with WhiteDuck is a frequent speaker, and together with Philippe LaFoucrier from GitLab, they will cover standardization and compliance within the CI pipeline in even greater detail. Welcome everyone to this GitLab Commit presentation. Happy to see you here today. Our topic today is going to be how to enhance your compliance and governance with a policy-based CI city. I'm Philippe LaFoucrier, I'm with Nico, Nico, if you want to briefly introduce yourself. Sure. Hi, everyone. My name is Nico Mezenhall. I'm Senior Cloud and DevOps Consultant at WhiteDuck, and I'm basically doing stuff around cloud native and Kubernetes and containers. Thanks, Nico. Our agenda today will be divided into three sections. We will try to introduce very briefly what is compliance and governance, especially in CI city. Then Nico will talk to you about OPA, the Open Policy Agent, and how it's working. And hopefully we will have some time for a couple of demos. So before we jump in, let's try to define what is compliance that's going to be easier to understand this talk. I found this very short and brief description on Wikipedia, and I just love the way it's defining compliance. It's a difference to standards, regulations, and other requirements. So fully, you will find a lot more definitions outlined, but this one is short enough to understand what is compliance and what we're talking about. In other words, for me, compliance is say what you do and do what you say. There are many reasons why you would need or want to add compliance to your process and pipelines. Of course, the first one that comes to mind is the regulatory compliance or statutory compliance. You can think here about frameworks and for the statutory parts, there are local and international laws, especially in the US, you have state and federal and international laws. So a lot of requirements that the state is going to enforce you to follow. Also for standards, you can think of web standards, like for example, if you are developing a browser like Firefox or Chrome, you might want to be compliant with the web standards. So that's something that you will automate at some point and you will want to maintain that level of compliance as you develop and ship some new versions. You could have also some obligations with your vendors or customers, like could be shipping the software without any known vulnerabilities, for example, and this is not a unit test. It's not something that we're testing with the software itself. It's around the software, but we will get back to this notion a bit later. And last but not least, you could have some corporate policies that you will want to enforce. There are more or less guidelines, but sometimes you really want to enforce them. And that's a good example. For example, at GitLab, we have the Handbook, which defines a lot of processes and guidelines. And we like to use these as policies so that we can enforce that later and make sure that everyone is doing the required things the same way. Depending on your industry, you might need to comply to specific frameworks, like, for example, HIPAA for the health industry or PCI DSS for payments with cards. We won't go into details with these frameworks, just remember that they exist and you might need, again, depending on your industry, to comply to them. On your way to compliance, compliance is not the first thing that you will set up in your company. It's even probably the last. There are many steps in preparation before doing compliance. Of course, like usual, automation is key, as the company measures, the testing and quality workflows will as well and offer a foundation to get to the compliance level. We can do all of these things without compliance, but doing compliance without them will turn out to be extremely hard, so they are intimately tied together. And now comes the part with compliance and governance in the CICD. If you wonder what's the difference between compliance and governance, I think the difference, and I like to define it this way, compliance is more like taking the boxes, taking off the boxes, whereas compliance, whereas governance is understanding and managing the risk. So obviously, they are also intimately tied together, but there's a very slight difference between the two. So in the CICD, what it means to have compliance, it means that you define the whole around what's in the pipelines. What I mean by that is, it's not only what you are developing in your pipeline, it's also how you are doing it. So for example, the base Docker images that you are using, or the whole process in itself, how you are shipping, who is involved, and all the kind of things, it could be also security or compliance gates. If something is failing, you can have some jobs that will make the whole pipeline fail because you are not meeting the requirements. And so that's a great way to do that in the CICD pipeline, because you ensure that these requirements are always made during all the life cycle of the project. So it's not something that you do once a month, it's something that you do all the time as your project is living. And of course, that's one of the GitLab values, and probably my favorite, my personal favorite, iteration is key, starts more than trying to do everything at once. OPA will be a great helper with this process. And we will see no why with Niko. Niko, up to you. Thanks, and thanks for the great intro on compliance and governance with CICD. And so now I would like to shift to a more technical part and show you what OPA, so Open Policy Agent is, how you can use it, how you can integrate it into your CICD, and we will then follow up with some demos. OK, so talking about OPA. So basically, when you go to the OPA website, you will find the sentence policy-based control for cloud-native environments. Basically, OPA is, in general, a proposed policy agent, which you can use across the whole ecosystem or across your own stack. So it's not limited to just one use case. So if you maybe heard about Open Policy Agent and the Kubernetes space, it's a lot more than just doing policy in Kubernetes, so you can basically use it everywhere. And I will give you some details in just a minute. So OPA is also a created CNCF project. So it's part of CNCF ecosystem and got introduced by Stara some years ago. With OPA, you will become or you will get a declarative policy language which you can use to basically define your policies and then validate and verify them with OPA. One very nice thing of OPA is that you can decouple your application logic from your policy decision. So basically, your application is just hosting your business logic, and then you have basically a site like Open Policy Agent which is really doing and enforcing and validating policies. So how it's done that your application just talks to OPA, maybe brings in some details, then OPA will decide if something is okay or not okay and will provide your application on some kind of feedback. To integrate with your multiple options, you can first of all use a REST API and run OPA as a site car if you're in a containerized world or just as a daemon if you're based on virtual machines. So if you build your application or your tool or something in GoLang, you can also just use the GoLang library the OPA project provides and for other use cases, you also have a represented module which you can also use to build your policies at. And then beside of this, OPA also provides you with some APIs which helps you in the lifecycle, which helps you to manage OPA to get some metrics, details, stuff like this. Another option to provide OPA with some further input data but it's also something I will talk in just a minute. So this is just the basic of OPA. As I said, OPA has a really, really big ecosystem. So you can use it with Kubernetes but you can also use it just with virtual machines maybe do an SSH validation. You can use it in service measures to decide which application is allowed to talk to another application. You can use it in your APIs and web apps and doing authorization. You can use it as we will do in the demo later with CI CD. So you have multiple options where you can use OPA and use the features and functionalities of the general policy agent. Some more details in the ecosystem. As I said, you can do API service authorization. For example, it's Envoy, it's Kong, it's traffic and others. You can do authorization for maybe Kafka or SQL databases. You can integrate it in service measures like Istio or LinkedIn. You can use it for infrastructure as code changes which we will do later in the demo with Terraform or just for Linux machines to decide which user is allowed to access the machine with SSH or which user is allowed to do in sudo or something. And of course, you can also use it for policies and governance with Kubernetes but this is one example. And this is just a small list. I included some of the link with further integrations of the ecosystem. So it's a really, really long list of integration points, which is really great. So talking about OPA and how it works. So let me bring up the laser here. So basically, we will have any kind of request. Doesn't matter at all. So we have a request and the request is sent to any kind of service. Doesn't matter for now. So basically, something is talking to our service. And then our service sends some kind of data to OPA to validate. This kind of data doesn't matter. So it's basically everything OPA will need to decide whether it's okay or not okay. The only thing you will need to be aware of is it needs to be JSON data but the content doesn't matter at all. So basically your service will send some kind of JSON to OPA, for example, by the REST API or the Golan library or anything else. Then OPA will decide based on the policies and we'll talk about this in a bit, whether it's allowed or it should be denied. And then basically OPA will provide the decision also as JSON back to my service. This decision is also based on JSON data and it can contain everything. So it can be just on the true and false. OPA can provide some more data if needed so it's any kind of data in the JSON format. Furthermore, OPA has also some kind of data store where it can provide OPA with further information upfront. So let's say if your request provides it with a part but you need further information to really decide if this request is okay or not, you can provide OPA with some further data upfront. This is basically this data store down here. So this is basically how OPA work. If we now go a little bit into details, easy example just based on an API authorization. So we have a post request on slushed API with an authorization header called NICO. So pretty easy example. And then basically our service sends to OPA our input data, which in this case is the method. We have the pass API and the user NICO. So based on this information, OPA then decides whether it's okay or not. In this case, we do not have any further more input data and OPA will just send back to our service the JSON was allowed through and then basically our service knows, okay, this request is allowed, everything is fine and we can serve the data or something. So this is basically just a simple example. So now the only thing which is missing is the policy part over here. So talking about policy, as I said, OPA brings in our policy language. It's called REGO and it's a declarative language. So basically you can ask questions or you can ask, hey, is NICO allowed to post the payload to a slash API, for example? And then you can build your own queries. The result of those policies can be true or false, which is a common broad, but you can also build anything further and give any kind of JSON data as feedback. So it's up to you. And to support you a little bit, REGO has some built-in functions. It's 140 and above. So it's basically time with date functions, string functions, you can use regx. You can really date Java web tokens, Java web tokens, for example, and many, many more. And here on the right, we have a sample, pretty easy sample REGO policy, which basically starts with a name. So basically the package of the name, app.arbug, and we're defining that the default is false. So any kind, if there's no allow, it is false. And then we have a block, which is defined when the result is allow. So basically when the action is post, and the user as an owner. And those posts are declared below here. So here we're checking for the input, which is the input data, the service provides a REGO, and we're checking the method. And if this is a post, and the input user is NICO, then basically both will be allow, or both will be true. And because both are added to allow here, both need to be true. So if the request, the method of post, and the user is NICO, it's fine. And OPA will say, okay, it's allowed. So back to the first sample here. As I said, the same post request, we're getting the same input data, as we see it's method post. So the input method is post, and the user is NICO here, input user is NICO, which basically means both are true, and OPA will provide and allow, is true back to our service. Now this is basically how OPA works. If you would like to get started with OPA, you have multiple options. So first of all, that's a pretty nice background. So it's a browser-based background where you can play with input data, output data. You can build regular careers, you will find examples. So pretty nice point to really start, and get ready to use OPA, OPA Playground. Then you have pretty good docs, documentation, we'll find all the details, how you build regular policies and stuff like this. And of course, we have the OPA CLI, which is basically just the OPA binary, where you can run over instance, you can test and validate your policies. And you also have an OPA evaluation, which is basically just in the Swiss Army knife, which you can use to test and build and run OPA and stuff like this. And we will also use this one later in the demo to integrate it with our CI CD part. But yeah, so this is just how everything works. But of course, we would like to show some stuff in action. We prepared two demos. Yeah, CI CD integrated policy policies with OPA. I will talk or I'll show you in a sec how you can validate infrastructure as code changes with Terraform. So basically you're introducing new cloud resources or update your cloud resources and then validate with OPA if those is okay and those is okay to deploy into your environments. So let me switch to my demo here. So I've prepared a small demo repo, pretty basic. So we have an infrastructure folder where we have a Terraform project, it's just one file. So basically in this case, we would like to deploy to Azure. So I added the Azure provider here. I defined some backend information we have just as a state file and stuff like this. So it's pretty easy here. We will use Terraform integration with GitLab, which means all the details, all the security stuff and stuff is injected from a GitLab site. So you don't need to do anything. And then we have here the Azure Resource Manager provider doesn't matter at all. And we're creating two resources. So we have two resources from the kind Azure M resource group, so just a resource group in the Azure cloud. We're creating two of them. One with the name example minus resource group and one with the name example deny resource group. The first one is where we deploy it towards your location and the second one to North Europe. So basically two resources, different name, different location. So if we now go back and have a quick look into our policies, we will have a Terraform.rego file which contains our regular policy. As I said, first of all, we need a package name. We defined it here. Then I defined two variables basically that I only want to check for resource groups in this example. And I would like to check if the location is West Europe. So I have two variables, one with the resource type and one with the location I would like to check. And then I have basically the policy here. So first off, I'm getting my input data and just store it in the change set here. I only input everything or start everything which is a resource change to make it a little bit faster if it's a long list of changes just to get the change resources. Then I'm checking here if it's in trade or update action because it doesn't matter if it's in delete action. It doesn't matter for me or what the location was because the resource gets deleted. So also once again for a big change to get it a little bit faster. So then basically two checks. So first of all, I'm checking the resource type if it's the Azure resource group type and then I'm checking after the change the location if this one matches West Europe. And if not, I'm getting a deny message. And then I'm defining the print. So basically here resource group with a name does not match allowed locations. So basically this is the regular policy I defined. So we'll now check back in the root. We're seeing the GitLab CI CD file. And here basically we have some basics. So basically we're including a template which is the chat from the creation in GitLab. So there's all the GitLab magic or Terraform matching happen. You need to define some variables and we have a list of stages. So we have an init and validate which is used to verify our code and to build everything up and running with Terraform needs. This is all part of the template. Same for the deploy, which is really then deploying our changes which is also completely part of the template above here. So the only things we customized is we customized the build stage and we added another stage which is called analyze. So in the build stage, the important part is that we're doing a Terraform show and converting this output into a JSON file. And this JSON file basically contains all the changes in this Terraform run. And this will be our input for our part. And then we're putting this into artifacts that we'll have it available later. And then we have the analyze stage. And here we're just saying, hey, we would like to run the step in a container image. We're just using the policy agent container image. We need to use the debug image to have a shell in the container overwriting the entry point and then calling OPA with evil. We're saying, hey, we'd like to have a pretty output format and we would like to fail if we get any kind of output. So if everything is fine, we will do not get any output. And if there is an issue, we will get the output and the access code of OPA will be false which then basically stops the pipeline or brings the pipeline to red. And then we're telling where the policy is located and where the input JSON is related. So basically the Terraform changes from above. And then we're just calling the package data Terraform analysis deny. And basically this runs our pipeline. And if we now check back into our pipelines and we have some runs and maybe seeing here the inner stage pass, the validation state pass, the build stage pass, maybe a quick look in the build stage and you're seeing Terraform mention that there will be two resources created. Our resource group example as well as our resource group example denied with the location North Europe. So if we now jump over to the analyze stage and this one is red, basically what is now is happened that this stage will run over the command I showed you some seconds before and then provide us with an output and say, hey, the resource group example denied does not match the location. So basically the resource group with best Europe is fine but the resource group with North Europe is not with North Europe, it's not in rest Europe and we will get the output here. We are able to fix that and make sure that this resource group gets not deployed into our yeah, cloud environment. Okay, so with this back to you. Thanks Nico, that was a great demo. Today we have another demo for you with regards to GitLab project validation. It's an initiative that I'm working on currently at GitLab. The goal is to track and monitor all the projects and components that are involved in the development of GitLab at some point. And so we added recently some policies in this inventory to make sure that the configuration of the projects is the one that we're expecting. So that's what we have here in this barge quest. First of all, we need the project with product categories. So the categories are way to add some tags or to label the projects. And obviously we want to track the ones that are product related. So this is not a real world example. Of course, it's just for the sake of this demo. But you can see that we also have a projects.rego file where there's a violation rule here that is true if the project is either a product project or as red data or is a library and there's any of the product violation that is listed above this. And the product violations are, in this case, we want to make sure that SAS is configured. So there's a violation if SAS is not configured or SAS is disabled, for example, that's a violation. If the job is configured correctly but we're not running it, that's a violation of our policy. And so the same goes for dependency scanning, secret detection and so on. And we can ensure that SAS is configured if we have at least one report of type SAS in the pipeline. The second rule set that we have here, the sites.rego file is to assess the websites that are listed in this inventory because we also link the websites that we deploy to their projects in this inventory. And we use this list to actually run some SSL checks. After this check, we are able to provide an overall grade from A to F. And of course we want this grade to be great. We want this grade to be at least the A, so it could be M minus error or A plus, but nothing else. If it's not at least the A, there's here a violation and we'll report that. So if we get back to this MergeQuest pipeline, you can see that everything is green by comparison to the previous demo. That's because this pipeline is used to process the data. So we're thinking the data with GitHub.com. We're updating the files. We're doing a SSL check. We're doing a lot of different things. There's this new stage compliance where OPPA is running, but OPPA is green because there's nothing wrong with what the user is doing in this pipeline. We're just using the pipeline to process the data again. But here we can see that we have some violations actually. This new project that we just categorized as a product project is violating our policies because SAS is not configured. Dependency scanning is not configured as well. And neither is secret detection. And as for the websites, we can see that this one is OK, but this one is a B grade, which is obviously something that we don't want. So we report that here. We obviously don't use this output just for debugging. We're providing some JSON files as a form of artifacts here. And these JSON files are going to be used later on to generate alerts and issues so that we can fix the situation. So as we've just seen, there are many ways to use OPPA. There are a lot of different use cases. It could be for validating Kubernetes manifest. It could be for building a null and deny list for your library dependencies, for example. That's something that can run directly in your CI CD pipeline. In comparison, you can run OPPA as a diamond and do authorization for the core envoy or many other different use cases. We clearly invite you to visit the OPPA website, the openpoliceagent.org website for all the integrations that are already available for this project. It's amazing to see all these integrations. And with that, we hope that OPPA is a bit more clear for you today, and obviously you want to answer your compliance and governments with policy-based CI CD. If you have any questions, feel free to reach out directly to us, and we wish you a very good rest of your GitLab commit conference. Bye-bye. Bye-bye.