 We've talked about how infrastructure as code reinforces the need for automation and security as code. In the next two sessions, we'll hear how GitLab partners are solving different challenges of securing IAC. Both also happen to be sponsors of Commit and we're so excited to have them. Be sure to check out their demo presentations as well. First, we have Taylor Smith from Bridge Crew. Taylor will share how to embed IAC's security into GitLab pipelines using AutodevOps. He'll highlight a workflow for catching issues in frameworks such as Terraform, CloudFormation, and Kubernetes right from the developer's IDE in the MR pipeline. Taylor, over to you. Hi, everyone. I'm Taylor Smith, part of the Bridge Crew team. And I'm going to be talking about putting the Sec in DevSecOps, automating cloud security as an enabler. Part of my job is bringing security into the DevOps lifecycle for our customers. So I'm going to talk about some of the lessons learned I've had. In this particular talk, I'm going to be talking about cloud infrastructure security and infrastructure as code and what that means for the modern cloud infrastructure security environment. So in today's talk, I'm going to be talking about what is infrastructure as code, a quick introduction. I'm going to be talking about the problems facing DevOps and security teams in this new paradigm. Some of the sources of misconfigurations and security risks in the cloud. Some tools that are helpful for securing your infrastructure as code, as well as learning infrastructure as code security, how to address cloud security throughout the DevOps lifecycle, and strategies for implementing a DevSecOps strategy. So this is a very packed agenda, and I'm going to jump right in. So first, I'm going to start with cloud-native technology and security and how it's evolving. Starting with infrastructure as code and how it makes DevSecOps possible for cloud infrastructure. So if you've never heard of infrastructure as code, you can start by thinking about how you would provision a virtual machine in a cloud environment. You could go through the dozens of steps that it takes to provision it using the UI. That's pretty tedious, especially if you're standing up thousands of hosts or you have ephemeral hosts that need to be rotated. So you could use the CLI, which is another solution. It's in code. It's a little bit easier to follow. There's a lot fewer steps to get there. This seems pretty repeatable. The problem is it's not very human-readable. It's hard to debug. It's hard to reuse those. You could just copy and paste those CLI commands, but then it's not really giving you what happens if you just need to update a piece of it. Do you have to go through that whole process again? It doesn't really work very well. And there's no collaboration in these without going offline. And there's no versioned history or anything around security when you do this. So we've moved on to infrastructure as code. And what's really exciting about this is all the code, all the configurations of your entire infrastructure are now in code. And so it's a lot easier to go through and make changes. So if I want to, for example, update this from a T2 micro to a T2 medium, I can just enter that, do terraform apply, and it will automatically tear down the old environment and spin up the new one for me. I don't need to go through all the different CLI commands to make that happen or the UI clicks. It's also in combination with a VCS. So like GitHub or GitLab or Bitbucket, you can check in this code, which makes it really easy to collaborate on the code. You can write comments directly on the code to collaborate and make changes, which is great. So show what that looks like. You now have versioned history. You now have the ability to collaborate. You can roll back and revert changes that break or cause issues. The only problem is this still doesn't involve security. So if we want to involve security in this, what's the problem? Well, what we found when we scanned artifact hub and the terraform registry is that 47% of helm charts had a misconfiguration and 44% of terraform modules in the terraform registry had a misconfiguration. So it's not safe to assume that just because you're going through terraform or helm or Kubernetes that it will be secured by default. There's still a lot of configurations that you need to be concerned about. Some examples of those misconfigurations are unencrypted databases. So encryption is not turned on by default for a lot of databases in the cloud. In fact, our unit 42 research team at Palo Alto Networks found 200,000 insecure templates in use. So not just not just in a registry or repo, but actually in use. And they found 43% of cloud databases are not encrypted. 60% of cloud storage devices didn't have logging. And they found that 76% of workloads exposed SSH to the world. So definitely a big no-no. And up to 20% had RDP exposed to the world. Now, these misconfigurations are super risky as these services provide remote access to these cloud environments. So if a hacker is looking, is just doing port scans, they'll easily find that SSH, open SSH port. And it's making it just that much easier for hackers to get in. The problem that we're running into is with DevOps and security, they're at a mismatch. So security isn't happy because there's typically not a ratio of 9 to 1. Developers to security people at a company. And to make that worse, this is amplified by the fact that if you have one insecure infrastructure as code template, it might have hundreds of resources that are now insecure. And that once deployed live causes thousands of alerts. All the different misconfigurations from all the different services are now sending alerts to that poor security team that's already understaffed. And you compound that by the number of different infrastructure as code templates out there. So there's Terraform, there's CloudFormation, there's ARM templates, there's Kubernetes, Yamls. We're expecting our security team to be experts in all of these. And then you have hundreds of resources out there by each cloud provider. And Amazon's launching a new service almost every day, it feels like. Compounding that by the multiple different clouds out there. So AWS, Azure, GCP, Alley, Cloud, OCI, there's so many different clouds out there and each of them has their own differences that the security needs to wrap their heads around. And that's just the real resources out there. So this is an example of one Infinidash. You should check it out. It's a very funny story about how somebody made a fake AWS service up and people actually posted job postings. This is just showing off how difficult it is for the security teams to keep up. And the other side is also true. Engineering isn't really happy with this traditional method. So we did have a spot for security in the old world of Waterfall where there was a design phase and implementation phase and a clearly defined testing phase that included security and there was enough time given to do security. But this was when we had three month long cycles. Now that we've moved to three week long cycles, you have much shorter sprints but security is still taking their long time to do reviews. And so when security is done with their review, they're probably giving feedback on code that was written multiple sprints ago. The developers have moved on to new code that they're writing. That whole context that they had in their mind about the security or about the misconfiguration is gone. And so neither party is happy. And then on top of that, we can't expect developers to be security experts. As an example, the CIS Kubernetes benchmark is 263 pages of text. It's just walls of text. And so we can't exactly expect a developer who already has to do all their programming and be an expert in their code language of choice, their Terraform templates or their CloudFormation or Kubernetes YAMLs to also be security experts. So basically neither party is happy. The solution is to have a developer first approach to security. That's where we have codified, automated and integrated security. So it's codified, it's in code developed as policy as code. It's automated where it automatically runs and it's fast and it's integrated into the software development lifecycle into the tools that developers are already using. So let's talk about some of those tools. So these are tools that can help you make secure configurations, but I can also help you learn. Now, you can obviously grab templates from all over from public registries for public repos and we highly encourage you to do that. They're great jumping off points. The thing is, like I showed before, all these different sources are great and they have their public, but they're not secure by default. A lot of them contain a lot of security misconfigurations. And so you still, even if you have that jumping off point, you still need to be checking them for misconfigurations and vulnerabilities. So one of the tools out there, this is our open source tool. It's completely free. We recommend you check it out. It's called Chekhov. This is written by our CTO, meant to be a tool to find, identify misconfigurations in code using code. And so the feedback is right directly to the developers. And you can use tools like this. There's others out there like TSEC and a whole bunch of other linters out there that can help you find, identify these misconfigurations. The best way to learn is to start by building your own Terraform or CloudFormation, your own infrastructure as code templates, identify the mistakes using some of these tools or even use peer reviews, correct those and improve. And it's not just improve the infrastructure's code template itself, or you're also improving your capabilities. You're learning from what is a misconfiguration and you're actually doing it that way. Now, we have, we recommend starting with your own code and writing your own code and learning from that, but you can also grab some of these learning tools like we have TerraGoat or CFN Goat and there's another provider who includes Kubernetes Goat. These are vulnerable by design, misconfigured by design templates. So you can see what an actual deployable, these can be deployed, infrastructure's code template could look like with all the different misconfigurations. It's just a very good learning tool. Now, even though you can deploy these, we highly recommend you don't. They are vulnerable. You will find things like port 22 open to the world, unencrypted databases. I don't recommend you actually apply them, but they're good tools for learning what is a misconfiguration in code and fixing those as an experiment. Now, what does it mean to embed security in the tools that we use? It means embedding it in every part of the software development lifecycle. So what does it mean to have a full development lifecycle? Well, I mean starting in the integrated development environment like your VS code or your IntelliJ giving feedback directly there and then pre-commit before you have a merge request, you include security checks to block code from ever making it to a merge request. Now, in the merge request itself you can provide code comments to say what are the misconfigurations identified and then in the CI CD process provide more of that feedback. Those same checks in the build and deployment pipeline and blocking builds that don't match your policies. So saying if you violate the critical severity misconfigurations we won't let this get deployed or we won't let this get added to your repository. And then finally in runtime you can check again for misconfigurations. Now it's in your cloud environment. Now it's important to have views of security across this entire landscape because as you move from the left to the right the amount of information the context you have increases you're able to see more of what that looks like what the full integrated environment looks like what volumes are being attached to what other volumes what databases are talking to what virtual machines. In the ID and pre-commit phase we're starting to add capabilities using things like our check off graph database to add some of that context in the code checks but it's always a good idea to have the checks across this entire environment. Now the opposite is true where the complexity and difficulty to make changes is better and it's easier at the ID and early development stage and that's because all the context is in the developer's mind developers are aware of all the things they're doing they're working on that code so it's actively in memory. Once they move on to the next sprint as we talked about before then their context is switched they've moved on and they're thinking about other things so they need to re-remember all of this stuff before they can make a fix. So it's important to have the full lifecycle and be constantly giving that feedback at every stage and I'm going to just dive into each of these stages to give a better understanding of what it means to have a true security check at each stage. If we jump over into VS Code here is that Terragot repo I have here an AWS instance that's just going to spin up an EC2 and I'm going to first start by scanning it with check off and I'm just going to do the file what this is going to do is it's going to find all the misconfigurations in this in this file so you could do a directory or you could do a single file in this case I've gone through this EC2 instance and it'll say things like ensure all data stored in an S3 bucket is securely encrypted at rest and it'll give you the line items where that where that resource was to say okay this is where you need to add encryption at rest and that's what check off does so if you're learning how to make secure infrastructure as code you can identify using check off and other free and open source tools what those misconfigurations are go back into the code and actually make the change now check off if you incorporate bridge crew takes this one step further and adds in the ID like VS Code you can actually go in and identify misconfigurations kind of like spell check so you can see here are the different misconfigurations identified right then right there in the context and we take that one step further if you have the bridge crew back end included you can do the quick fix where if there's a quick fix available I can automatically make that change so I can apply that fix now it's EBS optimized automatically right then and there so that's the pre-commit and IDs phase next we're in the merge request and CICD phase so when you're in the doing a merge request you can take the merge request and scan it for misconfigurations now let's take a look at what that will look like if you have an example here of we've added an S3 bucket to our one of our Terraform modules Terraform templates then you can have tools out there like bridge crew that automatically go through and identify misconfigurations in that pull request so it's identified here's a low severity issue where the S3 bucket does not have logging enabled and the fact that I know it's severity low means I can prioritize that lower then let's say this critical one where I have a public read so having those comments there in the context of the pull request makes it that much easier to get things fixed now if I jump into the platform there's an aggregated view of all those different code scannings that we've done across from the ID from the CLI pre-commit all the way through to the CICD to the merge request section now we can category filter by category severity or tags and we can also see across this the context so what's the issue are there related sources and what's the versioning history now if you have multiple get commits you can also see who made that change so you can go out and identify who needs to make that make that fix moving on to runtime what we provide is you can identify using either us or other CSPM tools out there like Prisma Cloud all the different misconfigurations in runtime when you have an open port 22 in a security group it'll identify that in runtime now if you're following get ops practices you should take that misconfiguration back to the infrastructure as code template and make the change there otherwise your templates are much less useful if you're making any changes in runtime that isn't synced back to the build time and so what we do to make that slightly easier as we do offer something called drift detection so if you look at what we have in our incidents tab you have all the different misconfigurations that are in runtime and if I search for a manual modification I can see here that somebody went through and in AWS made some manual change and if I go down I can see they removed port 443 and they've added port 22 open to quad force so this was somebody went in maybe during an incident and opened up the port open up ssh to the world that's obviously not something we want to sync so instead of syncing up the infrastructure as co template to the cloud in this case we'd want to bring that misconfiguration out of the cloud configuration delete that security group rule so we're not opening port 22 to the world and that's the security across the DevOps life cycle so we've secured from the development phase all the way through to the runtime phase using automated tools and the nice thing that I alluded to before is that as you identify these misconfigurations even if the fix is automated you're still seeing what the issue is and so you're learning or you're you're internalizing as misconfiguration so it becomes muscle memory to actually do it right the next time next I want to talk about implementing a dev sec op strategy using a crawl walk run methodology so what I recommend you starting out with is experiment taking a look at scanning a few files and folders ad hoc use open source containerized scanners like check off that are free available identify that laundry list of misconfigurations just to kind of get a feel for what you're looking at then move on to exploring different output types and their effectiveness for your environment moving on to testing so scheduling tests for every repository auditing configurations tracking errors and starting to come up with an SLA for addressing them so maybe critical vulnerabilities or misconfigurations have a much shorter SLA than low misconfigurations and then start to scale that out so orchestrating scans with every single build job as a part of your CI CD pipeline and tweaking and customizing policies so adding in skip checks for for policies you don't care about or adding in custom policies for ones that are maybe customer specific or coming up with your own custom policies that the world would be interested in and please contribute those back and then start to evaluate the results against compliance checks that you care about like NIST or PCI then start to add in a governance program where you're auditing the results with all the stakeholders you're bringing them reports you're implementing a tagging strategy to make sure all the resources are properly tagged and you're getting code to cloud visibility for all misconfigurations across your environments and you're regulating non compliance usage of your VCS so this is when you start to add in blocks and hard fails once you're satisfied with the current state you can start to make sure that new misconfigurations introduced are not allowed into your repository or especially not allowed into your runtime environment. So thank you for joining me. That's my talk about putting the sec in DevSec Ops. If you want to learn more or you want to join our discussion we have our own Slack. You can also message me here but you can also join our Slack. It's slack.bridgecrew.io. Thanks.