 I believe I'm between you and lunch, maybe? Or maybe there's another session after that. I feel bad for them. But hey, everybody, I'm Sara Calife. Today, I'll be talking about from Build and CI to security testing all in a single pull request. For some background, I'm, like I said, Sara Calife. I'm a solutions engineer at GitHub. I've been at GitHub for about three years now. So time, especially over the last three years, kind of flew by. Previously, I was a cloud platform engineer working on Kubernetes, microservices, and a lot of cloud applications. Now I'm more so focused on talking about security and automation and overall developer tooling. For fun, I enjoy volleyball, going to the beach, and just overall travel. So very happy to be in New York during Christmas time. So in this talk, there are a couple of things that we really want to talk about. So first, I'll be covering simple steps for automating a lot of the process, incorporating security earlier on. We'll be leveraging a lot of the open source tooling that is available today for us to use to run security scans across our code base, to run security checks against dependencies, and just overall cloud native components. At the end of the day, we want to optimize the containerization and automate the triggering of any of the security scans we want to run per pull request to make it easier for our developer to execute this within one go. And we also want to make sure that we're validating everything before any new code is introduced into any production level environment. So why even talk about this? Why even have this talk? Why are we even interested in this? Overall, I mean, there's just been a lot of recent vulnerability exploits that have been happening. Even today, I'm pretty sure Loc4J was talked about four times, just the keynote sessions. And it's just a hassle, and it's just so tiring to go back and fix these issues. It's very hard to go back and fix something that you could have probably fixed earlier on, or if you're fixing something in production when you've actually known about the issue, maybe even in the build or development phases. A fund that's a really interesting Splunk security study that was actually the state of security by Splunk in 2022, they actually shared that. Let's see, I'm not blocking the view here. They actually shared that 65% of our organizations are reporting over this last year that an increase of attempted cyberattacks. I mean, we're all a lot more aware of these vulnerabilities, but they're just constantly coming at us compared to what has been over the last maybe five, six years. 59% of the security teams have devoted significant time and resources into remediating these vulnerabilities. So we're realizing that actually it's from 42% a year ago, which is, again, more than a 10% jump in just a year. It's pretty significant when we're talking about time and resources of developers and security teams and individual people that have to deal with this day in, day out. So what we've actually seen is that even though we're more aware of security vulnerabilities, the lines of code are increasing as the same rate as the security vulnerabilities. So why aren't security vulnerabilities being fixed more often if the lines of code are increasing and we know about these alerts? Well, that means we're just not fixing them in time or we're not fixing them at all. Even in the, there was a study done and from 2016 to 2020 that 85% of the applications still contain security vulnerabilities across the board because maybe it's something small that they didn't detect earlier on or if it was detected it wasn't fixed because it's already in production, they didn't want to take it down or hop hatch it or so forth. So what can we really do to make sure that at least we're catching most of the vulnerabilities that we can before we even get to the production phase? What can we really do to make it easy for us to do that job and fix those vulnerabilities earlier on? So the goals are, here's the goals of what we wanna talk about today, what we're gonna be presenting, where I'm gonna be presenting. We'll be running integration testing and security testing all in any new code changes that are coming through. We're wanting to do that at the pull request level to reduce the amount of bugs, to scan for vulnerabilities and to make sure that we're building and validating all of this at the same time. We want to make sure that we are detecting earlier and blocking the merge. So if we're detecting vulnerabilities, we should be blocking the merge, especially if they're high, maybe even medium vulnerabilities. We should probably validating those and blocking the merge to make sure that nothing is reaching anywhere close to production until we fix those vulnerabilities. We want to make sure that we're configuring branch protection rules. So this is just a common standard across the board in any type of development, not just when we're talking about security. But we want to maintain a consistent way of testing and validating. And in order to do that and to maintain consistent results, we want to set those branch protection rules as standards across the board. When we're setting those as standards, then it's an expectation already set there. So as a new developer comes in, they already know that they have to do this before they merge anything. Existing developers know that any code that has gone through this process has already done those required tests. So this is setting expectations earlier on. And at the end of the day, who doesn't love automating? Makes it easier, standardized. And again, if it's easy, it's gonna get done. If it's not easy, then it's gonna take some context switching. If you have to go to a different tool, if you have to go through three different types of validation somewhere else, you're just never enjoying that process or you're never really wanting to do that process. So what are some of the types of security that we really want to talk about? There are many different types of security. So when we talk about security, we kind of have to scope it down to a couple of topics in order for us to achieve that, especially within like a 30-minute mark here. So today we'll be covering a couple different things. We'll be talking about image scanning. So building an image, containerizing that image, scanning it. Dependency checks. So we want to validate any dependencies that are coming in and to make sure that there's no vulnerabilities associated to those dependencies. So if you have an out-of-date dependency, how do we go through and make sure that we're not introducing a new vulnerability? Static code analysis. So this is where we're analyzing your code base across the board to detect different types of patterns that might be vulnerable. So being able to understand what is happening in your code base and being able to see what's happening in your code base and detect that pattern allows us to analyze, hey, this is a vulnerable pattern, we should maybe validate this before going into production. And configuration checks. We're here talking about a lot of open source, a lot of large-scale deployments across the board and how have we been doing that lately? Either through Terraform and infrastructure as code, Kubernetes, YAMLs and so forth. So we should actually be validating those configuration files and see if there's anything that's being introduced during the time of the config file runtime that if there's a vulnerability associated to that to make sure that we're fixing that earlier on. So what's the typical workflow? Dev introduces some code, we build, run our tests, merge our PR, and then we go, hey, development is complete. Let's pass it on to QA or maybe our release teams. And then there's a couple of validation checks, integration tests and so many more other tests that are happening at the time of the QA team. The issue with that though, sometimes those scans fail. I mean, raise your hand if you already were done with your development, you add a brand new feature you're ready to deploy and then you have to fix your code because there's a vulnerability associated that you didn't see it till pre-state or in stage day and pre-prod. I mean, I 100% have done that and I was so mad and I was so frustrated because we were literally missing our deadline. And at the end of the day, as a developer, you try to hit those deadlines, you wanna hit those sprints, you don't wanna have extra work every other sprint just because something wasn't detected earlier on. And again, as a developer, we're not really learning about a lot of these vulnerabilities earlier on the stage, not only because they're not being scanned earlier on, but because we don't know about them. So a lot of times I don't, like I know what a SQL injection is, what's a cross-site scripting? Is that specific to an application that's only with, that has like a webinar phrase? That's something we learn in the industry. You might learn it in college, but it's not necessarily the same type of learning when you go into a production level environment, you have to deploy somewhere on Kubernetes and then you have a thousand microservices that are dependent on one other microservice that you need to now fix. So what can we do to really improve that workflow? We want to, that they have to introduce some new code changes, build your job, be able to actually test that, but we also want to run some parallel vulnerability scanning at that time. So based on the type of security checks that we talked about earlier, we really wanna scan for containers if there's vulnerabilities associated to them, if there's dependency vulnerabilities associated, if there's static code analysis, vulnerabilities associated, or if there's any infrastructures code vulnerabilities associated. There are different tool links here that I've listed here that are all from the either CNTF open source, or CNTF foundation, or even from the open source side of things and things that GitHub also offers for free for open source developers. So at that point, that's when we want to merge the peer once those tests pass and then we deploy an image to our Kubernetes or wherever whatever environment you want to. So let's go actually into a live demo here. So first thing I wanna do in this demo is actually, let's see, do I wanna zoom in a little bit? That might be a little more visible here. So I wanna create a pull request. So I'm adding, I'm a developer, adding a new database using some Terraform, and I wanna add a new feature here into this pull request and create a pull request for it. So I'm gonna put my mic down and type that out. So we're creating a pull request right now. Once that pull request is created, we actually see that we have a bunch of checks that are automatically kicked off. I cannot merge this pull request until all those checks pass and I have a required reviewer. So I have to have somebody review this pull request until even if all checks pass, I cannot just automatically merge. It has to also be reviewed by my fellow teammate before we push anything to any production level type of environment. So what we're doing here, while this is getting kicked off, I'm gonna actually go talk about all the different steps that I've automatically created. So in this code base here, I'm actually using GitHub Actions. You can use any CI tooling to really, that kind of follows the same type of work closely if you want to use Jenkins or CircleCI and so forth. You can still use those tools and it's just following the same concepts. In this case here, I actually have my build YAML. I'm automatically kicking off a build YAML on the pull request and on every push to the master branch or to any production branch. So in this case, I'm running a makeStork and a makeContainer. So I'm building it and I'm building out the container for it. In the dependency review here, what I'm doing this workflow is specifically scanning for any new dependencies that are coming in during the pull request. I'm not scanning for all dependencies and overwhelming the developer. I wanna scan for only what's coming in new at that point in time just to make sure that we're providing the right information at the right time. And not only dependencies, but we're doing the rest of our security scanning. In this case here, I'm using CodeQL, which is a query language developed by the GitHub team that does an analysis of your code base and provides a data flow analysis of your code base to see if there's any vulnerable patterns. So I'm kicking off that job here. And in parallel, I'm kicking off TFSec, which does our Terraform or any type of Kubernetes YAML scanning. And then we're kicking off Trivi here. We'll be doing our container scanning across the board so if there's any vulnerabilities associated to that specific container. So I'm kicking all these off all at one time and making it easier for a developer to really build out a really strong pull request process to make sure that anything going into the main branch or production branches, it's ready to go. You probably don't wanna run all these things at all times for all branches. We really wanna focus on what do we really need to get into production level. You might want to run some of these earlier on in different branches. Maybe you have your smaller release branches and so forth, but maybe you don't want to require all of them. So what I've actually done, if we go back to the pull request here, is I've actually created a set of required workflows. So while this create insecureDB.tf, which is my new database that I'm adding, while that's being still running, you could see there's three out of HX still running here. Let's go into one of the PRs already created. Here I'm adding support for an ID check in a webhook. So this allows me to include some new code changes to incorporate a SQL ID associated to any webhook coming in. But we could see here, I'm co-developing with Grant Griffiths here, and he also noticed that there's a SQL injection in associative vulnerability here. So actually, if I go to view the changes that I've introduced, I introduced some new code changes, and within the new code changes, I noticed that I'm actually introducing a vulnerability associated to that specific SQL addition that I'm adding. And I introduced database query from user-controlled sources. All right, who really understood what that means? Database query from user-controlled sources. I had to read it like three times for me to really understand. You understood it. That's good, but I don't think everybody understands. I understood it after I understood what I was actually reading, because at the first time, the first time I actually did this, I was just kind of using some stack overflow code that I found, right? It's not always the case where you understand what's coming through. But what's really nice is that you get the small description, database query built from user-controlled sources. I probably might understand that's a SQL injection. But if I click on Show More here, I can actually see a full-blown description and recommendation and examples of what is a good example rather than doing it how I did it. In this case here, it's saying, hey, maybe use another library that already does this for you, so you don't have to reinvent the wheel here. So as a developer, I can go through and understand what's going on. We can also see that there's references associated to it. So we can always diminish the amount of Googling we have to do or go somewhere else to find this information. We can just go through and look through these references. So this is a CodeQL analysis. And what I mentioned earlier is that CodeQL does like a data flow analysis. So it's understanding from the source all the way down to the sync where the vulnerability is being introduced. So the source here, we could see that there's a request body coming in. And the sync here is actually where the request body is being executed. So therefore, this is exactly where it's being exploited. And this is where the exploit is actually being introduced. If I see this, maybe I should do some sanitization at step one. Maybe I should do some sanitization at step two. But sometimes there's multiple paths to the same vulnerability. So what can we really do here? Here in this step, there's actually three different steps. So as a developer, now I'm understanding that this sync actually has multiple types of paths to be exploited. So there's one source coming in, but there's also another source coming in that could also lead to the same exploit. So maybe I should do my sanitization in step three or step one to find that common denominator across both environments. So with that in mind, I've done my code analysis. And if I go back into my pull request actually here, I can actually see that my merge is actually blocked because the code security scans were failing. So I'm making this a required workflow to run. So in order for this to run, that's our required workflow. So in order for us to be able to merge our code, this test needs to pass. So until I fix the SQL injection, I won't be able to merge the code. So that's our code analysis here that was done specifically for more of a SaaS environment. If you go back to the pull request, we can actually see that there's a couple of different ones here. So this one is actually doing a validation across the, I'm switching the base image of my container to boon220.04. So as a developer, I saw that this boon2 image is one of the latest ones, why not use it? So I updated it and I went through and I created a pull request. When I go through my checks here and actually go into my code scanning results, although I'm not failing everything here, there's actually some alerts that are popping up. We can see that there's one medium and 14 low. I actually only fail the test at the high level because I don't wanna block a developer from continuing doing their work if there are not very high vulnerabilities. And that could vary if you want to block them at medium, if you want to block at any type of vulnerability, you can, but this is something that I've noticed like it's much easier if we want to block on higher critical just because those are the ones that need to be fixed first. And then if it's low, medium or low, maybe we can fix those later on. But in any case, we can see that there's some more vulnerabilities associated to this image right off the bat. I don't have to worry about going to another tool to find that result. And maybe I should update to the latest version, to a newer later version of the boon2 image so I don't hit any of these vulnerabilities. So when I'm saying I'm requiring certain things, so let's go into branch protections here. So if I go into my branch protection rules here, what I've actually said are a couple different things. I'm requiring an approval process to happen at every pull request. So it makes it easier for somebody to come in and have to review my code change because if I'm introducing some code changes, nobody's reviewing them. It might not necessarily be the best practice in terms of pushing code without having somebody else validate it. But what I'm also redoing is requiring some of those status checks. As I mentioned, we already kind of talked about CodeQL where we're doing an analysis of our code base. I'm requiring the trivia scan, which is what we did for our container scanning. And then I kicked off that run TFsec scan where it's doing our analysis of the Terraform code that I added. And we're also requiring a build store container job, which is our build process. So every pull request to the main or master or production branches have to have these workflows running to at least have the right information as we're going into a production environment. So now that we've set these production roles, let's continue on to one of our other requirements here. So in this example here, I actually added a new dependency for Swagger API support. So Swagger allows me to easily document all of my APIs, so why not add it, make it easier for me to provide the documentation that all my teams need or any of my customers need. So I had a Swagger support. It's a great improvement, great for documentation, great. But still one of my checks is failing. So what am I seeing here is that dependency review is failing. And we can see our dependency review failed here because it's actually running, if you're familiar with Dependabot, it does a scan across your dependencies to see if there's any vulnerabilities. But if I actually go into the files changed here, what dependency review is doing specifically is looking through the new dependencies that you're adding into your code base and doing analysis against those. So there's a rich diff display, which not many people know about because honestly it's very easy to miss here, but it's such a helpful feature. So you can see here that if you do the rich diff very easily can you understand what new dependencies you're adding into your code base. Why am I even adding CowlBank's depth? Maybe I don't want to add that one. Maybe I should go with a better one. Or knowingly adding Swagger, but at the same time, the Swagger, this version of Swagger is actually a little out of date and vulnerable. So it has a denial service attack associated to end. That's a high severity one. So probably should update to patch version 1.2.6. So I make that fix before I even get beyond anything else other than my pull request. So now I don't have to go back and validate this and see it later on in the development process. I'm able to fix it and just make that quick change. If I want to do it very quickly, I can open up a code space, do it in my ID and they're just pushing you commit in. So with all of those done, let's go back to our TF security one. We can see here actually a code scanning alert came up right in the comments here. So you can enable one of these capabilities as well. So with code scanning, you're not only able to get the alerts right off the bat, but you're also able to make comments, block the merge and allow different types of integrations with your tooling to make it easier for a developer to get these alerts all in one place. We can see the code scanning alert here actually through several errors. So probably my insecure database.tf is probably not the best way to do this and there's already information wide insecure. This SSL should be enforced at this database connection. Probably a good practice to do. Ensuring the database is not publicly accessible. Probably a very good practice to do. These are all things that we can also fail on the pull request or we can allow some certain warnings to go through if you want to and just fix them later on. But at the end of the day, as a developer and then as if you have some new team member coming in, they will review this and say, hey, maybe fix this before we even go through the rest of the approval process. All right, so now if we've walked through not only the pull request process, but also how do we incorporate a lot of the security items within the improved workflow? So we did some container scanning, our dependency scanning, our static code analysis and our infrastructure code analysis. If I fixed all of those and I merged the PR, we're good to go. Now we can really deploy to our production environment or integration environment and we're a happy camper, so what are some of the learnings from this? Providing transparency on the requirements and fixing the vulnerabilities in the PR is just so much easier for us to do the work that we need to anyway earlier on in the process with the right context. I can focus on continuous integration, so this allows me to always automatically kick off my builds and allows me to automatically kick off my integration test and my security scans all in one PR. I'm able to get the results even before I merge any of my codes. I don't have to wait to merge, run something else, go into another tool, go back to it, fix it later on, create a GitHub issue or a Jira ticket or a rally ticket and so forth to go back and fix this vulnerability. We can actually fix it in a couple of minutes. If you're talking about PR, I mean, I hope PRs probably take, if not some minutes, maybe two days. Some PRs might take weeks, but at the same time, a PR is more of a timed process. Once you fix it in the PR, you are able to merge it. So it makes it such a much faster experience to be able to fix the result, to fix the vulnerability that you're introducing within that PR, and it allows you to, again, scope it down to your build process, make it easier to not have any context switching. The end of the day, we want to provide transparency. Anything that we do, anything that we're requiring, start having to go to Confluence or a Wiki page or somewhere else to see what we really require, we should just standardize those. We should make it easier for a developer to understand what is required and make it as a set expectation. So if that is required, then we know it right off the bat with the automated process. The end of the day, all we really, really all love doing is collaborating. It's easy to do things separately. It's easy to do things in a silo, but if it's easy to also collaborate, it's just much easier process at the end of the day. We also want to maintain that open conversation. So when you create that pull request, you can see Grant in that previous message ask me to fix the SQL injection. If there's a security team member that knows way more than I would, which probably they do, how to fix certain things, they can come in and help comment on that rather than saying, hey, this is required, you need to fix it by yourself. We want to maintain that open conversation. We want to make fixing vulnerabilities more in the right context at the right time, but also easy. And at the end of the day, everything is referred back to it. So you can go back to this issue. You can go back to this pull request. You can go back to the alerts and see what has been done to make that fix. So you can share that knowledge easily with your team. If a new team member comes in and doesn't know how to fix the SQL injection, I can just point them to my old PR and say, hey, check this out. This is what I did. You can see all the commits associated to that branch, so. And that's all I had for today. Thank you. Open for questions. Any questions? That's a great question. You can enforce commits at a, you can enforce checks at a commit time if you're running things on-prem and you have like a pre-receive hook. You can do it with github.com and enforce it before in the commit. But at the same time, do you want to enforce it at the commits? Because that's, if we talk about commits versus PRs, a commit is when a developer is still kind of thinking through the process and pushing things into their kind of mindset of what the feature needs to be. At the PR is when they're ready to listen to the updates and to the changes and to the things that they want to make or improvements on. So you can enforce that the commit is just gonna be more of a hassle rather than an improvement because it'll be making it harder for us to commit anything. Because if I'm still like thinking through the process that I want to do, I'm doing a bunch of commits until I finalize complete my code and my way of thinking for me to do my pull request and make that feature available and be ready to receive the feedback of all the changes. Yep, yeah, great question. Actually I can just show you how I did it very easily. So if you're, you can do this with other CI tooling, but if you are using github actions, if we actually go into the security tab here, all my results that I have, so what I showed is all in the pull request but you can see a lot of the results all in your security tab here. The way that you can do that, you can actually see I'm using three separate tools here plus the dependency review tool that I'm, that scan that I'm including. But if I wanted to work with other tooling, I can just add more scanning tooling. There's an API associated to it, so you can also do this through the API, but it's much easier if you already are using actions. You can just look through some of the existing tooling that we already collaborate with because we just work with every vendor that does any type of security scans and that wants to incorporate in the PR. That's a great question. So dependency overall, dependabot and dependency review, it understands all the vulnerabilities are associated to anything that is listed as a dependency within your palm.xml or if you're using Node.js in your package lock.json. But if we actually know, so if we go back to here, we're going to our insights, we can go into our dependabot tab here. So dependabot will show you all the vulnerabilities to all the dependencies that it's detected. But if you go in the insights tab, it actually builds out a dependency graph. So for example, if you're using something in Kubernetes and there's thousands of other dependencies on that, you can actually just drill down and see if there's any vulnerabilities associated to it too. And it would inform you if we have access to that tree. Yeah, that's a good question. So in this case, I'm focusing on some of the dependencies that are just like first tier dependencies, but if they're fifth, sixth layer down, we can see the dependencies within the dependency graph and associated back to those. If we're able to build that dependency graph, then we'll be able to alert on those. But if we're not able to build that dependency graph, that's where sometimes it's a little bit difficult. Yes, yes. That's actually a very, very great point. So what we're doing in terms of specifically, if you're not using that specific function that is actually the vulnerability associated to that dependency, because many cases, so I can repeat what you said, if we're using a dependency and we're not necessarily using that specific function that is vulnerable, but even though that dependency is labeled as vulnerable, it doesn't mean that we need to really patch always or we should probably patch, but we don't need to patch that specific application if that function is not necessarily the one that with the vulnerability. So yes, it does throw a wrench into the commit. So what we are doing, and it's in the supply chain process that our roadmap is kind of thinking through, our product team is thinking through, is trying to understand because we can do that data flow analysis with CodeQL and how we can kind of correlate the information that we can do from the data flow to understand what functions you actually are using from the library. Not yet. So that's actually what's being worked on currently. So there's a lot of modeling and stuff that's being done internally to figure out what's the best way to handle that. But yeah, that is something that we are working on to make sure that we're contributing that back to the supply chain security from the GitHub's open source side of things. That's a, yeah, it's for... So the question was, what is the experience of patching already vulnerable code? Yeah, that's a tough one. We want to make sure that process is a little bit easier than it is. So one of the things that we have done with Dependabot specifically is, let's see if my screen doesn't fall asleep here, is to actually, there are a couple of different things. That experience can be tough and hopefully we can make it a little bit easier. It's not going to be the easiest process if you have a lot of code, a lot of cross multiple places. But what we are doing with Dependabot, if your code is sitting on GitHub, we're able to inform you, hey, all of these have vulnerabilities in this specific location. So for example, when Log4J came out, if you had Dependabot running on your systems, or across the board in your code bases, we're able to tell you at the top level of all of your repositories that are associated to the organization or to the enterprise to say, hey, this is a vulnerable dependency and here's all the files that it sits in. So we're able to kind of showcase that at the enterprise level. If you're using something with Dependabot, another feature that you can use, this is all also available for open source, not just for GitHub users, but GitHub enterprise users, but this is all available for open source. But this allows you to automatically create a pull request with Dependabot. So Dependabot not only scans for vulnerabilities, but it also will try to create a pull request to update that version of the vulnerability. So in this example here, you could say, see that we're using the Azure Go auto-rest version and we're on version 11.13 and we need to jump up to 11.28 to make sure that we're not hitting that vulnerability. So Dependabot actually created that pull request for us. It gives us a lot of the release notes associated to it, which isn't that many, but it creates a pull request automatically and it makes it easier for us to just keep our jobs running. And we can actually see here, actually it's a very good example. We could see here because I've automated everything, the pull request, not created by me, created by Dependabot, already kicked off all the scans that I wanted to scan and we can actually see that it failed at creating the Docker image. So if it's failing at the Docker image, that means there's something wrong with the build process. So I need to go back and fix it rather than being able to merge it right off the bat. And because the job is a required one, can't merge the block, we can't merge up, the merging is blocked. So this allows us again, going back to making it a little bit easier. It's not the easiest process, but if you have an automated process that just kicks off your build jobs, kicks off your automation scans and allows you just to create, if a bot already does the job before creating the pull request for that version update, it'll automatically also run all your tests. You can have that answer right off the bat that probably should emerge because it's already failing that stork check. Awesome, I think we're out of time. We're way past time. Thanks everybody, great questions. Thank you so much.