 So, Patek Patel from Informatica is going to walk us through all their DevSecOps stuff. So thank you all for your patience. And I'll back up. Great. Thank you for introducing. Hi, everyone. Today, we are here to talk about how Informatica has built the DevSecOps practices into AWS with the help of Redhead ACS Advanced Cluster Security. A little bit about myself. I'll start with that. So currently, I'm with Informatica. I've been here for the past six years and a few months. Before that, I work for Netflix and Yahoo. Currently focused on Kubernetes, that's one of the things that my team works on primarily as of these days. We are a security engineering team mainly focused on it. All of you can reach out to me on Twitter, at Patek Patel, that's my handle. And let's now take a little bit of information about Informatica as well. So we are 1.44 billion in revenue as of today. Informatica has been around for a long time, about 27 years. A lot of you folks may have used it and heard about it already. As of today, Informatica offers intelligent data management cloud, our customers use our data management cloud to do API integration, data integration, data quality, data governance, master data management, and many other things around how enterprises are building their data driven organizations. Our cloud process is 32 trillion transactions per month. And all of these services are hosted in the Kubernetes. And our Kubernetes is also hosted on multi-cloud. So we use all the clouds, like all three major clouds, AWS, Azure, Google. But today we will focus on how we are using ACS within AWS. So to start with, I wanted to provide my opinion on what does DevSecOps means for me. Everybody has their own opinion, how they run their organization, how their culture is, and all that. So from my point of view, when development team, security team, and operations team, they are coming together and building their environment in a singular fashion such that they are moving at the same speed. So that's the main goal. And this is the goal that my security engineering team also follows. As a security engineering team, we are there to help out our development and operations team to ensure that we are taking care of security as a one group rather than operating individually. Also all of us who are operating in the cloud, we have heard that security is a shared responsibility. So on that DevSecOps enables a shared security mindset for everybody. This is where development team understands what are the guardrails, how those guardrails are interpreted in the development, build, and production deployment environment. So security team helps out defining these guardrails and then helping them implement with the help of development and operations team on the shift left. So these are the gates that we are building to ensure that when our development team is building their code, operations team is pushing that code out from the build to production environment, they are given a continuous feedback on how their software piece of code is doing and what are the checks they are passing and failing and how to improve upon those checks. So that's what our DevSecOps opinion, my DevSecOps opinion is. All of the next slides are mostly focused on this opinion that I want to enable my development organization to be successful from the security point of view. So before we jump further into various practices, I also want to take a quick look at LifeCycle, how we see DevSecOps LifeCycle at Informatica. It is very similar to what we typically see in the traditional DevOps. So it starts with planning and development, but security, it adds few of the checks or guardrails such as thread modeling, peer reviews, coding reviews, and so on. And when the code gets committed, it's further down the road and this is where we are doing the static application security testing, dependency management, secure pipeline, and so on. So in the security testing, we want to ensure that any code that is being checked in, whether it is a Python Java or even a Terraforma CloudFormation code, that is meeting the security standards and it is compliant to our definition of the guardrails. And then moving on to build and test, this is again next phase where security tools are integrated with your build systems, whether it's a Jenkins, CircleCI, Harness, or any other build tools that you are using. And here, the security checks are extended to the next level where we are also making sure that the artifact or binary that is being built, it is meeting to the standard of our compliance and security requirements. So here, we focus on the configuration scans. So what this means is how this binary is configured and when it goes into production, what configuration it will take with it, like we are trying to verify those. And then comes the ship and deploy. Again, in ship and deploy phase, we are running further checks that we define in the build and test, but this is more of a validation that what we define in the build environment, is it the same in the production environment or not? And if it is not, how do we provide a feedback to our deployment team on what's wrong, what is the drift, and what went wrong? And the last one is the monitor. This is a continuous monitoring, so looking out for any drifts in your running environment and running your vulnerability scans and providing feedback to your development and operations team so that next cycle, next build cycle, next development cycle is fixed and improved. So as we go further and look into different things, I want to just take a break here to explain what tools we'll be talking about. So first one is Amazon EKS. So Informatica uses Amazon EKS very heavily as our managed Kubernetes engine. Then we have Amazon ECR. This is where we store our container registries. And then we use Reddit ACS, Advanced Cluster Security, for securing our Kubernetes environment and providing the feedback to our development lifecycle. And then JIRA is our ticketing and workflow management tool to ensure that all of these systems that can be plugged together in one workflow and is completing cycle through the feedback point of view. So as we go detailed into what we should do from the DevSecOps perspective, I want to start with some of the best practices that we think about when we are deploying our Kubernetes environment. So some of the things are like segregating your repositories. We'll talk about all of these items further into next slides. But segregating your repositories into Dev, production staging so that you have clear delineation between what code is written in the development environment, what is built, and what is running in the production. So make sure that you are using a distro-less container images. So typically, or traditionally, we have been all using publicly available operating systems such as Reddit, CentOS, Ubuntu, and taking that in and on top of it, putting our binaries or artifacts and running those in the production environment. But those systems come with a lot of burden, a lot of packages that are not necessary. And so distro-less images like Alpine Linux is very helpful here. Also we use admission controllers heavily to ensure that we are securing our environment from the get go from the first entry point, enable your audit logs for Kubernetes. This is helping out both in the security and debugging point of view. Get your security controls from the build time and run time perspective. Also adopt service mesh. This is for optimal routing and encryption. And it also aids in your security visibility as well, like which ports, which applications are talking to, which other applications and ports, and how the network traffic is encrypted and segmented. And lastly, implement CS scans. So CS scans, there are three different verticals into it. One is for containers, your Docker container, how they are configured from the CS perspective, your worker nodes, and your clusters. So we talked about segregating your repositories. So I'll go a little bit deeper into what I mean by that, why I'm talking about segregation. So here on the first bucket, we have the development environment. So this is where a developer is writing their code and committing that into a container, basically building out a container image that can then be used further in the development cycle. So when a developer is putting this into on his own local desktop, we enable them with a local binary. This is actually a command line utility that is provided by ACS that our developers are using to scan their local container images. And using this scan, they are able to understand what are the dependencies and what are the third party libraries or various libraries that are used, and what type of vulnerabilities exist in those libraries. So this is the early feedback system where they are able to look at their own code, their own container, and from that point of view, they understand that here are what I have written, here are my libraries, and this is what I need to fix. So once they are happy, they have patched all of their issues, all the vulnerabilities, they put that into our DAO repository. This is a labeling system that we have internally implemented so that developer checks in their containers into DAO repository. After that, when they are ready to go into QA cycle, it goes, it, it, it, those containers are promoted to the QA repository. And our CI CD pipeline, it only accepts container from this QA repository. So all the, all of our built pipeline, they are only accepting it from the QA, and using that, they are building this environment. So they, in the CI CD pipeline, this container images, they will be various YAML files, policy files, that would be defined, and using that, the final build is put out. And this build is also verified using various CIS scans, reddit ACS scans, and once that is, according to the satisfy, satisfactory requirements, it will be then checked into the staging environment. And QA cycle goes on through multiple cycles, and after the final release candidate is published, that final release candidate then gets into the staging environment. And this, this final release candidate is then used to deploy into our production Kubernetes cluster. So we'll, as we, as we go into the whole life cycle, how we are, how we are, how we have integrated it. So to put it into perspective, our container, it goes into AWS ACR. This is where our staging or QA or DAB repository comes in. Jenkins, the tool that we are using for our build pipeline, it picks it up from the ACR, and builds its whole artifactory binaries. And this binaries will be scanned by reddit ACS for vulnerability issues, CS compliance issues. And this compliance issues or any of the vulnerabilities of compliance issues are now collected by JIRA, and tickets will be created in the QA project to ensure that QA team can take a look, validate it, and request a bug fixes for that. And this cycle, as I was mentioning, multiple QA cycle goes through, and after those QA cycle, the release candidate will be promoted to the staging environment. So again, going over the build time security, as I mentioned previously, like distro less images are a very important factor here. At Informatica, we heavily use Alpine Linux as our distro less image, and even some of the legacy tools that we have, legacy application, where we have Mandate of Red Hat or Santos type of operating system. We have taken that operating system, taken that image, and reduced it down to whatever the minimum things that we need. So when I say reduce, we are removing any package managers, any network utilities, file system modification utilities, there are all this, we call it a bloatware that is not required to run your container. We remove it from our environment, and we have slimmed those images down to 80 MBs so that we are running the minimum operating system in our environment. Then, as I mentioned, enable tooling for developers to able to scan their code and their container in their local desktop environment, and provide integrations to help them build out a pipeline, so they are encouraged to look at their issues in their development cycle. Integrating ACS with Jenkins CI CD for continuous feedback during the build time, scanning at Dev Time and promoting to QA, scanning in also CI pipeline as well. So this is, again, like we previously talked about it in our pipeline design. In all of this, it is only possible if you have stronger block around what, where your build system accepts your images from. So we are locking it down to the compliant repositories only, and this is achieved by specifying specific labels that are accepted during the build time. Now, moving on to the runtime security, this is a very similar setup using Reddit ACS. So whenever container images that they get deployed into the Amazon EKS, Reddit ACS is deployed, the sensor is deployed in all of our cluster, each one of the clusters, and this is part of our orchestration layer as well. So orchestration layer will put up, will install Reddit ACS sensors on all the clusters. And using those sensors, we are implementing the continuous scans. This continuous scans are doing multiple things. The first one is vulnerability scans. Second one is the CIS scans add for the Docker images, CIS scans for the Kubernetes environment. The third thing it is checking is various policies that we have defined in our Reddit ACS for measuring the risk of particular Deployments. On top of it, we also have our custom policies defined in our Admission Controller as well that enforces various RBEC-related Privileged Related rules. And also it ensures that none of the vulnerable container that Get deployed into the production environment. Along with that, Admission Controller is also making Sure that our deployment is only coming from the Prod repositories, and nobody can deploy it from any open Repositories outside of Informatica's purview. So we have a lot of open repositories like available by Docker, redhead, koai, et cetera. And also many of the vendors, they maintain their own Repositories as well. But at Informatica, we block out those repositories to ensure That only approved code goes into the production. And this is achieved using the Admission Controller policies. So in the runtime, some of the controls that, specific Controls that we have implemented is first is enforcing The namespace usage. So namespace usage allows us to build out a segregation Between products and business entities. And this is where this segregation is ultimately Helping out to enforce role-based access control Between pods and also the various application. And the same namespace segregation is used to define The network policies and our service match policies on Which namespace, which pods can talk to which other Namespaces and pods. Along with that, Admission Controller, it is our Central point, central enforcement point, which is Enforcing all the garteries that we have defined. Which is like pod security policies, deployment must have The network policies associated during the deployment. How api servers can be used, can it be used from the External clients, how it is used from internal Environment, who can vary the api servers. So all of this is put into the Admission Controller. And in the last, monitoring the configuration to detect Cs compliance failures. So one of the things that we may Assume that once our kubernetes environment is Deployed, it is expected that kubernetes system itself will Keep monitoring this environment and do the self-healing. But many times we have seen that this self-healing has a Failure due to the local changes that may be Implemented by the kubernetes admin. And from the security point of view, we want to detect this Changes and report it back into our operations team. Along with that, this configuration also allows us To monitor for any rogue containers. Like people will be surprised at how can rogue containers Get into the kubernetes environment when most of the Things are managed by orchestrator or your local Deployment system. But even with all of that, we have Seen that people, the deployments, they don't do a Proper cleanup. And many times we have seen this Rogue container which may be used for testing purposes Or temporary purposes and then they never got cleaned up. So this configuration monitoring allows us to detect this type Of rogue containers. Now, this build time and runtime Security that we talked about, but at the end of the day, we Have to figure out what we are achieving, like how do we Get a single pane of view. And for a single pane of view, We have defined this risk lens that allows us to figure Out what is the risk profile of each of the deployment That we have in our running kubernetes environment. And to define this risk lens, we have multiple verticals that We have defined. The first one is images. So as we talked about, like, what are the images? Are those approved images coming from the approved Repositories? Are there any rogue images that are Running in the environment? Then the access control. Access control around, like, who are the admins? What are the access that each of the containers has in the Kubernetes environment? The next one is network policies. Who can talk to whom? Also for the pod security Policy is also part of this. And then the cluster Security itself. Like, how clusters are configured? What are the cs parameters configured for them? And are they meeting those cs parameters or not? So as we have defined those guardrails, we codified Those guardrails into the reddit acs. So here i have popped up one of the sample risks that was Identified by reddit acs. So if you look at here closely, Here it is defining all the policy violations. So one of our policies says that if there is a container That has any vulnerability that has CVS score greater than Seven, then that should not get deployed. Or if it is deployed and a vulnerability is detected Afterwards, then it's a policy violation and we need to flag it out. Similar to that, here's the process with uid zero. So this is an example of arbeck policies failure. And similar to that, secret mounted as environment Variable. So this is a big no-no from Cis. So it's a Cis policy failure as well. So these are the combination of the policies that we have Identified. And ultimately this policy Failures are calculated under the risk. And that is the priority. Based on that, we have defined The priorities which will allow our developers and the Operations team to prioritize their workload and figure out What items we need to fix first. So one of the good Example is that priority will go higher if the fixable CVS Score greater than seven is there, is present. But if there is a docker cis failure, then we may Not prioritize it at that high level. Because docker cis, not all of the controls are Approachable or expected to meet in our environment. So we have put it at a lower priorities. So based on this priority based decision, we are Feeding this data back into our development teams lifecycle Using gira tickets. And in the last, the DevSecOps takeaways that i want to mention here. So the first one is define guardrails and document them. So this guardrails are ultimately helpful for everybody to be On the same page. When you define them and document It, it is very likely that your development team and Operations team will go through it to understand and Figure out like how those guardrails will be implemented in The different stages of the DevSecOps lifecycle. Like whether this guardrail is at the code commit level, build Level or the ship level. And once you have documented This, it is pretty easy to put them into acs. So most of the guardrails that we have defined, they Either go into acs or admission controllers. And using those two systems, we are able to ensure that all of Our kubernetes deployment will fall into our standards and The requirements. And once you have the guardrails, One of the main advantages is using this guardrails, you Are able to provide feedback to developers. So this is also very important. You want to create your own System using whatever ticketing system that you are using so That the issues that are identified by admission Controllers or reddit acs goes back into the development Life cycle. This will allow security to Move to the shift left and reduce your security issues in The production environment. And when i say reduction, one Of the major thing that we have experienced is kubernetes Amplifies your problems. Because in the past, when you were Running like thousand virtual machines and running your products Or applications using that, now those thousands of virtual Machines are multiplied by thousand containers. So all your vulnerabilities, all your issues are Multiplied by thousands. So this is one of the things That is important to keep in mind. And to control this, you Want to ensure that you have broader mindset work. So when you provide your early feedback to developers, they Will be sharing your load. They will be working for you to fix the security issues early on. And at the end, build your automations to provide reporting. It's certainly a good idea to put your issues in the jira. But then you also want to build out your reporting system, your Dashboards and charts to ensure that you can provide routine Feedback to the leaders on how we are doing as a security Minded organization. So yeah, that's the final slide. Any questions? Yes, awesome. We do have one question from the virtual side of the house. And he's asking, are third party oci images being signed after scanning? Our third party oci images are signed. Yes. So as I was mentioning that we are bringing in all of our Images into our own ECR. And so before checking into ECR And during the storage process, we do sign it. So we have our own notary that is used to sign those images. Thank you. All right. Anyone else have a couple of questions? Here we go. Thank you, everyone. Hey, i have a question for you. Sure. This is here. So sometimes there will be Some containers, like init containers, which come and go very fast. So will it also scan that? So those init containers, yes. So during the run time, it probably won't get scanned. So one of the reasons is we have scheduled the Continuous scan for acs every four hours. So yes, during the run time, they won't get scanned. But they will certainly get scanned during the build time. So we know what, like, dependencies, what are the Configuration from the docker cs point of view. So at least we have validation during the development time And build time. Okay. You're saying that, you know, even Before the container runs during the build time, that scan Will happen and it captures that if something is wrong. Is wrong. Correct. And if that pass didn't get checked, It won't go further. It will fail there. Yes, correct. Yeah. That's correct. Okay. Thank you. All right. And one more from the ether of the virtual world. Is there a plan to migrate from the use of pod security Policies to pod security admission controller or something? Yeah, that's a great question. So we haven't yet found, We have invested a lot into pod security policies as of today. So the network policy, that is something that we are Looking at. But as of right now, no. It says mostly investment issue. Like we have invested in Resources, so don't have time to go into the next one right now. Yeah. Could you re-explain the pipeline that you was talking About when you said that you had to create separate Repositories? Mm-hmm. Are you, those repositories created beforehand or are they Created after the workflow or during the workflow? Yes, it's created beforehand. And also we have a document that Outlines what are the tags and labels that should be used. So budgeting system. So that also helps us like basically Repository names are specific based on our naming standards. Is that an answer? That was a good answer. All right. We're going to wrap it up here and thank you Very much for coming. That was awesome.