 Hi everyone, thanks for joining this session. We will talk about smart control plane for CSED pipelines. So I'm Shripad, Shripad Nargora. I'm software architect. I'm currently leading multiple initiatives around software supply and security, primarily around how we operationalize S-BOM and how do we secure our CSED pipeline. There are a few talks I got to keep going earlier, so you can even look into those. And I'll join you with Ashish. Hi. Thanks, Shripad. My name is Ashish Bhattwara. I'm a technical evangelist, experienced in leading multiple teams, and then working around cloud, software, storage, networking, analytics. Played several roles, product management, developer, manager, et cetera. So, well, let's get going. So I'm going to cover the background about what we are doing, and then Shripad is going to talk about the actual demo. And you will see how this smart control plane works. So let's talk about CI CDs. Of course, I'm not going to cover CI CDs. CI CDs is not a new topic. There are many tools available in the market, like Jenkins or Tecdon, that can be used to set up a pipeline. And that can automatically trigger the build, the testing, the analysis of the code, or even deployment on the target platform. So our approach basically extends these tools by providing a smart control. That means allowing multiple pipelines and multiple users to be managed by a single control plane. That means you can have complete visibility into different projects depending on your roles and the requirement. By smart, I mean that basically automated recommendation, considering the projects, considering the user roles, considering several artifacts related to the project. So before I get going in detail, let's talk about who are our partners in this. So there are several user personas or actors for a CI CD pipeline. I'm going to cover just four of them, but like developers or the QA engineers, release managers, your InfoSack, security engineers, CISO, all these are the actors or the personas for the CI CD pipeline. So today I'm going to talk about just four major personas. One is the developer. So basically the role of developer is just to build the code and fix the issues. That's pretty much the role of developer is. And here the JN is the developer just for the namesake. Rohit, Rohit is basically a DevOps engineer whose responsibility is to manage and maintain the pipeline. Then comes down to our CISO, our chief information security officer. And her responsibility is to make sure that the required regulatory and compliance standards are met. So the developer basically looks at the initial stages of the pipeline. The DevOps engineer and CISO, they look at the later stages of the pipeline. And then fourth is important is the Matt, who is the InfoSack engineer. So basically the InfoSack engineer is basically looking at the designing and implementation, and even managing the different security controls, different security tools, to make sure that the developers and the whole pipeline has the required tools and controls. So after talking about the different personas, let's talk about the different types of CI-CD pipelines that we may have. The one is like your shared pipeline, where multiple users are basically using the same pipeline. Then there are this pipeline where basically you can have the modular pipeline. In the modular pipeline, you can have customized control depending on the project. And then that's kind of a variation of a shared pipeline. And then you can have the MonoRepo pipeline, where basically you have a single repository, but there are multiple projects get built out of the same pipeline. But one thing is common in across these, like the common goals. Whatever the type of pipeline is, they share the common goal, like ensuring the efficiency, high quality, and error-free software delivery, ensure security and compliance, and optimize the resource utilization. So no matter what, you will have these common goals. Now these actors or the personas have to talk to each other. And coordinating and communicating with each other is a complex task. And on top of it, you will have different pipeline, shared pipeline. That also creates a complexity. And then on top of it, tracking this coordinated communication is a challenge. So what we are trying to solve is coming up with an approach that can help addressing these things. And so far I just talked about the personas, but now talk about the systems involved in this. So this is the kind of a GitHub repository where developers need to access, put their source code, put the fixes. Then coming to the DevOps engineer, they also need to put their configuration, pipeline-related configuration, and the tools-related information in the source code. And then the security tools marketplace, where you need to make sure that you have the right security tools for different project types. And that basically information comes into your source code because that allows you to build the actual pipeline itself. And then at the end of the pipeline, you get to see all the compliance reports and different other type of artifacts and reports that basically are compliance officer. Take a look at it and suggest what's happening. So you can think of one side is a personas. The other side, these are the different systems. And now they have to talk to each other. That's a challenge that we are trying to solve here. So merging these actors and tools is a complex and error-prone task. It's not easy. And on top of that, there are multiple pipelines with multiple components and multiple user roles that creates a much more challenging for us to manage. So this is what we are trying to show here. So managing in multiple pipelines is very complex and error-prone. Now, so we talked about the tools or the systems. We talked about the personas. Now, if I look at from the CISO point of view, what are all the what I need? From the CISO point of view, I need the insights. I need insights into what type of applications is my organization running, like whether it's a microservices, non-microservices, or Go-based, or the Python-based. All these things, information I need to have as a CISO. And also, what kind of security controls does my all application that my organization is running? I need to have for that. And on the other side, control. How do I basically make sure that all these things are running in compliance? And how I make sure that with all the requirement changes or the new projects coming in, how do I make sure that I take care of all the requirement changes? As requirement changes, I have deployed those tools, make sure that everything is in compliance. Now, talking about the DevOps engineer. So DevOps engineer needs to make sure that it's an ease of maintenance and the automation. When I say ease of maintenance means how can I efficiently manage and manage multiple pipelines? How do I keep track of the requirement changes? Similarly, how do I basically automate these common tasks? And are these statically defined and managed pipelines really helping us? Because today, each project have their own pipeline. They pull in the tools, et cetera, et cetera. Are they really helping us? I want to insulate myself as a developer. I want to insulate myself from everything else. I just want to write code, fix bugs, and done with it. Nothing more, nothing less. How can I basically ensure a high velocity of code deployment? That's as a developer I need to know. So now we'll talk about how we are solving that problem. And Sheepard will take us through that journey. Thanks, Ashish. So this was really interesting, right? Like Ashish illustrated. So one problem, as we mentioned, the statically defined pipeline. The pipelines are, today we have so many pipelines. And why? Because they are driven by different parameters. I have a different pipeline when I create a pull request. I have a different pipeline when I commit the change. I have a different pipeline that goes to my production workload, a different pipeline for staging. And this is essentially creating this management problem. So when we started looking into this solution space, we went to the whiteboard and we said, OK, what do we want? So we thought, OK, we want some magic wand, which actually can create a single control plane where multiple actors can interact through a common system, common platform. What I mean, right? So for instance, as a CISO, when I make a decision that I want to implement a new security control for my organization from going forward, I make that decision and I communicate or feed that decision to this magic control plane. And now this magic control plane needs to be able to create an actionable plan. Like, OK, who are the actors who are affected by this decision? For instance, I've made a decision that I need to implement new security control. So what are the tools that are required that can implement this? So someone has to discover those. And then the DevOps engineer, they need to go and implement those in the pipeline. So that is essentially the kind of the interactions that we are looking for to automate here. And the second thing is, we already have a plethora of this system that we just discussed. Like, for just storing the source code, we have GitHub, GitLab, Bitbucket. For running our pipelines, we are getting in tech town. We are not trying to create yet another system, right? This platform is built on top of it. We want all these existing components to be plugged into this system. So we can effectively basically manage these systems. So this was essentially something that we said, OK, on a high level, this is something that we want, a single control plane that can achieve this goal for us. And then we said, OK, what are the three major principles? So one is, yeah, we want this common platform where these multiple actors can share responsibility and then they can manage actions. When I make a decision, I don't have to basically explicitly communicate to all the respected or affected parties. I just make that decision and I feed it to the system. It will communicate and manage the after effect of that decision. The second is automation with the dynamic decisions for pipeline executions, right? So one thing that, as I mentioned, the one problem that is, is we have too many pipelines. What if we have zero pipeline? The pipeline gets created when they are needed, right? So I need a pipeline. I look into the context. What is the event that is creating the pipeline? What is the context in which it is being invoked? And I automatically compose the pipeline, right? And that pipeline get used for execution. So we don't have to store and manage and version control the pipeline. Pipeline as a code, maybe we can just get rid of that. And then finally, the pluggable interface, right? As I said, our objective is not to invent yet another system. But to allow a platform where these existing systems, existing technologies that we have, they can be easily plugged in. So this is essentially a few principles that we set forward for us. And then we say, OK, let's go ahead and build this system, how it can look like. So on a very high level, it has two components, right? Pipeline executions and pipeline management. So pipeline executions is anything that is all the components that get involved when you start executing a pipeline. When I create a pull request, the pipeline get executed. So all those things come under the pipeline execution. Pipeline management is the thing that is happening in the background or out of band that affects or influences the pipeline executions. Like, I make a policy. The CISO is making new decision or new tool. Some tool is getting deprecated from use. So all those things come under pipeline management that is happening in the background. And they influence the pipeline execution. So what are those components looks like? So again, SmartFlow is something that we call this platform, the magic one that I showed you. So it is based on some set of APIs that we are still expanding. One thing it has, it has a sake tool crawler, right? So what this tool does, it essentially crawls. There are a bunch of marketplace where these tools are available. And these tools are specific. When I want to do, let's say, vulnerability scan, I have set of tools in the marketplace, like Snake or others, which are the paid versions or some open source like TV that I can just use. So the sake crawler tool goes ahead and it crawls the artifact hub, the GitHub marketplace. And all the marketplace where these tools are available and it find out these tools are applicable for this kind of artifact in this context. These are applicable for container or monolithic applications or microservices. And it creates a catalog internally, right? Then we have policy managers. So our objective was to automate various actions. And when we have actions, we need to have some way to communicate our restrictions, right? So that is a policy manager. And the pipeline configs. So even if, when I say the pipelines are getting created dynamically, there are some configurations, like if I'm using some proprietary tool for some security scan, I need to feed it some secrets or what are the API endpoints. They are common, which are not specific to execution, but basically the configuration of those particular tools. That needs to be fed. We have centralized database where all these actions, all these events, all the metadata is stored. And of course, the event stream, right? Whenever one component changes, we need to have the, we have SmartFlow, which essentially find out what is, oh, this is the action plan. And it communicates the events. Like, OK, the developer needs to do this. Or these are the things that we need to actuate for that particular persona. And this is all in the management plan, right? This is happening in the background or out of band. And then the execution. We show the GitHub. As a user, I make some change. We build a trigger handler. So that action doesn't invoke directly interact with your target CI system, like Jenkins or Tecton, right? We have our own trigger handler, which actually intercept that request. It passes the event. It determines the context, like who is the user, what kind of event is it. Based on all the management that we have done, what kind of pipeline does the user want? What kind of desire controls there are? We dynamically create a pipeline definition, right? And then we start executing that pipeline. So this essentially avoids a lot of overhead in the sense that now, as an even DevOps engineer, my responsibility is just to ensure that I'm providing a concrete pipeline configuration. I don't have to, let's say, a newer version of some source tool or available or some tool is deprecated. I don't have to go and change all thousands of pipelines that are used in my organizations, right? So we are trying to avoid that bottleneck and to drive the efficiency out of it. So let me show you quickly some demo that we have. Again, I'm right now showing it through command prompt. This is, again, a very active project. So we are building some UI framework on top of it. So let's say I'm a developer and I register some, I say, OK, SmartFlow, register this particular repository for me, right? It tells us what kind of project is it and what is the role? Is it a sandbox or is it going to be in the production? Once this goes, our SmartFlow engine it basically goes and discovers the various artifact for this repo. Is it a microservice repo? What kind of artifacts that are being used? And it feeds it into our database. Now, as a developer, that's my role and that's essentially what is covered here. Now, as a CISO, I go and I say, OK, I want to list all the features that are defined here. Or I can first actually query the repository. So show me all the repositories that I use in my organization. It tells us, OK, these are the repositories. These are the artifact type. What kind of applications are these? Again, in typical organization, just seeing this list is not going to be helpful because it easily will run into thousands. In our organization, it is like in tens of thousands or close to 100,000 repositories. But then I can go and say, OK, what are the security scans that have features that I have defined for my organization to implement? So that's my responsibility that I need to have if someone is using Go, then they need to have static scan, vulnerability scan. What kind of artifact is it applicable to? And what is the importance? Is it critical? It is important or it is optional? So these are basically the security policies that I'm defining as a CISO. And then I can basically ask, OK, what are the security policies that I'm missing? I mean, my developers are using some artifact. There are some security policies for that, but I'm not covering those kind of artifact. And now it says, OK, there is some developers using this DataAppTradent library, which is some artifact called Mostach, which is related to the storage. It is using production, but you don't have any security policies defined for this. And as a CISO, I can make, OK, this is fine. I think I can make a glass break and I can say, OK, this is approved. I don't have any concerns for this. And we can basically go ahead with this. Now, as an infosec engineer, my CISO has defined some policy. My developers is using something. And now, as an infosec engineer, I need to know, are there any gaps in the tools that are available in my organization? I say, OK, there is no gaps that are available. Now, suddenly, I have some developer in my organization who decide, OK, Rust is the new shiny thing and I need to build my new application in Rust. So he goes and he registers a new application. Again, this is an application which is built in Rust. It uses NAICS. And now, CISO is unaware of this, right? CISO doesn't know that there is new artifact or new security controls I need to define. So CISO again goes and, sorry, CISO goes and says, OK, again periodically, our system will create an event. It says, OK, there seems to be some artifacts, some applications, like written in Rust, Marty, that are being used by your developers, which are being deployed in production. But you don't have any security controls defined for this. So now, as a CISO, I can say, OK, I need to basically add some security controls. So I say, OK, I'm going to add a security control that all the Rust code should be sterically scanned. And it is critical. It has to be done, right? I add this. Now, the smart control is essentially going to create the event for the infosec parser. That is, OK, there is a CISO policy defined, but there is no tool that you have defined here. And now, the smartness that we are talking about, right? It comes from, we are not only telling about the gaps, like what is not present. We are actually telling what are the tools that are available. So because I told you that this sake crawler, which actually crawls and find out what are the applicable tools, maintains the catalog. And now, it tries to recommend the tools to the infosec, right? OK, there is a gap. You don't have some tools for Rust. But there are some tools like Cargo Audit, this particular version. Is it free? Is it what is the license for this? And now, as a infosec, I can say, OK, I think this is good. I can go ahead and approve this particular tool. And that would essentially go ahead and trigger the workflow for pipeline creation, which we are not showing. I think we'll talk about it briefly in a minute. But this is essentially the crux, right? I showed you the management workflow that one I showed, like how we are managing this, the information that is flowing between different actors, how we are automating what are the smartness that we are making into the system. So let me go back to the slides here. So there are a few things that everyone is thinking, right? We have been hearing about one pipeline or pipeline templates and everything. So how is it different than that, right? So this one pipeline, pipeline templates, they basically are trying to standardize the pipeline inside the organizations, right? So instead of, if I have 100 developers, right, and they are building their own pipeline, then I end up with 100 different pipelines with different size and shape. Now with one pipeline for this template, I still have 100 developers, but they are all using 100 pipelines. They are using one pipeline, but I still have the 100 pipelines to manage. The only thing is their shape and size is same. If I want to change something, some configurations, or some tool get, I need to add, I still need to go and update all 100 pipelines, right? So that's the catch here. It is not that we are saying that everyone should use the same pipeline, but we are saying the pipeline will be created for you when you need it, right? You don't have to manage this. Another, as we saw earlier, when I think Christy was also telling in the keynote today, we have a lot of auto generations, pipeline auto generations being baked in into chat, GPT, bar, we can just say, OK, build me the pipelines, and it will start generating the pipeline definitions for us. But again, as again, we discussed in the keynote, right? So these definitions, we cannot just take it as is. Like even if you see in the description, you say this is a starting point, right? So you need to take this, and you need to validate this, and then you need to understand and then put it into use. We cannot just directly put it into use. And these are the things that we are currently exploring for the auto generation of the pipelines. And putting it into our frameworks together. So call for actions again. This is an open source initiative. We want to basically, we are looking for feedback. We want to, we are looking for some technical advisors and contributors. We are currently running some pilots for integrations into existing solutions. Like how do we integrate with Tecton? How do we integrate with Jenkins? How do we integrate with the marketplace? How do we make this pluggable and get some feedback from the respective vendors and get these things basically in actions? With that, I think yeah, if you have any thoughts, any feedback, please feel free to connect to us on GitHub or on Twitter. And yeah, I think that's all we have for today. Thank you. Yeah. Yeah, yeah, we started with one implementation. I mean, this is something that we started before ChargeGPT, right? We didn't knew that we'll have this tool that can generate the pipeline. What we started with is some templates for GitHub workflows, GitHub actions. We basically created template, and it's OK. We templatize everything, starting from name, tool, where we pull the secrets and everything. And then once we get that event, we started basically populating those templates. I don't need this stage and everything. Because this is a deterministic process, right? How do we build a pipeline? Configuration is automations, right? You just call two APIs, the pipeline gets configured. And that's how essentially we started. And then once we started seeing some promising, we are getting some starting point. We don't need to go ahead and create this template and maintain this template. Because we don't want to go from managing pipeline to managing template, right? We don't want to just transition to that. And that's where essentially we are now exploring, that we had one solution of this basically using with templates, which was, again, good, because we just have a bunch of templates, like around 20 templates for each different tools. And we can select based on that. So whenever we identify new security controls, the DevOps goes and creates that template. And then at runtime, we automatically select that template and populate it with the values. So I think that approach works for us, and that actually was good for the initial pilot that we did. But I think right now we are, as I said, exploring with some of these EGI features that can we build it on top of those starting points that are getting provided. CD, I think we had a few thoughts on CD deployment also. I mean, the challenge there is the configurations are basically quite diverse there, right? So again, the approach is going to be same. We need to abstract out the differences. If we can model them in some way. Because if I'm deploying on Kubernetes, that is completely different than if I'm deploying it on some server, right? And it depends on the target, basically. Yeah. Yeah, exactly. CD is different. Yeah, absolutely. Yeah, so right now we have the auto generation. So we don't ask developer to, I mean, developer as a developer, I just say. Because one thing is if we put all things on developer, then if they make a mistake, then again we want to basically, that basically becomes our baseline, right? Whatever developer define. Instead, if we have a lot of automation, we can discover those. So in this one, I just showed you, when you register, the smart flow goes there, discovers. Is it a microservice? Do you have Dockerfile? Oh, yeah, yeah. Yes, exactly. Why do they have other things to do like this? Exactly, yeah. So the idea, again, we want to leverage as much as possible, right? We don't want to reinvent the wheel, right? So Buildpacks, essentially, I use it properly. And that is essentially the motivation, right? That, OK, we have this discovery engine that Buildpacks already was doing it when you give it a repository, determines how we can build it, right? And for us, we do not build it. We just want to get that, extract that knowledge, OK? What this artifact means, right? Buildpacks was specifically for microservices, right? It says only. So we say, OK, we want to basically go beyond microservices also. Like, we want to discover microservice as one artifact type, like, if I Dockerfile, if I'm building it, then, but there might be some applications which might be beyond. So, but again, we want to build on top of the existing tools and technologies, for sure, yeah. Thank you. Yeah. Yeah, I think that's all I think we are running out of. Thanks, everyone. Thank you. Thank you.