 Hello everyone day three open govcon welcome back, so I'm proud to introduce Yeah, thank you my friend But yeah, a really interesting topic here from the JFrog team of just the experience of Curating open-source software working with gov cloud kind of products. So just interested to hear the journey. What do you get sir? Well, can you guys hear me? Okay. All right, excellent. So good afternoon everyone Thank you for stopping by here. I Think I just want to quickly set the context in terms of you know why how this presentation and then also From a problem perspective how this showed up, right? So I've been doing this I've been in this in this space for some time spend a lot of time Working for open-source software like cloud foundry and X IBM or X pivotal and also spend some time that special operation forces and some Have seen firsthand in terms of how From a code commit How long it takes to go through people and the processes in order to get your You know build code or a binary or an artifact From a low side or a unclassified environment to the classified side of the house, right and Unfortunately, there are a lot of processes and people involved in between and I've seen anywhere between a day to three days and Sometimes even more than that and based on people's availability and things like that and I'm talking about National security emission critical applications getting deployed and which can save people's life and bring people back home. So In this context and a lot of a lot of times the software getting built these days are A lot of dependencies on the open-source component What that means from a developer perspective from a from that context that involves What we call as curation or curating making sure the Software components that you are bringing in especially this open-source components are safe or somewhat You should have visibility into into the vulnerabilities and they have any Not so licensed compliance issues But one point to make sure that it's free of malicious packages or Everybody is talking about these days so curation and Securing this open-source component has has become a you know primary thing on Everybody's mind right including the developers So this this presentation is mostly about how do we take? How do we kind of automate? the curation of open-source bet In in an automated fashion to an extent and also have a visibility in terms of You know what components that you are bringing in and then and and have a repeatable and auditable In a traceability in terms of you know who requested what and when we curated that right so that's the thing so I want to actually be very conscious on on The problem that we are trying to solve here and the approaches that we are taking Not necessarily talking about the tools and the product here in the in the For the sake of this demonstration or you know from this presentation perspective. I will be using Some tools from the company that I work for JFrog. It's been info forefront in Putting out an open-source version of Artifactory, which everybody uses and also our scanning software x-ray Which looks at some of these vulnerability not just the vulnerability But also from a contextual information and things like that so again quick And then the second thing from a context context perspective I'll be I was working and again this pattern. I've been seeing working with some of these agencies is Hey, either the development is happening on the lower side or The development is happening on the high side right and usually when I say high side with the classified networks, but completely air-gapped and Sometimes or most of the time that's completely card and off The developers who are actually developing Their piece of software on the high side Usually, you know have some sort of a CI that they that they use Well, but most of the time This this thought I mean this entire thing came up Kind of shaped up because you know one of these agencies reached out to us and said look We are developing on the high side The developers are running their CI and then they are usually hitting a Yeah, you know library 400 not found error and then we We basically have a system of record like you know Jira or service now or some some ticketing system that That they open up at service ticket and then they spend next a day and a half trying to Somebody has to collect all those tickets go on to the low side and Then basically have to download this piece of software bits and the dependency. They have to calculate what dependencies that this This dockery major I live as an example python Library requires and then download that bits and Try to run through some sort of a scanner and then Once that is done then they have to basically bring that bits back on to the high side and Then they can run their CI, you know Their pipeline and this turn around This is actually this factual was probably around two days or sometimes even more based on what they are requesting right so They asked us how we could help out and and again This problem statement. This this pattern has been been I've been hearing this with everybody that I'm talking to and and again, you know The approach that we took here, maybe one of the approaches there are multiple ways to solve this but But from a definition of the software supply chain Like we all talking with the We talked about or talking about is composed of you know components or libraries or tools And it's not just executable right It's also the processes that used to build or develop and build and publish a software artifacts that involves the metadata configuration files secrets everything associated with that and most of the time there are centralized public repositories like Maven Central or you know, Dr. Herb and things like that where the developers are heavily relying upon right to these public libraries and then they bring in The these open source components as an example and then store it in in a repository And then they use CI to build and again that would involve configuration secrets and things like that and then They once they build they will do a promotion and they do a release that basically has to be distributed to You know plethora of devices right now. It could be an IOT or a public cloud Or a distributed engine, you know And everything in between so the point I'm trying to make is this The the software development life cycle that involves all the way from the left to the right has to be One needs to have it it could be a DevOps or it could be a security persona as well. They need to have a complete Auditability and then they need to have a visibility into what they're bringing in and again Sitting at the open source open SSF conference I mean a lot of S-bomb Software bill of materials that everybody is talking about Whether it is complete. How do we generate one there are a bunch of tools and then they're getting fragmented right now in terms of you know What's complete right? I mean that's but but again, it's a step in a right direction But we need to have a tool that will give you an ability to Realistically list out what components that you are bringing in what the dependencies and also making sure that those bits are securely transferred from from from a from a From a development phase To the build phase to the release phase so that it can that you have a visibility in terms of you know What do you have deployed on to onto the production and what type of vulnerabilities or security issues that that that means this is the whole The the software supply chain and and again, you know, all of us know We are trying to blur the distinction between these disjointed teams like you know the security or development or ops and SRE Now if you rewind back like, you know, ten years ago, suddenly the DevOps Term was coined and and and people were scratching their head about now. How do I work with an operations team? Who was trying to deploy my code onto the production server and then you know Then this concept does this term of you know reliability engineering SRE team and then also all of a sudden security team who are completely Isolated and working in their own aisle island was now actively talking to the developer in terms of you know What needs to be done again? I just want to make sure that these disjointed teams the the Distinction between them is blurring and again the SBOM is here right the with the maids it's two years now and Exegative order mandates you to have a visibility into terms of you know, what you need to know what everybody should know Like if you're a consumer or a producer, you should know what components that you're packaging and That me and if if the government is consuming that or any public website or an agency civilian civilian or DOD they need to be Completely visible of what this bomb should look like and again The zero trust right in which software component that you can trust these days depends on what package types and Suddenly the binary lifecycle management becomes super critically important because the binary is something that's going to run on the Production system on your runtime not your source code, but it's a philosophical debate in terms of you know What runs on your runtime, but but but other about the end of the day the binary that what you build the metadata what you associated with it and or You know configuration The binary Life cycle management is super important again. I want to actually start and this is the Premises of this presentation is in our developer requesting a piece of software if they're running in any you know pipeline like Jenkins or TFS They basically run in a right repository or The right repository think about right repository is something that you have access to your libraries that From a developer perspective and then you have again I have kind of distinct this between DevOps and the SecOps persona and you have those open repositories again We have JFrog enterprise plus here, but but this could be any repository manager That you could stand up in two different environments, right? So again the right repositories in this in this presentation this right repository is on the high side Let's say and this open repository is stood up on the on the DMZ And which has access to the internet and again these right repo is completely cordoned off And then the developer that request the piece of library that needs to be created the request flows into this open repo And then it goes off to the internet downloads the bit and with the security Policies that needs to be vetted against these libraries now you can run the scanner and and once Once the bits are blessed by the security team The artifacts can be transferred back to the right repositories and then developer Can have access to those libraries, right? So so the question is you know, how do you? How do you automate this process in terms of self-securation process and and again? I'm once the bits are available and then you build a piece of software You have access to the S-bomb you can promote The bits to the read read or archive repositories like read-only repositories that could be Safely distributed off to your runtime right that could be Kubernetes or VMs or whatnot but but the process of distribution from your read-only repository to the to you to your runtime also needs to be tampered proof meaning you need to securely distribute the piece of software from From from your read repository to the distribution cache again guys I mean, this is the The the curation is as important as the curation of open-source libraries or third-party libraries or cuts or whatnot the distribution of the software or artifacts is equally important to the runtime so as I've seen this first-hand the the ISO image or an artifact which is put on a hard drive or I are on a DVD that gets blessed by the cyber folks and when you Walk into a secure facility and they load it on to the onto the network There can many things that could go wrong and and these days with But with the leaks that has happened in the past, you know, three months or six months It's we want to actually make sure these artifacts are securely Distributed meaning there should be a some sort of a you know signing That needs to happen on this artifact so that when the artifact gets on to the high side You need to have a ability to Decrypt and making sure it's not tampered along the way right and that's the one of the Philosophies or the the from a from a mechanics aspect of it is equally important When it comes to the curation and and also distribution aspect of this right again And what we call this is a ship, right? so again getting into the cracks of this presentation, so I have two environments so the secure environment you can think about this is a high side This is where development is happening your developers are sitting and they're running their CI They're trying to request the piece of software and again this artifact There is four workflows that we have worked out here, right? So one is the preparation workflow. I'm going to go through what this means here in a bit the in a high level preparation workflow is basically a Process or a pipeline that sits on the high side and basically looks at the developer request that and then it sits out and look for either 400 not found or you know those catches those Request errors and then it can actually build a piece of JSON you know payload JSON file which has Calculated all the Libraries and the dependencies that needs to be created right and then again based on Your system of record you could open up a support ticket or you can carry this JSON A payload or a file on to the external environment, right? Which is where the curation workflow is another set of pipeline which will basically get triggered based on this JSON payload and then that's curation external environment is something that you could stand up in the DMZ environment and it could go off to the centralized repository download the bits and then Have your scanner scanners and then if the scanner Scans it and then not no vulnerabilities found you could transfer that back to a repository that could be As a scanned repositories if if there is a problem you can actually block the download isolated in that way you have a and then alert the The end user saying that you're downloading the piece of a saw You know a library that has vulnerability into it and and again once this is done There is a bundle workflow that will actually bundle It goes to each of those repositories In this external environment and then picks up those bits and the dependency based on the path where it is stored And then it builds what we call as a release bundle or a bundle think about bundle as a compress zip file And then we have an ability to sign that so that each of those artifacts as a checksum associated with it and Shah associated with it and then you can Transfer however you want to in in terms of ISO image or I put it on a DVD or a hard drive get it scanned by your cyber and then Coming back to the secure environment You can you can load that there's a load workflow that could basically load these release bundle and then populate the repositories within the Within within your artifact manager So that it's available and then you can alert the end user in terms of you know what it is What they're requesting is made available and things like that Any any questions so far? all right So from a from a from a secure environment perspective again, we talked about this the developers were running The CI pipeline Basically request a package right and usually the packages such as like Docker or the Python libraries usually have some dependencies that needs to be That needs to be Managed as well or needs to be brought in along with the certain package that they're requesting and then What we have done is we are wrote a simple Python script that will basically generate this manifest With all the package request and usually you could do this once a day depends on how many user requests and and if you're using a Jira or a service or whatever the system of record you could basically go after and produce this Manifest and then gets returned to a You know a JSON file that needs to be carried out to the carried out to the next in the environment so again the external environment this manifest that that that you brought in from the secure environment Will have to be processed to figure out all the dependencies that needs to be downloaded and the way we are doing this is the the pulling the dependencies from a from a External environment such as you know maven central or whatnot as an example The from in from an artifactory perspective from the JFrog artifactory perspective you could actually configure an artifact Repository as a remote repo pointing to an external environment and then you can download the bids Cache it and then have x-ray Look at that Which is our scanning software look at that and then as we download that bits you can get scanned in the real time and then if if there is a if there isn't any Issues then the scans will fail or when the scans fails then you can basically isolate that bits If if the scan succeeds then you basically copy the contents of this remote repository to the local repose right the local repository construct is Basically a local repository where your developer had access to and then again. This would be a trigger to the bundle workflow So As I talked about as a disclaimer in the beginning this this this whole presentation is the Is based on the conceptual what you could do to? curate open source bed and as part of this presentation and the workflow we are using You know JFrog x-ray You know the components that what the platform comprises of artifactory x-ray and JFrog pipelines to demonstrate this Again from a in the external environment with this bundle workflow Again, if if if you are curating one library versus You know hundreds of libraries that needs to be downloaded and x-ray and scanned This Gathering this curated packages could become a problem Meaning you might have to traverse through the multiple pads or you know versions and pick up the one that you need and And and then create a bundle out of it right bundle could be a Compress zip file and and things like that. So the the problem is the if you are moving this to a Moving this to a air-gapped environment you might want to do basically Copy this bundle to a distribution area so that it we could do what we call as an offline distribution that means That means that it could be downloaded on to your you know your USB drive or a DVD or a CD and then somebody has to physically transfer this to to the secure environment, so the the automation aspect of creating the bundle or Curation is what is going to save you time But at the same time we are not recommending here that you should bypass Your cyber or your processes that you already have in place to get it scanned and authorized before you take it back to the To this to the secure environment again, once the once the release bundle or a bundle makes it into the To the high side or to the secure environment you can actually Need to have an ability to load those packages into appropriate local repositories meaning into the into into Appropriate paths So that it's available for your end user when they're looking up for a certain libraries that needs to be part of their CI Build to happen. So there is some work that that That has to be done in terms of an automation So that you load this bundle and then the second thing that's primarily important is you want to make sure that as part of your load package you have to Make sure this bundle is not tampered along the way From a from the tool perspective we In the in the in the in the previous step We actually on creating what we call as a release bundle, which is basic basically a a construct of Gathering all the packages and then each of those Packages has been signed by GPG keys and when you bring it on to this secure environment As part of this load package a bundle is basically checked for its integrity so that The load will fail if the package is tampered or the signature is changed along the way right now on the pipe so that having said that once the packages are load loaded You have to generate some sort of a notification to the requesting user, right? It could be closing up the support closing up the tickets or Telling the notification to the end user saying that the requested packages there And it seemed being in the inappropriate directories and things like that, right? So this this is basically An approach and again, I'm so not saying this is the only way to solve this problem, but this would greatly Increase the developer velocity and then also second thing it it it's securely you're A curating and also transferring the bits so that the distribution of the packages from the from the especially open source component open source libraries Has been blessed signed and then you know you can have some sort of a trust in terms of you know What needs to be now how this to that needs to be loaded onto the high side as well So with this I want to actually quickly Show you And this is a Artifactory user interface here and then the way We have built some pipelines that that automate and you know Demonstrates what we talked about in the last 20 minutes or so So I want to actually quickly show you we have this gated Docker pipeline and if you look at one of these runs Look at the resources The start with the Json payload that I was talking about The payload could be as simple as this as a demonstration purposes We are trying to bring down like you know three or four libraries here packages here and then as part of this it this this gated Docker pipeline basically goes off To the to the to the central repositories and then Executes a you know written a you know python Script what I want to actually show you is as part of this It goes downloads it and then have x-ray scans it and if I say Blesses it and gets copied out of the local Repositories right at the towards end of it you would see There are successfully curated one library and then failed to curate a couple of them because there was a Vulnerability that was founded that was discovered and then we kind of blocked the download and and same thing same thing I have with the with the gated Pipe I as well. I was trying to Curate for Python libraries as well. So what I want to do is you know the once these two pipelines are run I also have What we call as a create release bundle Which which could be run? Every day or once in 24 of the way I have triggered this is it basically looks at last 24 hours if there are Your my pipe I and the Docker pipeline run successfully then it automatically goes and creates a release bundle for you, right and and looking at this it it basically Executes and another Python script that basically calls and you know RESTful API is on the Artifactory to create some of the release bundle right and then once it says repository successful we can go back To the release bundle construct here release bundle as I talked about this is the JFrog Artifactory way of creating a release bundle and again the release bundle is signed And if you look at this example release bundle, this was created a couple of hours ago and you could you could basically Look at the version and and look at the content of it right in the if if this looks like there are four files in here and Then it has a manifest JSON. It has a shop for each of the Curated piece and then you can also look at the release bundle info and this is what I was talking about the artifacts That it's brings in or what the x-ray has blessed and the bundle has Notice that each one of them has a show associated with that So that you exactly know What you are packaging and then you can version this release bundle as well, right as you as you run more of these Pipelines this release bundle could be version and then the way you could do this is you could Distribute this version if there is a connector connectivity, let's say between your Low side and to a high side sometimes, you know, you have one way connectivity then you could actually Distribute that version but it needs an online Artifactory server on the other side But most of the time it you don't and then the way you do this is do what we call as an offline Version of the offline distribution that basically bundle is basically brought down to your laptop or Wherever you want to store that and you can download that version And and then on the and and then you basically transfer this back onto the high side and on the high side I'm going to be logging in here You could you could basically? Go go to the distribution here and look look at the release bundle that has been received so far and And if you look at this example bundle, there are a couple of them that I did like couple of hours ago where you could You you look at the version of it as part of this Importing a word importing a file does the release bundle file onto this you can look at the Files that were brought in and as part of this load It basically populates the artifacts from a from a repository perspective and also Appropriate permissions and things like that but but but anyway the point I'm trying to show you here is it is it the distribution With the GPG kinds GPG signing on the on the low side and bringing on to the high side and then all of these actions could be auditable log and then the you know Notifying a user would would have you a visibility in terms of you know how your package Traversed all the way from downloading that bit on on the low side and and Coming all the way to the secure side in terms of it from a distribution perspective. So This is basically the crux of what I wanted to show you today and then I would be I Have five more minutes here before You know if you have any thoughts on how this could have been done differently or how we are doing it And if there are any questions would be happy to happy to answer so I've worked with a number of pipelines like this, you know, so low side isolated high side or secure area It's a great example of how one way to get stuff into there Of course, you know inevitably then tomorrow the next log for Jay log for shell happens And you figure out what I've got on the high side and what needs to be updated and figuring out what needs to be updated While I'm setting up on the high side It seems particularly difficult. So with this sort of workflow, how do you handle that figuring out? What's out of date? What needs updated things like that? Absolutely. So again I want to be conscious on I Want to I want to be conscious on open source way of solving this versus a Vendor lock-in and a tool I will tell you how a JFrog Artifactory in X-ray we do this. So we do something called runtime CVE extractions the the the This goes back and goes back into what we call as a zero-day vulnerability, right? So you tomorrow could be in another vulnerability That could be that could be discovered on the piece of software that you're running in your production system So I think there is a bunch of ways to Extract and have a visibility in terms of S-bombs Which S-bomb has been pushed and what version of it has pushed on to the production or into when I say production on to your runtime and then there's no easy way to discover Unless you actively monitor What what you're running on your runtime and then how do you discover a new vulnerabilities that's being found on its right? So There is no easy way to do that having said that there are tools in the industry that are working towards this and we a JFrog We are a few months away from releasing what we call as a runtime extraction capability that could give you a Visibility into you know how you could solve that but having said that the S-bomb is the is the key the the Having a traceability of which version of S-bomb The versioning of it and know what exactly you pushed on to the runtime would probably help you again There's not an easy way one push button to tell you how do you extract it, but I on the current Artifactory there are ways that you could write some plugins to extract the CVEs from the runtime and then act on it but but again We are working on that have as a product features that would enable you Yes, thank you for the talk. I was wondering if you could talk a bit more about the Like notification in the secure environment part. There were some keynotes earlier in the week and some stuff at CD con about CD events and automatically kicking off CI CD processes based on an event-driven model and I mean and cautious about vendor lock-in and JFrog But I'm curious if there's if JFrog has something like that or if there's any other technologies like that that you know of that would be useful for Once an artifact arrives in the secure environment automating the Start or of a CI CD process. Yeah, absolutely and again I'll tell you what I have seen what JFrog can do as well That's What I have seen is very rudimentary right now the it's still manual process in terms of You know letting a developer know that there's a piece of software artifact And again, I've heard this from many agencies and and this is one of the problems, right? The the the folks who are actually running the CI build doesn't know when their When their library is has arrived or curated and made available to them There are multiple ways They were I've seen people using some system of record like what I talked about like Jira or Confluence and or even vendor locked in things like service now Again again, there are people involved in this right and there's no automated way of solving this Again, it's evolving With with going coming back to the JFrog way of doing it we have Obviously web hooks into You know Atlassian or you know Jira or service now and things like that where You could you could get alerted on But but I think there is no rocket science in terms of you know, how do you do it? But I think the most important thing is the discipline and the traceability of Looking at the request package who requested it, right? I think the one of the critical things that When it comes to the curation is it's not what you're curating it's all about who requested that package and Why they requested it and tracing all the way back to the end developer who requested it in the first place and opening up that ticket in the service now or Jira on their behalf So that when the package get gets curated and transferred back to the high side You need to know you need to you need to let the developer know right so A JFrog we have We have implemented that workflow using You know things like this, but we have open-sourced it as well. You can happy to share that But but again, there are multiple ways to do this, right? That's what the but but again, I Didn't sit through the Open SSF that the one that you specifically referenced, but happy to learn if there are any other ways to do it Perfect. Thank you