 Hi, I'm Rajiv. I'm CTO RapidFord. If you use MySQL Redis or any of the popular containers, this is the right presentation for you. What we do for the community is we provide 80% cheaper, faster and secure containers. And we are going to present you how to do that. Hi, I'm Vinod Gupta. I'm Senior Director of Engineering at RapidFord. I'll talk about how we deleted 78% of our Redis container, and it still runs. Let me walk you through the agenda. We'll talk about the Community Images project, which is an open source project. We'll talk about how this project came along about the problem overview. We will show you the results which we gave God from our experiments. We'll do a quick demo of our open source project and would like to invite the community to see how you can contribute to our project. We leave ample amount of time for Q&A at the end. So these are the images we took in the Community Images project and hardened them from RapidFord system. These are popular databases like MariaDB, MongoDB, Postgres. We have load balancers and reverse proxy like IngenX and Ony. Redis Cache as well as Fluendee. We took these images which had more than a billion downloads. We wanted to create secure hardened images which has on an average 80% less CVs and provide it to everyone for the community to use at always in the open source fashion free of cost available on Docker Hub and all the source code is available on our GitHub project. Over to you Rajiv. What we see here is a typical microservices based application. There is an edge proxy. There is a business logic based application we have written. Then we have some react application which is publishing your dashboard. And then behind the scene we use MySQL, MongoDB and Redis. So as a native Go developer, I picked up my image and created a multi-distribution. So we built our Golang image and then put it in a scratch image which amazingly produced zero CVs. I was really happy. But when we deployed our application in a cloud native fashion with Ony as the edge proxy IngenX for our UI containers, SQL and NoSQL databases and we were supposed to scan the infrastructure. We saw 830 plus vulnerabilities. I am assuming all of you guys have also ran into similar problem where we can control our own application come up with zero vulnerabilities. But what do I do with the code I don't own? So first thing as a developer when I saw this 800 plus vulnerabilities the first thing was I went to my developer and I said go ahead do something about the MySQL CVE that we see here. His response was very simple. Code is not written by me. There is nothing I can do about it. As a manager what I did is I went to the website and said is there another distribution I can find that has less vulnerabilities? And good luck. There is a community image called Bitnami Images and what they do is they basically always look for a fix done by open source community. They basically pull in the new images packages and they re-spin and they release the new container. And what the result you saw behind the previous slide all the 800 vulnerabilities actually coming from Bitnami Images. So at this point I left with 800 vulnerabilities and I have no idea what to do with that one. It's a very intractable problem. I go to developer, hey can you look at the MySQL, the different packages it uses. Is there a fix there? Or can an InfoSec do an analysis saying that is this vulnerability something that may impact us if we put it in the production? Back and forth but you are left with actually no choice if there is no fix available for you. Has anybody ran into this problem and have you guys solved this problem? We can talk more about it at the Q&A in the end. What we did was we sat down and started deleting all the packages in the Redis container to see whether Redis container will continue to run. We ended up deleting 78% of the Redis container and it still ran successfully. And that's what the Wallab movement was and we started thinking how we can build a solution around it. Let me first walk you through the results after deleting 78% packages from the Redis container. So here what we saw was the attack surface got reduced from 102 MB to just a 20 MB images after deleting those many packages and files. The number of vulnerabilities in our system got removed from 110 to just 30 in the Redis container and the packages went down from 112 to 20. Of the 47 critical and high CVs we just remained with 20 high and critical CVs. Of course we can all do this if we have all the manpower in the world but we can't do this manually with every release of Redis or all the popular open source images we are doing. So we started thinking how do we automate this process so that we can build these binaries with each release and make it available to the community. Rajiv, could you please explain what's the process behind it? Yeah, if you look at what Vinod mentioned, he looked through the S-bomb, he figured out all the packages, he started removing that one. This whole thing looks very complicated and when you multiply with automating this whole process the problem is really intractable. Redis released a new one, they have added this time in new packages or they have removed the new packages, how am I going to take care of that one? So what we came up with is a solution that actually is very simple and that's what I'm going to explain here. Typically a software is built and then you test them and if it satisfies your requirement you release them. What we have done is we have created a utility that takes your application. It could be MySQL, it could be Redis or any Java application, it doesn't matter. We give you a return application and we call them a stub application. This stub is very transparent in nature. The way you deployed your Redis, you will deploy this stub Redis and the difference between the original and the stub is nothing but it has a tracing mechanism embedded into it. So anything that moves in the Redis container you collect its behavior. Then you run your coverage script and I'm going to explain you what coverage script is. You run your coverage script and then you signal the system that hey I have exercised what I need from this application, give me my hardened image. Hardened basically is a different meaning for other folks. What we mean by hardened is that what you need in the software that stays there and if you don't need something it's removed from there. And after that you continue your process of retesting it, that meet your requirement, then you release the software. So what is coverage script? It is not a test. It is basically a script where you don't care about the output or the results. In a nutshell it's just a payload that exercises your container workload. Covering all its functionality. And I'll show you a sample how to write a coverage script for Redis. Typically it takes like one day to write the coverage script for a very complex application. For Redis it takes almost half an hour or one hour and you will have a coverage script. The idea here is that when you harden something, if you miss out some functionality, it's the part of your development life cycle. You just add an entry in your coverage script and it will be covered in the hardened image. And after that you start doing your regression and QA with the hardened images. So this is the sample coverage script. So as you can see, it's a basic coverage script where we do just some basic Redis commands like setGet and some other standard commands. In a coverage script you never care about what results you are getting because that's part of the functional integration test or unit test which have been built by the original team. What we are asking is just to run coverage so that the main binaries get touched by the system so that we don't end up deleting them. I will also give a demo where I'll show the same script in action. So let's do a quick demo and we'll come back to these slides. So this is our GitHub project. It's rapid food slash community images. If you go into this project, you will first of all see a list of all the repositories for which we have created the hardened image. We have MariaDB, MongoDB and other images which I talked about. Let me walk you through the structure of the image of this repository in case you're interested in adding your own images. So we have all these images which I showed you earlier and we add a provider to it so that in the future if you don't want Bitnami, we want to use official images. We can add those as well. Now let's deep dive into a Redis Bitnami coverage script and how we added it. All these folders require a run dot script. This is the runner for the coverage scripts. In this runner, you just get a function which is passed with image repository and the tag and the namespace. And you can do whatever you want to make sure your coverage scripts get exercised. In this case, we install a Helm chart for Redis and simply execute this coverage script which I'll show you shortly, test dot Redis. And then we exercise the same test with the TLS mode in case the TLS boundaries are not tested. And we do a Docker compose. Let me quickly show our coverage script. You'll see this is the same coverage script which we had on the PowerPoint. We just get exercise it in the run script. We have built GitHub actions. So you don't need to build these readme's as long as you have an image dot yaml file. You can just add some of these entries like official name, official website and it will automatically generate those readme files for you. We also automatically create GitHub actions so that for each of this image, we can run a test every hour and see if a new image is available. If a new image is available upstream, we run and coverage script and create a new image followed by basically running the functional test. In case of Redis, there's something called Redis benchmark, which is a benchmark test for Redis. We run that as a functional test. All these images are available on Docker Hub. So if you click any of these image, for example, let's go to the Postgres. There's a Docker repository. You can see already 5,600 pulls for this image. And we also have rolling tags like latest. These automatically get created for your image. So if you're adding new image to this project, you don't need to do that work. We already take care of it. And you can see the reports in details about what was removed from your application. I'll pass on to Rajiv to explain what all gets removed from the image. So let me make it a little bigger. So what you see here is the latest and greatest Redis image released by the Redis lab team. And what we do with that container is we go through the script that Minod was showing. And the outcome of that whole exercise is the original image was 102 megabyte, and it reduced to 22 megabyte. How this happened, I will show in detail. But the side benefit is your vulnerability can't reduce from 100 to 30, and your package can't reduce to 112 to 20. So you have less package to manage and do the license and patching and all. What you see here is that all these magic happen. The original container is shipped with close to 6,000 files. To run a fully functional Redis, all you need is 85 files. It has some supported files. You can exclude few things which we have done and we have included here. If you do at the package level here, you see clearly this container image is shipped with 110 Debian packages and two Go packages. Go packages not at all used in the exercise of using the Redis. So those are removed. So that means any vulnerability associated with those packages are gone. It's not there in your hardened images. Same thing is with the Debian packages. We remove these packages. So if I see here, this is your fully functional Redis, all the packages that it has, and we have thrown away these packages. And one obvious thing that comes out of this detail is why somebody is shipping ad user with a Redis container. If Redis, let's say tomorrow, somebody finds a security issue with the ad user that can make anybody a super user. You should not be shipping that one. That's not a good practice. And I think this is what we see here is that it's not being used. So it's been removed. So you don't need to worry about that one. One last thing I wanted to mention here is that all these things come up with the, you know, the S bombs and the vulnerability report in CSV and JSON format. If the community wants it to a certain way to be, you know, generated for you, we are we are here to, you know, add that functionality. And all this system and all this system is available without any logins or sign up. And you can go back to the GitHub from this page itself. So let's jump back to the presentation so that we can go through the rest of the details. So how can we involve the community to help add more images? You can simply fork the repo. If you want to add a new image, just copy the template. Write your coverage script and add a functional test like a redis benchmark or whatever functional tests are available for your image. And we automatically create pull request with all the checks. Just submit the pull request and we'll be happy to work with you. So what does future entails here? Can you talk through that Rajiv? So we are planning to add next Prometheus, Grafana, Nats, Harbor and cloud custodian and other images. But this all this requires some kind of application understanding. So we are looking for community to help us. First of all, pick which container you want us to harden for you and also write the small coverage script. If we run through the script, then you will have the hardened image. Let's say somebody wants to have a Prometheus hardened image, you can go through this process. But we are not expert of Prometheus. Somebody who knows Prometheus better than can write quickly the coverage script. So if you want to reach us, we use the standard GitHub interface. You can go and create either a bug or a feature request or can also request new images to be added through our GitHub page. There is a QR code on the screen. If you scan this, you will learn to the GitHub new issues page and feel free to add any new image request if it's required for you. Thank you for spending time with us. And we are opening the time for any Q&A. And if you want to see behind the scene how everything works, we can do the demo for you also. We can harden right in front of you NGNAX or Redis image if you want to do that. Are there any questions online? Yes. Can you pass the mic please? What's your revenue model? What's your revenue model for this? Because I do see that you are producing this as a community, images, all those things. But what's the revenue model for rapid forward? The revenue model for that. This is completely open source project. We don't want to make any money from this activity. We are building a completely new technology and we need world to adopt to this technology. All the open source project, whichever you want to bring to the system, it's always free for you. In case you have your own enterprise applications, you want to run CV, then there is basically you can go and sign up on our page. We also have a community addition, which gives you certain scans for free for always. So this is a completely new shift in how we build the software and this community addition will always be free. So we just want your contribution and using these images. We are not looking for any revenue generation from this activity. Yes. How do you request another set of images? So if you scan this QR code, you will go to the GitHub page. I can take you there right now. And from there, you can just submit a small form. Let's go there quickly. So if you're in GitHub, you go and create issues. You will see this image request page. Just click get started and fill like name, version. What is the problem this image solves? Is this an open source and what kind of run times you want supported and just submit the request and we'll prioritize based on the time. And we welcome you to add more images yourself and we'll be happy to merge that request as well. Any questions online? Okay. So since we have a few minutes left, we are going to quickly do a demo of how we can harden engine X in front of you guys in the next five minutes. Sure. So I'll pass on to Rajiv. So I'm going to pull the test and greatest engine X. We are just doing a Docker pull of engine X. So this image is released by a five team three hours ago with 142 megabyte in size. When you sign up with rapid port, you get a bunch of utility. Pretty much you always use RF stub and RF harden. So what I'm going to do here is going to take an engine X image and pass it to rapid port. The image is compressed and suited to the cloud where we go ahead and open up the application, look for the various binaries and packages and everything. And in return, we give you a new engine X image, which is kind of very transparent in nature. You can just in place replace it in your Helm chart or Docker compose, but whichever way or forget definition, you can just deploy the application. Once application is deployed, anything moves in that application. We collect this behavior. So that's what I'm going to do the next is that we are going to test it. So what you see here is that we found this is a Debian based image has 142 packages carries 142 vulnerabilities. This is the three hours ago, the engine X team released with this many vulnerabilities. So can I go to the UI here? So we just process through this engine X image. What you see here is that it's like 142 megabyte. What we predict is that this kind of software, what we have seen is it sits with 116 megabyte of extra software, which it does not need. And the same thing is for the vulnerabilities and packages. At this point, we have not tested the application, so we have no idea what is required to do that. So typically, if I test this engine X, I would be run Docker with something like this. Normally, this is how you will run your engine X and you will do some curl operation, which is basically the coverage script that we are talking about. But we don't expect you to run the original application, but we expect you the return application that we have given, which is called a stub image. So we are running the application here. What you see even all the logs and everything is exactly the same way your regular engine X will do. Nothing comes in the picture. None of your other application, which depends on engine X, they will know that they are running a stub. But behind the scene, anything happens in this application is basically collected there. So if I go back in the UI, if I go to the logs, what you see here is that there is a histogram that is saying that a bunch of files are being touched, but the system calls are being made, network activity has happened. You can see all this behavior and all. And now, let's say if you are running for some time, so one second, this machine is demo. So I'm generating some traffic here. This is my coverage test. So I'm generating some traffic. Go back here. You see that traffic has generated. I'm done testing my engine X. I'm going to comment this out because I need it in the future. All I have to do right now is simply run the hardened command. Now, we ran one time. We ran 100 times 100 instances. We collect all this behavior because in Kubernetes, folks deploy multiple pod. So it doesn't matter. We collect all this behavior. Right now, we ran only one time. So we have one report of you using this piece of software. We have got the reports of usage and the original application. We merge them and we give you a final hardened image. If I do here, Docker images, again, grab engine X. So here's what happened. The F5 team released this software, 142 megabytes. We gave you a stub which was a little bit bigger because we injected profiling mechanism to it. And then when you process through it and you get the hardened image, but does it even work? That's the million dollar question here. So we go here and simply run hard. Now here you see exactly the same thing, same log, nothing had changed and hardened images is running right now. But does it even work? So let's try to generate some traffic. Yeah, it's running the same thing. So you have a fully functional engine X serving a web page and much smaller footprint. So I'm going to go back to UI. So if I refresh this page, go to engine X. So that's what you see here. 130 megabyte of software is thrown away from this container. And the side effect of that one, your vulnerability can reduce massively and your package can't also reduce massively. If you look at the details here, all you need is 15 packages to run the whole engine X. These are the unused packages we have removed from your engine X images. Okay, and yeah, that's it. Any further? Yes. Sorry, I didn't... Let me repeat the question for the audience online. So I think your question is how we are determining what are the packages we are removing and what are the packages we are keeping, correct? Yes. So how is this package needed or how is this package running? Yeah, very good. So what happens is that when we give you the return application, that has a tracing and profiling mechanism inbuilt. That means anything moves in that container, we collect that information. So if it touches a file, we collect that information. If it touches a system call, we collect that information. If it does any network activity, we collect that information. So we call that our isolation profile of that container. Isolation profile basically telling us is that this application, container application, these are the only things that's being used. What we do is that we have your original application and we correlate all these files saying that, oh, what package this belong to and what are the packages that's not even touched. In this case, you saw add user. When we exercise engine acts, none of the files from add user is even touched. It's not supposed to be there and we remove that one. So you need to exercise your application to know actually what actually you are going to use and what you are not going to use. Yes, absolutely. If you don't exercise something, then in hardened image it will be removed. That's what we expect. The developer who is very knowledgeable of their application, they should be writing a coverage script. Not the coverage test, not the coverage script. That will exercise those functionality. And we are expecting that this would be done before it goes to your QA or regression test. A developer basically says, I have written this wonderful Go application. I use this and this. I'm going to simply exercise this one. Once you exercise, we capture the isolation profile. We merge that application. But besides that, we give you also some control. One of the demands that developer has says that I have written a beautiful React application. It has bunch of PNG files and HTML. I don't have a test. I don't know how to even use that one. So we give you a git ignore kind of syntax which you can apply during the hardening process. You will say, don't touch my slash app folder because whatever is there, I want them to be in my hardened image. So you can do that too. I call this a tailor made solution because your enterprise usage of engine X could be very different from other enterprise. And we can create tailor made containers depending on your usage so that you are not carrying the burden of other organization security issues. Any further questions, comments? Thanks for all your time. Yeah, thanks. We hope to have your contributions in our project. Thank you. Thank you.