 Hi folks So this is me. I'm Matt Jarvis. I'm director of developer relations at sneak and sneak are a cloud native application security company So when you first start scanning your container images It can be a bit disconcerting to discover that you might have large numbers of vulnerabilities in your images This is a scan I did last week on a vulnerable node image that I built It's a fairly extreme example But you can see that this image out of the box is showing as having over 800 vulnerabilities in it so faced with this a lot of us will just freeze like a rabbit in headlights when we get presented with this big list of CVEs common vulnerabilities and exposures Particularly if our focus is on application development and not system administration What are we supposed to do with this information? Where do we start? I just wanted an image to run my node application in and already I'm facing this gigantic task to make it secure Well, the most important thing we need to remember is that fixing these things in containers Isn't like fixing them in virtual machines or in real servers We don't want to get into upgrading individual packages and starting to manage the whole system We need to understand where vulnerabilities have got into our images before we can start thinking about what strategies We might use to remediate them to fix them And what we don't want to do is to have to read through every CVE Understand its impact understand how you might exploit it Or to become too versed in the kind of dark arts of system administration We're going to look for solutions which align with the paradigms of containers. So we want repeatability We want efficiency And as much as possible we want to stick with the ideas of immutability that come along with how we use containers So the first thing that's worth understanding in this context is how the images we're using might be constructed Our container images are constructed in layers and those layers are coming from different places Some of them we're creating in our own docker files and some of them are being brought in as part of our build process It's likely we started from a base image in our docker file and then we added some of our own things during the build process Perhaps we made some configuration changes for our environment and then we added some custom software Depending on how we construct our docker file. We'll end up with these things Separated into file system layers in our container image and this layering gives us a good analogy to work with in terms of how we think about vulnerabilities So let's start by thinking about base images and the best ways to Think about vulnerabilities in the software in in them So although we refer to this as a base image It's likely that that image we're using is also constructed from a parent image Which then had software installed into it during its build process The parent image itself was then constructed in some way Perhaps even from a further parent image or by some kind of root file system building tool Understanding how the software that we're scanning got into our images in the first place Is really the key to decide on our strategy for minimizing vulnerabilities? So as an example of this, let's take a look at the official engine X image on docker hub If we look at the docker file for this image We can see that it's based on the Debian Buster slim image Which then gets software and configuration added to it when the engine X image is built and In turn the Debian Buster image is built from another docker file Which takes an empty scratch image and adds a tarball to it And if we then research how this tarball gets built It's an output from the Debera type tool, which is a series of scripts Used by the Debian project to build root file systems This is obviously the way that Debian do it There are different methods by which these things get constructed for all of the other operating systems Which are typically used as base images But the point of all of this is that even when we just look at our base images The way that software gets into them can be a long and potentially convoluted process And this can be difficult to follow unless you understand all these different paradigms Now some people might say that you should just use scratch build your own images Starting from scratch basically an empty container Well, this might work well for compile language binaries where we don't have any dependencies go or see for example But for most other things you will end up being the maintainer for everything that goes into that image And that can be a very big overhead on an ongoing basis So unless you want to become the maintainer of an entire base image In most cases you're going to want to trust an upstream provider for those base images and Look to them for fixes to vulnerabilities in that base image Don't try to fix upstream issues downstream as soon as you do this you become the maintainer In the long run the overhead from doing that is likely to be significant And it's going to require that you track more and more security issues in order to fix them in your Deviated version than if you just stuck with the upstream image So as we've just seen to use upstream images We really need to trust the entire chain of build processes which went into the image that we're consuming and This can be difficult to follow clearly Of course, this is no different from how we consume the majority of open source software And so many of the same quality factors that might influence our choices there also apply here Is the software maintained and updated regularly? Is there a broad community of users? Perhaps there are commercial companies supporting it and this information is all available to you online So take your time investigate what it is you're actually using So by trusting our upstream image provider and we really need to rely on pulling in fixes from upstream Either by upgrading our base image or by using a different base image that might have less vulnerabilities in it But picking a base image isn't always as easy as it looks For example the official Ruby base image in Docker hub Has lots of vulnerabilities in it and it's very big This is fairly typical of official runtime images because by design they need to be generalized for every use case So we could look at the slim version that's smaller. It has less vulnerabilities Or perhaps we look for another one, but there are lots and lots of tags in that repository. So how do we choose? Well, those generic runtime images are probably not what you want for production use cases It's hard to tell which framework version they might be following and that could change in the future But slim isn't automatically the best choice and you get less vulnerabilities But if you use slim to build your images You're going to start to need to manage the build dependencies because they are unlikely to be in that slim image So the best practice for this as I'm sure most folks will know is to use multi-stage builds Where we use that bigger more generic image to do our software builds And then we copy our build artifacts over into our slim version for production deployment So in this way, we're not having to manage our build dependencies because they're going to be in that generic image And we still get to take advantage of the size and the reduced number of vulnerabilities of the slim version And note in this example, we're also sticking to specific runtime versions So we know exactly What runtime environment we're getting and we know it's not going to change underneath us So in terms of choosing our base image here's some general recommendations Trust an upstream provider to do the heavy lifting and vulnerability fixing for you They're likely have bigger teams working on this stuff. And so they're likely to be fixing things more quickly Pin your apps to versioned images At least major probably minor that way the ground is not going to shift underneath you in the future Learn to love multi-stage builds so you can use slim images in deployment While still taking advantage of proven combinations that you know, we're going to work in build and Rebuild pretty often lots of times this is going to get you security fixes as part of the build process And then finally kind of consider moving your pins every once in a while Upgrading to new versions are also going to bring in more security fixes in those base images Fixing these things isn't usually very hard From some research, we did a sneak over 40% of Docker image vulnerabilities can usually be fixed by upgrading the base image And around 20% can be fixed just by rebuilding them because a lot of containers are going to run some kind of upgrade And during the build process So for our base images if we're trusting an upstream provider Then we're going to rely on them for fixes either by upgrading our base image by choosing another base image Who has less vulnerabilities in it? So what about the thing is that we are adding to the containers ourselves? Anything we add to the base image. It's now our responsibility to fix issues in it Well, if we've just added a package from an upstream distro repository Perhaps we are installed an RPM or a deb as part of our build process Then the same principle exists as with base images We're not going to start building that package from source to fix vulnerabilities in it We're going to get our friendly upstream maintainer to ship us an upgraded package or we'll might remove that package We don't really need it. We might change it use something else So what about code we're creating? Many of us are building containers that contain custom applications that we've written in-house and that we're packaging for deployment into our production environments and For our own applications typically most modern apps are based on a small amount of homegrown code and Lots of third-party modules libraries, which are usually open source This is a pattern that you'll be familiar with if you're developing in Java in node in Python and go and in many other languages And our application dependencies are typically expressed in a file in our source code For Javascript. It might be a package.json for Python It might be requirements.txt, but the basic principle is the same We're defining which dependencies and which versions we need for our particular application To be packaged up with our source code into a into a deployable a deployable unit Now this isn't a bad thing having reusable code means we write less code We don't have to reinvent the wheel And we can spend more time on the functionality we need but we do need to be aware of what's going into our image Each of the things that we Define as a dependency can have a large dependency tree of their own things that they need that they'll bring in automatically So potentially we might bring in a ton of other modules, which we might not even be aware of and These indirect dependencies We have much less control over them and again, we might not be aware of them at all and Typically you over 70% of all security vulnerabilities are found in these indirect dependencies So there are a couple of different ways that you might deal with vulnerabilities introduced here depends on the tooling you're using Here we're looking at a sneak scanner that same container image we started with and Sneak has actually identified the package.json for our application Inside the image Giving us a pretty clear picture of which vulnerabilities are coming from the base image and which are coming from the dependencies of our application But we can also scan our applications directly in GitHub before they get included into our base images and Here in this example sneaks identified vulnerabilities in the in the packages for the application But because we've got the docker file there It's also picking up vulnerabilities that would be in that base image used that was built by using that that docker file Again, you know, most tools are going to give you these kind of functionalities to be able to to scan your Your third-party dependencies So whichever way you end up separating out the vulnerabilities It's probably not realistic except for very simple applications to expect your container image to have No vulnerabilities at all Vulnerabilities themselves are not a zero-sum game a particular vulnerability May only be an issue under very specific circumstances on a specific architecture specific platform Without reading the details of every vulnerability How can we possibly decide what is an issue in our environment or not? Well security is almost always a series of trade-offs Particularly between effort and risk How much effort is involved to fix something versus the risk of it being an issue in my particular environment? So we have to make judgment calls on which ones to fix and which ones we might just accept We don't have endless resources available to spend on fixing things and so we need to Prioritize Unless we want to start digging into each vulnerability Understanding the specific circumstances under which we might be vulnerable then the best way forward really is to decide on a strategy so Prioritization is is really not an exact science. It can be based on a number of different factors Severity alone doesn't really give us very much information other than potential impact The CVSS score which takes into account things like exploit ability Does give us more context? But we can also use information like the maturity of available exploit code and most importantly if a fix is available High severity vulnerabilities which have an exploit and a fix are really a no-brainer to fix first So as I said the CVSS score Taking into account things like exploit ability and impact It does give us more context including details on how the attack can happen and it Helpfully provides this in a machine readable format in this vector string which we'll come to in a minute But understanding the output of our tools is also critical here So we need to be able to filter based on criteria so we can just see the things that we should care about Most tools provide ways of filtering those discovered vulnerabilities So in the simplest case if there's no fix available for example, then it's likely there's nothing we can do about that issue for the minute Again high severity issues which have a readily available fix should also be a no-brainer Just apply them So when we look at things in a slightly deeper way, there are obviously elements to this which are subjective So for example, you might decide that a high severity vulnerability That requires shell access in order to exploit it is a lower risk in your particular environment because you have other controls in place to protect against shell access Perhaps you don't have a shell in your containers. And so you Make the assessment that those controls mitigate that risk This is a more advanced kind of methodology It's probably higher risk unless you really understand what you're talking about So really only go down that road if you're confident that you can make those assessments And it's obviously a higher effort strategy So you need to again assess the amount of effort involved versus the amount of risk that you may be comfortable with and again Here your tools can help You can usually build filter pipelines based on that that CVSS vector string and Because that provides a lot of information about vulnerabilities in a machine readable format but a General reasonable strategy might be something like this No high CVs in production Nothing with a mature exploit and to apply all the patches which exist and if you followed this it would likely drastically reduce your Overall vulnerability count in most cases So now we've considered how we might use a strategy and let's take a look at how that might work in practice So if we trust our upstream provider that we're going to start from a base image We're going to leverage that upstream provider And go to them for fixes in that in that base image And then we might have some common configuration for our specific environment and this might be common to all of our images You know, this could be Configuration for our specific networks for a specific orth So this layer plus our original base layer and could make up a common base image that all of our Teams are going to consume Then we might have another layer of common software Perhaps it's a a set of middleware specific versions of run times for our applications And then finally we've got a layer that adds in our custom code So the applications that we want to deploy along with the metadata any application specific configuration And so based on those defined layers When we think about vulnerability management within an organization We want to be able to fix vulnerabilities once and have them effectively resolved everywhere through inheriting those fixes downwards in our distribution tree and In order to do that It's important that we understand exactly which layers cause that vulnerability and then we only fix it in that layer So as we said one way to achieve this might be to establish our own base image so this is going to contain a Upstream base image in this case the the Python 3.6 slim Then any generic Configuration for our environment and then we're going to maintain this as our base image So all of our other teams will take this base image and use it to build build their images So we can then Establish a baseline for that base image and anyone else consuming it only needs to worry about things that they've introduced through Additional packages or software added to this so in this example We've got a hundred and seventy eight issues in our base image and right at the start of the process But it's also important that we watch out for new vulnerabilities. So Generally images that are they've just been released. I've normally have no fixable vulnerabilities But things do get broken things get fixed stuff changes. So Rebuild and set a new baseline pretty often So then we might have a middleware image Which is going to take our our base image that we that we built for our organization and we're going to Install a set of middleware that perhaps all of our applications need in order to function And so then what we can do is to to test the middleware Layer and see how many issues there are in there and We can look then look at the difference between what was in the base image and in our baseline And what's in the middleware image and we know exactly which ones came from the from the middleware layer And you know in this case we can see that there are really only two additional vulnerabilities which which got Which got put into our image through the middleware layer because those hundred and seventy eight were actually in our base image and then finally we can use our middleware image to to produce our application image by adding our custom code into the middleware Layer and Again, we can then test our application code can be tested by the app team in this case. There are This exactly the same amount of vulnerabilities that were in the middleware layer So we haven't introduced any new vulnerabilities through our application layer The middleware team are aware that they've got two issues They need to fix and the hundred and seventy eight issues are in the base image And that's going to be dealt with by upgrading our our base image So the final consideration here is that obviously our containers almost never exist in isolation We typically running them in orchestration systems Usually Kubernetes and so our security isn't just based on dealing with vulnerabilities in our applications The blast radius of exploits is almost always a combination of application level vulnerability Combined with infrastructure misconfiguration, so it's really important that we consider our security as multi-layered and A container image which causes a single pod to be exploited is clearly not a good thing But a container image that allows your entire cluster to be owned is a much more significant issue Security principles for Kubernetes are pretty well documented these days You definitely don't want to be doing things like in this example host path privilege pods Not setting resource defaults can all allow an attacker to significantly increase their foothold in your cluster To potentially devastating effects So conclusions from this talk if all you remember is Define your trust boundaries so understand who's responsible for for which elements of things that have got into your images decide on a strategy Choose a strategy and and execute that strategy in order to reduce your overall vulnerability count and Start with the low hanging fruit high severity vulnerabilities that have a fix available are clearly a no brainer So start with those and then worry about the other ones once you've dealt with the low hanging fruit So thank you for listening you can sign up for a free account at sneak dot IO slash sign up and Hopefully I am here to answer any questions