 Okay, we're going to get started. Our next talk is an introduction to container security. Our speaker is Thomas Cameron. He's a Red Hat Solutions Architect. So without further ado, I give you Thomas. Thank you. Thank you very kindly. My name is Thomas Cameron, and I am the Global Solutions Architect Leader at Red Hat. So I'm responsible for all of our field technical sales staff, so all the pre-sales engineers, pre-sales consultants around the world are my internal customers. I try to keep them up to speed on technology and methodologies for helping customers and things like that. As we discussed before we got started, there's kind of an alphabet soup after my name. I am a technical resource inside of Red Hat. I've been with the company for about 10 years, and I have been in the IT industry since 1993. So I've been doing this for a while, but I have certainly learned the longer I work at Red Hat, the less I feel like I know, because I work with some incredibly smart people. And every time I start to get a little big from my britches, I just go and visit engineering in Westford and come back, and I'm like, yeah, definitely Linux. Definitely Linux admin. So let's talk about what we're going to talk about today. Today is going to be a repeat, actually, of a presentation that I did at All Things Open back in October. I found that when I was talking to folks about containers, you know, there were invariably questions that came up about container security. And, you know, the stock response is, well, containers, you know, can be secure because they take advantage of a lot of functionality in the Linux kernel, and they take advantage of some of the features of Linux, you know, kernel namespaces and security enhanced Linux and things like that. And people were like, yeah, that's cool, but what does a kernel namespace do? And I was like, it's a namespace, that the kernel spaces names. And I realized, you know, I sort of understood at a fairly high level what kernel namespacing was, but I started really digging into it. And the more people I talked to, the more people I found that were like, yeah, I kind of get it, but like, I don't really get it. So I decided to come up with a fairly simple overview. We're not going to dive super deep. This is just the fundamentals of container security. So I'll talk a little bit about who I am, about what Red Hat has been doing around containerization. We'll talk about what containers are, how they work, what they aren't, and talk about some of the myths that I have heard from folks who are exploring adopting container technologies. And then we'll talk about the nuts and bolts of what container security is, what makes up all the components of container security, including kernel namespaces, control groups, a little bit about the Docker daemon and how it helps to keep you secure. And then we'll talk about Linux kernel capabilities, security enhanced Linux, and then I'll go over some tips and tricks, what to do, what not to do, and talk about our conclusions. So by way of introduction, as I said, I've been with Red Hat for a long time. I've been in IT since 1993. I actually changed careers in 1993. I was a police officer for several years. I have a pretty strong academic background in security and law enforcement, and I have professional experience in law enforcement. I realized that I couldn't afford to pay my bills as a police officer and went off and became a computer geek. So now I'm a computer geek. I've been with Red Hat since 2005. I've got a whole slew of professional certifications. My first job out of when I changed careers was in a Novell shop, so yeah, I'm dating myself right there. But I was a C&E back in the day, back when that was like the thing, right? And I went to work for Microsoft for a while. I got better. So I was an MCSC and an MCT, a Microsoft certified trainer. So I've been doing this for a long time. We spend a lot of time focusing on security in organizations like banks and manufacturing companies, e-commerce companies and things like that. I've helped Red Hat customers for over a decade now working on these issues, working on security issues. And as I mentioned earlier, man, I totally recognize I don't know everything. I know a lot. I've been exposed to a lot. But again, it's hard to know everything. So kind of in a nutshell, I'm just a big old nerd. So let's talk about where Red Hat fits into the container ecosystem. We have actually been working on containerization since before 2010. We made the Makara acquisition. We bought a company that was doing platform as a service back in 2010. And they were using Linux control groups and security enhanced Linux for process isolation. But honestly, this is really before containers as we think of them today was really a big thing. But we've been doing this for a while. We're fairly familiar with... Well, I would say we're expert because I talk to those guys sometimes and I, again, walk away feeling like an idiot. So we bought Makara. We rebranded that as OpenShift. And so now you may have seen that Red Hat has a platform as a service offering called OpenShift. There's three versions. There's OpenShift Origin, which is the upstream open source. Anyone can download, compile, play with it, do whatever. It's kind of the Fedora land version of OpenShift. Then there is OpenShift Online that Red Hat has been offering as a service for several years now. You can go to OpenShift.RedHat.com and you go through a really simple web UI and say I need this much capacity and I want this application framework and maybe this pre-canned application and you can have your application up just like that. And then we have OpenShift Enterprise which is for deployment on a customer premise behind the firewall. Now, originally OpenShift used what we called cartridges that were units of compute that were basically segregated by security enhanced Linux, by Linux control groups and kernel namespaces. In about 2013 Docker really started to get it stride, started to do really well and we realized it really makes sense for us to change away from what we had been doing to what Docker was doing. So we adopted Docker as a technology and our Paz offering back in 2013. We are a top contributor to Docker. Last time I checked, this was back in October, we were the number two contributor to upstream Docker behind only Docker itself. So we got a fair amount of experience there and as you know if you've been to any cloud track at the conference, at scale the adoption of Docker is incredible. They have done a phenomenal job. It's good technology, good community. The company itself is doing real well. They've been through multiple successful VC rounds and the laundry list of people who have thrown in behind Docker is amazing. Red Hat is grateful to be on that list as well. And even Microsoft. I was in the Microsoft session the Microsoft Azure session yesterday and they were like, yeah we're going to be doing Docker containerization on Windows Server 2012. First they ignore you, then they laugh at you then they fight you, then you win. So I'm sorry I'd like to see them try. I'd like to see them try. The comment was expand and make it proprietary. So let's talk a little bit about what containers are. So containerization now I'll be talking specifically about Docker, but these concepts should be fairly universal no matter what you're using, whether using LexD from Ubuntu or whatever. But basically it's a technology that allows for applications, whatever those apps are whether it's web or database app server to be abstracted from and in some cases isolated from the underlying operating system. In the case of Docker the service can launch containers regardless of the underlying Linux distro and containers can absolutely enable some amazing application density. Since you don't have the overhead of virtualization where you've got the full operating system with the same set of libraries getting loaded for every VM it's really really lightweight. And when you factor in the architecture that it's just the bits that you need for the application and you also take into account the capabilities of Linux control groups where you can give very fine-grained control over how much resource or how many resources each application or each container is going to get you can get some phenomenal, phenomenal utilization. And I joke, the same container can run on different versions of Linux. Ubuntu can run on Fedora and Synthos can run on Rail. Dogs and cats, human sacrifice, mass hysteria. But it's really cool. I'm a red hat guy. I've been with red hat for a long time. It's just what I'm most comfortable with. I can't tell you how many times I've spun up a container and it's like, oh yeah, the underlying technology on this one was obviously built on Ubuntu. That's cool. I run it on my Fedora box or on my Rail box and it's like, eh, no big deal. So it's really really an awesome technology for application developers. So how do they work? No, maybe not. Containers make it easy for developers to build and deploy apps. It's not really that mass hysteria that we were talking about earlier. So the question then becomes, well, what are containers not? Because one of the things that does concern me a little bit and frankly frustrate me a little bit is this like, we're going to containerize everything and everything's going to be awesome. It's going to be wonderful and it's going to solve all the problems in IT. Maybe not. Maybe not a panacea. They are not the cure to all that ails you. And containers are not fit for every application. Not yet. Maybe not ever. I don't know. I never say never. Except when I say I never say never. Because you never know, right? I mean, somebody's going to come up with something that's like some huge, massive, awesome container. What we really hope for when we're kind of thinking about what this is, what we really hope for is for the big software vendors, the big enterprise software vendors, third party shops like SAP and Oracle and fill in the blank. Wouldn't it be awesome if instead of getting a DVD ISO image that's got a freaking install.sh that does all kinds of crazy, non-packaged, weird stuff all over your file system that you don't know what it is, wouldn't it be cool instead if they just went, here's a container. Go. That would make life so much easier. Now, will we ever get there? I don't know. I hope so. I hope so. Most importantly, containers are not virtualization. I think that in a very, very simplified way they are the next logical step in the whole physical, virtual and high density move, but they're not virtualization. You can run containers on bare metal, on an OS directly on bare metal, or you can run it in VMs. Virtualization is most often a component of an environment that has containers, but the two are actually separated. So let's talk about container security. There are multiple layers that are involved in securing your container environment. You know, it's not a simple environment. When you think about all the different things that you want to do to harden your system, the system that's running the containers, the containers themselves, the networking stack in front of the container, how we do network address translation from real public IPs to the internal private network that containers use, there are a whole lot of moving parts there and there's a lot of opportunity for friction between those moving parts. So let's talk about what all of those little components are and maybe get a good handle on how we can reduce that friction. So containers use several mechanisms for security. Linux kernel name spaces which is a really fascinating set of technologies and I'll show you some examples in a little while of that. Linux control groups for that fine grained resource control, how much memory, how much CPU, how much IO, things like that. The Docker daemon itself, when you're using Docker as your containerization platform, a fairly infrequently discussed feature of container security is Linux capabilities or libcap. I'll talk a little bit about libcap. It's a pretty cool way of saying I'm going to limit the capabilities that a process has even if it's running in the context of root and I'll show you some examples of that. And then Linux security mechanisms like SELinux or App Armor. Now I'm an SELinux guy so for me I will talk about SELinux but any sort of pluggable security model like App Armor or GRSec or anything like that could be used. Now I'm going to break out of this for just a second. How many folks disable SELinux in their environments? Bad users, bad, bad. Go to YouTube, look up SELinux for mere mortals. It's a presentation that I did at Red Hat Summit. It's 45 or 50 minutes long. If you watch that presentation and still shut off SELinux, I'm going to have to smack you. I'm sorry, I love you. I do it out of love but go watch SELinux for mere mortals. It's a one hour overview of SELinux and do you have a comment? Yeah, I've actually done it like for three years. It's been the top rated session at Summit for the last three years. So it's, I mean, at the risk of thumping on my chest too terribly much. It's a pretty good presentation. So SELinux is to be clear an integral and very important part of container security when you start doing massive scale. When you start doing platform as a service where you're running this stuff across really large environments, don't turn off SELinux. It's really important. So like I said, I was having conversations with folks about container security and it was always just like, oh yeah, we do kernel name spacing and that helps with security and then the conversation would go on and then one day somebody was talking really specific questions and I was like how do I describe what kernel name spaces do? And so I came up with a couple of ideas. So let's talk about what they are first. Name spaces are just a way to make a global resource appear to be unique and isolated. The name spaces that the Linux kernel can manage include mount name spaces, process ID name spaces, time sharing systems, name spaces, UTS, inter process communication name spaces, network name spaces and user name spaces. I've seen several folks taking pictures of slides. You're absolutely welcome to do that but these slides are posted on the scale website. You can have them or you can go to my People page people.redhead.com slash t-camera and you can download them from there. So those are the different name spaces which can be managed by the kernel and let's talk about what that means. So the first one I want to talk about is mount name spaces. Mount name spaces really just allow a container to think that a directory which is actually mounted from the host OS, an existing directory on the host OS is the exclusive domain of what's going on in the container. The container is not aware that that's actually a file system from outside of that container space and you could potentially mount it within the container, read, write and root privileges and so on. When you start a container with the dash v and then the host path, like the actual directory and then where you want it to be mounted inside of the container so dash v slash var slash www slash html colon slash var slash www slash html for instance, you can spin the container up so that it thinks that oh yeah that's mine and no one else has it and I'm not even aware of anything else and so then it sees that directory in its own name space and the cool thing about it is that you, there's no exclusivity you can do a mount name space of one physical directory across multiple containers and that works really, really well so as an example of that, can y'all see that okay? Okay, so as an example of that I'm logged on to my laptop, my T540 and I cat var www html index dot html and I see it as a silly web page that's the actual file that exists on my file system so I do docker run dash v and then I do var www html and I'm going to mount that within the container on var www html and I'm going to fire up a fedora container and execute the command bash and then you notice that my prompt changes from being root on my T540 to being root within my container and if I cat that same file I see that it's there in the context of this container, the container is not aware that that's actually not a native file system I can run that container, in fact I did run that container from within root's home directory not necessarily a good idea running stuff as root, but I did it just for sake of examples, but my point is it abstracts and isolates what's really there and tells the container nope, it's all you, it's all yours don't worry about it, don't worry about anyone else on the system so I have a pretty good understanding of this but what would be a security advantage of doing that and this is the audience participation point why would that be a good thing from a security standpoint it can access your file system, let's be real clear if I had mounted that RW it can access your file system but one of the beautiful things about it I'm sorry and the comment was well it can't access your file system and actually it can but you have control, you can say nope only read only, you're not going to be able to modify it yes sir exactly you have jailed that container to where you're going to give it access to one specific directory it can't it can't go up the file system to the container that's the root of the file system so from a security standpoint it's a heck of a lot better doing it this way than trying to do all kinds of crazy bind mounts or weird stuff like that or charooting charooting is fine don't get me wrong I like charooting but this is really strictly locked down okay so process ID namespaces a PID namespace really just lets the container think that it is a completely new instance of the operating system so when you start a container on a host it's going to get a new process ID on that host so PID namespaces means that the container thinks that it's in its own process tree and whatever the command is that you started that thing up with is going to be process ID 1 which is a little bit weird because we're all used to process ID being one being you know in it or something like that but in this case I'm going to launch a fedora container running bash and I'm going to run the PSAX command so I run that command so I do docker run-it fedora and execute the command bash process ID 1 inside of the container it thinks I am alone, I am my own operating system I know nothing about anything else but my process ID 1 is bash which we all know is actually you know you couldn't boot a system that way but again from a security standpoint how awesome is this that container doesn't see squat on the rest of the system it's like I bought myself and I'm safe and secure now over on the host though when I run a PSAX I actually did PSAXF but you can see that my docker command is actually process ID 18557 that's an example of that namespace that's an example of the Linux kernel going no no really that's process ID 1 you're all by yourself you cute little thing you're all by yourself and you're secure where on the system you may have dozens or hundreds or even thousands of docker instances so everyone understand why that's a really good thing from a security standpoint the container is not even aware that there are other processes running on the system so that's a beautiful thing from a security standpoint it's totally isolated you can't easily anyway do like a buffer overflow of another process because you don't even know that the process is there username spaces when you start a container assuming you've added your user to the docker group you started as your user account now I did an example earlier I won't do that do it as your user I blew my machine away put a fresh install on to do the demos and was stupid forgot to put a user account on it but so you start docker as your user account and in the following example I'm going to start the container as T Cameron because I realize halfway through building my slide deck I was like oh I shouldn't be doing this as root and I added the account but once a container is started the user inside the container I'm sitting there logged into my console my ID is T Cameron I'm user ID 1000 I'm a regular user no root privileges or anything like that I run docker run it Fedora with a command of bash I'm still on the same machine still on the same console but when I run ID inside of the container I'm omnipotent, I am root now the cool thing about this is that means that within your container whatever you need to do but that's only within your container if you try to go oh I got root on the system and try to do something else you're not even aware inside of the container that there are other things going on outside of your container so again from a security standpoint there's a bright line between what you can do even with root privileges inside of your container you can't affect the underlying host yes sir what happens if you configure what you shouldn't be able to because like someone out here will go well actually but remember that even though you have root privileges oh I'm sorry repeat the question thank you the question was what happens if you trigger a kernel crash within the container and the reality is because we're using namespaces because we've isolated the container in this root account doesn't actually have root privileges on the underlying host you're not going to have for instance you're not going to have access to the sysreq trigger proc file system entry to trigger a crash so at most you would be able to crash the application within the container which is kind of a restart the container does that make sense yes sir yes so the question is if I'm root inside of the container what would happen if I ran a kernel exploit that I found and tried to well here's the beautiful thing you're not really root in the context of the container so in many cases not all in many cases a lot of the things that you can get like if you have local access you can crash the kernel blah blah blah that doesn't necessarily work for user space in other words a nonprivileged user wouldn't be able to do it now if you did find an exploit that was a user space or user privilege that you can crash the kernel it will probably take it down if you can execute it and it does work from user space it'll probably take it down at which point you got bigger problems so the question is will it take down the container processor or will it affect the host depends on the exploit if it's actually truly like a kernel exploit that's going to cause a panic or something like that you're going to take the host down here's the thing if they have access to your system whether it's physical access or shell access to the system it's going to greatly increase your security vulnerability surface area because then they can do fork bombs and they can do all kinds of stupid stuff that's bad stuff the best answer there unfortunately is keep your systems up to date really pay attention to the security mailing lists keep your systems up to date there's another question over here I'll talk about that in a little while the question was can a container be constrained so that one bad actor overwhelm the system and the answer is yes absolutely and I'll talk about that in a minute so again security implications of user namespacing the main thing is just isolation even though you give somebody a virtual root environment where they have control in their container it isolates giving those escalated privileges to somebody that's got system-wide access network namespaces similar concept network namespaces allow a container to have its own IP address independent of the host and these addresses are not available from outside of the host so in this case it's private networking very similar to what you get like with libvert or other network or I'm sorry other virtualization but in this case the Docker service is smart enough to set up an IP tables masquerading rule so the container can get to the rest of the internet so in this example what I'm going to do is I'm going to spin up a container and I'm going to run ifconfig or in this case IPA show and you'll see that that my physical host and I did this a little bit backwards so I'm on the if I use Docker inspect and I query for the network settings IP address then I get that it's a 172.17.0 address but my machine my laptop wasn't even hooked up to ethernet like when I do IP address show on my ethernet interface there's no IP address at all so this is a totally separate networking name space it's isolated from the external network in this case I don't even have an external network but that's just a virtualized private network that's behind what will eventually be an IP tables masquerade rule so that it can get out to the network yes sir I have the question is can you do 802.1 queue trunking within the network name space I have read that there is work being done on it I don't know if it's complete but what would the benefit of that be like why would you what are you looking to accomplish yeah I mean that's a fair point if you do have that then you can directly address your your containers and then you don't need Docker to do all that stuff but my question to that would be but Docker has got an entire large community around developing that networking code and the IP masquerading code and they're actually pretty good at what they do I know I think I'm pretty good at what I do but I guarantee you I'm not nearly as good as all of that community so yes sir okay so the comment is in a very large scale environment where you're using you said VPN concentrators in containers open VPN endpoints and you start running into limitations with IP tables okay that's a fair point that's a fair point yeah yeah Docker may not be the best solution for that so you don't yeah that's fair yeah so the comment is you don't have to use network name spacing you can actually directly address the Docker containers yeah that's fair that's fair was there another question I don't know I don't know send me an email at thomasatredhat.com yeah I've been there that long send me an email at thomasatredhat.com I'll find out I just I haven't heard the next big thing yeah so the question was is any work being done to allow Docker to use NF tables instead of IP tables and I just unfortunately I haven't heard one way the other so yeah if you by the way if you have questions I'm thomasat redhat.com I'm really easy to reach alright so security implications of network name spacing it should be relatively obvious we can segregate those those containers we can keep them off of the network we control ingress and egress rules through IP tables so that it's it's isolated and we know for sure what's going to get in and what's going to get out so IPC name spaces IPC name spaces really same thing but with inter process communications so my container doesn't have any IPC's maps because well because I just spun it up right if I do if I spin up my container and I do IPCS there's nothing going on there's no inter process communications because all this is just running bash it doesn't have any applications that are talking to each other but on my host when I run IPCS you can see I've got you know zillions of processes that are communicating with each other but again that container I go back a page that container thinks it's all by its lonesome running on its own bare metal it doesn't have any knowledge of all of those IPCs that are running on the host so from a security standpoint the security implications there are you don't even have inter process communications that are exposed to that container so you don't like if a bad guy does get control of that container they don't have the ability to go and start trying to do attacks or anything like that so again isolation and segregation is not a bad thing UTS namespaces or Unix time-sharing system namespaces again let the container think it's its own separate OS with its own hostname and its own domain name so on my host on my laptop when I run the hostname command I get my fully qualified domain name it's a t540p.tc.redhat.com but then when I fire up my container and I run hostname on that same machine it's got a randomly generated string for a hostname so again the container thinks that it's its own machine it is not aware that it's actually a container running on a host so again from an isolation standpoint if somebody does break into it they don't get any identifying information about what the underlying host is like you don't want to say the hostname is hostXYZ and mycorp.com because then they're like now I know where to go attack right so isolation and security through keeping it segregated from the host operating system so that is namespacing those are the capabilities that the kernel has to segregate out and to isolate what's going on inside of a container process from the underlying operating system so let's talk about a different capability which is control groups so control groups really just provides a mechanism for either aggregating or for aggregating slash partitioning sets of tasks all of their children into hierarchical groups of specialized behavior really this allows for various system resources to be bundled into a group and I can apply limits to that group so disk.io CPU usage memory usage network use all that kind of stuff are contained into one address space or one one process space and the cool thing about using control groups is everybody thinks in terms of oh well if I use Linux control groups I can make sure that the one badly behaved neighbor doesn't take over the whole system and that's absolutely true and that's certainly a very common use for control groups to make sure that you don't get some bad person that's doing something silly like setting up fork bombs inside of your container and taking the whole machine down but the flip side is also true as well because we use control groups for containerization we also use control groups for virtualization and all kinds of stuff that allows you to do is use your hardware to the absolute maximum so if I know for sure that I've got a fixed amount of memory let's say 16 gigs of memory or 32 gigs of memory whatever I know that when I spin up whatever it is VMs or containers or whatever and I know what I'm going to be allocating to those control groups around those that I know to the specific number of containers that I can run on my machine so instead of this kind of like we're going to throw some more VMs on or we're going to throw some more containers on and we're going to watch it and see if the machine is okay no, in this case I can divide up my system into exactly the number of containers that I know it will support and I know that it allows me to do forecasting, it allows me to do scheduling it allows me to know when I'm going to run out of capacity when I need to buy new servers so control groups are pretty awesome and again it ensures that if a container is compromised or just has poorly written code like somebody does something that gets into a race condition or something like that there are limits in place which minimize the risk that that misbehaving container is going to hurt the rest of the host notice that when I run the command that system control status docker.service I get the control group and slice information so you can see that when I run when I've got a container running on my machine it's actually got its own docker.service system slice and so I can go in, now by default when you spend this stuff up we do put it in its own control group but we don't put any limits on it doing control groups like actually manipulating control groups I could spend like eight hours on that it's a fairly complex topic so I'm not going to get into all of the nuts and bolts and the nitty-gritty of doing it is that at least on red hat based systems we automatically assign each instance to its own control group and if you do start needing to get really restrictive on how they're going to access the systems you can go in and put limits in place you can go and look in the sysfsc group pseudo directory to see what the resources are that are allocated to your containers now there are like 8,500 entries in that directory so again unfortunately it's kind of not practical for me to try to go into each of them but you can get information about memory, cpu block IO, network IO and so on in there so if I go and I look in sysfsc group and I do a find pipe word count dash L there's like 8,500 almost 8,600 of them so the next component in the container strategy that I want to talk about is the docker daemon itself the docker daemon is really responsible for managing control groups and demonstrating those namespaces and all of those other things that I've talked about so that the docker images can be run and secured because of the need to manage kernel functions docker itself runs with root privileges and that's fine and by that I mean the docker daemon it runs with root privileges be aware of that but it's a pretty secure environment no code is perfect there will be an exploit at some point it's just the nature of the war between bad guys and good guys but it's pretty safe so there are a couple of considerations when you're running docker obviously you don't want to allow someone to access to your system that you don't trust you want to make sure that you've got some sort of way to make sure that you've vetted folks who are going to be spending up containers in your environment the documentation recommends that you add users to the docker group so that they can run the docker commands but with that flexibility does come some risk make sure that you only delegate this ability to trusted users and remember that they can mount host file systems in their container with potentially root privileges also everything that I've talked about and everything that docker does can also be done via REST API so really recommend that you've got updated versions you're keeping your docker environment your management environment the docker daemon and so on I can't stress enough almost every single exploit that you see that makes the news that everyone's like oh my gosh almost all of that is from two sources someone being stupid in plugging a USB drive in at target that they found in the parking lot or more often somebody taking advantage of outdated and insecure code that's network facing the one about the guy that USB drive up and plugging into the computer we can't help that I swear who's the comedian who says you can't fix stupid I kind of think that's just going to be with us forever but if you're running an environment especially an environment that is internet facing you have got you have got to keep up with your security updates that's just kind of this is admin 101 if you are going to expose the REST API over HTTP please do expose it except to secured networks or VPNs unless you really really know what you're doing and you're really staying on top of security now I get that in some places if you're doing like a public service offering where people can spin up containers and you want to give them API access you might need to expose it to a public facing network make sure you do it with SSL make sure you have authentication mechanisms in place etc etc etc don't just leave all the ways open so Linux kernel capabilities is really pretty cool and even I I've been working at Red Hat for over a decade even I when I started digging into this was like oh wow didn't realize that this was you know this was a thing didn't realize this is how this worked so historically if I have root access to a machine if I am logged in as root I am omnipotent I can do anything that I want to and what are we doing within containers root privileges so you're like you feel a little uncomfortable about that but hey it's in the context of their container they're not going to do anything stupid so but Linux capabilities is a set of controls very fine grain controls that allows services or users with root equivalents to be limited in their scope it also allows non-root users to be granted extra privileges so you can do things like have a regular user and you can grant them the capability of the net bind service and they be able to bind a service in their container for instance to a privileged port so Linux capabilities is pretty cool it basically allows you to cheat to grant more or less privileges than you would normally be able to to grant now in containers many of the capabilities to manage network and other services are not really needed SSH services, CRON services file system mounts mounts and unmounts not really needed because those get handled by the Docker service network management is not needed and so on by default Docker disallows a lot of those root privileges which is a good thing including the ability to modify logs, change networking modify kernel memory and the catch-all capabilities of system administration and if you go and read up on the documentation of Linux capabilities and I'm sorry this is an eye chart I'm sure you can't really see it but I went and I looked under the Docker GitHub page you go to Docker, Damon Execdriver, I can't even read that native template and look at the default template for Linux capabilities you can see that really only a very small subset of the capabilities that root normally has are passed through to those Linux containers and the result of that is even though we're granting users root privileges inside of their Docker container and they can we can absolutely go here's the gun there's your foot knock yourself out they're probably only going to shoot themselves in the foot because we are limiting the capabilities that root has inside of the that pseudo root user has inside of their container they're not as likely to be able to damage the system and then finally or not finally but in addition one of my favorite topics like I said I present on SE Linux every year at Red Hat Summit really any conference that will have me I will evangelize SE Linux I want to talk about what SE Linux does so security enhanced Linux is an example of a mandatory access control system there are other ones out there it just so happens that this is the one with which I'm most familiar at Red Hat and that's what we've adopted but basically processes files memory network interfaces memory addresses port on the network and so on are labeled by the kernel and there is a policy which is administratively set and fixed if you look under the slash Etsy slash SE Linux directory there's all kinds of cool information about there about what the policy is composed of any changes to the policy and so on but basically that policy determines how processes can interact with files with other processes with network ports the kernel and so on so essentially what happens is a policy is built and we have a default policy on Red Hat Enterprise Linux and Fedora and CentOS we have a default policy that is kind of our best guess as to what makes the most sense in the most common use environments but basically the way that SE Linux works is what we care about from an average user perspective is that SE Linux works with labeling and type enforcement and let me explain how that works if I've got this mythical service the foo service the executable file on disk if I look at it using LS dash capital Z which shows me SE Linux context if I do LS dash Z I might see that it's labeled on the file system like underscore type that says to SE Linux this is an executable in the domain of foo and it's a type so the label is foo exact T the startup scripts might have the label foo config T the log files might be foo log T the data may have foo data T and then when you fire up the process we're no longer looking on the file system but we're actually looking at processes in memory I do PS dash capital Z common argument for SE Linux running in memory it might have the label foo underscore T or foo type so those are labels those labels are typically defined either by well either we do labeling within the default policy or an application developer if they understand SE Linux and I hope that they do will do all of the labeling so labeling is just how we identify processes and files on the file system in memory they are managed by the kernel on the file system they are stored as extended attributes on the file system so that's labeling type enforcement is now that we know all these labels these types it's the rule that says when a process running in the foo T context tries to access a file on the file system with the foo config T or foo data T well that makes sense you want the foo type process to be able to access its configs and its data that makes sense but when the process with label foo T tries to access or also I should say when it tries to access foo log T that works as well but any other access unless it's explicitly granted by that policy that's stored under Etsy under Etsy SE Linux it's going to be denied so think about it if I got my foo process that's running with the foo underscore T label and that process tries to access let's say it tries to access a file under Etsy that's got the label shadow underscore T raise your hand if you think that's a good idea to grant that access we will all laugh at you right it makes sense it's actually fairly straight forward Linux is not nearly as complicated as people think it is it's all about what process is running, what context, what labels and what they have access to so if the foo process tries to access for instance the directory slash home slash T Cameron with the label user underscore home underscore dirt type even if the permissions are wide open the policy will stop that access even if Ichimod777 slash home slash T Cameron foo will not be able to access that foo process running with the foo underscore T label will not be able to access that home directory SE Linux labels like I said are stored as extended attributes on the file system or in memory the SE Linux labels are stored in the format of the SE Linux user the SE Linux role and then the SE Linux type and then multi-level and multi-category security so for the mythical foo service the full syntax for the label would be user U object role foo type and then S0 and C0 and I'll show you where this comes in in just a second the default policy for SE Linux is the targeted policy in this policy we don't use the SE Linux user or role that's for multi-level security like in government organization so we'll ignore those we really only care about the type and the MCS label think of the MCS label as extra identifiers right it's kind of like port numbers like we know that the the address for a host is never going to change but the port number for incoming connections may change like 80 for web or 44 for web and 25 for mail and so on MCS is just extra identifiers for SE Linux so in SE Linux for containers we can be very granular about what process can access which other process or which part of the file system so to be real clear these are two totally separate labels even though they're both user U object role foo type this one has S0 C0 and this one has S0 C1 as far as SE Linux is concerned they are completely different so type enforcement says that a process with the first label is different from a process with the second label so policy would prevent them from interacting also there's no policy allowing a process running with those labels to access for instance a file system unless it's labeled foo config type or foo content type or another defined label neither of those processes for instance would be able to access as a shadow or anything like that now on a standalone system running Docker all of the containers run in the same context by default but for instance in our paths offering open shift we actually make each instance run with its own SE Linux labels so even if somebody were able to gain access to the process running a Docker container SE Linux would still prevent them from attacking another container on the machine so it works really well so let me show you an example I'm going to emulate an exploit where someone takes over a container I'm going to use run con which says run in the context of to change my context to that of an open shift container and then I'm going to try to access Etsy shadow I'm going to try to write to the file system and so on so what happens is I do an ID and I'm root okay when I do ID-Z you can see that I'm running unconfined in a specific context then I'm going to take on the SE Linux context of a Docker container so I run con and I change my context I change the type and I also change my MLS and MCS labels and so what's funny is as soon as I run the run con command and I change SE Linux context it even comes back before I even get a full shell back going I can't access bash RC even though I am root I've still got the root prompt when I do cat Etsy shadow when I try to touch a file on the file system not allowed I try to just do a listing of the home T Cameron directory and I can't see that as well I'm totally blocked off from doing any of that and I think to myself well I'm going to be really smart and I'm going to turn off SE Linux so I do set and force zero and not allowed so even though I just changed my SE Linux role I didn't log out I didn't change user IDs I didn't do anything like that I just took on a different SE Linux role it blocked me and I couldn't do anything even though I was still logged in with root privileges so SE Linux is incredibly powerful it is I'm not going to say it's trivial to learn it's not I mean you got to put a little bit of brain sweat into it but seriously go watch the SE Linux from your mortals video it's one hour I think I encapsulate a lot of the basics of SE Linux and talk you through how to get it set up and how to how to fix things and change things with SE Linux but it is incredibly powerful I mean to see somebody who is actively logged into the console as root not be able to even change a file on the file system that's pretty impressive so let's talk about a couple of tips and tricks containers really at the end of the day are just process running on the host that means we as system administrators and systems engineers have to use that exceedingly rare thing known as common sense if you're running something on your host just because it's containerized remember we don't we don't deal in snake oil here it's not a cure for all the disease you still have to use common sense do have a process in place to update your containers and follow it I cannot tell you how many times I've had conversations with folks that are like yeah our developer created this really cool PHP container and threw it over to us and we put it out there in production I'm like really this is the last time you upgraded it huh you can't fire and forget do run the service in the containers with the lowest privilege possible drop root privileges as soon as you can don't allow root access if you can avoid it mount the file systems from the host read only wherever possible sometimes that's not possible you want to be able to write log files and things like that I totally get that make sure you're smart and where you grant access to write those log files where possible mount read only treat root inside just like you would treat root on the host even though I've talked through how we do segregate and isolate that root account from the host OS I'm a built in suspenders kind of guy when it comes to security I want to have as many barriers to somebody doing something bad to me and making me stand in front of my boss I don't like standing in front of my boss when something's gone wrong just call me a wimp I don't want to do it and seriously use some sort of log monitoring capabilities I don't care if that means you read the daily emails from the root cron job or if you have something really sophisticated in place it's going to do data mining watch your logs don't don't just download Bill and Ted's excellent container that you found on the internet from some site in Romania yes I've seen that happen don't run SSH inside of your container I have seen people do this well I'm going to go ahead and build this web service and I'm going to go ahead and put SSHD in there as well so I don't have to mess with the admin and don't do that because that's one more increase in the service area for attack it's one more opportunity for you to forget to update don't do it don't run with root privileges don't disable SE Linux don't roll your own containers once and then never maintain them it's easy to do guys I know as well as you do right sometimes you've got a big fire going and you're doing your best and you're busting your butt and you spin something up and you put it together and you throw it out there and it was an emergency so you went through an exception and you didn't go through the normal dev and QA and prod type of promotion and the fire is now out and now you have dollar day job which is taking up all your time and you think I'm going to get to that I'm going to get to there was something that I was supposed to but now I got my job to do don't let that happen to you and don't run production containers on unsupported platforms if you run your business on something my opinion is you should have that big red oh no button so make sure that you're doing it in a supportable and supported configuration so really in conclusion I hope that you will go forth and contain stuff containerization is awesome technology I am really excited and incredibly fast moving super disruptive in a very positive way technology for IT just like you I've had to bend my brain around some concepts that I wasn't familiar with in the past you heard me I came from a Nobel environment that tells you what my background is I'm a dinosaur up here and I'm having to learn new stuff but it's awesome stuff and it's exciting stuff they make application deployment super easy they leverage some incredible capabilities within the Linux kernel and by design they are relatively secure obviously just like any other technology out there there are some gotchas as with every other piece of software out there Docker tech requires some feeding and maintenance well maintained containers well maintained I should say containers can make your business more agile less complex and if you do it right safe any questions so if I heard you correctly the question was what if we're doing our own Docker registry are there any security concerns there have a good CICD environment in place have notifications when you're upstream projects that you're building the containers out of rev like if there's a security update for whatever the application or framework that you're using is pay attention to that preferably when upstream revs have some process in place it's going to suck in that source and start your CICD environment notify you when a container needs to be upgraded and you guys are going to push the upgrade out if you've done it correctly and you're doing the mounts so that the actual data that you're using or web service content or whatever in theory you should be able to kick a new container out with practically zero interruption in application service yes sir so for the Docker uninitiated when as I understand because it's a layered file system when let's say I download like a base red head image right and then I build another container off of that right and red hat revs that image for whatever reason so merely downloading that new image is insufficient fair point what is the next step from that at that point you need to start well there's a number of things that you need to do you really need to again you need to pay attention to and preferably set up some sort of notification or even more preferably an automated process that's going to go get feeds there's a number of ways to do whether it's RSS or whatever to get feeds of oh hey the you know the underlying red hat or Ubuntu or whatever container has been updated to be real honest with you you just asked the 64 thousand dollar question I mean that is something that everyone out there is struggling with there are a number of ways to address it you know there are a lot of startups that are doing it I can't talk about it yet but look for some announcements coming soon from red hat so there are folks who are working on it but yes you are right it is layered you don't necessarily even often have just a container there's going to be stuff that it's dependent upon but that is a complex question and I don't have a comprehensive answer especially now that I can do in the next three minutes yes sir I often find myself using cap add for docker and unfortunately I often end up finding myself just telling it to run in an admin permissive mode because the particular cap add permissions I need are not addressable somebody working on that I'm sure there is I have not heard anything about that so is what you're saying that when we do the capabilities filtering you're saying that it's too restrictive well I'd say if I actually need to run something in the container on a VPN in what I'm sorry on a VPN and that requires access to not just the network but my routing tables I can do cap add net admin but I can't do cap add routing and so I end up having to do cap add admin and give it full admin permissions and I was hoping that somebody was working on adding additional discriminatory permissions to cap add because I haven't seen a lot of activity lately is someone I'm almost positive there is as to who that is I haven't heard off top of my head but I will say and I'll say this to you and anyone else who doesn't know the limitations like that because that's valid that's totally valid I get why you're saying that open a bug either with us if you're using ours or with upstream docker or whoever version you're using open a bug Zilla report or a trouble ticket so that so that we know to do it so was this helpful okay good good because this is kind of a new presentation for me so I want to make sure that that it was I'm so that is fantastic guys I got to clear the room I think we're out of time yes we are indeed out of time thank you very much for coming and for giving me the opportunity to talk to you