 where it is just as cold and snowy as it would be in Boston if we were all in the same place. Welcome to the SIGHONK Ask Us Anything panel at KubeCon North America 2020 virtual. SIGHONK is not an official Kubernetes special interest group. We are a hacker crew and a group of friends who have been working on various aspects of Kubernetes security for a long time now and we're super excited to be here. These questions have been brought to us by the community and thank you all for submitting them. So thanks for being here with us. Let's introduce ourselves. My name is Ian Coldwater. I come from a DevOps turned pen testing background. I am now the co-chair of Kubernetes SIG Security and the new director of DevSecOps at Twilio. I've been hacking on Kubernetes things for about four years now and I'm super excited to be here. Hey everyone, my name is Duffy Cooley and my pronouns are he, him. I've been playing with networks, distributed systems and people in this space for quite a while and I really enjoy making communities like Kubernetes like more approachable and certainly more inclusive. So if you have any questions or anything I can ever do to help, just reach out. I'm happy to do it. I've recently transitioned to Apple and now I'm doing some of that work there. So I'm really excited to be here. Oh, my pronouns are they them too. Hi everyone. So my name is Rory McKinn. My pronouns are he, him. I'm a principal consultant at NCC group where I focus mainly on doing security tests and assessments for customers. And I've been looking at container security now for about four or five years. And I'm Brad Giesemann. My pronouns are he, him. And I'm the co-founder of Darkbit, a consultancy focused on cloud and Kubernetes security posture assessments. And I've been building and hacking on Kubernetes for almost four years now as well. And yes, Giesemann is my real last name, but it's no coincidence that I was pretty much born to honk. Speaking of that though, I suppose we should explain what all the honking is about. So the honk thing came from a video game called Untitled Goose Game, which was published by a company called House House about a year ago. Untitled Goose Game, the protagonist is a mischievous goose who runs around a pastoral English village causing trouble on purpose. This goose in this game kind of sneaks around and uses seemingly innocuous everyday objects to chain them together to perhaps do something like exploit them for its own ends. Because of this, because it's kind of like the way hackers do things, some of us hackers kind of adopted this goose as a mascot of sorts. And when we talk about honking, a lot of the time we'll talk about hacking or just causing problems on purpose. I have a question for you all. So we've all been working on Kubernetes for a minute. How do you think that the security posture and attack surface has changed with more modern cloud native tech stacks over the old way of doing things? That's a great question. I think in my mind the old way was when we would rack up a thousand servers or start up a bunch of virtual machines and then we would leverage tools to deploy some subset of our application to a subset of those nodes. And with that old way, we had all kinds of interesting challenges like shared libraries and other problems that would be inherent when we were trying to couple different applications onto particular nodes. And the new way we're dealing with container orchestration. We're dealing with a very similar model in which we're dropping applications onto some subset of those nodes or running them as long time or long running processes just like we did before. But the difference is that now where some of the benefits of this new security model come into play is that now we're actually giving each of those running processes their own view of the network, their own view of the file system, their own view of the PID namespace. And these things mean that we actually have better isolation between processes because we can actually containerize them. This is effectively what containerizing is. It's basically just giving that running process its own isolated view of the world. And so I do think that that has improved security, but it hasn't changed it too much as we look at the way that those processes interact with the kernel. Yeah, I think for me, one of the big things about moving to Kubernetes and Cloud Native has been this move to automation. It's been this process of everything being automated and infrastructure as code. And that's kind of got trade-offs. It's got good things for security. And it's got some things that are challenges. So the good things for me, one of the big ones is about repeatability. This idea that I can take this code and I can create 10, 20, 30 environments. And for me as a security tester, it means that I can say, hey, spin me up a new environment just for my test. And I can be confident that my findings, they're going to be relevant and the same ones as you're going to see in production. So that's great. But the downside is that we have to think about things differently now. So we don't have pet servers anymore. We can't just load all our secrets up manually. We have to work out how we're going to store them and how we're going to manage them securely. And that's one of the problems, the newer problems that we have to think about. And also with everything being in the cloud, we have to think, well, everything's exposed. Everything's one step away from being compromised. So we have to be really careful with how we manage our information and access to things because it's all out there just sitting on the cloud. Yeah. And I think there are a lot of benefits. I mean, the move to API-driven infrastructure and configuration means that there's many more opportunities to bake security improvements into the code that is the source of truth, as opposed to, say, only being able to bolt it on later. And those same APIs make it much easier to query about those security settings at scale. One place you query everything and you could see what the actual answer is. So like if a security policy or a misconfiguration is present, it's probably present everywhere. That makes it much easier to fix because it's all in a central spot and you can have it reliably take effect on the whole fleet. And that brings a lot less overall toil. But I think the challenge is finding that balance of velocity on automation versus complexity and security. So another question that we got asked is, what do we think developers of upstream and downstream cloud native technology should be thinking about from a security perspective? So as you all said a minute ago, I mean, it's important to just note in general for people who are building this kind of infrastructure that what's old is new again. That a lot of these problems of like the old tech that the shiny cloud native wrappers around haven't really quite been solved yet. So keep an eye on those. But also you can't just assume that everything is the same. So keep an eye on that too. If you're building your first party data center in the cloud, it's probably going to be very expensive and not work exactly the way you think it will. But also in general, when you're looking at the systems that you're building, I think it's really important to look at it from the perspective of an attacker, because as an attacker, I'm going to be looking at your systems, and I'm going to be looking at them in perhaps a different way than you might. Do you actually know what's running in your environment? Do you actually know exactly what your code does? And if you do know what your code does, are you including the perspective of an attacker persona in your user personas if you use those? Because when I'm looking at a system, I'm looking at it for things like where are the holes in this? Where are the trust boundaries? Is there anything that is going to take unexpected input and do anything weird with it? So if you look at the systems that you're building, both upstream and downstream, when you're building them and think about how an attacker might look at them and what that perspective might look like, that can be really valuable and threat model accordingly, because having a threat model is really important. Yeah, I like that. And I will paraphrase another friend who says that one of the ways to think about increasing the security of your application or your environment is to think about what tools you might be leaving behind for those attackers. And one of the ways that I conceptualize this is by thinking about what is packaged inside of a container image. That whole container image is going to ship and it's going to be running and it's going to be exposed to that long running process on those nodes that we were talking about earlier. And that means that anything that's inside of that container image will be available if somebody were to get a reverse shell on that node. Now, I understand the incredible power of having a debugging tool or bash or SCP or any of these amazing tools that you could use to actually get logs off of the running out of a container or understand what's happening with a particular process or debugging things. But at the same time, and many people actually will bundle these things into their container image because it might become in handy later. But at the same time, it may not come in handy for you and it may come in handy for anybody, even people who are working for new various purposes. The other thing that I think that is useful is thinking about this container orchestration model. We have to think about what privilege has been granted to that process. Is it running as a privilege container? Can it see all of the PIDs in the host namespace? Is it able to manipulate network interfaces? Is it able to do things like this? There's a ton of sharp edges and configurability in this, especially with regard to things like Kubernetes. What tools have you left behind? What have you left behind for your attackers is a great way to think about this. Another great question that came up for us was how do you go about evaluating the attack surface of a cluster and actually attacking it? Where do you even begin? Yeah, that's a really good one. For me, a lot of what I do focuses on things from a configuration standpoint. I'll first look at the configuration of the external network and how access to the control plane in the nodes are configured. I'll work my way in and I'll look at the configuration of the surrounding cloud environment. Say how the cluster is situated and what access it has to other things. Then I'm thinking about the API server and just ruling out the obvious RBAC flaws, like anonymous secrets to access to secrets, or if there's any applicable CBEs and things like that. That's stepping in further, layers of the onion. I look at the configuration of the Kubernetes components, the cloud credentials that are attached to the worker nodes, and then as a unit, the combination of RBAC, admission control, network access control, and container image contents, all is one thing to say, what's the overall security posture of that giving workload? And then I'll step back and work through some what-if scenarios for the common types of attacks against the cluster from the outside, or what services are exposed, and then take what happens if this container inside the cluster is compromised and look at it from that perspective. Yeah, I think that ties in a lot with what we end up doing. I think when we end up doing security systems for customers, what we typically do is we'll take one of two approaches. We'll let it take this white box approach where we can give an access. We're giving access to configs, we're giving access to pull stuff down. And there a lot of what turns out to be happening is we end up doing data analysis, pulling large quantities of JSON down and parsing it and looking for patterns. You've got a lot of data to look at. Automation and parsing is really important, and you're looking for patterns that could indicate problems. So could be cluster admin. This process runs as cluster admin. That's going to be dangerous. It's also going to be a target for me or other attackers. What about privileged containers? As Duffy said before, privileged containers can see all the processes. They can get a lot of more access. So what's in there that might be running as privileged? But then when we're doing the black box, what we're doing is really trying to take that scenario and say, hey, I am a compromised container. I've got access to this container. What can I do? Can I break out to the sandbox? What rights do I have? Do I maybe have a docker socket mounted inside my container? And maybe where can I go on the network? So there's these two approaches, but you're looking for the same kind of things. You're looking for patterns of badness. Yeah, totally. What they said, you want to look at what exactly is going on in the environment that you're assessing. Like what's running in there? How is it running? What level of privilege does it have? Is there anything that is potentially exploitable? And then if you can get that, then you kind of see how far you can go. Can you break out of the container? Can you break out of the node? Can you break out of the cluster itself and go into the cloud? It's really going to vary depending on what you find, but it's really a matter of like figuring out what's going on, seeing how far you can get. And a lot of it, as y'all said, is pattern recognition. But I want to note that if people are watching this who are new to this, I don't want the idea of us using a lot of pattern recognition to be discouraging to you because this isn't magic. It isn't that we're like wizards who have arcane knowledge that can't be gained by other people. It's just that we've had a lot of practice. And that's something that you can learn in practice, too. And for people who are excited about learning how to attack Kubernetes and this kind of thing, I want to encourage you to get your hands dirty, go in and start practicing how to do that yourself. A couple of great resources to do that are there's a workshop that happened at Kupkan NA last year by Brad and some other awesome folks. You can find it at securekubernetes.com. It holds up very well with time and you can go all the way through it. There's also a CTF that happens periodically called Honk CTL. At one point we played it as a team and it was pretty great. And I believe that we won it. And so that's something to look out for. And in general, playing capture the flag games like CTFs, not the kind of capture flag that you played on the playground when you were a kid, but the kind of it's like getting to like learn how to hack and sharpen your skills and learn how to think like an attacker is something I really recommend that people do because it can be useful and it's also fun. While we're on the subject of pattern recognition, another question that we got that I'm going to throw to you all is if somebody is a defender or a cluster administrator, what are some indicators that a cluster or a Kubernetes environment has been compromised that people should look out for? If there's a breach, is there something that people can see that happened? Yeah. Going back to the answer about how the Kubernetes security surface has changed, I think it actually highlights some of the complexity we've introduced in this process as well. So let's think about the old way and the new way and glue these things together. So the old way, we would have a subset of our applications running on a subset of nodes and we might configure something like Audit D to watch for behavior that we thought was not normal and alert on that and then that would give us the ability to kind of understand that things were happening in our systems that we were not expecting. Something you might not be expecting is something modifying or opening the file at the password or the shadow file or other like secure things that you would expect that not an application that would not do during normal experience. Well now in this new system, as we explore containerization, we have n number of Etsy password files because each process sees its own file system. And so that definitely adds a bit of complexity to the problem because now we actually have to understand which password file was touched and which container was it running and is that still not normal or is that okay? But understanding that normal behavior and understanding how to detect it has changed. We can't just watch all of the file systems. We have to change the way we're watching for those things. There's a great project in the CNCF umbrella called Falco and Falco basically operates at the syscall interface and so it's watching for system calls like file open or grabbing a file handle on Etsy password against the Linux kernel and then maintaining the context so that you can understand which container this happened in and maintaining the context so you can understand which container opened that network socket and really provides like a great next generation audit decapability. So that's definitely how I would start. Yeah, I'll echo what Duffy said about getting visibility into what's happening inside the containers for that malicious activity is like the primary source like somebody could control exec or a shell respond you know file connection like Duffy said those are like great primary indicators. I think the next best place to be looking are the audit logs. So you know the API server audit logs and the neighboring cloud API audit logs are really key in terms of getting that clear smoking gun evidence. And while it might not be the first thing that captures the compromise it's probably going to capture a lot of the next steps that the attacker will take because they will most likely find valid credentials somewhere and try those credentials against those APIs as they say you know can I expand my access inside the cluster or privilege escalate or try to escape out into the nose or escape out into the neighboring cloud environment. So for me I think you know the one two punch is malicious activity detection logs and the audit logs in combination so you can really understand what just happened. So hopefully security incidents like that will continue to be relatively uncommon but one of the questions that might be helpful for cluster administrators to know more about is you know how do you go about evaluating a project for its overall security posture before you take the time to install it and integrate it into your own environment like what things are you looking for and that I'll throw that over to you guys. Yeah that's a really great question and I think obviously there are a huge number of projects in this space so it's one you're going to have to spend some time on. The first place I would probably look is you know the CNCF security audit reports you know every graduated project has got an audit report done by a third party and those are great resources not just for like looking at code bugs but also for looking at how the project you know how what its security posture is like. So that's a good starting point I'll also tend to look at the websites you know for each project and say do you have security contact information you know do you are you encouraging security people to get in touch and to actually like get involved with the project and then the other place I tend to look is what's the happy path installed like you know what are the defaults that this project uses are they going to give you a secure installation because that that tends to be a good indicator as well. For me I want to know how exactly the thing works like technically what are the details of like what exactly is it doing in there what kind of privilege does it ask for what kind of access does it need does it have read write access to the host to other machines like as an attacker I know how I would want to potentially use something to break out of it how viable is that and in general if I'm using it if I'm thinking about using it in my environment how does it fit into my environment what does it require to be connected to does it conflict with anything and you know what does it talk to and how does it fit into my threat model because it's really important as I said earlier to threat model your environment figure out what's in there figure out what's important to you and see if that project is going to fit into that larger goal so I have a question for you all speaking of larger goals how do you find inspiration for where to honk next and how do you develop intuition about the emergent behavior of complex systems I really like this question I think it's fun yeah it's a great question I for my part I think it's um I really maintain curiosity about everything everything around me all the time like I'm I'm definitely one of those question everything types of people um and I find that that's true not just in the way that I evaluate software or the way I evaluate systems but generally in life like I would look at an elevator and realize I'm going to be on this elevator you know for 20 minutes of my day every day so I wonder what else I can learn about this elevator right and might do some research on the type of elevator it is and see if there are any like you know secret button pushes or tricks to actually up to change the behavior of this elevator um to my benefit right like uh and so for my part it's definitely curiosity stay curious you know like under you know take the time to explore that curiosity and I know that we can all feel very rushed when we're doing our day-to-day job but like that curiosity in you should be celebrated and and you know take some time to maintain that absolutely I think curiosity is a big part of it and for me like some of the best places I get a kind of inspiration is from trying to answer people's questions you know seeing someone ask me a devil question go that's a good point how does that work so one of the things I do is I do a training course on container security and definitely some of my best inspiration has come from a question on that course saying hey how does that work I go oh and that leads me down a path and you know there's other places good places as well you know you can go to Stack Overflow or Security Stack Exchange and try and answer the questions and doing that will lead you to interesting places could even lead to CVEs the other one I like personally is is I go to GitHub issues right go to the Kubernetes issues list filter by security just start reading the issues and finding out what people are thinking about security and again that can lead to interesting places or even CVEs I'm also a big fan of reading GitHub issues and not just in the Kubernetes repository various kinds of dependencies different kinds of third-party components can have interesting questions that are posited there too because not every bug necessarily looks like a security bug but often other bugs can be security bugs so sometimes if you read between the lines in different kinds of issues of people being like I found this weird problem what's up with that sometimes you can find interesting things there I also really enjoyed reading the docs a lot of project documentation can have interesting phrases in it like this can provoke unexpected behavior do not use this in production please make sure to change these insecure defaults and things like that that when I see that I'm like that's interesting what does this mean and then I want to go chase it and poke at it inspiration can be found from all kinds of places and personally I'm excited about that unexpected behavior and those intersections and corner cases of like what you don't think you're going to find but do so those are a lot of the time the things that I look for yeah curiosity definitely I'm fortunate and that I always get to see a fair number of different setups and there's always something I haven't seen before so I like to use that opportunity to get my hands on it I have to you know see it running get it working and then sort of take survey of what it needs and the assumptions and design decisions that were made especially around those defaults but I don't like stopping there you know that might be you know job done but that's where I think the fun just starts so I go right at poking and prodding I'm trying to knock it over mess with its inputs and really see how it handles things when I'm really giving it things that it wasn't expecting so all the time I'm asking what if I do this what if I do that what if I give it that too much too little if I take that out like what behavior emerges so like a few months ago I was poking in a container image manifest I was just taking out the digest or like modifying the url to fetch a layer from somewhere else and that just ended up triggering a cve in a container runtime so you know things like that that just kind of pop out of that immersion behavior is really where I like to to dive in and get my hands dirty on so we all have unique takes on how we find new things but like what's an area that you all think is right for further hacking and exportation yeah you know I think going back to like how we're changing things you know like how things are changed under container orchestration there is a ton of space that we haven't really spent a ton of time with yet right as we think about the way that we're building that isolation model you know giving each process its own network namespace or file system those sorts of things this is actually implemented by container runtimes and there are several popular container runtimes that makes you wonder like are they all operating the same where they all design with the same set of assumptions I think that there's like a pretty rich space in in in those sorts of assumptions also things like system calls right like I was talking about earlier with falco watching at the syscall layer to understand what's happening well the mapping of syscalls is different on the based on the architecture that you use so it would be different for x86 64 than it would be for arm and and I do think that like you know in those models where we like make the assumption that perhaps we're we should be only operating under the map that's under x86 64 but this is now running on arm so is it you know are we still are we seeing a difference in behavior though those sorts of assumptions I think really definitely highlight you know a whole field of area to play with I think for me it is all about these layers and all the different assumptions that are being built up as we stack over more and more layers you know from the kernel up through docker through kubernetes into things like service mesh and where those layers meet you get these kind of possibilities for weird sharp edges corner cases stuff that could be odd and we're improving a lot of protocols that are historically complicated things like HTTP there have been security bugs and tp servers for decades now and also pki you know kubernetes makes a lot of use of pki is traditionally a complex thing that has had a lot of security issues so those are the kind of areas I think are good in particular I think where we ever see a new layer so every new layer has a time to settle down and to work out what security looks like so at the moment something like operators you know it's quite a new thing there's a lot of activity there there's a lot of development so there's probably going to be opportunities to you know to poke at it some more and it's an area I think I'm probably more interested you know over the next 12 months or so yeah for me I've noticed you know a growing number of projects that are extending the kubernetes api with custom resources and that's to do things like create more clusters and even has like a new abstraction for managing other cloud resources and I think anytime you give the data plane of one environment enough access and permissions to be the control plane of another environment I think that's where things can really get interesting when those those layers intersect for sure and yeah continuing the ongoing theme of layers you know container security is a holistic thing right your container security is only as good as the security of the rest of your stack and for me thinking about the way that the different parts of the stack interact present a lot of exciting often unexplored attack surface I'm really excited about hardware attacks I think micro architectural attacks are really interesting like meltdown inspector and their variants and I think that there is a lot of unexplored territory about how that can interact with things like multi-tenancy in the cloud I'm also really excited one layer above about ebpf as an attack surface I think that that is a very interesting unexplored territory where there's a lot of fun things that we haven't quite done yet so I feel excited about exploring that coming up and and just getting to explore all of the things because I think the cloud native space in general is so new we're all exploring and building this stuff together there's parts of the attack surface we don't even know are there yet perhaps because it hasn't been invented so I'm super excited to be here at kubcon cloud native con with everybody getting to build this future together and these technologies together and I want to say thank you to all of you for being here and building and exploring these futures with us if you want to be further involved in the kubernetes project or the security of it there's lots of work in kubernetes to do and in the cncf landscape in general kubernetes and a lot of these projects are open source there's lots of work to be done lots of interesting problems to solve if you want to get involved kubernetes has lots of special interest groups sigs that do lots of different kinds of work I'm the co-chair of sick security we have lots of interesting problems to solve and we would love for you to get further involved if you're interested if you want to ask us questions or talk to us further we're around you can find us on kubernetes slack or discuss dot kats.io and we would love to be able to talk to you all and learn with you together indeed and I also wanted to highlight again that like a lot of the questions that we answered during this AMA panel came from the community and I wanted to give a shout out to everybody in the community who took the time to ask us a question for this panel we could not do this without your help so thank you so much for for putting up questions and one more thing Pog the Planet!