 All right. Good afternoon. Thanks for coming. We are going ahead with the next talk. My name is Jorge Salamero. I work for CISDAQ and this is about what the fuck my container just has found my shell. So a little bit about myself first. Let's see if this makes less noise. Yeah, much better. So I've been working in open source for some time already on monitoring for a few years and with a year working at CISDAQ. I work on the marketing team, but I consider myself a gamer. So I one of those persons is lucky to play with different technologies and today we are going to talk about containers, security and messes. So how many of you know CISDAQ? Can you raise hands? So a little bit. So all right, like more than half of the audience. Good. So you, for those that you know and you don't know CISDAQ for both of you, if you remember a few years ago we created this open source project. Last year we were over one million of downloads add-in to be honest right now. It's an open source product for container tall shit in. So it's great for looking at what's happening inside your containers. It was a great success and people they were asking us to do more. So we created a commercial product for doing monitoring both on SAS Prem, integrates with the COS and messes and gives you all you need to basically monitor your clusters. And from that experience we saw that that level of visibility that we can gain what's happening on what's happening inside the containers we could also do some security stuff and we started experimenting it with it and we ended up with something we called CISDAQ Falco. It's also all open source. It's been kissed for by a few folks out there with big names and it basically provides you with CISDAQ security nor is I'll give you them afterwards. That went very well and we said okay, let's do something else more. That's why we came with CISDAQ secure commercial product to do runtime container security in forensics. I'll show you a bit later on or if you have more questions, you can come around the booth. But when trying to implement security on containers, obviously there are multiple layers things that you're going to implement. But I want to focus today on scanning. So what are the containers doing? We can look at it at this from two perspectives. Something that's called static scanning and something that we call dynamic or runtime. So static scanning basically means analyzing your image without running it. So why do you need this? Well, probably even if you are not aware of, your developers are using containers and because they didn't look at it from a security mindset, you might end up having them run Dockerfiles like those. So I'm going to download this thing from the internet, not even with three HTTPS. I'm not going to check any signature. I'm going to build inside the container and that's going to be my application. Awesome. Well, not really because who's maintaining that. How do you make sure there are no vulnerabilities in that image? So that's why static scanning is going to help you. It's going to look at the software versions, the libraries, compared to the database with vulnerabilities, CVAs, or something like that. And actually the way that containers work, it's very handy for this because basically as you build the containers, you always take like a base image. So when you detect a vulnerability on the base image, it propagates across the other containers. So actually this can be implemented in a efficient way. Probably you want to implement this at the registry level. So if you're using Dockerhub or you are using whatever repository you're using, you'll probably have that there already. There are a bunch of different open source for implementing this, for example, the Corice Quay. But is that enough to cover all the use cases? Well, can we say if our image passes, our static scanning is secure? I wouldn't say so. Because containers, they are basically like black boxes. It's very difficult to look what's happening inside. So we need a software, we need a technique to look inside and see how those processes inside the containers they are behaving. We can just edit, we can enforce. These are the use cases. There are a few bunch of different tools for looking at containers being executed at runtime. So I come, it's a facility in the kernel. It works great, but it's very basic, so you can allow or blacklist some system calls. When you put it together with BPF, you can do some additional stuff, but it's still, it's kind of like tricky to work on this. Cilinux and a partner, they work also containers, but you know them already. They don't integrate with your orchestration tool. They don't have high level of concepts or rules for creating the policies. You have AuditD from Cilinux and we also created Falko. How many people know Falko from here? Raise hands. Just a few, less than Cilinux. Okay, so I hope this is going to be useful. So first, to understand Falko, I'm going to explain you a little bit about how Cystic, the open-source technology, works and we'll move into Falko. So Cystic looks at all the system calls being executed in your host. Basically, we hook into the Linux kernel trace points and that way we can see absolutely everything. There are some functions when you enter into a syscall and when you exit into a syscall that we can hook in there and actually also understand how much time you spend on every syscall. Unlike other solutions, like for example S-trace, that you have to attach into an existing or into a new process, Cystic approaches different. So we just see everything and we will filter out the stuff that we are not interested until we reach the invisibility on what we want to say. So this is a simple diagram of the architecture of Cystic. So we hook at the kernel level, we capture all the system calls and because containers, they are just processes running into a different use space, we don't really care if it's a native application, docker container, rockhead. All of them, they just work. A little bit also on how the architecture of Cystic works. So we have, at the moment, it's a kernel module because basically all the facilities exist in there, they were not enough to reach the visibility we wanted. This might change in the future, but at the moment, as it is, it's a small kernel module that copies all the system calls into this ring share buffer. From there, we have a set of libraries and user space processes that decode all the system calls from their vPorse events. We actually decode known protocols like HTTP or SQL or memcache using some additional scripts that they are called chisels. And looking at it from an instrumentation or orchestration perspective, it's going to be like that. So we have the host, we copy the system calls, actually their events, I'll share you in a sec, into the ring buffer. The user's processes typically run into a container and that the user space process talks to your orchestration tool. These do us messes, and it's going to be the example I'll show you today, and you can correlate those low-level entities of the system calls with high-level entities that they are resources in your orchestration tool, applications, tasks, all that. So if we look at that share ring buffer, we will see something like this. Like what we call event stream, where we can see the different system calls and we can do a bunch of different things. We can save them to a disk for later analysis. We will see this in a second for doing forensics. We can analyze in live mode, filter our stuff, and understand what the processes are doing. But how we can use system calls to understand what's happening inside the containers? Well, if we know how things work behind behind the hoods, we can know that if we find a clone or an XVE system call, it's because a new process has been spammed. So a new process has created inside that container. We can use look at open and close to see if we are opening files or accepting connections on sockets. Sockets and connect are set to look at network activity. This is a way, basically, you can understand what's happening inside looking at the system calls. But we will see in a sec that with statistic fault code, you don't need to go that deep because we provide enough structure on top of all these things. And to be completely fair, all these activities or events as we put in that ring buffer, they are not just plain system calls, as you would see from a trace output. Actually, we put a lot of metadata or contextual information, so the libraries on the user space, they can understand, they can reconstruct what's happening. So for example, we can see the process from that specific system call came, the parent process, the remote IP address, if it's a system calls that they are affecting the socket. So that's what provides that level of visibility. So now that you have a little bit of understanding of FALCO, of statistic, sorry, let's look at how we can implement a security tool on top of this. So looking at how the system calls are doing, what the processors are doing, we can detect suspicious activity. We will define that in a rule set. That rule set, it's basically created using the filtering language of statistic, which is very similar to TCP-DOM, that most probably all of you are going to be familiar with. It works very well in containers because this way of looking at the processes at the host level, so even if our user space process is in a container from there, we are going to see what's happening in all the containers that they are being executed in the same hosts. On FALCO, you can configure notifications when any of these rules they are trigger, like sending out commands or putting things into a file, and as I was saying at the beginning, it is entirely open source. So a few examples, and I'm looking back because my screen here doesn't work very well. So if we want to add a shell running into a container, we can write a rule like that. So container ID is not the host, so it's a process running in a different user space, and the process name is that. We can actually create those arrays and do matches against those, so if our file directory is in this list, tell me about it, and we do a write system call. If we change the name and space of a specific process, and it's not docker or cistic, which are the only privileged guys in my host, there's probably something wrong happening. So these are just a few examples of different things. Once these rules trigger, you can send notifications to syslog, FAL, standard output, or even to any command. So you can send an email notification or a webhook, post notification, whatever it is. So with this, I finish the boring slides, and I'm going to jump into the dangerous territories of demos. First, I'm going to demo in detail our syslog FALCO, the open source solution, and then we'll leg some of the cool machaic of syslog secure. So I have here some configuration files. FALCO.JAML is my main configuration file for FALCO, just a bunch of different things like which are, where are my configuration files, where I'm sending the notifications. So this is like pretty much a standard out of the box. FALCO rules contains the default rule set. I'll show you later how this is shared. All the same syntax for this language is the same in FALCO and in syslog secure. Here, this is a super long file where you can define all these rule sets. And we can basically define macros or aliases to simplify things. So if we look at the lowest level syslog filtering language, our write needs to be either opening or the system called open add, and we need to do the open with the write flag on. And it needs to be a file. So instead of having to write that all the time, we can define it on a macro so we can work at a higher level. So this is full of helpers, macros, and definitions. So that's fine. You're going to work into a different file that looks like this. And here, I have defined a standard, a rule that I use a lot of times. So all right, I got this container, and I know the processes that need to run in that container. If something else that is not that process runs for any reason, send a notification. All right, in this case, I'm looking, let's look at the, well, roll is just the name of the roll description. But what it's really interesting here is this, this condition. So basically, I'm looking for a new process that has been created. All right, which is this first macro. The second, I want this to be running inside a container. I don't want to look at the processes natively on the host. I'm pulling, in this case, already some container metadata, in this case, the counter image. And I'm looking that it's called nginx. All right. And finally, if all the processes, or if there is any process that it's not called nginx in that container, I'll trigger notification. Some of you probably are going to say, oh, you could, you can rename the name of a process. Yes, we have a roll to detect process renaming themselves. Okay. I'm going to try to keep this simple. I'll send another message like this and some priority. So to, I'm running this at the moment in my, in my laptop to keep things simple. And how you run Falco is basically like this. It runs as a container where you run the Falco container privilege mode, because it's load that kernel module. I mount a bunch of different volumes to get information about the hosts. And I mount my configuration files. So I'm going to do that. I'm going to do run that container. And then here in this side, I'm going to run an nginx container. Okay. I'm going to call it nginx. So we need to look at the output here. So now I'm going to do, going to run my history. All right. I'm going to run a shell in that container. So I'm here inside my nginx container. And I can see how there was already a notification, which is part of the default rule set that's saying, hey, a new shell has been created in this container. Actually has been attached. And this is the, the user who executed it. And this is the container ID, the process. It was a new process. And this is exactly what's been executed. All right. But now, since I have created that rule that says, if any other process running in here exists, and it is not calling nginx, trigger a rule. So anything I execute, like ls, or codresolve, if we look back at the output, we can see how we are getting those messages in there. Okay. So this is the use case of Falco, gaining visibility inside the containers, creating rules head or a policy of things that we want to look for, and trigger alerts on those. So as I was saying at the beginning, this is pretty unique. There are big names, big companies already using this. But at the end of the day, it's like kind of like a do-it-yourself thing. So some people we work with, they were asking for something easier to build. And this is where we came up with Sestic secure. I'm going to do a very quick demo on something similar. So what we can see here is a different cluster with a few containers running inside. It doesn't really matter. Actually, I'm going to show you, so you believe me, this is my interface. So with DCOS Messos. And I have here a WordPress application with a client that's making fake requests, my application, and then the database. And also I have deployed Sestic in all my nodes using marathon. So when I'm looking at this from an application perspective, see all the containers, but I'm interested in looking at this one. Okay. So here I can see how there is this client talking to WordPress at the same time talks to MySQL. So I decided to create a rule to look for a specific anomalous activity. So one of them, we are going to see, executed, it's going to be this one. So the name, it's new shell, running a container, a description. I got here some severity where I am applying this. I can leverage any marathon metadata for this. The rules, if we look at the rules editor, you see it's exactly the same language we could see with Falco. So you can actually reuse all your Falco.com fake. And which is very interesting is what actions. Because with Falco, you can just send out a notification. That's it. Obviously, you can put that into an orchestration tool and then take actions, modify what are doing your containers. But here, we have already made it available for you. So we can do three things. So we can stop or kill the container to prevent the attacker from breaking it. Actually, when thinking on containers, uptime, it's not something we should be proud of. The less uptime that your containers have, the less likely is that they are being hacked. So we need to change it into a different paradigm. We can also pose a container if it's a database or something else that has been a legacy application, something that cannot be killed immediately. And finally, we can create a SSD capture. So if you remember at the beginning, when I was showing you the architecture of Falco, we could get that event stream, all those system calls plus metadata information, analyze it, or dump it into a file. So this is something we can do. And actually, a very cool feature of secure is that we can include a number of seconds before the alert, that notification was fired. So that's going to help us to understand in the context of security, for example, how the attacker hacked into my container. So in the system called file, I'm going to have hopefully that information. Okay, where it looks like a very small time, five seconds, we need to understand that when attackers, they are hacking into containers because as they know they are highly volatile. All those attacks attempts, they are most probably automated. So it shouldn't be that bad. In any way, that's a compromise you're going to have between memory assigned into the agent versus time that you want to have there. Obviously, after the notification, you can include as many seconds as you want. We also can send you notifications anywhere, really. So what I'm going to do now is also show you a rule I created this morning, which is actually going to the staff. So it's a similar thing when here, I'm looking at someone writing in a binary there. Okay, so who should modify a binary there if the container has been built at build time? And what I'm going to do is to kill the container and take a capture file. So now I'm going to minimize all this and hopefully SSH into the right node. So when looking, if we look at this with perspective, I talk about static scanning at the beginning and then this runtime or dynamic scanning. So it's safe to assume that all these images I'm running in production in my cluster, they are safe. They don't have any well-known vulnerabilities. But still, there are two use cases, for example, detecting a shell running a container. Number one, someone using a third-day vulnerability and doing a common injection, sorry. The other one is someone like me who has decided to SSH into a production server and now do Docker, exec in that specific container. Sorry, that's not the right idea. And start hard fixing the stuff because I can't do it. Why shouldn't we do it? Oh, you shouldn't be doing this, but you can do it. So I executed the shell inside the container. I'm looking at the files, but now if I go back to Falco, and I filter in here, you see this box. It's yellow now. There is one. That basically means, if we switch into this beer, is that someone run, executed a shell in the container. And we could automatically detect it and give you some information like when this happened. So just a few seconds ago, all the severity, the policy that was triggered, the container image and the container name, the host where it was executed, the container details and the alerts details. Actually, if I go here, you see, you can see the LS command that I executed. So this is really, really cool to understand what are the attackers or anyone really, even someone from your team, who decided to exec into a container or do anything really. I'll connect to a remote server to start meaning bytecoins, whatever the hell they are doing inside your container. But we can do more. If you remember that policy, I created here, I said, okay, if I write below any binary directory, I want to kill, I want to stop kill the container. So, all right, let's do that. So I'm going to do this, because I'm going to do echo full var being hacked. And execute that. And you see that they didn't do anything else, right? Just a straight away, the container has been killed. And actually, let's look at this ID. And let's look, hopefully, it has been a schedule in the same node. Yes. So if you see, this is the previous container, which was exited, it was killed. And methods started, anyone automatically is orchestrated. It's safe. It's in principle, the best way to stop an attack in a containerized application is just to kill it, because the orchestration tool will create a new container with a clean image. All right. So now, if I go here, three minutes ago, all right. So we can see that in addition to running a shell, I got this other alert trigger. Someone modified our press container. So we can see the commands, hopefully, which is the same thing, because I didn't run any other stuff. If I go back, there is a very, very neat feature. So we have seen how with this so far, we can stop the attacks killing the containers. Basically, we can also see what it was executed in the case of running a shell. If it's a process doing an unexpected or doing connection, it doesn't make sense to look at the commands. But as well, we can do the following. All these system calls that they can be dumped into a file, if we configure that in our policy, it's going to be a by level here and be captures. And we can open them with a new UI, a new cystic UI that it's called cystic inspect. In these cases, they can inspect it's available in here. But this is entirely open source as well. So you can actually install it in your Linux host or in your Mac laptops as well. And use it for troubleshooting as you are long used to. But now use it for postmortem analysis and forensics of your container interactions. For the ones who knew cystic before, if you remember, we have two options, like the common line tool that is required to use all those hackish filters in a TCP dumb style, or we could use cystic which was a end courses UI similar to H sub. Still, it was kind of like funky to use because you had to understand all those low level concepts, file descriptors, system calls, system call errors, all that. With cystic inspect, we have to take a slightly different approach where basically the idea is that we are going to help you to correlate high level concepts like alerts coming from cystic secure, processes running containers, or any like file activity in the file system, network activity as you can see here, network applications, security commands, even performance and logs, all things that anyone with basic DevOps or Linux concepts can understand down to all the system calls. And if you have a look at here, in this file, we have 5.5 thousand system calls, so it's a lot of them. The idea now is that you can click on any of these boxes and go relate all of the information. And we can see it down here in this, in this specific capture file. There are a couple of things I want to show you here. We can see how there are some patterns of file activity in here. This is, and we have the alert here. This is a clear example of my capture, including a number of seconds before the actual alert was triggered. So it works. Everything has been live. The other thing we have on top is the system file that it was modified. So using these timeline, it's going to be easy from a human side point of view to correlate activity, looking at these graphs. And actually, I can, I can use these sliders to focus my approach around the specific time frame I'm interested at looking. And with all these boxes I have here in the top, I can actually isolate them and look at, for example, all the activity. So in this case, when my alert was triggered, two files, they were modified. Number one, the file, the binary file I put on the slash, be in the slash hack. The other one, bash, automatically wrote the bash history file. But I can do even more. So I can go into this specific file and decode the system calls they wrote that file and see the actual contents of the file. All right. This is very, very powerful. This is the first tool that's going to give you all the information you need to do that post mortem analysis to understand how they hack into your container, what they did, what kind of information they could access, credentials, or defecate database, and see if they actually send that information somewhere else. So you have a data leak, all that, even if your container doesn't exist anymore, because as you see, as you saw, we killed that container, doesn't exist anymore, but we could do all this troubleshoot in with CISIC inspect. Another cool example, actually, it's part of our demo, but since I have some time I wanted to show you, is this, hopefully, is this one that we got here. Yes, no, it's not in here. Capture files, events, another one hour. Sorry, this should be one day. Let me see, where do we have that container? Let me see if it's going to be in here. Not last hour. One day, yeah. So in this case, it was kind of like a more complex attack. So we saw how someone kid do a shell injection or execute a shell, download a root kid, and that person uncompress the root kid. He couldn't proceed any farther because we killed the container, but going back into my captured file, I can do again this post-mortem analysis, bringing in my notification, my file activity, my network activity, and for example commands. So in this case, I can see how the commands, they were executed basically at the same time that my alert did fire, and there was like a spike of download traffic. So in this case, instead of filtering for file activity, I'm going to filter the commands. And I can see all those commands again, the shell, the download, the uncompress. But now I can filter on this specific command or process. And here on the left side, we have different views, like connections, it was a tar, it didn't have any connection, directories, file errors, system calls. I can look at the system calls, but unless you know how things work, this is not meaningful, but looking at the files, everyone is going to understand and see how the file was uncompressed. And I can go to any of these files, decoded, actually put it in a beautiful way with assay, and being able to rebuild, reconstruct all the components of that specific root kid that my attacker executed. So this is everything I wanted to show you in this presentation. Do I have a thank you slide or something like that? Probably I should. Yeah, there you go. So this is everything I have for today. Falko, it's, as I said, this open source community-based project. We have contributions for many people. If you feel, if you are not into writing code, but just contribute to new rules for any of the applications you work, we will be very, very happy to accept your pull requests. You can also join our Slack community and discuss. There is a mailing list, but you know, these things are kind of like low volume. And if you like, since it's secure also, you can go to the booth and talk to us about it. And now I think we have a few minutes for questions. So anyone? Hello? Yes, it is. Don't take this the wrong way, but I'm going to play the devil's advocate a bit. It seems we can't have another week and something gets hacked. I don't know, the last couple of weeks was kind of crazy. Everything gets hacked. How do you guys assure that SysDig is not itself leaky? Because if I understand correctly, basically anything I have in my applications will go through it. Unfiltered credit card numbers, data, you name it, it's going to be sucked up by SysDig. Yeah. Okay, that's a good question. And so basically all the analysis, it happens at runtime. So it's not leaving both with the open source product and with the commercial product, the analysis, it's happening in that specific container. Okay, it's not that we are going to send the data somewhere else. So that constrains like a little bit where the data leaves. It's always in the same house. There is no leaking unless you take a SysDig capture, in that case it's sent somewhere else, or they command history. Yes, that goes to your backend. The other thing is SysDig, the container itself, is not exposing any external service. You can also write SysDig secure rules to monitor SysDig itself. But at the end of the day you need to define a boundary on what you are going to trust and what you are not going to trust. It's the same thing when writing a credentials world. When you write all your store, all your authentication, password, certificates into a service and that it's in charge to distribute it across all your infrastructure. What happens if your ball gets hacked? Well, you have access to everything. So it's like a little bit of a compromise. Yes, you have deep visibility, you can see everything vertically and for the bot. So this is a decision you need to make. It makes sense. Can I ask one follow-up question? Yes. How do you guys do this with your hosted solution with regard to basic things like, I don't know, like entry security or? Yes. So all that with a hosted solution, all communication gets encrypted between end to end basically. If you are really concerned and you don't want to share any database with any other customer because you have super private data, you can install on prem and it's going to be your responsibility making sure everything runs smoothly and you can implement your own security policies. Thanks. More questions? Feedback? Regarding Falco rules, is there a way to create at least a base sample template based on existing activity? So at the moment it's not possible. It's one feature which is in the roadmap of the commercial product. Probably some pieces maybe end up appearing in the open source tool. I don't have either the authority or the information to talk about that. So I don't know. I know it's part of the commercial product. You can also implement it in your own way. You can get Falco and actually someone already wrote about this on the internet, sending all the events or different events into a login system like a like key and then do machine learning there based on that. So there are multiple and different approaches. At the moment it doesn't exist in either of both solutions but it's going to be something available very soon. More questions? All right. Well thanks very much for listening. I hope you like it and if you have any other questions just come around.