 Well, hello everybody and welcome again to another OpenShift Commons briefing. I have my good friend, Nox, from Cystic who is going to give us a briefing on security and forums, express practices on OpenShift and tell us all kinds of WTF things. I'm loving that you use that in the title because I think it's the first time anyone's ever used that. So that's great. And I'm going to let Nox introduce himself. He'll do a good job of it. He's promised a pretty lively demo. If you want to ask questions, ask them in the chat. We'll make him pause and take a breath and he can answer questions. But otherwise we'll have a live Q&A at the end where anything that doesn't get answered in the chat will reiterate them and we are recording this session. So Nox, take it away. Awesome. Thanks Diane. So, hey everyone. I'm Nox from Cystic. I work on the product marketing side for Cystic Secure and we've actually got some awesome content for you today. We're going to be covering mainly a live demo. I'm going to go in, cause some vulnerabilities to happen and then look at a live OpenShift environment, how you can do forensics in there and do a bunch of other stuff. So I'm going to try to get through these slides as quickly as possible and then get down and dirty in the command line and show you everything else that's going on there. So what's on tap for today? We're going to talk a little bit about security with containers at a high level. How it's changing, how you architect your applications, what you need to kind of build out your systems and how you can actually enable better security through OpenShift and containers. Then we're going to move on to kind of the architecture that you want to collect data from your systems, look a little bit more about the Cystic Secure capabilities, how we can help you deliver secure services through OpenShift and then go into the live demo and instrumentation from there. High level on Cystic, we're the container intelligence company and we're going to provide a unified security monitoring and troubleshooting from a single instrumentation point. And a background on us, our founder was the co-creator of a popular network packet analyzer that many of you guys have probably used called Wireshark. From there he went and launched Cystic, our open source project, which is kind of like if you took TCP dump, LS off, S Trace, H Top and just smashed them into one and then layered on some container goodness, that's our open source tool. That launched in 2013, a million plus downloads and hundreds of thousands of users there. And then from there we've launched our container intelligence platform which is made up of Cystic Monitor, Cystic Secure, and we have 300 plus enterprise customers using that to kind of safely and securely deploy containers into production. And we have really deep integrations with OpenShift and use of major enterprises. So we take all that OpenShift metadata to heart and allow you to really layer that on for service performance monitoring, service oriented security, all those kind of things. So what we're going to really talk about today is the convergence of two challenges. First the one of kind of operating containers in production, what that means, how it looks for different groups, and then go from there and look at how you can enable effective scalable security of these diverse past workloads that you're running on OpenShift. All right, so containers are everyone refers to them as black boxes and they're great for development and great for operations, but what does that mean for your like your knock or your sock teams that are really there to focus on how do I get visibility into those systems? So on the development side containers are awesome, they're black boxes, you can put your code in them, you can execute that, they're repeatable. And then on the operation side, you can move these around really easily. Now my database doesn't have to be on one node, it can be on multiple nodes, it can be scattered out through OpenShift, and there's all these layers of orchestration that are making it really easy to deploy applications in multiple locations. But how do you keep track of all those containers as they're moving and they're scaling without going and putting in a sidecar container or something like that? And then how do you also make sense of these services that are now scattered across multiple nodes and make sense of them as that single logical service or that OpenShift deployment and the pods that roll up into them? So this is kind of some of the challenges that you'll face, but with challenges always come opportunities. And we really think that containers offer an opportunity for you to have better security. So there's controllable attack services. It's a single isolated container, that black box that we looked at before, should really only have one process running inside of it. So it makes it a whole lot easier to kind of lock that down, get that isolation through your different C groups and namespaces as well for that controllable attack surface. You also have much easier time doing robust configuration and change management. So it's really easy to roll back to a certain image or make a change to an image and push that out across your whole infrastructure really quickly, then kind of going in what we're seeing in your typical VM mode right now, where people are going out and having to patch different servers, patch different services. With containers, you can just cube CTL or deploy a new OpenShift service. And then from there, you've deployed a new version out to your entire infrastructure. That same process isolation also makes it really easy for you to do anomaly detection. So if you've got an InginX container and it's communicating through non-standard ports, or if there's some other process besides InginX running inside that container, it's a whole lot easier to go and spot, okay, something's going wrong inside this container rather than this VM where you'd have tens or hundreds of processes running on top of it. And then it also really lends an opportunity with containers and OpenShift to do more of that zero-day threat protection and behavioral monitoring. So looking at the activities that are coming from those containers and running those through a pipeline so that those easily baselineable activities, you can then spot anomalies against and really look for fundamentally malicious behavior that's happening on those systems. So how can you really deliver that security to your environment? And you can really do that if your security stack is architected for containers. So what do I mean by that? You need to have full visibility into every single container that's running across your infrastructure, down to the process level of what's running inside the system calls, without you having to do any of that per container instrumentation. Next, you need to have kind of automated adaptive security policies. The policies that you can scope to a certain OpenShift deployment or even the individual pods that roll up into a deployment and have those policies that are going to scale as new services come out and really rely on the metadata and the different things that your orchestrator like OpenShift is providing. The thing that I think is probably the most important is this non-disruptive unified instrumentation. So if you're deploying 10s to 20s of containers per host and we're really seeing the density kind of grow up as more and more companies are moving to use containers in production, you really don't want to have a sidecar container or go and inject some process inside each one of your containers. So you kind of need a single unified instrumentation that isn't going to go and mess with the Docker daemon or be going and sitting as a sidecar to each container. You need to keep that instrumentation kind of lightweight from a single point on that host without having to go and think about those containers as each individual point where you want to instrument. And then last being service and orchestrator aware. So understanding exactly how OpenShift is deploying your containers across your infrastructure and then having specific policies and actions that are tuned against that. And then also giving you the ability to use that metadata to then go and track down all the commands that were executed across a specific OpenShift service or the pods within that. All right, so now that we've kind of gone to, okay, what do we need to think about with OpenShift and the containers that are running there and how am I going to deploy and operationalize security? Let's go and look at our architecture and how kind of we've addressed some of these problems. So before we get into that, I'm going to just talk a little bit about where Cystic secure fits in your security stack. So we're really focused on two main aspects here, runtime security and forensic security. So runtime security doing your intrusion detection, lateral movement, data exfiltration. So seeing if your database has spawned an unexpected outbound connection, if sensitive data was read from that, and really looking for kind of fundamentally malicious behavior happening on your host. And that can be through system calls, processes, ports, pretty much everything that makes a system call will be able to pick up and then layer on this runtime security. And then the other side with containers is forensics. And a lot of times when a container gets killed, you're basically screwed. You've lost all your data. Someone starts a shell inside that container. You can't really see what's going on. They've killed the container. It's all gone. We've really built a unique way to get a buffering capture of all the system activity, pre and post any security violation. So you have the full breadcomb trail of every single thing that's been happening in your environment. And this is something I'll go in much deeper to kind of in the demo. On top of that, we're going to integrate with your existing platform security and IT security. So we'll integrate with platforms, be able to pull in events from there. If you have vulnerabilities that are discovered, our back, things like that, we integrate with those setup user credentials, can ingest events from those into kind of our events API, and allow you to have a really tight integration with us and OpenShift there. And then for your existing IT security, our product is entirely API driven. So sending out every single executed command from that OpenShift environment to your existing SEM or logging tool, or just doing and all your user tracking to your governance committees and things like that. All the data that we collect can be fully exported out to any other system. All right. So now I'm going to get into the architecture. The, our container intelligence platform is really built up of two main components that I'm going to be getting into further. The first is container vision. And this is our ability to see all app activity, network activity, host activity without going and instrumenting any of your containers. From here, we'll also will automatically discover any application metrics, system metrics, all that kind of stuff on the monitoring side from that individual container that's running on the host. And then the second is service vision, where we're going to go and enrich every single piece of data that we collect and send through our data pipeline with all the metadata that OpenShift is exposing. And all of that can be sent to our backend, which you can use as a SaaS service or deploy it on premise on your own infrastructure. And that backend is something that can be fully managed by OpenShift or Kubernetes as well. Okay. So from there, that single instrumentation and that single backend, we actually layer on three different products. So you can use SysTig secure, SysTig monitor or SysTig inspect. And there's no performance impact of using any of those on top of each other. So you're just are really going to have that single container per host that's going to give you full visibility into everything that's running. All right. So I've talked a lot about how we use this data, the data that you want to collect from containers, but how do you actually go about doing it? So what we're looking at now is a simple host. I've got a host OS running here. I've got a custom container. I've got an open source app. So something like nginx running here. And what we do is deploy our agent as a container or a process running on that host, our containers Red Hat and OpenShift certified. And from there, that container is going to load a unique kernel of instrumentation that we have. This instrumentation is part of our open source tool. It's part of our open source security tool, SysTig Falco, and is using kind of millions of machines, government agencies, all that kind of stuff. And that kernel instrumentation is going to see every single system call happening from every single container through non-blocking read and then put that into a ring buffer where our agent can go and process that at user space to see all commands, events, take it performance metrics, basically anything that's happening on that host, we can see and detect, protect and troubleshoot from that single instrumentation point. On top of that, we've layered on a rules engine. So rules for kind of any, any file access, court scanning, any connection or executed program that's running on that host or within that container. We'll detect at the agent level and then allow you to do policy enforcement. So killing a container, pausing a container, committing a container to then quarantine it, run it through workloads later, all from that, from that agent that's running on the host. And then the last component of the, of our architecture service vision. So typically you go and you look at your infrastructure from a physical view. And if you're looking at an open shift infrastructure, it's going to look a lot like this. So you've got multiple different VMs. There's a bunch of different scattered containers running across them. But how do you make sense of that as logical services? And that's where we're going to integrate with open shift directly and allow you to think of a logical service based on any piece of that open, open, the open shift metadata. So you can enforce and explore policies based on any piece of this metadata. And this is something that we've done for kind of hundreds of different companies with open shift on premise, using SAS, really whatever you want. That's something that we're here to support you with your open shift journey. And now time, the thing that everyone's been waiting for, let's get to the demo. So before I get to the demo, this is a time where I usually like to pause and see if there's any questions about our instrumentation, how we collect data and things like that. I don't see anything popping into the chat yet. So I wanted you to do the demo and I'm sure that's going to give us some questions. Okay, cool. All right. So what I'm going to do now is something that you'll definitely never see from any other security or monitoring vendor live on a demo. And I'm actually going to do an instrumentation of our container intelligence platform, running on a containerized host. So as you can see now, we've got kind of no, no data coming in. I've got system monitor running here. I've got system secure running. Really, nothing's going on in these environments. So now I can pull up my instance, run a quick Docker PS. You can see I've got a simple WordPress application running here. So I've got a load balancer database, some WordPress services here. And now let's go and get that command to run our agent. So I can copy it right here. And let's switch over an assistive monitor so we can kind of see the real time data stream in. And I'm actually going to just going to remove the dash D. So then we can see the output here. All right. So what this is doing is it's loading our kernel module via DKMS. So we're going to look at the existing version of your kernel, build the module on top of that. So it's not going to require a kernel restart or anything like that. And then if we scroll up within here, you can see we're doing a bunch of different checks for if Kubernetes is running stats, the metrics, if they're in the environment, JMX sampling, all of that all automatically. Pulling any certificates, searching out for any AWS metadata, mesos metadata, and then pulling all of that automatically. And so now if we look kind of over into SysTik monitor, you can see all the different containers that were running have pulled up. We can drill that into something like this WordPress container, see the performance of those. But let's actually go and kind of look at it from a topology perspective. This is one of the things that kind of anyone in the PCI compliance space will see. You need to see every single network connection that's going on in your environment. So if I go over to hosting containers here really quickly, click back on entire infrastructure, I can see every single connection that's coming into this host, then drill down, see all the containers that are running there. I can even actually see the SSH process that's running on this host from the ISP provider that we have in our office. And then I can actually go down all the way into all these containers and see the individual process that's running inside and the network impact and all the network connection. So this is something we installed our agent on the host. It automatically discovered all this, had this pre-built mapping. And then you can actually go down a step further. And let me switch this to container name really quickly. Go over to something like MySQL. And we're going to automatically detect that MySQL is running. Start pulling things like number of requests, number of errors, top queries, slowest queries, slowest tables, all automatically without you needing to do any other additional instrumentation. So here we're actually decrypting that TCP connection and then reading the file descriptor. So you just install the agent and all this is discovered out of the box. So we've seen some of the visibility that gives you automatically. Now let's go in and do some stuff that you wouldn't expect or want to happen in your environment. So I'm going to go back over here. Now you can see that we've got the sysic agent running. And let's go in and exact to one of my WordPress containers. All right. So now that I've kind of shelled into that container, let's start doing some stuff that really should never happen in your environment. So I'm going to go in and change and replace it, copy it over to switch it over to LS. So basically now what I've done is kind of hit in the curl command as something that can be kind of executed by kind of one of your standard system binaries. So then from here I can go and go LS openshift.com and I've essentially gone in and replaced kind of LS with curl and I've kind of masqueraded that as a different process. So let's go over and look at this into secure and see kind of what we've detected. So first off, no one should ever be able to shell in a container from what we've just seen because basically now I can hide everything inside that container, execute stuff across my entire system once I got access inside. So let's look at what actually we discovered in sysic secure. First off, there's a whole lot more red than before and I can actually open up that host and click on that WordPress one container and then switch over to this list for you and we can see, all right, first there's this terminal shell that was running in a container. We had that right below a binary directory and then there were system processes that had unexpected network activity. So I can click on any one of these, get full details about kind of the scope where it happened, the individual container, the host and then actually the full output here with user commands and everything else that was running. I can also hop over to the command history, click on the host, see every single command that was executed with full scoping of where it happened in my environment as well. So I always like to do the install to show kind of how easy it is to get up and running and out of the box of visibility you're going to get from both a performance monitoring perspective as well as a security perspective. All right, so now I'm going to hop over to kind of a more robust OpenShift environment that we've got running and look at some of the different things that are going on there. So first off, I'm going to start with our policies and right now we're looking at them by severity. So we can see kind of some of the high severity policies that are running in my environment, different file policies, network policies, things like that. But since you're using OpenShift or any and deploying services and managing services and exposing those to developer groups, you really want to think of those as your logical entities that you're trying to protect. So now I can go and look at my entire environment based on scope and see the policies that are running across my entire infrastructure, protecting the hosts and containers, but then actually drill down into kind of specific deployments. So I can go in where we're using the deployment name equals Redis and trying to detect if there's some unexpected outbound connection for Redis. So your database, you don't want it hanging out to the outside world and kind of tuning these policies on a deployment by deployment basis. I can also go in and add other data services to this if I wanted. So we can look at it from that kind of that logical perspective again at something like mySQL and drill down from there. From any policy, you can take actions so you can go stop the container, pause the container, or create the assisted capture. And the really unique thing here is you can tune this for different policies to see how much system data you want to capture pre and post any policy violation. And since we're going to be writing kind of every single system call to that to that SCAP file, for anyone in the audience use to use something like a wire shark or TCP dump, this is kind of like a TCP dump of your kernel where you're going to see every single system call pre and post that security violation and then can troubleshoot that later. And we'll go through some of those examples. And then from here, you can send it out to kind of page-to-duty Slack, Victrops, Opsgene, send it to CloudForms via webhook kind of all automatically there. All right, so now that we've seen how policies are created, let's go in and kind of look at some of the events that have happened in this OpenShift environment. So if you're looking at your infrastructure from a typical perspective, you'll see something like this. So you've got a bunch of different hosts. There's containers running on them. You can see which containers have had events and things like that. But like we've talked about before with OpenShift, you really want that service oriented security. And we can switch over to this deployment's view here and then drill into something like this Java namespace. And one of the easier ways to actually look at this is to go back to that topology map and go in and explore where violations have happened kind of based on that logical infrastructure. So here we've drilled down into a specific namespace. We can see all the deployments that are running in them, drill into a specific deployment, and then actually see where the event happened at a pod or container level and all the different network connections and the dependencies that have happened there. So let's drill into kind of this policy violation that happened with Redis. And we can see, okay, there's an unexpected outbound connection and then a sensitive file was read. The really cool thing here is I have the full scoping of where this happened from a logical perspective as well as that host and container. So if I need to go and quarantine that host, I can do that. But then I also have full knowledge of where this actually went and affected my open shift deployments, pods, or any services that's rolling up into. We can drill down further and look at the output here. So we can see, all right, there's an unexpected outbound connection from this F test command. And then after that, a sensitive file was open for reading by that same F test command. And they're actually going and reading from Etsy shadow. So they've immediately come in, got access to my hash passwords, and are going and reading from it. So this is kind of one of your classic data exfiltration examples that could be happening in your environment. And from here, you can kind of tune this. So, okay, if I ever see something like this happen again, I can kill the container or customize any of the out of the box policies that we have in our environment. And kind of all of these policies have been pulled in from they are rule engine integrates with our open source tool cystic Falco. So we have tons of policies that have come in from the open source community, things that cloud.gov or Yahu is using to protect their infrastructure. And then since we've been monitoring millions of containers with years for years with cystic monitor, there's a lot of policies that we've auto generated based on classic workloads or baselines that we've really seen from your containerized infrastructure. Let me ask one of the questions that just popped up. I think that is there an ability to stop an entire pod altogether if a security violation is detected. So we can stop, we can do kind of container level stopping of those containers. One of the things that you can also do is webhook out to the cube API to go and stop a pod from there. But the enforcement is done kind of at the container level right now. Okay. There was one other quick question Doug was asking, what is the persistence store technology for all the events and logging for the cystic agent? Yeah, great question. So our back end architecture is really made up of two main technologies. The first would be Cassandra. So all of our time series metrics and things like that are all sent to our Cassandra database. And then all the commands history events. On the monitoring side, we also pull in any Docker event or Kubernetes event. So if you've got a crash loop back or something like that, all of that data is sent to Elasticsearch. And those are the two main components. Both of those have REST APIs. So if you want to export any of the data that we're collecting, you can send it out. So we have a lot of customers that are kind of taking commands history data or event generation data and sending that out to things like Splunk. Cool. All right. Carry on. Thanks. Cool. Awesome. That was a good break before we get into some forensics. So thanks for sending those in. All right. So now I can go into this WordPress environment, this namespace here, and drill down into this specific WordPress deployment that's running there. So we can see this policy violation that happened where there's a shell in a container, kind of the same thing that I showed before. I can click on it, see kind of that same logical information that we saw earlier. But here with this policy, we actually recorded a capture, and then there's commands that are associated with it as well. So I can go click on view commands, and this is going to drop me back in time to that point in time when the event happened and give me all the commands that were executed around that specific security violation. So from here, I can see, all right, the user spawned a shell. They curled down a URL and then untart it. So from here, I can actually click on any row and get really deep information about that specific command. So we can see from the full command line here, they actually curled down a root kit. So this looks sketchy. I'm definitely going to want to do further forensic analysis here. We can see the working directory, the PID, user ID, shell distance, all that kind of stuff automatically. And then that user untart that command, and so they've done some pretty malicious stuff here. I can click over onto captures now and do further analysis of that capture at that point in time when the event happened. So if I click on this button here, we're going to open up Cystic Inspect. And Cystic Inspect is kind of our forensics tool to do a full analysis of kind of every single thing that's been happening in your environment. So you can see kind of all file access patterns, network activity, network apps, app log messages, your full HTTP requests and payloads. All of this data is going to be captured here in a really easy to filter way. So let's set this stage first with looking at when did that notification happen? And then I can isolate a specific time around that notification. So we can actually look at kind of sub-second granularity in here if you wanted to. And then start overlaying things like kind of file access patterns, network bytes in, and then those executed commands that we saw earlier in the command history. So this is right here when that user spawned the shell. And then we can see probably, okay, this is when they curled down that URL, there's a bunch of different file access, things like that. So any one of these tiles you can drill that into and get really rich information kind of about that tile and use that as a jumping off point. So right now we can see those same executed commands. So I can see the users started a shell, curled down that root kit, and then untart it. And then from here I can actually drill down in and do further analysis of what actually happened when that tar process was executed. So if I double click on that tar process, and then switch over to this files view, now I can see every single file that was written at that point in time when that tar process was executed. So we can see basically that root kit was unzipped. It wrote a read me and install script, all that kind of stuff. And then from here we can actually go in another step further and use our IO streams functionality to actually look at the individual contents that were written to that file. So just taking a step back here, we detected a user spawned a shell in a container, kicked off a capture, and then can go and drill down all the way from what individual root kit they installed from there, how it was unzipped, every single file that was written, and then the actual individual contents that were written to that file. Any questions right now about kind of the forensic side, cystic inspect, how that works, and things like that? There's one question here, can network payloads be inspected and viewed? Yeah, so let's go into kind of this HTTP request. From here we can see basically all the different contents, the headers, things like that. So right now we're pulling in seeing the individual kind of full payload of these requests. I think that answers his question. Awesome. And then you can get your fork counts, fork trees, all that kind of stuff all within here. If you want to see all the different processes that were running, drill down into kind of the individual system calls from a process, all the way down into really every single thing that made a system call during that point in time, you've got this full visibility. And the really nice thing is it's buffered. So you can see what led up to that security event, how they might have gotten access, and then how it was affecting all the systems and everything else that ran afterwards. That's about it that I had planned on the demo side today. Right now I'd really love to kind of open it up to questions and hear everything else that kind of you guys have to say, or if there's anything else that you really want me to show. Well, what I would like you to show while we're waiting for people to ask questions is perhaps your final slide on how to get a hold of you. So while we're asking questions and anyone who's watching the video can see where the resources are to contact you afterwards. Here we go. So actually don't have one of those slides repaired right now. The easiest way to always contact us is on our website. So within here you can actually start a free trial of SysTig monitor or SysTig secure, request a further demo, do all that kind of stuff. So if you go to SysTig.com you'll have full access. And then if you want to follow up with me directly, my email is really easy. It's just knocksknox at SysTig.com. So you mentioned Falco. Where is that hiding in the internet and on GitHub? And maybe a little bit about how you differentiate the open source stuff and the commercial stuff. Yeah, so Falco, it's got its own GitHub page. So Falco is really the rules engine. So it doesn't allow you to take actions, pull in any of that OpenShift metadata or things like that. But it's our open source security rule engine that's kind of similar to if you had combined like some things from SE Linux, some things from OSAC, some stuff from S Trace, kind of and built it for containers. That's kind of where SysTig Falco sits. And it's really that behavioral activity monitor. So it'll just send alerts to standard out and things like that. SysTig secure is really built for securing and monitoring that entire platform where Falco is more of a single host behavioral activity monitor. Perfect. All right. Well, I am not seeing any questions and your demo was awesome and it didn't crash in any way shape or form. So I'm pretty impressed with that since you said this was the first time you were doing this variation on it. So kudos to you for pulling it off. Again, give people a couple more minutes, couple seconds here. If there's any questions, there's a bunch of people on there, but they've asked most of their questions during your presentation. So I'd just say thank you very much for your presentation. I know the folks from SysTig will be in London with us at the upcoming OpenShift Commons gathering that we're hosting again on January 31st in London. If you're looking for information about that, you can just go to commons.openshift.org and there's information there. There's one person asking a question now talking about how sort of product differentiation between some of the other security apps or applications like Twistlock or Aqua Security. Do you feel confident about that or would you like to hold that? We could maybe have a panel on that sometime soon. That's a great panel question. Just doing the highest level and then we can follow up directly a little bit more about individual features and functions. We're a container intelligence platform, so you're really going to have this single agent that's running on your host that's going to be able to give you that full monitoring visibility, full troubleshooting visibility, and security visibility from that single instrumentation point. So it's really one way to get full insight into every single thing in your system than having separate monitoring agents and things like that that are adding more attack services or more of a performance hit. So that's the high level of how we would compare against those and then I can follow up directly with more information about individual features and how they'd be comparing. I think that's a good one for a panel and an upcoming gathering or event. It would be interesting to hear that but I do appreciate you taking the time today and everybody who's joined us we will put the slides up on blog.openshift.com shortly along with the video and it will be up on our YouTube channel as soon as it's processed and ready, so probably by the day. So thank you again Knox for joining us. Awesome, thank you for having me Diane and thanks everyone for all the questions that you sent in today.