 Like like I said, I'm I'm actually positively surprised at so many are here because I know it's early. I wouldn't be here. Otherwise Let's try to dive into the topic anyway So I want to talk about a bit Auditing and stuff that's happening around that. So generally, you know security is always something like this I think yesterday Facebook was in the news for losing. I don't know 50 million user credentials or something like that So it's a joy for everybody and oftentimes you we're just saying it's fine And then stuff develops and at some point you figure out maybe you hear in the media that you're in the media And then you say oh, this is no longer fine and everybody is talking about us and complaining And basically we don't want to get to that point. We want to figure out earlier. That stuff is happening. So let's see Obviously, there are no silver bullets. I'm not promising you the final solution Nobody should I'm just trying to discuss one problem a possible solution. So I Want to start off with oddity. Is anybody using audit D or how much is anybody using audit D bit, okay? So it's well for auditing from kernel events and in the user space It's basically writing the the auditing events and then you can also see them. So basically you have Our search and our report to see what your system is up to and you can define various rules What you want to monitor? Basically it's stuff like file access network access system calls Stuff that uses we're running successfully or unsuccessfully all of that can be audited and basically you can figure out what has been going on there It generally looks like something like this You have the application that is running and then in the kernel There are three possible event types that you can catch with the auditing demon either you have some user interaction you have some task or something that is exiting and on any one of these you can then define an Exclude rule and whatever is being cut or caught by one of these three and then passes through the exclude rule Will be locked by the auditing demon. So this is basically how you get two events And then you have a syntax to actually define those rules and how to get to the auditing events afterwards so Just what what that basically looks like is Sudo or report Basically shows you this is what I have in my logs. This is just a those statistics. So you can see we have for example 29 logins on that System and we have two users and five terminals and whatever we have there It's all based on the log. So which is Audit Audit log And here for example, you have one event This was the audit demon. So you can see we have a demon start We have the message audit then we have the Unix timestamp and I think the colon then has the process ID Or the identifier of that and you can see okay this was when the audit D demon was started and in the end we have Like the result was it was a success and then you have various other messages So you always have a type of event then you have this message with the timestamp You probably have a process ID or auditing ID You have if you run a process which command you were running and the success and all of that is in the log here and that is basically what all report is also using for the statistics then what is happening on your system and This is nice that you have that log file But obviously once you have more than one system, you don't want to look into log files But you want to centralize that and figure out Where stuff is happening and what is happening? So By the way, if you want to understand the log better for example read it has a very nice documentation page where they just show And walk you through various examples what all the parts of the logs are one thing that I found slightly lacking or That's another thing that is nice is you have more example rules of what to do which looks something like this So here this is their Page Basically here you have a set of rules that you can just use and then learn how to define certain things for example failed user Logging attempts or somebody tried to access something that were not allowed to access or something to run that they were not allowed to run Here you have some example rules which you can simply use the syntax might Look slightly weird But yeah, this is what you get for example These are just examples that are collected here and you can just reuse them One thing that I found slightly lacking or is still for if you are doing anything with namespaces and containers That is still work in progress. So knowing which namespace was doing. What is it's broken up into lots of tasks now Which seem to be tackled slowly, but that is still kind of work in progress So the idea is now we want to get all of that information and ideally centralize it somewhere and not just have to log Into one instance and run all report So why am I talking about that? I want to centralize all of that and I work for elastic the company behind the elastic stack Elk stack anybody using that already? Okay, probably directly or indirectly somewhere. I Always have the warning. Yes This is kind of like a hello world example. It's very simplistic But all the code and everything is online so you can try it out afterwards if you want to as well That is our stack if you've never seen it That is what you interact with generally for the UI This is where you store data and then we have these two components to collect data and the beats are actually the thing That are collecting all of that stuff for us that are relevant You know, that's a good old elk stack and that's where the name is coming from and that's where we have the elk The problem was there is no be in elk and at some point we then came up with this Which is the bell code the elk B? But we're always about scaling and then even marketing realized This is not very scalable because what happens if we add another open source project project to that Then we have to add another letter and then we have to redo the entire thing So we had that for a short while. This is the official bell core elk B But we got kind of rid of that and now we just call it the elastic stack So whatever open source projects we have we can just put them into that stack and we don't always have to redo the marketing Yeah, anyway, so now it's just that stack all a patchy to license so you can just go crazy and use that so We have something called file beat which is generally there to get log files I always say it's like tail F but over the network and on steroids So basically what that is doing is it's just tailing the various log files that you have in your system And it tries to get meaningful information out of that. So what that could look like is If you've never seen Kibana, this is Kibana, I'll just show you some pre-built dashboard since well, I'm lazy I didn't build that myself This is not the one I wanted Let's say we want auditing events and we have this file beat thing here Which is basically tailing the log file. I've shown you before And it's just collecting those and over the last 15 minutes We probably didn't have that many events let's say in the last 24 hours Because I set that instance up at night. You can see here That's when I created my instance then we had a lot of events and you can generally see what kind of events did we collect? Is it something that a user did is it would we added a group some path access you can see which users were doing things Yeah, what commands were they running Maybe some geo IP information though I don't have that in my example here and which commands they basically were running and that we can parse out of that auth log the problem is that auth log is A pain to parse because every line looks kind of different and has a different number of elements and at some point we were Kind of pissed off that this is so hard and then we kind of redid this entire thing So we've seen the demo But then we wrote a more specialized beat so a beat is like a lightweight agent or shipper and we have various Beats for different purposes. So we have file beat to tail log files with metric beat for system and application metrics with packet Beat for network data. We have a hard beat for pinging and we have audit beat for anything security related The idea of audit beat was generally We're using the audit these syntax so you can just reuse the syntax that you had for audit D But we're basically correlating related events immediately We're resolving the user IDs and we push that to elastic search directly So you don't need to write that file and then parse that file back and write the mess of regular expressions Why did we not use ebpf because ebpf needs kind of newer kernels and we have a lot of stuff on I don't know sent To a six or whatever ancient kernel versions. They are using and well, we wanted to have a solution for that as well That's why we went the audit D route basically So it's giving you a lot of powers Maybe not all of them, but a lot of things you can do there You can run it side-by-side with audit D and it's we think at least it's kind of easier to configure than what you have with audit D And it supports Docker because we can enrich from the Docker demon So basically we know in which container or namespace something is running and so we have the container side covered for that as well So just to give you a quick idea on what that looks like Audit beat audit beat YAML if I could type it No. Oh, yeah, that's true. Thank you It's too early for me. So this is generally the configuration that you have so first off We're saying something like okay. We don't want to rate limit this This is how many events we have in the backlog if we cannot reach elastic search Do we want to include raw events, etc? And then after that pipe, basically we have the general definition of our rules This is just might be slightly hard to read on blue But we have different kinds of things we can do a we can watch files That's the W or we can do auditing events and those are always dash a and then we have an action and a filter and The action can be always or never and the filter then defines like which kind of events you want to collect And you can chain multiple options with a capital dash s and with a keyword that K We can basically tag events afterwards and I have set up some rules here for example here Everything that affects identity and I tagged that with identity So anything that touches the files ETC group password shadow, etc. Any changes to those will be locked with identity I Have if one of my users So if my user which is my developer user I'll demo that afterwards is reading ETC pass WD That is being locked and will be a developer's pass with you read We have we're logging permission errors. We are checking for specific processes and network connections We're also checking for example. This one here checks Is one of my pseudo users using his powers to read the home directory of another user, which might also be interesting We're logging all the executed processes and anything that is elevating privileges. So those are just some example rules That we have the other thing that we are adding over audit T Is that we can basically check the file integrity of specific files or folders. So here I'm just checking my web route and anytime somebody changes my website Basically, I will lock that and know which user has been changing my file So if you're some suddenly serving some weird content or malware You might you want to monitoring what you're serving and then you can figure out at least which user was changing that Content when and then you know at least who did it so you can lock that user and also when did you start Serving something bad and then in the end I'm basically tagging my instances with some other attributes and further down You have like the connection string to elastic search and stuff like that Okay Let's try to do something here. So first off. I want to SSH into my instance. Let's assume This is failing Let's try passwords that don't work Okay, that has failed. Um, where will that information end up? Yeah, so that the first the thing that I'm using here is I'm actually just using a regular log file I'm using the orth log Valog orth log And then I can tail with the regular file beat, but you can get to that information as well So just to give you an idea what that looks like because well, you know, everybody's trying to brute force SSH In the last 15 minutes. This was now me who tried to know this was not just me This is the last 24 hours Okay, somebody found my instances again and tried stuff. Let's switch over to the last Okay, somebody has been Trying to brute force my instances quite aggressively Nobody was successful in the last 15 minutes, which is also nice Okay, and obviously this was me when I tried to log in here and probably this was Me and let's see where the root requests are coming from. Okay, it's China Obviously normally it's either China or Russia But when when you let that run for a while so you can see here from from China We had a bunch of login attempts and Those were the three failed login attempts from my own side with that elastic user now. I could by the way Let's just See, I'm only interested in the European ones. You could just Basically, this is a filter where you set the geo points around that and then you can see this was just me trying to log in And everything else was coming from China and then you can see this was me Since I'm roaming through my phone. That's why it's thinking I'm in Austria Then you always end up in your home country basically so you can see what we have been doing here We can't get rid of that filter again and just to show you I think over the last seven days Let's I think this has been running Longer already so you can see here somebody was very active and today somebody found my instance again and and Tried their luck again, and you can see these were my own Successful login attempts this with the public key and this with a password Which you should not do but to keep the demo simple I'm using a password and you can also see on a map a where are the users coming from and yes it's mostly China and be what the usernames are they using and it's always funny what people are trying and You can get kind of an idea of what systems people are trying to attack. So this is the off log so That was just kind of like a side quest to see how that is going now. Let's log in correctly And I now want to restart engine X. Let's say service engine X restart And I'm not allowed to do that because I need to log in as another user Let's say I want to use my elastic admin user I hope I remember all the passwords I Authenticated correctly and now another event should be locked. So first off Just to give you an idea of what that looks like we have in the auditing events We have a general overview of these are all there. Let's not do the last 30 days But let's just use the last 24 hours for example and you can see here I recreated my instance then I had a lot of events of Executed programs and login events, etc. And now I've been doing stuff again and well Locked in authenticated you have more events. You could filter down on specific events now And then see just those events or which errors happened and things like that but the one thing or the dashboard I'm interested in now is Not that the overview But if I had over to the auditing events I had executions and I've just restarted engine X and I should be able to see that Let's filter down to the last 15 minutes again, and you can see this was my elastic user Which who restarted something? Let's see if I filter down to the idea elastic user You can actually see down here these were all the actions that my user has taken so here I was restarting that user By the way, I didn't really show you the raw events Maybe that's something I should do as well. So these are all These are all the the auditing events in the last 15 minutes I had like 4,000 events and if you unfold one of them here you can see The auditing events or the auditing data and whatever peers something has been successful like my root user was Unsetting something in SSHD. Okay, you have all of that information We also enriched it with some other information. So for example here you see and we have the host information So you know where this is running and what operating system and What yeah, so this is the operating system information and since this is running on AWS I also enriched the cloud information so you can see which instance ID which reach in which availability zone and stuff like that So you can then filter down and drill into whatever operating system you're interested in or which availability zone Or wherever something bad is happening to actually get to those events Okay, so that was this user Let's do something else. Let's log in with my admin user Okay, and let's say We're interested in let's say home directories and I'm the admin user But I I want to take a look at what my user my regular user is up to so let's say Let's see here. We have a secrets file. Well, that looks kind of challenging. So let's say Home elastic user secrets text. This will require pseudo obviously. So if you run that with pseudo It will then tell me okay The content of the secret file is my secret which is not all that interesting But we have collected all of these events and we can actually Filter down on those as well. So here. I'm basically have a filter and I have a tag and the tag I'm interested in here is Power abuse so this was the event where I basically tagged like if an admin or a user with pseudo permissions Is reading the home directory of another user then I want to log a power abuse Event and if you filter down on that you can actually see Who has been up to what so you can see this user has used his route privileges With less to read this file Which is kind of cool that you can track whatever everybody has been up to in your system Okay, since we're running low on time The one thing you need to believe me now is that the file integrity works as well So if I was used or changing the website on my server It would be locked as well that which user changed my website and you know why basically we're using depending on the operating system we're different kinds of event handlers to check that and what we're doing is we're Hashing the files in those folders default is sha one which is kind of a nice trade-off between performance and like good hashes If you want the most performant One the last one access H64 should be the most performant hashing algorithm that we have in the system But you can pick the right hashing algorithm and basically it's just hashing all the files in your folders And whenever it changes it knows okay this user change this file from this hash to this hash And it's just events that we're collecting as well Let's keep the demo for time To conclude I always can kind of explain our stack is a bit of Lego because you have all these building blocks And then you can build whatever you want. It's not like a solution you buy It's not like some cloud service that you just plug in but you can configure whatever makes sense for your environment But there is some plugging involved. So you need to kind of put it together in the right way Yeah, we've quickly looked at AuditD in general and why it's a pain to centralize those logs or events We've looked at audit beat and what it can do and then some logs and dashboards and what it was collecting there By the way, why did we start with that? We basically we had to dog food something though We we never like to say dog food. We always say drink your own champagne Why because well, we have a cloud service and well there they wanted security events And basically that was they started off with audit beat to have like something that is auditing all the events they have And that's where we got to that if you want to play around with that You can log you cannot create anything any new visualizations And you cannot delete any data, but you can log in or you're being logged in automatically into a dashboard And then you can play around with all the dashboards if you want to create your own events This is not the pseudo user, but the regular user You can also SSH into the box and just play around with that So if anybody feels like giving it a try themselves this is where you can do that and By the way If you want to just see the the configuration and the code maybe I should add it to the slides com Yeah, so generally this is the repository this It's all automated so I have Terraform and Ansible to set up my cloud instance and in the templates You basically have all the configurations You will be mainly interested in this example in audit beat and file beat But you can get all the configurations and what I've kind of put together there if you want to check it out and apply it for yourself and With that I think we're done and we're pretty much in time Are there any questions by the way before you run off? I always take a picture So I can prove to my colleagues that I've been working today Because they didn't normally don't know where I am. So smile everybody wave. Yeah wave. Thank you Thank you. Any questions You have to feel like two more minutes Come to me afterwards if you have more questions by the way I have stickers over there if you want stickers grab them on your way out So about the xx hash is it really a good idea to use a non cryptographic cache for security validation purposes Well, it depends on I guess it's a bit of a trade-off between speed and how good it should be So but you can't pick from lots of algorithms Is that okay or okay? At least it's our current approach We are mainly concerned with speed actually to be honest, especially if you take like larger folders That it will not slow down your events. You can also like throttle how much scanning you want to do per minute So our main concern was actually speed. It's not about like having the perfect cryptographic algorithm for that But more like I mean yeah collisions are not that common depending on your algorithm. So Sorry most modern service CPUs will have acceleration for the common shaws and They're typically in the the rates of gigabytes per second in terms of hashing speeds So it should be acceptable as long as you use that acceleration Okay, yeah, I mean we always have like this I'm running this on my raspberry pi and you killed my raspberry pi and I mean that the use cases are very wide Or we have like this very constrained Docker environment. So we're a bit cautious But yeah, switch out the algorithm if you want another one. I think that should get you good Okay, so if you got time for one more question Well, I've seen the Terraform and beats configuration in a good type of Repository do you also publish the Kibana dashboard set up there all the dashboards I've shown are built in I didn't build a single dashboard all of that is built into the beat Basically when you connect the beat to elastic search, you can just say like insert those dashboards for me And they're there So all the visualizations you have seen everything is built in already Okay, I will try it out. Thank you. Sure either online or just install it yourself. Thanks a lot Thank you Philip and give him one of applause