 He is giving the presentation on serverless log analysis on AWS. George is an incident responder at Verizon Media where they have the chance to work on complex problems at scale. He's originally from Greece and has been living in the US for the past four years. They got their masters in cybersecurity from Stevens Institute of Technology in Hoboken, New Jersey, and hold the GCIH and GNFA from GIC. Please give a blue team village warm welcome to George's. Thank you. Ok wait, because I have a loud voice, let me fix this. Hey everybody, how you all doing? Alright, we're going to talk today about serverless log analysis on AWS. Hopefully I'm not going to keep you long so you can go to the parties afterwards. Hope you guys are having fun. I'm George's, as it was said, I'm Minister of Security Engineer at Verizon Media. I work for the Paranoids Fire Team, which we are the Forensics and Eastern Response Engineering Team. This is my handle if you want to follow me everywhere. And also I am a first time speaker so like if I am saying or anything like that, it's because of that. So first of all, why are we having this talk? We're having this talk because of all the, first of all the security challenges that we have on the cloud and as a result the different challenges that we have on the cloud, right? So everybody wants to move on the cloud, either because it might be because of auto scaling or because they want more space or they need more power, they have all the infrastructure they want to move to the cloud, everyone wants to move on the cloud. And then where the problems come in because engineers are still learning the tech stack of the cloud and all the new services that companies like Amazon or Google are spinning up all the time so they're not able to keep up. Sorry for that. And also my last point in the slide is what about cloud security? Because everything in IT as far as I know is built up without security in mind in the beginning. So we have to build up on it. So yeah, everyone wants in. And then we have fails all the time because people do Git repos that have AWS keys and they copy the AWS key on the Git repos. And actually there have been a lot of critical incidents out there that have been because of that. Like I don't know if you guys hear like two weeks ago a big bank, I'm not going to call the name. I had like a security incident because of some AWS keys that were exposed. I'm not going to say more about that. Then we also have security groups, misconfigurations, other AWS resources misconfigurations that always bring up problems, right? But what happens if we have like amazing cloud security teams? What happens if our engineers are amazing at what they're doing? We still have fails. Even if we have the pros defending us, we still have fails. Why is that? Because we always have like the server side vulnerabilities that are never going to go away. We have SSRS, RSEs out there, anything in the old Asp top 10, anything out there that can lead to exposures on our prem environment but also on our AWS environment. Do you, anyone recognize this IP over here that I have on the bottom? If you don't, this is the metadata service that AWS has. So essentially, if you get a SSRF on a host that is hosting on AWS, you're going to be able to get a lot of information out there and potentially AWS keys that are going to give you then access to the rest of your environment, right? So with that, we know that our environment is going to get compromised. Whatever we're going to do, it's going to get compromised at some point, right? So we need, apart from the defenses, which I'm not going to talk about anything about security defenses in this talk, we need the forensics. We need to know the artifacts. We need to know where all our logs are. We need, first of all, to enable logging, right? Because some people do not even enable logging. We need API logs, access logs, and even more. So here I come to the CloudTrail. Who out here knows what CloudTrail is? Let me see hands. Awesome. So CloudTrail essentially is a service that records all the API calls or any other AWS actions that happen on your AWS environment. It records all the activity from EC2, it can be security group changes, anything that happens on auto scaling, like if it's EC2 instances are spin up or set down, anything like that. Or all the other services that you see here that I list. So by default, it's enabled for up to 90 days and you can actually really easily access it if you have a single account. You can go to the CloudTrail service and access these logs. They're going to have everything that happens in your environment apart from like host-based logs or network logs, et cetera. It's just for your AWS resources. So apart from that, you can store these logs on an EC2 bucket and keep them longer in time and then potentially like, depending on your policy, you can store them there for six months, a year, I don't know whatever, your policy for attention and then you can even move them on cloud storage on AWS. So some of the most important artifacts and this list is not inclusive but this is just one of some of the artifacts that we need to know on AWS. For example, we can catch script kitties with a user agent because if like there's a script kitty attacking our environment you're going to see potential email issues or we're looking at user agents making changes in your environment or accessing your environment that you probably shouldn't have, right? And it's something you need to investigate. Then you have in every, actually every CloudTrail log, you have the source IP address which of course you can keep an inventory, it's a good practice to keep an inventory of the IPs that normally access your AWS environment and then if you see something odd, a new IP or something like that, you should investigate. It might be like a developer working from home or something like that you definitely need to investigate. Then we have the event name and event source which of course define what you're looking at and the user identity logs that are going to essentially identify the identity behind the actions that are happening. Then I want to really point out the response elements because it's going to make sense also while we go forward in this talk the response elements are going to be in your CloudTrail logs only when there is a change in your environment. So if you see someone on the CloudTrail logs trying to make changes on security groups but you don't see a response element log in there like if the field be having some information it means that they failed, it didn't make any change. But if you see the response elements field have information there it's going to have all the changes that have been made there. And also last but not least, their code. Their code field in CloudTrail log is really important because there you're going to see any access denied activity or any user that was not supposed to access some resources is going to have an error code and you're going to find it there it's a good indicator of something's wrong. So now let's talk about the traditional methods of doing blog analysis on CloudTrail logs and I want to ask a quiz over here in the crowd. So who from over here has actually done investigations on CloudTrail? All right, not too many but who of you use grep or jq or any other command-line tool? All right. Who of you have a nil stack that just send a log there? Okay, so we have half and half. So for the ones who use grep or jq or any other command-line tool it's amazing. It's really cool to have that knowledge and do that on the command-line but it's also like all these CloudTrail logs are already stored in an s3 bucket and you just go and download them locally or in a server to do analysis. First of all, it's costly to download from the s3 bucket if you want to go download 10 or 20 gigabytes of logs and then also it's more time consuming. Then for you people who have an nil stack set up or use, for example, AWS Elastic Search Service which is amazing, this is amazing but it's so costly, it's so costly. Then you can also use CloudWatch has some limitations or if you have a lot of money I don't know who you work for but if you have a lot of money you can use AWS logs to be sent to your sim and pay all that money, right? But wait, why do we come over here this late on Friday, right? To discuss about something that is most efficient that it scales and it's fast. So how we do actually AWS logs at scale as Verizon Media is that we have somewhere about, I'm going to just say that we have more than a thousand accounts AWS accounts, more than a thousand AWS accounts I'm not going to say the exact number so we have to find a way to manage all this, right? And if we want to do an investigation how are we going to find these logs? So what we suggest that you also guys do if you have more than tens or hundreds of accounts you need to have all the different account owners send the logs your way you need to have a centralized security AWS account that you're going to have all the logs centralized like a sim, right? Just it's going to sit on AWS on an S3 bucket so you're going to have them all pointed to your S3 bucket that you're going to have them stored locally where you're going to keep backups you're going to have them encrypted with the key that they said with you, et cetera, et cetera. Okay, first of all please encrypt don't have the logs just sitting there because they're really valuable for an attacker as well like they may not export like breeds do another breeze but if they find your logs that's a really good resource for them. So yeah, so we have them, let's say that we did that and we have them all centralized, what do we do now? So for you who don't know Athena is actually an ancient Greek goddess of wisdom and it happens to also be a serverless query service in AWS so Athena is amazing actually it lets you query structured data that are stored in an S3 bucket and the only thing you need to do is to point you to the right direction and she'll do all the magic I mean no really, she will do all the magic but there are some details that we're going to talk about it so how it works so you first need to have your logs stored in an S3 bucket and wait, we already have these logs the cloud tray logs in an S3 bucket already stored we don't need to do anything we just need to point Athena the right direction then define the schema in a create table query which is actually already you don't need to do anything you don't need to know Hive or create table queries it's already in AWS if you search create table cloud trail you're going to find the query ready for you the only thing you need to do is to change the location of the logs that you have and also like name your table the way that you want to name it and what Athena does essentially is look at the logs, query them and create a table for you so what do you need to do afterwards to do your investigation? just SQL you don't need anything else you just go to Athena, use SQL and do your investigation no downloading, no Elk stack, no big money and as we said so Athena is a serverless service you don't need to spin up anything, no instances it's fast, it's cost efficient and scalable it's actually really really really cheap it's five dollars per terabyte and if like that doesn't make a lot of sense to you what is five dollars per terabyte essentially what AWS does they charge you per query and I'll give you an example if you run, let's say that you have some monitoring on your cloud trail logs and you have Athena looking at it like if you want to search one gigabyte which wouldn't be the case at all where you have way less amount of data scanning on cloud trail every hour for 24 hours for 30 days so again one gigabyte of logs 24 hours for 30 days it's only gonna cost you three dollars and 56 cents you know how much it would cost you to just have an Elk stack just sitting there not doing anything just have an Elk stack on the cloud 150 dollars for a month just sitting there not doing any analysis imagine doing the analysis as well it's really cheap, it's really efficient it's essentially a cloud presto and it's available through AWS console which you're gonna see right after in my quick demo through an API and also through AWS CLI so essentially the framework that I am suggesting today to all of you to go back home and work with your teams and hopefully implement it and you're gonna see how implement it on the demo is a cooperatively framework for teams to work together you essentially have a space that you can save queries in AWS Athena and other analysts can reuse make them better they don't need to have essentially any SQL skills they just need to have someone who creates an amazing SQL query for investigations and you just go back and reuse them then Manatsman actually has a great oversight on your investigations because Athena stores your queries for 90 days so let's say that you work on an investigation and your management wants to check out your results you give them a report but they can also go in Athena and see the results from your queries and what queries you run for back to 90 days and essentially AWS doesn't even make you run the query again it stores the query and you can just click on the query and access the results that simple I said all that to the next one so over here I have some simple queries for different type of investigations for you the first one on the top is gonna be a query that we look for I talked about before about error code so we can look for unauthorized operations and AWS resources or access denied then the second one that's really important that actually we could use on the bridge that happened two weeks ago we look for a specific key that potentially has been compromised or that we know that was exposed and we can see all the activity that happened through that key and as you see over here I have the database on the table because these queries are on my saved queries tab so I save all the queries for other people to use in my team and then this is just a placeholder every time you can change the database, the table and just put your key that you want to look for then the last one just looks for changes in security groups next actually I should have put that slide after the next one because I haven't already said that we can use Athena for other logs but over here as you can see the first one just takes for console logging on your AWS resources and then the next two as you're gonna see on the next slide you can use AWS Athena for more than cloud trade logs you can use AWS Athena essentially for any structured logs that you may have on AWS or you want to put on AWS so the first one looks at Apache Access Logs and you can see at a password a path traversal attack to find a password essentially on a web server and the second one you look for your string that you want to look maybe it could be in a SQL injection or an XSS or anything that you want to look at your Apache Access Logs that has a response code of 200 so here just talk about this that Athena can work with any type of logs that you want as many as logs that you want they just have to be in a structured way you just need to define the schema which some of them maybe most of them are already out there people have the schema AWS have the schema for a lot of them or if you have a custom log or something that you couldn't find you can just define the schema which by schema I mean you just need to define the different columns and the different fields that your logs have let's go to the demo let's see if this video is going to play or not so here I'm just on the AWS console and we can easily access Athena you just find your history if you used it before or just click Athena on the services tab so now we are inside Athena from the console right on the right you see the query editor which over here you can write your queries and you can write a thing up to 15 queries at the same time and we're going to see that afterwards on the left you have your databases this is a personal account so I don't have a lot of databases enabled what essentially what I recommend over here for you guys to do is to have a database with different type of investigation for example you have a database that is called CloudTrail a database that is called Patsy Access Logs a database that is called ELB Logs and then you have a table for each of your investigation and you can call the name of your particular investigation over here we have the Save Query tabs which I already spoke about which we can have senior analysts or amazing SQL engineers that you may have on your team create really good queries that then your team can reuse in the future to perform investigations they don't need to come up with their own queries they have them over here it's just so easy and then the only thing that you need to do is to select the query and then the placeholder that I have in each one that is going to be like the database and the table and the location I'm going to talk about the location right after let's move a little bit and then there's the history tab so in the history tab you have all the queries that you run for the past 90 days so these queries are saved with the results in a new S3 bucket actually and you can access them within 90 days and the amazing thing about it is that if you don't have to run it again the results are over there and as you see if you see on the encryption type field I have them all encrypted this is really actually really important it's a rookie mistake that I did in the beginning you have to go to your settings right here on the top right it's not going to look like that in the beginning it's going to look like this if it's like this it means that when you run a query even if your cloud trail logs are encrypted I'm sure that you enable this and select the KMS key that you already have to encrypt your cloud trail logs otherwise we are failing as security folks right okay so now let's go back to actually let me stop this so I can explain this is a personal AWS account that I had and I used an amazing project out there if you want to test attacks on AWS and defenses and cloud trail logs and Athena it's called cloud goat it's on github it's an amazing project by some amazing folks that they have some cloud formation scripts and again it's called cloud goat that you can create a vulnerable AWS environment and then they have even the scenarios for you to run attacks different type of attacks I think they have five or six different scenarios that you can attack them so then I created an environment in my own personal account I didn't want to use our corporate data of course and then I performed a scenario so we can have the logs to see how this works so first of all let's go and create a table we already have the logs and this is the create table query that is already in AWS you don't need to create that right it's a high query that creates a schema and the only thing you need to do is over here at the bottom I just did it really quick because I edited the video you just need to change the placeholders which is going to be like that account number that you have the location the region and the date that you want to actually tell Athena to go and create a table from these logs so over here really quick run query my table was created and then you can go and check preview table to see if actually your table was created if there was a mistake and sometimes there are some mistakes most of the time you just need to switch the schema and over here at the left you can see the schema essentially of your table which is all your different fields on cloud trail that's what it is and each of these is a column on your table so perfect so in our scenario we were notified that there's this user named Tolo that had some suspicious activity happening on his account so we have from our this was this happened really quick wait let me go back so you see I went to a saved queries and I selected the user activity query that I already have there from my senior guy because I don't know SPL so I select this query and the only thing that I need to do as a junior analyst is just change the user name that I'm looking at and maybe and also change the database and the table that I want to query and as we're going to see here I didn't get some results because I had no idea how to change it I didn't edit it on the video so I'm just looking at the user identity that user name because user identity is a struct sorry about that why did that happen ok and we get only two results and we see that this user solo was actually accessing some code build projects and listing the projects why would that be so we don't know exactly what happened but we see there's a source IP address over here so we're going to check if there was any other actions based on that IP address so we copy it again we go to our saved queries that we have for looking for suspicious IPs I already have put it in here around the query and we're going to get all the results that came from that IP so we pivoted from user solo to that particular IP and we see oh there are so many different logs over here let's see that from this IP actually first user solo was accessed but then something else happened user calurisian another user got accessed and you see some actions over here in the event name create dbsnapshot so they are creating a snapshot of one of our RDS databases this is serious so let's see what's going on afterwards and if you go down in the logs I didn't have time to like a really thorough demo but if you see down here at the bottom you can see actually that the user was actually able to I was a user with a cloud code to create a snapshot of our database because the user calurisian who accessed the credentials had those permissions and then changed the master password to the database so then he had access to our secret database with all our six secrets over here and we can see all that in the cloud trail logs all the actions and we can query them really easily with Athena ok let's see how I can put it back in the presentation mode anyways you'll see my bookmarks too so apart from all this what else can we do we can do thread hunting we can automate all this we can have queries that are used for our thread hunting activities we can have automated hunting in place using step functions in AWS and lambda functions so essentially what step functions is for the people who don't know is essentially putting together or your serverless resources working together so you can it's like if you think it as an orchestration tool you can have Athena query your logs and then getting the output to a lambda function and then the output from the lambda function to create another table and so on something that I didn't add in this presentation actually that I want to mention is that you can even have really amazing diagrams come out of this using quick side take it out I didn't add it in the presentation but it's something that you really should look at it's called quick side and you can feed the results from Athena to quick side and get some really cool diagrams after that and I want to talk also about a suggestion to you guys even if you use AWS Athena or not a really cool way to track abused STS keys so what STS keys is the temporary credentials that actually AWS creates and assigns to your EC2 resources which is created automatically and it's assigned to an EC2 instance it's not going to be created by you and it's their temporary I think the default is 6 hours so these credentials are actually the ones who we see mostly being abused in the world because these are the credentials that someone is going to take advantage using an SSRF attack on your environment so the way that we can track it even if we don't know that something was exposed on a github page or something like that we can find all the assume role logs in our cloud trail logs and when we have an assume role event that means that a key was assigned to an EC2 instance then normally what would happen is that that key would only be used in that particular EC2 instance sometimes developers do some funky stuff so you might see some edge cases but normally it's what happens then if you look at the assumed role events you're going to see where that particular key was used so if we don't have an edge case with a developer and we see a key being used with an assumed role event from a different IP or a different EC2 resource then this is something that we're going to investigate so one take away that you need to take from this this is the one you need to take and look at your environment so we can make it even faster if we partition the data that we have on cloud trail in some cases you can make it even 99% better and up to 90% cheaper we can compress and split the files bucket your data optimize file sizes and then also just optimize your SQL queries as well in conclusion AWS Athena is fast, cost effective and easy to use and I hope that each and every of you consider it go play with it I definitely recommend setting up a personal account using cloud go to create the logs that you can look at afterwards and start doing some investigations on your own and then potentially bring it to your organization thanks that was my talk if you have any questions grant me here or later so you're talking about the history so Athena history is right now AWS has it only for 90 days the history but then if you would point you could go to manually then on the S3 bucket that they are stored and move them somewhere else but on the dashboard you're only going to sit for 90 days right now I looked for that they just have for 90 days the history so it's data scanned and it's most of the time you reach one terabyte of cloud trail logs most of the cases that I worked at has been like I don't know 5 gigabytes if you look for a month maybe 10 gigabytes which is going to be really really cheap really cheap so it counts the data that you query so that's why I say over here I say partition your data because if you go and scan your whole account for the whole year it's going to be 100 gigabytes but if you partition the data and you specify that I want to look for those days for those 10 days it's going to be 10 times less so yeah it's data that you scan so the question is if a particular yes so the best format is parquet yes you can scan any other data like JSON you can have zipped files anything but parquet is the best so AWS recommends to boot it on parquet yes that's a tough question in my opinion I'm not sure I know that the best is parquet very good thank you so much guys