 Hi everybody, thanks for having me here in the cloud called Village and Let's begin. So About me bell everybody. My name is Felipe Espósito. I'm also known as Proteus. I'm a security researcher from TNT security It's an startup focused on cloud security That is my handle into ether and my email address from scribe below and I'm from the beautiful and dangerous city of the Rio de Janeiro, Brazil So my motivation for this talk Every other day we have some data leak mostly because of some S3 buckets were misconfigured or Anything related like that, but that is nothing wrong with S3 buckets getting that much of attention They are probably the most prevalent and easy to find misconfigured AWS resource On the internet and it stores a lot of data. So, yeah, and Great head warfare is my weakness. There are at least three nine four Thousand public S3 buckets holding more than six billions of documents and that's a huge amount of documents, right? Yeah but AWS has other services to and some of those sources of services can also be misconfigured and exposed to the internet as well and Today we are going to talk about six of them but first In 2020 it's called Piper published a list of AWS disposable resources on github and this list has a 20 or more Exposable resources and that can be exposed to the public or to another AWS account The list is on github. It's pretty easy to find. It's pretty easy to anyone to understand which AWS API you have to reach to Requested to expose that resource and based on that list We choose some services to understand a little bit better try to hunt them a little bit better about that service And try to hunt them on the internet So another pretty interesting project was endgame from Kenan McElwain He released a new tool early in this year that was capable of back during AWS resources by making it public or exposed for another AWS account when an attacker account and This kind of tools it's helpful to show us the impact of a backdoor AWS resource that can be in that you can have on different environments and helps bootings to create detections for that. So Cheers for McElwain for doing that for us and In 2019 Ben Maurice did a great job by analyzing exposed EBS volumes. EBS volumes are like are hot like hard drivers when you're attached to a EC2 Eastus and You start things there, right? And for those EBS volumes, they can be shared with another AWS accounts on making public on the AWS So he did a great job by analyzing those public EBS He was able to review some secreties or scones and PII stuff But in order to carry to find those EBS He just had to carry the AWS API for To get a list of those resources So it's what kind of kind of easy to find those resources So he did a great job. He did a great talk. He had some challenge to solve and he did it pretty well So congrats for him But all those exposed resources and data they came from Somewhere I'd like to say sorry First because as some gymnastics at the Olympics have some obligatory movements We do have an obligatory slide to cover before diving to AWS and manage resources and that is the shared Responsibility model. Yeah, unfortunately Okay, so the shared responsibility model dictates that AWS is responsible for patching updating and maintaining the manage it and resources And the customer is responsible for configuring the resource policy Security groups and for their security their data. So all those data that I found in this research was only possible because customers fails in protect their own stuff and Just some stats. So pretty quickly. I was able to find at least 20 public read SQS queues 19 public write SQS Also able to reach more than 690,000 clouds research cloud search documents More than at least 30 unscathed activity me queue and more than five terabytes of exposed data on watch searches itself And just a little bit quick disclaimer and it's important to note that I didn't x any of those data I just counted the data that I was possible to count without touching it. So I don't want to go to jail and That's it So for the data source, which data sources I choose to try to find to hunting for AWS services Well, I choose two data sources for passive in yes risk a queue the community one and security trails Secure trades is 50 bucks. Not that expensive. So Yeah, it's a good source. I have some other async technologies like wayback machine should the main admiration and code repository So for a code repository one thing that I did it was pretty interesting is to get the shh get project it's a project that it's on GitHub and Their objective is to it keeps carrying the the github API for trying to find some secrets Credentials passwords that were leaked when a developer commit the code to the github. So I just modified to find AWS endpoint services And also I had the idea to use the Google cloud big carry to search for AWS resources in a public data source so GCP has a public data source from github and It backs to 2019 and have more than two terabytes of data. So it seems cool to work to checking out right and Finally, I wrote Some Python scripts to carry the github search API endpoint and the problem is that the API endpoint is only comes back with the first ten first thousand resource is results So I had to play a little bit around to implement some tweaks and dorks to get enough relevant data But the good side the advantage of carrying the github is that some resources can be protected by username and password and sometimes developers Make the mistake of committing them with the code. They are they are writing. So it's worth looking into Sorry And for sub domain animations, I read a really good blog post from Ricardo Irama. He wrote an article doing a comparison of nine sub domain animation tools and Based on those findings, I pick it the following for this research I choose to use find a man and a mess a mess from waspy. It's capable of getting Information from several sources like way back machine genius dumpsters and wrapped in the ending genius for instance So and we also try to use show that incenses for a few specific cases And we show the results to the results on slides to come But no that was not that successful and by discarding the data sources. I try to use such a string to try to identify some Unique URLs, but as we are going to To see that AWS is using wildcard certificates for most of the services which makes a string of no using identifying individual hostings but As we are going to perceive during the talk that most of AWS resources has a hand on component in the DNS and That must prevent from brute forcing, but the safety stream we can see the handle the handle part on certificates So we only needed to find the name providing information When you create that resource but that would be like a brute force and a shot in the dark and Brute forcing wasn't in scope footage research and I didn't try that. I didn't went for that way Right, so I do have some metrics here I Called valid DNS when the host name is and when that I got a first name any he's always for IP the address that means that resource on AWS is still active and I call it public IP when that IP resolved from the previously URL are not in there. I've seen 1918 on the private ranges. So Yeah, I choose these two metrics to decide which data source would be a good one so the first managed services that I Explored was the commit DB and I know that we can't access Directly access the commit to be from the internet because it's always deployed in a VPC And that means that we need at least an AC to instance to Inside of that VPC to connect it first and make a tunnel through it to connect to the document to be database But I was able to find some instance on the census that have at HTTP proxy tuneling the document to be To the internet. Well, people get creative ways to expose their data, right? Yeah, and from document DB and those data sources for example What I call historical github. It's the github from the big carry search And then I got seven unique results github search by carrying the API I got a hundred seventeen a security trails came back with 33 three results and passive total with just one from risky queue But the the truth results are that only 37% of the DNA the domains that I found was valid So 62% of that was old enough or as this was not valid at all And the most percentage of valid DNS came from security trails followed by github carry So passive total interest a little bit another service that I try to explore was the Amazon MQ and When you create an Amazon MQ by default you have two kinds of MQs to choose to two kinds of cues to choose You can choose between rabbit MQ and active MQ when you create an active MQ But by default it's creating side of VPC and it has a security group to protect it So it's not directly exposed to the internet Unless the administrator changed the security group to allow zero zero zero involved But when you create a rabbit MQ In the other hand, it's always supposed to directly to the internet So it's only protected by login and password that you create when you create the rabbit MQ Right, so I was able to find more than four thousand rabbit MQs in show them and and that's pretty straightforward and For the rest I was able to find around 1500 amass fine domain to an historical github five on github search and 145 and security trails and not on passive total But Well, this is the pretty screen from from show them So just we have a hundred sixteen Six hundred only for us is to one and it's pretty straightforward. Just click it on If you know that username in the law the password you can just log in and and see the queue Right, but well, but what about unique URLs? We found around To 200 300 around something unique URLs, but the most valid unique URLs came from a mess in five domain if you check Who's like before? I'm as if I don't mean has if you have 563 Unique URLs and security trails has one for five But when we get when you check for valid URLs security trails is almost close to a mess in five domain So Yeah, and you remember that I said active meq that you have to change this huge group to allow 0000 in bounds. Yeah, we found at least 30 cases where the other means is just that so I was able to access the console of the active meq by just Doing HTTP get on the URL and the part of each one meet 62 Yeah, well another Managed serves that I tried to hit to explore was Amazon cloud search and Amazon implements an elastic load balance in front of the cloud search. So and one interesting thing to note that The cloud search you have done choose a name that came to the random Stream the region cloud search in Amazon AWS on the URL. So looking for that in show done I could reach at least 38 thousand Cloud search service servers on the internet, but in show done you don't you only have the IP address So as Amazon implements elastic load balance in front of it without knowing the DNS We are not unable to to get the correctly virtual hosts Which leads us to only a mass GitHub and passive than yes as data sources and For those data sources I tried to check if then yes was valid and and try to do an unauthenticated research for letters e or a Because I choose those letters because they're really common and almost all languages And if there is the resource policy of the cloud search was set to start I would be able to carry those Those letters, right? So, yeah Just checking by validate DNS. I found that security trails was the most prevalent with more than 30 I'm asking domain around 10 and passing the MS if only two Valid DNS, but on those 42 valid DNS that I found I was able to reach at least 609,000 documents, so it's a rich amount of data right well and Another service that I tried to explore to explore was Amazon SQS and Amazon SQS are exposed by You are now so it's kind of tricky because if you have a new URL All the SQS region thought Amazon AWS.com are valid. They resolve for IP address But the Q most exist might exist here or not. So That completely Makes passive URL you suit the main admiration techniques completely ineffective So we only had the historical github and github search to try to hunting for Amazon SQS And by doing this research, I just found that more than 4050 URLs from github historical was not public So I couldn't reach our rights But I found that at least a training was public readable and at least 19 19 of that SQS was readable and writable This interesting part of the public readable is that you can intercept the messages from their pipeline and perhaps get some important piece of information and For those who are publicly writable It can lead for more serious vulnerabilities like SSRF or an remote cause execution By just depending which part of the code will consume the messages from the queue For instance, if we have a lambda with some kind of vulnerability and we'll be able to write anything they want Unexpected to that lambda would be might exploitable and the other services that I Tried to reach was Amazon Redshift and Amazon Redshift to it's like a Post-greed database like a cluster of that of our house and also Amazon expose redshift through DNS entries So every time you Amazon expose something as DNS entries It's like one to one For DNS in IP address when it's by a last load balance You might have two IP address or more for that entry. So It's it kind of easiest to find Amazon Redshift so So public expose a redshift database to the internet your most one created an elastic AP address and then Configuring the redshift to use the elastic AP address trained to use that the elastic AP address You created before and then you have to change this huge group to allow 0000 inbound for redshift Amazon find domain was able to find only 42 results historical github only seven github search around 192 and secure trails 226 passive total only two but Checking by if there are value or not on you know unique DNS Most the part of the security trails were valid That is a IP address that resolve to that DNS and For github care, it's less than a mess by domain So a mess by domain was able to find that 19% of valid DNS by source instead of github carry Right and on one there's the thing to note is that Sorry There's the thing to note was that I couldn't find any Amazon Redshift public exposed to the internet. So can I hear an alleluia? Yeah It's at least one good thing so and the last service that I tried to to find and trying to explore was AWS manager sir elastic search And when you create an elastic search You have to choose if you want to create a public or inside a VPC when you create inside a VPC You get the following host name. You got a VPC That dash name dash random string That dot region dot a yes dot Amazon calm and when you create a public one Which means that they're reachable from the internet You have the hosting like search slash name random region. Yes Amazon WS calm and Elastic search is also exposed through Elastic load balancing. So we needed to know the name the exact name from the host to to get the exactly the host and For elastic search, you can also choose the type of authentication so you can choose between Cognito base key off from HTTP. Yeah, I am none and a little bit more a little more Open distro has another configuration that's a bit more specific Well, but looking for a public elastic searchers Scott Piper published it on a On a forward cloud sec slack group that he was able to find at one year Go 387 public AWS manager elastic searches and the last time he reported on February He were able to find 359 he advised to To AWS to close those Public and reachable AWS manager services and I try to reproduce his search and I'm only able to find five So yeah, AWS made a really good job closing those Public key Manager elastic searches, but what I believe they what they did is was put a load balance in front of them So we couldn't be able to reach them directly by AP address Well, some stats for manager elastic search By by look for violence in the express force Security trails or is able to find more than a thousand Valid is valid DNS Followed by passive total I'm asked the main beach hub carry beach hub circle was way down But the percentage of veg valid DNS that I found was only 41 person So 41% of the other the host is that I found was valid, right? Well and those Those were who are valid and I tried to to get some information I Was able to determine that more than 800 Was the I am authentication type a little bit more than a hundred was open to the internet and Some a little bit less than a hundred has basic authentication And for those who which were open I counted 11,000 number of indexes and at least 5.1 terabytes of data that was exposeable to the internet as well And when we try to understand which data sources are public by source, we found the security trails Well has 58% of the public you open elastic search managers but when we try to Compare that with the amount of Sets of data that was exposed Sposable a mass we find domain has more than 40% of that. So If you're trying to look for exposed data, I'm as if I domain's a little bit more Better than security trails in that case So Yeah So some closing thoughts Well, take the shared responsibility model seriously and please The do it. Well, make sure you're not exposing any of your data 5 a.m. Resource policies that allow principle star or equivalent and for this you can use some security tools to automatically detect and mitigate those kind of miscalfigurations for instance CSPM or secure tube or secure job or AWS config rules and Doing the dynamic in nature of the clouds historical github make data was obsolete. So As the data stops in 2019 most of the services who are created and destroyed by them and We're not on Validia anymore. So we spend around $250 in big care resources for almost no valid heats. So, yeah, sorry boss well and Passive DNS has partial visibility of the cloud infrastructure, but it's still a good source for hunting for AWS manager resources So, yeah, that's all that I have for today guys Do I have time for questions?