 Hello. Well, hello and welcome to what is a day one of red team village at Defcon. I'm Ben Sadegipur and I have Tanner Barnes, aka static flow here with me. We want to talk about how to identify assets in the cloud. And we did some really cool research. But before we jump into it, we're going to do some agenda intro on who we are, and kind of give you a background on why we did this project. So let's just jump into it. One sec. There we go. Cool. I'm trying to see where my notes went. There we go. So for Jennifer, we talked about who we are, why we focus so much on the cloud, and our solution, and talk about how we mass exploited or look for vulnerabilities and mistakes in mass. And then we'll also have some really cool examples of things that we accomplished and got some boundaries with in the past few months. So originally, when we before we get into it, I want to give a background. Tanner and I were going to work on this for an entire year. So this was supposed to be for next Defcon. But unfortunately, things happened, and we want to do an experiment with it. And you can't say no to want to talk at Defcon. So we're doing a smaller version of this today as a part of Defcon. And hopefully next year, we can come back with more examples and more in-depth research. So a quick about me. Also let Tanner introduce himself afterwards. Again, I'm Benzadigipur. Most people know me as Nahamsek. You probably have seen me on Twitch. If you are an avid Twitch user. I currently work at Head of Hacker Education at Hacker One. I also create content and stream on Twitch. And again, you can follow me on all social medias under the name Nahamsek. Awesome. Yeah, that's all you said. My name is Tanner Barnes, static flow on Twitter and some other things. So I am and I apologize for my dog. I hope it's not terribly loud. There's something I can really do about that. So I'm a developer originally by trade. And then I've been doing pin testing for and cyber solutions for like the last year and a half now. And I'm an occasional bug hunter. But I help a lot more with some other bug hunters like Ben and some other people I know, helping build some tools for them. So there's my GitHub where a lot of the stuff I make is up. And then I also have just recently started doing some live coding on Twitch at that handle. Tanner, before we jump into this, I know that Def Con has this tradition where when you speak at Def Con, you have to take a shot. I don't know if you have one ready, but I unfortunately picked one for myself. So cheers. Go ahead. Go ahead. Cheers, man. Sorry, I did not. I could have ran downstairs, but I don't think we have the time. Cool. So that's kind of the big question of this, right? Is, you know, why, why the cloud like what's the what's the meaning behind why do we get into doing all this? So there's not currently a ton of great solutions for looking at cloud assets, right? So you got things like census and showdown, but some of those can be really expensive. I know census can be $900 a month showdown licenses for their, you know, their APIs aren't necessarily the cheapest. And it's definitely worth looking into, right? And so we wanted to kind of see like how could we look into cloud assets? And then, you know, what would that look like, you know, expense wise. And the other part is just kind of who isn't in the cloud these days, right? Like, I think you're probably hard pressed to find a large, you know, Fortune 500 Fortune 50 company that's not running some part of their infrastructure out in the cloud. So it's a huge target rich environment, right? If you look at just Azure, AWS, GCP, you're looking at something like 88 million IP addresses. And that's just in their, what I guess I would call like their EC2 space or their compute space, right? We're not talking about like stuff for Lambda functions, right? That have like pooled of IP addresses, but 88 million IP addresses that could actually be spun up for somebody's instance. And this is also a big part of where organizations, especially large ones use to spin up their kind of quick and easy development boxes, things are quickly testing on the edge. It's a great, cheap, easy way to deploy that. So if we can get really good coverage and insight into what's happening on the cloud, we can really find some really cool targets and get some really useful data out of it. Next slide, please. Yeah, so I mean, this is kind of an example, right, of just some, some real quick, like dirty, just some quick data that we grab from our database real quick. So you're looking at like 1000 corp domains, 13,000 internal domains. We found that actually very fascinating about how a lot of these domains in the cloud of their DNS names have stuff like internal corp dev and it's, it's odd, you would, I guess initially I would have thought that you wouldn't see a lot of that. But yeah, so I think that's cool. That kind of tells you that the type of things companies are putting out there on the cloud and on the edge, right? And then things like API, a lot of cloud APIs are obvious. And then if you look at like some specific target options, right, like anything on Yahoo or OAuth, right, you're looking at 100,000, 4,000 targets, right, all out there in the cloud. Right. So this is my favorite part. This is when Ben kind of brought this idea to me, I was immediately hooked and I thought it was going to be a real fun challenge. So yeah, so we're getting to kind of how it works and what we did. So it's built in go. I've been on a huge go kick. If anybody is knows me, or if you look at my GitHub, it's kind of funny, it started out when I really started doing get a work a lot of Java, a lot of I call it last year, the year of Java for me, it was a lot of like birth extensions and things like that. And then I got into writing go lang code and I just got hooked. So pretty much everything I've built since then has been in go. It's super great. Go routines make work like this especially easy. It's incredibly concurrent and it's a language built for concurrency. So if you look at like what is the tool that we built do as kind of a core thing. So what we're doing is we're taking IP addresses. And we are hitting them on a certain port over TLS. And then we're pulling the CNAME and DNS name records out of the certificate it returns and storing that tied to the IP address. So we're building a huge database of domain names to IP addresses in the cloud. So kind of unmasking who owns these 88 million IP addresses out in the cloud and to try to differentiate targets from them. As far as the actual code goes, not really incredibly, there's not a lot of magic there. Mainly it was just kind of starting ground for like reaching that data. The real magic comes from scanning it at scale, right? So again, 88 million IP addresses, you're not really going to be able to hit that, you know, from your box, right? That's a kind of an untenable amount of data to go through. So scaling it is super important. So this is my favorite slide. It was a lot of fun typing. I sent it to like eight people when I wrote it. So we were looking at about 535,000 slash 24 ciders across large binary targets and cloud infrastructure and then across 11 main ports. So you're looking at about 1.4 billion unique scan targets that we were looking at. And I just recently we just created version two of the API and I was able to get it down to a full scan of that space in 15 minutes. So that comes out to a rate of about 1.6 million unique scan targets a second with 28 to 29 million identified targets across those 11 ports. And then my favorite thing is that all of that costs about $25 in AWS charges, which to me is wild. So there is a flip sign to any coin like that. There were a lot of lessons learned on this way. One thing I learned off the bat is do not try this with Lambda. Lambda seems like a really simple solution to do this. Turns out they will not let you spin up 535,000 Lambda functions at once, which is not shocking. I tried that once and my AWS bill was insane. So that was definitely a no go. So there's definitely been some some heavy hitting AWS bills along the way, working through this to getting to that mythical $25 bill scan. As far as the code goes, when you get to things like this at scale, the quicker you can make the underlying code is obviously super important. So I had a lot of fun, you know, building really simple solution and then going through with profiling tools that are built into Go to really like carve out every like wasted piece of memory and wasted CPU cycle to just make it as fast as humanly possible, or I guess as computer possible. So when I did put it on the scaling solution, it would run quicker on the scaling solution and the cost less money. One of the funniest lesson we learned from all this is definitely be careful like who you're scanning and what you do with the data you get from the scan. The cooling about this scanner is it actually is very light on the network of targets because we're actually not making a full TLS HTTPS connection. It's is where you would actually probably see this on your ingress logs to your server because we're not actually making a full handshake. What we mistakenly did once was when we had the data we decided and we'll get to in a bit we had a really cool fingerprinting tool that we built and we decided we'd try to go fingerprint our entire database. We got some very hilarious LinkedIn messages and Twitter DMs from some very worried CISOs who thought we were trying to take over or attack their entire network. So definitely important with great power comes great responsibility, especially with the data set to this large. So it's cool, right? Domain to IP mappings. It's a lot of fun. You can get some really cool insight into what's out there in the cloud, but it's just a start, right? So we have all this data, but what are we going to do with it, right? And we kind of hinged and found some really core things that we do with the data, right? So we have these 28 million, you know, hosts that we found across 11 ports. But obviously, you know, XYZ laundry mats website isn't going to pay us for that sick iDOR we found in their like, you know, website, right? So we need to like filter that down to bounty targets. There's a great repo we'll get to in a bit about just that. And I'm blanking on the guy's name, but it's basically bug bounty targets, like aggregation list that keeps a wildcard list of wildcard scopes on bug bounty programs. So we use that as kind of a seed and then some some kind of custom ones that we put in there as well to run concurrently with the scan and filter them off into their own database. So while we have a list of every domain that we found, we have a specific database just for bug bounty targets. And that helps us kind of filter the data down to them point into some of these other tools. Another really cool thing that we'll get to is we started diffing scans, right? So initially, we started out just running kind of one a week or one every other week, and we realized we could actually get a lot of coverage and find some even cooler vulnerabilities. If we started doing this, you know, twice a week, and looking at what had changed in between those two days, right? So obviously a lot of overlap, but then those really new ones, we realized we were catching those only, you know, they'd only been in existence a day or two. And that was really helpful to kind of focus on what we needed to look at first. Another really cool one is the host discrepancies are virtual discrepancies with the host header, right? So we have all these targets. But again, you know, you're looking at millions of them. What is kind of a key differentiator for things that might be interesting? And one of those things is how they might handle virtual host. So I'll get to that as we'll come to that later in the slides, but you'll find some hosts in this data, we found somewhere, you can hit it by a P and you'll get one result. If you hit it by its domain, you'll get a different result. And if you hit it by a local host header, or a local host host header, you will get sometimes access to stuff you wouldn't have normally. And then the final thing was that fingerprinting, right? So being able to pull like titles, response link status codes, and take that data and use it to fingerprint specific services, but let us key in on large technologies that we knew we wanted to go after, like the Grafana bug that Ben's going to talk about a little bit later, and stuff like that allows us to kind of hone it on a specific technology. So right. So like I said, this is that's the repo. It's great. I'd recommend everybody if you're starting out with a recon set and you want to look at like where to start doing recon on that that repo right there is a great guide for sure to get started. Like I said, you got 1.4 billion targets that creates a lot of noise. Obviously not all 1.4 1.4 billion come back. But out of that 28 million, we really need to kind of filter it down pretty quickly to get to Oh, I actually get us some some bounties. I'm so sorry, guys. So being able to filter like that brings us access to, you know, a smaller target set. Next slide. Right. So like I said, the scan diffing. Obviously, multiple scans, I said they have some overlap. But the kind of big usefulness there right is determining those new hosts right? Can we see we run on Tuesday, right? And we run Thursday. What's changed? And those new hosts from Tuesday to Thursday are going to be really interesting because they're really fresh, right? Along with that, we also started doing historical archive. So the way we've kind of been doing it in the past was basically we run on Tuesday and then on Thursday on Thursday scan, we would push Tuesdays off into cold storage, just in a CSV file, flat file, and then we would move Thursday scan or we kept two scans of targets, right? So Tuesdays would go into that's what I'm sorry, I was trying to say Tuesdays would go into a historical database. And then the previous Thursday would go into cold storage and then Thursday would be the newest data and then it would continually shift, right? So we can always go back if we want to kind of do like, long run sentiment analysis of like that sentiment analysis, we want to do like a long run months analysis of how Verizon spins up domains in the cloud, we can look at all that historical data and see like, oh, this is how they name their hosts when they're deploying things to the cloud, which can help us in further scans. So yeah, so like I said, the virtual host discrepancies is a lot of fun. We found some really cool targets that way. I guess you wouldn't have found or you wouldn't have known were immediately interesting just from looking at of all you had was just here's the domain and here's the IP. You know, you're just looking at a wall of text, right? Like what's the interesting targets here? But if you take like this is one is just when I grabbed out of the database last night kind of looking through the database. You couldn't hit this host by going to the domain and a local host header wouldn't work. But if you hit it just by IP, you could get access to their it was an AEM instance that was like definitely meant to be temporary and was had a lot of things exposed on it. And stuff like doing this V host checking lets us see those really quickly at a glance. So this is a tool we're actually dropping at the end of the talk. But it basically we take you give it a target with an IP address, a domain and a port and it will run these three checks. So it will test visiting the IP with a host header as the IP will do the IP address with the domain is the host name and then local host as the host header. And you can see all those side by side and really quickly at a glance see what might be different in the outputs there. Yeah, and I'll let Ben take it away for some of the cool things we found in the data. Yeah, so again, there's millions and millions of IP addresses, right? There's so much of this that one person or two person two people can't go through. So we wanted to be as fast as we could with exploitation because from the time we decided to do this talk to today, we had roughly about maybe two, three months. So we wanted to do things fast, we have to do things automated, we have to make sure we know what we're doing. And that's kind of why we went after all these automation things, you know, getting the response headers and the response code, we can get them based on the V host and the host header, and kind of seeing all these things that could be weird, that could lead us into getting a vulnerability, right? So we need to do a few things first. We need to decide if we're going to just go after targets that are huge, like, for example, Verizon Media, or do we want to go after targets that are vulnerable, but it's across multiple organizations. But we'll talk about each of those in a little bit. So the first thing is, you want to go after a single target, right? Cool. What do we know about these targets? If you watch any of my Sunday streams, I do a lot of hacking on Yahoo. I know how they deploy things, what mistakes they make, where do they make those mistakes? What are the APIs are? What are they document? What are they storing the documentation? And also like, where is internal and corporate hosts? If you hack on Verizon Media, you know that corporate.yahu.com isn't just the only thing, right? There's other stuff. So again, I talked to Verizon Media, application.wato and Swagger UI, the notorious for it. There's other variation of Swagger that you can find on there. Port 4443 is very interesting because from what I've read on blog posts, and from what I've seen on that port, it looks like a lot of times it's meant to be internal, but accessible by another app and not directly visited by the user in the browser. So you want to really understand your target and understand what are some of the weird things about this target that's important and make sure you fingerprint photos and look for them actively while you're hacking on it. The next one is obviously the multiple target thing. You just want to spray and pray. You spray whatever vulnerability that you want across the entire bounty table that we have. We've set up what our bounty table went up. We had a set of targets, including private programs and public programs that we hacked on. Excuse me. But we wanted to go after the ones that are quick. We want to get quick wins. What's going to get us paid? Where's it going to give us some good examples for this talk? Again, time was not on our best side. So we looked at CVEs. There's a ton of good CVEs that got dropped in the past three months. There was a lot of stuff about heap dumps and memory leaks and stuff that comes with Spring Booted Gelokia. You also want to go after API documentation because if you have API documentation, then you're ahead of the game. You know how the API works, what it expects. So it gives you a little bit of a boost in your work when it comes up to a fuzzy API. And of course, you want to also look for sensitive or internal tooling or applications that are not meant to be really publicly accessible, like Jenkins, GitLab, GitHub, JIRA, Grafana, you name it. So you want to make sure you understand what those sensitive apps or internal toolings are and then make sure you have a fingerprint for it. We'll talk about all of those in a little bit, too. But again, like I've been mentioning this whole time, fingerprinting is very, very important. You have to make sure you know how to identify these things properly. So the first thing you want to do is, first of all, understand what you want to go after. So for our case, we really heavily focus on API docs because, again, APIs could be huge, you know, it could be 50 endpoints. If you know how to use them because you have a documentation, cool. It's really fast, really easy. You just have to go through and fuzz them. Spring Booted stuff, really easy. If they're left behind as a heap dump, you drop the heap dump, you go through your pool quads, right? But you have to understand what makes these things unique. Is that something in the response header? Is it something in the response body when it replies and it responds? Or is it a specific endpoint? Is it an image? Is it an API endpoint? Is it whatever you hit that endpoint, is it come back with a keyword that explains or that's obviously this particular application? And we'll have examples of all this in a little bit. So Swagger, for example, if you're looking for a Swagger doc, like here, if you hit Swagger Resources, it actually tells you where Swagger is being stored. You don't have to look for this, you don't have to brute force for it. But you can fingerprint for Swagger version because no matter what the version is, it's going to have this keyword in there, right? So you want to use it as a putty of fingerprint in the response body. If it has Swagger version under Swagger Resources, then that's a hit. That's our fingerprint. If you find a Swagger slash API docs and it's a JSON format and it's not the, it doesn't tell you where you got to just brute force for it but you're brute forcing for it in mass, you want to look for base path because every single Swagger documentation that is in JSON that I've seen so far, they all have the base path in there because you have to tell the user where the API base path is, right? That's our main way to fingerprint for the JSON formatted API docs from Swagger. And obviously, before people start reporting Swagger volumes about binding programs, Swagger by itself is not a vulnerability unless it's some internal tool and the company wants to pay for it. I never report on by themselves but it's really helpful when you're not playing hide and seek with the API where you don't have to brute force for these API routes, then you don't have to brute force for the parameters, you don't have to brute force if it's a put, post, or get request. So all these things add up, right? So as long as you have this information, it's easier to enumerate for it. We also went off the GraphQL at some point. What we did here was if you curl to GraphQL, you host and you with the GraphQL, response is going to come back as 400 bad request and it's going to tell you, hey, look, the get query is missing. Again, this is not the only way to do this but it was the easiest and the fastest one. So as long as the response code is coming back as 400 and it has the words query missing within the body, then we knew 100 percent there's at least some sort of GraphQL on there when you look for it a little bit further. And of course, if you're going after stuff like Jenkins, Grafana, Jira, easy to enumerate for them, there's always an endpoint, there's always a logo, there's always an API endpoint, or even better in the headers or in the text response itself in the HTTP response, the key words are in there like Grafana and Jenkins. So make it easy. This is super simple. As long as this Grafana coming back, case sensitive, it's like case sensitive, whether it's you know, capital G or not capital G down here. As long as Grafana is somewhere in there and the root directory is redirecting us to log in, then we know that's also a Grafana instance, right? Those are the things that we, you know, these are the examples of things that we've came up with as a part of our fingerprints to make sure we're avoiding as many false positives as possible. But again, just because you have these fingerprints doesn't mean you can identify improperly. You have to have some sort of a tooling. So eventually, we ended up releasing a tool which we're going to talk about later on. But before all of that, you know, in the beginning, we didn't have enough tooling. We were still experimenting with what we wanted to use. So we started to use this tool called Meg. Big shout out to Tom if you're watching this. Tom Nom Nom, amazing developer, amazing hacker. Can't say enough good stuff about this guy. Check him out on Twitter and go to his GitHub repo. Look for Meg. So Meg allows you to spray one or more endpoints. You can either give it one single endpoint and tell it to hit a number of hosts or you can give it, you can customize those. So it can be one or more hosts or one or more endpoints. The good thing about it is it saves the entire, it saves a file called index under a folder called out. And within the out index, it tells you the response code. So if it's three or two, as you can see 404, 401, 503, this is very useful. But on top of that, as I also shown in the last screen, it saves the entire response, the HTTP response within a folder. So if you go and do it based on IP, you would save them in the IP address. If you do it based on the domain, then we'll have a domain in the folder with everything that came back as a response. This is very, very useful because we can use this data to see if our fingerprinting is working. So before I jump into this, there is 100% a better way to do this than doing a crappy one-liner in Bash. But again, the whole thing is to show you don't have to be a wizard or a monstrous developer. I have a huge upper background to do this. As long as you know what you're looking for and you're not going after millions of targets, then you could do it with Bash. And I'll explain how this entire thing works. So the first thing in that code that I showed is we grew up for the response code from Outindex. Remember Outindex is where it saves the request that you sent out and it tells you if it came back as 200 or if it's not there, right? So right now I'm gripping for anything that came back 200 in Outindex. A lot of metrics, right? In that portion. Then we feed this to Cut and we say, hey, I want you to cut out everything after the space right here because I want to know where this response is stored. And I want you to store it and make it make sure it's unique. And you don't need to do this. It's just a bad habit of mine that I always have to make sure I don't have any duplicates. So now we have the location of every single HTTP response that came back and make a save for us. That was also 200 for this thing that we were looking for. Cool. Now I want to loop through all of these and we want to read every single one of these files and check it against our fingerprint. So we do a quick XR. We say for every single one of them, we want to feed it to GRIP and we're using the dash eight. So it shows a file name for where the fingerprint matches and we're doing the dash I because that's where it ignores case. So if it's, you know, capital and we didn't give it a capitalized letter, it's going to ignore it completely. And it's going to run the run the fingerprint. So example, if we're looking for swagger within the fingerprint, we put swagger resources or sorry, we'll put swagger version and it will loop through all those files and say, okay, this one has swagger version in it. This one doesn't, this one does it and it keeps on giving it to you. And then you cut it again. Because you want to get the the path of the ones that match because again, if we do grape that dash H, it's going to give you the location of the file and it's going to tell you what line. So you want to cut that out and only get the path so you can confirm it manually. I saved this entire thing under search. So I have a command called search on my on my box. So when I type in search, I give it a response code and I tell it the fingerprint and it does the whole thing. So let's look at some examples of how we used mag to eliminate false positives. So as you remember on this side right here, I was talking about how if you hit swagger resources, it will give you swagger version. That's our fingerprint and it's going to come back as 200 obviously because it exists. Cool. We have a list of hosts from Netflix and we're going to hit swagger resources from every single answer every single host that we have found through our resources. In this case is the database that we had before we started fingerprinting things. And that immediately came back with a ton of IP address that matched that fingerprint. So that every IP address that's on here has come back as 200 right here and it has the worst swagger version and it probably has the location of the API docs in it. Very useful stuff that just extends the attack service because now you have a ton of APIs to go after. Right. Same thing with GraphQL. I want to look for, I want to search for it. I want it to come back, whatever it has 400 and tell me if it has the word missing inside of the response. And I cut it again just because I want to just get the IP addresses so I'm just telling it, hey, I don't care about the response that you stored it with Meg. Just give me the IP address because I'm going to automate some stuff on top of it and see if I can pull information out of it for example. But that doesn't really scale with automation and I'll let Tanner talk about the fingerprinting tool that we wrote and why we did it. Yeah, so it's kind of that same thing right where we have all this data and it is, it's like the command line tools are obviously like bash one-liners word but not when you're feeding it some giant, you know, you know, a couple hundred thousand line file of targets, right? So the cool thing we were able to do with this pipeline that we built and this kind of framework is that we're able to take the data from the scans and kind of pipe it directly to these tools to run through, right? So the go fingerprint kind of wraps up that whole bash one-liner of like I want to take a bunch of targets, get results from their, you know, web requests and look through them for the things I'm interested in, right, for the targets that I'm interested in. And so that's really what this does. We kind of, it's set up internally for us that when new scans come in, this is triggered automatically and kind of enriches that data, like I said, and all of our records that we keep are internally tagged with these type of things, right? So you, and you'll see this on the repo, I have an entire like fingerprints JSON file that you can add to or create your own, right, for finding your own kind of targets and fingerprints in the data. Yeah, and then I don't know what Tanner said, do you want to make sure you keep these fingerprints as simple as you can? So you just quickly go through any tag stuff. It's very important to do your fingerprint quickly so you can go to your data and make sure you identify these things as soon as they're spun up. Should we talk about some bug bounty examples? Let's do it. Cool, so again, please remember that a lot of these are from bug bounty programs that I've hacked on. It's heavily redacted, unfortunately. I'm hoping my redactions have done properly before I start doing all of these and we change a lot of them and it's really hard to use real-life examples. I don't blame a lot of these companies because you don't want to end up at DEF CON and end up in the news, but just remember that a lot of these loans are super simple but yet have been found in the last three months and they're very, very low-hanging food and it's just to show if you have the right data, it doesn't even have to be to the extreme that we went with this talk. If you have the right data, you know how to process it and store it, then you can fund a ton of good and easy bugs with it. So the first one I'm going to get out of the way is I have no screenshots. I couldn't get this approved, unfortunately, but it was really, really unique because I've gone the same exact XSS, six times in a row, the same exact payload and exact end point on the same target over and over again for the past few months. So how does it look? So about a year ago, big shout out to actually Space Raccoon for this one. If you don't know Space Raccoon Eugene, awesome hacker, go follow him on Twitter. We're collaborating on this private program on Hacker One. Casually send him a link and he casually sent me back in the world. Never thought about it ever again, never spoke of it, never looked for it ever again, but me being me, I naturally added that end point to my word list. Hey, you never know what's going to happen to my user later on, right? Well, later on happens. I'm working with Tanner and we dump another set of data and I rediscovered that this demo app.html thing is back up and we know this thing. We've already seen it. So having access to our fingerprinting tool, I go back to our JSON thing. We write a fingerprint for this HTML thing and we say, hey, if you hit this end point, it comes back as 200, check and see if this JavaScript file is being called. And if it is, then we know there is an XSS in there because they're not patching the XSS. They're just removing it every time we find it. And we just keep monitoring it. So every time there's a new scan that goes up, you grab the data, you feed it to the fingerprinting tool and you look for this particular end point on that target and if it comes back, you can get a bounty from it. And I've done this six times so far. Super easy, super low hanging fruit. But it just comes down to knowing how to monitor for it. The next one is, first of all, kudos to Reinerator, Justin Garner, for getting the CVE number with LEAT 9, as he calls it, 13379. It's an on-off SSRF in Grafana. I'm not talking about the POC at all during this talk. There is no talks around it at all. If you're interested in the technical aspects of this vulnerability, then I highly recommend going and watching his talk from ActivityCon last weekend. And he has a blog post out for it. So definitely check it out if you want to learn about this phone in particular. So this is the fingerprint that we came up with. We talked about this already. If you hit root, it 302s to log in. If it has the word Grafana in the response, then boom, you have it. But it's not always on 443. We've also seen it religious to come up on 3000. And I've also heard Justin say it comes up on 3001 as well. So you have some fingerprints. The 3000 thing is cool because it doesn't have the keyword Grafana. And it may be a, you know, most people call it Grafana dot whatever, the permutation site dot com. But in a lot of cases in 3000, it could be a wild card. But you have a higher chance of it being Grafana because it's on that port than you do on 443, right? Because tons of people are hosting stuff on 443. So how does that work? Well, we pull everything that's on 443 from bounding table. And again, bounding table is what Tana and I have put away with targets that are valuable to us. It was just a list of maybe 30 companies that had a wild card program and were useful to us. And we shove them in there and we're just going through them every week or every other day almost. So we pulled everything that has Grafana in it, everything that's from port 3000. You hit log in on all of them with MEG and once you hit it with MEG, you do the same thing. You have it searched for 200 and you want to make sure the keyword, the keyword Grafana comes back. So you know Grafana is actually on these things. And this returns all the possible instances. You can skip the login part. If you don't care to fingerprint for it, you can just directly hit the POC and tell it to fetch the latest endpoint on the AWS metadata. And instead of looking for Grafana, latest replies with user data, metadata and I think another folder. So you can fingerprint to see, okay, if my ID works and the response of the POC had the word meta-data in it and it's exploitable, flag it for me and give me the results. But we didn't need to really do this because we have already Grafana tagged in our bounty table. So all we have to do is go back to our database, type in our regular query and say, hey, I want you to give me every domain that was tagged as a potential Grafana. And we could have either written a workflow for it where it automatically exports it. I'm against doing those because I like to do that part manually to make sure nothing's going wrong. You never know if it's a typo, if there is a something wrong in your POC, I like to do this manually. So we get all those potential vulnerable instances and do it manually and make sure they're vulnerable. But again, you can automate it with Meg or write your own workflow of, hey, if we diff it, it comes up as Grafana, hit the POC. If the POC gives you metadata, quote, it's vulnerable, send the hook to Slack or Discord and let us know. But I skip all that. I like to do it manually. And that was really easy to do. Again, like I talked about it, we filtered the DB, it showed all the assets tagged as Asana and we ran it through Meg and he gave us some really good results. But what was cool was that we realized some of these targets weren't really doing this just in one instance. There was a lot of them. This is actually a report that Justin found and we realized there were, I want to say, 12 other instances just because we weren't looking at the right domain and we're looking at the right internal domain, if that makes sense. I can't talk about the company, unfortunately, because of how private it is. But it was, they were fascinated by the fact that we found so many different IP addresses. That was vulnerable to this thing. And of course, that wasn't the only company that was vulnerable to it. There was a ton more, but these were just the most recent ones that we had found. And there were, all of these were bounding at some point, I believe, something big or something small. In a lot of cases smaller because this was a POC that was about 40, not even 30 days old. We were just reporting it without the expectation of bounding. So if they give us a bounding, it was just an additional benefit of it. In a lot of cases, it was more than just one asset. So we're identifying a list of vulnerable in the company's infrastructure and sending it to them, which kind of made things more interesting. So key takeaways. Again, companies may have the same happening in multiple instances on other IPs, but multiple instances doesn't mean that they're going to give you multiple bounding. Believe me, I've dealt with that in a number of times. But multiple instances sometimes, what I want to say in most cases, it means that you're going to get a higher reward or bonus, but also give the teams an appropriate time to patch. You don't want to report a CVE job today. You want to go about, you don't want to be that person that reports at the same day. I can get that. You know, it's there. It's a security team's job to, you know, fix these things, but it takes time. It's not overnight. You got to let them go through their cycles and let them patch it. So if you decide to report it too early and get money, don't get upset. It's a part of the game. Give them enough time before you start mass-paying for these. So we actually went and also sprayed admin across the entire bounty table just because we wanted to see, like, hey, like, some of these IP addresses are coming back with Corp or internal within them and they're not supposed to be accessible publicly, more than likely, right? Like the domain would unload, but if we hit it by IP, then it loaded some stuff. So cool. Let's run admin and see what comes back. So again, we did MEG, admin, a list of IPs, we'll search for anything that came back that had the keyword login in it and was 200. A lot of them came back, but we didn't have the time to go through all of them and brute force for them. But there was one that particularly was very, very interesting to us, which was this thing that redirected admin to MGMT, some folder login. Cool. All right. Well, why are we going from admin to this thing? This has to be something interesting. Let's just look at it. Now, of course, when we go to MGMT, it has an admin login page, as you can see on the right. Okay. And the password for the admin, username and password were admin-admin. And as you saw the reaction, it was any time that I put admin-admin and it works, this is probably the most accurate reaction. But that worked. It got us logged in, but that's not fun really. You know, there's nothing cool about saying, okay, I found an admin folder, admin-admin worked. It's cool. It's an easy bug. It's an easy bounty. But the bigger question is, are they doing this in multiple places? Is this being reused other places? Are they using this for other applications? How does this look? Is this just a single app? Is it... Let's look at it. So the app name is on the left side. We rename it to XYZ. The permutation here, it's not really a permutation. It's as the platform where the game is hosted. So if it's iOS, if it's through Android, other platforms that you can play this game, I would check it disclose. This target is obviously the domain that we can't disclose. And then the core app is the thing that they internally call it when you logged in. And it wasn't called core app. It was something very unique to this game that made you go, okay, so this endpoint isn't going to come up on other IP addresses. But if we find an IP address that doesn't have a hosting, for example, within their cider, but it comes back with this folder, it means it's owned by them and it's running the same code. Okay, cool. Now let's go back to our database. We're going to look for this app name, you know, the XYZ part right here. And we're going to say pull everything that has this XYZ portion. And I don't care what's before and after it, as long as that keyword is in the middle and it has a domain, I wish I could disclose.com in it. Give me the results and I'm going to spray it with Meg and see how many come back. So 15 of them came back to be exact. I think some of them were patched by the time we were finger printing for more of these. But 12 of them actually allowed us to log in with admin-admin. And each of them had access to different user data and PII because of different platforms. But from what I heard from the internal team, there was roughly about 1 million users that were using this thing. And I'm 99% sure there was RCE by design and some of the links that were in the admin panel. So again, it was a cool bug but you go from one bounty to like, hey, it's the same bug. I'm not saying give me multiple bounties but there are multiple things that could be affected by this. The team investigates this more and they look into it more and you'd usually turn out with a better bounty to be honest. All right, this has to be one of the most insane and easiest phones that I've seen in my life. This is not a mega corp that I know as a fact everybody watching this presentation probably has heard about them and you definitely don't know who they are. But unfortunately, because this just got patched and fixed just last week, we didn't have enough time to ask for permission so there might be a blog post for this later on. But this, before we talk about the bug, let's talk about a few fingerprints and things that are valuable. If you've ever looked at Spring Boot's documentation, there's a few things that are really interesting about it. Every endpoint that comes back within Spring Boot, it's really interesting. But two of them are really, really interesting because of the data they give you. One of them is being the HTTP trace. It's usually slash HTTP trace or slash trace. That gives you information about the HTTP request and the response exchange. So whatever response you send it or whatever request you send the server, it stores it in trace up to like 50 of them, for example. So if you're brute forcing, it also stores those in trace and we'll talk about that in a little bit. And also the heap dump is obviously it gives you a heap dump from the application's JVM. And according to documentation, if you hit a Spring Boot application, it comes back with constant type as this. So if you hit slash trace, the response is going to have this in the header. And that's what the documentation tells us. Cool. Let's use this, this fingerprint some stuff and go after an org. ZLZ and Ziad. Ziad's going to kill me for capitalizing the Z in his name, but that's fine. If you don't know, I'm following on Twitter they really, really wanted to hack this and make a corp. They were on a mission. We have to own this company. We have to be there internally. We want to see what this company looks like from the inside. Okay. So they dumped every single asset they could find from this. From all of our resources, beyond our IP scans, whatever we have, historic data, old data, new data, whatever we have, we just dump it all into one text file. We spray every single one of them with slash HTTP trace and we look for and see which ones come back. So in one of our exchanges, we saw that we came across a following URL. Again, X, Y, Z is the app name. It's just to protect the company. You don't want to disclose who they are so that the app name would be pretty obvious, but it's XYZ dash internal dash prod. And then it was site.com actuator HTTP trace. Okay. Well, let's look at it. When we looked at it, this thing comes up. This is the information that's being stored. It's pretty much a timestamp. What kind of request it was? What was the URL they made the request to? All the headers, including the cookie. And also, they had some other stuff right here, user agent, all right. My internal prod mega-corp is what got us attention. But the cookies that we had in that previous screenshot, the first set of cookies that we found were not useful. We couldn't log into that app that we were hacking on. But we knew if we looked deeper, because we had the heap dump, we could probably find better credentials or cookies. But the one thing that worked out was, we understood that, okay, there's XYZ dash internal dash prod site. Does that mean there's XYZ.site.com? Is that XYZ dash internal? Is that XYZ dash prod? And it turns out all these different permutations exist. And every single one of them have the same exact mistake of having actuator HTTP trace available to hit. So naturally, we went through one of them and we found out that in one of the actual HTTP traces, there are requests being made to these different things. And again, these are all renamed. But there was this thing called CoreApp. This is the core corpus, the app that we're using. There's the application, there's viewer, and there's XYZ services, where XYZ again is what was in the subdomain name. So there are four different apps running on this. And they each have an actuator and they each have their own HTTP dump and HTTP trace. So we dumped them all. We saved all the results from all of them. We also saved all the heap dumps. And we went through every single one of them gripping for cookies. And eventually we found a working session that actually logged us into the website and we load them into Burp, we imagine replace, and we eventually got access to a lot of internal sites that we shouldn't have access to. So and again, just to go for it, just to explain the whole thing, to wrap it up is we found cookies that didn't work. We found more instances of this thing being vulnerable in other places within their network. One of them actually had working employee credentials. We plugged them into Burp, Authenticators internal app. And on accident when we navigated to other websites, we realized now it's automatically logging us in because we had a match and replace setup that we completely forgot about. It was to the point that none of us knew how we were logged in and then we realized ZLZ had a match and replace that you forgot to take off. But they gave us access to some really cool stuff. There was 3,652 people. I think these are all employees of this company. We had access to their accounts on this thing. This had a little bit of a trick with the path normalization stuff, but it still gave us access because we had cookies. This is called my sales comp. I don't know what it is, but gave us access to some sales stuff that they did. This one was redacted, but you can see it says welcome with the name right here in the top right corner. I think this thing had some weird functions when it let us in person other users. We never explored what those users were because we were already too deep in this company's assets that we just deleted the cookies and the email. Apologize for doing this, but we want to show impact and let them know this is very, very serious and it should be fixed. We learned some lessons from it. First of all, dig deep. There's not just cookies. There's credentials that could be in there. It could be basics to four. It could be keys. It could be authorization headers. There could be a lot more information there than what I just explained. The biggest lesson learned from me is to stop brute forcing when you have HTTP trace available. That's because if you're doing HTTP trace and you're monitoring it for cookies and somebody else like me is just sending like 200,000 requests to find another folder that's on there, it's going to mess up someone else's work while they're monitoring for cookies. The mega-corp that we mentioned in here wasn't the only company that was vulnerable. So we actually went after the entire bounty table that we had and we did the same exact thing. But before I show those examples, please, please understand. Understanding your target is very, very important. If you see one mistake being made on one app, you get a bit that it's going to happen again and again and again within their infrastructure. So this is a few examples of it. It was just us throwing a bunch of these IP addresses and pretty much finger printing for it. Now the good stuff, I'll let Tanner explain what we're doing next with all these things that we've talked about. Awesome. Thanks, Ben. Yeah, it's been a real exciting, to say the least, it's been an exciting couple of months for sure. I'm really crawling through this data. Next slide, please. Yeah, so these are the two tools that I mentioned at the start is GoFingerprint and VHOS Scan. So GoFingerprint takes a list of URLs and the fingerprints.json. There's an example one in the repo. You can obviously add to it or take away the ones you think are dumb, however you kind of suit you. And up there in the top right is kind of what the output you'll get back. You'll get the domain that it was on or the IP, which everyone you're kind of feeding into it and then colon and then whatever it was tagged. So you'll get back all the positive ID services. The last one is VHOS Scan. So again, like I mentioned before, so you give it a CSV file with a hostname IP import. And right now it doesn't return a JSON file. It's still kind of a new tool. I'll be pushing some updates to it. Right now it just kind of dumps those JSON paths, basically just JSON objects. So they're just like printed out to the terminal and you've got the domain, the IP and the port from what you gave it. And then it has the status code from hitting it by IP, the response link by IP, the status code and response by hitting it with a host header of the domain and then the status code and response of hitting it by local host. And then a final one called domain accessible, which will tell you whether or not you can hit this by its domain, the DNS record. The point of that was we noticed in the data it was kind of a pretty consistent identifier that if you had a target where you couldn't, if you try to pull it by domain, even though there was a domain name or a C name in the certificate for the IP, you couldn't hit it by domain. You could only hit it by a VM. Typically that ended up netting us some more interesting targets. So we put that in there as well. It's kind of a way to differentiate them extra wise. Next, please. This is the gold mine that I think we've been excited to share this last slide for a very long time. And I'm very excited for Tanner to release this right now. Yeah. So this is going to be real fun to watch. I really hope it stays alive. I have a hunch that the little micro instance it's running on is probably going to die. Regardless, if you go to api.recondev search and then domain, right now it's just a simple, really dumb API. Just give it a domain you're interested in. You don't need a period at the end of it, just the kind of the root domain you're after. Or really, I mean, you can do multiple, right? You can do like foo.bar.com. We just don't need that first period. And you'll get a JSON array returning the domain, the raw domain, the raw IP of the target, and then those two things with their port they were found on as a full URL. So it's really dirty. It's obviously very raw data, but it's super fun. I think, Ben, you've got to pull it up, right? We can give like a... Yeah, I can actually do a demo of it if I think it's already struggling with how many people are hitting it. No, probably. I mean, it's... Yeah, I mean, obviously it's... So I'll show you quickly. A lot of you know how much I loved Sir Data Sage and I didn't like what they did with it recently. So we thought we'd do it on our own. Kind of looks the same almost. You just have to get the domain out of it. This is just for OF.Cloud because if you hack Verizon Media, you know what's up. So this is what it looked like when you would get the data from it. Yeah, so if you try to hit that and it takes like four hours to come back, that's not us. That's all your other fellow hackers that are keeping you away from really juicy people months. So if you want the API to stay alive, coffee goes a long way to keeping these things running. And again, we released... We worked on this last night. So we're going to work on this a little bit more and make it better as we have more chat now. But we just want to make sure there is something here for everybody to look at. And yeah, once the stock is over, we'll look through it and see where it goes. But that's quickly. We're almost out of time. That's the conclusion. First of all, understand your targets as always. What are you deployed in your apps? How do they do it? Knowing things that mistakes they make is very, very important. Again, familiarize yourself with things that you see frequently. If you don't understand how these things work, it's easier to exploit them. Fingerprinting becomes easier. Make sure you have a solid understanding of the apps in the backends. And also get a good database of assets you care about and make sure you keep them updated and you keep historic data. Again, you don't have to go to the extreme that we did. A lot of this data is available on Census and Shodan. You just have to know how to work them. We just have the luxury of being able to do this. And we really want to do some cool research around it. And that's why we kind of reinvented the whole wheel again. And again, most of these old tricks, most of these are old tricks. Nothing is new, but I've talked about, but it just shows that it still works against huge mega-corps. You know, an example of the heap dump thing that I showed you, someone with malicious intent, that's a goldmine for them, right? It's easy to pivot. You already have credentials. Who knows? We didn't go and test any of these apps that we had access to, but I imagine one or two of them had SSRF or RC or something on them that could have elevated our access. Also, don't just collect assets and spray random word list at them. I mean, there's nothing wrong with doing that. But the chance of you getting good results is a lot less with a target that you don't understand versus a company that you know what folders and files to look for, what the naming conventions are, and so on. Also, again, the wider you go, the more targets you have to play with, but again, it all comes down to how you process your data. And I said a lot of talks about bug bounties on Twitter, especially every year with Defcon coming up. I want to say that most of these bugs that I've talked about are found in the last few months. And there's a good $50,000 worth of bugs in here. So bug bounties, do they pay? Absolutely they pay, but you have to spend extra time to learn about your target's behaviors. How do they operate? What do they do? And you have to find impactful bugs to keep that in mind. Again, bug bounties are about impact, not finding vulnerabilities. So the higher impact you have, the better. And this was a good case for it. I don't have a reason why I put this on here. Obviously the semi-column trick always works. One of the slides that I showed, I said the semi-column path normalization stuff, this is what we did it. But I also wanted to show how important it is to incorporate this in your recon. You can see it from NAFI, there was a CV that came out not too long ago on one of the VPN service providers, and then you can see they're all connected. And of course, if you're not doing this right now in 2020, you're probably two years behind. And last but not least, thank you for watching. Thank you for the Red Team Village for having us. And of course, thank you to Zaya, ZLZ, Donut, Tom Nomnom, Irvi Sam, and Reinerator. And if I forget anybody that helped us throughout this research, I apologize. But thank you, everybody. And thanks for having us. Thanks, guys. It's been great. Awesome. Thank you so much, guys. Amazing presentation. Thank you so much for supporting not only DEFCOM, but the Red Team Village as well. And as a reminder,