 So, this is my testing box. So before we get started with the talk, all of these ideas are a methodology on how to do recon. Now, I have with basically bubble gum and popsicle sticks stitched this together into some automation, just bash scripting, right? A lot of this stuff can be done automatically. Some of it takes, like, kind of contextual knowledge, but most of it can be automated. So, we're gonna choose one of two targets, both have bug bounties and bug crowd. I'm not doing anything illegal to enumerate while we're doing the talk. So, you have your choice of Twitch.tv or Tesla. Who wants Twitch? Okay, who wants Tesla? Okay, Twitch is the winner. All right, so let me start this up and this takes a while to run. So, all right. So, this is literally my script I use when I'm red teaming our bug bounty hunting. And so we're gonna go through the talk and the output, but this takes a little while to run. Hopefully, it'll be done by the end of the talk. Okay, emergent recon. I already did the intro. The only thing I will say additionally is that I'm a dad. I love my kids. And that's my girl winning, well, not winning, but solving her first CTF challenges at OWAS CTF in Santa Barbara. So, really proud of her there. I'm also a huge gamer. I play a bunch of games. So, you can probably socially enumerate my battle tag or something like that and add me on Steam or WoW or whatever I play. So, all right, the first section discovering IP space. So, one of the methods I used to discover IP space is keyword searching by organization. Now these slides show Tesla, but the tool that we're running, we're gonna do Twitch. So, the best place to do this searching for what's called an autonomous system number is this site called bgp.he.net. The reason you want to find autonomous system number, right? If you're a large enough organization and you run your own network, really the internet's not globally connected computers, it's globally connected little networks. Well, not little at all. They're very large. They're called autonomous systems. When you have a large enough system or network, you have to register it. And so when you register it, you have to register a description or a name of your company. This is one of the only sites that allows you to search keywords to match company names to autonomous system numbers. So, here you have Tesla Motors and you can see they're starting registered IP space on the right hand side there, that 209-133-709-024, Tesla Motors Inc. Now what's awesome about keyword searching for this type of data is basically they might have registered another autonomous system number under a different entity name. And so here you can also see on the left hand side that searching for just Tesla also came back with Tesla Engineering Group. So they might have a different autonomous system that I want to enumerate for a wide scope bounty. So then there's always your verbatim registries, right, Aaron and Wright. These have who is data, reverse who is data on anybody. They both have web services that will allow you to do keyword searches on things like Twitch or Tesla or whoever. This will start giving you back IP space that you can start adding to basically your list of things that you're going to hack. All of this goes around building a list of things you want to hack in your red teaming engagement or your bug bounty hunting. So someone said Shodown earlier. Shodown also allows you to search by the organization tag. So here you can say org colon Tesla plus motors. You can also search just by keyword Tesla, but you'll get a lot more results. This will start giving you everything that Shodown is giving you. What Shodown is is an internet spider that goes a little bit deeper and has indexes and is available to hackers. It'll basically profile technology, stack, IP information, certificate information a little bit deeper than any regular spider goes and it keeps it in this large database that you can query for almost free and even the paid version is really not that expensive for Shodown. So here you can search Tesla motors in this instance and find out that there's a whole bunch of systems already out there. You can also search for title. This parses out title which can be really useful because if you're looking for a specific type of device or technology that they might have on their network that's vulnerable, you'll see it right away in the Shodown research results. So this is a little stopping place and I actually don't really show this very much because it's not really hacking, but it's super useful for me when I'm doing this large base recon on a site like Twitch. So I organize all my testing inside of mind maps and you could do this inside of an Excel spreadsheet or OneNote or something like that, but I use XMind and I thought I'd just kind of show you how this works out. So here I've started a campaign or a bounty hunt against Twitch, right? So the top level node in my mind map just says that it's Twitch the company, right? Because actually I know that Twitch has a lot of domains and they're not all named Twitch.tv, right? I'm going to discover that in the tool that we just looked at and some other tools we're going to look at. So I'm just going to start filling this out as we're going along, right? So if I go look for, you know, Twitch's IP space, I'll add a tree here and I'll just say IP space here and then I'll keep, oh, that's not spelled right. Good job. And I'll keep that range here. I can go look for it in a second. And then I'll start adding domains on the top level node. Now this doesn't look like much when you start it, but by the end of this you will have a lot of data that you can start working with in this tree. What this ends up looking like is something like this. There it is. What it ends up is looking like something like this, where I have nodes of a top level domain that's in scope, Twitch.tv in the upper right hand side. And now I have lists of subdomains and outputs of tools. And this is how I organize my information. And basically how do you eat an elephant with all of these sites that you're finding, all these subdomains? Well, one bite at a time, right? Like I test each site individually, one site at a time, once I find them. I do an abbreviated methodology for web hacking on each one of these sites once I know that they're real and they're owned by Twitch. And then I just go through things like content discovery, like dynamic parameter discovery, fuzzing, basic cross-site scripting checks, basic SQL injection checks, default passwords, all that kind of stuff. And then I mark them by progress. So if it's green, I'm done with it. I've made a first pass over that domain. And if, or actually if it's green, it didn't have a vulnerability. A check mark is based if I finish the testing on the site or not. So here we have a red entry on the bottom that has no check mark and is red. It means I found a vulnerability. I'm probably still testing on it. Orange is like I've put it off till later. Something about that site has caused me to say I don't want to do this right now. I'm going to put off this work till later because maybe this subdomain has some special technology that I just don't want to dive into right at this second. And I do this for every top level domain. So down here, you can see that Twitch actually owns Justin Todd TV, Curseforge. They also own Curse, Twitch app and Curse app. Those are all top level domains they own. And I'm going to do this same recon methodology for each one of those top level domains. All right. So discovering new targets. Let's see here. Here we go. Yeah. So show it on this last thing. So now, now that I have some IP space, right? Show it on it's giving me some information. I have maybe the main IP space stuff from just Twitch.tv, which I know is their main site. And then, you know, I have some other information that maybe I got other places. Now, I want to see if maybe they have some different kinds of brands, not just Twitch.tv and maybe not just having their IP ranges. I want to find out if Twitch has acquired anyone really recently. I want to see where they're linking to off their main site. So there's a lot of good tools to do this. And I want to do some tracker analysis of their ad and analytics because this will reveal other places where they're using those ad and analytics. And I can add those because they're probably related to Twitch. So acquisitions is pretty simple. This hasn't changed much in the last year. Crunchbase is still the number one place to go to find acquisition data. A lot of day traders use Crunchbase to figure out information on trading for or investing in companies, startups, whatever. It also has a lot of news, but the subsection that says acquisitions, you can drill down into any company and see pretty concretely who they've acquired and it's updated really, really frequently. So here you can see in the last few years, Tesla has acquired Grom and Engineering, Solar City and Riviera Tool. Now, when you take over an organization like this, if you're a big parent organization, you probably decommission all of their IP space and probably their servers. You deadlink all of their DNS entries or remove them all together and you nuke all their C names. But really, that doesn't actually happen 100%. They probably still have some cloud infrastructure out there that they forgot was up and running. They have probably customer data leaked some places. This happens all the time. So if you're a wide scope program like Tesla, who says we care about all security vulnerabilities on our bounty, I look at these two. I look at Solar City. I look at Riviera Tool to see if these domains, when the websites used to exist, reveal some type of sites that maybe have vulnerabilities in them. So Link Discovery is this idea of finding out what the main site is linking to. There's often a lot of links that are outgoing from a website or even incoming. You can do this recursive link discovery in a tool called Burp Suite. How many of you used Burp Suite before? Very good. All right. So we're going to do this real quick. We'll try it on Twitch, see if it works, if the demo gods will love me. I'm going to boot up my Burp testing profile here and put Burp on the right. When I started this morning, my license had expired. I had to buy a new license. I was like, oh, this is painful. All right. So now we have Burp and our browser on the left. Let's do some things before we start. So the first thing we want to do is make sure that Spider in Burp is not passively spidering as I'm browsing. I don't want to do that. I just want to instrument a certain thing right now. So I'm going to disable Pass to Spider as you browse. I'm going to say Max link depth for this exercise is one in Burp and max parameters that I want to crawl is 25. And then I'm going to say when it sees a form, a login form, usually Burp prompts for guidance. This is super annoying if you've ever used Burp before. So you can either say don't submit login forms or you can say automatically submit these credentials. I usually for this exercise choose don't submit login forms. Number of threads is fine. And then forms in general, I'm just not going to submit forms for this spider. So I've set up my Burp spider settings here. And now I'm going to set up some scope. So if I go to the target tab and I go to scope, really what I want to check out first is adding anything that I know says twitch. Now Burp has a little bit of different functionality in the last year that they've launched. So you have the verbatim ability to add a real domain or URL here to add in scope. And normally when you right click on a site and add something to scope, that's what it's doing. I don't actually care about that because I don't know any domains except for the main one yet that I want in scope. What I actually want to do is basically say anything that has the keyword twitch is going to be in scope for this project. So if you click this block of this box right here that says use advanced scope control, you can say add and you don't have to anymore supply a fully qualified URL. So here I can just now say twitch and just say okay. So now that becomes my scope for this project. I'll say yes here. Do I want to limit history for just that thing? Okay, so now let's go to twitch.tv. Alright, make sure burps on. Oh, that was the dude streaming my bed. That really confused me. Alright, so we're now proxying twitch through burp. You can see that interceptor has or intercept has said yo, there's traffic coming through here. So we're going to turn intercept off and just let everything go through and then go back to our site map. Alright, so already just by visiting the main page, we have some subdomains that are hot linked off of the main page. And the idea here of this link discovery idea is we're going to iteratively now spider everything we find to find more subdomains. So first let's just choose gkl.twitch. We'll add this to or we'll basically spider this. Oh, there we go twitch.tv. We'll just spider everything. If possible. Yeah, right click spider here. Alright, already the spider has started to return a whole bunch of content for link stuff, not it just on twitch.tv now, but also the things we just chose a second ago. So now we start to get a pretty good map of stuff that twitches is related to. Now, because I am parsing on a keyword, we're also getting the benefit of seeing what vendors they integrate here, right? Some of these are not twitch.tv domains, their ad systems on or like other things that have twitch in the URL or the domain, which can be pretty useful for us to know to gather information on the target. So the idea here is recursively I would start selecting this and keep on spidering until I built out a huge map of domains. And those are going my list of things to test I need to test every single one of these things, which seems like a daunting task but if you abbreviate your web testing methodology, you can absolutely do it. I had a bounty the other day that was upwards of I think 5,000 found live hosts. And I spent well, this is one the other day I was the other month, I spent about a month on it. And I think I made 20 grand in a month. So it's not bad. Okay, so you would then select all these spider these. How do you take this information added to your mind map or just make it in a list so you can feed it to other tools, but it actually doesn't have a copy all targets function, which I hope they they do pretty soon. But what you have to do is actually have to have the pro version and you select everything here. And you right click and you say engagement tools. Analyze target. Analyze target builds you this report PDF report of all the domains and information about what dynamic parameters they have. And this is, you know, perp usage. But if you save the report here, I won't do it right now. But if you save the report, it gives you a PDF report. And this is the most effective way to just copy and paste the list of targets that you hadn't left hand side right there. It's just, there just doesn't exist a function to copy all targets on the left hand side of it. So save this as a PDF, open the PDF, copy the table and dump it into something else. Any questions? All right, quite a few hosts for twitch. What? Yes. Yeah, to do that, you need to have the professional version of Burke, unfortunately. Yeah, that doesn't exist in free. All right. So yes. You use what? SAP for what? Oh, SAP, not SAP yet. Good question. I don't think that they have that function either, but you can use that for everything in here. The same same stuff happens that SAP SAP has a spider. In fact, that spider right now might be arguably better than serves burps because it handles JavaScript really well. Although burp just released a blog really recently that they're updating their whole spider engine to just be awesome. I'm so excited about that. So maybe that will change in, you know, a couple weeks. But SAP will absolutely work for this workflow as well as well as like Charles and some of the other interception proxies. All right, we will nuke these. All right. So that was a demo for link discovery. Oh, not sure. We won't present. Cool. All right, so someone said who is data or reverse who has data is a method where we could start to find related domains or IP space of some of our targets. Now this is a tool that's relatively new called DomLink. It's written by a guy named Vincent Yu. He does this really cool Twitter thing called red team tips. He's has over like 200 red team tips that he tweets out every other day or something like that. And there's some really useful nuggets in there if you're doing red teaming or just bug bounty in general. But he created this tool that's based off of this site called Huaxi. Huaxi is a website that offers an API that's really cheap to access for and it has a free version has a free number of queries you can do it due to it for reverse who is data. So he created a tool that will recursively basically look at a couple of things for an organization. Mostly the most useful one is organization name. So I'm going to show this real quick and what this looks like. So this will query this Huaxi database. This tool will with an API key that you sign up for it. You can do I think 150 queries before your free version is gone and it'll help you basically search for any site that has the registered company name of whoever registered your main target. So let's see if this works for Twitch. I'll make this bigger in a second. Oh caps lock was on. Uber hacker up here can't get in with caps lock. Okay. All right some of my test blocks. So let's see the tools and I think Dom link is where I'm putting all this stuff. Okay. So Python Dom link. Let's see how Twitch has their who is information set up. Twitch TV dash C. Okay. See if this works. Okay. So their registered name is Twitch interactive Inc. Which I think is actually correct. Right. So this tool is going to ask me do you want me to check the whole database for Twitch interactive Inc. And give you back what domains those are. So I'm going to say yes. Now this can get iteratively recursive because they might have actually registered under Twitch or Twitch interactive or Twitch engineering. Like people register stuff very weirdly. There's no there's in a lot of companies there's not like a standard way that you do this unless you're very patent down with how your registration works. So a lot of the times it will it will actually alert me on a couple of these. It'll say I found five with the keyword twitch in them. Maybe we should check all of these out before I know I'm just going to say yes. And here we go. So we have a ton of host data for other sites that are twitchy. Twitch con obviously their conference so they host a domain for that that's in scope. Why twitch con twitch with the three something they might be parking some of these. But that's okay. I still want to check them out. Once I did this I used a key word for a large manufacturer and what I ended up finding was a domain that looked nothing like any of the other domains I'd seen or any of their brands. But they had indeed registered it with the same company name turned out that the company was hosting basically a whole bunch of new school well new school for them portals for code repositories basically Jira Jenkins all kinds of CICD stuff. And they thought that the protection to this was that they never told anyone about the URL like it was only internal rights of security by obscurity. So they thought that they didn't need to put authentication on it. So I walked in and stole all their source code passwords rooted the Jenkins server through script console. And that was really just right out of this tool. So this can be really useful to find targets twitch Amazon tie which yeah. So so now you have this output. Now what do you do with this output. I mean basically it's as simple as copying and pasting it into the mind map. So if I go back here. This is where the funds are. So each one of these should get a node. If copy and paste works. Okay. Well you get the idea. Each one of these should get a node twitch.tv justin.tv twitch con. Everyone should get a node. All right. Let's go back to presentation. So okay so far useful. Yeah. Okay cool. Questions. Question. Yeah. It's a different database. It's a reverse who is focused database and they're parsing more fields in the who is information than that site is. I've used both. I find this one to be way more effective. So yeah. Okay. So now we also want to find maybe. We're still on this track of trying to find top level domains acquisitions other sites that are related to twitch because they will be in scope for a large scope bounty. Now every company use ads and analytics tags. Right. Yes. Yeah. Yes you can. Yeah. Yeah. Through. You can do it through that same one. Waxi or that 12 DOM link. It has multiple options. You can specify you want to search by company keyword. You can specify registrant name. You can specify a whole bunch of stuff. So you just check into the like the stuff of that one. Yeah. So built with basically is this company that does technology profiling and add analytics analytics. Basically they check out every site on the internet. They spider spider it by looking at their leftover source snippets or text files or just default configurations of some frameworks. They know that your site is running fastly and you know is hosted you know with this X server technology and is using these JavaScript frameworks and they also know what and add analytics you're using by the format of the key that you embed in the web page. Now we can use this to our advantage because if you're a company like twitch you have a new relic key and you have a Google analytics key or some of these other keys. Now luckily for us they allow us to look at these relationships between sites. So here I've drilled down into twitches and those are their analytics keys on the middle pane right there. I can click on any of those and see where else their analytics keys showed up on the internet what other sites are in the built with database. So this allows me to find brand new stuff that I might not have seen before like bini.tv and you know boost the bun these are all streamers that actually twitch has promoted so much that they now have their own twitch hosted sites and so these might be in scope for the bounty they might be custom code you never know. So this is really this is really advanced so basically to do this you use the built with you can go to the site and just do a search on there on their site or you can use the Chrome extension which I can show you now. So in my browser up here if I go to twitch again or actually I do it in my testing profile here no that's not my testing profile I close it yeah close it there it is okay so now that I'm at twitch built with the extension is installed it's just in the Chrome star you can click it maybe burbs messing it up real quick oh there it is okay cool just takes a little longer because it's going to proxy it. So here you can see that I get some tech I get some tech information and this is useful in hacking anyway to find out the stack they're using what javascript frameworks they're using and you can drill down into this back here sorry this is also very useful in other parts of the methodology is knowing what they run right because you're going to look for odays in frameworks they use or whatever so this is useful but then if you go to the second tab up here yeah question Wappelizer doesn't do any of the admin analytics tracking it just does technology profiling so it'll do like it'll do just as good of a job if not if not a better job than built with but it's yeah it's limited on the other functions so let's try on the website yeah another question yeah so that's the problem right I would have scaled it if I could the problem with built with is that it's it's a paid tool they really want you to pay for it so actually it looks like they just updated it right when I was doing this presentation so I used to work right inside the bookmark like now it looks like I have to log in with a free account to get the relationship information or maybe use the search on the website they have an API it's expensive it's really expensive to use the API but I don't know of anyone else doing this analytics tracking the analytics code tracking across multiple domains so kind of stuck with it right now if I want to use this method anybody else knows of anything cool like that I would love to know so let's try let's run the main site all right here we go all right so relationship twitch.tv I can see that here's their UA code right here or the Google analytics code if I click on that now I get the domain information let me make this a little bigger so here in the left-hand side I can see related domains I also get a heat map over here of like how much they're related although I'll be honest I don't really know how to read this graph really well yet but I use the data from the table most often so start the tree you know this is kind of the same stuff we were looking at before some guild sites faceless.tv this happens a lot actually is what happened with this kind of linked analysis if they're going to do a beta product and they haven't even let anybody know about it a lot of times I can look at this data and know that they're doing that data product like I knew way before some video games had even launched or gone into beta that Twitch would already partnered with them and got sites ready for them and already integrated the analytics code into the page and I end up like knowing beforehand not really useful for a hunter but exciting for a gamer so yeah all right yeah you don't you have to visit them the indicators I wouldn't say a compromise but the indicators of ownership are usually the site has a privacy policy that links back to Twitch or a trademark that links back to Twitch in the footer that's how I usually know that these are related yes yeah absolutely yep you could use DNS registration information you could go back to the who is information to and just verify that these ones match you know who is look up and stuff like this yeah absolutely sometimes I'm reckless enough that I don't do that but you know just depends yeah yeah yeah all right so that's tracker analysis basically so other things that you can just do or things we just talk about so trademarks exist across many sites right you have to embed your trademark and policy in privacy policy in any site that you launch for business nowadays they protect you legally so searching for things like Tesla C 2016 Tesla C 2015 Tesla C 2017 and then in URL call in Tesla is a quick Google dork to try to find some sites that are related to Tesla yes I don't look at that honestly when I'm doing bounty hunter like bounty hunting stuff I don't I don't care as much so I mean that's a horrible answer but yeah I just like yeah yeah I would I would report it like I mean I have that instances before like I found a site it has the privacy policy but I can see it's obviously not managed by the company it's a third party it's vulnerable to something like I picked up right away it has like some kind of search functionality or reflected like text and it turns out to be vulnerable cross-site scripting you know a bug bounty situation I just ping the customer and I say I found the site it should be in scope because you have a wide scope program but also might be in scope because I don't know if you own it they have your trademark on it I just say those words a lot of times they'll be like thank you we'll contact the owner of the service we think we knew who it is and they'll usually award me a bounty for just having found something that they had no idea was up or have lost track of so okay so now we've done IPs and brands and like related sites and now we're going to get into discovering subdomains so we talked about subdomain enumeration right like finding now that we have Tesla.com or twitch.tv those are top-level domains now we want to start finding subdomains of those sites and those can individually map to IPs and be their own applications which all are in scope for bounty hunting a red teaming so really there's two methods to enumerate subdomains one is subdomains scraping and the other is subdomain brute forcing now subdomain scraping is the idea of taking search engines databases census by do DNS databases SSL certificate databases even virus total way back machine there's about 65 total sites around the internet that harvest large sets of data about domains or maybe they're not even really made for that they're made for other things but allow us to do searches to find references to domains and they all in some way or another offer access to an API or can be scraped pretty easily by some Python to return so this is relatively new actually we weren't doing a ton of this in pen testing until like the last couple years actually nobody was really looking at scraped information off the internet to identify subdomains and maybe assets of a company in a red team engagement but now it's all the rage this is actually one of the best methods to find subdomains and secret sauce of your clients or your bug bounty targets so there's a ton of sources what happens is really there's been two advance one was a one was a tool called sublister which was one that everybody used for a really long time and it was it was maintained for a while then kind of fell off wasn't really maintained and then two authors really recently released two tools that are epically good and so they each have different sources and they each have different features and functions so I can't decide which one I want to use in my methodology if I would if I would be asked to use one tool so I just script them both together when I started that scan at the beginning of the at the beginning of presentation that's just some bash in the background running concurrently this tool a mass and the next tool I'm going to talk about subfinder and that's the reason the tools take so long is because it's scraping all these sites right now so this is a run of a mass against Netflix who also has a bounty public bounty and what it does it goes out to I think between a mass and subfinder 65 sources I think and it also offers some some cool stuff like permutation scanning so first thing it'll do is I'll scrape all those sources for references to netflix.com and then it'll tell you on this page I found media netflix.com and geo netflix.com and ex out 104 netflix.com I have no idea what that is and so it'll give you back a list of now targets and you can see in this methodology for a big company you're starting to gather hundreds of targets that you can go after which for a bug bounty hunter is is good right you can hack the main site or you can focus on these other sites there's really no limit to the kind of stuff you can find a lot of my buddies are on the Walmart red team and they use the same methodology this recon methodology to find stuff that's just been left out there especially when they acquire a new company they do the same stuff same enumeration so the other helpers that this tool has is it includes some reverse dns stuff but also has what's called permutation scanning so you see here the second result here was media netflix.com on the right hand side from this tool and that's great so what it'll do is it will add common prefixes to that subdomain like one dash media.com or user or dev or prod or whatever that media.com and then it'll try dash and it will append keywords and a dash and a dot to try to find additional hosts that are related to that subdomain and see if they resolve and if it does resolve add it to the list and so this is called permutation scanning in subdomain enumeration and so a mass includes a function to do this so the other one is subfinder written by iceman awesome hacker name I love it and this one has a multi resolver brute-forcer it built in so if you want to do brute-forcing and so domain scraping via the same tool you can use subfinder and it's pretty efficient it also can output JSON so you can feed some of this stuff to other tools especially if you're using something like aquatone subfinder supports now aquatone output so if you wanted to use subfinder for the discovery instead of aquatone which is a framework for OSINT that finds the same kind of stuff you can basically take the output of this put it into aquatone and then use aquatone for the later phases of what he calls a domain flyover but really it's just subdomain analysis so I used to have a fancy table saying like what was better about each tool when it was sublister in some other ones like enum all which is a tool I wrote I completely deprecated my own tool I just want to use what works aquatone sublister and anything else for scraping but it doesn't matter anymore subfinder a mass or the two tools you want to use are the best and breed right now and they will be for quite a while they handle the most sources and then the most effective okay so that's subdomain scraping now you have subdomain brute forcing how am I doing on time I have five minutes oh god okay subdomain brute forcing is the idea that you just try to resolve a whole bunch of random stuff like admin.twitch.tv.com or twitch.tv and media.twitch.tv and if you get a resolve it means that site exists or some kind of DNS redirect right and this is time consuming brute forcing anything passwords or DNS entries is time consuming and pentesting needs to take a long time there are newer tools nowadays and the one we're going to talk about is mass scan mass scan or mass DNS sorry mass DNS is the one we're going to talk about mass DNS used to take what used to take or what used to take a day to do or maybe a week to do with a large list of words to try to brute force subdomains I just cut that time down into a minute and 24 seconds and how it did this is it's written in C first of all very fast and then it uses what's called multi resolvers basically instead of using one DNS resolver to do that resolution it cuts it into groups of 10 resolvers and then distributes your word list across 10 resolvers and then brings it all back to you in the same tool so it drastically cuts down the time to do this type of brute force a one million lines of the main dictionary runs in one minute and 24 seconds that's unheard of so this is kind of what everyone's using right now so what about those word lists well this is every on the right every tool that I've ever known in my pen testing career bug hunting career that ever existed for subdomain brute forcing fierce a lot of people used fierce for a long time knock like a whole bunch of tools came out for subdomain brute forcing over the last 10 years they all have different word lists and then individual projects came out with word lists to try to do subdomain brute forcing I basically cat it and unique them into this one list called all that txt and this is what I use with mass DNS this is what I feed mess yeah DNS with okay I'm fine this that I could stay a little while okay so there's some other newer kind of tools or kind of newer projects out there when it's called common speak but common speak is as they used a big query on a whole bunch of sites to parse out their subdomain structure and their url structure now the subdomain data is awesome like it actually really gives you key words and terms to look up subdomain data that is pretty new right they went out and they analyzed hacker news and HP archive and stack overflow overflow and basically if you think about it they're just spidering these sites and every time they see a url mentioned or a domain mentioned they capture the subdomain and add it to a list and then do some analytics on it and they say alright new school companies are probably using these names for their subdomains so if it totally became fashionable to name your subdomains after characters and harry potter harry potter made a rooster surgeons doing a big query result against one of these sites might tell you that that's become fashionable again is how people name their subdomains which is almost like language analysis added to subdomain subdomain enumeration they also tried to do the same thing with url data and how urls are structured like their url path this has been less useful for me but I'm still waiting for the author to like sell me his dream I'm like why this is super useful because what I looked at it it's very application specific right everybody has a different url path there's not many common occurrences I see unless you're using a purchase software license software something like that so but the subdomain data is awesome I've integrated into the all that text file so it's in there now so you really you could just use the all that text file any questions alright so other ways to find sub domain data so if you if you have question no he's taking a picture alright so if you have if you have DNS or sorry if you have DNS second able there's this idea of how DNS sec links to the next subdomain set up for your organization and I'll be honest I don't know exactly how this works I'm not a DNS expert maybe someone in this room knows this better but I know that they do link in reference to each other when you set up DNS sec so there's this idea of what's called NSEC walking and there's three tools for this LDNS utils NSEC walker and NSEC map and what they do is they basically iteratively go through every domain that you hand it the first one and look for the next reference one in the NSEC chain and basically what happens when you run one of these tools as you get kind of an old-school looking DNS zone transfer which is amazing because nobody lets you do that anymore and if they let you do that a lot of this would be you know moot like fight a zone transfer great like I mean me all that so if they have DNS second able they can they can use one of these tools there's a whole presentation about it that Barath Kumar he did at the bug crowd conference level up two years ago I'm reading he walked through doing all of this it's called esoteric sub domain enumeration techniques which was excellent I really loved it other ways that you can search for domains is searching get hub or get lab or you know sources you know like that and you can also just do some some Google door game so all right so now we've got a giant list of targets in our scope or you know campaign for red team or county you know thing so we do now well it's very general like in pen testing you do port scanning it's that for for a long time we were using and map which nothing is in map but it's slow it's really slow and it's a great tool and I use it in the methodology but just in a different place so things like Z map and mass scan are infinitely faster to just do a general port scan so here mass scan to do a full port scan of a large targets ASN takes about 11 minutes to finish and it's on 65,000 hosts right that's super fast for a port scan you know I'm running this on a mid tier digital ocean box so like also not the most bandwidth that I'm working with but end map would have taken a week right I remember when I was doing pen testing full time when I was like just a scrub we would kick off this really ugly pipe on script and and it would have to kick off end map at the beginning of the script for a large company's domain and then we'd come back four days later and it would have finally completed now in that has gotten a lot of tuning to to get faster there has been a lot of tuning but mass scan still blows it out of the water so what I do is I do the initial port scanning with mass scan the fast port scanning of the whole IP or of the IP with every port right the full range of one to 65,535 and then if that tells me a port is open then I feed that to end map and end map only gets the ports I know I'm open and then I use end map because it's stronger in other areas it allows you to do version scanning it allows you to add end map scripting engine checks and so I'll feed the mass scan output to end map and as you can see this is a methodology that can obviously be automated right you can script all this together yes yes absolutely yeah yeah that's actually a typo so yeah thank you yes yes sometimes it does come back with false pauses that are asynchronous because it's asynchronous really I don't have a great way around that right now so like when I get back that data of port data what it'll look like in my automation is is it at like looks like every port is open on a box and with a mass scan and it actually like since I'm logging the port scan to a flat file that flat file ends up being like like a huge amount of megabytes and it slows down my tool I'm still trying to figure out a way to like not do that so it's in my issues list at home but yeah I mean it happens less often nowadays like I think that if you tune mass scan really well you can get around some of it so yeah yes yes this is true okay so the question was a lot of places blacklist you when you start doing scanning right this is a running joke I don't know if you've heard it before like what happens here is a lot of cloud-based WAFs Akamai and Cloudflare and some other ones when you start getting into port scanning or when you start requesting a lot of web requests against a host or you even send one iota of attack traffic to the site they'll put you on this global blacklist which will blacklist your whole house and then your wife my wife will come to me and be like why can't I get to Amazon why can't I get to united I'm literally she was trying to go to a funeral with for her mom's mom and she couldn't register a plane ticket because I'd blacklisted our house RIP on this global blacklist so this is a running thing I test over a VPN all the time now and I make sure to have a VPN that has a quick proxy switch fiction or function so I use IP vanished because IP vanished has a whole bunch of functions that I like like the check box that won't let you connect to the internet unless you have the the VPN enabled and it also has a button that just says quickly change IP so once I get blacklisted from one I'll just switch to another but I don't test over my home network anymore so VPN check questions okay all right so now you've got back some port scan data immediately you're going to start to notice things that are interesting here obviously you're going to notice the 443s in the 80s which are what we're going to test for web testing but you're also going to start to notice remote administration remote administration protocols and database servers and things like that all of these can be brute-forced with password lists and so this is the part that gets you know you got to make sure you have permission at this point so when you do the mass scan and you get the output back then you then you serve it to end map and end map with end map you do the slash og flag to get back the greppable output what you can do with the greppable output of a service scan from end map is feed it to this tool called brute spray and brute spray is pretty cool it takes the end map gmap file or gnmap file and it will parse it for all remote administration protocols and using I think it's Medusa's the core technology choosing under it will brute force everything in that gm map file for credentials and it'll do it concurrently which is kind of useful so here I've iterated I've specified a password list that comes with the default user and password list this is honestly enough for me I'm not trying to hardcore brute force into stuff like that's usually not even scope for the bug bounty in red team it is so you know maybe you want to use a more advanced username and password file list here a lot exist out there sec list is probably one of the best ones that you could find for password and username lists but I'll just use the default one here and I'll run this after the mask in and map finishes it does this so it locates the file and you can see that inside of my gnmap file it identified nine FTP services eight SMTP actually nine SMTP eight SSH host one telnet and one one my SQL in order to brute force all of these for common usernames and passwords and alert me when it finds it so then I have now started to look at remote administration protocols but now I have all of this data for websites 443 and 80 these are actually what I spend a majority my time testing on a bug bounty is websites but I have so many now I have hundreds of domains for a large scope company like if what's the largest company you could ever think of an Accenture Microsoft right you do this analysis for Accenture Microsoft you're getting back like I would say well I know for Microsoft it's in the it's in the like thousands of live hosts that you're looking at right and I don't know how many of you have ever done a campaign against a thousand hosts it's a lot of work and to really know where to start you could start at the top of the list looking at those hosts but also you could do this message this method of kind of visual identification so what you do is use a tool called eyewitness there's other tools that do this aquaton does this as well there's another tool called HDP screenshot and there's another tool that I just identified that somebody told me outside in the recon village tables about that's even better for this but I haven't tried it yet so it's at the end of the presentation the idea is you find some tool that visits all your URLs or domains and takes a screenshot that's all it does takes a screenshot dumps it in a folder now you can open up that folder with large thumbnails enabled or whatever and just look through and kind of eyeball like yeah that's that redirected to the main site I don't care about that this redirected to a help site kind of don't care about this oh this looks like an admin back-end login really care about that right and so you visually I started identifying things that you care about out of this large list and that helps you prioritize what you test first so you get kind of return on your time what eyewitness did that I liked a lot is you can feed it just a list of domains and not whether it's HTTP or HTTPS and so this tool eyewitness and specific would try both for me I didn't have to recreate multiple lists with HTTP and HTTPS and then feed it to the tool the tool did it for me but it's also kind of slow these sometimes are prone to error these tools taking screenshots because they're using things like phantom.js which is just kind of a garbage fire sometimes so yeah you just have to take it with the grain of salt that maybe you'll get some false negatives in this idea I probably still recommend like when I'm doing this against a smaller target if I have less than a hundred hosts that I found I'm just loading all that in a browser and just visiting the pages myself using some Chrome plugins it's over a hundred hosts then I'll do something like this and go try to automatically do the screenshot that bit yes yes it is in eyewitness they have that function where you can dynamically add a set of ports to the end of the domain yeah which I actually don't think a lot of the other tools that are trying to do this have supported yet so yeah like if they were running an SSL HTTPS service on another port right yeah okay so this one this one is kind of new and is actually like really simple but who here has used Wayback Machine it's one of my favorite sites archive.org and Wayback Machine are awesome if you've never used them before what they are as a website that takes periodic snapshots of most sites on the internet that they can find and then they will take these snapshots and show you the front page or maybe several pages on that site with a date and time stamp and they keep the image up there and so what you can do is when you start going to these sites what you're going to notice that a lot of them are like infrastructure that's not serving a real web page there's no application it's really either an API or some kind of infrastructure that just need to be hosted but it's hosting other ports and a web server for no apparent reason so you'll notice that when you're going through your testing you'll get like a lot of blank pages or like stuff that looks like it's been nuked but once was there like a lot of content so what you can do is you can go and look in the web archive history and see was there once sensitive content there this is actually how a buddy of mine Brett like I had done this and been successful in a couple cases but a buddy of mine Brett who's a bug bounty hunter was like it was kind of nice to see that he's found some really high impact vulnerabilities using this methodology so you know when you do this how do you automate it right well there's some tools to do it there's one called back unifier and another one called recon cat that supports scraping the links to the image of the history of the site that you're looking way back into and so these kind of get integrated into your methodology when you see sites that have little content on them or look like they used to be sensitive but have since been fixed you go to their way back enumeration and then bingo it's like a configuration page that has like a private API key that's still being used and they thought that removing the web page was the solution but actually the content was cashed on the internet so alright so we did way back enumeration we did visual identification alright so now for each one of these sites right I probably have a node in my mind map for all of them and I just start going down the list and this is what I do first thing is I identify the platform with bill with right the current plug-in I have a python script that I'm going to release I just have one bug to fix because they changed the API all the time and I need to now change it to use a free account but I have a price on script I built that will scrape it with and give you the technology profile back for your site so you can integrate it into your own tools I'll release that afterwards and maybe tweet about it on my twitter or you can use wappelizer which is somebody mentioned there's another one called what web which is also a technology profiling script there's a whole bunch of them so they're all pretty good at this point I want to see what my target runs my target application what does it run like ASP.net are we looking at PHP what JavaScript frameworks like I want to start getting that information so I know well ASP.net has request validation so cross-site scripting not really going to be super success so super successful for many many times unless I'm looking at a custom piece of code or something like that PHP you know very prone to path traversal tax or command injection and stuff like that old school vulnerability so I got to put like my 1990s hat on and be like alright ready to attack PHP so so this gives me that information the built with and then also there's another one called retire.js which is super cool which will give you the full version length of all their JavaScript libraries which is easy easily cross-referensible with has there been any vulnerabilities since the version or since the version they're using so immediately I know if they're using outdated JavaScript libraries and if there's cross-site scripting vulnerabilities in those or anything else there's also a newer ish plugin which is called a bone scanner, burp boner scanner and you load it into burp and it basically does the same thing that retire.js does but it's up for the server stack so you think like nmap gives you but or nmap not if you think like Nessus when you do a Nessus scan it comes back and it says this server software is this version because I know by the header or identified that there was still an install file left or something I know it's this version and that version had a vulnerability in it this does this through burp so you set this up here and every time you visit a new site this looks at those key indicators and says yeah this is old it probably has a CVE associated to it you should probably go check to see if that's exploitable. Alright so I'm on a main site I technology profiled it you know I kind of know what I'm going up against I've maybe found some CVEs that I have to identify the next big hurdle is parsing JavaScript so sites that are you know well every site nowadays is using heavy JavaScript right like and dynamic spiders and even burp in its current incarnation is just not good at traversing JavaScript it's no technology is really great at it so to be great at this I have to add separate tools my methodology so zap is actually really good for this that's Ajax fighter is like a headless browser that will execute a whole bunch of functions on a page and basically return you know how you instrument an application that's heavy JavaScript and allow you to find parameters and even things like DOM based cross-site scripting very easily so Ajax spider is is pretty cool the other one is called link finder which is a standalone tool that you feed a URL or a list of URLs and it will go in and pull down anything that sees in the source code of all the pages of a spider for full URLs absolute referenced URLs or dotted URLs relative URLs with at least one slash or just references the files and it will go through all of the JavaScript files on a site and parse these out for you and build them into links so that you can visit them inside of either your browser or just directly through burp repeater or something like that this has been really successful with things like API functions that are maybe loaded on a page in a large piece of JavaScript on how to implement the API but since you're there just using one function you're only executing one one-hundredth of what the API can do well now you get the pads for all of the API and how you instrument them and if they haven't basically set up access control having this mapping and knowing how to work that API without a document is absolutely glorious you can find vulnerabilities even these things reference configuration files sometimes like a lot of times we were missing out on a lot of this information in JavaScript files now we're not as much anymore because we have helper tools like link finder very similar is JS parser actually written by Ben who was in the room earlier I don't know if he ever got in which is sad but he was here he helped write this does the same thing parses out paths that are referenced in JavaScript so how do you feed these tools those pages it's pretty simple you go to your top level target in burp site map you go to engagement tools again you have to have the pro engagement tools you say find scripts and then you copy the selected URLs that have scripts on them and then you feed JS parser link finder that list and it will automatically go and do its magic and find URLs that you need to add all right so now we parsed JavaScript we have a good map of the application what we're doing etc okay so now we want to do content discovery the idea of content discovery or direct brief first content discovery directory brute forcing who has done this before anybody directory brute forcing no one's used derbuster before or anything okay there you are I was worried there for a second but the idea here is you have twitch.tv right and you've spider twitch.tv and use it as a user and you know all these paths and functions that it's executing but that's not the whole story absolutely there are backend URLs that are used by service staff like admins there are usually configuration pages installed by frameworks or login pages installed by frameworks there's just a whole bunch of stuff behind the scenes now how do I know about that when I look at a site I do directory brute forcing with something like derbuster now derbuster is super old school nobody uses derbuster anymore we have now moved on to some command line tools that have instrumented the same thing that derbuster does but much faster like go buster go buster is one of my go-tos another one is dersearch which I've been using more lately and the reason you would use one of these tools over the other is the amount of information it gives you back and control of the directory brute forcing five minutes okay got it and so these tools will allow you to go through a large list of directories the best one right now is robots disallowed written by Daniel over here he went out and spidered all of the robots that text files in the large Alexa list right or the top Alexa whatever and then built it into a list for us to use because if you think about a robot that text file it's what developers don't want you to find right so now we go to every place they don't want to find they want us to look at stuff and we look at stuff and that's what a hacker does so so go buster and robots disallowed pretty when so this is another list that's pretty good for this I took robots disallowed and again every tool that I ever found to do directory brute forcing and I catted it into one list it's pretty shit but it's still pretty good it works for me it's a large list so this will take a long time to run on a target the other idea is that now we have a whole bunch of functions maybe that are linked in java script or we got from other places but we don't know what parameters they take to actually action the function so you can brute force parameter names as well once you find a function that seems juicy and I only ever do this when a parameter seems really juicy like you know like admin equals whatever you know transfer data you know depreciate user like whatever the keyword seems juicy I'll do this kind of analysis otherwise this is another time-consuming step brute forcing anything is time-consuming so there's also a list of the most common parameters on the internet that's integrated into a tool inside of birth called backslash powered scanner it's the top parameters that appeared on websites from the Alexa the Lexa list and you could feed this to a tool called parameth which will try to in first of all to try to find print no commonly known parameters on a url or on a script or on a rest path and then it'll feed at this list and try to brute force some if it can't elicit what they are verbatim so this also works really well this list this back this backslash powered scanner list called params it's in the folder on port swiggers github it also is really useful for API fuzzing so if you're up against a rest path like you find an API and it's a rest-based API this list is really good for trying to find rest-based API functions when you're doing a black a black box web out it all right we're almost done I swear all right so the other one is this idea of auxiliary testing like just some random stuff I wanted to add in here a lot of people are committing bad stuff to github a lot of times they do it they just on accident they commit passwords they commit config files private keys whole bunch of stuff so basically what I do in my script when I kicked off the one earlier is the first thing it does is it builds a set of links that are searches to github so you can see I have a word here password which is actually the one you hit on the most with this kind of analysis and I build a link here I say github.com slash search queue and then my domain is this dollar sign one here right that's that's the domain I've so in this case it would probably be twitch.com and then plus password and that's the search and if I click on this out of my console it'll take me to github I have to be logged in and it'll do a search on all github projects for any code that has been committed that has twitch.tv in it and password equals and I will find invariably a lot of these this seems dumb and simple it happens all the time and is worth a lot of money and bounties and is worth a indicative of a lot of risk to your organization it just happens people go spin up custom code projects they forget they commit it to their own repo then they even remove it but the reference is still there in the history so this is a big thing that happens all the time so I build these links dynamically while the subdomain scraper and brute force are running in the background which take a long time I am manually going to each one of these github searches and trying to find out if they've committed stuff so I'm trying to stack my activities so that I'm never wasting time something new I've started to do is favicon analysis so the favicons that are associated to you know your little tab and it shows you like hey this is the adobe favicon or whatever I pulled that down from the main website which I already know usually when starting one of these I hash it and then I pulled on everyone I see on every site that's in that port scan that's in the reference subdomains when I see if the favicon matches and then I know that for sure that is probably owned by that site and it might be something indicative that I need to test so favicon analysis and that methodology is pretty new I confirmed this with a friend of mine he's pretty good at recon this is actually worked out for him as well so found a couple of reference IP ranges in the cloud that he didn't know for sure were targets but then because he could verify via the favicon that they were was able to get permission to hack them when nobody else had ever seen them and found a couple bounties on that so super cool this is the last one I told you about actually found out the table up there go grabbers so the idea of HTTP screenshot or any of those screenshot tools this is a new one what I like here is that it's faster it's written in go probably so it's probably faster and I'm going to try this when I get home I heard it's pretty good so I don't really have any data on this but if you want to go institute something quick you can try this when you get home so if we go back this is the total methodology wrapped up in kind of bubbles and a cat the tools that I use with it and this is iterating all the time like this is always changing every couple months something happens where I change up my methodology I'll wait until all the cameras go down I guess yes what no what is that maybe okay yeah absolutely I I don't have any experience with that I would love to add it to my methodology let's talk absolutely yeah okay so that's most of it let's look at the automation that ran okay so here is a holy crap alright that was a lot of stuff so this is my automation it's just glommie basket bashrifting it's nothing special right so I started on twitch.tv at the top right first it says it's running a mass sub finder and mass dns on twitch.tv so it's going to take a while then what it gives me is the crunch base links for acquisitions for twitch so I'll take this and paste it in my browser crunch base uses distil so I can't automatically pull this down into the command line I actually have to go visit the page as a human because distil is a really good bot protection so I just build the link here I go check it out I see who they've acquired I add them to my mind map and then I also want to see if there's any like kind of directory structure that twitch TV maybe uses or it's referenced very highly in Google search results so this is a python Google scraped or a Google browser basically so I can't remember the name of it but I institute this basically command line browser to pull back Google queries and and then it gives me links that are referenced on Google for twitch TV so I start looking at these things and seeing if there's any strong correlations of sites I should test then it builds my github list for me right so passwords I'll just grab this and copy it or open in the open and browser so really hope nothing shows up so somebody has a project here which has a JSON config file with maybe credentials in it here's a bot config for probably scraping twitch probably not run by one of their own employees here's a conf file that doesn't actually specify a password here's somebody who's put a variable in for a password user and pass that's really secure oauth so this is going to use oauth yeah so I'll look through multiple pages here when you're looking through the github output you can choose best match or recently indexed recently index will give you you know a good view of like if anybody's done something like really recently and may not they have may have forgot to pull it out so I will use both I will look at both sorting views of this data and I will do this for all of these password ID RSA past WD there's a long list of these that I took from another tool I can't remember the name you remember Dan with the name of that tool is yeah some it's some other github like dorking tool but it does all it does some of the analysis automatically but it was for a different use case so I took that and put it into building these links then the scraper and brute force or stuff finished processing and I want to make sure that these sites exist and are not just references or taken down right because it's scraping it's taking information off the internet you don't know if that sites may be taken down right so I built my own script to resolve everything not super fancy and it resolved IP so a lot of IP or a lot of domains were found a lot what what is it oh okay I got to go so anyway you can do this at home this is not fancy scripting I get the I get the list of IPs to do and analyze I get the links for the sites I load them in the browser and I test doing I test them doing web hunting that's it thank you very much