 already has. So at this conference we don't really need to explain what recon is or why it's good to do. So let's start with the question of why even automate? Why is there a talk about automation? Because if you're watching this you obviously like to hack, so why would you get away from doing that hacking manually? I think it's a good question and my best answer is that you don't actually have to choose between automation and manual hacking. You can have both. More specifically what I'm saying is you can use automated recon to feed your manual hacking. So while you're asleep or gaming or relaxing you can still be finding stuff to explore later. So that's the why and that's why you think automation is cool because it goes hand-in-hand with manual testing. But this talk is about the how not really the why. So one of the things I hear about a lot is breaking down tools or methodologies or security in general into vulnerability assessment or penetration testing or bounty hunting and making strong lines between these. But I actually think those distinctions are a bit arbitrary when it comes to automation and actually prefer to abstract security into questions rather than categories. So I like to break down all my testing into individual specific distinct questions. And I get this idea from the UNIX philosophy which talks about making each program do one thing well and expecting the output of one thing to become the input of another. So I try to do the same thing with my automation. So on the right you can see a number of questions I might want to know about a target and I would say this this applies almost every time and I'm a very web focused tester so these are obviously very web focused but it's a giant list of questions and it also includes network security as well. But it's a giant list of questions that I pretty much want to know about every target. So my automation is a way of asking and answering those questions for any arbitrary target. And that brings us here where we have specific discrete questions being answered by a specific piece of code. So in this case we have two questions. What are other subdomains which is a question captured in check subdomains.sh and which are their subdomains are running web servers which is captured in check web server.sh. So that's what I'm going to talk about today a way to ask lots of different security questions continuously and then do fun things with those answers. And if you want to see what kinds of questions you might want to ask for a bug bounty for example I really recommend you start with a good methodology. And I think the best explanations of the power of methodologies comes from my best buddy Jason Haddix who's also speaking here by the way at the same conference. Jason not only talks about the steps in his bug hunters methodology series but he talks about how to show them visually and maps them out into mind maps. It's really good stuff. I really think he has the best content out there around web and recon methodologies. So let's look at a few examples of these. Let's start off with a simple case of where you know you have an external IP range and you want to know what hosts are live in that range. The question there would be something like for a given IP range which hosts are live or which are likely to be based on the fact that they're serving common services. That question produces a seed live IPs dot text that becomes the input to countless other modules. So before I go more into modules, I want to actually say something real quick about the level of the code in your automation. I think this is likely to come up for anyone who thinks about automation or is actually tried to write their own. I like to think of it as two extremes with frameworks on one side and completely custom code on the other. So completely custom code is like you're writing in C or go or something in your writing your own port scanner or something. So you're interacting fairly directly with with the kernel low level programming, right? And with frameworks, you're like calling a masses enumerate function to get a list of IPs, for example. So I personally prefer to sit right in the middle of those two and write extremely small unixi modules that leverage a low level utility. So for example, I'd like to do my ASN and IP range lookups from IP info dot IO. So in my automation, I write small little wrappers for each of those discrete functions rather than using a framework. I think that hybrid idea gives you a great combination of unixi like control of very small things doing very small things without needing to rewrite things like curl or end map or mass scan on your own. So that's that's the level I like to live right there in the middle. So here's another module that is fundamental to my workflow, which is just getting a full HTML output for a page. So here I'm using a headless chromium browser via command line so that gets accepted when you make the query in as many web servers and defensive systems as possible because curl is pretty much denied by a lot of things by now. And then once I have that raw HTML, I could do all sorts of stuff with it locally via different modules like getting all the JavaScript files, parsing it to see if this page might be marked as sensitive, looking for artifacts that might indicate the application or text stack, looking for fields that might be known to be vulnerable to certain injections, etc. And I've at least a dozen of these just for parsing HTML. But you have to have the HTML to start with. So this is a good fundamental, you know, module or script or piece of code to start with. So one thing you always have to do for another module here is flesh out the scope of your target, which often involves pivoting from one TLD to another TLD that is related. But you don't necessarily even have the name of the domain, right? So you might be looking at Tesla, you're going to find other Tesla related domains. But there are some Tesla related domains that don't have Tesla in the domain itself. So one technique I like to use for that is following redirects to the target domain. And there can be flaws here, you can have other sites or territories or whatever that link to a different target, but aren't related. So you've got to do other checking to make sure that doesn't happen. But it is generally a fairly low noise, high signal method of finding new stuff. And to do this, one of the tools I like to use again is IP info, and also host.io, which is related. And these are both a key part of my entire automation stack, because again, I'm writing small modules that call IP info and host.io, you know, explicitly to get one particular function and then produce an output from that. So they are how I do a lot of the lower level tasks, like getting ASNs for a company, getting IP ranges from ASNs, etc. Once again, there are a million tools that you can do this with. But in my opinion, the key to solid unixi automation is having something as low level as possible. And non abstracted as possible that you trust. And I just happen to use IP info for that. Some people use hurricane for that other people use like some of the DNS services where they have API access, or there's just a whole bunch of them out there. And this is just for my automation workflows, which is what we're talking about here. But if I want to go and do some manual exploring, and just check some stuff out, I'm extremely partial to a mass for that, especially since they keep batting functionality. It's now a loss project. And Jeff Foley who runs that project is just fantastic. So I really love the mass. It's my favorite all around your tool. I really enjoy it. Alright, so now you've seen a number of modules. And you get the workflow for creating modules, right? And you understand that they have to be small, they have to have discrete, discrete output that is then consumed by another discrete piece of functionality. So here's what it starts to look like. When you want to answer a complex question. Again, thank you, next. The output of one becomes the input of another. And this is a simplified view of a workflow. So we're going from good company to company. We're getting domains, we're taking domains and we're getting ASNs, we're taking ASNs, we're getting ranges, and we're ending up with ranges dot text. So this is a simplified view of our workflow. In actuality, you might have multiple sub modules that add sources or do cleanup or validation on another module. So domains dot text, for example, might have five different modules feeding into it. And maybe one or two cleanup mechanisms that go and prune out noise from there. Make sure no junk is added. But now take that to the nth degree, right? You can have one piece of output that feeds 10 different modules. And all of those modules on the right can then in turn feed each other or produce their own outputs. And over the last five years, I've created like 50 of these things for my own use. And the fact that you can just automate them is completely insane. I'm actually in the process of putting some of these on GitHub. I meant to do that before this talk, but I should have some up soon. And they'll be the ones I'm least embarrassed to put out there. Probably do some cleanup for a release. And that brings us to the automation piece, right? So there's lots of fancy ways to do automation. But this is all about, you know, really cheap Linux box, what can you do with the tools available? And you can just use cron for automation comes free with Linux. And you can use it not only to run your modules, but also to send you notification when things are found. You just need to figure out what needs to finish before other things start and wire that all up. And you could use code inside the module to make sure one thing is finished before another. And some interesting stuff that's fairly obvious once you start wiring stuff up. And finally, once all the modules are running continuously via cron, right? So you've got these discrete pieces of code producing discrete outputs, they're all wired up using cron, they're running continuously, you can then rig them up to notify you when they find something. And this is super easy to do via email Slack or really anything with an API. So I'm really partial to Amazon SES for email. You can send tons of emails with it for like pennies a month, you could do it all from the command line. And of course, you could even set up your own Slack channel where you monitor your favorite targets or bounties or whatever, and send yourself a Slack if your automation finds something new. So another thing to consider is how to collect and maintain and deploy all of this to the internet, right? So we have the scripts themselves, you know, the code, the modules, whatever you want to call them. We have them automated via cron, and we're now sending alerts with continuous monitoring that's going out via email or Slack or whatever system. Now the question is, okay, how do I build a box that does this? Like you don't want to have it running on your home system, like shooting out of your home connection. It's a bad idea. So the natural way to do this is to build yourself a Linux box somewhere and just start hacking on it, right? You start adding code and scripts or whatever. And you're pulling down some libraries and some modules and some third party tools or some open source tools, whatever, you're just like linking all this stuff up. And that works. But the problem is once you want to replicate that somewhere else, you have to redo tons of work to make that box identical. So what I started doing a while back was using Terraform and Ansible, combined with GitHub to manage all the code in the configs. So you have a self-contained directory for a new target that you want to monitor, like, you know, Verizon Media or Tesla or whatever program it is. So if I think up or hear about a new technique, I then make that change in the local copy and just redeploy the box using Terraform and Ansible. Or if I want to monitor a new target, I created a copy of that working box with the new seeds for the new target. And I just, you know, replicate that and then push it. And it goes out via Terraform and Ansible. And it builds itself inside of, you know, Digital Ocean or AWS, wherever you want to build it. And the crazy thing is you can actually deploy. And as soon as you press Go within a couple of minutes, the box comes up, it comes live. But because you have configured Quran already, you it just starts monitoring, right? It just starts working automatically. You immediately have your emails wired up, you immediately have your Slack wired up. And you can set up the variables such that it just starts working with the exact correct names of your new target. And it just starts finding all the domains, starts finding all the subdomains, it starts pulling all the websites, it starts testing all the ports, it starts crawling, it starts, you know, doing vulnerability analysis on the sites themselves for cross-site scripting and RFIs and all the different techniques you want to use. All the automation just spins up and starts kicking off. And that is all connected to your alerting. So you can literally just go in, set up a new target locally in Terraform or Ansible, press Go and then within two or three minutes you start getting the alerts. And that is like incredibly powerful. So what I love about doing automation in this way is that when I hear about a new technique, I don't just say, Oh, that's cool. And maybe write it down or mostly forget about it. If it's really cool and I want to remember it, I make a note that I need to turn into a module. And then I go in and add that module to Ansible in the local config, right? So for example, Jason just posted a thing on Twitter, I think, like last week or maybe the week before about crawling CVE details, the website, looking for URLs because they're often talking about URLs that are sensitive or dangerous. Well, he was like, why don't I just crawl that and make a list of URLs? I was like, Oh, that's super smart. So it's on my list of things to do to go create a module that does that. And I mean, I wouldn't have thought of that. I've thought of similar things, but I didn't think of that. And it's now on my list of things to do. And so you can go from directly from a thing that you saw that you hadn't thought of that someone else had a cool idea. And you can make a module out of it, which now incorporates that knowledge, like it just really fronts frustrating to me to go to a talk or something. To be super excited about all the stuff that you hear. But then two weeks later, you're like, Do I even remember any of that? Well, with an automation stack, you can convert your knowledge and your learnings into something tangible and repeatable, right? So everyone you're seeing on this slide here, everyone you see here some combination of a hunter, a tester, and a content creator. And you should absolutely be following their work. So I mean, these people are putting out really cool tools. They're putting out YouTube videos. They're helping the community. They're super accessible. You can just ping them on Twitter and be like, Hey, what do you think of this technique? Just very accessible, very knowledgeable and producing great stuff for the community. So you should absolutely be following their work, constantly learning stuff from them and any others like them in the community. And it's just really good to be tapped into people like this. All right. So that's what I wanted to talk about today. And as a quick summary, so the biggest takeaway for my approach to this is breaking up your testing into discrete commands. And you want to stay as close to a trusted source as possible when you're writing those commands. So avoiding abstraction. You want to make sure your output is extremely well named and clean and goes into artifacts that can be used by other modules. You want to chain your commands together into a methodology and schedule them running cron. In this case, you can use whatever you want, but crons there and that's free. And you want to lock in your configs into repeatable deployments using axiom or terraform or Ansible. Axiom is another really cool tool for this. Yeah, really, really cool stuff. Guy named Ben made this and it's a deployment automatically into digital ocean. And it's very similar to what I'm using with terraform and Ansible. We should definitely check out that project. You want to follow the people I mentioned and stay aware of the newest techniques. Like they're putting out great stuff. You just need to follow them. Trust me on this and you'll find other people to follow by following them. And finally, when your automation workflow brings you fruit, go and hack on it manually for fun or profit or whatever. This is how you can get a hold of me. If you want to chat more about this, there's a number of people thinking about this problem right now and some really cool frameworks coming out around as well. And yeah, if you want to chat about it, just hit me up. And thanks again to the Red Team Village for having me. We'll see you next time.