 Well, so it's um, it's gonna be just 2.9 right now and Originally, we're gonna we were gonna have another person introduce Introduced Sam, but he's a little bit camera shy, so I guess I got to do it so I Guess the best way I can introduce our next speaker is Council 9 black badge winner Sam or Thank you Thank you Hi, my name is Sam today. I'll be presenting on hunting certificates and servers and so let's get started Our team council of nine has won the black two black badges through the badge challenge As being mentioned, I'm also a software engineer at Akamai So one thing I have to be very clear here the pain is expressed here on my own no one else is and Be careful when you connect to any host online and always seek explicit permission before attempting to exploit etc so There's three parts of this so part one is skating the internet. So this is actually a Defconn story I at the as part of the badge challenge We noticed the person who had run the badge challenge every year 1057 would use the same tlds over and over again in particular. He was very fond of the doc codes tld Unfortunately, there are not a lot of doc codes domain names So they picked up on this and started monitoring for any new doc codes registration online and as well as skating the internet for Domains hosted that contain a doc code to a domain we'd also then search for relevant who has records as well and This was a pretty closely held secret by our team there are about 10 people or so that knew this at the time and I'm really only sharing it now simply because it's the contest is largely over in its current form anyways and So our scanners actually worked they picked up one domain We got a slack notification that grade our codes was registered and it contained lost handle therefore it must be him obviously So it contained a series of puzzles. So we visited the website We solved we spent about 72 hours not really sleeping solving the puzzles thinking it was a registration for a contest or something similar and We sent through the solution on the final step this was November 2nd 2016 and Lost replied back. I don't know what you're talking about essentially and it appears you've been trolled So I sent him I sent great codes a message on Twitter asking who he was and what he's doing and It turns out we weren't the only ones who got trolled all of the other teams who are competing against us at Defcon that year also got trolled and Yeah, it actually turned out it was one of the 10 people that we have told about this trolled the rest of us using our own scanner So that was kind of my first introduction to scanning the internet and doing large-scale reconnaissance in the IPv4 space and that really got me thinking of what else I could find and search for on public internet and One thing I'm very familiar with is TLS certificates and I as you might know TLS certificates contain host names and So before I talk about scanning the internet for TLS certificates I have to talk about TLS and this is going to be really basic. I'm sorry if you're a TLS expert TLS handshake you send over in the Client hello you send over what you're connecting to in the server name indicator or who you're asking for Then the server will return back a certificate to you and then there's a handshake that occurs that drives the secret that's used to communicate and Yeah, so that's that's kind of like a high level of how TLS works. So if you go to HTTPS any website today This is what your browser will be doing So I want the host names that are found in that second step there and so I wanted an efficient way to Find to get these host names at scale Ideally from as small server as possible. I'm not operating with any budget here. I'm running VPS with a single CPU core so I cut off the rest of the connection and so when I get a certificate back from the Host name sorry when I get a certificate back from the server I would just immediately kill the connection and not continue along with the heavy crypto that's involved in doing the rest of the handshake This isn't really anything Crazy it's entirely by the spec. You're allowed to exit out of the handshake at any point if something goes wrong So yeah, so by doing this I'm able to Save CPU costs on my server and connect to many more servers for as cheap as possible So what's in a X5 and I certificate just to give you an example This is one for Google.com and as you can see it contains many many host names And this is fairly common. You'll see either wild card or just a list of domain names so as I kind of mentioned previously I Scan the internet for these so for every so I set up a scanner So for every IPv4 host I did two things I ran mass scan over it which just checks whether or not port 443 is open I do that because nothing I write will be faster than mass scan So this is a good first filter to prove that 443 is open. I then wrote a GoLang program that sends a TLS client Hello, and then gets the server certificate back and then just immediately disconnects and so the pseudocode roughly looks like this there's a few parsing steps in between but this is essentially all that I run and I had to modify the GoLang TLS stack to get this early return because it's not really a standard thing to do it was Actually surprisingly easy. I really love working with the GoLang TLS stack It's much much easier than working with OpenSSL if you're familiar with OpenSSL so This is what I ran or set up and ran and it took about 72 hours or so so I got about 12 gigabytes of data back which was host name IP address combinations and I then asked myself some really basic questions like am I finding every host and this scanner identified 51 million hosts online the IPv4 space a 2015 paper that I found identified 42 million one really interesting thing is Shodan only finds 42 million today I don't understand the discrepancy between my scanner and Shodan I suspect that Shodan has been blocked from parts of the internet, but I don't have any data to back that up And I have to ask myself when I find an age every certificate online and the answer really is no It's common server configuration In that client. Hello. I could specify an SNI. It's a client server configuration that the TLS client hello is used to differentiate clients and Based on what you pass in there that will turn a different Certificate back to you. And so this scanner will miss all of those Another thing I have to point out is that the goal line x5 and 9 parser is very strict so if your server was doing something weird with certificates or Really just had any sort of non-standard x5 and 9 cert It likely will not pass through goal lines x5 and 9 parser and there's actually a blog post on how to parse malformed certificates there if you're interested So I also have to ask who else is doing this because none of this is really New or revolutionary in my opinion. So I set up a TLS server and I just captured traffic and I set up a just got the PCAP back and I actually found three servers that did the exact same thing that I'm doing two of them were from universities Which were scanning the internet for their own research purposes and one of them was from a hosting provider in Germany as well So clearly there's so many in Germany who's doing the exact same thing and simply not Publishing it in any paper that I found I believe that they're doing the same thing as I am because Of the way that they close the connections as well as there was no SNI sent in the connection They made to me so clearly they weren't trying to connect to a host They were just simply trying to scan IP range or connect to my IP address to just get back the certificate that I would return so I now have 12 gigabytes of hostname IP address Combinations which is a lot of data and it's like not the most friendly form to work with 12 gigabytes of text. So this is part two which is how to search large parts of large DNS data sets and So this actually came about because I wanted to use the rapid seven data sets. So in the rapid seven DNS they have DNS listeners online for FDNS and our DNS requests and They're forward and reverse DNS requests and they you can just go to their website download this. It's a great resource I love it. It's just really hard to work with because it's 10 gigabytes of compressed text which expands out to about a hundred gigabytes of uncompressed text and So this always took the long time to search every time I wanted to search through this It took about 20 minutes, which is a little bit insane. So I find myself trying to write better Or trying to eat the script. I should say better or more fast ways to decompress and grep or Use more disk space to grep faster and this obviously just took a long time and I actually wrote a blog post about this Which is linked to there back in February and so in order to Sort the data. I took advantage of the DNS structure So DNS is structured such that when you make a DNS request for in this case blog at every Sam calm You actually first go to the root which is dot roughly calm and then you go to calm and then you go to calm every Sam and then you go to calm dot every Sam blog and So you can take advantage of this in order to sort of the data. So I reversed every string roughly and Sorted it at this point because the data is sorted in order to find a Hostname that I'm looking for in this data set. I simply have to binary search it which is in order of magnitude faster than Grepping through the files If you're familiar with how binary searches work I talked about this in much more detail in my blog if you're interested in technical details I'll stuff code online and how to do this and This is actually something that's hosted online as well So I put this online using a go-to-line web server. It's available today on DNS stop buffer over run buffer over run slash DNS and you get back data from the rapid seven data sets This I use this myself It's a great way to just quickly grip through them if you're searching for something and I also posted the runtime on there The runtime is usually a fraction of a second to binary search of this data And I also linked to my github account here, which has the data or has the Code to generate this server in it or generate sorry convert the data into us Searchable format that's usable by the server So one day I woke up and I checked the traffic for this website and I saw this and I thought man I'm a really good blogger but what it actually happened was I got pulled into something called amass which is a utility for hostname reconnaissance that's published by a wasp and This was good completely unknown to me. Luckily the server held up. I'm a little bit proud of that personally but that was the That was the source of all this data and the fact that it's so heavily cached Means that everyone searching for the same host names or they're repeatedly searching for the same host names over and over again Which I found really interesting So I actually don't like data on this. I don't actually know what everyone's searching for and I frankly don't really want to So now that I have a good data set and I have a good way to search it I want to put it all together and So similar to the DNS records with rapid seven I hosted this online at TLS Buffer overrun over dot run slash DNS and this is actually online today and you're welcome to try it out It's literally just a combination of the data set that The my steal scanner picked up as well as The server behind the rapid seven data set and this is actually fully automated at this point So this will refresh once a week So I then have to compare myself to what else is out there I'm mostly just curious to see if this is even necessary and So shodan.io should contain similar results. It's not free. I Found that my TLS scanner tends to pick up some more results than shodan Which I found interesting and that comes back to my previous point that I think shodan is being blocked by large parts of the Internet, but I don't have any data to really prove that Certificate transparency monitors are awesome But they only contain publicly trusted certs They don't link back to where the cert came from and rapid seven actually has a TLS data set, but It's only the new certificates they encounter in their scans every week They don't have historical data unless you have an account with them And there are many others the OSAMS tool has a great list of existing resources online So just to give a demo here. Obviously hack yourself first plug in your company's name. See what you find I find that really interesting there's also when I build these tools one of the first things I do is I run them against.mil and I report whatever I find back to the DoD vulnerability disclosure program. So if you're interested in finding vulnerabilities Go look at the 473,000 results here find a hostname Try and exploit it report what you find So I did that and I found a web logic Remote code execution that was still online and from 2017 exploit and I was able to exploit and report it in and I just do this simply to test my tools, but at the same time it's because of the Because of the military has such a variation of Technology online because it's all subcontracted out. It's not all PHP. It's not all Java Your tools will likely pick up something interesting And so actually one interesting thing about this is that they actually blocked outbound traffic Which made autumn which would likely make automated scanners fail here So I was able to demonstrate this by injecting a sleep of 12 seconds And then the results of 12 seconds to return back to me demonstrating RC So yeah, hack yourself first hack your military first Thank you. I guess questions. I use Linode with Linode if you are a Security researcher or sorry if you are doing something for security research You can get a security researcher designation on your account Which means that as long as you follow their rules and you have like a Page that links to what your tool is doing connecting to other hosts. They will Automated they'll have an automated reply to anybody filing abuse complaints against you. No, it's only It's it it matches based on a reverse of the DNS name So you could do like dot mill army dot mill and it would return you everything underneath that zone Any other questions? Thank you. I hope this was interesting