 Alright. Thanks for coming. My name is Johnny Long. In case you're in the wrong spot. Presentation today is called Google Hacking for Penetration Testers. How many of you have seen me speak before? Show of hands? Okay. How many of you have seen me speak up on the topic of Google? Okay, cool. Thanks for coming and seeing me speak about Google again. Next year it will be different. It will be Yahoo. I have about an hour and a half talk here, which I'm told is just not possible. So I'm going to be flying through some stuff early on. The format of the talk is pretty straightforward. I go through some technical mumbo jumbo in the beginning. It's a little bit dry. So for those of you that aren't technical, feel free to nap then because things get interesting towards the end and we have a nice fireworks show at the end. So those of you that nap do it now. Also, after the talk in the next room, just on the other side of this wall, we're going to do a book signing for the book by the same name written by some guy named Johnny that has gotten decent reviews and has lots of pretty pictures. So you might want to check that out. So I'll be doing a book sign across the wall. Also, we're going to be doing a contest. This worked out fairly well for Black Hat. I expected to work out cell phone rules. For every ring of a cell phone, it's $5 to a charity of my choice. That usually works. We're going to be doing a live Google hacking contest. For those of you that know a little bit about this topic and have come armed to actually win some prizes in the form of books from Singress, I have two folks in the back that are helping me out. Can you get a Josh? Josh and Nick, can you put your hands up? Back here, against the wall. If you guys are interested in entering the Google hacking contest, here's what we're looking for. These are not already public that cough up sensitive information. It's sort of a gentleman's agreement that they actually work and that they're actually unique. But here's the deal, to claim your prize, you need to come up in front of all these people and in front of that camera to claim your prize. We will have your email address and we'll inform all the other contestants of your cheating, lying ways. And we will give them your email address, so be warned. But anyway, seriously, if you guys have some interesting queries that you'd like to send to the guys in the back, we've got lots of prizes that we'd be happy to hook you up with. All right, so let's dig in. This is the part that I'm going to go short on. How many of you have used Google? Show of hands. Everybody else has to leave, that's it. All right, how many of you have used Google with advanced operators? Show of hands, good, excellent. So I will go pretty light with this. Advanced operators are pretty straightforward. You basically put them right in with your query, just like your other search terms. The syntax is an operator followed by a colon, followed by a thing. I'll call it a search term. There's no space before or after the colon. So let's get an idea of what some of the operators are. This is ripped right from a Singress book. It's the first time I ever enjoyed copyright infringement. It felt great. Anyway, you've got all the operators down the left-hand side. What the operator is, followed by the purpose of the operator, followed by a couple things that identify where the operator can be used and whether or not it plays nicely with the other operators. One of the things you'll realize when you start messing with this stuff is that some operators are just plain nasty. They just don't like to be messed around with. So this table sort of helps sort things out. But here's how, I'm going to skip this one because this is a bad demonstration. Here's how advanced operators can work. And by the way, I have 209 slides. Does anybody have more than that at DEF CON? Sweet. I have PowerPoint Foo. All right, we will get through them all. All right, let's use advanced operators to actually find this page. This is a page from my website. And what I'm going to do is I'm actually going to define this page with advanced operators. That's true. My PowerPoint Foo skills are gone now. You will be able to get this off my website after the conference. I felt very uncomfortable pushing the 40 megs worth of Google joy up on my server with all of you evil people around. So after the CON, you guys can pick it up. I have yo-yo skills, though. So let's define this page. Here you go. In title, I hack stuff. Notice we're using a phrase. We're looking for a phrase in the title. I'm building a Google query here to define the page. File type PHP. This is a PHP page. NumRange. NumRange got a lot of press last year through no fault of my own. And I still won't describe the evil uses of NumRange. But if you haven't figured it out by now, all hope is lost for you. But NumRange is absolutely evil. I know you. I think you're going to need to come up here and drink that if you're bringing it up. Yeah, come on. Who wants Mudge to sit up here? What are you going to do with my DNA? I've already got it elsewhere. None of you heard that. All right, on with the show. Let's go a little bit farther in text. In text is a little bit goofy because in text sort of does what you think Google searching normally does. It finds stuff in the text of a web page. But the bottom line here is you throw all of this crap together into one big nasty query and you get something that looks like this, one result. You've basically used advanced operators to narrow down your search very efficiently. So there we go, one hit. That's advanced Google searching. I've gotten a lot of flack about the term Google hacking. So let's show where the line is, okay? Google hacking basically is looking for pages that look something like this. Can anybody tell me what this URL might describe? Thanks much. Okay, I'm here in e-commerce, right? Shopping cart, yeah. I hear credit cards, there you go. Yeah, this is basically, well we'll show you what it describes, but here's the query. In URL admin, in URL orders, file type PHP. It's a nice query. Throw it into Google, you find pages that look like this. It's the back end of an e-commerce site. It's the administrative page for an e-commerce site. So you've got customer orders, you've got order amounts, you've got ship dates, okay? And then on the bottom right you can see we've got payment details including bank information, credit card information. Okay, this is the heart and soul of Google hacking. It's basically using Google, first of all, to do things that it wasn't necessarily designed to do, and secondly, find things that aren't necessarily public. All right. Now I came up with this interesting term called GoogleDork, which by my own definition is a foolish or inept person as revealed by Google, all right? So the term GoogleDork has gotten pretty popular despite the fact that Google has not put it on their web page anywhere. But it's become obvious lately that Google has started blocking queries, okay? And I call the blocking of malicious queries the GDDS for the GoogleDork detection system. This is just a blurb. I'm going to fly past this. This is basically legally so that the white hats in the community don't yell at me for, you know, revealing this thing. Bottom line is Google's blocking queries. They've never before stepped into this game of messing with your queries. Okay, they've been in the ivory, I don't want to say ivory tower, it's got negative connotations, but they've never jumped into the security game so the problem that I have with that is that they're not doing it very well. Let's take a look at why. Anybody run into a page like this when they're doing a Google query? Show of hands. Anyone? You all are evil. Everyone look around. People with their hands up are evil, okay? This screen comes up when you try a malicious query and Google just doesn't like it, okay? So what we're going to do is we're going to actually try to bypass this screen. This is what the query looks like in the form of a URL up at the top, down the middle, for those of you in the back that can't see, I've basically highlighted the query part. It says, in URL, colon, admin, dot, PHP. Well, here's Uber hacker trick, Oday of the talk, number one, okay? I've got lots of them, all right? Unfortunately, this is like Oday from like 84. All right, so our first bypass technique involves just changing the case of any of the words in the query. So how about admin, dot, capital, PHP? So what do you think happens when we fire off any of these queries? Lo and behold, it goes through, okay? Now, I'm not going to pick on Google too much for this one because about two days after I talked about this in Japan for Black Hat, magically this got fixed, which was cool. I counted that as a personal victory. I was like, all right, well, let's keep going. So that was round one. Let's talk about round two. Round two is query shifting. Here's an interesting query. This one is for a index.php file that has PHP BB2 somewhere in the URL, and they block it, malicious query. So call this query shifting. Basically, what I'm going to do is I'm going to use the same query for all intents and purposes, but I'm going to use different operators. So I'm going to look for PHP BB2 in the URL still, a file type of PHP and the word index. It's essentially the same query, and it goes through. Shift around Google's operators, make the search a little bit different. It goes through. Query shifting. This one's my personal favorite. Some of these are getting fixed, like as we speak, some of them are still lingering, so your mileage may vary. Anyway, powered by PHP BB gets blocked. Now, this is elite. Just wait for it. Check out what you do to get the query to go through. Okay, putting spaces between the letters and the word powered lets the query go through. Now, you would think that this would just not work. Google would take one look at this and go, you're a friggin' idiot. Well, they'd do it nicely, and it would be in pleasing colors that weren't offensive. But normally, they'd fix it for you. Well, Google does actually fix this query for you. They take one look at it, and they're like, what you meant to say was powered, but hold on, let me check the query to see if it's evil, not evil. Okay, I'm gonna fix it so that I'm searching for what you're looking for, and I'm gonna search for powered by PHP BB. You can tell it's searching for that because down in the actual query results, it highlights the words powered by PHP BB. Down here, Oday from 84. All right, here's another example. Same thing, powered by PHP BB 2, blah, blah, blah. Throw some spaces in there. It also works if you throw pluses between the letters in the URL. That's also elite. How many of you have seen this message? Show of hands, okay? Those 10 people are really evil, all right? This is the message that you get when you violate Google's terms of service that you've all read. You're laughing at me. I saw a lot of hands. I saw a lot of people in the audience say, yeah, you're Google users. So how many of you Google users have read the terms of service? That's what I thought. Yeah, all right. Well, Google's terms of service says, don't do automation, okay? That's the long and short of it. They give you some alternatives. The alternatives sort of suck, but they say don't do automation. I got this message by actually curling for a Google query. I used curl, there's my query, fired it into poo.html, which doesn't mean anything, by the way. And I viewed the page, and we get this forbidden message. Now, can anybody tell me exactly what's different between curl and any other web browser that Google might be interested in? Thank you. Oh, Dave, here we go. How about if you throw a user agent into curl and actually try it again? What do you think happens? Yeah, throw a user agent in there. Now, I threw a really nice user agent in there. I did a little bit of research, got out my sniffer. I was feeling all uber. I'm like, ooh, look at the nice user agent from my browser. Copy it into curl. I'm like, yeah, I'm the man. And then I started thinking, you know, I really can't type as well as I used to, and I actually looked at my curl command line. I had this typo in there. I actually had an F space ire fox as my web browser. And then I thought to myself, wonder what happens if I just throw, you know, like poo in there as the user agent? What would happen? Take a guess what happens. It works. All right. So Google's looking at the user agent, but they don't give a fly and flip exactly what you put in there. It just has to be in there somewhere, which makes sense. They don't want to block the newest browsers from getting search results. So anyway, that's how you get past some automation. All right. Question is, why do this? Bottom line is Google's not going to protect you from evil searches. You got to take the battle into your own hands. All right. Let's keep moving. I'm going to fly through this because this is pretty basic stuff and it was fluff for the people at Black Hat because they're not as smart as you guys. Just kidding. All right. Bottom line is there's some interesting things to do if Google queries just don't work out for you. This is an example of a query looking for email addresses where some spam troll and we're out there looking for email addresses. So we look for at gmail.com. Well, Google throws away the at sign. Okay. The bottom line is there's lots of data out there in Google that can be mined very effectively for all sorts of different things and can be mined in such a way as to not send any packets to the target that you're trying to get more information about. We'll talk about anonymity in a minute. Bottom line is if Google queries don't work for you, and yeah, this is really elite too, you know. Using a regular expression on an HTML file dump of a query page gets you the same result. For example, I use links to actually do the same query, pull the results down, and then I look in that file for gmail addresses using a regular expression. Okay. Pretty straightforward. But the bottom line is this is a little bit, you know, it's a little bit hard doing it this way, you know, with the links, but if you search more and more pages, you can actually get more results. Actually, this one was done with a program called EmailMiner from SensePost. How many saw the SensePost talk today? Okay. SensePost has a really cool tool called Bidibla, which they released today, and they do a lot of things to actually do open source information gathering about targets using Google. Basically what I'm doing here is I'm showing you some of what they do behind the scenes. You know, there's nothing terribly technical about it, but it's an interesting way to actually profile a target. So, lots of email addresses. We troll through more results. We get more hits. I threw this in there because it's just funny, and I like ripping stuff out of people's books. Other places to find email addresses. How about Excel spreadsheets called email.xls? Yeah, I like that one. How about Outlook address books? Yeah, you name it. It's in there. How about registry files that actually have Internet account management that pop three passwords in there? All sorts of places. Well, the same thing can actually be done to map a network, and this gets a little deeper, but the process of actually mapping out a network might look something like this. You get a list of targets. You find related targets, perhaps outside that domain. Those of you that saw SensePost talks sort of know where this is going. You expand that list of targets, using a couple different methods. We're going to look at a couple different ways to do this. You verify those targets. You discover more targets, and you do vitality scans to see if they're up. That's sort of how the process works. Let's look at some interesting ways to do this. First, locating targets. Some quick ways to locate targets. Lots of stuff in there. Your eyes are probably all drawn to the last one, which, yes, it does say scraping turds. I don't know where the turds and the poo and all that stuff is coming from. It really is not a statement about me personally. It's just how it came out. Okay, that was a bad way to say that. All right, you get the idea. All right, very simple query, site colon microsoft.com. Again, we're having a nice little conversation with Google, trying to do recon on microsoft.com. Site microsoft.com lists every Google page from microsoft.com. All right, pretty straightforward. Problem is, the results suck. If you're trying to acquire targets, the targets are rather limited. Top of the list is www.microsoft.com, which the entire freaking planet knows exists. As an attacker, we don't care about www.microsoft.com. Bottom line is, we want to get rid of it, so we change our query. Site microsoft.com minus site screensaver.microsoft.com. As you can see, when this query comes back, we're not looking at WWWs anymore. We're getting interesting target names. An attacker sort of scrapes these target names to the side and collects them, gets to be a little bit tedious. This isn't a great way to do it, but you're going to run into a query cap of 32 search items in a query. So you keep reducing sites from this list, eventually you're going to run into a limit, but you're getting interesting results. Research.microsoft.com, partner.microsoft.com, blah, blah, blah. All right, some other ways to do it. Back to regular expressions. Perform the query in an automated fashion, slamming all the results of the HTML into a big file, and use regular expressions to go through that file to look for URLs that end in microsoft.com. So you're still using Google to do the actual query, but you're not necessarily limited by Google's search functionality. You can look for more interesting stuff. Now, you can not only do this with the search result page itself, but you can do this with the entire HTML content of the website as well, without throwing a single packet to the target. We'll talk about that next. Blah, blah, blah, you run the thing, you get stuff. A big so-what slide here. Yeah, I know there's other ways to actually recon targets. The hook here is actually doing a recon on the targets without sending packets to it, so that's where we're going to go. Now that you've read all that, next slide. Automated crawling, Google frowns on it, so we could use some other techniques to actually get things done. I think I'm probably going to skip all of this because... blah, blah. I do want to highlight this. This is a nice tool again by Sense Post, and I love this stuff. Biddy Blah is a very powerful tool. I encourage you to go to Sense Post's site and check it out. But if you're not necessarily interested in all the functionality of Biddy Blah, but you're more interested in the techniques that Biddy Blah uses to actually map out a network and you want to expand on that, some of these tools are available from Sense Post as well. Bottom line, run this Perl script with a domain name, comes back and gives you a list of subdomains, DNS names, and target names using Google queries. Pretty straightforward. Okay. Onto the turd thing. I came up with another term, Google turds, which Google hasn't adopted yet either. I think it's because turds are usually brown, and Google doesn't use brown in any of their color schemes. Not to be too graphic about it, but a Google turd is a Google query that's broken. It shouldn't return results. It's syntactically incorrect. Can anybody tell me what's wrong with this query? Site colon NASA. There's no .gov. Right. It's a busted query. Problem is we got results. That's the heart and soul of a Google turd. Busted query that gets results. These results could be interesting. We could do something with these results. Lots of ways that these could have shown up. Syntax problems on the site when it was crawled, yada, yada, yada, but we'll come back to this in a little bit. All right. Also, there's a lot to be gotten from relationships between sites that you can actually use with Google. I'm going to go really light on this, too. But the bottom line is, if you can scrape all of this stuff from Google and get information about Microsoft.com, how much interest do you have in finding other sites other than Microsoft.com that could potentially be targets that are owned by the same entity? It's something you probably want to do. Well, in order to figure out relationships between sites, we need to have somewhat of a process. For example, if I post a link on my site to Microsoft.com, does that mean that there's a relationship between Microsoft and me? No. Anybody can throw up a link to Microsoft.com. Now, let's ask it another way. If Microsoft also has a link to me, is there possibly a relationship now? Possibly. In fact, it's fairly likely. This is the heart and soul of link waiting, okay? Weighing links to and from sites to determine possible relationships. Bottom line here is that the two-way link weighs more than a one-way link, and you can use all of this information when you're scraping all this data from Google to figure out relationships. All right? I just said that. All right, now, the question is, what can we use to determine links to and from a target? The answer for everything in this talk is Google, the link operator, all right? Link www.microsoft.com shows all the sites that link to Microsoft's website, okay? The problem is the results. For example, MapQuest, second in the list. Microsoft have anything to do with MapQuest? I hope not, do they? I don't know. Well, bottom line is the results are pretty bad. Also, link doesn't play well with other operators, so link is very limited for these purposes, okay? If you try to use link and site together, the query's busted. It doesn't work, all right? So, blah, blah, blah. Again, sense post to the rescue, they have all sorts of tools that do stuff like this, and I'm really giving them a lot of press because they deserve it, okay? They've taken a lot of these really cool ideas with Google, and they haven't done DEF CON and Black Hat talks on them necessarily, but they've quietly implemented them into various tools that make sense. One of the tools is BILE, the Bidirectional Link Extractor. Basically, BILE automates the process of figuring out link weights. Let me give you an example. For example, who is sense post? You guys probably know, but let's pretend we don't. A Google query doesn't give you much of an answer, okay? But if we throw sensepost.com into BILE, we find out that there are lots of links to sensepost.com from lots of different servers, and here's where they are. We also go through, and BILE will link weight each of those relationships to figure out which is the strongest, and this is the output that you get. You'll notice sensepost.com is at the top. It's the most strongly related site to sensepost that's out there. The second is blackhat.com, and these are sorted order. Then we've got packetstorm security, securitylab.ru. Bottom line is you can tell pretty quickly that sensepost has something to do with security, and they're involved in the security community. It's all done with Google queries, Google queries and link weighting. All right, interesting question. Can Google be used to locate hosts that Google doesn't even know about? Sort of a brain teaser. Sure. But we'll need to do some background stuff first. I'm skipping that. All right. So we're at the point where we've located some targets. We've seen some interesting ways to locate targets. How about expanding the target list? Here's some ways that you can expand target lists. We'll talk about just a couple of them. If you've got a list of hosts, if you've got a domain, food.com, you can force the domain with NS lookups. You could use every known word in a word list. You could do all sorts of interrelationship stuff. But that stuff doesn't always work. You could do fuzzing. For example, if I churn up bhst03.ryzen.com, I could fuzz that and see if bhst02 is there. Same with something like www33. It's basic fuzzing to help us expand our host list. You can also do it with names. Let's say we get these two hosts, fuzzing.com and orange.food.com. Can anybody give me some other hosts that might exist? Mango is the first answer. Slick. Caliente is not a fruit, is it? Passion fruit. You were at Black Hat, weren't you? That was the number one answer, was passion fruit. It's much more passionate than the ones I came up with, but there you go. Cherry, I came up with mango. Watermelon, pear, strawberry, banana. Didn't come out with passion fruit. So you've done some basic expansion, some really basic fuzzing to figure out more targets that might be out there. Let's figure out a more interesting way to do it. Anybody here use Google's Sets? Good. Google's Sets actually takes a couple words that are related and tells you the next words in the series that might appear using Google's database. For example, apples and oranges fired into Google's Sets gives you a list that looks like this. We've got stuff like, we got here we got pears and strawberries and blueberries and figs and kiwis. It's a pretty good list, okay? Could fairly easily expand these, but let's try another exercise. We've got investor and foundation as host names. Expand those. Okay, I'll cut to the chase because there's not a lot of answers coming. It's tough, so let's throw it into Google's Sets and see what Google's Sets has to come up with. Investor and foundation thrown into Google's Sets gives us this awesome list of things like the word board members and business directory and metal roof and owner-occupied at Foga-Fopa. Man, you guys are dumb. You didn't call it with any of these. Well, the problem is Google's Sets did its best, okay? The bottom line is here, we didn't give it related terms. So the next question I pose to you is, armed with a list of words, we're thinking host names here, right? How can we find the ones that are the most related? The answer for everything in this talk is, thank you. All right, here's one way to do it. Take the host names, every host name that you have and split them up into unique pairs. For example, Apple, Pickle, Orange and Fred are host names. Break them up into unique pairs. Apple, Pickle, Apple, Orange, Apple, Fred, blah, blah, blah. And now we Google for each pair to get something that looks like this. Okay, based on the Google results, which of these somewhat related host names are the most closely related? Apple and Orange. It has the most hits. It's a reasonable assumption. So you get a massive host list, feed it in TV, pairs through Google queries and get Google to tell you which ones are the most closely related and then you throw those two into Google sets or you automate this process like we have on the bottom left to get the same results. Bottom line, Apple and Orange wins. Google figured out which of the words was most closely related. So, let's pull it together. I'll pull it together with pictures. All right, we get a list of targets. Here's SDSU.edu, list of targets. Physics, music, bio, geology. Okay, they're college courses, right? So, let's feed and this is automated here. Let's feed all of these words into Google to come up with the two that are the most closely related and we find out that chemistry and physics are in fact the two most closely related words. What we do next is feed chemistry and physics into Google sets to get a list of expanded host names, okay, to figure out and then stick them on to SDSU.edu. And what we're going to do is send a single NS lookup to figure out if these hosts resolve. Again, all automated. And as you see, a lot of the hosts resolve. Okay, pretty nice. So, we can also go in, we can do links to see if they have, you know, if they're web servers, you can do port scanning, whatever. Now, the only reason I'm checking to see if these are web servers is because of this very chic and very hard to read table. Now, the table is interesting because it basically sums up what we found. 13 new hosts that we didn't have before. Okay, we found 13 new hosts using this method. Seven of which Google knew nothing about. Google couldn't find any of these hosts. Four of those hosts were web servers. So, Google didn't even know about these web servers. So, the answer to the question is, can you use Google to find hosts that Google doesn't even know about is? Yes, thank you. All right. Verifying. I get a little bit into verifying. Bottom line is you can use lots of stuff to verify the targets, but we're going to try to do it in a more stealthy fashion. One more example. Remember the NASA sites, all right? What was missing from the end of them? .gov. 74 results in this Google turd query. If I slap .gov onto the end of each of these hosts, how many of them do you think resolve? All of them. Okay, the bottom line is these were posted to the web. They were broken. When we resolved them, we found targets. The point is, lots of ways to get interesting host information without sending any packets to the target. All right. All right, now about zero packet scanning. This is sort of meant as a joke. Zero packet scanning does not mean passive. Okay, I'm not talking about passive here. I'm talking about zero packets from you, the source to your target. So, let's try to figure out some ways that we can actually solve one without sending any packets directly to the target. It's about having fun with Google, or remember when security was fun. So, please no comments about the zero packet thing. All right. For example, DNS resolution isn't a big deal. Okay, port scanning is that flags IDS monitor, so we don't want to do that crap. So, let's find some interesting ways to get Google to help us out. Here you go. One Google query that finds forms that look like this. This form lets you send an email to somebody, from somebody with a subject and a body, and it comes from their web server. You do one Google query, you find this form, you fire off email from them. Isn't this fun? One Google query finds sites that'll gladly run the finger command for you. Notice I said that without opening myself to any finger jokes. One Google query finds sites that will actually ping for you. One Google query that will find sites that will gladly port scan a target for you. Zero packets from you to the target. You proxy your connection through this site. They're doing the port scanning, plus you're coming in proxy. Fun. About this one, this one's a little long-winded, so I'm going to blow through it. Another site that lets you do port scanning, limited ports. See the limited ports? Here we go with more 1984 eliteness. What do you think the colon 81 on the end of that URL refers to? A port. What do you suppose that little red dot that says port closed means? Port closed. Let's change the URL to 80, and we get a green dot, which means port open. We write a little script that actually fires through all those URLs to see which image is returned, and we have an automated solution that actually bounces a port scan off of another site. Zero packets from you to the target, you get your port scan at their expense. You can even build an HTML page on your local workstation that looks like this, if you really must have green and red dots to break into a site. Some of you are doing, that's okay. How about one Google query to find open proxy servers? Okay, when you run out of proxy servers to bounce all this crap off of, fire off a Google query and find some more. Open reverse squid proxies. Another CGI proxy, one query to find CGI proxies. Okay, this one's kind of fun. Please choose URL slash encoding, and then they give you a little box, and you put a URL in there, and what do you know? It bounces you to a site proxy style. Here I am viewing my website through a Chinese government proxy. Oh, did I say Chinese? I meant, I was thinking about Chinese food, and it just came out. Next slide, not the food, but the word. All right, here's one Google query that actually does all this stuff for you, sites that'll do all this crap. Swizz, fingers, NS lookup, host, DNS queries, digs. Again, one Google query. This site will do your bidding. Okay. This one will find NQT. I covered NQT last year. It's got some interesting limited functionality. But bottom line here is Jimmy Neutron actually turned this into a nice little pearl script to bounce scans off of an NQT server so that you can do all this stuff off of NQT. These tools are available from my website if you want to take a look at those. Okay, I'm skipping this one. This is basically a Java SSH client. You can do all sorts of nasty stuff with this, but it's from your machine. It's just kind of neat, so skip that. Oh, wait, come back. Is there a way to anonymously web crawl a target without pounding it with packets? Of course. How many of you have ever used a cache link? Keep your hands up. How many of you have ever read the banner at the top of the cached page? Good. I got to say the average is better than it was at Black Hat. All right? Now, for those of you that haven't read this page, shame on you, but let's figure out what happens when you click on a cache link. All right? Here's a TCP dump. We click on a cache link. We make a connection to, in this case, the frack website and to Google. Okay? So we've got some stuff going on. Question is, what was that that happened behind the scenes? What is this? It's an image. Okay? So, up in the little header, Google was kind enough to tell you this cache page may reference images which are no longer available. Click here for the cache text only. So, let's click here for the cache text only. Sniff it again. This time the entire conversation is between me and Google. We got the HTML off the site, safely from Google without touching the frack website. Bottom line here is it's very easy to get HTML content without sending any packets to the target. What you do with that content is wide open. Run regular expressions through it. Do whatever you want. Bottom line is you're not touching the target. You're talking to Google. Question is, what happened behind the scenes? Top URL is the cache link. Bottom URL is the cache text only. There's only one difference. It's got and strip equals one on the end. Okay? So, let's see how we would do this in the real world. Right click on a cache link. Save the address to the clipboard. Paste it up to the address bar. Slap and strip equals one onto the end. You're doing the HTML of the web page without sending any packets to the target. Easy, right? No, it's not. You're supposed to say no, it's not. Okay, well, how about a Firefox plugin that actually does this for you? Right click on any URL on any page and you get these nifty little options. Passive cache Google this link. What do you think that does? Okay. Very nice plugin written by Brian Baskin. It's available from our website. Not from Google, don't say Google. Alright, so you right click on the link and it takes you to the passive cache Google image of that page. You don't have to hit the target. The one under that is nice too. Passive cache archive this link. How about pulling the page out of archive.org to see what it looked like in the past. Okay, so two very nice options for getting web pages without touching the target. Right Firefox. So, yes you can do zero packet recon against the target. And we're flying, we're not letting you read, we're going on. Yeah, I'll talk about this. This is kind of neat. I didn't necessarily invent this thing, but it's kind of a neat situation. Let's say you go to a web page that you find in Google and it's down. You click the cache link and sometimes the cache link's even broken. Sometimes you get that error message. Well the question is, is there any way we can recreate that web page without, you know, without hurting ourselves and the answer is cache sliding. Somebody else came up with this for the life of me to figure out who it was. So, if you know please tell me, so you don't think I'm making this up as my own uber-eliteness stuff. Alright, so we have this query. We can rebuild the web page just from the snippet. The way you do it is pretty straightforward. What you do is you find the page that you're looking for, you narrow down the page with a query and you actually throw words from the snippet on the tail end of your query. Okay, it's pretty straightforward. But what you do is you keep sliding your search queries forward so that you can read more and more of the snippet. So, for example, you can see I queried for native's project next to each other, right? The next query I query for album entitled, notice it's showing me before and after my search term. I'm recreating the web page without the web page even being there. The cache is gone, the real web page is gone, bottom line is you keep sliding it forward and forward until you recreate the entire web page. Just a quick second about this, user agent spoofing. It's entirely possible for you to change your user agent to make your browser look different. How many of you have ever changed your user agent to Googlebot? Excellent. Nobody at Black Hat raised their hands. That's pretty cool. Well, surfing the net as Googlebot has some interesting ramifications. For one, lots of badly configured sites, like forums and news sites, actually lock people out depending on what their user agent is. For example, you got this advertising on your site, right? You want Google to see your advertising and all that garbage and be able to troll through your forums so that it can send visitors your way, alright? But you want users to actually sign up for an account. So you block anybody that comes in that doesn't have Googlebot as their user agent. So if you want to view the site, set your user agent to Googlebot and get whatever you want. Now, these people ain't too bright. They'll be the first to admit it. This is not security, okay? I don't think it was meant to be. And a lot of sites do this cloaking a better way, which is you look at the user agent plus you look at the IP they came from, see if it came from Google, that's a better way to do it. But bottom line is you can have an awful lot of fun with this. Alright, the fireworks show. Welcome to the Google hacking showcase 2005. All you people that are sleeping wake up now. There you go. Let's put the games begin. The best and worst of this year. Alright, let's start pretty easy. How about a VNC server? How many of you have used VNC or know what it is? Yeah. You install it on your server. You install a client. You point it at your server. You control the keyboard and the mouse. It's pretty nice. It requires a username. It requires you to know the port that it's listening on. Alright, how about one query that finds real VNC servers that will actually send you down a Java based client that you can use and is also nice enough to fill in the host name and the port that you should connect to. Alright. And one modification to the query and guess what, you can find sites that have no username or password associated with them. One Google query, you're moving the keyboard and the mouse. That being called Google hacking, just go. Alright, thank you. Alright, this one's weak. Next. Print servers. My laptop even thought it was weak. Alright, one query that actually finds access print servers. Now print servers are boring little beasts. You know, unless you're FX and those folks that do really ungodly things to printers. But printers get a really bad rap. Now, I'm going to show you some more O-Day here. This is really elite stuff. Press, you should start writing right now. Oh, you're in five minutes? Alright, let's fly. Alright, the zero day here is you actually click on that little oval right there with your thing that's a mouse, right? You click the left button with your finger and you aim it at that little oval that says configuration wizard and what do you think happens? It orders Chinese. No, not really. You enter configuration management. How about webcams? I hate freaking webcams. These webcams however are really interesting. This gives shoulder surfing a new name. This one's nice because it's pointing at big barrels of explosive looking things and it has a laser button there. So you can shoot the things. I'm pretty sure that's what it is. I didn't click it. Alright. Anyway, this is the inside one. No, not really. But it's like eBay or something. I don't know. Very interesting. How about this one? A speedstream DSL router. Very nice. Again, here we go with the uber hacker tricks. Please do not print this in the press. What you do is you take the mouse left button again and you click on that oval. It says disconnect. What do you think happens? Bye bye. How about this one? One query finds Belkin routers that are misconfigured. You can lock them out of their own wireless network for their own safety. Alright, how about this one? Who can guess the system password on this box? That's bad news. This is a query that finds a window, small business server. Attention, this site is for employee use. Only public access to our website can be found by clicking here. So you're supposed to be there and not here. What you find pages that let you set up the administrator user. And again, left button, little oval, proceed to log in, create an admin account. Got it. Printers suck unless you can see the print jobs that they're actually printing. How about these? Microsoft Word documents. Some have to do with religion. Some have to do with aphrodisiacs. Problem was it was the same user. Not sure. This is like a distributed denial of service. It lets you do all this stuff to it. You turn on netware and apple talk at the same time. It's just evil. One query to find firewalls. How about smooth wall and IP cops that need updating? Snort IDS front ends. How many of you know about that Cisco problem? Yeah, there's a lot of technical stuff into how that works. You can probably do something similar with a query like this to just find open Cisco devices for people that are stupid. Open switches. PHP Nuke is very nice. How about this one? A query that finds PHP Nuke sites. There are no administrator accounts yet. Proceed to create them by clicking here. For security reasons, it is best that you create the super user right now by clicking here. Parking lot, Nazi cam. You can see the CIO vault over his convertible when he comes into the building from the handicap parking spot. Multiple cameras, multiple camera views including one called Woody. I didn't click. Timelapse video recorders. That's not the end of my presentation. You guys knew I was out of time, so you went right to the end of my presentation. We are very close to the end. Sorry for the next person. Very sorry. Going back or going back? Okay, let's go to the... Ah, here we go. Sorry, it will be worth the wait. There we go. All right, I won't press the wrong button again. One query that lets you find... I don't know what this is, but I think that stands for master bedroom. How about this one? One query that lets you find e-power switches, web-controlled sockets in somebody's house, electrical sockets, Google for the default password, and you get these rectangular buttons, which it's sort of like crypto. You click the rectangle in this case that says power and restart. Yeah, all right. How about this one? Lots more things to turn off, including things like on and off buttons. These are ovals, easy to find. Turn off lamps and motion detectors. You can even turn off their media rack, their aquarium, and their electric bong. How about this one? One click to turn off somebody's Christmas lights. It's just rude. Usernames and passwords. How about digital camera? Sprint, T-Mobile. Counter-Strike server configs with clear text passwords. MSN contact lists. How about V-calendar files? MSN CC mail mailbox files. Open SQL servers. No username or password required. Point and click, own the database. SQL lite, same deal. Netscape history files, including clear text pop passwords. And the surfing list, which includes IBM and hotchicks.com. These are sysprep.in files that actually show the clear text admin password and the product ID for Windows that they use to install it. All right, let's go on. Yeah. How about ipsec final encryption keys? VPN user profiles. Now, VPN is very secure, so these are encrypted. Unfortunately, though, these are not. How about Nessus scan output? Explorer Windows. Yes, that does say WinNT System32. Yes, the oval does read delete. Police reports. Sensitive government documents. Yes, sensitive government documents are on the web. How about this one for the FOUO raincoat? The FOUO ensemble. The step-by-step instructions for putting on the coat, which should be classified, I think. And whatever the hell this guy's doing. I don't know what he's doing, but he's probably smiling, but he's not. Government research project. This is Boba Fett. The silence of the lambs guy. Darth Vader. With the drinking attachment. If Darth Vader's at DEF CON and he wants to drink, he shouldn't have to show his ugly head now, should he? So, yeah, we got all sorts of stuff. How about this one, authorized user only? I'm at the end of the presentation again. Yes, it did say authorized user only. Do not distribute, do not put on web. I am completely out of time, but let's do our last two slides here. For a shirt, can anyone guess what this username and password should be to break in here? Very good. Next one? Next one's a little bit easier. How about? How about? There we go. Sorry, I'm out of time, guys. All right. Yes. How do you break into this site, folks? It says admin equals false in the URL. Yeah, let's change it to admin equals true. And congratulations, you now have...