 How many of you are using FOCA right now? How many of you love FOCA? How many of you love the guys who are presenting FOCA? No, no, no. How many girls of you? Well, let's start with the presentation. First of all, a quick introduction about us. Here is Palaco, he's a friend of mine, long time ago. He's working as a software architect in Yahoo. I'm Chema Alonso, I work in a Spanish company called I-64, and I'm also a Microsoft MVP in enterprise securities five years ago, but Microsoft has nothing to do in this presentation. It's just life. So let's start from the beginning for the new people. How many of you don't know anything about FOCA? Okay, for you, a quick introduction about what is FOCA? Well, first a quick introduction about what FOCA is not. FOCA doesn't have anything to do with freedom of choice side. FOCA doesn't have anything to do with fellowship of orthodox Christians in America. FOCA is not religious, so she doesn't understand. This is a FOCA. That's a FOCA. That's a lovely animal, it's a seal. And we, of course, it's true. And you guys have been asking us for the last year, FOCA only runs in Windows when it's going to be running on Linux. But FOCA eats penguins. We cannot pour this to Linux. She will eat it. Exactly. So now for the people who, who is new in this talk, a quick introduction about previously on FOCA. So this whole thing is starting in our company, well, in his company when he, well, these guys do penetration testing all year long. And one day they saw how important the stuff that is in documents like metadata, how important is that as a first step in a penetration testing. There's users, network shares, all the stuff that we're going to see in a minute. So it all started with this tool called Metastructure that then become FOCA. And as you can see it's in Spanish because in Spanish it's better. So what FOCA does is basically analyze a whole bunch of different file types. It's going to look inside every possible kind of office document like doc, PDFs, PPT, sexo. It's going to look inside EPS, images and so on and so on. So we've done more stuff in the second version but overall it just analyzes all these kind of documents that have information inside. So what kind of information? Everybody knows what metadata is. You usually have a property dialogue and there it says the owner of the document or stuff like that. So yeah, that is metadata. That is useful. But there's a whole bunch of stuff in the document itself that in last year or so we categorized as lost information and hidden information. We're not going to go over that again but basically there's all over the document there is information that is useful. You can find who created the document, who has modified the document. So those are users that we might use. There's paths over which operating system was this document created using which software? If it was printed, what printer was it used? So you can discover if the printer is either local or remote, if the printer is remote you have an IP maybe and you have ACLs to know that computer has access to that IP. There's paths. If the document was stored either locally or remote, if it was stored remote then you have the share. So again you have either an IP or name with paths. If the document was combined with information from a database you have the metadata about the database like the name of the database, the schema like tables, columns, all that kind of stuff. Well and so on and so on. So there's a lot of stuff that we've been using for that. One example is pictures. Like pictures they have GPS information in pictures. So not only office documents, graphical documents has also information. And of course the GPS information. Yeah. So let's with the first demo. This demo is about FOCA as a desktop tool, a tool that you can use to extract metadata to analyze documents and so on. A quick demo with this. First of all I'm going to use this version, not this. This one. And if you want to use the FOCA as a tool for your desktop to analyze documents you only have to drag and drop the file and just click on the right button to extract metadata. And the first demo I'm going to do is with a PDF document. This document is the installation guide of a Linux. It's a Linux distribution in Spain called Wada Linux. It's based on Debian. And of course you can see a lot of information about how to install Linux. Linux is good. It's a very good operating system which can be used for a lot of things. And in this example that we are going to do it. And the guide is in Spanish because Linux in Spanish is better. Exactly. And the only thing that we are going to do is drag and drop the file into the FOCA and extract metadata. And of course you obtain a lot of information about this document. And in this example it's quite funny because Linux is perfect for everything but for writing documents so they are using Microsoft Word in a Windows machine so it's quite nice because this Linux distribution is supposed to be for the end users. So it's quite nice. And the second one, a demo is about a picture because with FOCA you cannot download pictures from the internet but you can drag and drop any pictures that you have on your desktop machine just like this. In this example it's a picture that you probably can use in any of your social networks that you are connected to. This is, well, as you can see there's a lot of white space on the picture. You can see here, white space, white space, white space. Are you seeing the white space? White space, white space. Well, maybe this can be the picture of your profile. And of course you can drag and drop the picture to the FOCA and just right button, extract metadata. And with FOCA you can access to the existing information. In this example it's quite nice because there is, the thumbnail is complete. You can see the foot, the guy, the foot. And of course the face is reflected on the TV screen. And of course you can see here, no. And of course the GPS information or the place where the picture was taken, don't do that in your Facebook or Twitter. That can be very bad. So go back to the slides. And let's start with more things. So what we just seen is you take one single document, just drag information for that. You can either do funny stuff or you can get in trouble. Like that Tony Blair famous affair with the massive destruction weapons that were not and all that stuff. That's how the whole metadata thing got famous. But FOCA is not really about that. What FOCA does is it takes, this is yours. So what FOCA does is instead of taking one document, we're going to take as many documents as we can. So we're going to go to a web search engine. We're going to download as many documents as we can. And we're going to look at this information in all those documents. When you have one single document then you're going to have like one user, one path, one printer. That's not very useful from a penetration testing unless you want to do some social engineering. When you have a thousand files like with the missile defense agency, then you start putting this all together and you get a whole bunch of information. If you get the FBI with almost 5000 files, you put all this information together and you saw last year what's happened. So you cannot get a fairly accurate and very complete map of the network. So what FOCA used to do for last year was we went to Google and Bing. We explained why we did this. Like Google is not very good with the file type extensions and there was missing a lot of documents. So we're combining the results from Google and Bing. Downloading the documents, extracting all the information that we think is useful. And then what FOCA is really good for is it takes all that information, cluster everything together, and prints a nice diagram of the network, user list and that. For those that weren't here last year, we're going to see it again in some demos. After the demos of the last year, we are not going to do any demo with the FBI or with the missile agency just because some legal issues. So other than looking at the information that is inside the documents, what we do is let's say that there's a server called web one. Then there's probably a server called web two, web three. So we're going to try those. We're going to use Google sets also to see if there's a server called, I don't know, poland.devcom.com, then maybe there's another server called germany.devcom.com. So we're going to be trying all those. And then in the next slide, you can see, I think this is from Novel. Novel? Yeah. Yeah. So, well, there's some Novel information there. This is looking for, these are printer shares from a search in ODF documents. So you can see things like Gamma. So then we're going to try Alpha, Beta, all that stuff. You can see SRV2. We're going to try SRV1, all right? So, so, this year, we are going to do the demos with the White House, with .gov. Wait a second, wait a second. How many of you were here last year? Can we do this with the White House? Yes. Well, it's a tough task to do it manually. So we did the project before. This is all the documents that we were able to download from the White House. As you can see, there are more than 2,000 documents. As you can see in the new version of FOCA, there is a tree with different file types. So there are 1,700 PDF files, PPT, PPTX, WordPerfect documents, Excel files, ODD, .ex, and so on. And of course, as you can imagine, they are not cleaning the documents. So just clicking on any document, you can discover a lot of information. This edition time is scary to me. Also with the presentations, with the slides, indexes, a lot of information. This is the executive office of president, with the Microsoft office, the mail. This is an internal domain which is not public on the internet. You only can ping this domain if you are inside the network. It's impossible to obtain any information from it. And of course, more data, exit files, some pictures. I don't know one of these, maybe with a thumbnail. In this case, it's not naked. And with FOCA, with FOCA 1, you got some list with the user names of the guys who are not cleaning the documents. You can trace the documents from which data was taken. Just clicking on it and FOCA takes you to the document. You can analyze the documents and so on. An Excel file with a lot of servers as you can see. Printers and so on. Well, it's scary. Of course, all the information that you can extract, you can discover, you can export it to a text file and so on. And in the end, just clicking on this magic button which is analyze metadata. Oh, it's working again. I'm going to stop. Which is analyze metadata. You are going to be able to receive this information which is the internal map of the network that you can describe with the metadata. As you can see, there are people using Windows machines, XP. Of course, people using Windows Vista. All that is without even pinging the White House. This is just from the metadata on the public documents. And of course, there are 19 servers which are a Windows server. This is one with the internal domain which is the O3. Probably there are O2 and O4 and so on. This is an internal domain. It's impossible to resolve the IP address and so on. And in any of these servers you can discover which users are working on it remotely, operating system and so on. That's what FOCA1 was doing until one week ago. So let's see what's new in the new version of FOCA. So today we start the new version of FOCA2, which is bigger than the first one. So it's more bigger. This is basically a summary of what we're going to be talking for the rest of the session. This is what is new in FOCA. So the main characteristic of FOCA2 is that when we were doing FOCA before, it was all about the documents. It was all about what was inside the documents and all that data. And then we realized that we were missing a really big thing that was right in front of us, which is the URL from where you're downloading the documents. So just by looking at HTTP slash last, wherever, wherever, and the path and that, all that URL contains a lot of information that we're going to be examining on. So that's the main thing of FOCA2 and we're going to go over how, just from a URL, we could take this to the next level. We created a recursive algorithm that analyzed that URL and the URL that you find in the documents to what you're going to see. We've improved the ways of gathering information. We have a new search engine. And the other big thing of FOCA2 is that this tool has been the first step in every penetration testing that we've done for the last year. And then we just thought that it was so convenient that we could kind of improve it a bit and get more stuff into the same framework so that you can continue your work without leaving the platform. So we've started to do some software recognition and some other interesting stuff. And of course, a reporting tool to generate what you discover. So in a minute, you're going to see, well, you've already seen more or less how the tool looks like. You've seen that it's pretty easy to use. You're going to learn all the new stuff. And it's just a very quick session. You're going to be able to generate nice reports and kind of a pretty complete job. So one of the biggest things that we've done for FOCA2 in this new version is we've added a new search engine. Before, we were using just Google and Bing. In this slide, you can see that even in the White House, if you just use Google, you have 332 documents. If you were using just Bing, you have a bit more, 375. Now we have a new guy called Exalid. And this guy is discovering more than 2,000 documents. So we are combining the data from all these three sources. And we have a whole bunch of documents to look at, almost 3,000 for the White House. And with what we've been telling you so far, you understand how important it is, the volume. So one document, that's fine. A few thousand documents, we can do a lot of stuff. So these guys are a really nice addition to FOCA. Another feature that we add to FOCA is the PTR Scanning. The idea is that once you are able to discover an internal IP address of an internal network, it's possible to connect against the internal DNS, change the query type to a PTR and scanning the internal network segment. So in this example with my university, they love me. We change, we look for the internal DNS. We change the DNS to Neptuneo. And we change the PTR, the query type to PTR. Since that moment, we can throw queries to that DNS like this. In this example, as you can see, it's an internal IP address. It's the 192.168.4621. And the internal server is responding us with the name of the server. It's just an IP address. We can get the name server and ask for the internal IP address. And of course, with the whole segment. Now this is included in FOCA 2.5. And talking about IP addresses, once we have an IP address, we would like to know how many domains are being served from that IP address. So there's a nice feature in Bing where you do IP colon and then IP. And then you get every virtual domain or alias that you are serving from that. So for shared hostings and stuff, that's how we do it. Shodan is something that we really love. I guess most of you, if not all of you, will know Shodan by now. It is a search engine for banners. Like when you go to HTTP and you do like a get or a head, you get information about the server or an FTP or every possible thing that you can imagine is there. So when you're doing a penetration testing and you want to go to this first stage of gathering information, you don't even have to run a scanner vulnerability scanner against the organization. You just go to Shodan and they have everything in the exit for you. If you're not targeting anything specific, you just go to Shodan and say, I want an IES version five or I want an Y Apache two zero whichever because you know that there's a vulnerability in there, then you have it. We started using Shodan in FOCA, kind of parsing the web results. But we were FOCA was becoming pretty popular. We were increasing the traffic. So we actually contacted John, the guy who wrote Shodan. We asked for permission basically to keep using the service. And not only he was pleased with the idea, but he created a JSON feed for us. So now it's even nicer and even nicer integration. And there you can see that we have all the information that we want like IP addresses and software, all the stuff that we've been talking about. So what's the idea with FOCA? The idea with FOCA is that before download the file and strike the metadata, we can do a lot of things with the URL. In this example, there is a URL with a document on it. As you can see, it's a Microsoft document. But the URL is giving us a lot of information that we can use. So let's use it. First of all, it's an HTTP. So it's a server. So let's get the banner and strike information from that server. Then we got a domain. That domain, we can use it to discover in which part of the network is one of those servers that we weren't able to allocate into the network map. So it's good for us to discover all the domains that we can. So domain.com is a domain. Of course, if we got the domain, we can also get the name server, the mail exchanger, and the information in the SPF record about IP address of internal servers. Of course, we can do the same with the subdomain. We can try to discover more IP address of the internal server. And try to verify all the non-allocated servers that we discover with the metadata. So we can try server. If we discover that there is a printer in slash slash server 01 slash printer 1, we can try to discover if server 1 is in server 1 domain.com or server 1 subdomain.com. Of course, Apple 1 is a host name. And Apple 1 is the name of a server. So we can try to add DNS prediction looking for Apple 1 to Apple 10 and so on. And we can do the same with the Google sets. We can ask for us to Google a list of related words to Apple and try to discover more servers. And there's not only the information that you can see in the URL. It's also the information that you cannot see. There is an IP address associated to that. So we're going to resolve, reverse resolve the IP address. We're going to get the banner of the server by using the IP address and not the domain that we were using before. Once we have the IP, we're going to use Bing IP to get the information and all the domains that are sharing that server. And we're going to repeat that for every domain that we just found. We're going to connect to the internal network servers with all the information that we just got, all those IPs, and kind of do all the things, all the same things again. PTR scans and everything that we've just discovered. And for everything, absolutely everything that we've just discovered and all those steps, we're going to try, for every new URL that appears, we're going to repeat the algorithm. Then more stuff that there's there is we're going to look for patterns in the URL itself. We've been doing this for the data, so we're going to do it in the URL. Like, you have a possible username there, so we're going to use that. And of course, also we have paths in the URL. So we got three paths in the URL that we can try the directory listing. That is good if you are doing a penetration test. Of course, we can search for unsecured methods. We can try to discover if put, delete, delete, or trace methods are available on every path. We can fingerprint the web server using 404 error messages. We can try to fingerprint the server looking for application error messages. We can try a dictionary attack with a list of names on all of the domains that we discovered. We can try the sound transfer and we can search for any URL index on Google related to this host name and do the same. And of course, then we can try to download the file, extract the metadata, hit an info and lost data, sort all this information in an actually way. So the problem with this is that we've just converted it into something massively difficult. It's like, there's 30 steps there that you have to know how to do. There's hardcore hacker stuff here. I mean, I guess the interface of FOCA now is really complicated to do this, right? As you know, I'm a Microsoft guy, so there is a big button on FOCA. You only have to click on it and FOCA does everything for you. That's nice. Alright. So since we just saw that analyzing the URL from the, so the document is coming from this URL and we've been analyzing that URL and see how powerful was that. We just thought that every possible URL is going to be interesting. So we've added FOCA the possibility of just crawling more and more URLs. We added a couple of setup and properties panels and then this is a screen show of one panel where you can set up a few of the options that you want to use for pretty much everything that we've been describing so far. So like web search and DNS search and all that stuff. And of course the the algorithm is going to work by itself with that search all button. But then there's stuff that might be time consuming. There's stuff that you know that you're not interested in for some reason in this particular penetration testing or even stuff that is illegal in your country like a sound transfer. So there's these panels of properties where you can basically customize everything. So let's do a demo with the White House. You know we can get in trouble. Yes we can. Well let's do a quick demo with with the White House. This is the or this is the girl. This is the demo that we did before with the metadata extracting from the documents. As you can see all documents are locally stored. So it's on my laptop. I'm gonna delete everything after this talk. And we are going to create a new project with the White House. And then we are going to create the project and just clicking this magic button which says search all. So we only have to to click the button search all. FOCA is now searching for the documents on the internet using Google being excellent. And as you can see there is a big list of links. We didn't even download the documents. As you can see the documents are still on the on the web server. But FOCA is analyzing the URLs. So if we go to this new panel network data we can see that there are a lot of information that FOCA has been collecting from the URLs. So for instance we got that White House is using two IP address and there are another server called MQM. I like it. Which is using two IP address. There is a special server called APP2. And of course FOCA has been analyzing a lot of information at the same time. So if we click in which in this domain as you can see there is a bottom panel which is being used to collect all the information and FOCA is looking for all the paths in the URL. Looking for all the documents related to this web server. Looking for directory listing and looking for unsecured methods. This is automatically. So we just click in and search all. And in this example in the White House in APP2 there is four unsecured methods in one path. So you can put a file. We can upload a website. A file on the server and so on. Well just click in the bottom. That's all. So let's stop this project and let's do another one with the Spanish Linux. And we are going to do the same in this example. There are only a few documents. 60 or 70 documents. And this quite nice because FOCA is working. And at the same time FOCA is analyzing the information of whole documents. And this is Guadalines. As you can see there are directory listing that you can open automatically and search for all. There are 33 unsecured methods on the server. You can do a lot of things. You can do it manually but FOCA does for you. You don't have to worry at all. And of course if you want to do this but not only searching documents you can go to the DNS search panel. Search for the options and just click this other big button which says start. So start and now FOCA is working. Let's see the options. I would like to finger printing. Okay. As you can see FOCA is trying right now Google sets inside the algorithm. And if you have a look to the log you can see what FOCA is doing. Basically FOCA is doing all the things that we've been presenting five minutes ago. It's looking at being, it's trying to resolve all the internal IP address with the PTR scanner. It's using so then it's resolving. FOCA is working. FOCA is working. And that's all. And you got three different views. One for the internal network for the network not only the internal. Another one only for the domains. Another one for the IP address. And once you finish in this example this is a hosting server. As you can see with a lot of domains we can skip this IP address because I don't want to spend more time on it. So just clicking this another big button. Skip to next IP. FOCA will change to another one. And so on. So once you finish you can try the software recognition which is the same. Fingerprinting HTTP, fingerprinting SMTP and mail servers, fingerprinting using so then. And technology recognition looking for ASP files, PHP files, JSP files and so on. With this third big button. So go ahead. This is yours. Another tool that is added to FOCA is just the DNS cache snooping. The idea is just to discover where the internal user are approaching to. So what FOCA does is only connect to the internal DNS. Just set the query to no recourse. That means that this DNS is going to respond only if the domain name is on the cache. So we are going to try a big list of domains and we are able to know if that domain is on the cache or not. So you can do it along along a day. And in the end you are able to discover what are the user doing in the company. So let's do a quick demo. A new project. And it's here tools, DNS, snooping. And in this case we are going to use the national spanish train service. Because spanish is better. So just the domain name. Clicking the DNS server. Just click on the server. Just load the file. In this example a big list and a snoop DNS. So in Spain right now it should be oh no FOCA. FOCA knows it's not working. Let's do it again. Don't worry. There are a lot of demos. Don't worry. Tools, DNS, snooping, A. Domain, renfe.s. Obtain DNS server. Just load the file and click on a snoop DNS. And oh here it is. This is the places where the people are browsing right now. Facebook of course because it's the most important in their work. And it's 3 a.m. They are using Microsoft score. They are reading the newspapers, spanish papers. They are moving money from one bank to another. Banking and so on. And you can use a monitor service that every 60 seconds is going to repeat the search. So you can surveillance what they are doing internally. So that's not bad. And another option that we added but to FOCA is a reporting module. The idea is that you got a lot of information but you want this information to be in a document. So now we are working on this. This is a module in which you only have three different reports. First one is about documents. The second one is about metadata. And the third one is about the domains. And you only have to select the data in which you are interested. Then you can select different kind of graphics and you obtain this very nice report. I got one here of the White House, about the White House, which has 478 pages with a lot of information. No, I don't want to help Adobe. Well, and this is how it looks for every document. You got a PDF, a doc, an Excel file, a CSV, and so on. And in the end of the document, you got the graphics. Very nice graphics. Graphics, that's all. So I want to do the last demo. A big one. Not that one. Not that one. A big one. So not now. A couple of slides you and then let me do the last demo. So there's a FOCA online tool just to try it out if you don't want to download it. But it's only going to allow you to try it with one document. So you're not going to be able to do the whole network map and that's just extracting the information from one document. But it's online, so you can try. And this is how you, this is one of the ways you can protect your organization from a tool like FOCA. This is a plugin that you install in Internet Information Server. And what this plugin is going to do is every time that your server is serving a document, it's going to clean it. So in the hard drive of the server or whichever you're, whatever you're storing it, like SharePoint or something, it's going to, the document is going to remain intact. So it's not modifying the document, but it's cleaning it on the go. So whatever you are obtaining is clean. You can either, there's a couple of options. So you can either just clean everything so there will be no metadata there. Or you can use like a template because you want all the documents in your organization to be like, either look that they come from, they were made with the same software or same user, stuff like that. So and if you want to clean individual documents, there's a couple of tools to do that. Last year we saw how the tools that were before didn't really work very well. So we created a couple of them. All Metastructure is one of them. So with these two, you can clean all the information, all this metadata that is in Open Office documents. It's kind of hard because you have to do it manually one by one. So there's the tool. Well, and before the last demo, some issues, we are going to be in the room 113 for questions, not complaints, only questions. And tomorrow at seven o'clock in the evening in this room we are going to deliver another talk about connection stream attacks, about how to get into a control panel to manage the databases and so on. And I'm going to do a demo with the US Army with the new FOCA. It can be. No, no. It's not illegal. It is actually. No, no, no. It's not illegal. So this is the free version of FOCA, a new project. Just army.mil, Army.mil. Is this the domain? Yeah. Create and just search all on FOCA, start working, searching for documents. And as you can see, we got the information of the network, which is, oh, they got a lot of servers. And of course, they got, hey, they got a secure method. Can I do a demo with them? No, no. They got trace method and so on. Well, we can be here until tomorrow, but no, have to go on parties.