 All right, I'll go ahead and get started. So one of the problems that I run into a lot in OSINT is one of the most basic ones, which is essentially just finding some string, some information, in a huge pile of data. It seems really simple. It seems like it shouldn't be a problem. And yet, I keep running into it again and again. And I found this tool. And I want to add it to all of your tool belts. It's a very simple one. But it's a very effective one for me, at least. It's called Recall. So the short version of this talk is that there's a dude, that guy, who made a really cool thing called Recall. I bumped into it one day. I'll tell the story of that. And I think it's really interesting. It's really awesome. It was really useful for me. And I want the rest of you to be able to use it as well. So again, one of the big problems that I keep running into is we have all kinds of information, too much information. And I have a heck of a time digging through it. I ran into this quote while looking for something interesting to explain how it was and figured it was just relevant enough. So one of the things that happened recently was I submitted a talk to Source Boston, where I was going to dig through a bunch of the NSA's lease documents, look at a lot of the programs they had, and compare them to what really a normal person could do if they had a few million dollars, a lot more programmers laying around. Long story short, the NSA's stuff is not magic. It's an interesting talk. I had a good time. But I started by looking at this page by all the EFFs documents and going, oh man, now I've got to write a talk. It's going to be a giant pain in the neck. And so I started by writing a little Python script that I can pull down all of the EFFs released documents and ended up with this, with nearly a gigabyte of PDFs. Not really searchable. They weren't really OCR'd. And basically, this is what happened to me at that point. I've got a talk to deliver in about a couple weeks, and I'm looking at like 882 megabytes of PDFs and just blank face, fear, sweat. It was terrible. Until I bumped into this. And this was essentially the thing that saved my ass on that talk. And I think it can be really useful for you guys as well. It's essentially a desktop full search tool, like it says there. But it can do a heck of a lot more that they don't tell you right away. It's a huge understatement to say that all it does is full text search on the desktop. So to get started, the basic steps are really simple. You just install an open recall. You look at it. You think, wow, that is so many buttons. You look at the user manual. You realize the user manual is also huge and ignore it. And realize that there's too many buttons for you to realize how it's going to work on your own. And end up reading the user manual. And I'm here so that you don't have to do that. Because it turns out there's only one really quick thing you have to do to get started. And from there, it's just all nice. So to get started, you actually need an index. They don't really tell you this. And it took a long time for me to figure out how the heck to get the PDFs in. Essentially, what you've got to go do is set up a place that recall looks and builds what it calls an index. And the index is it gathering metadata so it can search the metadata quickly and then direct you to the documents you need much faster than if it had to search through actually a gigabyte of PDFs every time. What this looks like is here. So if you go to the preferences tab and put in the top directories, it'll drop you right into a place where you can toss whatever directories you want into there. And the cool thing is that you can put almost anything in those directories and recall will eat it up and let you search through it. So in my case, it was a ton of PDFs. But in your case, it can probably be zip files. It will transparently open those. It can be mail. It can be PDFs as before. And we'll even get to how you can save web pages and search through those as well later on. So the short version of this, if you're interested in looking at the talk, you can check that out. The wrap is unfortunate. But if you pointed at about a gigabyte of PDFs and then have it search through them, recall itself. Again, if you go to the preferences tab and set up the index here. I had it pre-indexed because it takes a couple of minutes. So it's already set up and ready to roll. But the short version is you've got a gigabyte of PDFs. You're not sure how to search through them. And instead, you can just put prism in and here they are. The really interesting thing is that you can go ahead and open them and it will have already set up the search for you in here. So as you poke through, it can just drop you to whichever. Again, this isn't like a revolutionary thing. This is not something that's going to ridiculously change lives. You're searching. But boy, did it save me a lot of time when I had nearly a gigabyte of PDFs to go through. And that carries over to other things as well. If you're trying to search through mail, if you're trying to search through zip files, whatever you're looking for, this can basically eat it up and spit out search results for you. So on top of this, that's sort of the basic functionality, right? That's the really simple. You've got some PDFs, you need to search through them. But there's a heck of a lot built on top of this that I think is really interesting. So first off, Firefox actually has an add-on built by the same person that lets you go in and save the Firefox pages that you visit and then be able to search through those full text as well. So if you've looked at something like Hunchly, this is basically the exact same thing, except it requires a lot more fiddling to work, but it's free, so it's sort of up to you on whether you're looking to spend money for a better experience or whether you're looking to use something open source. This is actually incredibly useful if you're digging through a lot of web pages and then later on down the line, you're like, oh yeah, I think there was some mention of this at some point, like a few days ago, but maybe it's not there anymore. Maybe it's been taken down since, who knows. This will save a local copy of all of them and index them automatically so you can search through them. It's really interesting. The setup for this is kind of a pain in the neck. It took me a while to figure it out, but once you figure it out, you're like, oh, oh, okay. All right, that was it. So essentially it's not really documented, but that is where the Firefox plugin saves all its stuff too. And once you dig through and find that directory, from there you basically add the index the same way as you did before, but then it will still not work and you'll be confused until you do the last thing where you go to preferences and then set up the web history and have it cash through the web history is there as well. So I'll put up a blog post with these instructions here or you can take a picture now where it's kind of a pain in the neck to dig through and find this on its own. And you've seen the index before, but this is the new thing as well. You got to go over to the web history and add in that same directory there and set some space for it to store stuff. So then the way to actually save pages is really interesting because you can go in and you can either index pages manually or you index this page or always index this site, or you can set just a regex and say everything on this domain I want to be saved or everything where this word or this regex appears I want that to be saved. And it's just an interesting way that you can go through and only save the things that are actually relevant to whatever you're working on at the time so you have less to search through. Finally, if you go to file and rebuild index that will clear out everything you had there before and completely rebuild the index from scratch. If you update it, it'll just add it in. If you rebuild, it'll wipe it out and restart it. So depending on what you're looking for that has tripped me up a couple of times. You've got to be careful. So things that are neat with this especially is that it still has the full text search which includes through all of the code, all the stuff that wouldn't necessarily display in a browser. There's also a copy saved locally which is really, really useful if you're looking at stuff that's ephemeral and has a tendency to disappear. And you don't really do anything. So if you set up a whole domain to be indexed then all you got to do is just browse through it, look through stuff, make sure you click all the links and you'll end up with a full text searchable database that's useful for whatever you're looking for. So again, this probably reminds you of Hunchly. It's basically an open source version of the same thing that I just accidentally stumbled across one day that I have never heard of anyone using before. I don't know if this is something that is commonly known and it was just me, but I figured I should introduce it to you guys. Again, Hunchly is almost certainly way better. There's kind of a pain in the neck, crashes from time to time, but it is free, so. But the cool thing is this is still not all that Recall can do. There's, again, it'll eat anything. It has a ton of little features under the hood that seem not particularly interesting until one day you need to open up K-word files and then you'll be very happy. So again, you can throw just about anything into that index. It'll be able to eat it up and let you full text search it. There's also a web UI, which if you're sharing documents among multiple people, perhaps this is something that's interesting for you to be able to actually do that search over the network and have multiple people collaborate over the same system. As well, the reason the web UI works at all is that it's in Python. There's a Python binding underneath all of this. And so if you're looking to script through things, the SDK for this is actually pretty interesting. Not that difficult to learn. There's only like six or seven functions. And you can query into them and pull out whatever information you're looking for. Again, full text search in Python. Getting Recall set up and calling it from Python was actually easier and more effective than using Python's PDF libraries to open up those PDFs and try to pull information out of them. Like it was actually faster to set this up than use Python's own built-in PDF stuff, which was shocking and awesome. So the next time you need to search, I hope that this is something that's in your tool belt. It's really simple to set up if you're on Linux and there's a Microsoft Windows version as well. Haven't tried it, but it's probably even easier to set up than Linux, because of course. And it's something that can eat almost anything, give you a full text search, and it's worth taking a look at. So you guys can follow me on Twitter, and I'll put out the blog post with the instructions and slides from this talk probably on Tuesday when I have recovered. So yeah, long story short, I hope this is in your tool belt next time you need to search through something. And if you have any questions, just let me know. So the web UI, I think if you access it over the network, I don't think you can collaborate as far as multiple people typing at the same time, but you could both access the web UI with the same underlying documents, which if you're trying to control documents or something, I think would be an easy way to give multiple people access without having to copy a terabyte of documents over to multiple machines. I believe so. So basically what you have through this web UI doesn't look like you can actually go through and fiddle with stuff directly. I'm sure once you downloaded these, you could do what you wanted with them. But I don't see any way in here, nor did I find one to go in and change documents. Yep. So it comes with a whole bunch of different OCR utilities, or various text extraction utilities, including OCR. I think the way that I did the piping with that was through Python, although I'm not sure if there's a way directly on the command line to do that. I know some of the things, like the indexer, can be run from the command line instead of being run from the GUI. I'm not sure if there's also a command line pipe argument that lets you access the stuff in it, or if you'd have to just write a little Python script. Again, probably four or five lines of Python that you can run on the command line and pipe that out through there. Yep. That's a good question. I didn't try that. There's probably a way I have no idea what it would be. Cool. Thanks, everybody.