 Hello everybody and welcome. First of all I would like to thank you all for taking the time to be here. I hope it will make it worthwhile for you. My name is Martin Holst-Wende. I work for Swedish company called Two Secure as a senior security consultant. I have been working with Eye Two Secure for a couple of years before which I worked as a programmer mostly with Java. And nowadays I mostly do web app security and pen testing and working hours and programming in my spare hours. And I really like the open source movement and I try to participate and contribute wherever I can. And I contributed a few scripts to NmapNSE, some features in web scarab and some bits to Malory and W3F. I also think you have a lot of different tools to automate and make my workflow more efficient. And sometimes it ends up in usable applications. And HatKit is the primary example of that which became an OAS project a few months ago. With me today I have a friend and colleague, Patrick. Hi, my name is Patrick Carlson. I also work for the same company as Martin. I'm also working with IT security, focusing on web application security and databases. I've been a speaker here before at DEF CON 15 where I presented a speech on SQL injection and out of band channeling. And as well as Martin I'm trying to contribute as much as I can to the open source community, mainly to Nmap for the moment where I've committed a script or two. Yeah, so what we're going to do today is talk about the HatKit project. I'm going to start off, talk a bit broad in general about what the project is all about. And then Martin's going to go on and talk a bit more in detail about the tools. So as some of you know, testing web applications is a complex task. I mean looking at methodologies like the OAS, for example, OAS testing methodology. As a tester you have to cover a few different areas or a number of different areas including like web server configuration, authentication, session management, input validation and so on. And in order to do so you usually need some sort of tool or framework to work with. And there's a whole bunch of tools ranging from fully automated scanners, where you just click a button and it will spit out a report with all the vulnerabilities. And you also have different smaller tools specializing in looking for, for example, SQL injection across that scripting. And then of course you have the proxies which Adrian doing, I mean the manual tests where you can intercept the communication, you can modify stuff, you can view all the headers and so on. So I mean a typical part of an intercepting proxy is obviously the intercepting part. But if you look at the proxies available today they come with a bunch of different features which we can see in this table here. What we've tried to identify is I mean the different components present in these proxies and whether they actually have to be in the proxy or not. And in most cases I mean most of the features you could use them, you can leverage these features from different tools than the intercepting proxy as long as you have access to the collected data or the stored data within the communication. Obviously you can't intercept, you have to have the intercepting functionality within the proxy itself. So I mean looking at these typical proxies in my experience I've tried a few different ones and they all have their strengths and weaknesses obviously. But I mean one thing that bothers me quite a lot is that they're usually very resource intensive. I mean especially like when you do testing with like Internet Explorer or Safari or Chrome I mean where you have to set the system settings for the proxy you route all the traffic through that proxy, intercepting proxy. I mean you'll have OS updates coming in, you have video streams and so on and after a few hundred megabytes of data the proxy tends to get a bit slow. So that's one of the challenges. Also when you look at the data collected by the proxy the available tools usually are pretty static. You can view a table of all the collected data but you usually can't modify it a lot. And also when you want to work with the collected data like analyze it you usually have the possibility to do searches or regular expressions in some cases but you have limited possibility to actually work with the data once it's collected. And also you have limited post processing capabilities like for example if you want to use the collected data to actually do some sort of run it through another tool to do a SQL injection test for example in most cases it's not really obvious how to actually extract the data from the intercepting proxy. So what we've done with this project is try to address these few drawbacks that we actually listed here. The project is actually two different tools. It's an intercepting proxy with a very, very lightweight feature set and lightweight footprint. It's a recording proxy so it records all the data it sees into a MongoDB database. Also it comes with another tool which you can later on use to look at the collected data which is stored in the database where Martin is going to talk a bit more and it's going to show you how you could use this tool to actually have a very dynamic view of the collected data and do post processing. So the HatKit proxy it's based on the OVAS proxy actually which is written by Rogan Dapes. It comes with all the usual stuff like intercepting, it has reverse proxy support, it has syntax highlighting, it has a fully qualified and non-fully qualified mode which allows you to modify stuff within the HTTP protocol which usually can't modify through other proxies. It also has something called TCP, I mean it has TCP interception in the early beta stage which allows you to intercept TCP communication and actually modify it in real time and then just let it send it through to the server and that's what I'm going to focus on these last few slides on the TCP intercepting part and then I'm going to hand over to Martin who's going to talk a bit more about the data fiddler which is the other part of the HatKit proxies, HatKit project. So TCP interception, we provide the possibility to intercept TCP traffic within this project. We provide two ways of doing it either through manual interception where you intercept the packets and you get an editor where you can edit the packet. We also have something called a scripted possibility where you have different processors so you can write your own code in Java through the bean shell integration and basically each TCP session gets its own bean shell interpreter so you have the possibility to keep state within a TCP session. I mean if you collect some interesting data in the first two packets you can keep them in a registry and then just put them up in the sixth or seventh packet where you actually need them and the registry is basically just a hash map with the string key where you can store whatever data you want. So I thought I'd show you some demos on the TCP intercepting part. What I've done is I've used a small ERP or quite large ERP application which consists of a client part and a server part. The server part is actually, I mean it's just a database. It's a thick client so it connects to a database. There's no application server in this particular example. So what we're going to look at is how the thick client connects to SQL server directly using a common application account and once you're connected to the database and you try to log in to the ERP application it will query a table for a username and a specific password in order to log on the user. So we're going to see some different scripts that can be used to analyze this traffic and to actually manipulate this traffic too. So what I'm hopefully going to show you now if everything works. So this is what the user interface looks like. What we're doing here first is actually we're setting up a forwarding address. So what it essentially means is that we listen to our interface on port 1433 and all packets that come in on that interface are then forwarded to the above one to the 101 1435. It's a SQL server instance that listens on that particular port. So what we need to do is that we need to say that we want to process all those packets using a bean shell script. At this point I'm specifying a script called Microsoft SQL server downgrade. So what it will do it will attempt to downgrade the authentication process of the SQL server connection. And I don't know if you're familiar with it but basically you have different types of authentication SQL server. What I'm going to do is try to downgrade to the weakest one where basically the password is just exored over the using exor encryption and sent over the wire allowing us to decode it instantly. So we start up the proxy and we see the console and switching to the application starting it up. We see a connection getting in. We see that the bean shell script actually successfully downgraded the encryption. And we can see that you have to log in from the client using an account which I've redacted away since it's the name of the product and using the password enterprise 123. So with access to that account you can pretty much do anything in this CRP application since it has the highest privileges in the application. So you could do that with the script. We're going to look at a few other scripts as well. Here's the same application. Again with the logo that discreetly removed. We start up the proxy again and at this point we choose another processor called the MS SQL query sniffer. We try to authenticate to the application and we see that there's a bunch of SQL queries running over the wire connecting to the server. And what we can see that there's actually specific queries saying select something password alias from g underscore users where the alias equals the one that we put into the authentication form. So what the application is doing is trying to retrieve the encrypted password from the database and once you log in through the login form it compares your encrypted password to the one retrieved from the server. Does anyone see a problem with this? So what we do is that we open yet another bean shell processor. What it does it actually looks at the, it tries to match the query that we just saw in the sniffer and simply replace it with an empty password. So what I do in the proxy I just specify I want to use a different processor and then I use the Defcon demo at the top and we apply it and then we go back to the application once again and we log in using a blank password and we can see found and replaced pattern. And it will bring us right into the ERP application without knowing the user's password. So those are the kinds of things you can do with the processors automatically. So you don't have to modify each and every packet. The scripts I've just shown will be or included in the release of the tool that we're doing now for the moment. There will be some updates also after this presentation to these scripts. And I'm going to end here and pass the mic over to Martin. All right. So now I'm going to focus a bit more on the second part of that project which is the Hadkit data fiddler. And I'll try to answer these questions, the what and the why's and the how's. And I'll make some demos also. So what is it? Well as we've already mentioned it's a tool or framework to analyze web traffic. And it goes a bit more into details. We could describe it more like a platform where several applications have been implemented on top of some common components, UI components and filters, window handling, and a shared database layer. The database layer itself is based on MongoDB which is a so-called no SQL type of document storage database. One idea which has been important during the development has been to reuse existing tools as much as possible. Not just rewrite the same old tools in a different programming language or something. So the aim is to make it a platform which can use existing tools and pre-recorded data as much as possible. And what does it do? Well as of the DEF CON release there are four such applications implemented. And the ones that exist today are TableView, which is kind of like the first tab of web scab where you can get numerical information shown in a table in a highly flexible manner. And it can be tailored by the user. There's something called the aggregator which does traffic and pattern aggregation. And then there's a third-party plugin which among other things it can utilize W3AF and RAT proxy to analyze traffic, pre-recorded traffic. And it can also be used to export data to other proxies. It also contains some common functionality to filter data in order to work on with the parts of the data that is relevant. So you can basically untaint your data if you happen to catch some OS updates or whatever. There's also cache proxy which is still an alpha. And TableView. So it gives you a highly customizable way to get an overview of an application flow. And it's very simple to write, reuse the kind of view that you need for your particular scenario. And what I mean by scenario, well some such scenarios are that for example you might be interested in analyzing user interactions. So you use two different browsers maybe on the same target and try to see if one user can access the information belonging to the other. And that's one scenario and there you might be interested in being able to differentiate based on user agents. In another scenario you might be more interested in analyzing the server infrastructure issue. You're more interested maybe in the server banner or the headers. And in another time you might be interested in analyzing for encoding mistakes and cross-scripting, potential cross-scripting issues, stuff like that. I'll still make a demo but before I'll do that I'd like to spend a little bit more time on the core of the data filler which is the data that has been stored inside MongoDB by the proxy. So the traffic is stored as parsed object in the database. And what does that mean for us? It means that when we do selection what we want to load from the database up to our application we can specify criteria, things which are deep inside the objects themselves. So we can say for example that we only want to work now with objects which have some things, some request headers set cookie and a certain cookie or where there is a certain parameter of JSON. And it also means that when we load the data up we don't have to load the entire objects we can just load the parts. For example if we just want to work on the server infrastructure we can just have to load the headers. And also when we get it into the application it still retains the same structure. So to some extent MongoDB is very similar to an old school object database except that it's platform independent. Okay so let's move away from the nitty gritty details and see some actions. So what you see here, this is the data filler and this is not how it normally looks. Sorry. This is not how it normally looks but I made it, I cleaned it out a bit so that I'm going to show you now how we can populate this with something that is interesting because currently it's just showing the database identifiers of each object in the database. So I opened the settings for this and I'm also going to double click on one of these objects. By doing so I get something called the object inspector which loads the complete object from the database so we can see how it is structured. And there you see there's for example the response and the response contain headers. Each header is stored, it's an attribute in the headers dictionary basically. And it calls a set cookie and it contains an ASP cookie with value and the path attribute. And we go back to the settings. So we're interested in the response object, sorry the request object. So we load that on the left side here we define variables, that's what we load from the database. So in variable V1 we load the request object or node. And then we add a column which is on the right side we define what we want to see in the table. So I can just type in the variable V1 there, optionally added title. And if you have your coder goggles on you might see that what appeared on the second column is the string representation of a Python dictionary with unicode keys and values. Everybody see that? But that's not very user friendly is it? So we can instead here in the column definition reach into the method apply. It's a method that would give us the request method. And then we have all the sudden you can see that there are get some posts. You might also see that there is coloring enabled and the coloring is just a hash of the text value. So it can be good to have if I want to see where it changes. I'm not maybe interested in the particular value because I want to save some screen real state. So I can disable the actual print out of get and just see the color value. And we can save this view for later if we want. So we don't have to redefine it. Now the column definitions are really just Python. It's pure Python which is evaluated. And since it is Python we can write any kind of Python code there. And there are some helper functions we can help us produce some nice use readable strings for our tables. For example we're going to use a helper function called FQ host. It takes the request object and produces the fully qualified host name. So we can just write FQ host on the variable b1 there. Add the third column. And now you have the fully qualified host in the third column. And in a similar manner we can decide we want to see the parameters also. And there's the param string. And we'll just add a new column right param string. And you can write arbitrary Python there. All right so what I just showed you is how you can start to populate and write your own view definitions. So you can get exactly what you want. Now I'm going to show you some bit more advanced usage where we're going to use the functionality to reach deep into objects. And the example scenario here would be that I might be analyzing a web application which is based on Ajax. And the workings of the application aren't visible in the URL. It's only visible deep inside the request and response bodies. So I need to reach into it to get to visualize it. Okay what you're seeing now, that's the normal view of the data filler. We don't do any changes to it. So we enter the setup again. And we're going to the tab called database filtering. And we start on the sub tab called native clauses. And here we can specify some clauses which says which objects to load. And we type response.json. That node has to exist. If it exists we get it into our application. And we can test it. And there's 113 of these objects. Okay that sounds good. We apply it. And now the only thing in our view are the JSON responses. Now to do, we also want to reach into the JSON for our viewing. So we need to modify this view a bit. And I've prepared this a bit. So there's a JSON there. Let's see I'm going to load it. And what appeared now as you can see in column 2 is that I reach into this response.json.trends which is in v6. And from that I pick out the attribute which is the date. And in that list I take the serif element and the query attribute. So it's an object within the list within an object within JSON. And yeah. We'll show you here also what I mean. So we're trying to reach one of these query objects inside of that. And we want to follow that through this application flow through this request. So I just apply that. And now we see that at several places there are type errors. And that's because even if all the responses had JSON they didn't all contain these objects. So in order to fix that, the final details, we can go into a bit more advanced usage but we're going to the JavaScript expressions. And if you type JavaScript expressions in filters, these are actually passed down into MongoDB and evaluated on the fly by MongoDB. So you write a function there which it turns through if you want to send this object back to the application. And the one I'm loading here, just basically checks if there's JSON and if it contains the trans object and if it contains the list and blah, blah, blah. If so, return true. And apply that. And there we have the flow of the application. It was all the same but what you've seen is how we can reach deep into the JSON and see defined use that let us see only exactly what we're interested in. And I forgot to mention that but when you're in the table view, we also integrated that with the request. So you can see the diff using KDiff 3 or whatever diff you like. You can view the response content with your platform default editor for JavaScript or HTML. Or you can override it and use your Eclipse or whatever. Another data field application is the aggregator. So if the table view is a way of representing data in a one-to-one format, the aggregator instead walks through the data on the database side and collects the interesting pieces which are sorted by a specified key. This is a feature of MongoDB. It's very similar to MapReduce if any of you are familiar with that. So this is the aggregator on a tree view. And I open this setup here. And the one that was predefined already was the aggregate path. And we're on the basic tab now. And here you can just load some predefined combinations of reduce functions and sorting keys. And the aggregate paths ones that you're seeing, it just aggregates the path sorted by the host name. What do we end up with? We end up with the sitemap if you do that. There are a lot of predefined ones here. We can just play along a bit with them. There's the aggregate paths by the HTTP status. There's the aggregate, you can aggregate server banners by host, for example. So you can see here all the server banners that Twitter and Google and whatever. This is just some random collected data used. Here we list the response errors. That is listing all the unique keys that we have in our collection. And also counting how many times we saw them. And this is a bit interesting. If we go into the advanced tab, we can see that the key is a static key called one, which means that everything will just be sorted on the same entry. We can change that now. Say for example, we're right to request header host. And bam, all of a sudden, we have the same thing that we sorted it by host. So this is, for example, if you want to analyze infrastructure, and you're suspecting that for certain paths, they use different servers, then you can do the same thing that you sorted just by paths. Some other example scenarios that you can, some other basic ones that are really useful, can be to check all the parameter names that are used. For example, if you're suspecting that there's a remote filing solution possibilities or direct RBG reference, it can be pretty useful to look at the names of the parameters. And if that seems interesting, you can also add another node to that tree where you look at each individual value also, just aggregate everything. And there we have it with the values maps. And as we mentioned several times, one basic idea in the HatKit frameworks is to use re-use existing tools as much as possible, because functionalities and assets, but code is liability. And there is a mechanism inside the framework which we call the third-party plug-in. And what it does is it loads data from the database and one by one, it lets the plug-in process that data. And such plug-ins can do a lot of interesting stuff, and so far we've implemented four. One is the plug-in for rat proxy analysis. Perhaps you're already familiar with the micro-salevsky rat proxy. It's mostly passive proxy which analyzes the data on the fly as it goes through the proxy. And in order for us to make use of rat proxy, we had to trick rat proxy a little bit. So we do it by, let's skip here. Oh, that image didn't come out well. I'll just explain in words. We do that by the data-fitted starts listening to a port. Then it starts the rat proxy process and tells the rat proxy process to use that port as a forwarding proxy. Then we start sending, feeding rat proxy traffic. Once we get the request on the data fiddler back, then we can send the response back and we can collect the output. And this method of export has also been generalized in a generic proxy exporter. It basically works exactly the same way, except that we need to manually open up your proxy, whichever you want to use, WebScare burps up, and configure it to use the data fiddler as a forwarding proxy for it to work. And there's a caveat. It doesn't have less to sell properly right now, unfortunately. There's also WebScare exporter, which exports data in a format that WebScare can read. Unfortunately, when you do it that way, WebScare doesn't process the data. So you would use the generic exporter for that. And finally, we have W3AF. I guess most of you have at least heard of it. But it's a web application and audit tool. No, web application audit and attack tool. And it contains a lot of functionality. One of the things I like about it most is that it contains something called the grappers. The grappers is just Python code, which takes your question response, and searches for stuff. It can be different things. It can be in search for stack traces or internal IP address, or social security number disclosures, or potential cross-site scripting issues. And the third part, plug-in, this hasn't been released. It was released earlier this week, and the DEF CON release. So I'm going to make some demos of these plug-ins. So with the rat proxy exporter, all you need to do is basically tell the proxy where the binary is. And then you run it. And the thing with the rat proxy is that it's kind of slow, because the application doesn't know how many requests we make. Sometimes it makes several requests from one inbound. So we have to wait for it to time out before we can move on to the next. There we go. So we gather the output, and there you have it. And that's basically the raw output from rat proxy. And then I'll show you the W3AF. Now, W3AF, that's all Python code. So it's very much more efficient where we can interact with that. We don't have to trick it anything. We can just reach into the code and monkey patch it a bit and just use the grepers. So you need to just enable it, tell it what the code is with the grepers and bam, there are the results. And I'd like to mention also that the integration with both of these, both the rat proxy and W3AF, it requires the third party to actually be installed or at least accessible on your machine. So we didn't just take the code from W3AF and put it in our project. We tried to give some added value to both projects by doing it this way. Let me see if that was wrong. And the final demo I'm going to show you here is a generic exporter demo. Right. So we set here that we want to use the proxy exporter plugin. Tell the plugin where that proxy is listing and define where we want data filler to be listing. Okay. Then we start our proxy. In this case it's WebScare. Check that we doesn't have intercept on because if we do it to disrupt the process and the data filler will time out and that things will happen. The sky will come tumbling down. We check that it is configured to use a forwarding proxy which is the same as our data filler. Nine, nine, nine, nine. Okay. And click the summer tab so we can see what's happening. And then we hit run. And as you can see now we start populating WebScare with data. You might notice also that the plugin window has a filter tab like most windows in the data filler. So you can actually filter if you want to send some stuff to WebScare but not everything perfectly doable. Untainting your data. And yeah, of course this is a fully functional WebScare when you import it this way. So there you have some processing. You can see that WebScare has detected some potential cross-site scripting issues. Okay. All right. So there are some upcoming features. There is a cache proxy. Cache possibly starts a little HTTP forwarding proxy and it matches the request against what is stored in the database. And it returns the best match. And it can be configured either to just be closed so that it just returns 404 if it doesn't match or it can be open so it goes ahead and fetches the remote content if it can't find it in the cache. So what can you do with this? Well, you can do, for example, you can resume a Nikto scan. And you can also use it too if you want to gather screenshots or videos outlining what you did on a particular assignment. After the assignment is completed, you maybe don't any longer have access to the target. And this feature is already in alpha stage implementation wise. And if you're a Python hacker, you can definitely get it working without any major problems. The thing is there is no UI implemented for it yet. So unless you want to hack, you're gonna have to wait a little bit. Faster integration. We hope to integrate directly with Jbrufus so that if you have a request and you want to do a manual request, you can just send to Jbrufus and it pops up. Currently, we've only implemented so you can like send it to browser and yeah, you can also do right click and get copy, fully qualified request, which you can paste into the manual request of proxy of a choice. Planning to add some more advanced text search capabilities either by a policy in a whoosh. And as I mentioned, there was a new release earlier this week, both for proxy and the data fiddler containing all this stuff I'm talking about. So what would you use it? Well, you better be able to make sense of large parties of complex information and get all the information you want after your body of data. So you can download the source from BitBucket. You can download the released binaries from BitBucket. There is documentation on the OWASP website. Unfortunately, during the summer, there's been a lot of development and everything hasn't been documented on OWASP yet. In order to get it running, you need Python, Qt, Qt4 bindings and MongoDB driver and of course access to MongoDB instance with pre-recorded data. W3F and RedProxy are optional. We've got it working on Linux and Mac OS X. I think you're out of luck if you're running Windows. So who is this meant for? Well, obviously for application testers. That's the perspective I've been giving here today. And it's not for testers to just want to point a click tool. It's for someone who really wants to take control of the data and really fill it with it basically. But it's also another angle on this and that's for server administrators who can use the proxy as the reverse proxy and use it to log all incoming traffic. They can then use the data filler to analyze user interaction. For example, detect malicious activity or perform post-mortem analysis. And since we're using MongoDB, one bonus feature we got was that we have a backend which can scale massively and potentially handle very large amounts of data. We love to get some feedback and new members to the project. This is still very much on the development. And please don't hesitate to join the mailing lists. And I'd like to thank you all again for listening. We won't be doing Q&A in this room, but if you join us in the Q&A room, wherever that is, we can show you some more hands-on and answer all your questions. Thank you very much.