 Hello everyone. Welcome to our talk and welcome to DEFCON 0x0f. Let me just do a brief introduction since this is our first time ever presenting at DEFCON. I'm Ben Feinstein. I'm a security researcher with SecureWorks in Atlanta, Georgia. My background is primarily as a software engineer for a variety of security tools and security products. And I have a background early on in IDS. To my right is Daniel Peck. He's also a researcher with SecureWorks, proud graduate of the Georgia Institute of Technology. He's likely our strongest reverse engineer and jack of all trades and a lot of other tasks that we tackle on the research team in Atlanta. So our talk involves malicious JavaScript. And for some of you, you may ask the question, why should you care? Well, if you've seen some of the talks this week over at Black Hat and here at DEFCON, you could see that JavaScript is a pretty severe security problem. A lot of malware and spyware downloaders use JavaScript to get malicious code onto your system and execute it. As we've seen with some interesting talks this week, browser exploitation is a major topic. Basically, dynamic scripting in the browser can be used to basically use your browser as an attack point into your internal network or on your LAN. Also, scripting is frequently used for information leakage purposes such as your browser history, cookies you have stored on your system, lists of sites you frequently go to, stored passwords and credentials that are cached in the browser. So basically, JavaScript has access to a lot of dangerous information and a lot of pretty severe functionality there. What we're primarily looking at today is its use as an evasion and security bypass technology. So just to give a real brief background of JavaScript, I apologize for those of you that you're already familiar with this. Back in the 90s, Netscape came up with this concept in their early browsers and the idea really was to offer scripting language so that website authors could make the browser experience richer for the user. After Netscape introduced JavaScript, Microsoft came along with their own version called Jscript and eventually it was standardized under ECMA or ECMA script under the auspices of the ECMA group. And what is not generally understood is that a lot of parts that we associate with JavaScript aren't really part of the standard. One example of this is the document object which is really defined in the browsers. It's typically just an extension in the browsers. It happens to be mostly compatible across them but if you've ever tried to write scripting to work reliably across a lot of these browsers for website purposes, it can be a challenge. And one of the reasons this is a security issue is this JavaScript tends to blur the line between data and code. And what I mean by that is that to a lot of security devices, the JavaScript that's coming back from a web server is really just viewed as data. It's simply part of the response of that web request that you sent and it's coming back over the wire. In reality, it's code and it's going to be dynamically interpreted and executed inside your browser running on your local area network. We'll talk about that a little bit more later. So, like all great tools or all great technologies, there tends to be feature and functionality bloat. A lot of this involves Ajax and Web 2.0 technology. The XML HTTP request object is a very powerful part of JavaScript that allows you to really asynchronously control requests to and from the browser and the web server. And as our experience has shown, more features always equals a larger attack service which to exploit and greater potential for security critical bugs. So, I'll hand off to Daniel. He'll talk a little bit more about the Web 2.0 effect. Web 2.0, any grant. If you've tried using a browser lately with JavaScript turned off using the NoScript plugins for Mozilla and various other things to turn off JavaScript, it's pretty unusable these days. Most of your major sites have so much JavaScript integrated into it that it's just not feasible as a security measure to turn off JavaScript. While this advice might have been good in the early 2000s, late 90s when a lot of security problems with JavaScript were first coming to light, they're not really practical anymore for anybody but the most hardcore security people. It's not practical for grandma, mom, kind of atmospheres. So there's some real problems here. JavaScript is a vice of your typical web developer and designer. It's thrown out there without a lot of thought. It's just put in there to make something happen. There's not a lot of training involved in it, and most of the scripts are put together with examples from the web. These examples tend to be not the most secure or the most well written in any way, shape, or form. There's just big JavaScript library repositories online that you pull down stuff from, edit the code just enough to make it do what you want. And you're not really sure what's going on in a lot of cases. Continuing with the most many popular sites, try MySpace, try YouTube, try any number of other sites, they're pretty much unusable. It's a staple of the web and it's here to stay and it's only going to get more and more involved in our daily web surfing browsing habits. Is it really that dangerous? We get this question a lot in regards to JavaScript. Is it all theoretical? Is it things that are just out there but not really being exploited? Well, we saw some examples in the month of browser bugs. Number 25 is a good example of a vulnerability in JavaScript itself where the vulnerability involved iterating over a native JavaScript function using a for loop to go through it and after a while you would get a denial of service and potentially remote code exploitation or execution. Number eight was an example of how JavaScript makes exploitation easier but isn't an exploit with JavaScript, which is primarily what we'll be talking about. In this case, an ActiveX control was created and it used a function that was called 65,000 times, iterate, get an integer under flow and then you're successfully able to access pretty much whatever part of the memory you want. Some other things that are happening with these tools these days. The new citizen group has the JavaScript attack API, which I don't know how many of you are familiar with it. I haven't used it extensively myself but it's becoming a more and more popular and quite mature tool for what it's doing. And things are really coming together there and it gives a great framework and tool set for JavaScript exploitation. Other things we've seen last year, the browser based port scanning was a big deal. The guys at Spy Dynamics along with Jeremiah Grossman and a handful of other people kind of had the same idea at the same time and started working through these. And recently Dan Kaminsky, I just talked at Black Hat and here this week just a few hours ago, talked about scanning inside the network using Flash, which is sort of similar and there's some overlap there that we'll talk about briefly. But there's a lot of things that the browser allows the user to do that aren't normal, aren't part of your usual applications. Fishing and cross-site scripting. Cross-site scripting is the seemingly ubiquitous vulnerability. More being created each day than are being found and they're just going to continue to be there. What used to be just a cute little attack that you could show how a website was coded poorly to show an alert is now being usable for serious malicious activity and we're seeing that more and more every day. Some examples of things involving that were the eBay seller ratings recently, which allowed the seller to change their icon to be a power seller to have over a thousand positive remarks or positive transactions. And that might not matter to you and me again, but it matters to most normal people. They're going to check and say, oh well this has over a thousand quality sales. I'm safe doing business with this guy, which lures you further into the fishing vulnerability mode. Following that address bar spoofing, there's been so many vulnerabilities that allow this to happen. It would be a waste of time to enumerate them all, but it seems like there's one every couple of months and this is incredibly useful in fishing or any number of other attacks that involve information stealing. A couple of post-mortems of real-life events from the last year also. There was the Super Bowl 40 Dolphin Stadium site hack. It was dealt with the web server being compromised. There was iFrame injection which tried to exploit. Two already patched, but widely unpatched in the populace. Microsoft vulnerabilities. One involving VML and the other one, I believe it was MDAC, but I don't quote me on that for the 0704. So this brings to light the whole JavaScript issue as Ben spoke about earlier. It really blurs the line between code and data. We see that a lot in these quick time embedded movie files. They can now have JavaScript in them. Maybe not the best design decision, but it's there and you can run arbitrary code through a movie. Years ago, would you have thought that that was possible? Probably not. They were just movie files. Shockwave Flash. Shockwave is going with the inner platform effect where you have not only the browser running on the OS, you have Shockwave and Flash running in the browser, and you have full access to a network stack now. Again, we're starting to see things creep up and they're probably only going to get worse. A big thing and fairly underrated this year is the Adobe PDF cross-site scripting vulnerability. This was released early in the year, I think around March. Basically, arbitrary JavaScript alerts, cross-site scriptings, anything else very easily. It was trivial to exploit. So now we're going to move into talking about some of the obfuscation and evasion techniques that are used by JavaScript to hide what the actual exploit is doing and really evade IDS, IPS, and all sorts of other detection mechanisms. There are a variety of evasion and obfuscation techniques that are typically used in the wild and that we see in JavaScripts. Probably the simplest to understand and the simplest to implement is white-space randomization or randomized comments. The reason this is effective against certain security technologies is if you're inspecting the return content from a website or a rebel crest on the wire and you're using a signature-based analysis technology or detection technology, you can insert any amount of white-space or random comments throughout that JavaScript and it's going to break signature detection. Basically the byte stream on the wire is going to be completely different than the signature-based content that you're looking for for a particular malicious JavaScript call or a particular section of script that you've written a signature for. Another common technique that's not only applicable to JavaScript but is basically used in a lot of attacks is string encoding and unencoding. Typically you can encode strings using hex characters, the AFTI equivalent, percent encodings, unicode encodings. Some ones I did not list up here include wide and extended unicode character sets. We've seen recently some vulnerabilities or some bypasses of intrusion detection technologies based on some unusual wide and extended unicode encodings. Another relatively simple technique that we see in the wild is string splitting. If you anticipate that a security technology is going to be looking for a particular string value on the wire or in a web page in the browser, you can take that payload essentially and break it up into any number of strings, concatenate those strings together dynamically during interpretation and execution time, and naive or signature-based analysis of the content on the wire is going to fail. This is a very simple example, lots of detections fail, and you could simply break that up into four strings throughout the script, concatenate those together, and dynamically in the browser that string will be produced, but on the wire you're never going to see that altogether. Another technique is integer obfuscation. And if you could imagine if you will that you're an attacker and you want to encode a particular integer value in your script, be it a port number, a key to some value, any number of reasons. Well, there's essentially infinite ways you can represent that integer. You could basically create that integer dynamically using any of the mathematical operators at your disposal. And a very, very simple example is if you wanted to encode 31337, you could simply perform five additions, and dynamically that value will be created in your script. Very simple stuff. On to a more sophisticated JavaScript attack technique, Heapspray, and it's a more advanced cousin, Feng Shui. Alexander Sotarov gave a great talk this week at Black Hat, kind of reviewing Heapspray and his Feng Shui attack technique and his heap lib tool. I hope to do it justice, but the basic description is that the JavaScript is executing in the browser. The activity of that JavaScript is allocating memory in the browser's heap. So, if you have the client dynamically interpreting your script, you can allocate arbitrary segments in the heap, fill it with Knopsleds and Shell code, and then basically what you need is an exploit that's going to overwrite a function pointer and redirect it back into the heap. Typically, you would use a Knopsled that would be a valid heap memory address so that when you overwrite the function pointer, redirect back into the heap, it'll interpret that as a memory address, jump to it, and then you're in your Knopsled or your Shell code. And I would recommend you check out his slides or look up on that more. It's a great technique. And JavaScript is very dynamic. Any function or any variable can be renamed. You could rename the document object, Fubar. So, for a security technology that's looking for particular calls or instantiating a particular object or class, earlier in the script, that attacker could have redefined that name to whatever he or she chooses. So, that's another way you can defeat security technologies is basically dynamically reassigning the names of your functions. You could, for instance, assign unencoded or encode to different function names that seem more benign. And that's a potential way you could bypass a security technology. Onto a slightly more sophisticated cousin here is block randomization. Typically, security technologies aren't going to interpret this JavaScript and walk through loops in it. So, pairing this with string splitting or integer obfuscation, you can create four loops, while loops, do while loops that within the code of that loop, you're dynamically crafting your payload for exploitation. And most technologies aren't going to understand and execute this loop properly, create the shell code, and then be able to match that against whatever signature or analysis they're trying to do. Typically, these techniques are used in conjunction with one another, and that also affects or increases the effectiveness of them. So, you're seeing more and more malicious JavaScripts that combine all these techniques into one really ugly mess. We want to credit the folks over at Metasploit for coming up with the evade-o-matic module. They really kind of discussed these techniques and wrote about them, so they kind of came up with this classification that we're using here. To put up an example of what we'd considered a highly obfuscated JavaScript, this is something that was found in the wild, and it is very difficult for an analyst to manually figure out what this is doing, either as part of an incident response or an intrusion detection analyst trying to look at this attack and see what the implications are, or even if it is attack, could it be benign? By the way, I'll buy anyone a beer if they can tell me what this does before we get to that slide later. Anyone who wasn't at the black hat talk. Ah, good caveat there. So, like so many good ideas, caffeine monkey was born at the local bar. Sitting around myself, Ben, and another security researcher discussing current and future exploitation techniques and just where browser exploitation and the web is going in general when it comes to security. So, we wanted a couple of things in this tool that we wanted to categorize and do a lot of finger printing and such with this. We're on a central database for collection and an easy way to get to the analysis. We wanted this collection to include web pages, JavaScript, and anything else that you find on the web. We wanted categorized, you know, in various different buckets where you could combine them or break them apart and get down to the lowest level. We also wanted a great and very easy mechanism to feed into the various browsers and other things that we were using to identify and analyze these results. One caveat in fairness is that in the browser or spidering, these JavaScripts were identified by MIME types, which we realized does not catch everything, but allows us to very quickly get through it without dealing with a lot of HTML parsing and reparsing as the onion script happens. And we felt it gave us enough to get there. We also wanted a safe and lightweight alternative, as we'll talk about later, and a lot of the current methodologies of detecting and analyzing malicious JavaScript, they're neither lightweight as they involve multiple VMware images, lots of snapshots, restarts, and lots of time and resources for the individual researcher or analyst to set up. And if you're working in on-the-spot incident and trying to figure out exactly what this is doing, then you don't want to deal with all that. You want a quick and easy way to tell what's going on. Safe also, a lot of these exploits that we have seen are taking care of the current ways of getting around it. They do things like closing off text areas, breaking out of sandboxed areas, which are kind of naive approaches with the way a lot of analysts are doing it right now. And we'll talk about those some more later. So to get this across, thankfully we have a lot of great open-source software to work on. SpiderMonkey is what we worked on as our basis. We extended that into the CaffeineMonkey. This is a great piece of software. It's been around for a long time. It's very mature. It's been being developed since 1995 in one form or another. And it's extremely extendable. And basically you can drop a JavaScript interpreter into any program you write with very easy bindings between the two. And you can control your function input and output with JavaScript. So this made it very easy for us to work this into a quick prototype. We also had the Heratrix web crawler from the good people at archive.org, another very mature piece of software. If you ever need to spider a collection of web pages, it's a great tool to use. And the good people at the University of Michigan who had some scripts for taking apart those archive files and allowing us easy access into them. So this tool has and will be released under the GPL version 3. Everything about it is pretty much GPL. It's on our website that we'll get to the links later on in the page. But we hope that the community will continue to use it and extend it if they find it useful and work from there. So basically the heart of SpiderMonkey is that it wraps and logs functions in almost the syscall area of JavaScript. We're not just walking through it syntactically. We're hooked in at a deeper level somewhere with the debugger. And instead of manually running a debugger and running through all the JavaScript and all the loops and such and cataloging it, you have logging functionality that's occurring and outputting. And currently it's very verbose. But that's because we wanted an easy way to get statistical analysis. And the best way to do that is to be as verbose as possible. So here's a screenshot of the Heratrix web crawler. See a couple of the jobs being done. Finished, finished, enabled by operator. It's very easy to work through. Just, I can't say enough about how easy it was to set up and use effectively. Some more things of it is downloading some progress bars. Nothing special, but again, very easy and very grateful to have this to build on. But moving into what you're here to see is the demo for Caffeine Monkey. Here you'll see we're at the command line and Unix based. And we're past, we've got the JavaScript and from the earlier slide, the highly obfuscated pasted in there. And we were going to do this live, but like all demos, those are doomed to fail. So we figured screenshots are the best bet. And if you want to play with the tool yourself later, you can download it. So here's some of the logging being printed out. And we just write these to files. It's a, you know, you can't analyze a big file as it's scrolling across the screen because it just produces too much output. But as you see, new strings are being created here. Every time a string concatenation occurs, which is a very large occurrence in both malicious JavaScript and simply just obfuscated JavaScript, new strings are created thousands of times throughout the script. You see here documents being built up, document.write, document.write. And as you can see further, it's producing the strings that it's going to write into the HTML webpage. Moving further down the line, you see a vowel is hooked in, hooked in. You can see exactly what's being passed into a vowel. It's a document.write. See it continues down as it's having to make some new strings because it, again, passed in some string concatenation to a vowel. And finally you see document.write being called with the script and a source to an external JavaScript. For completeness, as it turns out, this particular one was a highly nested advertisement for the Spanish-speaking adult friend finder world. And that's what you'll find is there's a lot of redirection and you just keep passing through these onions. So imagine if you would walk through this script yourself only to get to find out that it's just pointing somewhere else where you have to go through it again. Not a fun day. So here's the result, like I said. This is the obfuscated JavaScript all boiled down into this. Not a lot there, but a lot of code around it to make it happen. So they wanted to see the original. Here it is again. You can hear how it's working. It's a lot of care ads, a lot of string from cares, and it's got a little cipher in there. I think this turned out to be your typical Caesar cipher moving by an offset of 24 to go around. Yeah. So that's what it started at. And after the tool ran through it, this is what it ended up. And the tool is very quick. This is a matter of a second. It's zero time at all. Moving into some of the pitfalls of current technologies of why we developed the tool. Well, there's a lot of honey clients out there. It's an area that's been very right for research in the past few months and years. I had the MS Strider project, which was kind of the first widespread use of a honey monkey type project of a piece of software going out, scouring the web, running against multiple versions of IE. Now, there's not a lot of info about this, but from what I understand and from the information I've gathered is that a lot of exploits were found using these tools and a lot of patches were made that we've seen come out. Other things that have come across are the Miter honey client project, Capture, Honey Sea. These are all relatively similar in the way they act. They have caveats one way or the other of what they're designed to do better or worse. The problem with a lot of them though is they're very heavyweight and resource intensive like I spoke about earlier, using either VMs and multiple snapshots or heavy-duty hardware, lots of drive space, and just in general taking a long time and a lot of resources to use. There's also the other methods being used which are very high interaction from the human perspective and are very slow to work. Imagine there are analysts and I've seen them do it that step through this code manually or through this obfuscated JavaScript and it'll take them hours to step through it and figure out exactly what's happening and it's a real waste of human time. Not to mention it's very error-prone. And you mess up one or two steps in something like the example we showed and you're not going to have a clue what happened because if that result is used later on then you've thrown off your whole analysis and you're going to have to start over. Speaking of the safety issue, again the text area wrap I wouldn't trust it under O-Day conditions. I don't think anyone should. It's a good way to get your browser doused and bypass your analysis and make you again have to start all over and find a different approach. So Daniel talked about the Webcrawler component of the software. Essentially we wanted to spider some websites or basically start spidering with seeds of some websites we thought would yield interesting samples of JavaScript. What immediately came to mind was the MySpace social networking site. Basically this is a site you're all familiar with and one of the characteristics of it is it allows user defined content to be uploaded including scripting. To hop right back to the Heritrix crawler you could see essentially we defined several jobs in the Heritrix terminology and a job essentially starts with a list of seed URLs that it's going to start crawling with. From those seed URLs it basically will run either till it's completed the task and it's basically spidered the entire tree to completion or until the operator essentially cuts it off. And as you can see we targeted MySpace initially we also targeted whereas and serial number sites for pirated software using some Google hacking we identified a list of .edu servers that had been compromised and that were hosting pornography and malware. Basically after those results we decided maybe we should check out some .mil.countrycode sites that had been compromised and were hosting malware and porn and there are some out there and it's quite easy with Google using URL flag looking for .mil and then looking for various terms indicative of pornography. And also stopbadware.org is a great site for if you're looking for compromised web servers also shadowserver.org I believe is another one so the guys at stopbadware have an exhaustive list of sites they've identified as being malicious or attempting to exploit your browser so if you're a security researcher you could use these lists some of these sites will have already been taken down but many of them will have not been taken down you could use these to perform some research and also I've heard that Google which has their own internal effort to identify malicious sites and filter their search results out is now starting to share more information and more of their list of bad sites that they've identified so what did we find here I'm basically talking about the MySpace run we spidered MySpace for about three and a half days we collected about eight gigabytes of data in that eight gigabytes of data there are approximately four to five hundred unique JavaScript documents that we collected so the vast majority was text HTML this is a crawl report of our MySpace crawl so you can see there's almost 121,000 unique HTML documents that were returned there's 78,000 JPEGs but we only found 500 JavaScripts and again as Daniel said we were identifying JavaScripts by their mind type obviously that's an incomplete solution because you can embed scripts in various ways you don't necessarily even need a script tag for your JavaScript to be dynamically interpreted but that really wasn't the problem we were trying to solve and we thought just looking at mind type would be a good enough approximation and get us some data pretty quickly so we got eight documents and you see we had 4.9 gigabytes of text HTML so we found lots of obfuscated scripts cookie manipulation adware or ad syndication tracking of your behaviors but essentially it was all benign we did not identify of this 400 from MySpace anything that we could really find that was trying to exploit your browser sure there was ad syndicators that were trying to get their advertisements past ad filters but that's really not what we were concerned about at this point you know there's all sorts of technologies for tracking people visiting your website where they click on how long they linger at places and a lot of these scripts also use these obfuscation and evasion techniques because they don't want various filtering technologies to be able to block them so we were somewhat surprised based on our limited run at MySpace it seemed to be a relatively cleaner ship than we thought we'd find so kudos to the security team at MySpace I assume so how do we go about classifying these JavaScripts now that we've collected a corpus of JavaScripts from MySpace we went along and in the research community and we found some researchers websites with examples of some nasty JavaScripts so as Daniel talked about we had hooked in the JavaScript interpreter a bunch of key functions everything from object instantiation string concatenation, escape and unescape chr functions those kind of things so the idea we had was basically to look at the usage of all these methods and these calls and compare running the corpus of benign sample versus some samples of known malicious JavaScript essentially we were profiling the script by instrumenting the interpreter and like I said before we saw some benign uses of these obfuscation and evasion techniques so this first graph here I apologize if you can't make out the titles is four samples of malicious JavaScript now one through four and basically we broke it down into stack bar graphs based on the frequency of calls that we had hooked in the interpreter the very bottom of the stack bar graph in light blue is document write so you see document write isn't called very frequently once or twice or three times in these samples the large stack over that is string instantiation and string instantiation turns out to be very frequent in these malicious JavaScripts in fact to generate usable graphs that really showed anything we were having to scale the string instantiation by a factor of about 50 here calls to eval are indicated here element instance calls of the document object and also object instantiation to compare this visually with our samples from MySpace we broke down all the JavaScripts from MySpace by domain name we took the top ten by the number of unique JavaScripts that were returned and on the far left by far the most JavaScripts were hosted up at the MySpace.com domain these could be people's profiles these could be MySpace's own internal JavaScripts to manage login and cookie tracking etc all the way rounding out the top nine actually because we had a tie at the bottom and she is very popular MySpace many links lead to her website and she hosted several JavaScripts that we ran through our analysis engine of note also is photofile.ru so apparently hosting photos in the Russian networks is quite popular on MySpace for whatever reason and here you see that document right is relatively more common document right is being called a lot more in these scripts string instantiations are less frequent calls to eval are of different magnitude and to put these two next to each other I think you can visually see some of the patterns we found here the four stack bar graphs on your left are known malicious JavaScripts the rest of them are those top nine domains in our MySpace so immediately it jumps out at you that document right is rarely called in these malicious scripts but it's called all over the place in these benign JavaScripts another thing that struck us is that relatively string instantiation is happening much more in these malicious JavaScripts I speculate on why that is like we spoke before the string splitting method is used throughout these malicious scripts and these benign scripts don't seem to be splitting up their strings or instantiating new strings with string concatenation nearly as frequently also you see eval is called more relatively more frequently in the benign samples than it is in the malicious samples and object instantiation is happening much more at being the top bar chart in pink in these monkey chow malicious samples than in our MySpace spider run over here and when you download our tool from the website the data this is based on is available in the form of a spreadsheet and a CSV file if you email us I'll also provide you a database dump if you're curious so what's the future sorry so what's the future of caffeine monkey well it's being made the code is being made available on the research tool slash blog section of our corporate website because like most people in here we're owned by somebody so there's a lot of directions this can move and it'd be great if the community as a whole is going to take it up expand on it and save everybody some time because it takes analysts more and more time to go through these especially if you're working in an IPS IDS or any sort of incident response capacity to find out what really happened it's going to take a long time so potential things that can happen are inclusion in a web proxy of some sort IDS IPS heuristics based addition to signature based platforms or yet another point of logic for heuristics based platforms or anything else other crazy inter-platform thing that can happen is a firefox plugin that would run your javascript before it ran your javascript crazier things have happened not suggesting that one but if anybody wants to do it knock yourself out it might be very useful and probably learn a lot in the time so moving on from there we got about seven or eight minutes for a Q&A before we move into the smaller room so if anybody has anything please raise your hand yes you know the profile you guys do can I show some of your videos I'm sorry yes the the logging shows command by command that has been hooked as it runs through the tool it shows everything that happens for every among other things oh here this is just a selection of ones that we believe would show the best output there are other ones that are output we haven't fully put together everything into a caffeine monkey because Mozilla and IE both extended javascript some past the basic engine approach but a lot of things are in there that would be called sparkly shirt the question was the presentation on the CD yes it is the skeleton version of it it's not full but it's pretty close yes the white paper should also be on the CD you don't see the we'll go it should be downloadable from the same location as the tool which again is right there sound guy or not right so the question that was asked is about application security in this context it seems like you're going to have to have an application that allows this to run that scans it beforehand and that's exactly a lot of the problem we're seeing is that these are just run straight out beforehand and your AV might catch it it might not with AV it all depends somebody takes the first hit AV isn't based on preemptive detection except with some of their heuristics models so correct that is correct some of this sandboxing technology that the gentleman up here is mentioning could be used and should be used and a lot of web browsers and it probably will continue probably will be like that in the future right essentially our web crawler was set up on one you know mid-level recent vintage Dell workstation it was hanging off a residential broadband connection and we gathered the eight gigabytes off my space over approximately three and a half days I believe we had 20,000 html documents 77,000 jpegs 500 javascripts in that corpus and I'd be happy to I don't have the numbers off the top of my head for the other runs but just to give you an idea of how long that took what really concerns me is the heap spray and Feng Shui attacks basically they yield directly into remote code execution through the browser to detect and they're quite reliable based on my understanding of heap spray and I'm not the expert on this essentially the javascript since it's running in the browser when you're doing things it's allocating heap space in the browser's heap so you could spray a particular knob sled that is a valid heap address followed by your shell code and then you need to combine it with one more the final punch to set up to essentially overwrite a virtual function pointer it's going to redirect you back into the heap and then it'll enter when you overwrite that function pointer like I said it's a heap address knob sled that's going to jump you or trampoline you over to your shell code and from what I can tell the heap lib tool that Alexander Saturov released it's quite reliable and it's improving on the initial heap spray attacks right the basic spider monkey you're going to download doesn't have a document object it doesn't have a navigator object or a location so we implemented the document object we implemented navigator we implemented location the screen object can't recall the others so we were adding in functionality and we're also hooking a lot of the existing calls we hooked the internal function that does string instantiation we hooked the escape and unescape functions we hooked the internal function instantiation and in that method we didn't care about the syntax or the semantics of the script as it's presented we're essentially hooked in the interpreter and looking profiling the dynamic execution we weren't the first to come up with this idea Didier Stevens had a blog post where he was exploring this idea and there's a couple others that had mentioned it I think we may be the first that added that much more functionality to it and we'll be happy if you guys checked out the code if you got improvements we'd like to see them so it's out there for anybody to use if that's all the questions we may be moving over to the Q&A track room across the hallway thank you