 In case it's a little too early, the talk we're having right now is HTTP IDS evasions revisited. I'm Dan Rulker, and I guess we'll get started now a little bit. Maybe ask yourself, who is this guy? Well, just a quick background on what I've done. Currently, I work at Sourcefire Incorporated, basically the corporation behind Snort now. I came on board with Mark Norton to redesign the Snort engine for a high-speed networks. We came on board, did that in about two months, and you guys might have seen that in the release of Snort 2.0 in September-October timeframe. What I did after that basically is working on some protocol decoders for Snort, most notably probably a new HTTP decoder that's going to be coming out in the next month or two, but don't quote me on that. So I've been working on a couple of those things. Before that, I was also a lead developer on the Dragon IDS. I did, again, some application protocol decoders and a high-speed packet capture event correlation with Randy Taylor. So I've been working with application protocol decoders for a couple of years now, two, three years. So I want to give some of the knowledge I've learned over the last couple of years and bring it over to you guys. How many of you in the audience are actually involved with IDS, either in developing IDSs, employing IDSs, managing IDSs? How many people in here? How many of you guys actually compare the differences between the IDSs, try out different tests? All right, good. Well, I hope you'll be able to take some information away with you today with some tools that I'll be releasing after the talk. They're actually probably on the DEF CON website now, or if you want to, you can download the most recent versions at idsresearch.org. Maybe asking yourself who these strange-looking people in the front row seats are. Well, they work at Sourcefire too, but they have free Snort t-shirts available for people afterwards. So if you want to talk to some of these guys up here, you want to raise your hands. If you didn't bring any Snort t-shirts, I'm sure they could give them your name and mailing address. I'm sure they can send you some. So anyway, just wanted to plug that, so hopefully my expenses will now be paid. So before we begin, anyone with a laptop or maybe just a really great memory, what we may want to do here is pull up, if you're running Linux, do a man-asci to pull up an ASCII map. A lot of the URL encodings that we're going to be doing, it's going to be easier for you to reference what we're encoding and all that kind of stuff. So if you do have something available, if you're just running Windows and you trust the DEF CON wireless networks, then you can Google for an ASCII map as well and pull that stuff up. So as we actually start going through the different URL encoding types, I think you'll find that pretty handy. Go through a quick presentation outline here. We're going to talk a little bit about the history on HTTP IDS evasions. Why were they important? What tools were developed that kind of started this all out? We're going to talk a little bit about evolution of the IDS and HTTP evasions. We're going to look at both, probably the two main types of IDS's, protocol analysis IDS's, pattern-mashing IDS's, where they started to how they evolved with HTTP evasions and where they are now and where does that actually leave HTTP evasions currently. Then we're going to talk about two general types of application layer evasions. I want to kind of talk about the general types because I'm hoping that for one, they're easier to understand than some of the more detailed applications we're going to talk about. But also, if you understand these general types of application layer evasions, you'll be able to do your own IDS evasions, look at protocols yourself, try and figure out how you can mangle some of the different options there. So we're going to talk about these two. And the two we're going to discuss in this presentation is invalid protocol parsing, which is anyone familiar with the tool side step by Robert Graham? Couple people. Well, basically that tool helped bring about this type of application layer evasion basically works like if you can fool the IDS into thinking that a certain protocol field is like something else, then you've effectively evaded the IDS. Like for example, if you're parsing the HTTP protocol and you think the HTTP version is actually the URL, then you're not going to decode it correctly and you're not going to be able to match malicious URLs. The second type we're going to talk about is invalid protocol field decoding. This means basically like even if you parse the protocol field correctly, if you don't know what's data is actually in that protocol field, then you won't be able to find what you're looking for. For instance, if you're looking at the SNMP protocol and you don't really understand the SN1 encoding, then you're not going to be able to match certain things. Same thing with the URL, if you just do a basic hex encoding on the whole thing, but you're expecting to see plain text, like you're expecting to see CGI in PHF, but instead you're seeing like percent, this, percent, that. You're not going to be able to detect it. So that's the general types of application layer evasion that we're going to talk about. Then we're going to go through the two types and talk about practical applications. For the protocol field decoding, we're going to talk about nine types of URL encoding. We're going to start out pretty easy with the hex encoding and just get a little more complex. Pretty much all these URL encodings, except for two, two of the encodings are valid on both Apache and IIS, but the rest are really thanks to Microsoft. So a lot of those, it kind of adds a little more complexity into that. So we're going to go through all those, give you some examples. Hopefully you'll be able to take away from the talk all these different types of encodings. But if you're still a little hungover from last night, you can always look at the tools I'm releasing because they cover all these encodings as well. So you'll have a little help with that. From there, we're going to move into the protocol parsing. We're going to talk about some new evasions, like request pipeline evasions, how that works, and we're going to talk about some parameter content encoding evasions. For those things where you're accessing like the database, you have some malicious parameters where you're trying to exploit some CGI errors, again, in the parameter field, you want to make sure that an IDS can't see that if you're trying to evade it. So this evasion really goes through and helps you with that. And then after that, it's basically tool time. We'll talk about the different tools that are being released with this talk. So I guess let's get us started here. A little history on HTTP IDS evasions. What attacks are some of the most common and effective attacks in the matrix? Yeah, I know Trinity used an SSH exploit to shut down the emergency power grid. But trust me, these are web attacks coming off forms and sizes. You just don't have to wait around for your favorite web server to come out with a buffer overflow for someone to discover that. Lots of different ways you can do it. You've got CGI errors. You've got database hacking. Got a lot of different ways you can handle this cross-site scripting attacks, all that kind of stuff. So web attacks not only do most companies, governments, banks, they all have some sort of web server available. And so these are really some of the most common and effective attacks that you can do. And because of this, IDS is really, really, really want to catch these things. Because otherwise, you look kind of foolish if they go into bake-offs and someone runs a web attack that they should catch. But hey, guess what? I didn't. Most of the web attacks, if you've been noticing recently, when they come out, it takes about two days before it gets turned into a worm that shuts the internet down for a couple hours. So we want to make sure we can catch those things. OK, now we're going to talk about a little bit of the tools that help bring about HTTP IDS evasions. You can never get through this talk without mentioning whisker. This was perhaps the first and most effective web scanner that employed IDS evasion techniques. I think there's about 10 or so. When whisker was released with the IDS evasions, it really shut down some IDS companies. They're really scrambling to catch the same old attacks they had always caught before. I did a little work with the dragon to help with that. A lot of the IDS vendors really had to catch up with that. The whisker IDS really played off, and I want to make this distinction between whisker and sidestep. Whisker really played off of either web server quirks, or quirks within the IDS's HTTP decoders. For instance, one of the whisker attacks is that you basically embed a capital HTTP slash version in the URL. That was done in hopes that the IDS would look just for that string HTTP slash, and then say, OK, well, I found that, so I think the URL is here. That falls into the invalid protocol parsing evasion we had discussed a little bit earlier. But a lot of the other ones also just had to do with web quirks. Like for instance, Apache would accept a tab instead of a space as well to separate the get request and the URL and the HTTP version. So they're very effective attacks, but they're much different than the tool sidestep. I bring up sidestep here, even though it's not known for web attacks or HTTP evasions, when the tool was released, it really brought about a very good idea and a very influential idea in how to evade IDS's. Basically, Robert Graham used the application protocols against the IDS's. It wasn't working off of quirks or the actual protocol decoders themselves in the IDS. But it actually really delved into the application layer protocol and exploited, I guess I could say that, with some of the lesser known characteristics of that protocol. The two types of evasions we're going to talk about really kind of stem from the sidestep type of evasion. Basically, we got the request pipeline and the parameter encoding. And they really come from this, because they're utilizing aspects of the HTTP protocol. Let's see here. So basically, move it on. Evolution of IDS and HTTP evasions. We'll start from the beginning here. You basically have two different types of IDS's, main ones anyway. You got the kind that protocol analysis will catch your attack, and you got the kind that, say, pattern mashing will detect what you want to. At the beginning, they behaved kind of similar before any evasions came out. For instance, URLs at that time weren't encoded with any type of encoding. So the protocol analysis IDS's would work by parsing the protocol, finding the URL, and then doing some pattern mashing to look for malicious URLs. Whereas the pattern matching IDS's just search the whole payload to see if there are any malicious URL contents anywhere in the payload. However, when the evasions did come out, there was a stark contrast between the two. The protocol analysis IDS's were one step ahead. Because all of a sudden, they had already been parsing the protocol. So all they had to do was implement the encoding algorithms to decode those encodings correctly. Whereas pattern matching IDS's actually had to now write a protocol decoder for the protocol being evaded, then had to implement the encoding methodologies. And once that was done and given a couple years, the two again converged, and they behaved very similarly now. So that means that, since they behaved similarly, we'll be able to use the types of evasions I'm discussing here against both types. Current state of IDS, HTTP evasions. Where's the technology now? It's been kind of stagnant for a couple years. What does that mean for the evasions we're going to talk about? It means basically we can use these against both different types of IDS's. So just kind of keep that in mind. These should work against, to give you some specific examples, like you could use them against real secure, Snor, Dragon, any of the other open source IDS's as well. So we're going to talk a little bit more now about the two general types of application layer evading. Invalid protocol parsing. Like I said, that stems from sidestep. And it really comes from the fact that most IDS's, they're not exactly pros and protocol analysis. What you do for this type of evasion is really research on application protocol. Look for areas of the protocol that you think are a little obscure and test it out against your IDS's, if it actually interprets those correctly. These are very effective evasions because it's not trivially fixed. Basically like if one of these comes out, a lot of times IDS's have to completely redesign and re-implement the protocol decoder because there's some fundamental flaws in their parsing algorithm. So these are very effective. The invalid protocol field decoding, that's what I guys said before, it's basically even though an IDS parses that field, finds the field correctly, if it doesn't know the data that's there, then it can't interpret it correctly. For instance, one of the things we'll be talking about is what happens when you do a post request and you encode the post payload with base 64 encoding? Does your IDS go down to the payload, decode that using a base 64 algorithm, and then match against it? Probably not. That's how the protocol field decoding work. They're both very effective. However, the invalid protocol parsing is probably the more serious of the two because those really, in general, mean some kind of redesign on the IDS's protocol decoder. All right, so now we get into the more practical application side of this. If you haven't got that ASCII map out in front of you, don't worry, I'm going to be helping you out a little bit with this. So we're going to talk about this. We're going to try and get through these URL encodings. Hopefully, I'm going to try and go through them fairly quickly because they're not the most exciting things for me to talk about or for you to listen to. So hopefully, you'll be able to get the main gist from the presentation. And again, given what I'm going to bring up in this talk, you should have the tools to go do additional research on your own in the web and know what to look for, and tools to try out, things like that. So we'll start out right now, I guess. First one we're going to start out with. We're going to start out simple and then get more and more complex is your good old hex encoding. It's been around for years. It's been an HTTP standard for years. Most IDSes all handle this. And most web servers all handle this as well. So this is pretty much a universal encoding. Is it that effective against IDSes? Not at all. But I want to start here because some of the later encodings we're going to talk about utilize this type of encoding. So basically, the concept behind this is you look up in your ASCII map the character you want to encode. You look up the hexadecimal value, not the octal or the decimal, but the hexadecimal. And so for example, what we're going to probably be using throughout this presentation is the capital letter A. The hexadecimal value for that is 41. So how hex encoding works is in the URL, you escape it with a percent and then you put the characters four and one. So when that gets sent over to the web server or an IDS intercepts that, it sees the percent for one, knows what to do with it and decodes that into a capital letter A. Pay attention a little bit to this encoding, though, because we are, like I said, going to use it through probably the next four different types of encodings. So you guys should probably be familiar with this, especially those of you that are more in the IDS field. Next one we're going to talk about here is the double percent hex encoding. You probably saw this with some of the bug tracks that came out about it a couple years ago. Basically, this is the same as the normal hex encoding except you're encoding the percent first and then you follow that percent by the hexadecimal values of the bytes you want to encode. So for example, if you got that ASCII map out in front of you, you look up percent, its hexadecimal value is 25. So in order to regularly hex encode that, you do a percent 25. You then follow that by the bytes you want to encode. So in this case, A, again, the hexadecimal value is 41. So you got percent two, five, 41. And thanks to Microsoft IAS, you got a double decode. So it goes through it once and then goes through it again. So as you see on the first pass, the percent 25 gets then turned into a percent. And then on the second pass, the percent 41 gets turned into an A. Can you double nibble hex encoding? Again, this is working on Microsoft IAS. Basically how this works is each hexadecimal nibble of the byte we want to encode is itself encoded using the normal hex encoding method. How many of you wanted to wonder how many encodings I could get in one sentence? All right. So basically what that means is to go back to our example with the A, we basically got the hexadecimal 41. So in order to encode this using the double nibble hex encoding, we look up four in the ASCII map. It's a hexadecimal values 34. We look up the one. It's hexadecimal values 31. And we encode those using the normal hex encoding. So percent 34 and percent 31. And we make sure we put that other percent right out in front of it. So on the second pass, it'll pick that up. So on the first pass, the percent 34 gets turned back into the four, 31 to the one, which leaves us with percent 41, and percent 41 turns into A. OK, first nibble hex encoding. Anyone want to guess what web server? That's right. Microsoft IAS. This encoding is very similar to the double nibble encoding, except the only difference is only the first nibble is encoded. So in this case, the first nibble of A is a four. So we encode that with the percent 34 and just leave the one as one. So you get the percent percent 34 one, which turns in first pass. Percent 34 turns into four, leaves of percent 41, which by now, hopefully, everyone knows that percent 41 turns into A. All right. Second nibble hex encoding, IAS. Just like the last one we talked about, except the second nibble is encoded. And the first one's just left alone. So we got the percent four and then the percent 31 because the hexadecimal value for that's 31. On the first pass, the 31 turns into A1, leaving percent 41, which leaves that. OK, quick break encoding. I'm just going to try and make a joke here, break it up a little bit. So anyway, I guess I didn't go over so well. So anyway, but we got about four more URL encodings to go, gets a little worse now before it gets better. After these encodings, then we're going to go into the invalid protocol parsings, which are a little easier to understand and not so much into the bits and the bytes. OK, UTF-8 encoding. I'm going to try and go through this fairly quickly. If this is the first time you've heard of UTF-8 or tried to decode it yourself, you're probably not going to get all of it in this presentation. But go to Unicode.org, things like that, and you'll be able to learn more about it if you're interested. Basically, UTF-8 is a way to encode values that are greater than a single byte value of 255. So UTF-8, most web servers will accept up to values of 65, 64k, 65,535. We're going to talk about two UTF-8 byte sequences here, the two byte sequence. How the UTF-8 encoding works is that the first byte in the sequence contains the high order bits contain the number of bytes for the whole sequence. So as you see in the two byte sequence, the first two high bits are set file by 0. What that tells the algorithm is this is a two byte sequence. The bits after the 0 are part of the value of what we're going to be decoding. And then with the extension byte after that has the high bits set file by 0, so you have six more bits. You add those together. You get 11 bits total, which a two byte UTF-8 sequence will give you a max hexadecimal value of x7ff. Three byte sequence, very similar. Again, you basically got the three bits, 1, 2, 3, that tells you it's a three byte sequence followed by a 0. So you got four bits in the first one, six bits in the second, six bits in the third. So together that gives you 16 bits for a max hexadecimal value of ffff. And the final showdown here, UTF-8 encoding is supported by both the Apache and IS web servers. We're kind of breaking out of the mold here, not just IS, but getting Apache in there. And just wanted to let you guys know the new UTF-8 Unicode standards don't really allow you to encode ASCII characters in multi byte UTF-8 sequences. The current versions of Apache now follow this, but IS still I don't think does. So we're going to do a quick example here. Again, this is going to be our example of A, which turns into a hexadecimal value 41. So what this looks like first in UTF-8 is percent C1 percent 8-1. Remember, anytime you want to encode an ASCII character in a URL, you've got to usually escape it with a percent. So in this case, the C1 turns into, as you see here, the first byte there, the second byte there. And if you go one more, basically I'm exiting out the bits that are for the UTF-8 encoding so you know what to do, which leaves us then with the 100001, which is a hexadecimal value of 41, which equals A. I know that's pretty fast, but if you're really that interesting, Unicode, UTF-8, you can Google your heart out. All right, back to Microsoft IS again. Remember how I told you you had to escape bytes with percent? Well, guess what? You really don't have to do that with Microsoft. All you really have to do is send the byte. So for example, the UTF-8 sequence we just talked about, which was C1 8-1. All you do is you send the byte C1, filed by the byte 8-1, and IS will interpret that. What makes it kind of cool as well is that you can encode the first byte with a percent. So you can send like percent C1, and then you could send just the bare byte 8-1. A lot of IDS's don't really mix and match these things. So we'll get to a mismatch encoding a little bit later. But that's something to keep in mind because when you mix these encodings up, it really works well against different IDS's. All right, percent U encoding, we're almost done with the encodings. What this is basically Microsoft's little way of encoding Unicode code points to 64K. Basically, its syntax is percent U, followed by four bytes, which is four bytes, which equals basically four nibbles or two bytes. How you kind of encode that is you actually put the Unicode value. So for example, if we want to do A, it's values hexadecimal 41. You put zeros in front of that, and you end up with a 41 there. Don't get confused. I've seen a lot of people do this. They see for example, we did the UTF-8 encoding with C1-8-1. A lot of people say, oh, well that's the code point for A. So I'll just put C1-8-1 in here. So you'll have like percent U, C1-8-1. That's wrong. What you really put in here is the actual Unicode value, which this is very effective against Microsoft because what they've done for us is with a certain Unicode code page, an ASCII character say A may map to like eight different Unicode values. So you'll have like the value of 41, 100, 102, 104, et cetera. So you can actually just put that in there. So like the same encoding, percent U, 00, 41 equals A. You could also do like percent U, 01, 00, also equals A. Or percent U, 01, 02, also equals A. So you get a lot of those. You can thank Microsoft for that, because if you have a different code page as well, the ASCII character's mapped to different code points. So it's pretty convoluted. And now we're on the last encoding. This is actually the most interesting one because it's, as I said a little bit before, the mismatching coding really, what it does is we take the other eight encodings we just talked about and put them together. So basically at the end here, you can see I got a little quiz for anyone that's interested. You can write that down. That translates to one letter. So give it your best shot and see what you can translate it down to. And if you want to double check with me afterwards or whatever, just come up and ask me. But yeah, so mismatching codings, they're very effective against IDSs and just most IDS analysts, even if they catch the packet, you think they could decode that? Probably not. So anyway, just keep that in mind. All right, so we're done with the encodings now. We're going to be, we're onto the invalid protocol parsing practical applications. To me, these are actually the more interesting IDS evasion types because they're a little more general. I think a lot of people can understand them and they're very, very effective against IDSs. If your IDS doesn't catch it, then they're pretty much going to have to really think about how they want to redesign the protocol decoder. So we're going to talk about two types that I mentioned previously, the request pipeline evasions and the parameter content encoding evasions. And both of these, like I said, are pretty effective. Request pipeline. So what is a request pipeline? Anyone know? Raise your hand if you know a request pipeline anyone? Okay, well, basically it's a new feature in HTTP 1.1. It basically allows your web client to send multiple requests in a single packet. This is different from the keep alive option because with the keep alive, you could send a request with the keep alive header and the web server will keep that connection over. Then, you know, open. So you then you can send another request like a couple minutes later. This is different because you actually send the request in the same packet. So, you know, as we look at that, it's actually very effective in evading IDSs. How does it work? Well, most IDSs, they only assume that one HTTP request packet is gonna contain one URL. But with request pipeline evasions, that's not the case anymore. You can put as many in there as you want to. So how does this work then? For the first couple URL requests, we say, okay, we'll send some pretty benign, you know, benign requests like we'll do. We'll look at the like slash, the index, we'll look at the content, maybe download the logo gift or whatever we wanna do. And then the last URL request, you basically put the malicious URL you wanna access. If you leave it in plain text, it's probably not gonna work for you because a lot of IDSs at this point, what they do is just do pattern matching still over the whole packet for certain malicious contents. So what you wanna do is just encode that last malicious URL with your favorite type of encoding. Hacks encoding works just fine, you know, and it's universal for all web servers. So you kinda encode it like that and most IDSs won't ever decode that URL. So here's kind of an example here. As you see, we got the benign request. Always remember when you're doing request pipeline, you have to have the HTTP 1.1 version. And because of that, you always have to have the host header because that's mandatory in the HTTP 1.1 protocol. So if you don't put the host in there, odds are the web server's gonna say, I don't know what you're talking about. So remember to put the host parameter in there. So as you see, we, you know, just access the index, content.html. And then we basically encode our last one. If an IDS happens to be looking for the content, CGI bin, PHF, they're not gonna find it there. So they'll just see a bunch of percents. So that's what that encoding turns into, is CGI bin, PHF, and you can access things like that. Okay, parameter content encoding evasions. This evasion's particularly effective when you're doing any sort of parameter type, you know, malicious parameters. If you're doing database hacking, you know, you're putting in some select statements that aren't really that good, like select all credit card numbers or whatever, you know, things like that. You wanna make sure if you're trying to evade an IDS, that that's not in plain text. So also works well for like, you know, against CGI input errors with the parameters at the end, or if you're, you know, find an unpatched IS server, there's still some out there. You know, you can use your command.exe with the parameter types. So how does this one work? Well, instead of doing a git request and putting the parameters after the question mark and encoding that with whatever URL encoding you wanna do, because most IDS's decode the URL and the parameter URL field too. So you can't really get away with that anymore. What you do now is you actually change that git request into a post request, and you put the parameters down after the header section into the HTTP payload. So we've moved the parameters now into the payload, but again, it's still in plain text. So an IDS most likely is gonna find that malicious content that it's looking for. So we make it a little tougher by using the content encoding header in the post request and, you know, picking your favorite type of encoding, like you could either zip it or you could like base 64 encode it or whatever you wanna do, which then, and then run your parameters through base 64 encoding, like, you know, on whatever you wanna do there, and then putting that into the payload. So what an IDS now has to do is say, okay, this is a post request. What type of encoding is it using? And then it goes down to the end of the payload and decodes that. IDS is just don't do that because right now it'd be very time consuming and lead to potential DOS attacks against that, you know, because there's a lot of valid post traffic and, you know, it's a little scary. So this is what it would kind of look like here. You got the post CGI bin, PHF. You know, again, people see that PHF and say, oh, it's an attack, but say for example, you're accessing like a database like Oracle. You can't really write signatures for Oracle database access because there's a lot of valid traffic going across there. So you'll just get a lot of false positives. So if you're actually accessing a valid resource on the web, you're not gonna be able to really detect this attack. So as you see, we basically set the content encoding, base 64, we set the content length is 15, and then the parameter sections to CGI bin, PHF, for an example, you know, QAlias, whatever you wanna do. So, you know, again, IDS will be looking for QAlias. It's not gonna find it there in the base 64 encoding. All right, so that's kind of the more formal part of this talk. Now we're gonna talk a little bit about the different tools that there's available for you guys to use. Basically, there's three different ones. An encoder.c, you know, this is gonna do all the different encodings we just talked about, and this is a command line version. Runs on both Windows and, you know, StarNix, whatever you wanna do. Linux, BSD, whatever. And then we got HTTP Chameleon, which was written by Mark Norton, my colleague. He basically wrapped up the functionality and encoder.c was with a really nice Windows GUI, so, you know, it's okay. I don't, you know, just cause you use the GUI doesn't mean you're less elite than anyone else, but I actually find it pretty useful cause it's a little, you know, keeps a lot of the history around, keeps a lot of the different templates that you can use and hopefully it'll make it a lot more accessible to everyone. And then the final one we got is the Microsoft UnicodeMap.C. What that basically does is you run that on your Windows box, your IIS server, whatever you wanna do, and it dumps all the code pages and the code points that's on your system. You know, maybe just for your information, if you kinda wanna know what code pages your system's using. This'll also be very useful when the new Snort HTTP decoder comes out because the new configuration allows you to specify certain code pages and code points for individual HTTP servers so you can make sure you can catch the specific attacks. You know, for instance, to kinda give you an example about this, a lot of people try to decode Unicode code points to Chinese web servers, IS web servers using like the American code point mappings. And you know, that doesn't work because like they have a different, A equals different code points on the Chinese web server. So if you know, if I don't know if any of you guys have you're administering or watching different web servers like saying Hungary or China and America, you know, it's good to know this because a lot of IDSs, all they do is default to the American code page and you know, it's wrong. So I just wanted to bring that up. So we'll talk a little bit now about the tools. Start off with encoder.c. This is a command line tool, allows very fine-grained encoding of HTTP URLs. You can encode like each character in a different type of encoding you want. You can throw in directory traversals, multiple slashes, backslashes, whatever you want to do. You know, also in case you don't have like Netcat on your machine to pipe what this prints out across the wire, there are some simple SOC support built in. So you know, you can just port this program over and you'll be able to, you know, do some research on the web servers or do some IDS testing. And I would suggest that if you do run Windows or you know, like Windows, use the HTTP chameleon instead. You know, it's a lot more useful as you'll see because I'll do a little demonstration here in a little bit. Okay, encoder.cusage. Basically you got your three different ways you can encode text. You can encode a single character by just doing slash a with a character or the encoding type slash a and encoder.c happens to be ASCII encoding or hex encoding. If you want to encode a whole string, you basically just put the string in the braces there. And if you want to do like a code point that's outside the range of a normal ASCII character, like say you want to encode like, you know, one FF or something, you just kind of put that in there to the code point. Remember, this is kind of like C syntax. So if you're using a hexadecimal value, put a zero lowercase x in front of that value to designate that it's actually a hex encoding. Okay, HTTP chameleon written by a colleague Mark Norton with the, you know, Windows GUI. I think it's much more usable. And I guess we'll do a quick demonstration here with it so you guys can kind of check this out. This is pretty much what the GUI looks like here. As you can see, it's got an about box, a help box. That helps actually pretty advanced. So, you know, download this at your leisure and go ahead and read different things. It kind of gives you some quick starts, how to use different things, save some templates for yourself. It also explains the different types of encodings here. You know, and like for example, you know, first nibbles slash I, you know, et cetera, things like that. How it kind of works is this way. You know, we'll take the basic Unicode attack of years old, we'll select that one. Kind of comes down here. And this is the basic attack. You know, it's basically a template. So you can decide how you want to encode this one. So for example, you could say, okay, scripts. I think many IDSs will be able to find that and that shouldn't be normal access. So we could do like a short UTF-8 encoding. So we'll double click that. You know, pops up what I highlighted. You know, I didn't highlight the whole scripts just a little bit of it. And then, you know, click okay and it puts in the encoding for that. Then, you know, for example, then we got WinNT System32. Let's maybe encode that a little bit, obscure it up a bit. Maybe use double hex encoding for that. Double click on there, comes up, you click okay, puts that one in there too. Then at command.exe, you definitely don't want that in there. So you do this one and maybe do a mismatching coding on that. And we'll do the, say, okay, fine. All right, so you kind of have that. Then once you're ready, then you click encode and up comes your encoded URL here. So anyone want to try and read that out for me now? All right, so basically, you know, you got all that and you're hoping now at this point that your IDS actually can encode and decode that. So, you know, now you basically put in your HTTP host and you can send the request. The HTTP response spits out what you got back. So you know, you can tell if it was successful or not. It's kind of interesting, you know, try it out against, you know, like your home web or like some of your test web servers or whatever to see what you can get. So it kind of works like that. The other thing that's kind of nice is, let's see here, for example, if you do this and get this up here, Unicode Explorer. I think it saw some problems with the XP here that I'm running, which by the way, this is my wife's laptop. Company I work with couldn't give me, I guess they didn't have any laptops available. So, you know, it's a mad props to GW law. So, but I basically, you got a couple of different things here and see if this will come up here. Sometimes, sometimes, but, so we got this. Now I want to explain, oh, not responding. Oh, I just, I just, I just rooted my box. I'm cool. So let's see here. Okay, wait, Unicode Explorer. So basically what this does here is, you got your code pages you can select. These are all the code pages that are currently on XP. You got, for example, Latin one is what is the default on all American servers. So click that one. And basically now you can go ahead and go and pick your favorite character you want to code. Remember I was talking about those different values for A? Well, there you go. Not only do you guys 0x4 one, but you got like, you know, six other ones or whatever. So I'm, you know, feel free to pick it and, you know, you pick one here, it comes up. And then, you know, you basically have like a reference as well. You can say, well, what about, what other code pages in the system also refer to code point 0102 as A? So you can kind of scroll down here and say, oh, okay, cool. So, you know, so you can see that you got that. I wanted to also just kind of give you an idea here for the request pipeline. Clear these out here. Basically, here we go. I've actually encoded it first because it's a little easier to read. As you see here, we got the first URL request, normal. We got the second logo.gif. And we got the third CGI BIM PHF. Well, remember I was telling you, don't leave that CGI BIM PHF in prints there. So basically we're just gonna take it all, highlight it. I don't want that space there. There we go. And then we say, I'll just do a simple hex encoding on this. Okay, looks good. Click okay. Then encode it. So we're back here. Everything looks the same except CGI BIM PHF's gone. So that's kind of the idea behind it, all right? So like I said, you know, play around with it. Send us, you know, if you find something not working or you have some troubles with something, we're probably gonna be releasing a new version of it that fix some of the problems in the next week or two. So, you know, feel free to check back at idsresearch.org for the latest. So let's see here. So back to the slide show. We're almost done at this point. All right, so the msunicodemap.c, this basically, you know, like you saw in the Unicode Explorer that eventually came up, it basically just tells you what the different code pages are in the code points. And slide 37 and we're done, maybe a little early, but thanks for being a great audience and not heckling too much. You know, download the update tools at idsresearch.org. You know, and I may be releasing a sum, like some code for, basically you'll scan a web server and go through and tell you what types of encodings work on that web server, you know, so you can test out, you can make your IDS testing a little easier. So all right, so questions anyone? I got one right up here. Yep, good job. All right. Nice. I guess we'll have to find out. I can't really talk too much about it, so we're starting to tell me to, you know, yeah, to talk too much about it. It should be coming out though in the next couple of months, so, you know, it's gone through some pretty rigorous testing, so we'll give it a shot. And the question here? It's silly to talk about the basic code for a processing component there, but it's easy enough to tell some of the basic code for coding in odds are that's going to be a useful code for coding. Okay, question was basically, instead of like actually decoding the URL, could you just basically, you know, try and detect that encoding's been used? It's a good question. You definitely can do that. The only problem with that is that you really won't be able to get the attack that was sent. You know, you'll get the alert that says, hey, this URL wasn't coded, but unless you're like a decoding expert on like URLs, you probably aren't going to know what the, what actual exploit was brought about. So, you know, we actually want to try and do those decodings if we can. Okay, question over here? Yeah, yeah. Oh, like how's ethereal against that? Okay, basically the question was, say you capture some of this traffic with ethereal, ethereal doesn't decode URLs. So, you're kind of up on your own. Yeah, exactly. Yeah. So, okay, next question, right there in front. Okay, question was, do these encodings work across HTTPS? And is there any way that we're going to maybe change the HTTP chameleon tool and to support that? Yeah, we may in the future and yeah, they definitely work over that. Hey, anyone know what time it is? I want to make sure I'm not going over time. What is it? Oh, okay, so we got some time. All right, more questions? This one down here? Yeah, yeah, this actually deprecates some of the double decodings for IS, but not the Unicode code points. So, you know, again, if your IDS doesn't know the different values for the Unicode code points, then you'll be a via that way. But yeah, that's a good question. By the way, the question was, if you use like URL scan for the IS web server, does it help you with some of the decodings and not make them all available? And it does not make them all available. Question in the back there, I thought we had? Yeah, I haven't tested against that, so I don't know. He had asked, does the EI web scanner stuff work with this? I'm not sure. Yeah, URL scan with IS actually does deprecate like certain double decodings. So it kind of just leaves you up with the hex encoding, the percent U encoding, the UTF-8 encoding, the bare byte encoding and like all the multiple Unicode code points you can encode. Question down here again? Okay, like in the tool? Okay, the question was, if you find like a really messed up encoded URL, you paste it into the tool and it'll spit out what it was. Not at this point, but that is a feature we'd like to add. So, okay, a question there too? I'm not sure. I'm not an expert on like configuring IS and Apache you probably could, I'm pretty sure, but I'm not sure about IS. Okay. Question over here again. Trying to get both sides of the room. Yeah, actually there is, because say you're trying to access a URL that with like a, you're trying to access a webpage that's Arabic characters. The only way you can do that with the URL is actually encode it using the valid UTF-8 encoding. So for example, you know, or like Hungarian as well, if you're trying to use some of the letters that aren't in the American alphabet or that are actually Unicode code points, you would have to encode those certain ones in the URL to access it correctly. Does that answer your question? Yeah, he's wondering like, if he's not really too worried about the decodings and you know, code point and codings, then you know, yeah, you can definitely block those at the router or however you wanna do that. Try your best question there. What is it? Okay, question was, have I tested this out against OpenBSD firewall normalization with that? No, I haven't. I doubt they incorporate this though. So, but yeah, I'm not really sure on that. Okay, question over on this side. I'm sorry, I didn't really hear you. Okay, what about them? The browsers actually, you know, like when you get sent back, usually the browsers will be able to handle Unicode encoding. I really tested much out against if I actually send all these different types of encodings to a browser, I've been more server based, but I'm not really sure. Welcome to test that out though. Okay, question here. Yeah, it definitely is a flag. He had asked, are all these different encodings, like if you see them across the wire, you know, aren't these things you wanna alert on anyway? Yeah, they are definitely things you wanna alert on. But in the same way, an IDS really should decode these things as well so you know like what's been accessed. Otherwise, like if you see something like a, you know, let's see here. Oh, I don't have it up anymore. But like, you know, if you see something like that CGI and PHF string encoded, you're gonna have no idea probably like what actually was accessed and was it a valid URL in your server event or were you compromised? So, but yeah, okay. More questions? Okay, and one more after this last question. So go ahead. These encodings aren't exploits in and of themselves. These are really IDS evasions. The only solution to these are actually for your IDS to be smart enough to handle all these things. Otherwise, you're probably not going to to be able to handle these correctly or even detect them or anything like that. Yeah, and we should be out fairly shortly. Okay, last question. I'll see if anyone from this side because I've been answering a lot from this side. All right then, right here. Did I already ask, did, let me get this guy because he's a newer one, go for it. They already do. Yeah, Apache. They already support unicoding codings. Yes, in URLs. But remember, they only use the UTF-8 standard that we had talked about. I'm not like the percent you and all that stuff. All right, so that is the question answers. Thanks a lot, people, for coming out. If you want to talk to me afterwards, just come by. Well, brother, thanks a lot. All right.