 I didn't know this was going to be at 4.30, so I apologize if I interrupted your 4.20 break. Anyway, thanks for coming out. It's been a long day, especially a long week if you've been a black cat. We'll talk about some new techniques in SQLI obfuscation, meaning new ways of sort of hiding your SQL injection attacks or ways of avoiding web application firewalls. And if you dare use your Wi-Fi or wireless, these slides are online right now, a slightly older version, but I'll repost the latest later tonight. So let that bake for people to type in stuff. It's not that many characters. Cool, who am I? My name is Nick Galbraith. I work at Etsy. How many people know who Etsy is? Oh, yeah, right on. Okay. For those who don't, we're an online marketplace for like artisans, handmade stuff. It pretty much owns a cyber steam pump category if you're into that, which seems to be a popular meme here at DEF CON. Top 50 American website did about $500 million of transaction volume, so we see a lot of crazy stuff at our site. And this is some of the research that came out from it. My specialty is managing some of the groups that handle security, fraud and a bunch of sort of enterprise features. Previously, I've worked a lot in cryptography, Wiley published a book of mine about 10 years ago. And I've been fixing like broken auth systems and password storage systems and applications for like 15 or 20 years. So this sort of came out of some stuff that my colleague Zane Lackey, who's here right here in the front row. How can we determine if our website is under attack? How do we know when people are probing us and poking at our systems and stuff like that? We're a little larger than, the problem is we're sort of this mid tier. It's very difficult for us to put like a true web application firewall in front of all our stuff. So we do a lot of monitoring in the application on the back end of trying to attack SQL injection. And so one of the things we're doing is like how can you actually tell if something is a SQL injection or not? And I sort of started going down this path. It turns out SQL is gargantuan. And the 92 spec is 625, hello? Test? We back? Everyone hear me? Test? Yeah, okay, we're back. 625 pages of list pure text. The latest BNF specification, what does it say? 120 pages of pure text is gargantuan. And no one implements it to spec. And everyone's got special rules that they do. Everyone's got exceptions. Everyone's got special cases. So what happens is when I was investigating how people look at how they determine if something is a SQL injection or not, we kind of end up with something like this. And it's, this is a big, big mess. And part of the reason I just sort of threw up when I saw it is just by inspection, I see bugs. I just see bugs all over it. It's like oh, well you just add an extra space to your thing and you're going to get by this application firewall. Not to pick on PHP IDS, nice guys, mod security, same problems. Hard wired limits, things missing white space at the end, all sorts of ways of getting around it. And so I was like, we got to be able to do a different way. And they want to see my talk at black hat a couple of days ago? Hey, couple of people? Awesome. Okay. The 32nd version of it is maybe a minute. Live injections in NuC library. Easy to port to other languages. It's open source right now and it handles parsing and tokenizing user input in a totally different way than regular expressions. It creates a series of tokens, it does some reductions and then matches it against known SQL injection attacks. This released a couple of days ago. Some URLs for it. Let that beg for a minute. And so what I was doing with this is it detects SQL injection attacks pretty well as a really low false positive rate. And I was writing this and I've been studying the spec and all these rules, writing a whole bunch of stuff and a bunch of stuff I didn't care about that right now. And what I did is I pumped tens of thousands of real SQL injection attacks into it. And it came from everywhere. Published attacks. How to guides. Random stuff we see at Etsy. And the output of things like SQL map and things like that. So just tens of thousands of real SQL injection attacks. And what I did is pumped it all through. I used code coverage analysis at the end of this. And long behold like large parts of my parser weren't invoked. Which means that code is never been used in SQL injection. Then there's a bunch of features I didn't implement. And it didn't seem to stop anything. So it means like those things I didn't implement also weren't being used in SQL injection. Because that's kind of weird. So basically what I did is there's a bunch. That's a good talk. There's a bunch of SQL that's laying around and it's never been used in SQL injection. Which also means probably web application firewalls aren't detecting it either. If attackers don't know about it it's unlikely defenders do too. So what I'm going to do is just sort of walk down the list of a lot of dark corners in SQL. Some you may know about. Some you may not know about. And this is going to be great for building new fuzzers. You know, vulnerability scanners. And certainly on defense people detecting stuff. Some things they need to know about when they're doing it. So let's get started. No. Our friend Noel. Popular token in SQL. Let's take a look. So it turns out that this is the easiest one. I start with this one. It's so trivial. But it's a good example. My SQL Noel, which actually we popped back a few slides. We'd see Noel peppered through the regular expression. Can actually be written as backslash capital N. Backslash lowercase N is not Noel. It's a new line. So that means if anyone does normalization they do it too lower on the input. They just lowercase the whole string. And you used a capital slash N. Your patterns wouldn't match it. It just slipped right through. It wouldn't match this basic regular exceptions. People sort of know what I'm talking about. Does this sort of make sense? Yes, yes, yes, yes. Cool. All right. So that's a great case. So just using that. I didn't see it in SQL map. Like in their sort of algorithms to fuzz things. Like that's a great example of something you can just ask firewalls just by using slash N. So let's do some more exotic stuff. Lots of sort of unusual functions in Postgres that are not used elsewhere. Some of these things are functions in Microsoft, but they're not in Postgres. So there's a little different signature. Sometimes they're expected to parentheses at the end. Sometimes not. And then you have these sort of peculiar functions that are only unique to Postgres. So if you're building out some of these probing things like the classic like and or or one equals one, these are some other things that might slip through. They're very unusual. Just Postgres does them. So let's try some things on numbers. Numbers, when I was writing the parser in lib injection, it turns out parsing numbers is surprisingly difficult in SQL. And there's an amazing amount of diversity. And here's just an example of floating point numbers. I didn't actually write in true regular expression format, but it turns out you actually can't write this in a single regular expression. If you do, you're going to get a ton of false positives. That's sort of one of the tradeoffs I found is that, yeah, you could use regular expressions to do this, except you're going to have this ton of false positives. And you just can't detect. You have to detect accurately. So if you just go down this list, there's, what, see, one, another. Okay. Fifteen. Fifteen different variations with possible unary operators. And Oracle has its own thing where you can throw some other stuff at the end. It's a complicated thing to do in a regular expression. So what I'm thinking is, you know, if you're writing something to fuzz SQL or fuzz a website, and you just sort of do like one and maybe just do a floating point, there's a lot more diversity, especially things that just start with a dot, exponential numbers. I've never seen that in any SQL injection. That's probably going to trip up a bunch of regular expressions. And then I've noticed the parsers themselves handle things differently. So just the spacing and when it sort of figures out when a number ends and when the next token starts, very different behavior. Sometimes they're syntax errors, sometimes they're not, which means it's going to be really tough writing a regular expression in order to handle these special database specific partisan rules. And you see a few examples there, exponential, floating point, you know, all sorts of delimitations there. And this one I like a lot because basically you're able to throw in a number that's not a number. So you want to do your and or equals one equals one type checks. These are great. They're not actually numbers. They actually look like just regular words. I haven't seen any, you know, WAF sort of like handle these sort of unusual cases. So you're building out like expressions, arithmetic expressions. These are great things to try out to sort of bypass firewalls. Making sense so far? Cool. All right. Gets more interested in just a few minutes. Some more stuff. Hexadecimal literals. I've seen this a few times, you know, when you're building out strings and you do the chr function or defunction. But again, it turns out special case exceptions for everything. Postgres got stuff my SQL action, sorry, Microsoft actually just let zero X be a number. Quoting, non quoting, complicated stuff and more binary literals. Didn't know about this. Why the hell you need this in SQL? I have no idea. But it's there. Oh, binary numbers and they're different and it's case sensitive. So if you did an upcase version, wouldn't work. Interesting. More stuff. Money. I am not a Microsoft expert. So I'd love some feedback on it. We already said like floating point already had 12 different variations. Maybe more. Now we got more commas. Okay. Dollar signs. Okay. How does it cast? I don't know. I love some feedback on it or some help here. Again, if you're parsing things or if you're doing that one equals one trick or one not equals zero trick, you can try a money dollar sign, you know, money value instead of a regular integer. That's probably going to snap some regular expressions. It's a really good test case. The fun part. So my SQL has been known to have some unusual commanding rules. If you're familiar with my SQL, it has a whole bunch of version commanding. It has all sorts of different ways of doing it. And in particular, the sort of, hey, my pointer is here, awesome. Pound sign, comment. And this has been well used in SQL eye attacks. However, if you're attacking Postgres, Pound is actually an operator. It's not a comment. So if you're WAF, is this like, hey, if there's a comment, does Nuke it? In Postgres, you'd be actually eliminating real code. And that's really interesting. If there's a firewall or WAF that's actually like changing the input, this is a great way of like, hey, if you know the target's Postgres, throw in these fake operators and don't do anything, it's very likely you're able to inject something courtesy of the WAF itself, rewriting stuff. And there's just a ton of interesting oddities in my SQL comments that they're pretty well known, but I just wanted to call it out that it's very complicated to sort of like tease these out. But there's more. Postgres, I don't know who did this, but they're actually recursive or nested C-style comments. So I've never seen this before. It doesn't work in C. And that last example, what happens when you remove all the comments? There's a lot of WAFs like, hey, I'm just going to nuke comments in the thing. Let's go over here, chop it out. What happens when you remove the comments in a regular expression? They both go away and all you're left with is union all. That's pretty cool. And you can do more than one level. It just goes on forever. It's like a completely bizarre feature. I don't know what's doing there. It gets worse. Strings. How am I doing on time? Pretty good. This is sort of one borrowed from C. Two strings next to each other actually merge into each other. So select foo bar is actually a select foo bar. It just merges into one. So you can see you can chop up a query into little tiny string bits with all these parentheses or quotation marks and make a real mess for a regular expression to parse. And again, post graphs and other ones require a new line. I'm going to whip through unicode because this is such a mess. It's something I really think needs a little more investigation. Sometimes you can specify strings being unicode either with capital N, lower case N, sometimes case sensitive, not sure on escaping rules. Another great way of sort of bypassing stuff. My SQL ad hoc character sets make up all sorts of crazy stuff. Fantastic. This is your screwed. You're screwed. I mean, there's no, it's like, I didn't even know about this. You're probably never going to use this as an application developer, but as an attacker, oh, why bother using quotes? I'll just use dollar signs. And guess what? They're nested. So go figure out the parsing rule of that. You really have to go character by character by character to figure out what's going on. This one, unfortunately, maybe the best is just to go down to the bottom here. I'm sorry for people on my right. Okay, it's a whole bunch of stuff. The main thing is here, the escape character is specified after the string. It's a post fix operator. So it means you don't even know what you're parsing until after you're done with it. Got it. 10 minutes. Thank you. I mean, when you get the slides, you can go click the link and try and decipher this stuff, but it's like, it's almost impossible to decipher. And this one I think is really good for busting wows. Same thing with Oracle. You can use Q as a delimiter. And this function call thing is really interesting because it's basically like a deferred evaluation. You can just pass in, you know, straight SQL into another function. And there's something with delimiters. If someone's an Oracle expert, I'd love to talk to you about exactly what is going on here with the other delimiters. It's complicated stuff. Bad for defense, good for offense. All right. Just a few more minutes here on expressions and operators and here's some various sort of tips that might be useful if you're attacking things. Random operators. They're really strange. Cube root and Postgres. How many people have ever used a cube root? Silence. No one. Why are they there? Effectorial. Like why? More. Lots of things. Bit quite XOR Oracle only. Not less than, not greater than. I've never seen that operator before. You know, like, you make the jokes right themselves. Right. So let's go up this one level more. So you see a lot of the sort of classic or 1 equals 1 which is really easy to detect. And this is me just sort of testing my own Wii screwed up. That's the talk to go to. 1 equals 1. What I haven't seen, I did this only in my own testing. I've never seen it is why do that? Why not use actual functions? Anything that evaluates true or false is what you need here. So hey, you can go back to high school math. I probably made a mistake here. I'm almost positive. It's kind of a degree in mathematics. But so be it. You can get a lot more creative in these test cases that are kind of be really tough to figure out how to solve for. We all know about union and union all. There's also an intersect statement which I was not familiar with. I think someone clever could do something with it. I don't know what. But I can absolutely guarantee that every web application firewall is not black banning that one. So that's something else to sort of poke around on. This one is sort of a side note and this one has come up. It's used all the time. There's things that poke through it. What's interesting about in lists where you do sort of select star where ID equals 1, 2, 3, it's a pretty common thing. To my knowledge there is no platform, no system, no framework, no prepared statement that actually lets you pass in an array of values. Which means everyone who's using in list is writing custom code not using parameter binding. Which means knowing developers they're not going to do it. They're not going to validate input. So if you're poking things and you have some form input that has a range or plurality of inputs that sort of you pass in the query string prime target. Because if they're good developers there's no parameter binding. It's custom code to build this query. So that's sort of neither here nor there but it's a real weak point in the application side. So I'm so smart. How come we don't see these all the time? I think it's because dumb attacks work. We haven't seen a lot of diversity in it because the dumb attacks work. Or there's not published reports on more interesting attacks. The good news is I mean this is a rich area for research. So it's something I look forward to working on. So what am I going to work on next? So lib injection which I'll give you a link in just a minute. I really need to add more rules and more parsing types and I really need to publish the stats of the output of it. Really need more testing on it. There's a whole bunch of things. There's big int types, Postgres has arrays and hashes and all sorts of stuff. Oh, and a regular expression engine just to make things more interesting. Those things can go in infinite loops. There's all sorts of stuff. You can bust stack on that. That's a real ripe area. And character encodings. It is just so complicated. It's the user. It's the application. It's the OS. It's the client library talking to the database. I mean there's just a whole area of opportunity for mistakes. And some references we can go read on the website of your own. And thank you very much. Here are the URLs. And I'll leave that up and I'm happy to take questions afterwards. Thanks a lot for coming.