 Alright, I guess we'll get started. Hi everyone, my name is Sohan and I'm talking about evolving exploits through genetic algorithms. So, before I jump into genetic algorithms though, I want to just give you a little back over of who I am. I've gone for many years. I do programming, I love viruses, worms, and I've been trained as a computer scientist and I do penetration testing in the daylight hours. But, you know, I'm still a noob. But, uh, this talk was focused uh, mainly off of uh, kind of uh, my computer science interests and uh, my job and my inner laziness wanting to come out. And I was looking at my job and I go, uh, what I do on a day to day basis is I exploit web applications. And uh, there's a number of problems associated with, you know, performing this task. And uh, the uh, the major ones are it is driven by the customer. So you have to provide them what they want. Uh, there's a small scope. You are only allowed to hit a tiny portion of the site. So you have to have a uh, scalpel-like uh, efficiency. You can't hit the whole web server with a hammer. Uh, you only have a limited amount of time. Usually very short. As in a day, two days, three days. Uh, and it's all report driven because it's based off of giving a report to the customer. And so uh, these problems were what has been driving me to look into this area. And uh, and there's a number of ways that I approached trying to solve these problems. And my methodology was usually run as many scanning tools as possible against a web application. And then uh, manually poke at the areas that, you know, come up as suspicious. And from there, if it does turn out to be exploitable, I write and exploit for it. But uh, there's, there's a couple problems inherent with that approach. Because uh, the code coverage uh, is inherently small because I'm trying to limit the amount of code that I view on a day to day basis. Uh, so uh, I, I want to have myself view less code and make sure that the code that I'm viewing is actually uh, potentially vulnerable. Instead of just what have you. And uh, also the uh, inspection of suspicious areas that are discovered by say web scanners or manually testing. Uh, it's also time costly as well. Uh, and then additionally, the development of a working exploit for a site takes time as well because there might be additional uh, blocking uh, mechanisms in place like a uh, a WAF, a web application firewall which you, you can see you have SQL injection but all of a sudden you don't really have SQL injection because there's, there's an additional layer you have to break through. And uh, there's a number of really good tools out there for exploit discovery and development. And uh, I, I, I use Acunetix, Burp, Zap and SQL map very frequently. Uh, and they're all fantastic tools. Uh, but uh, I, I realized running uh, you know, some of the other tools like Nessus and map uh, other scanning tools uh, that there's, there's this, this problem. There's this very similarity, there's this very big similarity with uh, an existing industry. And it's a, it's a fundamental problem with web application scanners as we know it today. So uh, what up bitches? It's funny. He thought you were clapping for him. He's like, well I, you know, said SQL map, what? Okay. All right, you know why we're here? Wow, this first time I had, there you go. That's what I'm talking about. At the very back in the gray. No, in the hoodie man, bring your skittles up here. Oh, what is this called? Thank you. Oh my god. That was awesome. The price is right. You are here. All right. Thank you sir. Wait, what's your name? Connor. Connor. Connor represents all of you who are first timers. And Defcon. So foundational problems with current techniques. Sorry, that's all I know. I think he was talking about scanning. Oh, scanning. Scanning and software. Oh my god. Look, he's got a countdown timer. Oh shit, you only have five minutes to go dude. Four minutes. Wow, that sucks. Well, thank you for the alcohol. I appreciate it. So, uh, back on track, the foundational problems that we have with web application scanners is that uh, the current main technologies are built around a signature based system. They, they have a understanding of what a potential exploit could look like. They throw it at the web server and then if they retrieve a favorable or unfavorable result, they mark it as a finding. And so, uh, this is, okay. So, so, uh, I thought, you know, hey, why not, why not take genetic algorithms and apply them to web applications? Why not take, you know, your average basic SQL injection and go from something that a web application firewall can easily protect against and a programmer can easily defend against to something that is, uh, more, more hard to, uh, to stop. And so, uh, this whole process of evolution is, is something that was really fascinating to me. And so, uh, so for this talk, we're going to use genetic algorithms to make exploits for SQL injection, command injection, and, uh, our attack surface is HTTP and HTTPS. So, uh, it's web based parameters. And, uh, we're not going to cover anything else. There's, this, this could be applied to a number of different things, uh, another, uh, JSON, Ajax, what have you. But, uh, just for the scope of this talk, we're talking about SQLI and, uh, command injection. So, uh, the tool I wrote for this talk is called force evolution. Uh, and it takes this concept of, uh, I, I'm going to use genetics to write exploits for me. So, I don't have to do it myself. It's, it's the inner lazy programmer. So, uh, what is a genetic algorithm? Well, a genetic algorithm is essentially, uh, you create a large number of things and in this case, there'll be exploit strings. And, uh, you look for a certain solution that these things will provide. And in this case, it'll be an exploit. Uh, and then you score all the strings performance using some sort of vague ambiguous fitness function. And this fitness function in our case, uh, we'll get into that later. But there, there is a way of determining, okay, using numbers, this is a better injection string than the previous one. And, uh, so our, our algorithm here is we have this loop. While we haven't found the solution, we score, we kill off all the low performing strings. Uh, we breed the strong performing strings, the ones that are more efficient or they bypass or they exploit better. And then, uh, we also mutate the strings randomly. And then once we have found a correct exploit, we display it and show it. And so, uh, the tool Forced Evolution does exactly this. We create a large number of pseudo random strings. Uh, they, we are pulling upon the history of all previous, uh, well, all that I could find. Uh, SQL injections and command injections. Uh, and using them to influence the population of creatures that we breed. So we're not, uh, losing evolutionary progress. We're progressing forward. Uh, so we're, we're, we create a large amount of strings and we breed in what we know has worked in the past. But we use that just to influence the population. We don't actually say, okay, we have a set of signatures because then we're back to the original problem. Uh, and then, uh, if we, if we go through the exact same process as a generic, genetic algorithm, uh, we send the string as a parameter value, either post or get, what have you. And then use the, uh, the response from the server to determine the score. And this could be many things. So there, we, we have a good deal of granularity in how we can score a string. And then, you know, just like the rest, we call, we breed, we mutate, and then it, when we find a string that exploits, successfully exploits an app, we display it. So, there's a number of things that we, we also need to talk about. Like, what is this fitness function? Like, how do we define, is this string better than another string? And, uh, there's, there's a couple of things that we can look at and say, does it cause weird behavior? Uh, is the string reflected? There might be a potential for XSS in this. Uh, does the string cause an error? And if so, it, uh, is our SQL injection or command injection displayed inside of that error? That, that gives us additional information as well. And, uh, also, does the exploit string cause, uh, goal data or sensitive data to be displayed so that we can see, oh, potentially this is, you know, a good exploit. So, once we've, once we've found out what a, a creature's score is, then we breed the top scores, and then we kill the, uh, the under performing scores. And, uh, the majority of, well, I can't really say majority, but, uh, a good chunk of genetic algorithms use this genome crossover. And this works really well in our domain, because we have these variable length SQL injection strings that we need to breed against each other. And so, the, this breeding process consists of cutting each string in half, and then mixing halves, and then mutating them. And, uh, the current implementation that I have in the tool is, uh, two parents create, uh, four children and also survive themselves. So, they pass on their genes, and they also live to see another day, until someone is better than them. Uh, now for the next step, like, what, what do we mean by mutating strings? Are mutating our, our exploits? So, yeah, that, my whiskey, oof. Uh, the mutation rate, uh, I found to be, uh, usually it's best to have it variable. Um, and there's, there's a number of operations that we can use, but it all boils down to three essential operations. We have mutation, changing a single byte in a string, we have adding information, and we also have removing information as well. So, it's, it's somewhat like, uh, natural evolution. And so, uh, say, say the example of the, uh, the pre-mutated string ABCDE, or ABCD, uh, the mutations that have been applied to it are, the X has been pre-pended to the string, the B has been deleted, and the D has been mutated to an F. So, hopefully that'll give you some idea of what we're saying. We're not doing anything crazy, we're just picking a random part of the string, and we're changing it a little way. So, uh, that's how we mutate the strings. Now there's a couple things to keep in mind as we go throughout, because we have this algorithmic process of breeding, killing, breeding, killing. So, our, our population is going to vary, uh, and the mutation rate versus search speed is very important, because, uh, if we mutate too quickly, if we say every single part of the, every single attack string that we have is going to change, it's essentially throwing random data at the web server, and it's really not efficient, it's not worth, it's not worth doing. It's, it's taking a bunch of dice, throwing it in the air and hoping you get all sixes. So, uh, it has to be, uh, tuned down to a point where it is efficient search. Uh, and there's also the, uh, the string call rate versus the repopulation speed. If you call more than you breed, uh, you're the amount of strings in your population will decrease and vice versa. If you, uh, repopulate too quickly, they'll be like rabbits and they'll denial of service around machine. So, uh, with these, with these things in mind, I went ahead and I compiled a couple of statistics on, uh, the, uh, the leading edge tools. And, uh, I did Acunetix, Burp, Zap, uh, the OWASP, Zap, and SQLMap, as well as forced evolution. And this, this is just the raw data, but I'll go through some charts to show you how it compares to them. Uh, the number of requests sent to server, uh, is, is a very significant amount. Uh, forced evolution sends on average maybe 10, 30,000 requests to a server. So, this is not exactly a, a stealth attack tool, but, uh, we'll get into some of the, the pros later. Uh, and the time to exploit is usually dependent on network latency. And so these, these will fluctuate a little bit. But, uh, forced evolution does perform well compared to some tools, but not very well at all to others. And, uh, the same for SQL injection. I also did the same statistics for SQL injection. And, uh, the, the total number of requests for server decreases dramatically because SQL injection has a finer way of expressing, uh, the score associated with the fitness function. The, there's, there's a better way and it's easier to score one string higher than another because you have more information to do so. And so it's naturally more efficient because it, it depends on that fitness function, that scoring mechanism to determine who lives or what string lives and what string dies. And so it reaches a solution faster. And the time to exploit as well, uh, decreases proportionally. So, uh, hmm. With that, let's go ahead and try a demo. May the demo gods be gracious. Because this, this does, uh, depend on, uh, Python import random. So, let's, let's hope everything works. There we go. Okay. Ah, this is terrible. I'm sorry. Okay. So we have a generic web application here with a log in form. Uh, and, uh, it is vulnerable to SQL injection as you can, I'll type in just some random characters and it doesn't, it doesn't bring back correct input and, uh, there's, there's also other problems with it as well. So we know that a vulnerability there exists and we can discover this vulnerability or this suspicious area like we talked about previously through other scanning tools. And, uh, now all we have to do is point, uh, forced evolution at it and we'll go ahead and exploit it for us. Let me see. My VMs, all of a sudden change size, sorry. Okay. So, and, uh, forced evolution will be up on GitHub in about 15 minutes after the talk. So, the command line options, I wish I had my glasses, are, uh, we have a target and for this we'll just do localhost and we have an address of the vulnerable web page. So in that case that will be sqli index.php and then we also have the vulnerable variable which, uh, I believe is password although I believe both would work. And then the method, the method previously was displayed as post or, I'm sorry, get, but, uh, the tool has, has both options. And then the other variables, we'll just include for completeness. We'll just include the, uh, the username. Typo? I would be dangerous if I had my glasses. Okay. Username equals, let's just say Defcon. And then we also have, uh, what, what will constitute a valid exploit? So in this case we want to get to the administrative area of the site and so we'll put in, uh, our goal text will be administrative. We'll just put admin because the tool will search any request or any response that it receives back, parse it and then, uh, determine if it has that string in it. So, and on the right hand side I have a tail of, uh, the current requests coming into the web server. So as, as we start running the tool that, that will jump up. I wish me luck. Here we go. All right, right now it has created a large number of strings. Uh, well, actually not that large. It's only about a thousand. But, uh, it's running them against the web server currently and it's scoring them based upon what does the, uh, the response it receives back. And it's taking the top performers and then it's breeding them. So right now we're at generation two, three, nothing crashed. Okay. And because this is based upon, uh, random, random strings, uh, sometimes the solution is found extremely quickly and sometimes it's, it takes a while. But, uh, because of the influence of the, the previous database, uh, this, this will become much, much faster. I'll take this back over to my side. So the pros and cons of using genetic algorithms, uh, the cons, they, there's, there's a couple major ones. Uh, this is not a very stealthy attack tool. Uh, as you can see this generates a large amount of requests to the web server and that's inherent in genetic algorithms as, as a whole. Uh, and there's a small potential to inadvertently destroy the database and operating system. So I wouldn't run this against, I wouldn't run this against, uh, yeah, a production environment. Job security? I don't know. Yeah, and, and it is a slower process to develop and test exploits, uh, at least from the front end. Because I'm sure anyone in the audience, when they see that SQL injection, they, they brr, write it out. Uh, but, and see the program took, you know, 20, 30 seconds to do it. And genetic algorithms will always be suboptimal to source code analysis because there's, there's just more code coverage you can do. Uh, but the pros, the pros for genetic algorithms and using these to create exploits are, are, are fantastic. They're, they're really cheap in CPU, RAM and hard drive and human time. Uh, you can run that on a Raspberry Pi. The, your only limiting feature or factor is the network speed. Like how far away are you from the web server? Uh, and as far as my time goes, I can just turn it on and it runs. I, I don't look at it again. It's good. Uh, and I feel it has more complete code coverage than other black podge approaches because not only does it have the signatures that the other black box approaches have, it also isn't bound by a box of thinking this is, or someone's saying this is what we know a good SQL injection to be. It doesn't have that definition. It's, it's limitless in its approach to the, the solution. And so that, that, that takes us to the, yeah, right, right now this tool will break web applications in the future. Uh, it might not do it efficiently, but as the, the database of, uh, SQL exploits grows, it will do it more efficiently. And uh, the, the, another huge pro for this is automatic exploit development. Uh, the, I don't have to invest my time into sitting down and figuring, oh, okay, I got SQLite. Oh, okay, there's a WAF. Oh, okay, there's something else. Oh, okay, there's filtering roles. This, this doesn't need to know about those. It just cares about that question and response. And so, uh, it's, it's really fantastic in that regard. And, and, and the last biggest pro for this is emergent exploit discovery. Because since this isn't bound by what we know as okay, this is a valid exploit, this will create new things, new ways of approaching problems that we haven't seen yet. And for that reason, I think it's absolutely fantastic and I think we should pursue this. So, in conclusion, you can download the tool, give me, give me about 15 minutes. And uh, there's my contact info, so.