 Good afternoon. Let's get started in two weeks' time. No questions. And instead of assignment four replacing our lowest homework period, can we do it instead of the final? No. Good try. It doesn't matter because it's not happening, so. I said, instead of assignment four replacing our lowest homework, could it replace the final? Yeah. No. I'm just going to say anything. And about today. What if you already have 100% on the three homework? So you get the bonuses for what you do on the four? No. You're doing homework four because it's prepping you for the project CTF. But you can't stand not to do it. Let's get two vulnerabilities. Now we get into the core of web vulnerabilities. So where was the instance we talked about where code and data have mixed? What does that even mean? Folks, I can't hear the guy answering the question. So stop talking. So let's say you have data in the HP file. And also, though, the front end, it's in the same place. It's that word. What do you think the variable names based off of strings that you stored? Making variable names based off of strings. So that is turning data into code, I would say. So like eval is a functionality to turn data into code. But what about the mixing of code and data? So I can say in the HP program, yes. But as an attacker, can you manipulate a PHP program? No, right? You have to assume, I mean, unless you have right access to the server in which case you're in for a fun time. Otherwise, you have to assume you can't change the code that's running on the server. Query parameters? Any form. That's your input. But what is the actual mixing of code and data? What does that mean? Like not necessarily in web. On the stack? Yeah, so what code and data is mixed on the stack? Yeah, so fundamentally, on the stack you have the data. That's where the user wants to store their data. There's also the control flow information of where to jump back after you execute this program. So essentially, I mean, it's not really code in there, but there is the code flow of the program is left on the stack. Yeah. Well, as an example, like using MySQL with PHP as an example of data and code, would that be all considered code? All considered code, kind of. It would be like HTML that contains JavaScript code. HTML that contains JavaScript code? Maybe, yes. I think we'll get to it later. So the pointers on the stack are data values mixed in memory, but they're also code themselves. Is that what you mean when you say that? Not quite. So here, the canonical, what we originally talked about, I know it seems like forever ago, at the very beginning of the class, we talked about Captain Crunch and his magic whistle. So the phone network, your voice transmission was the same line that they would use to send essentially code of how to program and send a call. So that by sending certain frequencies, you could change and alter how your call was routed or you could make it so that you got this call for free. So this is, you have the same channel, you're mixing the user's data with essentially control signals or code, which causes this problem. So yeah, so this is kind of the first instance of that. And web applications have a lot of areas where you have this mixing of code and data, and we pointed out a lot of good ones. So we pointed out that the JavaScript code, the HTML page that's delivered to your browser, that HTML is essentially you can think of as data. And HTML is not really code, nothing's executing. The HTML just tells the browser how to interpret and parse things. But the JavaScript code is code now that's running on your browser. And these are mixed and they're sent in the same channel all in one HTML page. Similarly, so basically the way to think about this is anywhere that strings are concatenated together in order to produce output to another program or parser. So in the case of JavaScript, you have a web application that's concatenating strings together to build the HTML output to send. And so if an attacker can control one of those inputs and the inputs aren't sanitized, they can control what code is executed. Similarly, as we'll see with SQL queries, if the application builds a SQL query by concatenating strings together, we will see that we can maybe try to control that query because we're essentially just building up a string that we're passing to a SQL parser to try to get the SQL parser together. So all types of areas. So even in HTTP, so in HTTP we have the user's data, like in that same TCP string, we have the headers of the HTML which can be thought of as the code in some sense. And then we have the data of the request or the response. HTML, SQL queries, command line as we'll see. SMTP, so this is another interesting one, when sending out emails, you can trick sometimes the SMTP server to execute essentially control the headers and alter the email that's being sent. So the first one we're going to look at are classic OS command injection attacks. So where did we look at these for binaries? What was the idea here? We didn't call it OS command injection because we're on an operating system. What was the idea here? Somewhere where you can execute it and then you move the command pointer to the program pointer to that code. So that's for exploiting a buffer overflow and giving your shellcode to execute. So command injection is different. Did it really need to set you on binaries? Is it related? It's all related to set you on binaries, so yes. Is this for web applications? This is for web applications. We're talking about for binaries. We looked at one that's applicable to binaries. Yes. So just like we talked about on binaries, if the application calls system using your input, you can inject using backticks to get it to execute arbitrary commands. Similar thing happens on web applications, which makes sense because you're writing an application, you don't want to have to redo functionality. So if you need to, let's say, grep for something in a file, well, just call the grep function using the system command. The problem is if the web application is not filtering that input, if the user can do things like backticks or something like that, they can get code execution and execute arbitrary commands along the line. And essentially, this is the same thing happening again. We're building up a string in our web application, which we're going to pass essentially to bash, which tries to parse it. And so we can trick it to parse and execute different commands by using the bar or backticks or dollar sign parentheses. Then we can get it to evaluate code. So system, eval, popen, include all the school stuff. The firing. And the implementation will look like this. So this is something that would be normal. This would be... I don't want to demonize any of the developers because it's understandable to write code that looks like this. You're calling system, you're grepping on a file because you're trying to look for some phone load article. So, of course, the attacker can do anything. They could mail themselves that your EDC password is vile. What are they limited to? Can they do anything? Character count? Possibly character count for the application. Who's executing this code? The web server. So the permissions that we have on this application are that whatever permissions the web server has, or the web application, which depends on your installation, maybe www-data, I think, is the default on it. But if we think about it from our perspective, we're accessing a web application which means we're remote to that system. We don't have a user account on that system. We can't SSH into it like we did in homework 3. We just have a web interface, and so from that web interface, we can now execute arbitrary commands to get us onto that system. And then when we're on there, we can maybe use a privilege escalation attack to try to get to root access on that machine and try to access the setUID binary. So in the web context, we normally care about accessing setUID binaries because essentially we're acting as a remote user. So if they do this, is this safe? Super safe. It's not safe because you could still just inject quotes to separate up the data and then put your command in there and carry on. Exactly. So we can have the star of our input be a double quote, which is your maximum double quote, and we can do semi-colon that whatever we want. What do we have to worry about? The open double quote at the end, right? The important thing to remember when all these string concatenation things is whatever comes after us will always be appended, right? So we need to make sure we handle that double quote, the FOMO.txt. So we can still do this, right? Now we've passed double quotes there. So, now if we do it like this, is it safe? What's the difference? Is that okay? So, I think the key is it depends on what the semantics are, the system of the system, literally not system called, but this call to the system function, right, in this programming language. So, I believe PHP, essentially what it'll do is exactly what you said. So basically we've pre-parsed all the arguments to this program. So it's not going to be, so no matter what we parse and pass in as the expression, there's no way we can alter the parsing of the program. So you can think about it, what the system call is actually going to do is it's going to call execve with the arguments of rv0 is going to be grep, rv1, ve, rv2 will be our expression, rv3 will be FOMO.txt. So there's absolutely no parsing that's going on in here. So therefore it doesn't matter what character we're passing through the expression. So there's no shell parsing involved at all. So this is important when you're trying to find, because as you know web application is going to be written in any language, right, so you may have to go in and say, okay, this looks like a kind of weird call to system, is it actually parsing it, is it doing shell parsing or not, so you need to look up the documentation to understand how it's being used, what's the secure way or the not secure way. So preventing this, it comes back to sanitization. So this is something we've been preaching for a while. Never trust outside input. I feel like we've hammered this a lot, so this is good. Right, but you have to do it in all of these important contexts, right, that you can't never not trust outside input. Languages will provide built-in sanitization for this. So you should use, in PHP there's this gave shell arg, but it's very important to read the exact semantics of each of these functions so you know exactly what it's expecting and what it expects and what it prevents. Right, the question is, when you call this function, can you put it inside double quotes? Can you not, are the ad quotes, will it not? Right, so you need to know the exact semantics, otherwise you can have something that looks safe but actually is not. So you have to know that in PHP's escape shell arg add single quotes around this string and quotes escape any existing single quotes so you can pass it directly to a shell function but you don't have to put extra quotes around it. So there's a difference, right, so there's another escape shell command that escapes any characters that might use the tricking shell command that they're executing. So there's a different ones and they have a different kind of semantics that you need to understand. Otherwise, you can try executing code. So why do we want to execute code? Yeah, so we're trying to escalate our privileges, right, from a remote attacker to a local attacker, get some kind of access onto this VISTA that we can write a little bit further. Okay, so we already saw in PHP there's a mechanism that allows people to include code to include and require function. Other web applications have similar types of inclusion mechanisms and if the application is configured incorrectly, we may be able to inject attack code directly into the application. And there's actually multiple different ways to do this. So one way is if you can upload files to the system and let's say, well, let's talk about PHP, but Apache by default, if you have PHP installed, any file that ends with .php in your web directory will be interpreted as PHP code and executed. So you can upload code that's included. If you can control the parameter to the include function, let's say you can't specify the exact name of the file that you're uploading, you can upload the file which has PHP code in it, then if you can control the input function, then you can point that to your uploaded code and it doesn't matter that this code doesn't have .php extension. So PHP will still try to read it and execute that code. As we saw, so PHP has that functionality. It allows URL F open to allow opening remote files. So if you control the parameter to an include or require, you can put a URL which is points to your server which outputs PHP code, which will then be downloaded and executed on the server. You may be able to use it in classics, .attacks, influence the path that's used to locate the code. So in PHP, so this allow URL F open, as long as I just mentioned, that allows URL to be used. So you can use something like this. So if you have your main app, you have include path, you include or include path.library.php, which seems fine. In library.php, you're including include path.math.php. So here we require, so this is an example that shows two different things. So one is this problem of allowing auto creation of global variables in PHP. So because I can access library.php directly, I can specify any value I want for include path on the URL as a query parameter. So now I can force it to get and download code in my choosing. So all this do if I make a get request to this, include slash library.php, include path.evils, hgdv, colon slash slash evils.com. What's the PHP code on the server going to do? What file is it going to request? Yes. So if I create a text file called math, or I mean a file that my server does not interpret as PHP, but outputs the PHP source code at my location math.php, that code will be fetched by the server downloaded and executed. So I can include a remote shell I can do. Now I'm executing whatever permission the web application has. You said modular, modularize? What do you mean by that word? Did I? It was in 20 years, previous slides. Modularize? Probably we were talking about making into modules. So rather than having one giant PHP file, you split out the functionality into different modules and different files that you can include. Okay, so now we get to SQL injection. So we're looking at a form, a basic form, username and password form. We've all seen something like this. We can look and remember we're testing these things. We can see it. We can always see the HTML of the page. So you should always be in here testing things and be looking at the HTML of the page. So here I have a form of action as login.asp. I have a method that's post. In there I have a table, because I'm using terrible takeaways layouts. I have a username, an input text field, a password with an input password field, and a submit button and a reset button. So already just for this one request, what can I tell? Submit to also the data will be in the payload. Yeah, probably an ASP application, right? Something that's interesting. Cool. So now if we look at that code on the back end, we can see that there's a login function. So now we're practicing reading other languages just to try to understand what they're doing. We kind of have to remember that the exact syntax will be different, but the semantics are all kind of the same. And if you don't know anything, you can always look it up, right? So here we have a variable username, which is getting from the request form of the username. So this makes sense for request.form. It's probably the form that got submitted. Password is request.form.password. Then we create some server object. And we may not know exactly what this is, but now we can see a SQL query. This is interesting. So we create a variable called sql, where we are selecting from pubs.guess.sam to our table, where username is able to be username, and password is equal to password. And from sql, so why are we including the username and password in single quotes? Yeah, because we want to include all of the user's data, right? So if you think about it, in the case of the SQL query, the data that we're passing to this query is what the developer wants to have happen in single quotes. Right? And everything else is essentially the code of the query, which specifies semantics about the query that we're trying to execute, right? We're trying to get... Select everything from the get tables, where the username is this, and the password is this. We're not even seeing the rest of this, baby, but what is it probably doing? What's it going to do with this result? What's that? Run the query. Run the query, and then what? Yeah, probably check to see if there's actually a user in this system, right? This is a classic login form, right? Well, it's actually a bad one, because we're just using the password directly, and I'm not doing caching and all that stuff, but we're not going to get into that. So we can see they open a query to the database. They use this open. So now we can kind of infer, okay, this RSO must be some way to access the database in ASP. It's passing the connection that I was passing the login. It's performing the query. Then checking, hey, if the record set object, the RSO, the record set object is empty, deny access. So record set close, and then output access deny, otherwise grant access and put access granted. Right, so ASP is similar to PHP. Everything that's not within the parentheses close tag is going to be output directly. So, super simple. But, even in something this simple, we have a core problem that an attacker can put arbitrary values here for username and password. And so by changing and putting in the correct values, they may be able to change and alter the query that gets performed. So fundamentally, when you think about it, what is this function trying to do? Yeah, but why? At a high level, what is it trying to do? Trying to authenticate the user, exactly, right? It's trying to say that, hey, you can access this application if you know a username and password. So we can check in database, the database has these users, usernames and passwords. So, the thing we want to do is we want to be able to issue this query and get access to this system without what? A username and password. A username, yes, we can find a username, I think we've talked about that before. But fundamentally, we want to access it without a password. Because they're not sanitizing the parameters they're putting in there. Yeah, so let's take this. So this is the query, right? This is what's happening at runtime. 600%. So we can see, fundamentally, the web application code is concatenating strings together in order to generate a SQL query that it then sends to the SQL server, and the SQL server does what whenever it sees this query? Before it executes it, what does the SQL query get? A sequence of bytes, right? So then what does it have to do? Parse it, it has to parse it to check if it conforms to the SQL language format, right? So it does lexing, parsing, generates the query that is intended to be executed, and then it executes it, right? So we think about it... So we think about what's the structure of this query, right? Actually, I don't know the entire SQL syntax, but we... That's the fun feature. This is weird. I'm going to give it up for two seconds. Why do you want to do this one? Huh? Well, maybe I didn't do some of the browser issues there. No, it's not it. You've got something selected somewhere. Alright. Try the highlighters. Okay. What about if you... What if you press the T next to the pen? That's the entire thing. Alright, it's cool. We're just going to give it a line. Alright. Just draw it, but not that. Okay. So, select, star... Excuse me, important parts about the query. So it's a select. Not an updated insert. It needs to be a select. Star... the from section. Definitely awesome. Then we have a where clause. Right with that. Right, and the where clause has two different ones, so we have an and. Let's go like this. And we have two clauses here. Then we have equals. Let's simplify this. U... U is actually half of them. Okay. So this is a horrible structure in our query, right? But the parser, the single parser when it generates, when it parses the input, will create the tree like this based on the syntax of the query. Right? And so what's our goal again? Yeah, we want to be able to pass that input to the program that will allow us to log in without giving a username password. So, how did the SQL engine know to parse this username, this is the column on the left equals and this would be my input here. How did it know that? What is the exact syntax in the query that pulled into that? Yeah, username equals and then whatever is in that single, those single quotes, right? So, we want to try to log in without. So what would happen if I just passed a username through password bar? Am I going to get into the application? No, that's not an access denied. So, if I type, I don't know, if I type something weirdly large, like that's saying buffer overflows, right? If I just keep typing forever, is that going to affect the application? No. The size of my input doesn't matter. So what, so the SQL server we can already see is using the single quote to delimit our input, right? So what if we put a tick in there? What would happen? So let's say our name was, let's say our name was a little nally. What's going to happen? It's going to be a what? A single syntax error, right? Because we think about what's the actual query that's going to be executed, right? We have this bar, right? So, why is this invalid a syntax? Yeah, so the problem is this single tick, the single quote is matching this other single quote, and this is not valid, so this other single quote probably is not valid SQL, yeah. So for SQL compiler, when SQL generates this decision tree, can I increase that decision tree where we have conditions where this and then it goes to either left or right, checks these two values, U equals U. If I can increase the size of this decision tree, won't I be able to, because all I need is a variant truth from this decision tree. Right, so the goal is, right, so when we put in huge input, right, but I don't know what you're just saying, when we put in huge input, we're fundamentally not changing this parsing tree that we ended up with, right? The parsing tree, the tree is still the same, but what if we, so here, we have the tick, but the problem is if we have this other tick, right, so you want to add another clause, what would you want to add? So let's do that, we'll do a password, and so this is going to be this, password 2 equals 2, and password equals password, so let's add it in here. Syntax error, what? Syntax is the problem here. This final single quote, so what if we do, let's see, we can add, and tick tick equals tick, this is about a break, so how does, so does this change how we parse the select? Does it change our parse to start? Does it change the the from table? Why not? Yes, because everything before our input is essentially hard coded, right, so this is another fundamental thing to understand about SQL injection is that we can't control anything before our injection point in the query. All we can change is at that point what kind of after. So we can see that we've changed to a new where clause that has now actually, I don't know exactly how it's going to parse it, but let's say it's got two ands and an or, so we've essentially completely changed this maybe in the or, that maybe and right, so we fundamentally changed this where clause by adding additional elements here and this is the core root of a SQL injection vulnerability is being able to alter the way that a SQL query is parsed which seems like kind of trivial thing to do, but so what is this going to do? What is the thing I'm worried about here? Yeah, I'm worried about this AND SQL password because I'm not 100% sure exactly how SQL is going to parse the ors in the ANDs and what the precedence is and the precedence order and have to test it out on my local SQL server to try out and it's also making sure that I'm using the same server that the application I'm trying to test against is using so that I can test if my SQL parses it differently than Postgres or NSUL right, so what do we use where do we have this problem before like injecting stuff and other stuff coming after us right, we had the same problem where we were injecting a command into a system into a system function call which we could then because we always had stuff that appended after us so what was our solution for that a semicolon and then what yes, comment line exactly, so it depends on the exact syntax, so what so we know this OR equals 2, so we know if we say this hey, where username equals something or true equals true, we know that's where we execute what rows are going to return from the database all of them because the where clause is going to evaluate to true for every single row and so we'll return all these rows in the SQL so the question is how do you do line commenting in SQL do I need this stuff no does it matter if I put it into the password no so then how does this change the query so the query will then look like 2 equals 2 this part won't be in here we'll have this and asperd equals this but how does the SQL query parse this query now how many clauses does the where clause have 2 1 high level OR query so an OR and then on the left equals username 0 and on the right is 2 equals 2 so essentially you can think of it as this part so like this part is going to all comment it out so the SQL parser will not take anything there yeah it would depend on the parser I believe I think all the parsers allow that and that would be fine like a new line is totally fine in a SQL query so yeah you'd have to look up the exact documentation of the parsers but this doesn't mean we can't use the other technique we can maybe do this and then maybe what can we do after that so fundamentally like I said once we're in here we're controlling the query at this point we can't change the SQL query that we had we can't change what rows we can't change what table it's accessing but depending on the system we may be able to let's say insert into users and then you create a new admin user so now depending on if this again depends on the configuration of the server some servers and some programming language libraries will allow you to execute multiple queries like this separated by semicolon some won't insert arbitrary data into your database how do you usually use 2 equal 2 even though it's exactly the same so this is called the tick or 1 equals 1 dash dash technique for the very game so we just saw this yeah so this is actually a better way of showing the parsing right so the parser ignores everything else from a dash dash to the end of the line again do you mean the source code you may not have access to it so how would we know what to do with that yes we'll talk about it but yes, fundamentally you have to and that's what makes web application testing so difficult is because often you don't have access to the source code so you have to come up with a set of black box ways to be able to test and figure that out so we'll go over those so we can actually use sql injection so this is the fundamental building block of sql injection we always have to do tick or 1 equals 1 that may not be something that interests us right we may want to do other interesting things we saw inject sql statements insert statements so if we can inject into this view parameter of this value we can then just create multiple users and maybe we can control those new values in some interesting way update so we may be able to update, this would be actually a really bad one if there's sql injection 1 really here with this password we can change we can drop this where clause and just set every single user's password to be equal to whatever or something that we pass in which would be fun we can also target a specific user by adding a where clause to say where user ID is 1 or user name is able to add them or admin whatever we want delete statements delete statements this would be very bad too why? they can delete all of the users in your database so fundamentally this is kind of a core problem here is that the web application has permissions with the database to do essentially everything most web applications talk to the database and login as a single user they're not logging in as you they usually do their custom authentication and authorization so the web application has more permissions to the database than your user has and so now with a sql injection you can fundamentally do anything to the database that the the web application itself could do deleting is very bad I probably don't have to reiterate that but if you imagine how many websites actually have good backup policies and backup functionality that works for those of you that work if you think does your organization have a good backup policy in case your database gets deleted or hose yeah a lot of times they're there but when stuff actually happens is when you find out oh we missed some bug we haven't been capturing this data for months it's not a fun place to be in so we talk about just very briefly semicolons can be used to separate queries we can so this now the sql server and the web application framework allow this now you have total control of the application you can write arbitrary queries so how to identify sql injection so there's multiple ways to approach it and it comes back to this idea of doing testing with the web application where you generate some theory or some hypothesis that says hey I think there could be a sql injection vulnerability here if I give this input if there is a vulnerability then I shouldn't see this output so one way to do this is called the negative approach so the idea is you try to get it to crack so for instance if you put a tick in for a user parameter and the web application gives you a 500 error that likely means there's a sql injection vulnerability there or some kind of vulnerability there so that's a fun thing to test of that in there I mean in May if you're lucky the application will give you all of the error message you'll get the whole error where PHP will tell you hey there's an error in your sql string text right here and then it'll actually leak some of the sql query for you this is why it's always good practice on any web application to shut off all of those error messages the other way is a really interesting way this is much more selfie so the other way is a positive approach so the idea is you so for instance usually where I see this is usually on blog type applications where there'll be an ID parameter that says which blog article to look at and so usually there'll be one two three right and so if you make a request for if you make a request for something with an ID of one plus one right and it gives you back the blog article that has ID two you know that likely you've executed some sql command there and so that query is vulnerable to sql detection so this is kind of the two different approaches there are two different ways to look at it I basically do both I mean I try just throwing stuff in I usually try single quotes, double quotes usually just try to get it to crash if you start with a crash then you build from there so it seems does sql injection seem bad yes why can log in by pass on education insert into tables add data, delete your data but we didn't talk about how to get all of your data because we saw that the query that you execute you're limited to only querying the table that's hard coded into that query right so there's a sql injection yeah maybe you can change it to give you all the users right instead of just a specific user but fundamentally you still want to be able to extract all the other tables right if there's one sql injection vulnerability I want to be able to get your credit card table or your username password table and try to crack and break the passwords so how do you imagine to do that comes to the union operator so the union operator briefly talked about your sql is used to essentially merge the results of two queries right so you have your first select query and then you have a union with a second select query to add the results of that query to the first one something like this select repeat a1 union select repeat b10 so let's return a followed by an a10d and that will be what's actually returned here so this is exactly what we're looking for right we wanted something that would come after our injection point right we can't change the first part of the select statement but we can inject a union operator followed by another select statement in order to try to query another table so if the original query is select id name price from products where brand is equal to some parameter b we can modify this query by passing whatever food take union and then some special union query to try to query from another table so that's right so but what's the important thing about a union query what does it require that the two select statements have yeah yes you need the same number of columns and possibly type depending on the system right which is something that if we don't know what query is actually executing is not something we know beforehand but we'll see we may be able to guess a little bit depending on what output the user gives us so over here this will be a query for some e-commerce application that's going through and showing you all the id names and prices of the products right so you would probably start with guessing at least two a name and a price but you may not know exactly what they are but if you were you'd be able to do a union select user pass middle from accounts and now you've extracted all the user names and passwords from the accounts table so that will be all shown in there what else did I have to know for here so I had to know the number of parameters so three yeah yeah I needed to know this table name so it turns out for most of the standard depends on the database again but like MySQL which we just talked about in the second most popular database it has a standard table that you can query that lists all of the other tables so you query that table you get the other tables you can get the column names of that table so that's the other thing right we need to know the table to access and also the columns of that table but we can access that information pretty easily so but now we have this problem of how do we determine the number and type of the queries parameters what do you think what should we do there's a table that gives you the names of the tables and columns but we have to like here here we don't know so all we know is that there's a SQL injection vulnerability somewhere in a select query that's controlling these products but we don't know that this select statement used three columns because we have to build a union select that goes after it with the exact right number of columns because if we put two it will crash it will give an error if we put four it will give an error yeah a new select statement in the union that has a specific type of value like just one if you're expecting a integer and then see if it returns that value or not and then we'll do another try a different combination so essentially one thing we can do to get rid of type we can use null so if we use null it doesn't matter what the type of the original value is we can always use null so then we'll work so now we can get rid of types for now then now we have to decide how many so essentially we just try we start with one we input one it's a union select with one null if it is not correct it will give us a 500 it will throw some error if it is correct it will I think add one more result with like null null null so we try and use like null null null and for these we don't need a table we don't need to select null from a certain table so we can use this without knowing anything else on the previous example once we got to three we'd say yes this works now we know exactly the number of columns and then we can use what Adam just said and I'm not talking about someone third person sorry now we need to know the type of columns so we can use this same structure we can use all nulls and we can first try putting so remember we're doing experiments right so we only want to modify one variable at a time so we want to try putting a number into the first parameter so put maybe well or we can put a string there and we can try it in different we can try strings and integers in different positions in order to understand is it a string value or an integer value so then when we do that we can query these database specific tables I'm not going to go into these there's tons and tons of information on the internet about these but each of these databases has this specific one so how do you know which database the web application is using what was that maybe from the error message of intelligence in mySQL error let's say it doesn't tell you that yes that's a really good approach so there's there's even yes so each so SQL is a sander that I believe most of the databases should support in the same way but each of them Oracle, MS, SQL, MySQL have database specific functions technology, syntax in MySQL there's even a database specific comment so you could comment make a comment and only if it's MySQL will it actually realize that's a comment so you would input something that maybe was alters and messed up the query parsing if it was other databases if it's MySQL it parses it correctly and so you'd be able to tell by testing which database it is you can also track for any of these tables too you can try it once you have the the union query you can try it for any of these so typically and so this becomes the basis of your SQL injection attack so now we've widened it up to any SQL injection on a select statement we can now extract all of the tables, columns and rows in your database which is something that's terrifying on the web cross-site scripting is still more prevalent than SQL injection so SQL injection is not as prevalent now but because of the impact one SQL injection vulnerability allows you to access everything so Albert Gonzalez the hacker we talked about way back in the beginning of class who the Feds claims stole $242 million dollars through hacking heartland payment systems and TJ Maxx they use SQL injection vulnerabilities to get and to extract this credit card so one thing you may think of is hey as a defensive measure I'm not going to output any errors I'm not going to change the page at all on if the SQL query failed or not or maybe you can't do that but I'm going to completely disallow any error messages so the question then becomes well is this enough further than actual more so than hiding the more so than hiding the error messages what if you don't even see the result well that's actually a separate issue we'll get to that one in a second but let's say we can't see any error messages so let's say there's a news site we accessed pressure leases through this ID equals 5 parameter a SQL query is sent to the database which is exactly what we want select title and description from pressure leases where ID equals 5 what's different about this one than the other ones we looked for yes one of the parameters is integer 5 what's different about this query that's being sent than the other ones we looked at what was it through a URL parameter but the other ones were 2 you can assume they were but what's different about the query itself it's not doing a select star what else oh so let's say this is the one it created so from our user input so it's not hard coded in 5 it's using our input single quotes our input's not in single quotes and we're testing for this so this is a great example of a SQL injection vulnerability that does not need single quotes because of the way the query is written itself so you can query for something like 3 plus 2 and make sure they got you page 5 and then I can tell you that's normal okay so all error messages are filtered by the application so A we have a thing where it's only fetching from the database one page right so fundamentally we can't try to extract additional pages with the union so fundamentally we don't get any error messages so we want to be able to still extract information from the database and so we first try injecting 5 and 1 equals 1 so we have to make sure this is all coded properly but if we injecting 5 and 1 equals 1 what should that return to us the same result exactly does it create this query so it should be returned so in this way we can tell if it should be returned so we know that if we append and 1 equals 1 to a query it's always going to be true but when we're injecting other statements we don't have any other information we can't extract other information when we say this so what we know is if the same record is returned if that page of 5 is returned then the statement must be true so essentially what we can do we can ask binary questions of the SQL server so if you think about it what we're getting back what other requests we make to the server we're getting back one bit of information because we can put a complex boolean condition and then we can say and 1 equals 1 and we know that if that whole thing is true is true then we'll get that article 5 if not we'll get an error a symbol error rate what does that tell us so for example we can ask the server is the current user name hacker press release 5 and username equals hacker so the response to this will tell us yes or no that's true but this is super useless because this is something we just have to guess and randomly poke we may be able to say something like does is the user well is there a table that exists so you could ask a question like that and you could say that does there exist a user at that user's table named admin but what we can do is we can actually use super interesting binary search in order to extract information from the database so what you do is you make a query so we know which table queries has the list of user names and columns sorry the list of tables and columns you would query that database and you'd say if the first record in SQL you have these really complicated functions that you can use so you can say things like if the substring the first character of the first row is less than what's in the middle of the alphabet m or n if it's less than m that would be your conditional you know true or false you just limited the first character's input range so you can do that binary search on the first character until you find the exact value then you can do that for the second character third character fourth character you keep doing that and so you figure out the one table name and you keep doing that obviously you would not want to do this as a human because that's horrible but you'd write a program that's able to do this so you do things like this so this would tell you if the user name was value the user name is less than what's a question mark and then that would hone you in on exactly what the value is so it's classic binary search but you can do this all 100% automatically and using this by just being able to extract one bit of information from the database you can dump everything the entire database so you can dump all of the tables all the columns all of the data from the database using just this one bit of information it's a really powerful thing and it's actually automated and there are automatic tools to do this for you so you don't have to do it or write your own but you know how you couldn't do it if you had to it's basically brute forcing it brute forcing in a smart way so you brute force it essentially byte by byte do you have this yes a series of queries it's like playing that game what's that game something 20 questions playing 20 questions but with an infinite number of questions and so you can just keep asking questions that narrow down the space of what the value is until you know exactly what each of the value is so this is one super interesting type of SQL injection vulnerabilities the blind SQL injection technique the other interesting areas are the basic order of SQL injections so the idea is your input is injected into the application and it's put in the database securely right so there's no SQL injection when the data goes into the database but later on somewhere else in the application that data is pulled out of the database and used in a SQL query unsanitized because the developer is oraneously assuming that the data had already been sanitized before it was put in the database and that is safe to use but fundamentally it's not so you think of things like maybe there's a guest book that users can put their username and passwords and then there's statistics run over those about how many I don't know users do something and those queries use the user input so this is important because even if the application escapes single quotes right and does everything correctly there can still be SQL injection vulnerabilities so yeah and another way where this comes up that's pretty interesting is would be in like a password changing form so let's say the user set their username to be john tick tick tick john tick dash dash so the application safely escapes it when it gets put into the database and now the database holds the value john single quote dash dash now if the attacker changes their password and they're using the username from the database as part of the query now we've altered this second query semantics and syntax and are able to change it and do whatever we want so let's say something like this like a register page where we're inserting into users the username and password and so here we're using even though we're concatenating strings together we're using the php function to mySQL real escape string which is the way to prevent SQL injection vulnerabilities in mySQL so then I'm changing the password so now we get the password we then issue a query to get the username and password from the user's table we then fetch it and then we update the update to use the password and you can see here that the developer thought about escaping things right they thought oh this new password comes from the host request so it's definitely dangerous I need to escape this but it does not escape the username because hey it came from the database right fundamentally SQL injections are and I'll also mention here for second order SQL injection you can also take this to nth order SQL injection where data comes from one database and then maybe another table into another query into another thing and then it's used on sanitize so you fundamentally essentially you should never trust user input anytime anywhere thinking that it could potentially be user input or could be in the future you should escape it before it goes into the database into a query so how to solve this right well it's easy you just completely sanitize every single place where any possible user input can ever test the database which actually is insane I mean you think about how many places that happens it's almost all of the places the other way to do this is stored procedures and this is the fundamentally the I was getting my other things up so stored procedures anybody use stored procedures before yeah what what database MS SQL yeah they're a huge way doing that right so with MS SQL you write all the functions that you need and store them in the database and you just call those functions the best way to do this is really with prepared statements so the idea is you pass the string of the query you want to make to the database first the database parses the query and then you say and this is the data that I want to be used so in essence let's go back here in essence you would pass and we'll see the exact example you'll essentially pass select star from pubs where username equals data to come later and password equals data to come later so the SQL server will then create this tree without any user input then when you say hey here's data 0 here's data 1 it will put that into the proper places and the SQL server knows that that data should never change the parsing of that so when you work something like this you call a prepare statement and you say select star from user for username equals colon name and password is equal to shallowness encadding password and assault limit 1 so does this string ever change or any way an attacker can influence this string to get it to pass this prepared statement do you use your names? how? using the URL what's the type of the string what's the value of that string exactly this string right here it is never ever a different string so you can look at it there's no dollar signs in this string so we're not doing any of that string interpretation or anything like that we're not concatenating anything with this string this string will always be passed to this prepare function exactly the same no matter what input the user get and this is why it is fundamentally secure so that now the database will analyze it, parse it, create the parsing tree and then later you bind this name parameter to the value of dollar sign name which came from the user the same thing with the password as dollar sign password came from the user then you have to tell the database to execute that query then you can solve it could you ashes correctly the string no you couldn't correct prepared statements what's the downside here you have to do like four times the work man it's like so easy to just do like query dollar sign user dollar sign name so much more easier than to do it correctly yeah you can use the compiler to compile the structure only once correct if I have millions of queries like this compiler will have to just bind the parameter and not actually parse and compile it I think this would be faster that's not smart it is I don't know if you can reuse prepared statements in different areas it's basically wherever you would use a normal query you would use it in this way first you still have the same number of queries to write your application you couldn't think that your program would be smart but yeah that would be interesting you also can sanitize input but as you can see it's very difficult to properly sanitize every single input it's very easy to make mistakes and only one mistake will allow you to legally control this it is not because the SQL query is already parsed it so it knows that data what you're passing here is only data for this name grinder you can think of it like variables you're saying a variable called name it's not reparsing that name this is great so we've just finished SQL injection you can crop that script in on Wednesday and I'll make it here thanks