 So, welcome back. We now start a new thread over here. This is on software vulnerabilities and web application security. So, before we continue much of this talk will be on actually web application security. We would like to first figure out what exactly are the vulnerabilities, what is the attack and what are the vulnerabilities and then of course, what are the defenses. So, for that we should understand to some extent how a web application might be constructed and what is exactly going on in the client side and the server side. So, here is a client basically browser and the server. Now, what you typically have is a multi tiered server, but for simplicity we will just consider two tiers presentation plus application and the database tier. So, where is the vulnerability and what sort of an attack are we talking about. So, let us look at the goals first. Various kinds of web security, web attacks could be with the following goals. For example, to steal user credentials perhaps stealing a password, pin, etcetera, etcetera. There also be things like DOS or DDoS. So, what we are going to consider right now is something more along these lines stealing of user credentials. So, a web application would very often produce the html. So, there is an htp request from the client to the server and there is a response back. So, typically this response would include amongst other things html plus javascript. The html could and is often the case that it would actually involve things like a form. So, you might have a form that is presented to the user out here on the screen and then you have got multiple fields that need to be filled up. Say for example, name, address, etcetera, etcetera and then those. So, we refer to these as htp request parameters and these parameters are sent back after they are filled they are sent through the htp response. So, there is an htp request and htp response and then these parameters would be sent through the next htp request and so on and so forth. Now the vulnerability over here is that the software and the page is expecting a user name which could be simply something like an alphanumeric string or actually just an alphabetic string which contains say for example, John Brown. But the user for example, might type in something that looks more like javascript code and that thing goes back to the server and executes on the server end. So, this was not anticipated by the server that the form parameter should have just been an alphabetic string with a simple user name or a password, but instead it contains javascript. The browser has no way of knowing whether this is part of the original page or whether this is something that is reflected back. So, what happens is the form parameter is sent through a request and then there is often an opportunity for that form parameter to be reflected back. So, for example, if the user says his name is John and that John goes in through this htp request then the web page a dynamic web page might actually create something like hello John. So, whatever the user typed that form parameter is now reflected back. If instead of John there is some script tag for example, html tag called a script tag. So, whatever is between the script tags the start script tag and the end script tag is supposed to be interpreted as javascript and if that thing goes back and it gets reflected from the server to the client that thing will execute on the browser and that thing could do all sorts of things that script for example, could read cookies and send it to some attacker site and things of that sort. So, these are some of the vulnerabilities basically it is that the user has typed in some input and that input has not been sanitized. For example, that input could actually modify the SQL query it could completely alter the semantics of an SQL query and do something completely different from what was expected or anticipated. So, we will see such examples in this in this lecture we will also see a demo and then in the lab perhaps the next lab that is on Monday you will actually start working on some of these attacks using one buggy application software which is available on the net it is called dvwa damn vulnerable web application. So, this application has multiple levels of security say level 0 which is the lowest level of security which can be easily attacked and then you have got level 1 level 2. So, if you set it at level 0 it is easy to attack it if you set it at level 1 it is somewhat harder and it is even harder if it is at level 2 and there are several attacks there is a full menu of attacks including cross side scripting SQL injection persistent cross side scripting non persistent cross side request forgery and so on and so forth. So, in this course in this workshop we will concentrate on two of these which is cross side scripting and SQL injection. So, a brief of cross side scripting a website is said to have a cross side scripting vulnerability if it inadvertently includes malicious scripts crafted by an attacker in pages returned by it. So, these malicious scripts are not intended actually by the developer of the website, but somehow they got into that page notice that most of the pages these days are actually a dynamic web pages. So, that is to say the original web page will actually include many other things and some of those things could have come from a user into that web page and what if something malicious also got included as in the process. So, an example of something malicious that could get included is a bunch of java script over here. So, you have got a script tag opening script tag and a closing script tag these are standard html tags and in between you have got something that might look very innocent, but actually it is a bunch of script java script and this thing is part of the web page. So, when it is received by the browser it actually gets executed. The malicious code may for example read browser cookies on the victim's machine and ship these off to an attacker's web server. So, there are actually very very very interesting attack vectors that you could actually include. There is a full script injection that looks like this there is even partial script injection that does not even have these tags, but is a part of java script which can which when concatenated at the right place in a web page becomes malicious. So, you have got full script injection you have got partial script injection it is not even necessary that the attack vector is a bunch of java script it could be simply html itself. So, there are many different and creative attack vectors which we will encourage you to to try out an experiment with in the lab. Now, there are two types of these cross-site scripting the first is easier to understand the non-persistent is a little bit more complicated. So, what is persistent the malicious code on a web page is saved on the web server when an innocent user downloads the web page the malicious scripts execute on that users browser. So, to start with you have got a web page in html and java script which everything is fine about, but then for some reason some code gets injected on to this page and it is actually saved on the server. An example of this is users update their profile on a social networking site. These profiles may be read downloaded by other users through their browsers. So, if the owner of that page or that profile for example puts in some java script over there. Then when a visitor to that profile downloads that profile the script that was injected will also execute on that visitors browser and that script might do some things that are not expected. So, this is an example of a persistent attack you just load it once and 100 different people download the same thing it is saved there on the web server. So, that is not so difficult to understand the more complicated thing is a non-persistent accesses attack. So, the first thing is that this exploits the fact that some servers echo back certain user input to the client without validating it. So, an example of this would be if the page asks you for your name and you type in John and then that John is sent as an input parameter from the client to the server and the server then uses that input parameter to create a dynamic page which includes your name. So, the server then sends back the HTML to the client which displays on the screen hello John have a happy day. So, whatever the user typed in is now sent and reflected back. Suppose he enters his name as Prashant the server then responds with hello Prashant good morning to you or for example, you might have a search field in a catalog asking for different kinds of items in a toy store and you say I am searching for this kind of a doll a Barbie doll. So, you in the search field on the client side on the browser you type in Barbie doll and that thing goes as an input parameter to the server end and the server looks at the catalog its entire database to find whether that Barbie doll is in stock and it does not find it and then it responds and says I am sorry we do not have Barbie doll in stock. So, the same thing that was sent is echoed back. So, there are many occasions to echo back user input. So, the first thing that you want to see is if to determine if a site if a particular website is excess as vulnerable is to determine whether it simply reflects user input without sanitizing it without validating it. So, note in this case for example, the server has echoed back this person's name. Now, instead of the person typing in his name as Prashant what is the user types in script alert fire. So, this is a bunch of JavaScript and the browser knows it is JavaScript because it is enclosed within these script tags. So, he types in this for his name and the server side software does not look and see that a name cannot look so weird it simply echoes it back saying hello this thing and what happens then is that this actually gets executed by the browser and it will display an alert box for example. So, here is a simple example. This web page asked me for my name. So, I typed in my name and the next thing you know is it reflects it back. So, whatever I typed over here is going to be reflected back. How is it reflected back? The server software does it. So, it says hello Bernard. Now, instead of typing my name I type something else that same thing script alert hello. So, this thing is typed by me it goes to the server end the server says hello this thing. It does not look to see whether you know it just thinks that this string of alphanumeric things is actually a valid name just a string of characters actually. It does not check to see whether it is actually alphabetic or there are some special symbols like a bracket and so on and so forth. Simply reflects it back and low and behold when it reflects it when it gets reflected it gets executed and that JavaScript that gets executed actually does the following it displays an alert box with hello inside. So, this is the first problem a sign of things to come the website actually simply reflects back user input without actually validating it without sanitizing it. Now, there are several ways in which you can try and prevent XSS you can do it at the server side you can do it at the client side. We will discuss many of these things in great detail because this is one of our research areas over here. So, one possibility is to validate and filter all user input at the server side for example, a maintain a black list of all user input that should be filtered out. For example, if in a name you see a symbol like a like a bracket or a quote or something then you filter it out straight away. So, single double quotes angular bracket should not appear in an email address for example, a better solution in some cases or in most cases the equivalent of a white list. So, if you are asked to type your email address make sure there is an act and there are dots and so on it follows a certain template. So, there are white list approaches there are black list approaches this black list is not just figuring out characters, but you also figure out the script tag for example, but see script filters it out and typically you do the filtering using regular expression. So, both for the black list and the white list you might want to write regular expressions on the server side as part of your server application. Now, there are special functions in PHP which can do some of this for you. So, we will actually see that in the demos today and on Monday. Another example of this injection of code. So, this is basically a code injection attack you are injecting JavaScript. Now, here you could inject also SQL injection. So, here is the deal you have you must have seen this kind of a URL it is called an extended URL with a query string. So, this is a URL then there is a question mark and then there is a parameter name and a parameter value and then between two parameter name value pairs there is an ampersand sign. So, it says that ID is equal to this and password is equal to this what this really must have come from. So, student ID is equal to this and password is equal to this where this came from is there was a form on the browser to be filled and one of the fields was an ID field and the other was a password field you entered this you entered this and the HTTP get request actually has this extended URL if it is a get request where these two where this entire thing is appended to the URL. So, the server software will typically take this and will use it to complete a query. So, this is already a query that it has this query string and it simply takes this thing and puts it inside there and takes this thing and puts inside there. So, most of you will recognize that this is standard SQL. So, the standard template for a most basic SQL query is select the following columns. So, these are columns or attributes student ID and GPA grade performance index or what in IIT we call CPI cumulative performance index. So, that is your score sort of. So, select these two attributes from a students table students who admitted in the year 2009 for example, where these are the this is the predicate where the student ID is this. So, you have to match this in the database under the column student ID and match this under the column password. So, the server side software will prepare this query will complete this query part of it may be prepared already it will just take out from the HTTP request message it will it extract this value and this value and simply put those two values over there and prepare this SQL query and then this will be submitted to the database engine or the SQL engine for actual execution. Now, again how can we abuse this thing. So, you have to think about how the hacker might be planning things. So, one thing is instead of typing student ID equals a name he might type something like this, but more interestingly instead of a normal password he may actually type this thing in. So, password is equal to this or x equals x and the way this is parsed by the SQL query engine is it looks at this this is basically a Boolean expression. So, A and B or C. So, A and B that is A B plus C plus meaning or and the way that would be actually executed is it would look at this first and see this is always true because x is equal to x is obviously true always and. So, if there is A B plus C and C is true then the entire expression is true. So, this entire expression is true regardless of what you put over here and what it might do next is it might get the GPA for the student with this ID or it might get all of the students records and displayed on the screen etcetera. So, that really depends on the details of how the engine works the SQL engine, but in any case what you see is this is a serious problem because it is giving you information you are not supposed to see of different other people and so on and so forth. So, this is the first abuse the second abuse could be something like this you type in for student name 1 2 3 or 1 equals to 1 in some dialects of SQL this thing is a symbol for a comment. So, what the way this is parsed is student ID is equal to 1 2 3 or 1 equals to 1 and ignore everything after the comment 1 equals 1 it is obviously true. So, this whole thing this. So, this will be true and will return either this record or perhaps some other records as well. So, this is another attack vector yet another attack vector this is much more dangerous where student ID is 1 2 3 in this particular statement by a semicolon and then drop table student 0 9 this is very dangerous now because what this query does is it will actually drop the entire table out here. So, it might actually delete your tables not just read them, but even modify them and worst still delete them. So, these are some examples they are all in the text and we will show you more examples of the demo and you will be encouraged to try out more of these in the lab.