 All right, it is 10.30, second to last class. Everyone excited? Don't seem that excited. Semesters on, it's over, it's all summer time. Where's all the like, ooh. All right, I guess you guys all have jobs that you don't wanna go to or something this summer. Okay, so the plan is for today we're gonna finish everything we're gonna talk about for web security and then on Friday, and I'll remind everybody on the mailing list, but on Friday we'll have like a practice CTF in the class to practice for the final CTF. So I will be communicating, but once the servers are all set up, which should be today, I will communicate that information about how to access everything to the team leaders. The team leaders are responsible for communicating to their teams, right? There's 100, well, allegedly there's 120 of you in this class, right? So the end of the interview directly is kind of a problem, so that's why we have each of your teams as a leader, that's who I'll talk to and that leader will be responsible for distributing that information to everyone else. So please, I don't know that I can necessarily, if you can't get access to a computer for Friday, let me know, I'll try to see maybe we can reserve a computer room and have a little overflow section over there just like we'll do for the actual final. So let me know because I want everyone to be able to actually know what's going on, know how everything's gonna work, know how we're gonna submit flags, how the whole system just looks that way. We'll be on Wednesday, Monday, on Monday, okay, I got my final on Wednesday. On Monday, we'll be able to hit the ground running and we won't hopefully have too many problems. Questions on the plan? Anything? Projects due Friday? Project reports due Friday? If there's no questions, just stand there for a while. Magic success, so you tell me what was different between that time and the other three times. Okay, so what we're talking about, we've been talking about manipulating and tampering with client-side information, right? So we saw that hidden forms, we, the form data, the values inside hidden forms are sent to the client's browser, therefore we as a hacker can modify and alter that data. Similarly, but hidden forms are not the only ways that we can actually store data, client-side data on, sorry, where the server asks us to store data on the client. So cookies can actually be used to store state on the browser, right, this is how sessions work. Ideally, the ideal state for using cookies for sessions is you give the browser a random, some random value and you store on the server side that this random value corresponds to this user. That way you can link all those sessions up together. But, well, so the server, right, so cookies, but the cookies can be any arbitrary data, they don't have to just be a random session cookie. And so just as we saw in the example on altering the hidden form fields, we actually also alter cookies trivially, right? So if a web application, for instance, is storing your user ID in the cookie, what happens if you just change that number and make a new request? If it's using that as your session information, it's gonna think that you are that user. But if the, let's say, there's that, yeah, it's actually all kinds of nonsense that can happen here where, what if I can guess your session? What if you're not using only, I don't know, six bits of random information or something to generate your session? What if your sessions aren't actually random? What if they're predictable? Then I can maybe try and generate and predict your sessions and once I can predict the session, then I can pretend to be other people on your system, right, you'll have no way of knowing because this bit of information is stored on the browser, right? And you just get this back and you see, oh, new session, great, right? URL parameters, right? Oftentimes, applications may want to store state or store some information in the URL parameters. When you click on parameters, right, they use that get parameter there. But that comes from us, from the user, like perhaps instead of before when a price was a hidden field on a form, maybe the price is a parameter in the URL. We can change those parameters, we can do whatever we want, right? Why would a developer want to do this? Any ideas? It's easier, right? It's actually a lot easier to develop an application like this than dealing with sessions and storing all the session on the client, storing the session in the server side. So now you need a database, right? You have to have some database or something to store that server side at state. It would be a heck of a lot easier to write web applications by just pushing all that state onto the client, right? But the problem is now the client can change and tamper with that data. And even if, so the other thing that maybe comes to some of your mind, so who's taking like a crypto class? A kind of crypto, right? So one of the ideas is, oh, you just encrypt that information, right? And that way the client just gets this opaque blob and then when they send that back to the server, the server decrypts it and then uses the values, right? I will argue that that can be secure, but it is very difficult to make that kind of thing secure. For instance, let's say you're encrypting the price. You're putting the price in the URL parameter. You encrypt it, right? But you encrypt the price differently from, let's say, the order ID. And then you use that information later. So why can't I go through your website by, I don't know, what's something that's really cheap? Try to think of something that's cheap. An LED, which is an LED, like 10 cents? I can go buy an LED, get your encrypted price that I know is 10 cents encrypted. I store that and then later on I want to buy a monitor and I replace the price in that $2,000 monitor with the 10 cent price there. But the answer is that you would use something of a random salt. Even if you use a random salt, you still have to make sure that the server has to store what salt is used on that request. And it has to do more work to link those requests together. Or you have to actually HMAC the entire thing that you send to the client. But even that's not always, there's still tricks you can play. Actually, maybe not, but I don't know where the current state of it is, but I would argue it could never be here executing on the client side. Just like DRM, I mean it's been a cat and mouse game for the past 20 years. Anything where the client has fully complete control of the hardware and software, I don't think you could ever make secure it and you've got key issues for encrypting that data. Well the key issues is just on the server. I mean, assuming if you can bring in the server side of the application, you can do whatever you want anyways, you can get the keys. So you assume the keys on the server are secure. But the problem is there are a lot of subtle problems that can come up when doing crypto, when giving them access, you basically are giving them access to an Oracle to be able to encrypt whatever they want. Maybe they can forge a new token essentially or new data that you want them to encrypt that means something in a different context. All kinds of crazy stuff. So yeah, I'd say in general, it's much better to rely on, because the session is much simpler, right? Give them a random value and then you store on your server side any state that you need associated with that session. Can you do it correctly? Yes, you definitely can. Is it easy for an average developer to do correctly? No. I think it goes back to the don't roll your own crypto, don't basically trying to do your own crypto library. So yeah, some actually frameworks like Flask will actually store session information in the client cookie, but they use an HMAC and the library is taking responsibility for doing that correctly. Cool. So yeah, maybe we can manipulate query parameters to try to change the price of things. Okay, we're back to that. Okay, what's the referer header? What is refer A and B is this how you spell referer? Right, so it's in what kind of header? An IT header? It's a TCP header. Oh, it's an HTTP header. Yeah, it's an HTTP header that gets sent on their HTTP requests. So an HTTP request can have a referer header and then this header will, you'll be refer colon space and then a URI that showed where the user came from. What page they were on when they clicked a link. So it's defined, blah, blah, blah, specify the idea is the user agent is supposed to send this, right? So how do you actually spell referer? Two hours. Two hours in the middle, yeah. So you have to ask yourself, was this a very clever ploy for them to reduce one byte on every single HTTP request because I knew this would be used by millions of devices across the world? No, it was just a typo. And nobody caught it until people were already using it. And at that point, it was like, ah, well, shoot, okay, I guess we're not going to change it now. So this is literally part of every single request, everything you make in a browser, it's sending this header and it uses this. It doesn't, it happens automatically. So the question I want you to think about is can this be trusted? If you're a web server, can you say, oh, the user came from their cart's page, that means that what they're sending me is good. That means they're a login user. No, and why not, you're shaking your head. It could be a malicious browser. Yeah, it could be a malicious browser. It could be us. We've already seen how to make HTTP, I mean, we saw how to write an HTTP server. Equivalently, we can write an HTTP client. We can use a command line tool like Curl which allows us to completely control the headers that we send. So we can make it appear as if we came from any URL we want. Yeah? No, you shouldn't, but we did for years and years and years. Yeah? To do what? You want to share? Sorry. Let's see. You don't have to name names. We did a lab pay-per-view app that was tied to streaming and we used the refer to make sure that they were coming back as a page. I mean, nobody, back in, this was 99. I mean, you didn't think about that stuff and people didn't have near the tools that they have these days to do that. It took a lot more sophisticated user to do it. So that was like an authentication thing? So you authenticated them first and then each step of the process, you just assumed if they came from the previous page that they were authentic. Thank you. They only did it in one spot. But yeah, definitely. We had it in all sorts. I mean, I can think of several apps off the top of my head that we did back in that range where we used it along. Yeah, it's very easy to do. I mean, it's something that you don't, and this is part of why you're taking this course, or why you've taken this course at this point. Is to think about things this way. To think, is this actually secure? Who is sending me this refer header? The client, it's coming from the client. Therefore, it's dirty and could potentially be untrusted. There's nothing that guarantees that this actually comes from there. And part of the problem is that developers can assume that it's an HTTP header, so therefore it's trustworthy. But that is often not the case. So yeah, oftentimes, this is actually a great example, Eric, so oftentimes developers can use the referer to control access. And so they can say, okay, if you're coming from page login or homepage, then I know you're still logged in. So if I make a request to that new page, giving you the referer header, this actually also causes problems with now with HTTPS versus HTTP. So you're on an HTTPS page, which means that the only people that know that you're visiting that particular page should be your browser and the server that you got that page from. Everybody knows you're making a request to google.com, but they don't know exactly which page you're requesting because it's all encrypted. But if I click a link on one of those pages and google sets the referer header to that HTTPS URL that I was on, right, they're leaking what page I was on. So there's actually, so now there's rules about browsers will not send referer headers when they come from HTTPS pages. I think too, and I actually don't know all the rules, but I think they will not send it to any HTTP page. It was leaked in the clear. I think it will do it to HTTPS, but not 100%. Maybe it won't do cross domain either, it has to be an issue. So actually, with curl, we can use the desk H option to specify arbitrary headers in our request, and this allows us to arbitrarily change, and we can add any kind of header we want. Okay, so we talked about forms, so what was one of the first cases I tried to convince you that JavaScript is useful on the web? Because it's a user, right? So client-side validations. Client, a validation of what? Like you don't want to go back to find out if he entered a correct email or something. Right, so client-side, if you think about HTML form or user input, right? Using JavaScript to check user input, right? And actually the HTML5 spec, if you look, it's actually really cool. You can actually put in that attribute regular expressions that the input should contain. You can actually do this without doing JavaScript, which is pretty cool. You can say that certain fields are required to be there. You can say minimum, maximum links. You can say that this field is an email, so we'll do email validation for you. You can specify patterns. You can do custom validation doing JavaScript. So if you have this on your page, on your web page, and you get a request from the user from that form, right, we know a form will generate either a get or a post request, depending on the method of the form, to the action URL inside on the form element. So you get a request. What can you assume about that input data? So let's say I have a form field. I have one of the fields is required. The second one is an email, and the third one is some pattern that I specify. So now I get a request. They hit enter, you know. Whatever, enter. I get a request to the action of that form, some post request. What can I assume about the input? It's a string. All of them? Yeah. It has been validated from the client side, and I'm sure that those fields are as they should be. So we're assuming that those fields are invalidated. So the first field was required. The second field is an email, and the third field is some pattern. Yeah, so we can actually, so the problem comes with assuming, right? So we can hope that our validation was done properly. Right? In 99% of our benign cases, just regular people using our application, right? These things will be done, and the first field will be there. The second field will be an email, and the third one will satisfy this pattern. But we have absolutely no way of ensuring that every single request that is made, right? Satisfies these things. Because as we saw, I mean, we saw, you could just change, you could literally change the HTML page in your browser, right? You can remove the required attribute. You can remove the type email. You can remove the pattern. You can just make, use curl to make that HTTP request directly, right? So you can make direct post requests to that server, putting any values you want here for required, whatever that first field was, whatever the second field was, whatever the third field was. So all of these fundamentally can be bypassed. And once again, it's because this data is living on the client, right? We're asking the client to do some validation, which is great. It actually helps us in cutting down the amount of communication. But the downside is now we've cut down communication, but now we can't trust anything that comes back to us. We always have to perform that validation ourselves on the server side, right? This is true whether it's a browser or a mobile application, right? With a mobile application, you can actually, how many people have rooted their Android phones? I think a lot of people haven't, right? So what you can do if you root your phone, you can change the proxy or firewall rule setting to redirect all traffic going out of your phone into like, let's say your laptop, where you can run a man in the middle SSL proxy so you can dump all the traffic that's going from your phone. So you can use an HTTPS from your mobile device to the server to make your REST requests or whatever, right? I can sniff all of that information. So now I, as an attacker, know exactly what your HTML and HTTP endpoints are and so I can make requests to there and my request, I'm going to go through those validation that happened on the app so I can feed any arbitrary information in there. It's actually really fun to do if you want to man in the middle your phone stuff. It's actually really cool. I think I had to do that for crawling to Google Play Store. So I man in the middle an Android device to see what requests it was making to the Google Play Store so that I could mimic it on a Python script. So Google Play thought it's like a phone trying to download a map, but it's really a Python script. Okay, so that was all about trusting client-side data, right? So fundamentally, this is one of the other things from the web. You cannot trust anything that comes, you need to be very paranoid, right? Anything that comes in externally should not be trusted. Even maybe things internally should probably not be trusted, you know, just to be safe unless you specifically have a reason to trust that information. Okay, so we talked about web applications. We talked about web applications are complex, right? Much more complex than just a simple website. We talked about different types of web applications. We have web applications that need to be dynamic and interact with us. Something like, let's say, Wikipedia, where we could go and edit and change things and we're constantly changing the state of that application, right? But do we need, well, in a very basic form of Wikipedia, could you develop a web application of Wikipedia that had no user accounts? I mean, not to say should you, technically could you? Yeah, the basic core of Wikipedia is anybody can edit a page, right? Would you have a pandemonium if Wikipedia suddenly transitioned to this model? Yes. Yes, for social reasons, right? Because people are terrible, that's what I'll say. Some people are terrible, I'll say, all of you are awesome, right? But for most sites, we wanna know, not only do we wanna establish a session with the user for one interaction, right? We wanna actually establish who is this person? Is this person the same person that I saw a month ago, a year ago, so that I can know what information I wanna store about them? So we have authentication, right? So one of the, two of the key things in most web applications we need is authentication and authorization, right? So what is authentication? User name and password. It can be username and password, but more generally. It's just it, so if I just get any username and password, I should be 11. Identifying a person. Yeah, it's easy to try to, I mean, yeah, the end of the day, right? You wanna try to identify, or answer the question, who is this person, this user, this person who's interacting with my system? Does this guarantee that they are the same person? No, it's very tricky, right? What if I give my username and password to one of you, then you could access the system as me, but you're not actually me, right? So it's actually this complex relation between, even identity is a tricky problem. Like how do you know that I'm actually your professor and not my twin who just decided, who's in town visiting and decided he wanted to teach a lecture on web security? I don't have a twin, or maybe I do, but. I don't have a twin, or maybe I do, but. Right, how would you know? You wouldn't know. You'd look at my ID and see if you look identical, right? You have my fingerprints? Let me have other ones. I burned myself last night, all my fingerprints are all gone. So what happens if we're able to break the authentication system of a website? Yeah, then we can impersonate other people, right? We can pretend to be other people on the website. So what is authorization? What's the difference in the relationship between authorization and authentication? Once you identify the user, it's then to identify what he's capable of doing. Right, what are his roles and what are his. Right, exactly. So the idea is this answers the question. Authorization says, what are you allowed to do, right? Now that I know who you are, how do I determine what you're allowed to do? Whether that's using roles, right? It's like, I'm the professor and you're all students in the class and you have TAs, right? We all have distinguished roles. Or maybe it's based on your groups, maybe different groups of your group projects are allowed to do certain things, right? So we can have an app. Usually the typical ways that this is done on a web application is you typically have some administrator, right? Some kind of important user. You have regular users and you have guests, right? These are people that maybe aren't even authenticated, right, but they still have some authorization. Like on Wikipedia, you can go right now and edit any Wikipedia page. You are not authenticated with Wikipedia, right? You're some unknown anonymous user. But you can authenticate to Wikipedia and then you may have more permissions, more authorization to do different things on Wikipedia. So what happens if we break authorization? Yeah, so we're able to do more things that we're not allowed to do, right? We're able to actually do more than the system means that we're not allowed to do, right? And so these are kind of intertwined, right? So if I break the authentication of an admin user, I can now perform actions as an admin, which means I'm exceeding my authorization of me, the person, right? But if I have some way of giving myself my user account, who's a regular account, if I have some way to change that to admin privileges, now I'm increasing my authorization, right? I'm breaking my authorization, yeah, authorization, right? So it's kind of mixed. There are good ways to think about this, but to me they're kind of very similar. It all really comes down to at the end of the day, what can you do, right? But if you think about, so if I break into your user account on, let's say, I don't know, Google Apps or something, Google Drive, right? It's not that I'm now an admin on Google Drive, but now I can see all of your documents, right? Which is definitely not a good thing. Okay, so how can we attack these things? These things are central to web applications, right? How many, if you have to guess, how many username passwords do you have on websites? Which is a great question. I'm actually gonna pull out my phone so I can look up the answer for myself. What's in there? Somebody guess. You wanna play a game? Guess how many accounts I have on websites? 400 to 663. 463? That's correct, I'm gonna be very upset. 60? Long game offline, should be good. This is why I don't want you to have my fingerprint. Does it not just say how many? Yeah, left, left, left. Whoa. Okay, it's more than 35. I'm gonna show you a list, but I'm just not gonna do that. All right, if you figure out how to have LastPass show you, then you're like, science. I wish there was a way to do it. Google saving. I have a lot, let's see. I think it's, I don't think it's quite 400 because I can scroll through it without my thumb getting tired. But definitely, I think more than 40. Probably I'd say in the hundreds, if not 150 to 200, probably something like that. This glues to everything, right? Your ansu username password, your Google password, your Facebook password, your, I'm not gonna get too much into it, but if those are not separate things, they should be separate things, right? You should have different passwords for all these services. Because we've seen how terrible web applications can be, right? So if you sign up for random website X, and you sign up with your email and your password is your Gmail password, and that random site X has a SQL injection vulnerability, then an attacker can use that to download all your username and passwords. And the very first thing they do is they see Gmail and they tried that Gmail with that password that they saw. And probably I'd say five to 10% that's gonna work right off the bat. Right, so very least your email password should be different from all your other passwords, right? Because if somebody breaks into your email, they can go to any site and say, oh, I forgot my password, it'll send the email to them and then they can change it. Yeah. That's right to do it, but I have different levels. So for the random site, you know, posting on a bulletin board somewhere that I don't care that much about, I use the second password for all of this. Just to make it easier that there's a top 20 sites that all have different passwords. Nice, yeah, I used the last pass, so I have a completely different password. I don't know any of my passwords. Which works until you find sites that don't work with the last pass. Uh, you, well, there are a few for sure. Yeah, you can copy it and paste it in, that's basically what I do. This all kind of works. There are some offline stuff where it gets weird, but. Anyways, so how do we attack authentication? So using everything we've learned in the class, right? On a web, well, we're talking about web applications, right? How do we attack authentication? And what's our goal here? To impersonate someone else, right? So how do we do this? You want to bring in somewhere? What are you gonna do? Okay, yes, that was, fiddle with the session to try to get somebody else's session. That's like quite the opposite. You're not, yeah, that's tricky. I don't know exactly which one you're breaking in, that's why I don't like talking about them separately, but let's think about more about username passwords, right? So, because maybe let's say it's a short-lived session, so that's not quite as good as having somebody's credentials, right? So I want to break authentication by getting their credentials. So what do you do? What can you try? You can try phishing to get somebody's credentials. Yeah, you could try phishing, so phishing would be send them an email, right? Have a link to a page that's not facebook.com, but it's your site, and it looks exactly like facebook.com. It has a login form, but that login form goes to you, right? So you can get their credentials that way. What about using some of the network attacks that we have? What are you talking about? What could you use, what did you do? I would look at the website just to see the form, how the forms laid out, and then if they're not using HTTPS, you could sniff the network traffic as it goes across, and you could have every password at that point. Yeah, so you could, if the target of the login form is not the action, is not over HTTPS, then you can eavesdrop on either those credentials or even the session information, right? This was actually, I don't know if you remember this, but a while back there was a big uproar because Facebook was not encrypting that the login page was encrypted, but every other page was not, and so a person developed a tool called FireSheep, which would listen on the wireless network interface for Facebook requests, and if it saw, and then it steal session identifiers, and then you could use one button to basically access Facebook as that person whose credentials you just sniffed. Let's think about on like a college campus like this, well here we're using encrypted wifi, but this was back when that was pretty rare. You go to Starbucks and you can just see everyone. You can perform arch-moving attacks to make people log their request through you, right? So this is why we talked about all the networking attack because it is still relevant to perform on these web attacks. Right, what are some other ways? So we can just, yeah, we can steal it, right? Keep a record of all the keystrokes on the keyboard in the middle. Ooh, keep a keystroke on all the keys. I'd say at that point you've kind of owned them, like once you can execute code on somebody's computer, yes, you can basically do anything, right? Even more. So let's say we can't execute arbitrary code on your system, not quite yet. But if we could, yes, we could install a keylogger, we could log everything. There we go, we got everything. But what are some other ways, yeah? Social engineering. Social engineering, so how's that different than phishing? Social engineering is probably you're actually interacting with a person and kind of somehow get them to tell you something important. Yeah, so I could call you or somebody could call you and be like, hey, this is ASU's IT support. There's a problem with your MyASU. You're not gonna be able to get grades this semester. It's probably gonna impact your graduation. I need to perform some debugging on your account to authorize that. Can you please provide me with your ASU ID? Great, okay. And to verify it for security, what's the password on the account to make sure that you're the actor? Okay, what's your role? Huh? Yeah. Group force. Group force, yeah, we can do that. We can also just try to guess, right? We can try logging in, or we can try to brute force sessions, right? Yeah, so SQL injection or download, like SQL ID, you can download the database pretty much. Yeah, we can completely somehow bypass the authentication, right? And this could be either using refer headers to trick the application that we were already logged in, right? Or we could use SQL injection vulnerabilities to get around authentication. Or maybe there's actually a problem in Dropbox a while ago where they pushed out a change and it was not actually checking that your password matched the password in the database. So it would just let you in no matter what username, what email address you put in. Yes. Luckily they discovered this very quickly, right? So this is essentially bypassing authentication. If there's some way you can trick it to not let you in, then you're good, right? So SQL injection, another one is session fixation, which we'll get to today, right? So for eavesdropping, right? So if HTTPS connection is not either protected by SSL or even, well, no, I don't wanna say that. Even if you're using something, right? Like a VPN to tunnel your traffic, or you're using something like Tor, right, so Tor is encrypting, well basically the package that you send are like ununencrypted, right? You send it to the first one, it decrypts the outer layer, then sends it to the second one, decrypts the outer layer, sends it to the third one, and then your package goes out, right? But if you're using HTTP, all of the traffic that's going out of that exit node is unencrypted. And actually that's how a lot of people try to gain visibility into the Tor network by seeing what people are doing, by looking and logging what HTTP requests that people are making out of exit nodes. Then you can actually use this to identify people if they're accessing some sites that are not over HTTPS. And let's say I see that you're logged into Twitter, you're logged into a certain Twitter user, right? Now I can link that Tor session with you. So this actually is how oftentimes they can try to break Tor, right? So we saw actually the basic authentication that the username and password is sent in the clear, or not exactly in the clear, but it's base 64 encoded, which is essentially in the clear. If it's submitted through a form, we can grab it. If the authenticator is either a cookie, a URL parameter or a hidden field in a form, we can smack it. For development, to do this securely, the secure flag on cookies tells the browser, hey, only send this cookie out over HTTPS. Do not send this cookie out over HTTP, right? Because I couldn't trick you to click on an HTTP link to the server. And even if the server should only ever support HTTPS, you will send it on that first request exactly, because you're making a request and let's say the server redirects you to the HTTPS version, still you've leaked that cookie through that initial request. So this is why this flag was introduced on cookies to add this security mechanism to browsers. Okay, group forcing, right? So if our authentication has some kind of limited domain, we can try to group force it, right? Like some websites have, well, hopefully none that we know of, but sometimes they kind of have like a four digit pin. How many combinations of a four digit pin are there? 9900, 9900. It's weird that you did that that quickly, but. Yes, this is why I don't want people who don't know how to do math. Well, assume you're right, I'm not gonna check it. Right, so, how long? So we actually tried, some of us tried group forcing CRC 32. Let me do that in a couple of hours. It's a lot less, it's changed numbers, right? And so, right, so if you think, well maybe not the website does not have this problem, but maybe the mobile application allows you to use a pin to log in, right? I can try to log in through the mobile site, right? At the end of the day, it's going to an HTTPS server. It's just making an HTTP request. I can have it be exactly like it came from that phone, right? So what's the problem? So what do you have? So like a lockout policy on the account that says, okay, after 10 tries in a minute, you're gonna be blocked from logging in for two or three minutes. So does that keep you out? No, why not? What are you gonna do? After 10 or 10 seconds. You get another 10, so you can actually just spread this out. I mean, if it's a high value target, hackers can wait for $10,000, $100,000, right? I can try it over months, I can try it over a year, I can try it every day, yeah. A lot of people use like birthdays for pins too, so you can go over the domain. Yeah, you can actually search in a smart way about the pins, not just random, right? But what about, so this is if I'm interested in breaking into a specific account. What if I'm interested in just breaking into any account? So let's say I can enumerate the account names, their email addresses, whatever, maybe they're on the site, right? So your lockout policy is per account. So why don't I try every single account, 000, 000? Am I gonna get something? Yes, probably. The heck yeah, probably, right? And then I try birthdays. I try all ones probably first. Or one, two, three, four. Yeah. I'm gonna get a ton of people, and I'm not even gonna hit your lockout policies. And I've already cracked a significant fraction of your user's user names and passwords, right? This is actually why logging is actually very tricky to try to enforce and try to make brute force here difficult. Because fundamentally, if the users have bad passwords, I can try a bad password across everyone, right? And even if you say, well, I restrict the IPs, and so I'll do a per lockout policy per IP, well, great, I'll change my IP every x number of lockouts, right? I'll hire a botnet and make requests from 10,000 IP addresses. Each of them gets 10 requests. I can make 100,000 requests. I'm gonna get some accounts across all those. Because they each only need to make one request. Okay, so we can do this. We can also, right, if for whatever reason the session values are not random, if they're not cryptographically random, if they're based on the server time and the number of processes that are running, then we can actually try to break the authentication token, right? If we can get some valid authentication token, then we're in, right? Or what if they're just sequential? If they're just sequential, then we can easily try to brute force backwards and forwards. If passwords are user specified, then we can maybe use some smart reasoning about, we can actually, if people who develop, they look through all the leaks of the password databases and they build models of what passwords you should guess to be more efficient to try to crack passwords, right? So I can see that if I'm going to this, maybe we can try to infer that there's, like here, November 9th, 2010, 9, 15, 10, right? So we can maybe try to see if there's a secret value. Maybe it depends on time, right? If we can try to figure out, oh, how this value changes over time, then we can try to break these values. And if the authenticators last for a long time, right? If your authenticator, your cookie is valid for months or years, you have a lot more chance that you're gonna hit something and find something good. Okay, so bypassing authentication, right? So getting completely around authentication. So we can bypass form-based authentications, right? So we think, okay, we go to a website. That website has a login button. We click the login button. Takes us to a page of the login form. We fill out our username password. We hit enter. It takes us to a home page, right? So what about HTTP and the web means that we can't try to access a home, the user's profile directly? So after the login, we are given some unique value that we'd have to send back to the server to see that I have already been authenticated. Maybe, it's gonna do it correctly. What if I forget to do that on the user profile page or on some home page, right? So oftentimes, developers assume because I use the application. I go to the website. It brings me to the main index page. I have a login field. I log in, I correct the login. Now when I'm at the login, I can see more of the application, right? That page has more links on it. And then I can go and access those things which require to log in. But oftentimes, it's very easy. You have to make sure, right? The entire nature of the web means that anybody can make any request to any resource you have at any time, right? So you can't assume that they can't guess those links or whatever, right? And so this is called forceful browsing. So just saying, well, what if I'm not logged in and I try to access your company's financial information? Is it gonna ask me to log in or is it actually gonna give me that information? Oftentimes it will because developers forget to do this. You have to check on every single page, every single request that the user was actually authorized. Authorized and authenticated, right? Those are the two big things. Sometimes there's weak password recovery procedures, right? So when you click on, hey, send me a new password, that usually includes some random value to link that request back. What if that's not random enough and I can guess what that value was? Then I can reset your password to be whatever I want. Attacks on authorization. So we actually saw an example of this. I mean, a lot of the binary vulnerabilities we study, right? Are examples of authorization attacks, right? Yeah, basically the binary has some implicit policy of hey, I should only do X, Y and Z, right? But we get it to do A, B and C and to read files, Q and R and do whatever we want, right? So actually the same problem, a very similar problem exists on the web. We can still have path or directory traversal attacks. By using dot dot slash, dot dot slash, we can get the web application to read a file that we shouldn't be able to access. This actually is incredibly common. If the program is just opening a file based on our user input, we can use the directory traversal to get it to access any file. So we can do something like if there's a show.php that suffices to buy us a file, we can do dot dot slash, dot dot slash, all the way to ETC password, right? So what does this give us? Specifically getting ETC password through a web request. User names, right? User names on that server, right? Valid user name accounts on that server. So then maybe I want to try to brute force those user names to try to break those user names to then get access onto the server, right? ETC password, despite the name as we know, does not actually pay the passwords. But if I know the user accounts, yes, not anymore. But if I know the user accounts, I can try to brute force SSH login into those accounts. And the other thing is this, we can get around all kinds of firewall stuff because the dot dot slashes, we can URL encode them, right? With the percent encoding, we can do all kinds of stuff so it's hard to detect just strictly from looking at the HTTP request. We can do forceful browsing so we can just, the application developers assuming that we go through the application like this when really we can make any request to any part of the application at any time. There can be issues with automatic directory listing. So what's actually automatic directory listing? If you take a link for a web page which does not have an index or HTML, it will actually list all the files. Yeah, so by default Apache, if you tell it to serve a website out of a certain directory, if a directory, if you access a directory that does not have an index.html or actually there's a bunch of index.php whole list, it will just, by default, just list the files in the directory. This actually occurs on my website if you go to I think slash classes slash teaching or something, it shows you that. But clearly that's fine because there's nothing secret there, right? You can already access any of that information. But if I give you a link to reports and you should only be able to access reports, I don't know, your company's reports but you go to slash reports and it shows you for every company, right? Now you can access any, you may be able to access any of those. The oftentimes parameter manipulation, so you can oftentimes change URL parameters or change get parameters in order to access content that you shouldn't be able to access, right? If the resources that are accessible are determined by the parameters to a query. So actually you can tell those stories specifically me and my advisor were pen testing and credit card processing company. And so we got access to one account, right, on the system. So we get access there and they have a way to review your reports. And when you look at the URL and it's something like whatever the domain was slash reports slash long integer number. You're like, huh, that's weird. And then you go to the other reports and it's also long integer number. Like what happens by like increment this a little bit? And so by incrementing that you're able to get credit card reports for every other single company in the business because this part was a legacy part of the web application. It wasn't actually checking that I was authorized to view this specific report, right? And so then we wrote a crawler to go through all possible combinations of that to show them what we could do, right? And download all of that client information about credit card transactions, right? So we can do this and change these parameters and maybe try and get somebody else's content, right? And here we're not trying to get to admin privileges, right? We're just trying to see content that we shouldn't be able to see which is from other users. So this is basically what it was. There's some kind of like profile or reports page and how to report ID. And so by altering that, we could get anybody's report. I wanna talk about this very quickly. Okay, we talked about registered globals, how registered globals is a PHP function that automatically creates variables in our PHP scripts based on get and post parameters. So it turns out we can use this to cause unexpected execution paths to be followed by essentially introducing variables inside PHP. So let's say we have some feedback page, we have PHP, we say if name and comment, we open the file, user feedback, we write to that file, we close that file, we say, hey, feedback submitted. So this would be how we would use it. So this would automatically create the name and content in here. But what if I have something that says, hey, if the password is some secret, unguessable nonsense, then I say you're an admin. So this would be like some key you give your admin to log into the application to get admin privileges, right? Now I have a check that says, hey, if you're admin, then show you all the secret admin stuff because you're an admin. But what happens if I get example.php with the password of f for foo, and I set admin to be one, right? Once again, the parameters come from the user, right? The user can arbitrarily add any parameters they want to a URL for an HTTP request, right? Well, what's gonna happen is this check is f is foo the same thing as the secret, right? So this gets skipped. But then admin with php register global says, yeah, there's an admin variable. And its value is one. So awesome, so I'm gonna show you the secret admin stuff. So this is how you can actually add parameters to the php script to get it to execute. So you see the developer never intended, right? The developer intended either one of these two things happens, either you're an admin and you see these admin stuff or you're not an admin based on this condition. But because this condition is setting a variable that I can set because I control the parameters that get passed in with register globals, this is why register globals is such a horrible, horrible, horrible idea. Very quickly, oh, I actually think we've gone through everything. Okay, so now we're actually at the end. So there's all kinds of issues with misconfiguring your server. So if you can have an HTTP server is running on the same server as a web server, right? Because often people want to upload FTP, update their websites through FTP. It turns out that FTP sends the username and password in the clear too, so you can actually sniff FTP credentials. Then we could maybe upload with FTP to a CGI bin directory and get it to execute arbitrary programs. We may be able to execute programs as web applications. This is actually a super cool one. If the website actually allows you to upload files which happens a lot, right? So you upload an image. And most times, most web applications don't want to store an image inside the database, right? They usually will either store it, actually nowadays in like S3 or something, but if you can store it on that server, you can store it to like an uploads directory. But if you can control the name of that uploaded, you could have it end in a .php, upload PHP content instead of a file. Now when you access that PHP code directly, it's gonna, Apache is gonna say, oh, here's a PHP file. I'm gonna interpret this and execute this as a file. I think we're pretty much done with content here. So any of you have any questions on web security stuff?