 Cool. Thank you for the introduction Joaquin. I really appreciate it. It's actually, I'm actually super excited to be here talking to an OAS chapter Especially because I was realizing as Joaquin was telling about all the cool things that OAS does that a lot of my knowledge about web security came from reading OAS guides and OAS top 10. I remember being a master's student and my advisors like, okay We need to include this vulnerability, this vulnerability, and this vulnerability. I'm like, uh, I don't really know what those are. We go look them up and that leads you to a lot of guides that tell you more about them. So I'm actually, it's really, I'm really excited to be here today to talk to everyone. So what I'm going to really focus on this talk is black box web vulnerability scanners. So how many people know about these tools, use these tools? Some of you? Okay, cool. And so, and really, well really what I want to do and kind of what I want to start is a little bit of a conversation. So it's very easy for us in academia to be in our like ivory towers and do security stuff without ever talking to practical security people about what actual challenges they have, how they use these tools that we try to study and try to improve. So that's really kind of, if I could take one thing from this, that would be what I'd love to hear from you is feedback about how you use these things, where do you see these tools going? How do these, can these tools help you be more effective in your job? So a little bit about my background. So I did, I'm actually, is this a UC Santa Barbara for those that don't know? Yes, it is incredibly beautiful. It's on the water by four sides. This is actually the campus and this is probably not where, I think that's where I spent most of my time in engineering buildings. I'm at Triple Gautre, so I did my math, I did a five-year bachelor master, that's the mascot of Santa Barbara. It's an Argentinian cowboy. There you learn one thing today. So I did kind of like a four plus one at ASU, I did a bachelor masters at Santa Barbara. After my master's, I got into research. Actually, some of the work I'm going to present is from my master's work there. I did that and then I was like, man, I'm so done with academia. I'm going to go work in the industry. I'm going to make a ton of money. It's going to be awesome. And then I went to Microsoft where I worked full-time as a software developer at the Redmond. I actually love that. I really like software development, but I really miss security. And I found out that I really like doing research, like doing cool new stuff that nobody's ever done before. I like to tell students, like at Microsoft, I was doing something new, right, that was new to me. I'd never built this kind of system before, but I was kind of building the same thing. In academia, you get to kind of tackle these new interesting problems that maybe nobody's ever solved or thought about before. So I went back. I went back to Santa Barbara, did a PhD. And while I was there, Team Shellfish was my hacking team there. So we competed in a bunch of hacking competitions. I did compete in DEF CON. I don't know how well we actually did. I don't really want to look at the results, but we never did great. They keep changing things on us. Like one year we showed up and it was all IPv6 and all of our protection stuff was all IPv4. And one year it was ARM and I was like, ah, too old to learn another architecture. But no, that's the way it goes. So then I was super stoked. I got a position here at ASU in Tenby. I've been here before. In August it'll be my second year. It's here. So I'm still relatively new to the area, still trying to deal with the key thing that's going on out there. But I'm surviving and I'm starting a hacking group at ASU called Home Devils. So some of my teammates are in the room watching. So we competed in DEF CON falls. How do we do? I think we got 67. Hey, 67. Out of 256. Out of 256. That's not too bad. But we're moving on up. We're moving on up. I had a whole white board in one of the other rooms filled with crazy equations that the four of the DEF CON people created for us. Just think about it. Please, go ahead. So that's kind of brings you to where I am here. I don't really have, if I have some, a little bit, I would say the tiniest bit of industry experience. I've done some professional test testing things. I think like three of them with my advisor. But I don't have your experience in what you see every day out there. So obviously, I guess it kind of seems silly to be able to have to talk about something. I like to kind of cover the basics all on the same page. So what about applications are really important, right? I think we can all agree on that. Otherwise, you're probably going to be here unless you're just here for free food. We talk to people. We put our money online. And personally, I'm like appalled any time I have to go physically into a bank. Like, what do you mean I have to go into the bank? Like, now I'm going to bother to check my phone, right? I don't, I barely even go in there. There's only services now we can put our healthcare records on. We can file our taxes. And remember the problems that the IRS was having with people playing other people's returns? All of these are web applications, right? Accessible over the web. This is such an important part of our lives that when things happen and they go wrong, right, things will go really wrong. So I like this example. UCLA had 800,000 of their information. I was actually one of these 800,000 people. I did not go to UCLA, but because I applied there, my data was part of this data that was in this breach. One of my favorite hacker stories, especially one that I like to tell in class, about Albert Gonzalez and their crew where they stole credit cards from TJ Maxx and other companies. And he got, I think, 20 years in prison for crossing about $200 million worth of damages. And they were exploiting, it's actually great. I think you've seen the American Greed episode on his crew, with his beautiful, like, sequel injection attack goes across the stream. That's the crazy technical vulnerability they used to get all these credit cards. Target, the target breach, 110 million credit cards stolen. I was also part of that. That's the very first one. I swear, I don't like... Very good. OPM breach last summer, right? You were a part of that in anything else? I actually don't think I was, but who knows me? But actually, we're talking about people, right? Even if you weren't, if somebody you know was part of that breach, right? Because it contained all the sensitive, particular appearance level stuff, your information was also contained in that breach. And so your information's out there. So I talked to some FBI agents about this, and they said it's very worrying that somebody has all this data of all these people who have very important information. Yeah, so OPM, 22.1 million people, it's crazy. One that I really, well, I can essentially ask anybody from any of these companies first, that happens to everyone. It's not, it's not you, it's everyone, right? I like the JPMorgan Chase hack, specifically because they say that the hackers came in through the front door, right? What do they mean by the front door? Well, they mean they exploited an overload flaw in one of the bank's websites, right? And so that, I think, really underscores the importance of web applications, right? To me, honestly, if a company doesn't have a web application, it may as well, I mean, a website or some web presence, it may as well not exist, right? Like, how many people decide where to eat based on, like, Yelp reviews and then checking a menu? If you can't see a menu on your phone before you go there, it's like, unless somebody tells me it's the best tacos that are at, I'm not going there, right? And so why web applications, right? Why, it kind of helps to think about the attacker's motivation, right? Why are they after web applications? Well, some of you have looked at web applications, and they know that sometimes they're a jumble, right? They're thrown together, there's tons of different technologies that are all coming together to build this, just to serve this HTML page, right? So much stuff is happening on the back end. I know I've personally been guilty of this as a Ph.D. developer back in my day, not knowing what the heck I was doing, but gluing and taking things together until it worked and pushing it out there, and now it's on the web. And not only is it on the web, but it's open 24 hours, right? And that's the beauty of the web, is anybody can access it at any time from anywhere. Unfortunately, this means the exact same thing for attackers, right? They can tap our websites, our web applications, anytime from anywhere. And we increasingly have sensitive data in these web applications, right? So, and this is what I like to do with the technology, so this is actually a slide maybe familiar to some people, and I know at least one person who's taking one of my pride classes. This is all the kind of like technologies that I go over in my grab web, just web security class. Think about all these things that you have to learn about and understand really deeply to be able to either create a secure application or to be able to find vulnerabilities in these applications. And really it's kind of this, this is what I see, it's like this big spaghetti thing of all these different technologies kind of all mix together. And it's really easy for vulnerabilities to happen and it's really easy for developers of these web applications to make mistakes. Okay, just to kind of refresh, so what is a web application? What do I mean by this? So we have our browser, right? We have some kind of server running some server side code, maybe it's PHP, we have some backend database, maybe it's MySQL. The client makes an HTTP request to the web server where the web server processes it using a server side code, maybe it makes some queries to the database to fetch some data, get some data back. Finally the goal of the web server is to return an HTML page with CSS and JavaScript using an HTTP response which disappeared to the browser. Where the browser then takes those raw bytes that it got, interprets it and displays it to the user in a nice beautiful graphical user interface which then the user can interact with and this whole cycle kind of repeats, right? So where can vulnerabilities occur? So I just want to cover very basically SQL injection and cross-site scripting so that we can know when we talk about these tools, the goal of the tools is to automatically find these vulnerabilities, well how do they do that and what do they look for? So SQL injection comes in here in the picture, the interaction between the web server and the backend database, right? And so the idea is there's some kind of code on the backend server that creates and constructs a SQL query, right? And it does so by concatenating strings together. So here we're creating a SQL query in this PHP, in PHP, right? So we want to run the SQL query and select star from user where the name, so select everything from the user's table where the column name is equal to some value supplied by the user, right? And this is very natural for developers, this is how they want, they're creating some queries so they can concatenate some strings together and so if the name is something random, right? The server-side code process is they just concatenate these strings together, right? PHP just puts these strings together, sends this off to the backend database for the backend database to remember just gets these sequence of bytes and then it has to parse it to understand okay, what was the structure of this query and how can I interpret that? So then they interpret this query and they say, okay, select everywhere from the user where the name is walking, or I can do that. And so we can do other things, right? What if the attacker can control this name value variable? Let's say it comes from something from the user, it comes from part of the URL, part of the query parameters of the URL where it comes from post-request in the forms, right? If an attacker can control this and now submit arbitrary data, well, if they input something like this, they say it's not going to drop table users. Now, the server-side code does the exact same thing in the previous situation, right? It just concatenates strings. That's all it's doing. It concatenates these strings together, it takes that result, and it sends it to the SQL server where now the SQL server has to parse it and say, okay, what query did the developer actually intend to get executed? And in this case, the attackers tricked the SQL server to execute two queries, right? One query is like everything from the user to the name is A, and the other one is to drop the user table, right? Now we've just lost our whole user table here from one query. And this is not the only thing you can do with the SQL injection vulnerability. You can also extract all of the tables, all of the information from your database. You can also insert and possibly change data, so you get basically... It's really, really bad, right? They can steal your database, they can basically delete your database, and they can arbitrarily change your database. Just one little mistake. Anybody on Earth can do this. Any questions so far? Feel free to raise your hand and stuff. Pretty nice. Okay, so that's SQL injection. The other type of vulnerability that's really popular, that I don't know about the top of my head where it falls in the OAuth list, hopefully somebody can maybe tell us, is it comes in between here with the server side code generation of the HTML page. And that's one of the most famous vulnerabilities is cross-site scripting, right? So the idea here, and again, these are actually, even though we think of them as kind of two different vulnerability classes, they really stem from the same problem, which is that the web application is generating some language, right? In this case, in one case it's SQL query, in one case it's an HTML page. By concatenating strings, another parser has to interpret here. And so let's say we have this HTML, this PHP issue code. I think it's PHP code, right? Remember, I have some stuff with ASP. It's a little weird. Okay, so here we're taking the name variable, and what this means is everything that's not in the angle bracket and the question mark is going to be constant. So everything that's not inside those purple tags is going to be constant. At runtime, the server side code will replace by using whatever's inside that name variable right there in the output. So it's essentially again concatenating strings, right? It's concatenating everything before the purple tags, substituting the concatenating map of whatever the name variable is, and concatenating map of everything afterwards. So, if some other random name, right? If the name happens to be Adam, the server side code substitutes that name variable. Then the important thing is, right, that the server side code is just sending an HTTP request back to the client. The client then has to take that, parse it and interpret it as HTML, right? That makes sense of what it got from the server. So here it parses it and it interprets all those tags as HTML tags, right? And then the screen's going to show hello, Adam. Now, if I instead have the name being some fragment of HTML or something, right? Once again, the same thing happens. The server side code simply inserts whatever's in that variable in the output, sends it to the browser, now the browser code parses it, and then when it sees these script tags, it says, oh, this is inline JavaScript code. I should execute this. This is what the developer of this web application wanted to happen. And we'll see a JavaScript alert. So, and this is really the fundamental problem of cross-site scripting, is that an adversary can trick you to execute JavaScript. The whole thing is the same order of policy. And so what I really like about cross-site scripting vulnerabilities is once the adversary gets hacked and you're used to executing arbitrary JavaScript, they can do really cool things. So they can make requests to the server on your behalf. So this is Facebook, and they trick you to, they use cross-site scripting vulnerability in Facebook to get you to execute some JavaScript code. They can have that JavaScript code post on your wall, post on your friend's wall, friend random people. It's a bank account. They can try to initiate a transfer They can basically impersonate you and do everything that you could do on that web application. This is why this is a very serious vulnerability. And they can get all those requests. And the very cool thing is that the server doesn't know that you didn't do that. It just sees this request come in, and it goes, okay, yeah, I'll post that message, draw, transfer those money. Okay. So now we get to the main events. The black box web vulnerability scanner. So these are usually commercial tools, also open source tools. Anybody familiar with any of these? And which ones are you familiar with? It's going to be a pretty bold update on all of that. Four to five, Portswigger, that's the objectives. Yes. Three of them. Anybody familiar with any of the other ones? Equinetics. What is it? Forty-five. Forty-five. Equinetics. Any of these ones is not on this list. I need to update my logos. What was that? What was that? Web inspect. Zap. What was that? Okay, soldiers, I have to practice. Let me roll that group. Net sparklet. Yeah. All different types of tools. So if we kind of break apart what these are... They're black box, which as we'll see means they have no knowledge of the source code of the application. They exist completely outside of the application. Web applications, so they're targeting web applications. Vulnerability standards, they're trying to find unknown vulnerabilities in these applications. They're trying to find SQL injection and cross-excripting vulnerabilities in a web application with no user interaction. That's kind of the holy grail. So these are considered, I kind of think of them as like point and shoot tools. Like you just point into a website and they try to do what they can to find some vulnerabilities and then they bring you back a report that says, hey, I found these things, here's what they are. Why is that? They definitely have benefits, right? One of the beauties of these tools is because they're black box, right? They don't care what server-side code your web application is written in. They try to act into and try to find vulnerabilities in a Haskell application just as well as a Ruby application as PHP, Python, whatever crazy language you want to write your web application in, right? Because they access it just like a user, right? So that's nice. One of the other things is, well, actually we don't have that there, but that's one of, I think, the main benefits. So how do these tools work? Well, they can treat the web application completely as a black box. They don't know what has a database, they don't care. All they do is make HTTP requests, get HTTP responses back with the HTML, CSS, and JavaScript, and they continue doing this. They go throughout the web application, making requests, getting responses, going to another part of the page, another part of the web application, making new requests, getting new responses, making new requests, getting new responses. And the goal is to try to infer the existence of a bug in the web application. And say, yes. Give you a report that says, I think there is a SQL injection vulnerability because I gave it this request. It sent me this response. That means there should be a SQL injection vulnerability. This is great, right? It's an automated tool. You can just do this. So how do they actually work, right? They have three main components. They have the crawling component. They're able to try to discover and interact with as much of the application as possible, right? That's their whole goal. They want to find vulnerabilities. They want to crawl and find as much of the application as possible. They have an attack module that actually says, OK, try this input. It's likely to trigger some kind of fault. So for a SQL injection, usually what they do is they'll try input with ticks, all kinds of stuff. They'll try to put ticks in it to try to cause a SQL error to occur. And they know if they've done this and they'll just look for keywords like error or a SQL error or something and say, hey, this vulnerability actually occurred. For cross-site scripting, they use the huge list, which I believe now is maintained by OWASP. It used to be, was it Arsync? The cross-site scripting, chichi. They have a bunch of input that they try. And the other important part is the analysis module. So this gets back the response and tries to answer the question, OK, was it vulnerable? Did that website actually have a vulnerability based on that attack? And they kind of try to feed into each other because you may discover more pages as you're attacking. And so then you want to crawl those pages. You may want to attack while you crawl. That's kind of an interesting design choice of how the tools do that. So we tried to ask the question, are these tools any good? Right? You know, it's actually something that's kind of difficult to say, right? Because if you, well, you run the tool on your website, it tells you some vulnerabilities. You find some. I don't know, what do you think? Like, yeah, was it worth it? Was it worth 10 grand? There were a lot of these tools costs. I don't actually know any numbers. If somebody wants to secretly give me some data of what people are paying for these, that would be awesome, too. Right? So, you know, but that's the key problem with black box tools is, OK, I told you there are two vulnerabilities. Does that mean you don't have any more vulnerabilities? You're all done. You can go home. Right? How many vulnerabilities did it miss? And why did it miss those vulnerabilities, right? So that's what we tried to do. And we created an intentionally vulnerable web application called Black Widow Gecko. And it's a pretty fully featured, I'd say, web application. You can, you know, user credentials. You can log in. You can register as a user. You can log in. You can upload photos. You can, like, purchase a photo. You can comment on photos. I don't know, all kinds of, you know, good. You can see it was done in, like, the web 2.0 time with all the rounded corners, maybe. That would be what the developers... Oh, also, I'm not... You know my last name is French? I don't actually speak French. There's a lot of links to it in a bit. But there's a guy who created a wiki about all the vulnerabilities here, so I, like, stole his screenshot. And I noticed, like, I don't know what this says, like... If anybody asks me, in case it's offensive. OK, so we put 16 intentional vulnerabilities in here. Some normal ones, processor things, SQL injection, all the way to hand injection, all types of vulnerability mixed together. Logical flaw in the application where you can submit a coupon multiple times and drive the price down to zero. Just to try to get a handle on an... Maybe if somebody's a fast counter, you can see that this is probably not 16. You got to get there before the audience. That's all I can tell you. Except in class. We also had multiple of the same vulnerabilities. We had stored and reflected cross-ed scripting, obviously. We also had some vulnerabilities that were behind the login. So we could see if they would log in to the website, if we gave them a user account. We had some vulnerabilities that were only accessible after a multi-stage process. So to post a comment, you had to fill out the text, you know, click the submit button, which you're taking to another page that said, hey, here's what your comment's going to be, would you actually like to post this? And then you say, yes. So it's got a two-step workflow there to try to test the scanners in that sense. So we did this, and then we ran... I think it's 11. Oh, that's the next thing I want to know. Is that right? Yeah, I think so. Okay, we're at 11 of the vulnerability scanners against our web application. We actually ran them in three different configuration modes, and I won't go into the details and kind of give any abbreviated things like going to other cool stuff. And so, basically, the takeaway is no tool found over 40% of the vulnerabilities, and combined, even across all of the tools, they didn't find more than, I think, 50% of the vulnerabilities. So there's a huge, you know, and you can see there's a huge kind of variance in these tools, right? Like some of the tools, like on some vulnerabilities, BERT was actually pretty good, but in some things like RedLos Scan, we couldn't have existed any more, LastScan, Intuosider, yeah. So there's huge range in what they found and didn't find. So it kind of is interesting to look at, well, what do they miss? Right? Well, some of them they miss. There was a week session ID and an admin page, so they missed that one. There was a week password, or one of the user accounts of the password was admin-admin, so they didn't get that. Commander manipulation, this is an interesting one, where you could modify a parameter to access pages that you shouldn't be able to access, right? So kind of like a horizontal and vertical installation. Forful browsing, you could forcibly browse to certain pages that were supposed to be only protected after you purchased them. And logic flop. There's also a stored SQL injection that they didn't check, because this was a second-order SQL injection. So they, if you created a user account and then later changed your password, it would trigger a second-order SQL injection. I don't know why they didn't, I think when uploading a picture, there was a directory traversal vulnerability, so they didn't have to find any of those. And the stored cross-site scripting behind the login, this was very strange, and they did not actually protect this. And one more other interesting things, so missed vulnerabilities, right? These were like false negatives, and it's actually, this is one of the only ways you can actually measure, okay, precisely how many vulnerabilities did they miss, right? Because I created this thing, I know how many vulnerabilities there are in there, and so therefore I can tell how many they actually miss. False positives, right? So false positives are, when the tool reports a vulnerability, it says, hey, I found a SQL injection here and you go and look, and you verify no, there is no SQL injection. Is that your, is this like the vein of everybody who's ever used these tools' existence? Or false positives in general, of any automated system? Click check, huh? Click checking, that's a joke. Click checking. Yeah, so actually the false positives range between the tools from anywhere, some tools have zero, some tools have 200 false positives, which is, yeah, the same, you're trying to think about that. The average is about 25. One of the key things I was trying to look at is why, yeah? Just curious, did you find any correlation between the number of false positives and the coverage that you got, like the more powers they have and the more false positives they have? I do not know. I don't think it was exactly like that, and I think the 200 plus is going to throw things out of whack because I know that tool found a lot of vulnerability, was towards the top, in terms of finding vulnerabilities, but I don't think it, so I think that would probably skew things. But yeah, that's a very good, that would be an interesting thing to look at. One thing was, one of the tools reported a server path disclosure on every URL that started with a slash. Oh, slash user slash home, that's a local server path disclosure. No, that's a URL. There were some actual, oh, so there was, oh, oh, oh, there were unintentional, okay. So before I say, when I said that there were absolutely no other vulnerabilities they couldn't have found, oh, I mean, sorry. This, yeah, these were like actual false positives where the tool legitimately said, hey, there's a SQL injection or a processor in here when there was clearly not. Right, so this is probably the fault of the analysis model of those tools, thought that they were finding vulnerability when really they weren't. Yeah, super weird stuff. I don't know. I don't know. These tools are weird. So one of the things we try to do is, okay, copy some pair of these tools and say, well, maybe one is better than the other. So one maybe kind of came up with, because we're, I guess, academics and like make graphs about things, right? It's saying like, okay, we can say that. So here we're doing what's called, called this dominant graph. So the idea was, like, academics was strictly better than AppScan. So academics found six vulnerabilities and that was covered the five vulnerabilities that AppScan found. Right? And so this thing that was kind of strictly better, it found all the vulnerabilities plus one more. And so that's how we ordered this one. So in a really interesting case, N-Stockery found one vulnerability that nobody else found and the same with Herb found another one that nobody else found. So that's another kind of off on their side. But I think it missed a really simple one that nobody else found. Did any of the scanner-spun vulnerabilities that you didn't intend to program in? Yes, one. Very tricky one, yes. It was kind of like an error page or something that was like the server name. So you could change the query or read exactly how it was done, but I saw that and I was like, ooh, that's good. So then you have to like, it's really terrible you have to start over and run through your experiments right to see if any of the tools found that one. Luckily that was the only one. I think as far as I know it is, that's it. They're not there for a long time, so I think hopefully all of us will take that. Yeah, this is kind of a good point of how it's hard to create these. So how these tools actually work, right? So a lot of them, if you think about a form, right, you're trying to fuzz a form or you're trying to fuzz query parameters on an application, right? You may have a parameter foo and a parameter bar and they all have some maybe default values. You want to fuzz and try to test one of the variables, what do you do with the other one, right? Because maybe you change that and you change the application and now it's no longer doing what you think it should do. So they had to use default values. They had cool cross-executing attacks. Some of them were better than others. Some of them had really cool command line injection payloads that would use like backticks and do a ping and then they would test for timing differences between sending that payload and sending a normal payload. That was a really cool thing. SQL injection. So this was also a technique that BERT used, was a similar timing thing as they would inject basically, can't remember if it was a sleek function or something like that which would cause, so then they could find blind SQL injections by using this technique to do timing detection differences between their requests. So that was really interesting. Remote code execution. So some of them had really cool ways to actually, I think they even had services where they would be like listening on local port to see if they were able to connect back trying to test these kind of things. So from doing all that work, so we kind of looked at it and we said, okay, what was the biggest problem? What was the key core problem of these tools? And it turned out to be something that as a human is kind of silly, it's the login form. It's the tools we're testing the application. Even when we gave them valid username passwords, they test the application and then accidentally log out. And they didn't know that they were logged out, they didn't know that they should be able to log back in. And this really becomes from what's called a shotgun approach to these tools is rather than interacting with it, like you as a human, make a request, you get a response, you click on things, you browse the website. They, instead of making one request, they fire off tens or hundreds of requests per second to try to increase the efficiency of their tests. And they get back a bunch of responses and they pass that analysis module to try to understand what's going on. The key problem here is what happens when one of these requests logs the tool out of the web application. And then all those other requests are now not testing the application in that same state. And so really, this is one of the big problems is that, so I like, I was actually just thinking about OAuth, I like the fact that it says web applications in there. A lot of people say websites. Like websites are just informational things that don't ever change. So a web application is actually a complex software application that has state that you're interacting with over time, right? And this is a key problem is these tools treat a web application as if it's the static site and they can just close willy-nilly, right? Without any repercussions. When really, that's not the best approach. So to kind of illustrate this, what do we mean by state? So let's say we have super simple web application, which is at first the view page is not accessible. At first I have to access the home page, then I can access the login page, then the view page is now accessible. Now I can access the view page and the logout page. And when I access the logout page, the view page is no longer available. And so if we think about this once again in a graph terms, right, we have these two states. We're either a guest or we're a user of the application. When we're a guest, we can make as many requests as we want to the home page and we're not going to change the state. Once we log in, though, now we can make as many requests as the home page as we want and we can make requests to a view page and now we can log out and go back to the guest account, right? So, you know, this is an incredibly simple idea of state and the same thing applies when you use any application, right? You're using it, you're interacting with it, you are changing the state of that application. You're now logging into Amazon. You can do more, you can see more things. You can post a comment, which permanently changes the state of the application. Right? And so our goal here was can we build a tool, can we build a black box scanner that actually can infer this state graph from the outside with absolutely no knowledge of the website. And so, there's an idea called a Mealy machine, which is basically this graph augmented with outputs. What it says is, when I make a, when I'm a guest state and I make requests to the home page, I get response A. And when I log in, I get response B. Now when I'm in this new state, if I make requests to the home page, I get response C. So, what we're doing is we're inferring the state of the web application by interacting. So we get, we make requests to home, we get A, we make requests to login, we get B, we make requests to home, we get C, we make requests to view, and we get D. And what we realize is we can, from completely outside, infer that the state of this application has changed because we make a request to the home page and we get a different response back. A and C are different. Right? We think about the application as a Mealy machine. We say, okay, when we're in a specific state, when we make a specific request, we get that exact same output. So when I visit the home page, I'm gonna make, I'm gonna get some response A. And if I make that same identical request later, and I get back something different, the state must have changed. I must be in a new state now. And I could have been in multiple states in between that I missed. There could be, there's all bunch of ways that this could go wrong. But this is kind of this key building block that we use to build this thing to be able to make all these graphs. So we have to first say, okay, how do we tell if A and C are different? Right? Because we're getting back an HTML response. And maybe if it's slightly trivial changes, we don't really care about that, but if it's something really big, then we do care about that. We want to determine, so we know, okay, I made a request to the home page, and I made a request to the home page. But which request in the 10 requests or whatever that I made, actually changed the state of the application. How do I determine and try to infer and figure out which one did that? So I have to have a way to do that. And then this state-changing idea just gives us an idea of, oh, I'm in a new state. Oh, I'm in a new state. Right? But I may be logging in and logging out where it flips between two states. Right? So we have that way to kind of collapse this chain into a graph to try to understand what that actual graph looks like. So I see we're running super short on time, so I'm going to do, we'll skip through this to a little bit on the evaluation. So we used three scanners, four scanners, WVET, SkipFish, W3DF, and the state-aware crawler. The important thing is the state-aware crawler used W3DF's fuzzing engine. We just crawled it in our state-aware way and we fuzzed it in every possible state. So it's actually kind of cool because we can tell what state we're in, and if we get to the wrong state we can see through the graph how to transition ourselves to the state that we want. And so we evaluated this on a lot of different applications, one of which was Waco Pico all the version 2 because we had to remove there's one time-based feature. So this is part of the problem with this and this is why I guess we do cool things in academia, like have a crazy idea, and then say, let's take all time-based stuff out and all that stuff out. Let's just see if we can infer the state without any of that. Any of those things. So a pretty good set of applications. And what we did is we used WGET, just a recursive WGET as the baseline, and we measured the code cover to these applications. So how much of that application's code were we able to execute? Because what we learned earlier is a tool that isn't able to execute code can never ensure a vulnerability in that code. It kind of intuitively makes sense. And so this is why we use as our main metric here code coverage. So how much of the application can we add? We have our applications and in, I think, all the cases we were better than the other three. But important is the difference between the blue, which is us, and the red. So that's really the difference between what state-being-state-aware gets you is being able to exercise more of the application's functionality and being able to test it in all these different states. What is an interesting thing? SkipFish has low over a percent WGET. Why? I have no idea. Also here on this vanilla form thing. This is one of the probabilities. Weird things happen. I don't know. And these are some of the graphs that it produces, which is really cool. So this is for Waco Pico, the state graph that it inferred. And so some of these things kind of make sense. You have to really look at it for a while, so I won't do that. But log out. You log in or register. When you're in that state, you can add a comment that moves here and there's all kind of different crazy ways to move throughout this graph. But all these are kind of state-changing requests. Any questions? Is there a tool that does this? Yeah, it's open source, but I'm going to give the it's research-ware disclaimer. Very... work-once. People have used it. It is out there, though. It's going to brave and willing more if their advisors make them do it. It's kind of how research stuff goes. We have to use this. But yeah, it is out there. There's a whole bunch of limitations, like fuzzing this kind of graph. It gets bigger. It gets harder. We really have a good way of dealing with something like a comment. You add a comment and it changes the state of the application. But every comment you add, it's kind of like that add comment that just functionality that's adding to a list. It's not really transitioning you to this completely new state that you can never recover from. That kind of kind of is. That's kind of tricky. Anyway, so I'll go over really quickly. That's kind of some of the work I've been. I have done. Those are Pinkers published. You can check them out on my website. Outside of the link, my kid has a profile in there where both of the Waco Pico and... I'll get that slide. Waco Pico is open source and it's also part of the OWAS broken web applications project with one of those apps in there. So if you want to check that out, there's a great VM from OWAS that you can download and run and you can access Waco Pico and check it out and play with the vulnerabilities. So we're kind of... I want to talk a little bit about the future, right? What are the challenges with these tools? So one problem is that, well, right? Now applications are starting to say, well, let's get rid of this back end, right, this server. Let's get rid of the database, move the database to the browser and get rid of the... Okay, there you go. And so now when I make a request, right, I get shipped back HTML page, CSS and JavaScript where this JavaScript is really the key of everything and it's what we call a client-side web application, right? So it's running 100% on your browser. It's using the local storage, APIs that store data in your browser, right? And maybe it makes some requests, but I think it's really interesting to think about these tools because they still have problems with cross-site scripting, right? They have third order non-based cross-site scripting here which we still want to be able to test for and try to understand. So the question is, with these new client-side web applications, right? How do these tools work on them? Or do they work? Or should they work? And so we kind of took the backbone, AngularJS and Ember so what we did was we created a little note-taking application in each of these three different frameworks. Backbone, Angular and Ember and one last question. Okay, how do these tools do? Are they able to crawl these applications, right? They're still applications but now they're running on the client using JavaScript. And actually I found that, surprisingly, I think one study I saw said from the top like 1,000 domains, 3% of them were using one of these mind-side technologies which was actually really interesting and kind of shows that they're definitely coming up, right? But definitely on the rise. And so we have here the note-taker app this is our cool T&C note-taker app. One of them. Okay, yeah, cool. First thing to jump about is this big cross-ed scripting bar. Most of you code these into your apps. This is actually our way to detect because part of the problem is, okay, you have this scanner, right? When you tell it to crawl a web application you can sniff and sort, right? You can even up all that traffic that it's making and the responses that it's getting. So you can analyze it later to see, okay, what did it do? What requests did it make? What types of things is it fuzzing? Here, this is all running in its internal browser. So how do we ever know where it goes to the application or where it tries to populate it, right? So this is where we're like, oh, they're meant to find vulnerabilities. This is the world's stupidest cross-ed scripting form, right? So you type this in to a clear cross-ed scripting vulnerability on a completely different page and this way we can tell because it will tell us, hey, I saw this cross-ed scripting index page. I think of what this is called, .php, which shows us that it actually got to the index page and it gets to the new node page that will tell us that this is a different endpoint so we'll report a different vulnerability here. It's kind of a way, it's really difficult to track these things because they're running all on the client. So one of the cool things about web applications is you can kind of see here up in the URL bar, right? So we have, or maybe you can't because it's very tiny. So what we have is this index.html page and then we have the fragment, the hash, or the hash tag, I guess, if you could say, and then we have slash node slash new, right? So that's what's really crazy about these client-side web applications is they still have kind of the concepts of routing and URLs, right? But the interesting thing is everything after that hash mark is not sent by the browser to the server, right? The server only gets a request, well, this isn't a server, but the server would only get a request for index.html and then it uses that fragment as the sub-resource of the URL document. But Ember.js does this, it will use that to figure out what functionality did you want to go through. So it's kind of like a web application, right? There are different URLs that kind of mean different things and each client-side application can do their URLs differently. So we have a search feature and so this is still work that's ongoing. We have published this, I thought I'd share this, but it's kind of fun. Our initial result was we used 14 scanners and because we haven't published it, I won't tell you which ones. Only three were able to crawl the application at all. Which is kind of depressing or exciting if your researcher account events. And no scanner was able to crawl all of the versions and all of the websites. It's super weird. Some crawlers would be able to crawl the Ember.js thing just fine, but not the Angular.js. So we're still trying to figure out why, right? We always want to answer why, like why did this thing happen? Because that's not what we can learn from it, but it's really hard when everything exists in that tool and they're all closed source and you can't test them. Some of them are cloud services, so we literally are not even running them on a machine, right? We're just asking them to get crawled. So it's really interesting there. So I kind of want to end with... Who else? Are there pen testers in the room? I know there's a good number of them. Cool. So these are some of my colleagues from Shellfish when we were in DevCon. I think this is a hotel room we're in. We stole those... We borrowed these tables from all these good desks. We were in this big suite in the Rio doing the hacking. But anyways, so my goal is to kind of put you all out of a job, pen testers. If I could make an automated black box web vulnerability tool that could do what you do, I would be super happy. Because that's the dream, right? The more I think about these tools, the more I think they're really kind of like little AI things, right? It's like we're trying to take all of your domain knowledge, right? How do I break into a web application? Squeeze it down into this automated tool, a program that anybody can fire at any website and try to find bugs. And I think that's really the strength of these tools. I can't... I'm probably not going to put you out of a job. Yeah, but I've got a long time. But where this is really important, right? You're here at an OOS meeting, right? You're already ahead of 90% of the developers out there when it comes to awareness of security, right? The key problem is these developers don't even know, right? And so if we can give them the tools and say, here, use this. Maybe I showed that they aren't good, but they're getting better, right? And with new, cool research, we can improve them so that that way we can kind of improve the security of the entire web ecosystem, right? By enabling our fellow developers who don't have that security expertise to test their own websites, right? To find vulnerabilities before they're out there. I had a good, heard a good talk once where somebody said that these black box tools, they can improve on the tree of vulnerabilities, right? So what I want to do is kind of raise the bar so we can find those a little higher up, right? Keep raising the bar, making the job more difficult for attackers and really making the web safe. So some of the challenges I see are improved crawling capabilities. You can't find a bug in code if you do not execute. That's really fundamental. Improving the attack functionality. So how can we make the attacks more efficient instead of just using this huge list to try to close all these possible things? Can we be smarter about that? Can we improve the tools and make them smarter? Improving analysis. So can we find new techniques, new vulnerabilities, right? The tool kind of has to already know what's an attack and how do I tell if that attack is successful to find these vulnerabilities? Things like logic flaws are incredibly difficult to detect in a black box manner, right? Because this is logic that's in one specific application. Oh, and finding kind of hidden bugs, right? So there's some kind of work on this where maybe your system uses another system which uses another system, right? That data is propagating through there. Maybe at some point some other system can be corrupted. I mean, trying to think of how all these things are kind of intertwined. So with that, I'd like to thank you all for listening to me and being so patient. You said there was a tool that somebody asked about. Is there a name for that or where can we find information about it? Oh, somebody asked about the State-Aware Crawler. It is on my GitHub. It's really the best site for it. You can go to my web... That's our group website. You can also go to my website www.dk.com. There's links to all my stuff there. So it's going to go off the Micro-Pico website too. This stuff will be once we get it published. We have to publish first and then release. If you have any questions, please reach out. Talk to people, work with people. Some of the things I'd love to do is figure out how professionals use these tools and how effective they are in your hands versus like a normal developer, right? That's something that's really hard for me to do and to be like, how these tools are helping people, right? How did you... Because you mentioned that you did have issues with the standards logging out and them not knowing that it was logging out. How did you, I guess, make that logic so it knows that it's logged out or logged in and when it would log back in before it could be skinned? Yeah, so that was with the State-Aware Crawler. If we were able to infer that there was a difference in state of the web application, we could actually test, we could use that home.php request to say, oh, if I know if I get back A, I'm in this state, and if I get back C, I'm in this state, and then with my graph I have the transitions labeled too so I can say, oh, I'm in this state but I really want to test in this state, right? The other challenge is how do you test the login request? Every time you fuzz the login request you have to log out and that's basically what our tool would do is it would be able to make the fuzzing request, check what state it was in, if it was in a state that it thought it should not be it would transition out. And then I think it's a worst-case scenario we also gave the ability to re-roll that VM back to the beginning and make all the requests up to that point so it's in a no-good state. Yeah? A simple observation I'm talking about trying to infer whether you're logged in or not. You're requesting the home page over and over again which was the page that you already had access to as an un-authenticated user. The more direct and clear way to do that is to request the view page because then the application won't allow you to view the view page so you'll probably get a... Well, it should not, right? But if you get redirected to the login page or the home page or whatever indicating that you're not an un-authenticated user or you get an access denied or anything like that then you're getting a really clear message that you are not in the authenticated state versus requesting the home page which, depending on how home pages work a lot of them are the same content whether you're logged in or not so what you would want to do is request the authenticated page and then confirm if you get the results of the authenticated page versus you get redirected back so we kind of... I definitely glossed over the basic idea we used was the link structure of the pages so we looked at the links in the form on the page the key idea there is like the links in the form to describe what you can do with the application there's a new link on the form to a new page you've never seen before then it's likely that the state has changed so that's kind of what that home page is like on the home page, logged in you'd see a link to view but yeah, you're probably going to use like home page and stuff like that everything is crazy and varies and there's tweets they can pull in and there's an ad thing that happens so we kind of punted on a lot of that but yeah, I like that idea because it's clear that the view page would have completely different responses I don't have to worry about parsing the HTML of the page and looking for a link if you get a 302 versus a 200 response then it's already over and you know that it's a different state just for some so people actually use these tools in their jobs I would say one of the number one frustrations we encounter is once the scan is done it's done you come back next week and scan again and there's just no matchup between us to maybe you fixed something, maybe you processed and wrote something so you would want to do some scanning yeah I don't know what you're tracking we were about to see more than 6 months ago is this going to exist so kind of built in regression testing also I've talked to many of these as well so you need to think that a bigger level than a single scan one concern that I had about the list of missed issues that all the scientists had was whether or not those were issues that the scanners like even claimed to look for because some of the logic stuff were the authorization things the parameter tampering that without context they don't necessarily even claim to look for that because they don't know the difference between ID1 and ID2 are those both yours or not there's no way for them to even know that I don't think they would claim that they were even trying to find those I agree, well I don't know I guess I do work for any of these companies so okay it definitely wasn't in that study some of the tools so yeah it's a very good point so actually for a lot of those we do this is all a summary of 12, 15 page paper so there's a lot more details in there but yes, some of them definitely I agree are not quite fair but it kind of is like that's where we want to go you want to be able to detect logic law is very difficult but like parameter tampering or force browsing if you know about the state of the application then you can maybe check for force browsing saying well there was no link in this state of the application but there's a link here what happens when I make that request in this state and it's the same but yes I totally agree the more interesting thing is kind of like the differences between them and like in some applications I can get into it with some scanners found vulnerabilities without any configuration at all and some that were behind the login which means that they successfully created a user account without being told username password which is pretty impressive and some of the tools did it and they were only able to find performance when we actually logged in to the game then the username password or use their proxy to log in but there are kind of three ways to do it so yeah it's kind of I think those are more interesting about the differences between them rather than all of them missed them and a lot of those are probably challenges like some of them were cross-site scripting in a flash atlantic right cross-site scripting with that dynamically created form of JavaScript so those I think are big issues but they are not apps that detect a big scan and adapt to your scan so like you try to login three times and it's detecting maybe you're not inserting tokens with your logins it's now just going to lock you and every time it just says fill login I guess I should have prefaced this my ideal way that these tools are used is in your own environment against your own application you need clothes, environment, game control that you can disable that one yes but it is interesting because think about this kind of example so yeah maybe the can they detect that do they break with it all the time they just give you a bunch of knobs to tweak that maybe you don't understand what they knew or how to do that maybe that's one of the knobs that you can tweak yeah and the thing that you didn't talk about in terms of difficulties for scanners, problem stuff is a cross-site request for a great defense so did any of the applications use like anti-ceaser tokens or something like that to invalidate form requests that's interesting so that you need to have first obtained an occurrence and validate a request token in order to successfully make posts that can be a problem where maybe even if the tool crawls it successfully the first time because it's just submitting the post on a page but if it tries to submit it again it doesn't have a valid token does it identify that and is that blocking things also yeah so I think so definitely why would he go does that have anti-ceaser tokens I so then did you get flaws in any of the scanners complaining about cross-site request work I don't think so because this was from the 1910 I don't know if they were all testing for that I have the specific burdens right there but yeah that's a that's a really big thing are you aware of machine learning research in this space in what sense I'm wondering what would happen if you took some really mature thing like Watson or something and like you have coached it with you know here's a thousand different pentesters hitting the same application here's what all the scanners do go at it not that I'm aware of but yeah there are that's actually kind of more aware I'm moving I'm starting to think in the terms of like machine learning or artificial intelligence or some kind of those types of things right you know how can we actually make it more like you right when you're actually on a website because that's kind of the same thing right is when you're there you're trying to reverse engineer in some sense that application as you use it right so I'm trying to think about okay what are some ways we can maybe automatically reverse engineer the application or those kind of things yeah we'd love to do like train some crazy I'd be worried about it turning the scandal over ultimately right exploiting things well Matt you probably want to look at the DARPA's cyber challenge they're presenting that yeah so my team unfortunately I left Shellfish after they had already qualified for that or after they had started the CGC stuff yeah those are a little bit different because they have they have the binaries right yeah it's not by applications right you can apply some of those principles too by that conditions but it's so hard without having the code right that's the key problem here do you know I think you can definitely apply some of those problems in the client side apps like analyzing but yeah part of them are the challenges they have to patch the vulnerabilities they don't have access to the source but they just have the binary yeah it was actually when we did DEF CON they had some CGC binaries in there this isn't fair people building systems to automatically break these we're here by hand trying to yeah so how general is your scanner applications in your underlying very simple ones I don't know to be honest so it's general in the sense that it is abstracted a bit from wherever because one of the key problems I have with a lot of these types of scanners is they assume that like like query parameters are always going to be in like the query part of the URL right where we all know different frameworks do things differently some of them put some things in the path some of them do all kinds of things so we're more general in that sense we can handle different types of applications like we had a Ruby application a PHP application and different kinds of apps definitely for this version it just needed to be not no outside things can change the state right that was kind of the key thing it needed to be the user that changed the state right because it's really interesting when you have what if you have multiple users of the system right or get changed in or some other person did it but I don't have any insight on to like theoretical my NSA machine, healing machine expressiveness of this approach it's very much a series of heuristics that all work kind of together and then you get these full graphs kind of off topic but manipulating the fan to get data out of it manipulating the what this week where they were manipulating the fan speed to make it more so and get data out of the machine crazy like exfiltration, secretly exfiltrating like creating a covert channel with the fan speed and then listening to it with your phone and actually transferring like 900k a second it's actually really cool I didn't know if anybody researched it more I didn't look at it too much I didn't either but yeah the covert channels and every kind of system but so anyways I think I'm up so thanks thanks a lot Ed we'll probably find without the next slide just a closing slide thanks for everyone for coming the next chapter