 Hi, everybody. Before we actually get into this speech, I wanted to give a quick explanation. Dr. Zage and Dolores Zage are supposed to be up here with me. They sent me an email on Friday saying, hey, by the way, we're not showing up. I was like, well, okay. I was like, why aren't you showing up? They're like, oh, our son's got to move to Sandia National Labs, so we're going to help them. DEF CON is not a big thing, so you can handle it. You're a grad student. So I was like, okay. Well, so basically, I added their email addresses at the last minute, so just, you know, that was the advice I got from some of the DEF CON goons. So, you know, you guys want to email them, whatever. Any of this stuff. But also, then they sent me yesterday. I was like, well, you know, this is based on 20 years of metrics research. And I did a lot of the security side of things, but not as much the metrics side of things. So I said, so how am I supposed to, you know, really do a good job explaining this? So they sent me their research papers. So then I was like, okay, 20 years of research papers. So I went through what I could, took notes, and, you know, basically get through the speech the best we can and answer your questions to the best of my ability. I have a pretty good understanding, but like I said, I would have preferred for them to be up here with me, but, you know, the speech must go on. So the presentation outline, we're going to, a quick overview of S-squared ERC. It used to be CERC, used to be the Software Engineering Research Center, Analysis Security and Software Engineering Research Center, and then, you know, get onto the actual metrics and the vulnerability analysis, which let me get through these. I've got some notes. This is going to be the first DEF CON speech that I actually have to take notes to. So basically, it's a National Science Foundation University Cooperative Research Center that, you know, it's funded by them, has a ton of universities, 10 plus, 50 researchers. It's all collaborative. We have projects, you know, any company, or you have government agencies, everybody else sponsors their projects, sponsors us to do research projects for them. The participating universities, I'm not going to read them off to you. You can see them. They're in the slides. They're on the CD. We have a ton of industrial, industrial affiliates. This is not really stuff that I care about as much. Okay, then here's the design metrics timeline. Like I said, it's about 20 years of metrics research. So it started in 1986, and I believe it all came out of Purdue University, is Dr. Wayne's age, his baby, from what I could tell, the first research paper they sent me was from Purdue. So I'm guessing that's where he did this, this research at. Let's go ahead. Okay, I guess they were going to click through, and this is really annoying. Okay, they got a Alexander Schwartz Koff Award for their research. Basically, I'm guessing they wanted that in here to say, hey, it's credible. Like I said, this is their content, so I have to basically stumble through some of it. Okay, here's some that actually get to the good stuff. Overview of the design metrics. These are the external view of design complexity. These are going to be the internal view of design complexity. Which is, you know, that's pretty much when we're looking, talking about faults in software, the more complex something gets, the more error prone it's going to be. So this is basically when they started this stuff, it's all goes back to, it was all just reliability. This is, you know, 20 years ago, nobody really seemed to care about security. But they did care, you know, they don't want their software crashing or, you know, if you have navigation system or whatever else, you know, you don't want that thing going down. So it was all reliability was the focus for software when this research started. We look for outliers. Pretty much, you know, you have these modules that come up as very error prone, fault prone. That's what we're looking for. And there's all the people that have funded us. This actual project to actually take the reliability metrics and apply them to security was funded by army research labs. So let's get through. We've computed metrics on university based projects, computer science corporation, Stan Finn, systems from army research labs, Harris's rock project, Magnavox, PBX system, Northup Grumman. The stuff that I was actually part of is all the open source projects. So I didn't do metrics on these. These are Wayne and Dolores. Now on the results for the actual liability metrics, they have correctly identified in the worst case scenario, 76% of the defect prone modules. So that's in the worst case scenario. That means, you know, the worst time you ran it was 76. It's almost always higher than that. Generally closer to 80 or 90%. So we're getting to some definitions of our external metric that you can see is inflows times outflows plus fan in and fan out. I'm going to go ahead and go over that nomenclature. So I'm not sure how standard it is. That's what I got from their papers inflows is basically any data coming into a module. It doesn't matter like to be parameter passing, global data, reading a file, any, any data coming in is going to be an inflow. So then an outflow is any data coming out of a module. So let's say like a file change, you know, if your module writes to a file, that would actually be considered an outflow or it could be passing something out, you know, anything that's coming out of that module's outflow. Fan in are going to be the modules that actually use the module focus. By module focus, it's the module you're trying to run primitives on. And I guess I should just to make sure to find what I mean by module is any addressable, callable piece of code. Because since, you know, on this reliability metrics, it's not limited to C or Java or Ada or Perl or any other language. So, you know, because languages are so different, we're just saying module is a callable piece of code. So any time I say module in this speech, that's what I'm referring to. So then fan in, like I said, is modules, account of the modules that use the module focus. So if you're saying we wanted to detect reliability in this module, how many modules call it? So that would be your fan in. Then your fan out would be how many modules does it call? So then as you can see here, we've got a, you know, basically it's actually kind of simple, you know, once you have it all defined, you have two plus two plus three times and then you get your, you know, your result is 23. So let's go ahead and we're going to break that down to this part, in flows, out flows, fan in, fan out. Then we have an internal metric, which is, let's get through here. That's actually a measure of the internal complexity of a module. And what the equation for that is you have your central calls, your data structure manipulations, and your input output. Now W1, W2, W3, SMI was confused about, went back to the research papers. All those are is weight. So if you wanted to give more focus to data structure manipulations, then you could give W2 a weight of two or a weight of three. So for all of our metrics, unless if the actual, you know, the people funding us to run metrics request a specific weight, W1, W2, W3 is always one. Army research labs, the SMART project, the values of W1, W2, W3 were always one. Just to make that clear. So then let's get, sorry guys. So extending design metrics technology to a software security engineering process, really what we're, we're trying to say is like how do these actually apply, you know, reliably metrics to security? And looking at defect rates, a deployed software package of one million lines of code has 6,000 defects. So let's say if we assumed 1% of those defects are security vulnerabilities, there are at least 60 different opportunities for someone to attack a system that is a million lines of code. So really what we're saying is that, and I'll get into this a little bit later, is that the reliability faults, you're going to have security faults within those. So, you know, the majority is going to be, you have a ton of reliability faults, then a small number of those will actually be actual security faults. Let's get through here, and this is where I'm talking about. So you have your defects, and within that, some of that's actually exploitable. And that's kind of what we're, what we set out to prove when we did this project for the Army is to say that, yeah, these metrics that we've been using, you know, proven over 20 years for reliability can actually find security faults. So when you look at the parallels, you know, you have a fault-prone component that's likely to contain faults on the security side, vulnerability-prone component, which is likely to contain vulnerabilities. A failure-prone component is probably going to have failures in the field. That's the things you really want to get because that means your stuff is going to crash. So when we look at the security side, we say, well, then you're going to have attack-prone components, which means it's actually likely to get attacked in the field. And when we actually did an analysis on a large commercial telecommunication system, we found that the presence of security faults did correlate strongly with the presence of the general category of reliability faults. So when we saw that, that was kind of before this project, and that really led us to believe, well, we can use these metrics that, you know, have been used for stability and actually take our error-prone modules are most likely going to be, you know, vulnerability-prone. So our project objectives, we wanted to investigate the overlap, like I was saying, and see, like, how much overlap is there and will our system actually find, you know, vulnerable pieces of code. There's our systems that we did, Firefox, OpenCeleris, OpenSSH, and Drupal. And Apache, I don't know why I didn't have that on there. So pretty much we considered it would be really good if 50% of the components contained documented vulnerabilities would be identified. And what I mean by that is you have, like, CWEs or CVEs, you know, basically everybody here that does security knows, you know, you have patches, and so you actually have the stuff get reported for the most part. So you have reported vulnerabilities. So what we did is gather all those reported vulnerabilities on these products and then run the metrics on those specific versions and say, okay, were we actually able to identify these, you know, error, or sorry, vulnerability-prone components? So that's the whole process basically is you take your system, then you build a list of all your vulnerabilities, what version of the code it was, and this is where I actually, my big part in this research project was I would take a report of a vulnerability and then say, okay, well what is actually causing that vulnerability? What's the actual vulnerable piece of code? Because that's really what you need to know because when you're doing, you know, your metrics and it gives you a specific module, you have to make sure you have to know where the vulnerabilities, you know, was or else you can't say if it actually identified it or not. I mean, if it says, hey Apache has a, you know, buffer overflow, it doesn't tell you exactly what module the buffer overflows in, it doesn't do you any good. So that was really my part was to find exactly where those vulnerabilities were. So then when they run the metrics code we could then say, okay, well the metrics actually identify all these. So there we have, you know, these are our CWEs, how many that we collected for the different versions. We did Open Solaris. I'm not sure what the version was on that. Firefox, we had 2.001, 2.002, 2.005, open SSH, HGPD, which I believe, excuse me, have the version numbers in just a second. So everybody knows what Apache is. I don't really need to go over this. For the vulnerabilities on version 1.3.1, you had 144 files, eight vulnerabilities identified. With the metrics, we actually had 87.5 percent or seven out of eight vulnerable modules were identified in the top 10 percent, which is a huge win for us. Because that's saying, you know, if you are a code auditor and you want to know where to look, then that's a huge win to tell you, if it says, hey, this is the top 10 percent and that's actually where the vulnerabilities were, then that's pretty good. I mean at least, you know, in my opinion. So there's a graph of actual some of the output. It shows, hey, these are the modules that are likely to be vulnerability prone. Open Solaris, I think everybody knows what that is. So we had 22,600 files, 23 module updates from vulnerabilities totaling 37 changes. Five is the largest number of changes on one module. So we wanted to identify 23 modules out of 22,600 modules and files, which is a approximately one in 10,000. So yeah, pretty tricky. So Dolores wrote some code for this that she calls her dividing conquer and let's see. There we go. So we actually got, she got that code to work and pick apart because I guess there's a lot of weird stuff in there that was messing up her preprocessor engine, but once she got that working, we had 60 percent or three out of five vulnerable modules and 69 percent or nine out of 13 changes were identified, which is pretty good. Not as good as all of our, you know, the reliability numbers, but for a system that complex with that much code, still felt it was, you know, pretty good results. Firefox, again, another description. Everybody knows what Firefox is. So we did a general analysis. We did a lot more on this one. They were a lot more interested in Firefox. The Army was, so I actually did multiple versions instead of just, you know, one specific version. And we had at least 51 percent and at most 86 percent of reported vulnerabilities were in data handling. So the data structure manipulation, which we talked about earlier, that was the best predictor of vulnerabilities in Firefox. So for that, what you would want to, or what we ended up finding is that that would be one of the times we thought if you're going to analyze web browsers, then you would want to give more weight to the DSM than to anything else. Open SSH, everybody knows what that is. We had a 3.8 P1, 31 vulnerable modules were identified, 18 of the 31 vulnerable modules were in the top 10 percent, which is again, I say pretty good. It makes your life a lot easier if you're looking for security bugs, if you can say the top 10 percent was actually accurate. And a tough thing with this research also is that there could be still, you know, unknown vulnerabilities in software. So when it gives you a top 10 vulnerable module, we're kind of basically all we can actually look at is the ones that have been identified. So there could be, you know, our numbers could actually be higher if there's unknown vulnerabilities in that top 10 percent that we could actually have much better numbers, but we don't know. We didn't actually do source code auditing and say, okay, well let's, you know, try to find exploits in Open SSH or Apache or anything like that. We were just looking at existing vulnerabilities and try to see if we could find those. Then we get over to Drupal, and that's a, it's actually new research is going on right now with the grad student named Zeeley. And Drupal's in PHP, so again, a different language. The other ones were all C, C++, and everybody knows what it is. So anyways, what we ended up finding with Drupal, well, some of my work is Dolores ended up naming it the Drupal Vulnerability Miner, but it was a lot easier than Firefox where I was like manually collecting this stuff. Actually just wrote a Perl tool that we called the Drupal Vulnerability Miner, and the Drupal community has an actual security site that's like really good about reporting the vulnerabilities and where exactly the vulnerabilities are located. So that whole thing, the Drupal Vulnerability Miner goes out and is able to just parse through that data and automatically generate what we needed. We have these Excel spreadsheets that show okay, this is the vulnerability, this is where it was in code, this is what caused it. So then when we go with our metrics, then we compare the two. So that was just something that I did that made things a lot faster. And on our PHP source code analysis, we're still designing, I'm not, but Dolores is still designing the PHP to XML tool so that she can, because basically we have this XML definition of a, it's a generic definition, so then let's say you had C or Java or PHP or Perl or whatever, you want to take that instead of actually like scanning through Perl, you actually have a parser that converts it into a standard XML definition and then the metrics run on that. So then basically if we need to add a new language for our metrics, we just write basically a parser that converts it over to this XML definitions, that way we're not language specific. I think they said like, maybe it was eight or ten years ago, one of the government people came to them and said, hey, we need to do ADA. So okay, well, we did do C, so this is the whole idea behind this, making it a little more generic. So after showing all that stuff on this project, the benefits that we felt, like I said earlier, is to know where your vulnerabilities reside. If you're doing, especially if you're, you know, you're on a team and you're trying to find if there are vulnerabilities in your code and you want it, you're trying to out of the source, it's very helpful to know where to start looking first. Now, of course you're skilled, we've talked to people from different sponsors and we're pretty good at it, but this was still helpful to them. They said it is beneficial, there's things that they may not, because they kind of have a pattern of what they look for and maybe stuff they don't usually look for and this helps with that. So pretty much being able to analyze all these different technologies, like you have your web server, open SSH, et cetera, et cetera, you can cover all of these scanners out there, we'll say, hey, I can look at C code and especially a lot of your static analysis that just says, you know, I can find, you know, it can find buffer overflows in C and that, you know, it's a specific thing that it's looking for, a very specific pattern that's very limited. So with ours, we're trying to say we can take all these different languages, you know, any language out there, we can take all these scanners and that works for you, but then let's say you, there's a all of a sudden for some reason your company decides to use OCaml and you're like, well, are there any OCaml, you know, vulnerability scanners? Well, probably not. So you could actually, you know, get the XML definition into our system, run metrics on it and it would show you the fault and vulnerability prone and identify other security weaknesses because, you know, like I said, we didn't identify 100% so we think there's, there should be metrics out there that will help to better find all those vulnerabilities and one thing, you know, they said when I was out here to ask is if anybody has any systems they want analyzed, then that would be something too, you could contact them and if you have some source services and all that stuff's no problem for anybody that has those and that's actually that's it. Go on to questions.