 We're going to do a partnership, or we can do a partnership. So now we're going to act very broadly. So the idea here is we'll think about, so what are applications? What are we talking about here? What do you think of when I say applications? It's not a wrong answer. Set up programs which are built to work together to achieve goals. OK, programs that work together to achieve goals. Web applications, desktop applications, what else? Mobile applications, what else? Embedded applications, I don't know. Call those an application? Probably, I mean they're as tricky. Because then you get down into what is part of the application. But it's definitely embedded in devices that are running code. So I would definitely put those in there. About server applications, the DNS servers, the Apache web servers, the IIS web servers. These are all different types of applications, but fundamentally, and the way we'll think about it is they provide some sort of service. So let's get back to the idea of programs working together to accomplish some goal. So they're either running locally. So if we're processing or filing anything like that, they can be running remotely. So they can be running on somebody else's server, but they maybe want to offer a service to other users. This is why it's so important to study the network for before we do that. So what determines the behavior of an application? Input, so the input from the user, what else? Is that it? Holy programming? The code, what's that? What it's running on? What it's running on, so any other abstraction layers, libraries, those kinds of things? What else? I'm going to need resources. What was that? I'm going to need resources. Resources, so it could change dynamically based on the size of the server and how much memory it has, those kind of things. What else? The network, maybe what kind of network set it has, or network devices it has access to? The user. The user, we said user input, I'd say those are similar. The system, anybody install software? Never, you've never done that? You saw a server application before, like Apache? What do you have to do after you install it? Do you install it and use it? You have to configure it. So the application's configuration affects its behavior sometimes substantially. So all of these things contribute, right? So we can think of an application as it's the code. So the code obviously has a huge impact on what this application actually does. But a lot of it is also environment. It's the data, so the input from the user that's being processed. And environment, you can kind of roll up all the things we talked about, the server, the network, the configuration. And so what is our goal as attackers? If we wanted to attack an application, what do we want it to do? What was that? We may want to bring it down, but at a high level. What do we want to do? Like just like, make it do something it isn't designed to do. Yeah, so make it do something it's not supposed to do. If it's not supposed to crash when we send it stuff, and we send it stuff and it crashes, right? We've got it to do something it's not supposed to do. If it's not supposed to give us everyone's social security numbers, but we send a SQL injection vulnerability that makes it give us all of its social security numbers, we made it do something that it was not supposed to do. And so fundamentally, that's what we are thinking about. So we need to try to get these applications to violate the security of the system, yeah. Intended by who? What do you think? Well, you say things like, well, make it give student IDs. I mean, you could argue that if the program, in fact, gives student IDs, it was intended to give student IDs. So I would say from the original programmer, from the owner, these are all different people, the whole different ideas of what the intentional program. Yes, so this is what makes this tricky and difficult. So I like to, then I give the thing that we can give you an example in here. No, I don't think so, yeah, okay. So yes, I like to think about, I like to give an example similar to that. So let's say, I told you there was a website that I could change any page on that website to be whatever I want. Is that a vulnerability? Depends, if it's CNN, is that a vulnerability? If it's Wikipedia, is that a vulnerability? No, no, it doesn't. Well, depends, they have mocking mechanisms, but whatever, I can still add a lot of Wikipedia, right? So yeah, so context is incredibly important. So that is your job as a security analyst is to understand the application, to understand what is this supposed to do? And that actually leads to a lot of arguments sometimes between security researchers and developers because they say, well yeah, it's supposed to do that. So either the security analyst misunderstood the purpose, like you said, in student ID numbers, if there was a vulnerability where anybody could access all the student ID numbers in this class, that would be a huge vulnerability that's actually violating student privacy laws. But when I go to my NSU, I can see all your student ID's, right? Because I have to for business money, for teaching purposes, right? So the same behavior in different contexts and different applications could be insecure or could be secure, so it's incredibly important to think about that. Good point. So when we think about security, right, we want to either violate the confidentiality, right? The application said this should remain secret but I can read it. We can violate the integrity. The application wants this data to not be changed by some random user, right? But we can violate that. We can maybe violate the availability, right? This application should be available to all network users and if we take it down, it's not available for all network users. So the way I like to think about applications, and so to do this, what can we influence? So we said the behavior of application is determined by the code it runs, the data that it runs on and the environment in which it executes. What can we control? The data, fundamentally, the input is essentially the only thing we can change. If we can change the code, that's a huge problem, right? We can already change the code, we can fundamentally make it do anything, right? If we can change the environment, we'll see sometimes we can change the environment, sometimes not. That depends on where the application is running, right? If it's a local application running on our machine, yeah, we can change the environment and mess with it all we want, right? If it's a remote application on the server that we don't have access to, then we can't modify or change the environment at all. So this isn't really important to keep in mind. And so to do that, I don't think about, okay, this is like a high level how we can think about applications. So we have the application, right? The code of the application, it's running in some environment, right? On top of that, you need to think it's running on an operating system. So you can separate out maybe the environment from the operating system. It has access to a network oftentimes. There's a file system. There are other processes on the computer, right? That can maybe try to talk to this application. There's a terminal where you as a user can use the computer, right? So is it, so if I'm sitting at my computer at a terminal am I locally on that? Is that a local application I'm using? When would it be a local process on that? Yeah, it's a local process on this machine, right? So when I'm at a terminal, I boot up a terminal, I'm on my machine, I can run LS and CAT. Those are all programs that are running on my local machine. Now when I SSH into let's say the submission server and I run the MySQL command because somebody forgot their password again. My local or remote? Both do I explain? Something's running on your computer that's communicating with the code running on the server. So in that way you have both a local connection and a remote connection. Yeah, so I can think of that, even in that case I'm remote. So essentially this terminal part, I'm just connected remotely to the terminal of that computer, but it's exactly the same as if I'm sitting in front of that computer typing in commands, right? This is an important thing to think about when you think about remote local distinctions, right? Because I have access to that machine. I have as much access as if I was logged into that machine sitting directly in front of it. At least one of the programs for example, right? So the application, so when you think about where does the application get input from, right? Well if you get input from the terminal from a local user, right, who's using the application, it can get input by reading files from the file system. It can get input from the network, from talking about other network services, and it can get information from other processes, right? So this is remote procedure calls. You can get information from other processes on the system. So what are all ways that we can influence an application? The file system, right? We can mess with the file system. Maybe that'll change the behavior of the application. What else? Neclope. What was that? Neclope. The network. Maybe we can mess with the network. What if we inject packets into the network or drop packets or SNF? What else? The terminals. We can give input that's weird, right? To try to make a graph. And other processes, right? Yeah, exactly. So this is why I like this diagram. What else? There's one other thing. The environment. Yeah, if we can control that environment and change the way the program executes, then maybe we can influence what this application does. So these are all the things you need to think about when you're trying to break an application, is what are all the things that I control, right? That can be input to this program, right? And this applies if you're talking about an application running on your desktop or that applications that are running on a server or applications that are running on your phone, right? All of these kinds of things are important in all of those domains. So what we're going to be looking at and studying is application vulnerability analysis. What's vulnerability analysis? What's a vulnerability? Are you almost? How easy is it to break something? It's usually hard. Yeah. That we can find in the system? Yeah, so finding, so vulnerability is any bug that can be used to compromise the security of the application. So it's typically something that's not intended by definition not intended, right? So that's a vulnerability. So vulnerability analysis is essentially finding vulnerabilities, so analyzing a system in order to identify the vulnerabilities that are in that system. Specifically, when we talk about application vulnerability analysis, we're focusing on finding vulnerabilities in applications. When we look later at web application vulnerability analysis, we're going to look at how to find vulnerabilities specifically in web applications. All the same, what types of vulnerabilities can there be? What was that? Code related. Code, so in what kind of way, though? Like maybe there's a bug in the code, like a memory corruption bug or something, some undefined behavior, so the code is written. The code itself has a bug. Yeah, so that would be one design. So what would be an example of that? Yeah, so I can make one. It uses a different, it uses a more role data structure that you can take advantage of if you're aware of it. Okay, that's good. So Python and the other languages have this with web requests. So the hash table, everybody knows hash tables, right? So the hash table implementation has what of access and insert? What big O implementation? Big O one, all the time. Average, assuming that the items are distributed equally throughout your hash table. If you're able to force your hash table to all hash to the same element, then it degrades to what? Yeah, O of N, like it's super. And so there's actually denial of service vulnerabilities where they took advantage by knowing how the Python or how the programming language would hash values. They would send key value pairs to the web server that would all hash to the same value and it would slow down the server so much that it would cause it to crash. So yeah, so that's like the program's fine. Or another thing would be maybe a design flaw would be sending a secret private password in the clear over UDP, right? Or any kind of thing, right? Then anybody on that local network can sniff that. So even though the code, so the way I think about this, the code is doing exactly what it's supposed to do according to the design, right? The problem is that it's fundamentally a problem how this thing was designed in the first place. Still a problem with the code. We'll see you guys later. So yeah, so design, so we kind of can think of this as design vulnerabilities, so there's a problem in the design. Implementation vulnerabilities, so there's some problem in how the code was implemented, right? There's some, you can think of this as like a design bug versus, or a design flaw versus an implementation bug. So what else, these are the only things. Configuration. Configuration, right? If we go even below this, what about when we deploy this application? You can have the world's most secure application, right? If you deploy it on a shared web server with world readable, writable directories and where anybody can access or delete those files, right? Then it's fundamentally not secure. So you have to think about all these things. This is why when you're doing more or a billion analysis, even think about not only what is the design of the code, how is the code written and where is this code, where could this code be deployed and how is it commonly deployed? Or if you're looking at one specific instance, you wanna look at how, like, is this instance installed properly and secure? Another type of deployment that happens all the time? Default passwords. Happens all the time, right? And that's a problem with deployment, right? The code is properly vetted in username and passwords. It's not like there's a design flaw that you can get around that. It's not like they are not checking passwords correctly. It's because in deployment time, they didn't follow the steps that say create a new password for the administrator. So design probabilities are, as we said, flaws are the overall logic of the application. So this is another game for logic flaws. So another example I like is you used to be able, there's some website, so you know anything, you can have coupons on a website which will reduce the price of an item, let's say like 20%. So what if the website allows you to keep applying a coupon over and over? So to finally reduce the price to zero. Or what if the application, so you can put the quantity in on a shopping cart of how many items you're buying? Have you ever tried to buy a negative two quantity? It should reject that, right? But what if it doesn't? And what if it says, oh yeah, great. So you're getting refunded $200 for the purchase. So these are design-level problems and the logic of the application. Usually these big downs to some kind of lack of authentication or authorization checks, so if there's any problems there, it can also be aromnias trust assumptions. So trusting another machine or trusting a certain user when maybe you shouldn't be trusting that user. And you know, there's actually a ton there. And these are actually so part of the research that I do is in automatically identifying vulnerabilities and identifying like logic flaw, design flaw vulnerabilities are incredibly difficult. And it comes down to this problem of the intended functionality of the application versus how the application was actually coded. So do you usually have, so you can look at the code, right, you can analyze the code to see what the code does. But where do you get that intended functionality specification from? Nowhere, it doesn't exist, it never exists. Maybe it tells you that it exists, they're lying to you. They're like, oh yeah, we have UML diagrams and describe it, no, nobody has that ever, right? It's just inside the developer's head. So you have to essentially look at the code, infer what did they likely mean, and then see how that behavior diverges and that can help you identify a vulnerability. So it's actually an area that has a lot of research and really cool stuff in Bob and Van Van Vy. Another classic case of this is what's called a confused deputy program. So you, uh, problem. So you can think about, let's say you are completely locking down a Windows machine and you're gonna say that, man, I'm gonna make it so that nobody, no program except for, let's say, Internet Explorer can talk to the web, right? That might be a security policy you wanna implement. Now let's say that I, as another process, I say, okay, I can't send a request to the internet. I really want to, I have these credit cards I wanna exfiltrate and send back to me. I really wanna send a request. So I can't do it, but who can? Internet Explorer. So what if I can trick Internet Explorer to send out my data? What if I can just use remote procedure calls to ask Internet Explorer to open up this URL that they request for me? So in there you have this, you know, this problem where we've trusted this entity, we've trusted, you didn't think of it in this case, Internet Explorer's the deputy, they're super trusted. But if we can confuse them to continue and I'll do the deputy, shoot people on our behalf, or arrest people on our behalf, right? Now they're acting as us. So it's the same as if we were doing that action, right? And so that is a very difficult problem to solve. What we're gonna focus on here is mostly implementation of vulnerabilities. So we're gonna look at a high level, you can think of this as the application is not able to handle unexpected input. So it gets some input that either wasn't expected or it was coded incorrectly, and it causes the application to do something in security. So this could be unexpected input, right? So maybe we give it input that it's never been tested on before. It causes it maybe to crash or do something worse. Unexpected errors or exceptions, what if we delete the file that it thinks it's gonna read from? What does it do then? Does it just crash? Does it do something? Unexpected interleaving of events. So this is especially multi-predicted code, you get this a lot. What happens if A happens and all that happens, B happens and then C happens as well, right? Does that cause a deadlock? Does it cause resources to be go haywire? Is there some way we can interleave events to cause this to happen? Unsiltered output, so oftentimes we'll see applications have to be careful in what they generate as output. So if they're not doing this correctly, that can lead to vulnerabilities. There's a whole host of things here. We're spending a lot of our time focusing on this because this requires, a lot of these things require in-depth technical knowledge of how these systems work so that we can exploit these. Deployment vulnerabilities are, as we said, so sometimes being correctly or faulty deployment or configuration of the application. It could be installed with more privileges than it should have. For instance, as the design says, hey this application is secure, assuming you run this program not as the administrator, and you download it and try to run it as the administrator, not as the administrator, it doesn't work. So you say, well the first thing you do is type sudo first to run it as root, and magically everything works when you run it as root. But the downside is you're not running that program as root, so if there's any more duties in there, the attacker is gonna become root. Maybe, so if you think about all the things that you rely on for the security of your application, files not being readable or writable. Maybe somebody else changes that. Maybe there's no weird configuration change. Easier to guess passwords, as we said, right? There's a lot here. And the idea is the way you can tell the difference is if it was correctly deployed, then this vulnerability would not exist, right? So this is not inherent in the application itself. You kind of think of this as like a hierarchy, right? So the design's laws are at the top. They affect every single instance of that application, right? They also are more expensive to fix because they require completely re-architecting and thinking about things. Implementation bugs affect all the instances, but you can fix them fairly easily. They're usually a one-line fix, right? But now when you get to deployment vulnerabilities, my installation may be secure, but your installation may be insecure because of how you deployed it. So we touched on this a little bit. Remote versus local attacks. So you always need to think about, so when you think about this, when you do research on security, we think about in terms of threat models. So we think about I'm an attacker, what capabilities does the attacker have? Can they sniff your traffic? Can they spoof your traffic? What capabilities do they have? You should also be thinking like this as an attacker. What can I do? What are all the things that I can do in a remote versus a local server thing? So what are some things? Well, I can give you an example of something you can do local that you can't do remote. Rip open the hard drive and start. Too local, too local. Yeah. Too local. So we'll keep our ideas constrained to the digital and not to the physical, although there do exist techniques for that. Because actually it's super cool if you wanna look at cool stuff. There's people who've done research where they can, on a computer, if they shut it down and then they can take your RAM out and then put it in another computer and read your RAM and actually read what was in your computer. It's called the cold boot attack. And the way it's called a cold boot attack is they do it with, it's better so your, the RAM memory degrades slower if it's cold. So they take one of those spray cans and turn it upside down and freeze your memory, essentially. Make it really cold. Then shut down the computer, pop the memory out, pop it in a new computer and read out all your data. So physical attacks are super cool and interesting. We're gonna define ourselves to digital attacks. You can read the source code or binary. Yeah, you may be able to, depends. But maybe, oftentimes you can, if you're local on a machine, especially if it's you running the application or you can see the application that's running, you'll know exactly what application is running and you'll have access to and at least the binary. Yeah, I still call that physical. But yes, yeah, you can often do that and you could maybe use some weird bias stuff. Yeah. Switch users on an offering system. Yeah, maybe you'll be able to try it with different users, right? If I'm on a local system, I may be able to, if I have multiple user accounts, maybe I can switch between user accounts. Maybe I can try to crack the passwords of the other users on the system to get somebody else's account. What else, yeah, what's next? Database, accessing database. You may have access to more data on a local system versus a remote system. When you're local, you can maybe run multiple instances to see if there's any of these multi-threading interleaving problems, yeah. So for a remote, are we assuming we don't have full remote access for the remote terminal? Correct, that's local. We're considering that local because we're locally on the machine. We are, the terminal is remote, but the application sees us as if we were a local. We have all the privileges we would have if we were local, except for physical access. So the demo is when you keep it. Wait, just like, let's say it again? The remote is just when you send packets, yeah? Yes, remote would be a remote service, like a DNS server running, right? Yeah. All these things, why can't you, why are you saying that you can't do it when you're remote? Should be fine, like, even if it's, like, switching users or something. Can you switch users on my system? So if I'm running an HTTP server here, how can you get me to switch users? You can switch users, do it every line on your machine, but how can you get me to do it? Suppose I'm rude, like, what the fuck? But you're remote, you're not on my local machine. You have no access to my local machine, right? No user account is done. If you have admin, then you can get in and you're loading, right, at that point. But you're assuming we're a user on the system that doesn't actually own the system? Yes, other thing to think about, right? If we're administrators on our system, we are the security, right? We own everything, so we don't know. We may wanna try to find bugs, but we wouldn't find bugs by pretending to be other users, right? Say what could this user find, right? Yeah, good points. But the other thing to remember is if I'm offering a remote service, that you're just accessing over TCP or UDP, if you do not have a user account on my system, you're not local, right? So you can't get me to, you can't change files on that system, for instance. Unless maybe you can, right? Maybe I'm running an HTTP server on there too, and you can use that to drop some files on there, but if you're purely remote, you have a lot less capabilities. Let's see some of them, so. So local, we can actually, so we have more vectors we can get into the program, right? We have our local interaction that we can give input to the program. We can maybe handle it with a file system. We can maybe create another process to try to do some remote procedure calls to that process. We may be able to mess with the environment, right? This is a key, and this is very key. A lot of people get into the distinction of muddled in their heads. If we are remote, we can't do any of those things, right? We don't have access to the system. We cannot force anybody to do something for us, right? So we can't add or delete a file. We can't give direct command line input or parameter command line arguments, right? We can really only send input to the system as if it's over in the network. If you're remote, there has to be a port connection for, or an application has to be listening on the port so you can make any observations on the computer. Yes, because, or not only, you can think of a case where maybe an application is running that goes and fetches data from another system and uses that data, so that would be input. So if you think like, it's kind of a silly example. Well, let's say like something, an application is taking in Twitter feeds, right? So you can put in your Twitter username, it's taking your input and putting it into the applications. So there maybe you could take the application over, even though it's not necessarily running a server and listening on any port. Nine times out of 10, when we talk about a remote application we're talking about a server that's listening on the port. Using process scripting, we can delete data, right? So would that be a remote attack or? Depends, we'll get to that later. So local attacks are usually easier because we have more information and we can control things more. We can control the environment. We know exactly usually what application is running so we can use that information to find vulnerabilities. Whereas remote, we only get access over the network. So a key subset of remote attacks would be unauthenticated remote attacks. So this is an attack that maybe we can perform or some kind of vulnerability that exists before we ever try to tell the server who we are. So before we log in with the username and password if there's some vulnerability, this means anybody who can send these packets to the server can explain that vulnerability. Right, it's not just us, you have to have a user account on the system first. And so the key idea here is if we're able to, the goal usually with remote is we want to be able to execute as that remote application, right? So if we can control that remote application, we can't do everything. We only have the permissions of whatever that remote application could do. So in instance, another way of thinking about this is a remote attack allows you to transition into a local attack. Because now you're locally on the system with the permissions running as the web server, whatever the server is running as. Then you may be able to use a local attack to then escalate up to the root or the administrator of the system and then go forward from there. So usually this is how attacks will kind of go. Remote, they'll get local access to the remote attack and then they'll use that to get more of what it is and then spread both out. But in general, much more difficult to reform. Because you don't even know who's there, you don't even know who you're talking to. So this is why reconnaissance in the network can become so important. You want to know who am I talking to because maybe I already know about a vulnerability. Okay, so we need to think about the life cycle of an application so that we can walk through all the steps so that we can understand how to break them. And we're gonna focus here, as we'll see. So the developer, the author writes the code in a high level language, then what happens? Yeah, so it gets depending. So there's a big fork, right, depending on if it's a compiled language like C or an interpreted language like, I was gonna say Java's, no, it's not really. Yeah, like Python, right? So we take a C example, right? So the application is translated. So hopefully some of you have been taking compilers. This will make all of what we're doing a lot easier. But compilation is essentially taking this programming language and producing it executable, right? And it's saved to a file. And of course there's different interpretation and compilation, which we'll see in a second. So, how does this file get turned into code that runs on your computer? There is no magic. Destroy all notions of magic. What was that? That I looked at, it's turned into a similar thing. What was that? It's turned into a similar thing. So it gets turned into a similar thing. So that's the translation. So it gets translated into an executable form. So on Windows they'll have extension.exe, Linux doesn't have a specific extension. So you have to go with the elf, it's in that elf file format. So it actually happens. So we'll go to the details. I think we'll go off them. The application is somehow, which is not magic, which we'll look at, is loaded into memory. It starts executing, right? The operating system allows it to execute. So the application starts running and then the application terminates and it goes away. So interpretation, so if we look at interpretation really quickly, the idea is the program that we want to execute is usually kept in its original programming language, like your Python.py file. And to run that, what do you need? An interpreter, which is what? Thank you much for all of this. Program that was compiled. Yes, it was another program that was compiled. It's an executable file, just like other programs that you write. There's nothing special about the Python program versus the cat program versus the bash program versus the LS program. These are all programs that were written in a language that were compiled down to some executable form, which we'll look at. So the Python is special, so we think of it differently because its job is not to do anything. Its job is to do what? Interpret a Python file, right? So taking a Python source code as input and execute each of the instructions according to the Python specification. So this, of course, there's all kinds of, you know, maybe not going line by line in Python. There's oftentimes intermediate translations, so the PYC files was sort of the Python byte. Is it byte? I think it's considered byte code. And each instruction is parsed and executed one at a time. That's the main difference here. And one of the interesting things that we'll come back to, that will be a theme, is a lot of the interpreter's languages are easy to generate and execute code dynamically. So what does this mean? So you could take user input and transform it to code and run that if you want. There's a lot of user input, so that's just any kind of string. A string, yes, that's the key, right? So the program can take a string that it has dynamically created from reading from a file or getting file from the data from the user. It can take that string and execute it and turn it into code and if it was written right there in the file. So these are usually the eval family of functions. The actually has it in Python, has it in JavaScript, has it. They're all here in the email, but we won't talk about that. Compilation, so what are the steps in the seed compiler at a high level? What happens first? So we want to... It's not like some sort of a pre-processor. Yes, so the first thing is a pre-processor which says what? Makes it most of it. No, it makes it more technical. The pre-processor gets rid of all those pounds, right? So the pound define or the pound include literally like copies the code into there, right? So the first thing that runs as a pre-processor that gets rid of all the macros, evaluates all the flags, all the defines, everything like that is included in the pre-processor. Then at a high level, now we have a seed program. Now we're going to turn that into assembly code, right? So we turn that into some architecture-specific assembly. So what are some assembly like this? Mint. Mint. Let me see what the first one is. X86. X86. Arm. Spark. And V64. 64 and V64, yeah. So the two, well, I'm trying to think, we're going to focus in this class on X86, but everything that we're going to learn is applicable to all these other languages. You just have to keep that in mind as we go through them. I feel like X86 is like going into DOS. Is that like the... I believe in the name. Like I thought it would be like every DOS brand is using X86. I think it did, it has to do with the processor. So the processor is the, I think it's the 8086 that it was the instruction set for that. And so future pieces of hardware, they basically decided to be backwards compatible with that system will support the same assembly language for you to the chips and it's stuck throughout time because nobody wanted to rewrite the applications. I don't know how much it has to do with DOS, but the really interesting thing now is the processors, even if the programs are running X86, inside the processor it translates X86 instructions to microcode which it actually executes on the processor. There's even another layer of extraction below X86. We don't even have that. So you can actually, if you've never done this before, see what assembly or program will run. You can run GCC-CAPITL-S on your C code and it will output the X86 whatever language you're running. M32 is going to be a super important option for everything we do here because this will compile it in 32 bit even if you're running on a 64 bit operating system. All right. Let me get this off here.