 All right, folks, can you hear me? Yes, I see some thumbs up. Awesome. All right, so we're gonna try something new and different. So I am on travel. I'm in a hotel room right now, but we're gonna try to teach a class. Gokul, can you mute yourself? Cause I'm hearing feedback and it's really weird. Or maybe I can do that? Oh, cool. I can mute you and I'll mute you. Okay. So then what I'm gonna need you to do is to like wave, I don't know, the people on the front row, I can see you very well. So I don't know, wave or do something if, yeah, exactly. And I'll unmute you so I can answer your question. Are we good? Cool. How's the assignment going? Do you see everything coming together? All of the bandit levels, all the exploitation stuff we've been learning are all coalescing in this one assignment. So this is really the kind of crux of the class. And if it wasn't brought up already, this will be basically the final assignment. So there's this assignment, a final and we're done. Don't all start clapping at once. All right, awesome. Cool, well, let's get started. So hopefully you all viewed the lecture. That was posted yesterday. So we're gonna kind of go right from there. So basically to catch everyone up, if you didn't view that lecture, you should definitely go back and see that. The goal here is to, that class was talking about x86 assembly code. So trying to look at how binaries actually get executed. So I know in a lot of your class, you've been in your previous maybe classes, you've looked at MIPS binaries or those kinds of things, but we're actually looking at real x86 binaries because as we'll see here, we need to be able to, when we get to the memory corruption style vulnerabilities, you really need to understand what's going on to be able to exploit those. So what we're gonna talk about now is different types of attacks. So this is a slide that I kind of take in my grad class. So to kind of 545, to give everyone kind of how we, or what kind of stuff we talk about at that level, this is kind of a very broad range of different types of vulnerabilities that we cover. And so we're going to, in this class, this is an undergrad class, we're gonna touch on what I think are kind of three of the most interesting and cool types of vulnerabilities and how to exploit them. And so we'll dig in here, we'll go over that, this will help with assignment six and it should be good. Any questions? All right, cool. Okay, so we'll first look at file access attack. So as we talked about, really it's the, what, I have to do some kind of interaction, I can't just talk all the time. So what are some ways that we can take over and trick an application to do something that it's not supposed to do? Can you say it louder? Sorry. Can you beat someone up? Beat someone up, yeah, without doing that. Okay, he said trying to be someone else. Trying to be someone else, not beat someone up. Yeah, that makes more sense. Okay, good. Although from this class, I would expect either answer, really. So yeah, okay, so trying to impersonate, trying to break, and that's essentially breaking the authentication of the system, right? Trying to trick the system into thinking that you're somebody else, what else? Yeah, in the back. Manipulate the stack. Manipulate the stack. Manipulate the stack. So by basically, and really what that boils down to is by injecting data into the system, if we can corrupt and change the memory or the code or whatever of the system, then we can maybe take over the system and make it do something it's not supposed to do. We'll kind of get into specifically what that means and how we can do that. But that definitely is, so the way to think about that is user input, right? So it's really important to think about like a binary or any kind of application as what are all the different ways that as an attacker, I can get input into this application? Because if I can get input into this application, then I can maybe use that input to trick it to do something it's not supposed to do. So what are all the different ways that we input into a application? Like as attackers, what are our capabilities? Arguments. Arguments? Yeah, so the command line arguments. So argv, right? So we talked about earlier that when we run programs, right? We call the, so even when we're running it in the command line, right? We type ls, what does the shell do? The shell looks up, tries to figure out using the path variable, what ls are we talking about? And then calls execve on binls, passing in any arguments to execve of what we have in there. So this is kind of, so fundamentally when we execute another program, we control the argv, we control all the arguments that get passed into this function. Yeah, that's great, what else? Standard input. Standard input, yeah, that's a great one, right? So it's kind of the one we're most familiar with, right? We're running an application, we give it input, it does something, processes that input. You've done this yourselves in the secure house examples. So in homework two, when you had to give input to an application, other people secure house's application to get it to crash, right? That's an instance of attacker-controlled input causing an availability problem on an application. Great, what else? Like app open, open the file? Files, right? So why do attackers control files? Or when can attackers control files? Is it all the time, every single file? I'm sorry, I didn't hear you. If you make the file controllable, is that what you said? If you make the file, then you control the file. So as a program, you have to be careful what file you read, read a file made by the adversary. Yeah, perfect, so yeah, thank you. And shouting is appreciated, so yeah, exactly. So if you can convince an application to read a file that you and attacker either created or control, then fundamentally, that is another way that you can feed input to this application. So as an attacker, we can maybe control what file is opened. So by altering the path to the file, or we may be able to control what the contents of a file are. So in all of these ways, depending, and again, this is depending on how the application actually works, it can give us the ability to control the application. So that's kind of the important thing is to, and this is why I'm reiterating it from the last lecture, is understanding all the different ways that inputs get into a program is incredibly important. And this exact same mindset will help you on assignment six of looking at each of these applications and saying, okay, what's going on here? Like, how does it work? So that's the reconnaissance phase, right? Of breaking the bank. And then you need to understand what exactly is going on. Where does my data feed into this program? And maybe how can I use that to subvert its behavior? Cool. All right. So let me, okay, cool. So, and somebody mentioned F open, right? So we know that we as a program, technically we can't mess with files ourselves. Who can actually open and read and write to files? The operating system, right? So through system calls. So that's what was just mentioned. Let's see if I can bring up my terminal and see if this works. Everybody see the terminal, even in the back? Yeah, thumbs up in the back. Thank you. Oh, I think it's open, right? Okay. So we look at the open function, right? So this is the open sys call here. We're calling open. We're passing it a path name and flags. So you've all written applications. Have you written applications that open files before? A C or C++ application? Yeah. How do you do that? Use the fstring library. Use what? The fstring library. The fstring library to do what? Well, the path string library. Yeah, he's asking us to do what? Oh, to do what? To open, to open a, I mean, how do you open a file? So literally you're coding a C or C++ application. Path string? Yeah, so you type, you send, give it a path, right? Like a string. What, so often times as an application, we may be hard code the string, right? So we may say open a specific file based on a string that we already know. And so that would look, let's see. I guess we can pull up Emacs. We'll do a lot of demos. Okay, is this big enough? Can you read that in the back? More or less? Yeah? Okay. So we want to call open, right? So we want to open up a file. What file do we want to open? Somebody give me a file. Man, that was lame. ASCF, shout out. How about home root grades? How about that? Yeah, grades. Everyone likes grades. All right, then I don't remember. Okay, here's the flags. So we want to open it up for whatever. Read, right? Yeah, there we go. So this will give us a file descriptor back. Can you make the font larger? Font, okay. Can I do what? The font. Larger? Yeah. Yeah, that's tricky. I don't remember how to do this on Emacs. Is Max there? Who would know? Anybody else remember how to make the font bigger? I had a function, but there we go, wait, I had it. The solution is obviously execute some lisp code and you change this to like 22. How was that? Better? Okay, yeah. Okay, cool. So we have this line of code here, right? That this is a natural line of code. So, okay, let's go through some examples. So what, when we call this open call here and we're passing in this path argument, right? Do we know exactly what file is going to be opened? Yes, why? What is that file? Is it the path string specified how to get the file? It's a specific file. Right, because it's an absolute path, right? So we have slash home, slash root, slash grades, right? So either that file and those directories exist or they don't, whatever, but we'll access that same file. How is that different than this example? One's relative and one is absolute. One is relative and one is absolute. So what does that mean? Can you explain? So relative would be like, say you're in a directory home, it's going to be any of the files that are like maybe directly in that folder or whatever the path you have to give it has to be from the respective, I'm sorry, from the perspective of home. But the absolute is from the perspective of root, of producing like a Linux system is from C, backslash, or whatever it's going to be. Right, perfect. So yeah, so the important thing is it's all in reference to your current working directory. So every process has a different current working directory. Normally when you execute a program, your current working directory is whatever that program executes. And so when you open something like this, like grades, right? So that's going to look in the current directory to see if there's a file called grades. If it is, it opens it up. Whereas, and this is essentially you can think of and that's the point of a relative file. It's specified based on where the process is currently executing. What's the directory of the current process? Whereas this is always going to get the same file, right? So as attackers, when we're attacking a set UID binary, who controls what directory the program executes in? The attacker. Yeah, the attacker, perfect. Yeah, I wanted to just yell it louder. Really get into the spirit of things, right? So the attacker, so just like we can control the arguments that get passed into a program when it executes, so we can control argv. Similarly, we can control what file, what directory a program executes in so that we can control this parameter here. Or we can control where it looks for files. Perfect. Second thing, so what if, let's say I'm writing a program here and what's it? Grade student and I'm passing in a name parameter. You can all see this, right? So now, rather than having my grades all in one file, because I'm very smart, I'm gonna have each of you have your individual grades based on your name, right? So how would I write that based on, let's say there's either a subdirectory or a file inside grades, that's your name. How would I open that proper file? You can concatenate the string to name? String concatenation, right? Exactly, perfect. So this is all something, so I see is slightly rusty. Let's say, you know what? We'll do, ah, string concat, string concat is good. So we can do home root grades and then slash and then name path. So that, is that right? Actually, somebody? So this is why we always like our man pages because we can look here and say, okay, yes, concatenate, appends the string, string to the destination string overwriting their toning byte. Oh, that's bad because it's a buffer, buffer overruns are a favorite avenue for attacking secure programs. That's true. What's the string cat that actually gives me a new string from it? Anybody know? String CPY, yeah. Yeah, all right. So that's why it's hard to code correctly and see. Let's do character pointer path equals malloc. All right, we wanna do this correctly. We will do character pointer dir, people to this. So we wanna pointer that's the string length of the name plus the string length of the directory. And then we need a plus one for what? The null byte. Yes, perfect. So now we have enough space for the path. So then we, yeah, so now we can first string cat the directory onto the path. And then we can string cat the name onto the path. So now we should have a string that is the directory concatenated with a name that has enough bytes for everything. And now we need to open the path. Should we verify that we're correct? No, you're all happy that this code works. Everyone's 100% clear. Yeah. Okay, so is this, first thing to ask yourself is this like natural code? So another way to do this would be string or S printf, so string printf. But is this something that you, that people do normally in programs? Why? Because we never know how to make it easier to figure out what the user will input. Cause we don't, sorry, can you say that louder? We never know what the user will give the input as. Right, because we, oftentimes our programs will do things based on user input, right? So even something like, let's say a SQL server will have, depending on what table, which is basically user input somebody wants to access or what database, those may be different files, right? So they need to read a different database from a different file depending on user input. What are other examples where you're choosing a file based on user input? A saved game? Yeah, so for games, like a saved game state? Yeah, it's getting input from the user in order to open up that game file. What else? Word processor? Yeah, fundamentally, right? This is actually something that's really fundamental to everything you do. You're opening up files based on user input, right? So looking at this code here, we think about what is the developer's intention at this open statement? So what files did we want this program to open? Grade, grade files, but specifically where? In that path? Right, in that path, so as a sub directory. So the developer, you can see the intentions here, right? The developer only wants to open files that are in slash home, slash root, slash games, or grades, sorry, I don't know, somebody said games, that's right. Home, root, grades, and then only files from within there based on the person's name, right? So what's the problem here? Is that constraint true? No, why not? So what you're an attacker of this whole thing, so this is another important thing to think about. So this function here, you're an attacker. What do you control? Do you control this directory variable? No, why not? Does it static? Yeah, it's hard coded, right? It's hard coded as this string. You cannot fundamentally change this variable. So what things do you control? You control path, but how? Do you control path at this point here? It's from the name. Yeah, so from name, right? So you as an attacker, you control the name, which is the input to this program. And then what does the program do with that name? Concatenates it to the path to dur, essentially, right? So at first, concats dur in there, and then it concats name, and then it calls open on it. So as an attacker, by controlling name, can we violate the developer's assumption that we are only opening files that are inside of slash home, slash root, slash grades? How? That's not the dot, dot, slash, dot, dot, slash. Yeah, what do you want to dot, dot, slash, dot, dot, slash? Yeah, but what do you want to input? You're an attacker. Three times, and then go to that C shadow. C to C shadow. Okay, so if name, the attacker input's name is this string. So let's kind of walk through this just through, like, pseudo, can you all see this comment? I know it's a little weird. The colors, not so great, but anyways. So if we walk through this code, we know directory is this value. We know path is gonna be the string length, enough bytes that is the string like the name plus the string like the directory plus one. So after here, path will be, path is essentially equal to the null string. Then after this string concat, path is going to be equal to this, home root grades, and then we're gonna concatenate our name onto there. So path will be home root grades, and then what? What happens to these dots? Do they get mixed with anything? Do they get, what happens here? It's appended. Yeah, just appended, right? So have we broken, I mean, is this allowed by the code that we're looking at here? Yeah, is anything preventing us from adding these dots in here? No, it's fundamentally just a string, right? This is just a string. What is a string? It's a sequence of bytes followed by a null byte, a zero. Right? And so we didn't violate that at all. And then when we pass this path parameter to open, what does the operating system do? It's gonna open up what file? Open that file. Yeah, so it's going to open this path and it's gonna resolve this into the actual file. So it goes, okay, home root grades, okay, up one, up one, up one, ETC shadow, great, here's the ETC shadow file. Did the developer ever intend for a user to be able to open ETC shadow? No, but as an attacker, we're able to violate those developers' assumptions and make the program open whatever file we want. This is the root cause of a key aspect of a lot of file attack access vulnerabilities. So, sorry, I need to make you small so you can see this in the recording. Why is it so common for a user to name to not be allowed to have, like, slashes? Is that the reason why the user names are not allowed to have slashes? You mean are the user names for our system, like the homework six? For any system you can have a lot of slashes. I can't say that is the only reason you could theoretically have that, but yes, when you have, especially on most systems, your user name maps to a home directory that is a directory. I think they very conservatively make user names. But yes, this is where you can get into, actually, every year somebody manages to create a user name that's weird and Unix doesn't like and breaks my scripts. So when I was creating the homework six server, I had some stuff in there to replace certain characters with underscores and somebody found a new one. Actually, I think it was a slash. So somebody had a slash in their username and that broke stuff, so I had to change their username slash to underscore and then everything worked. So yeah, so this is kind of colloquially known as essentially the dot dot attack. And the reason why we're talking about this here is we just talked about, there's a lot of other applications that actually build up and open files based on user input, right? It's not just the C programs that we looked at, it's all kinds of programs. Websites actually do this a lot. If they're storing, let's say, image data, so have you ever uploaded a picture to a website? Yeah, it has to store that somewhere. What if it used your name to decide where to store it? You could potentially overwrite different files of the web application server. So this comes up in a lot of different contexts. So, oh yeah, this is wide string end count. Anyways, whatever. So yeah, here's another example, which we basically already did. So by inputting as many dot dots as we need to, we can basically escape the directory that the developer wanted us to be in and access any file in the file system. The other name for this type of vulnerability is the directory traversal attack. Why? Because we can change and move to arbitrary directories at will. Cool. Questions on this? So how do we defend against this? So like here? So how do we, how do we change this code to make it safe? Where's the x? And if condition to check if the string has any slashes or anything, just compare the strings. I think it's like string toke or actually I can't remember how you check, but yeah, you could write in pseudo code, you could say, well, if I was writing Python, you'd say if, well, there's two, okay, so let's think about this. So what do we say dot slash? What are the characters? Don't we like? Dollar sign. Dollar sign? Why don't we like dollar signs? Environment variables injecting into your code. Environment variables injecting into your code. Yeah, we're talking about paths right now, so let's ignore that. So let's just say right now, we just want dots and slashes, in name, return null or something, right? So this is not valid C code, but essentially you get the idea if there's a dot or any slash in name, then return it. So essentially what we're doing here is blacklisting what we think are dangerous characters for name. So we know that name is used in open, and so we're saying let's block any dots or slashes in name. So is this safe? Safer, sure. Yeah, you know, being safer is always better. Is it, I don't know, so is it the best solution? What's another way we could do it? So this is where essentially blacklisting characters that we know to be bad. You specify what is a lab or just output. Yeah, so essentially we can whitelist it and specify what characters are good. So we could say something like if, so we could look for, if I was writing a regular expression, you could say A through Z, zero through nine. So alpha numeric, lowercase A is uppercase A is zero through nine. So I could check it with the regular expression or check every single character and say you could do it in C, write a loop to check if every character of that string is alpha numeric, and if it's not returned. So what's the fundamental difference between those two approaches? There's actually something wrong with green right now. Oh there's something wrong with this string. You can't see this nice, really good expression I wrote. Okay. One includes one. So one includes one. Minus two. Okay. So differences in let's say the maybe the checking code in this case, but blacklist could be just as, I mean, depending on what the specific attack we're trying to prevent is, Blacklist could be even more restrictive or even have more characters to check. So I wouldn't call that necessarily a fundamental difference. A white list is nicer because we know what we want. We may forget things we don't. Right, so it's easier in some sense, so they both can have problems, right? It's easier for a white list to be correct because you can be very conservative, right? Can you have dashes in file names? Yeah. Yeah, you can. Am I allowing that here? No, I'm not. I'm being very conservative in what I'm accepting as valid input, right? So the other thing, and then with Blacklist, so what we Blacklisted before were the characters dot and slash, right? Are those the incur, or let's say, are those dangerous characters when used in an open path for every single possible file system? Probably not. Yeah, the short answer is I actually don't know. If you're writing, I think on Windows, you can actually represent paths with slash. So you can say something like open, I think this used to be the case. Maybe like slash foo slash bar slash something. And it will open that. So the operating system has different behavior here. So this is the, again, a problem with, could be a problem with both black and white lists, but Blacklist and characters, it's the Blacklist has to be complete. If it doesn't have every single possible character, then you can have problems. Questions? We will go to, oh, let's see. Sorry, I'm just clicking on random stuff. Okay, we are gonna go to, okay, cool. So path and home. So let's go, and we may have already done this before, but if we have, it should be pretty easy. Everybody see the screen? Yeah. All right, cool. So, okay. So when I type in LS, how does the shell know what program to execute? It searches the binary and searches your path. Yeah, so it needs to write, so how do I know what's the command I can run to tell me what is the exact binary that's going to be run by LS? Witch. Yeah, we have witch, right? Witch tells us exactly what program will be executed. So we can do witch LS. It tells us slash bin LS. So when I do slash bin LS, that's how I can execute it directly there. Great, and now it's important again, as we mentioned, and this is what all comes down to documentation and understanding what's going on. The exec VE syscall has the first parameter as a file name, and this must be a absolute path to a file name. So you pass in a file name, argv, and environment pointer. So, where does it say that? File name. So anyways, so we need some way to be able to specify, hey, when I type, I don't want to have to type in slash bin LS every time. I want to be able to type LS and I want the shell to figure out what LS I'm talking about. And this is done by using an environment variable called path. So the path environment variable specifies a colon separated list of directories to search in for a binary. So when I type in LS, the very first thing that happens is my shell goes in and it looks at, okay, is there an LS in slash user local S bin? Is there an LS in user local bin? Is there an LS in user S bin and then user bin and then S bin and then finally bin and that's where it finds it so it stops and uses that one. So this is a kind of early circuit termination as soon as it finds one, it stops and it executes that one. Great. Okay, how does that help us? Well, so, okay. So here I have, okay, void. Let's say I already know the file, that doesn't really matter, cool. Okay, so we're here, we're writing some code, we're continuing our amazing C grading program. And we want to write a function that shows the person who runs it the current grade. So we'll keep it simple. Okay, right now we're just at one grade file. So we have our grade file. Now I can, as a developer, so at this point and I have the highest confidence in all of you, you can write a program that reads this file and outputs it, right? So how would you roughly go about doing that? Open the file, then what? Write mode, so open in write mode or read mode and then what? Yeah, so read in, you can read, call the read function to read in byte by byte of the file and use write to write out byte by byte to the file, the standard out, and you can do that. But this is a Unix system, the one of the core principles and idea here is that you shouldn't reinvent the wheel if you don't have to. Is there any other utility on any other program on a Unix system that you can reuse that has, that will show you the contents of a file? Yeah, cat, right? So the cat command, if I were to go and undo cat, I don't know what you want me to cat, cat test. So I can see the contents of files. We'll do it again just so it's up more towards the top, right? I give it a file name, I call cat, I give it a file name and then it tells me hello class. So it turns out there's a great function in libc called system that executes a shell command. So let's go and try that. So rather than, so now we could do open file, read file, write to standard out, you know, here we're talking about at least one, two, three, four, five lines of code plus a loop that we have to do, we have to make sure we do this correctly or we can replace all of this by, so which approach would you choose? I think it depends on what you're doing it for. What? It depends on what you're doing it for. Why does that depend? Because I'm trying to be more secure or I'm trying to be efficient with the code I'm running or I'm trying to be efficient with the code I'm writing or I'm trying to be 100% good at design. Trying to be efficient. True, I think there's a interesting sort of Occam's razor in effect. If you can have two pieces of code that do the same thing and are the same in terms of security, you should choose the simpler one that has less code. There'll be likely to be less bugs, more lines of code is less maintainable, so the less code you have, the less things you have to maintain. So if I can replace 10 lines with one line of code that does functionally the same thing, maybe that's a worry about, am I running this on systems that have cat or don't have cat or that have a weird version of cat so that can definitely could cause problems and it'd be something to think about. But if I know I'm gonna be running this on a UNIX-like system, I can be fairly confident that they'll have cat and that this will do exactly what I want it to do without me worrying about it. Yeah, that's more vulnerable to the cat. Well, sure, that's why we're building this up right now. I will say you can do this securely. There's a way to do this securely. Cool. Okay, but let's dig in here and get on the hood. And again, remember this all comes down to what's actually going on. So here we call this system call, the system call which calls into libc to actually do this, but what does it do? So let's look, man, system. Yeah, can you hear me? This is my connections unstable. Can you hear me now? Yeah, that's fine. Okay, yeah, so my connection was unstable. All right, so I didn't say anything important. Okay, let's read this nice documentation. It says, so system, pass it a character pointer, a string. The system library function uses fork to create a child process that executes this shell command specified in command using exec l, important that it's not exec ve, but we'll look at this in a sec, as follows. So it calls exactly slash bin slash sh, then passes in as argv zero sh, then passes in as the other argv one dash c command and then our command. So system returns out that the command has been completed. So, and when I'm here, so what if I did bin, so what was it bin sh dash c ls? What do you think this is gonna do? Execute ls and execute ls. Yeah, create a separate shell and execute ls. So going back to my example, if I pass in cat test, this is gonna execute cat test, right? So how does sh know what cat to execute? What was that? The path. The path, just like when I'm here sitting in front of my terminal and I write cat test, right? This is the important thing to realize and understand is system is a function that is basically saying exactly as if you were at the shell command, typing in cat space slash root slash home root grade. That's exactly what's gonna happen. So cat test. Okay, so now, if this program is a set UID program, that's running, who controls the environment variables of a program that runs? The attacker. The attacker, right? As an attacker, we control that. We control path. So what does that mean for here? So actually, wait, before we get there, again, another thing to reiterate, what did the developer intend to do? So we were the developer here writing this line of code. What do we want to have happen? How to put the graded standard out? Yes, to execute the program cat, bin cat, and to output slash home root grade. But can an attacker violate those assumptions? Yeah. How? It changed the path for cat to point to a different program. Yeah, so you want to try it out? This will be very simple. I need standardlib.h int main. Okay, so there we go. I'm going to say system, and I don't have this file, so let's just call it system test for right now. Right, so test was that file that we just had in that directory. Everyone agree with this code? Basically, it's roughly the same as whatever we had. Text is kind of hard to read. Oh, sorry. How about now? Is it worse? That's better. That actually was better, okay, cool. High contrast, okay. So everyone agree this code roughly does what we were talking about before? All right, let's compile it to see any mistakes I made. Okay, cool. So we just ran that. We can see that it's calling hello class. We can actually use various, so we can use various programs in order to understand more about what's going on. So Strace is going to execute a process and tell us all of the system calls that this process is going to make. Now, there will be a lot of output here. Do not be freaked out. Because we're going to try to find the important one that we were looking for of Strace, of that right to call cat. So if we look up, we have an exec VE. So this is actually, we can see us, this exec VE dot slash a dot out. This is us executing it. Stuff happens to set up the loader and all kinds of other stuff. There is, I bet the fork. Okay, this does not show what I wanted to show. That is, all right, well, we'll ignore that for now. We can figure out what exactly I want. Fork. Yeah, okay, I want dash F, I think, to follow the fork because we're going to fork a child. So let's see if that works. Well, here's the read and the write of hello class. Okay, open test. Where's the exec VE though? It's happened sometimes with live demos. I feel like it's got to be close. There we go, found it, woo, success. Okay, can you see that line? Cool, so this is, so this is exactly what happens when we call that system call, right? This is straight from the documentation. It's going to call exec VE slash bin SH. The argument zero is SH, argument one is dash C and argument two is cat test. Okay, perfect. And then bin SH is then going to look up and figure out where cat is. So simple binary, right? We looked at, we looked at this code, simple binary. How do I trick it to execute a version of cat that I wanted to execute? Change your path. Change the path to what? Back before the real cat. How about cat before the real cat? Okay, so let's say I had my path just be slash. So this is one way that you can run a program to just change the environment while you run it. So this is basically saying, well, let's try to set path to nothing. I wonder what that will do. Yeah, okay, perfect. So why did it say cat not found? Right, because there is no A folder here. It has no idea where to find this. And so it can't find cat. Okay, so we need a cat. What do we want our cat to do? Maybe just say, you were hacked. Is that good? You can't see it. No, no. How about now? So it's a simple shell script that says you were hacked. So we know that if this executes, then we were able to trick that program into executing a program of our choosing and not their choosing. I need to make it executable. So, okay, I can execute that version of cat of cat. It says you were hacked. Great. So now I run a.out. It still says hello class. So what am I doing wrong? You're not setting the path. I'm not setting the path. So what is the path and you do it? So I can set, okay, so path A. Okay, that didn't work. What about cat? Why isn't this working? How do I fix this? You need to put a directory. I need a directory. What directory do I want to use? Your current one. Which I got? Dot, the current one. Boom. There we go. You were hacked. Right? So why did this happen? It's all amazing. We need to reference the current directory. Yeah, so we changed the path to set it to be the current directory. And so BNSH, when it executes and when it's trying to execute a cat system, it says, okay, well what cat program does this person want me to execute? Or does this program want me to execute? And then it looks, it says, okay, I'm gonna look in the path. Well, the path is dot. I'm gonna look in the current directory. Is there a cat? Yes, I'm gonna call dot slash cat test. And that's how that gets executed. So we're just able to completely change the executable. So if A.out was a set UID binary, then what permissions would cat be executed with? Yeah, root or whatever, yeah, if it's set UID, whatever the elevated permissions are from that set UID binary, perfect. So how come changing the path variable didn't change it on your entire system? Like, cats don't work? Yes, so there's two ways to change the path variables. You can do, let's say foobar, calling path. So now I've changed the path using export. I've changed the path for all other commands that I'm going to execute here. So I've changed this environment variable. If I say path equals dot and I say, what, Env? Oh, yeah, yeah, okay, that's actually good. So Env prints out the current environment. So if I look in here, yeah, I can see that path. So this style is able to, you're able to execute one command by changing just this environment variable for that one command. So it doesn't affect your path once this command is finished executing. Does that answer your question? Yeah, thank you. Yep, okay, path there, great. And there's other, so it's not just system, there are other functions, execlp and execvp also use the path variable to locate applications. So it's important to look at and to understand what program is this trying to execute? How is it trying to find that command? And that way, can you alter the path variable in order to change what program is executed? Similarly, in a similar vein to the path environment variable, and I'll do this real quick. So when I do cd tilde, how does the shell know where, what is tilde? Home, it should be the environment variable and it's specifically the environment variable home. But if I set home equals root and I do cd that, that didn't work. Yeah, there we go, okay, cool. So by changing the home environment variable, when I do cd, so now again, this is another instance where I have what looks to be something fixed, right? I have a tilde which specifies the home directory, but by altering this environment variable, I can change what that home directory is. So similarly, if an application uses a home relative path by using a tilde, then an attacker can modify home variables to control execution. So, so everybody understand these two vulnerabilities or have questions about this? So how do we fix these? So you're my expert security engineers. So how do I change this code so that it's no longer vulnerable to somebody messing with the path? Well, I'm saying the cat is right, the actual path to the cat you want. The full path to the cat I want, which has been cat. So if I do this, now I'm telling the shell because remember system calls bin sh and we know the shell if we specify the exact program to execute with an absolute path, it has nothing to do. It doesn't need to do anything else here. So if I do this, compile it down, a.out, hello class, and we'll go back to my vulnerable one. Oh, no, not that. Right, so here we now executed our vulnerable execution by changing the path, but we're still not able to change what program gets executed. Cool? That's a question. If you go like the beginning of your program, like the first line system, and then path equals, would that overwrite anything that the user would put as a path? Yeah, okay, so the question is if you change the path, so I don't, let's see. Actually, I think, okay, I think this would work. I'm fairly certain this will work. So here we're hard coding a path. If we tried to do it up here. They cannot see it. Oh, sorry, let me finish writing this. And nobody likes my color choices, do they? Okay, so we have two instances here, right? So this first one, this export path equals slash bin. This is not gonna take effect for the other system calls because this is the system call forks another process so the changes to the environment don't propagate. I believe that the second one, this actually would fix it because we're specifying, hey, for this command use slash bin. But of course, here it's kind of silly. We know exactly the command we want to execute slash bin cat. We can use, to further answer your question, we can use put env, we can use the command put env to change the path environment variable for our program. So in that case, yes, we can specify. So we have two ways of doing this. We can test this really quick. What was the, oh, standard lib.h, so if we go, oh, same thing, oh, you're good. So what we can do is we can do a put env. So this changes our program's environment to have a new path of our choosing. And then if we do bin cat, so let's go and hopefully you can see us here. So by calling put environment, we're changing the environment for our program and then that way we've controlled it and hard-coded it for the rest. So if we try this, yes. So we've been able to fix it here because we're specifying exactly the path and we're essentially not using the attacker-controlled path. Yeah, good question. Other questions? Nope. Cool. All right, so digging deeper into system. So not only do applications invoke external commands to carry out specific tasks, right? As we saw, so using cat to cat out a specific file. What if we essentially combined the two things that we were looking at? So what if we were combined? So what if we had a calling system even with something like bin cat hard-coded but then the file parameter we as an attacker controlled? So for instance, and as we saw, system executes by calling bin sh-c and then whatever string you pass to it. P open is another one that does this similarly. And so let's look at this simple example. So here's some C code. We have a buffer here. We're going to cat slash var log and then percent s where percent s is argv one, right? And so we're going to concatenate and we're going to put argv one wherever this percent s is. So what's the intention of this line? So what does the developer want to have happen here? Yeah, that's a log file. Yeah, so to cat a log file that's given from argv one, right? So this could be a program to output log files and specifically only supposed to output log files that are in far log. So we'd actually know as an attacker, well, one thing we can do is we can use a directory traversal to cat out any file, right? So how would we do that? What would we put in for argv one to trigger a directory traversal? Yeah, ..slash, ..slash, EDC shadow, whatever other file we want. We can get it to output any file we want. But we want to do more. We want to be able to actually execute commands as the root user, not just read files. We want to be able to do arbitrary things. So how can we trick bin sh into executing different commands? A semicolon? A semicolon? Why is a semicolon important? Because the ends of the root user command starts in that word, if you want to. Yeah, so if we could do something like foo, semicolon, cat, EDC shadow. Okay, let's look at this, this. Okay, here's that same code. So what kind of command do we want to execute? Let's say we want to delete a file. Do we like deleting files? Sure, right? We can mess with the system, mess with availability. So if we input, so what do we input for argv zero to delete a file? Or argv one, sorry. So have you called in space rn in like the wild card? And we've got space rn, and then, yeah. Oh, that's pretty vicious, okay. Yeah. Cool, so after this SN printf command, right, it's important to, okay, so what is command gonna be here based on this string concatenation? So command is gonna be cat space slash var slash log slash, and then semicolon space rm star. Everyone agree? That is the result of this SN printf based on this argv one. Yes, guys, are you nodding? Perfect, okay. Then when we call system, why does system end up thinking that we're trying to execute two commands? Cymicolon. What was that, louder? The semicolon. The semicolon, why specifically? Why does it even care about the semicolon? If you command on one line, if you just separate them with the semicolon? Yeah, and we know that system ends up calling bin sh, which is the shell. So it interprets whatever commands we give it as if we're on the shell. So all the cool, awesome commands that we can use on the shell, like semicolons to execute multiple commands work or back ticks to execute commands or dollar sign parentheses, anything we can do as if we were sitting on a terminal using a terminal, that's what we can use here. So here, bin sh gets this string and says, oh, you wanna do cat slash var slash log semicolon rm star, great. I know that those are separated by semicolons. This is two different commands. One is cat, let me find the program for cat, execute that and then execute rm star. And then we'll start deleting all the files in the current directory. I will not run this because that would be bad. Hello? Okay, cool. And so this is fundamentally gives an attacker more capabilities than they would have just by reading files. So what are some other things we can do? We could read files, we can delete files, what else? Are you guys talking? Cause the sound is super weird. Run other programs, run other programs. Wait, wait a second, maybe it's me. Hello? All right, I can see the wave, I cannot hear you. Maybe you'll see this on the video and hear it. It sounds like you're talking from the bottom of an ocean, I don't know exactly what's happening. Okay, so I'm gonna finish up with an example. Okay, I was just gonna ask you what I just said, but I suppose that's on me now. Hello? So, okay, so finishing up here. So yeah, okay, so we can read files, delete files. We can also execute arbitrary commands. We can do really cool stuff. We can have what's called the reverse shell. So we can have, and you can look this up, you can find examples of reverse shells online where a program can talk to you. It will basically reverse shell will connect to you on a given IP address and port, and you can type commands as if you were on the system. Essentially, we could do anything we want. We could drop our SSH key as the user so we can access their system. We can drop PHP files into their web server to execute PHP code. We could access their database. We now have full code execution as that user. So we can do all kinds of super cool stuff. Change their password to what we want. I think we can do that. No, maybe you have to know their password. But anyways, fundamentally, we can do all kinds of cool stuff. So I wanna show you one super cool example of this. So in, and I'll get you done in time. So there was an example in 2014. There was a bug discovered in how bash processes and environment disparate variable. And essentially the idea was bash program can pass through environment variables, not just data and code, but also it can pass function definitions. And it turned out that if you set an environment variable with value started with parentheses followed by a function definition, bash would essentially interpret that and evaluate it as code. And it was executed by bash. So this seems like a weird behavior, but fine, like I don't know what's really the security implication here. So it turns out that you can execute arbitrary codes by appending commands of the function definition. And so for instance, if you had a limited access SSH account, so anybody use git over SSH? Yeah, so git is literally giving you SSH access to their system and they're setting it up such that your SSH access is limited to only be able to essentially talk to git. But as part of SSHing in, you can set environment variables. So environment variables are set based on the user for that command. So what you could do is you could SSH to their system set an environment variable that would execute bash code on their system as that user. So you could pop any kind of these restricted shell SSH accounts. So this was the command, the original command got put into an environment variable. You could cat EDC shadow, you could kind of do whatever you wanted in here. The other area where this was highly vulnerable is web applications. So specifically CGI based web applications, which there were still a lot of at the time, basically pass HTTP arguments from the user to the application through HTTP. And so you could execute arbitrary code on a web application simply with one web request. So it's actually the shell shock. I think it was one of the first instances of like a vulnerability with a name and logo and website and all this stuff. So you can go look this up. The other super interesting thing is that this bug was present in bash for 20 years and nobody figured it out. It took literally 20 years for somebody to realize, oh wait, there's this weird functionality that nobody uses that actually lets you get arbitrary code execution. So they're insane bugs still out there, letter command injection style vulnerabilities. The other thing I wanna say before I let you go is that these type of command injection vulnerabilities, these are incredibly important. They still occur on almost all types of applications. So we're looking right now at binaries, but they still exist on web applications, PHP applications, Ruby on Rails. Basically every type of environment where you can shell out and execute commands on the shell is potentially vulnerable to these types of vulnerabilities. So it's super important to look at that. All right, I got four minutes left. Go cool, you probably have to, if there's any questions, you'll have to chat them cause I can't hear anything you guys are saying. Any questions? I'm sorry. I'm bringing in your test grades. Test grades. The test grades? Test grades. Test grades. Oh, that's really weird. Probably by next week. I think by next Tuesday. Yes. I'm sorry. Same question? Yeah. Okay, I assume everything's going good. I will see you all on Tuesday. Thank you.