 Alright folks, let's get started. Sorry about the screen. I will assume, I guess, going to be erroneously, it would be thanks to my today. So, all the old people I can get into things. Until then, let's see. Oh, it's class next Monday. It's not a holiday. Don't know if you're thinking that. It's not me telling you to text you. What a holiday would it be? President's Day. President's Day. Do we get the book deal? Huh? Do we get the book deal? You do not have the day off? Okay. There's still a lot. And then on next Wednesday, not this coming Wednesday, but the 21st. 21st, we're having an in-class CTF, so I'll put the other one. Okay, so let's continue where we left off on Wednesday. So, we're talking about the classic directory traversal attack, dot, dot, attack. So, can somebody give us an idea of what's the idea or what's the high level notion behind this attack? Or more ability? Password file, kind of like this. Mm-hmm. Back to the user. And so, I was thinking back on the fight system. Right, so the idea is if user input is used to open a file, the user can add dot, dot characters in order to traverse to any directory that the attacker wants. And so, you'll see these in a lot of, you know, this is like a small snippet of C code where we're going to be storing the edit path. So, we're adding the path to some file from the user, and then we're opening that file. So, if the user controls that path, they can fundamentally get you to open any file on the system. And so, we talked about how to sanitize this. So, we talked about very broadly two different approaches. The first approach we talked was about doing a black box, I'm sorry, a blackless approach, where basically you define what characters are not allowed in user input. So, you would say, aha, I know that the dot character is not allowed. Therefore, I'm going to disallow all dot characters, which may or may not make it secure depending on how that input is being used in one other context. A better approach is to say, okay, this is user input, it should only be out of the numeric characters. And then you know you're set and that nobody can do a dot, dot attack because all you're allowing is lowercase h or z, uppercase h or z, and then zeros or nine. Another trick, so, I don't think I mentioned it, but this is a technique. So, chroot is, I believe, a system called the operating system. So, the operating system, I want to change it so that my default root, my slash directory, is actually a sub directory of the overall system. So, this is actually how I run each of your code that you're submitting to the submission server. Is I create this chroot, you get your own directory, and that's where your code is executed. So, your code thinks that the slash directory is home and it can't go out of that, whereas your home directory is actually, your slash directory is actually like far chroot and then a random IP for your chroot. So, that's why you can't jump out of there unless there's a vulnerability in the chroot program to, or sorry, system call. You can't jump out of there and get an access to my files with the operating system. So, that's fun. Other types of final manipulation attacks that we can talk about. So, we can, and we talked about this at length. So, as we are executing an application, we control the environment that it executes in, especially for, we're talking about running local applications for a local attack. So, we can modify the path and home environment variables. And what can we do with these? So, what would be a situation in which modifying the path could allow us to control a program? And you would change the path, for example, the USRB into another directory which actually limited the LS and other modified files in there. And by running the LS or the project as a stop, the other users are trying, are actually running the LS modified LS in their user space. So, how do you change their path? Can you just change another user's path? I'm just changing the path. But you can only change your environment's path, right? That means. So then, how does that vulnerability manifest itself? I mean, there's a sim link. There will be a sim link. Now, but if you can make a sim link in user bin, you've already got them to X. You can put whatever you want in there, right? I know. And you can just put a drop number finally on the user bin. Let's talk about assignment one for a little bit. So, those of you who did the extra credit, did you implement gzip from scratch? No. The LWZ encoding? No. How did you do it? If people like to share their files? If I provide the output to the gzip and store it in the file, and then you was sending this once we let that file in. So, you're using a different program on the server in order to do your input, right? And this is kind of the usenix philosophy, is you write one program, does one thing well, and then other programs can use that to do whatever they want to do, right? So, in this case, if you want to just output the contents of a file, yeah, I mean, you could write something to read it and output it, or you could just call cat. Or if you're looking for, let's say, a certain phrase in a file, you could write the script to go through and look for it, or you could use crap, right? And so, when you use something like system, the system system call, or I believe popen does this as well, and you say popen ls, how does it know which ls to open? Or we say gzip, how does it know what gzip program to actually execute? The path, it looks in the path. And so, if you're executing a setuid binary that's running as root, and you control the path, there, if that setuid binary then calls ls using system or something that uses the path, then you can trick it to executing whatever you want, and whatever you want, that program will then run as root. So you can pretty much fundamentally get it to do whatever you want. Similar type of thing, so where is home used when we're talking about paths and file systems? So what is the home? There's no place like home? It's your user directory, it's like, it's $1, but I don't know how to subscribe. It's like assigned to you by assist admin 3. Yeah, so we actually saw where it's located, right? It's located in the ADTC password file. Every user has their home directory in there, which is automatically, so that's how your home environment variable is set up when you first log into a machine. But how is it actually used? Like if you were a program, let's say I wanted to write a program into the .comwork folder in your home directory, how would I reference that path? You can use tilde user name? Yeah, so you can use tilde and then, well, what was I doing? Yeah, so you can tilde slash the directory and that way whoever's running that program, the tilde will expand based on the home environment variable, whatever that user's home directory is. So, but again, the same logic applies. A set UID binary is using tilde to write to our library or write, either write or read from somewhere. We can completely change that home environment variable to get it to open different files in different locations and do all kinds of fun, crazy things. So different, so the important thing to remember is the, so when we're talking about path and home attacks, so the exact VE system call, you have to specify the exact path to execute, to the executable, because the operating system doesn't want to guess. The operating system wants you to tell it exactly which binary you want to execute. So you need to pass in an absolute file path. So libc offers, so this is actually a good one. Let's look at this man page. Yeah, that changes the ZSH and now I don't have any of my pretty tolerance. So if I do man, exact VE, so this will take you to a system call. Yeah, okay, so somebody was asking earlier on the mailing list, how can you execute shell scripts? This is exactly why it says it right here, the operating system will look, and if the script starts with dots, hash bang interpreter, then it will execute it as a script. Let's, I want to see if it, man, man exact VE. So we're looking at the man page of exact VE. I just really just want to see if it says anything about absolute path. Alright, I can't find it, but it does exist. But, so passing it into exact VE exactly this format of filename, we are, sorry, this is a little off, there we go, sorry about that. Because the arguments to exact VE are the character pointer, the filename, and then a character pointer of an RV vector, so a character pointer pointer, and then which is null terminated, and then a character pointer, the environment pointer. But sometimes, I mean, you know, this is kind of, you could say this is a low level primitive that must be called an operating system, or else literally nothing happens. But as far as a programming construct, there's a lot of things to have to call, right? Oftentimes we want to be able to call a program like somebody mentioned, a piping output into gzip and taking advantage of pipes. The operating system doesn't know anything about that. But if we look at the man page of system, if we do man system, we can see that the system library function uses fork to create a child process and then executes the cell command specified in command by using exec cl. And we can, and this is a libc function, so we can do, do we have examples? So we can do things like call, there we go, system foo, and it will, so we can see we're not specifying the path, so system is going to use the path variable to look up for the foo executable. If it finds that it uses it, it will also parse all of the arguments. So I believe, yeah, okay, so this is the important thing. So system is exactly the same thing as calling slash bin slash sh, and then passing in the argument sh, the next argument dash c, which means execute this command as a bash, and then the actual command that you pass in. So this is why when you call system, you can do things just as if you're on a command line of piping and redirecting output to a file and catting a file and then piping a regret and doing all this fancy stuff because you're literally passing that string into bash just like as if you're on a command line and typing that in directly. But eventually that's about to fall down to an exec de call. And so all of this bash will is the one that's doing the lookup of the filenames and the path. It's very important because this comes up in a lot of context about how is the programmer actually executing a process that it wants to do something for. So there's a whole family of all of these. So I just did man exec l, and that shows you the man page of exec, which has exec l, exec lp, exec le, exec v, exec vp, exec c, exec vpe. All of these different, these are all different interfaces that all are in libc that eventually call down into exec v. It took me a long time to realize what the differences were between all of these. So this is me trying to explain, said that in your shoes, being like why are there so many of these different calls. And from these function signatures, they do different things. So for instance, exec l takes in a path. So the path to the execute we want to execute. And then a series of arguments. So just like printf, this is a function that can have multivariables. So here we put however many variables we want and when we're at the end, then we're going to know all of those big arguments there. Yes? Is the return value processing or is that a success? Let's look for return and see what it says. Return value, exec functions return only if an error has occurred. The return value is negative one and the error number is set to indicate the error. So just like exec ve, it completely all of, because all of these functions that actually call exec ve, which means they replace the currently running process with whatever you specify. So this white system is nice because it does a fork for you. It forks the process and on the child process it executes whatever you said you wanted. Or the other, some people are playing with p open, so I just did man p open. And that will, is another way to execute a child process to do something for you. By the end of the day, they ought to call exec ve because that's what eventually the operating system does. All right, cool. Sweet, okay. So all of those families of libc functions, most of them exec lp, exec vp, system p open. These things will look up in the path. If you don't specify the absolute file you want to be executed, it will look it up in the path. And so just like we said, the home directory. So if a setUID program is, let's say, opening something in tilde slash myfile.txt, you can change the home directory to point to whatever you want. The other concept I was just brought up earlier is you can actually, if you want to, open up a specific user's home directory. You can do tilde username to like total root slash whatever dot txt. And then I think in that case, you can't control it because it's going to look up in ETC password for that specific user's home. But anyway, we're completing these purposes. So how do we fix this? Specify the whole path. Specify the whole path every single time, right? How can we be guaranteed that that's cross-platform, right? Specify hard-coded path on Linux. And then it doesn't work in NetBSD. Does it involve other systems like programs that can change the path of this system? No, you don't know it's vulnerable. That's the point. So you want to write, you still want to write code, right? You want to write a setUID program that uses other programs to do stuff, but you don't want it to do it in a vulnerable way, right? So you're coding this app, right? So how do you do it? You can figure out the OS, right? So maybe you can interpret based on the OS type what it needs to be, but I don't think that's all the problems. Yes. You could detect based on OS's and do different things, which, depending on how cross-platform you want to be, you could do. It's actually a hard question because a lot of these programs are, I believe, defined at the POSIX standard, that they will be at certain locations. So you can always use slash binls at NetSame, Unix, POSIXE system. It will be the LS executable. So it's kind of a great question, but it's important that you just think about, like, the answer is easy. You just always use absolute paths, which may or may not work depending on what program you're trying to execute and where the operating system has both links there. So same thing with the home directory. Don't use the tilde in your paths when you're opening files. Or exactly the same reason. So you should always, always, always, always, always use absolute paths. It's one of those things where the convenience of using relative paths or just saying system LS seems like, yes, this is what you want to do, but it is very much not what you want to do. And it's really important that you write code correctly the first time because, let's say, you're writing code now that you're thinking, well, I'm going to send you a new binary, just a little binary I'm making for myself, so who cares? So you just do whatever you want. And then later on, five years later, you're writing a send you ID program and you're like, hey, I had that code that did this thing. Let me just copy and paste that into my new project, which happens all the time, right? Actually, sometimes, for me, the longest part of coding something is figuring out when was the last time I did that thing and finding that code in that file so I can seal it and use it. So you hope you did it correct in the first place. So always, always, always, absolute paths. Don't use relative paths. It's actually pretty easy. It's always nice when you have security recommendations but if you just follow this, you will be safe. You don't have to worry about anything. So if you're aggregating paths throughout your program, is that okay with your building? In what sense? Like you have constants maybe for absolute root, like some basic paths and then you append on paths, as long as you control that to not use them. As long as you control the input, right, exactly. So it's all about who can control or modify that input. So if you're ever taking user input, this is even something, if you're reading something from a file, that would potentially be user control. Yeah, so, yeah, when I say absolute paths, I don't necessarily mean like in your program you have to have like slash user bin or slash bin, like the entire path, but even if you're calculating dynamically, it still needs to be 100% controlled by you, the developer, so there's no way that an attacker could influence it. Any other questions? That's a text. Alright, you're wanting just enough to be dangerous. That's gonna be awesome. So, other thing we're going to look at, so another type, and this is actually kind of interesting when you think about it, so all of these vulnerabilities are all, let's say weird features or weird corner cases of different aspects of either the file system in the case of .got allowing you to go to a previous directory, the behavior of bash in the case of using the path in the home to look up executables, and here we have, so what are some links, there's symbolic or hard links used for? If you have different versions or installs, so like some install versions, like Python, like 3.0, whatever, it'll just link back to Python if that's your default. Yeah, so the idea is, so a symbolic link is a file system concept where you can think of it if you're more web focused as a redirect, where if you try to open this file, you say the file you actually want is located at this directory, so this is actually really great for if you want to have different versions of Python, so you have, there's, they split between Python 2, like something that's usable on Python 3, which is terrible, which I should probably not say, but I'm old and set in my ways and my students are slowly pulling me up into Python 3 and I'm going and picking and screaming, but anyways, the point is, you want one interpreter at user bin Python that is the Python interpreter for your system, but you still want to have differences between Python 2 and Python 3, so you create programs in user bin and call them Python 2 and Python 3, and then rather than duplicating that code of whatever one you want and having that executable be in two places, you create user bin Python as a symbolic link to user bin Python 2. Let's actually see if this is the case on this system that we're using. Can you guys, excuse me, I'll make it bigger, so let's do LSSLA, user bin Python. Yes, okay, good. So we can see that this means, so this is the important thing about symbolic links. So the links themselves are open to everybody. It's not like, there's no access control permissions on the link. We can tell the link because it's as an L on the permissions on the far left. So does this mean that anyone can read and write the Python 2.7 file? Exactly. So the privileges are checked, so if we look at LSSLA user bin Python 2.7, that we can see is an only by root, root and is readable writeable, executable by root, readable executable by root and by root, root and readable executable by everyone. So I can't write to this file because I'm an other. So the symbolic link is accessible to anyone and the access control checks happens when we access the actual file. There's actually a similar idea in concept to the path attack. So when we call system, system LS, we know that bash is going to use the path in order to find an LS executable. And so we can control that path so we can get it to do whatever we want. So let's say you have a second ID program that is opening up a file called password. How does the operating system know what directory to open that file? So when it calls open, which is a system called the operating system, how does the operating system know what file to open? Yes, so when... So you write some code that calls open, the open system call. And you say, let's say, we'll do man open and nope, man. I think it's too open for the second section. So open, you put a path name in the flag. So this is read, write, execute, or... No, yeah. Read only or read write or whatever permissions you want to open. So, yeah, how you want to open a file. So there's a path name. I think the directory, blah, blah, blah. All right, I was trying to look and see if we could get it to tell us. Ah, there we are. If the path name given is... If the path name given in the path name is relative, then it's interpreted relative to the directory referred to by the file descriptor DERFD rather than relative to the current working directory in the calling process. Oh, open app. I was like, that's wrong. That's not what's supposed to happen. Okay, that was just more confusing because that wasn't documentation for the call we're interested in. But we call open. We call open. And then we put in a path name and password. We put in whatever files we want, read, write, and create if it doesn't exist. So then the operating system sees that. How does it know what password file to open or to create? There are some orders for checking the directory. For example, in Windows, the order is a current directory and then it checks for other directories for the result of the file name. If it's a similar name, it tries to resolve the actual file name and so after that it finds if it's not a directory in Korean and it is so it opens the file name. But how does it do it on a Linux front? You think, or you don't? Is this what the I know? I'm looking into... No, that's a little too low level. So it's similar. So every... So if we look at our nice friend the slash proc file system, I mentioned that the operating system keeps data about every process. We talked about process ID, the owner of the process, the group owner of the process, and all that lovely information is in proc. We'll do proc self. I think we want CWD. There we go. So every directory has the... has a current working directory associated with it. So this is... Oh, oh, tiny. So with ls-la slash proc slash self slash current working directory and so this says that that is a symbolic link to slash home slash AdamD. If I change directory to slash let's say well I can't be rude. If I do cd to slash and then I do ls-la proc self current working directory CWD it will say that that is now a symbolic link to slash. So every process when it executes has a current working directory. And so when you say something like open password it looks in the current working directory. So unlike Windows which we'll try to look in other directories Unix system will only look in the current working directory. If it's there it will open that file. It will create it depending on your flags otherwise it will tell you that that file doesn't exist. So who controls where the process is open at the start? So who controls the current working directory? Yeah, the person I believe if a process calls execve so that's actually something because it's not passed into execve so it must be inherited from a child process to its parent process. Of course as a how do you ok news all these things. So CHDR is to change your working directory. So you call CHDR this is a system call because it has to update the process table to say this process is now currently working in a different directory. And now all the other files will then be open in that specific directory. So if I'm opening up let's say a password file I'm checking I'm a semi-ed program. I'm opening up a program called password in the local directory and I'm checking if your password matches the file in that directory. And let's say it normally runs from slash home slash password checker and there's a password file there. How could we mess with that program? Sorry could you say that again? No I just made that up so I don't think I can remember it. But we have a password checking program that the way it works is it calls open and it calls open with password because it's checking the password file. And normally it runs from slash home slash password checker and it's opening a file that only it has readable and writable permissions to because it is a password checker and it's running a semi-ed so that way it can be running there. So how might we be able to influence and change which file it decides to open? What does that mean? Probably before before he starts opening or writing on the file we call this. Yeah so we can call changedir and this is exactly how cd works. When you say cd whatever tilde or home or cd slash home slash add of d all it's calling changedir that file ever pass. So I can create a password file in my local program and then I can execute the program and the current working directory it will inherit is the one I am currently in. And therefore I can get the application to open up whatever file I want. Now but maybe I don't want to write let's say it will append whatever input I give it so it's a root set UID program it will append whatever input I give it to we'll call it we'll say password we'll stick with the same one a password file. So what would be a file that I would want to append to in order to escalate my privilege on a Linux system? It's EPC password right? I can create a new user account and I can give it I can give it whatever password I want to create my own account on the system and I can create it with the user ID of 0 or I may want to edit the ETC ETC sudoers see I can't even look at so the ETC sudoers file has a directive of who can call sudo on the system so if we can add a line there we can get our account to be able to run the sudo command and we can append to a file of our choosing that's really awesome but the program is only opening up a file called password if I use cd slash ETC and then I execute this program and it writes my output to a password file or it append my output to a password file that's not what I want I want it to change ETC password PASWD so how can I get around that how can I force it to append to whatever file I want symbolic links yes it's bringing it all home so I can execute it in a directory of my choosing I can create a symbolic link from password to whatever file I want in the system and that will allow me to append anything anywhere in the system which is super awesome so that's the idea behind link attacks is essentially usually a program will think hey I want to edit whatever the dot password file or something so by creating symbolic links we can trick the application into thinking that it's editing a different file than it thinks it is and we can even play really cool games I think we'll get into it more later but you can get into games where even if they check is the file owned by you or write the file and then if they open it later there's actually a window where you can switch it from being a local file to being a symbolic link we'll see that in a second so I think I can edit myself so this has actually been used in real-world vulnerabilities in semi-id binaries it was really this one where it would create a temporary file and the temporary file name was easily guessable and so you could and it didn't check if that file already exists so it would you would create a symbolic link from that temporary file to the thing that you want it to open and then now you're editing whatever file you can see password that you want so this is the DTAP gather utility on CDE so it created so it looked for this so you could it would try to edit this and then config out manager generic display 0 it doesn't see if it's already existing before opening it and so you could so here's the EDC shadow this is a LAS-LEDC shadow shows up the shadow file we didn't talk about the shadow file so in this terribly named EDC password file less EDC password PASWD how many passwords do you see here from my password from here no it's an X what does an X mean so the password is the second field the first field is the user name the colon and then possibly the password so what does an X mean X means look somewhere else look at the shadow file so they realized very quickly that putting the passwords even when they're hash into a file that everyone can read is a bad idea and try to crack hashes so everything is in so do shadow sudo less let's do sudo ls-la so we can see the EDC shadow file is readable and readable only readable by the shadow group and everybody else has not read that file so if I try to cat out that file as the Ubuntu user so if I do cat shadow but I'm a sudo user so I can look at this and there's sudo less EDC shadow and now you can see under the out of D and the Ubuntu users that in the second column now there's actually a hash and so you can decode this hash I'm literally using a really strong version of this so I don't know or my password is password so it doesn't matter anyway so this is a local VM that you cannot get into anyways so whatever oh I think this testing one literally is password okay somebody should break that and verify cool so this is the idea so now if we can change this file or add to this file let's say if we were able to there's our lovely root user at the top here and so we can change the second column here this value I believe 100% certain but I believe the exclamation point in the second column means that you can't log in as this user so it will deny you from logging in so so you can like remotely maybe I don't know somebody Connor can you look that up so the idea is that this attack so by making a symbolic link to EDC shadow from VAR-DT-ACLINFIG at Android Generic Display 0 when you call DTAPGather to the set UID binary it tries to make this as a directory and tells you that the file exists but it actually changes the permissions on that file because it calls chmod to change the permissions on there and now you can see shadow file is readable by anyone so now you can get access to this file try to crack all the passwords on there just pretty cool so what do we learn from link attacks how do we defend against this in our code so on the last attack right they tried to do what what was the if we go back what's the fundamental problem here like if you were writing this code how did they write this code and what was the problem you did check with the file existing what was that before doing what before yeah so before I think in this case they may not have read or write to it yet but they changed the permissions on it so they set the permissions on the file that they were opening and they didn't check they didn't check that the that the file exists before opening it before modding the files so that somebody could replace and probably there's clearly another problem here that anyone can create this far DTF config app manager generic display 0 although I guess you got the purpose here I don't know the purpose behind here so the idea is while you're coding you need to be very careful of is this actually the file that I think it is or is it a symbolic link to another file and so you can do these things you can open a file and check is it a symbolic link and so you need to be very careful when you're coding these things you should look for unexpected types is the thing that you're opening if you think it's a file but it's actually a directory that could be a big problem symbolic links temporary files so you shouldn't just create let's say a temporary file called dot temp which is easily guessable by an attacker so there are secure temporary file generation I think it's S temp or make S temp there's a secure word in there make KS TMP oh, you mean this second bullet here see this is the problem that I don't have with the recording thing I don't have the presenter mode so I don't even know what's next on the slide so yeah these are all the what is kind of horrible about APIs is they will give you a secure version of them but not tell you that the make temp is the make super insecure temp version the only place I've seen this done really well was in I think it was the like windows metro the windows 8 metro APIs for doing JavaScript programming and Eval's like insecure was the name of the Eval function so that you knew it was a really insecure function and you should not use it so if you want to use it, it's on you and that's your fault whereas like for an API design standpoint you have something that's called secure why would you ever use the insecure version like that doesn't make sense you should get rid of it or call it something scary so which one is the insecure version these are all secure so the S is the secure version just make temp so the make S temp function generates a unique temporary filing from the template, creates and opens the file and returns an open file so it creates it and opens it at the same time and returns you a file structure rather than I believe let's see the difference make temp so man make temp I think I need to do 3 make a unique temporary file so make temp a template so this see the difference is the return value so this returns a character pointer which this character pointer then you have to then pass to the open function in order to open the file that was just created to get a file so think about how many milliseconds or think about how many CPU instructions can occur especially on other applications in between executing one call to call the make temp so you have no idea that what you're opening is the same file that was created because it's being referenced here by a character pointer whereas in the make S temp so man make S temp returns an int so it returns the file descriptor so you can immediately call read and write on that file descriptor without having to worry about is this actually the file that I that was randomly created to be a temporary file yeah so it's actually very easy you just use this version like you use the secure version awesome which gets us to talk to a tax which probably has the best name and they're pretty easy to to so writing down the name is time of check to time of use so the idea is if there's a gap between when you check something security criteria before using that thing that can be a security vulnerability and this idea of this time of check to time of use this comes up a lot in a lot of different context so this is a really important vulnerability class and the idea goes back to what I was talking about a minute ago or many minutes ago I had an idea of time as a meeting up here so the idea is let's say you check that a file so you're a setUID program you want to edit user programs but you don't want them to trick you sorry edit user files like only that user that's executing your files so remember we talked about um Unix how setUID works is there's many types of IDs there's the effective user ID the saved user ID and the real user ID so what you can do as an application you can ask the operating system hey if I was executing as my real user ID the person who actually executed me would I be able to open this file and so the operating system will tell you yes or no the problem is then you have to then open that file by using the file name to then specify getting that file and at that point you have no guarantee that that file is the same thing that you originally checked so the idea is if you think about a timeline there's a time so a time zero is when you're checking that some security property is met so you're checking is this file owned by the person executing me even though I'm a root program and could they open it and then there's a gap where something could happen where then you check at time t2 that you're actually using it for a security operation and so if the adversary can get at a t1 that's in between there and alter your security constraints then they've completely owned your application so the other way to think about this is kind of like it's almost like a race like you're trying to get as an attacker really lucky so if we think about it yeah okay cool so the idea is I'm going to just draw a little picture to give us some ways to think about this so we have our here our t1 which is our time of check we have let's say here our t2 our time of use and so the idea is as an attacker if we can get in between these at t3 we can use the resource or validate some constraints for otherwise do something important so if these are let's say two cpu ticks or clock cycles between t1 and t2 what's the likelihood that we can get in between them not very good yeah that's probably as mathematical as an answer as I'm looking for yeah not very good it's kind of really hard to time it exactly right so that the cpu executes one instruction of their program and then executes your instruction to change the thing and then executes the third instruction so as an attacker how can we increase our chances of this occurring take away cpu resources by like making that gap wider yeah so one thing this is why I want to draw this diagram is if we make the gap between t1 and t2 and this is really cramping my style if we make the gap if we make the gap between t1 and t2 wider let's say we can increase it to I mean even on the order of let's say 500 cycles or if we think about half a second that's a huge window or if we increase it to on the order of minutes then we know we're good and we know we can definitely do that so you can actually do this so there's some super interesting things so one way to do this is we can create a lot of noise on the system so we talked about if we create a lot of applications running at once on the system then when it executes the victim's machine if every other process is trying to do t3 then it's much more likely well hey okay that's actually two different two different ideas so one idea is we just create a lot of cpu churn so a lot of things are happening so it's highly likely that the process is going to be evicted between t1 and t2 thus increasing the time that we can get an attack in so we can improve this gap if it's reading from a file we can cause a lot of traffic on the file system which will slow it down we can read from the network we can do similar things the other way we can do how many times do we need to get this correct just once this is one of the fundamental things you should take away from the course and this is why it's called a defense because a defender has to defend against things every single time or an attacker only needs to be correct once so one way to do it even if it's well two cycles is really slow but let's say I have like a 10mm see if these are so fast it's hard to come up with a realistic timer but like a few millisecond window I just keep trying at some point I'll get lucky so I have maybe a couple of programs doing T3 and testing if they're correct and at a certain point bang I'll get it and then I'm done I don't need to do anymore so if I can keep doing this and guessing all I need to do is to get lucky once the other really cool thing I want to talk about is especially with links so there was some researchers who showed that you can actually basically if this is file access checking does the file could the real user open this file and then opening that file what you can do is you can make you can slow down that second part of opening that file by creating an absurd number of symbolic links in really nested directories so you make the operating system have to go open 20 directories and then that's a symbolic link to another thing that's another 20 directories that's a symbolic link to another 20 directories and you do that to the operating systems limit thus you're really increasing this time and then it's really easy to get that so these are all there's all cool tricks so this is why I think it's another good instance of the security idea that attacks only get better over time so when this attack first came out people said ah that's a really cool like theoretical attack but who's going to do it within that you know super tiny window it's like well I start thinking how do I make this window bigger right you come up with a cool attack so attacks always get better always get more realistic and they get really relaxed could you set a limit on the amount of symbolic links yes so that's like and there's normal grace right how much you do but still fundamentally the attacker can increase that time and just make it you know just by doing file system activity all kinds of stuff another way this can work is data race condition what's a race condition what was that where processes are competing from memory yes you have multiple either friends or processes accessing the same memory and they're doing so without getting an exclusive lock to that chunk of memory and so you can have all kinds of crazy stuff happen so if one of them has great access you can kind of this is also a time of check to time of use so the system caller I was talking about since I literally just talked about this is the access system call returns an estimation could the real user actually do that open checks it so this is how you have to use this so this is the C code that has to use this so the top level it's if access file name and then a flag W okay is equal equal to zero and then if that's true then open the file so call open with the file name with the flag W over your name or rewrite only if that then write to it so the idea is even if you compile this code out and there's really not learning instructions in between them between this access system call and the open system call an adversary can still be able to get you to open files of their choosing yeah so why can't you combine with an access open system call you did this in the meantime ah yes so why can't you just combine these into something so what would you want that something to return let's say you're right so you want to return a file of the structure exactly but can elaborate a user space program do that that's tricky right because a user space program can only call the system calls that are available to it right and so and really the operating system is the only one that can ensure that nothing happens to this file right so you may I mean actually you may be able to get around it if you open it first with like an exclusive lock and then maybe call access on it that could work I don't know but it's not a natural way it's almost a backwards way of doing things so yes the much better way the operating system provided a primitive access and open or something or open access or something like that what was the purpose of the read write at the WR only oh that's just it's just a flag to tell I mean this is adapted to real code I have real vulnerabilities so this is just what the code has so this just means open the file name with read write privileges so that doesn't help like no no the the flags here don't have anything to do with it except for the fact that you're writing to it so you want it to be something that normally you could write to if you're reading from it you're still useful because you could read from this let's say ATC shadow if you could get this to work so yeah the flags there are only really about what this allows you to do alright so we just got to our lessons learned we'd want to use system calls that return file descriptors so once are so I'm going to go back to the proc system it's full of so many good things so proc lsshla proc self slash ah and super interesting proc self is itself a symbolic link to whatever process is accessing that link that's a link to their process ID it's all turtles all the way down so in lsshla slash proc slash self slash fd this is showing all of the file descriptors that are open so here so we can actually see so we're doing lsshla proc self fd we can see there's four files in here that are all symbolic links zero one two and three so zero is what standard in standard in one is standard out two is standard what's this three why does that exist why is there a file descriptor number three so where file descriptor zero one and two linked what is that yeah it's a dev pts zero so this is my terminal my terminal is a device so this is how the output of that application actually gets to the terminal that I'm viewing so I think actually don't know if this will work but let's say we'll do echo add-in into dev pts yeah so it just outputted to me on my file descriptor right on my terminal whatever you argued the file it came with my standard output so what's this third or fourth file descriptor with number three yeah I think it's probably related to this shell actually the shell that you open so if you notice oh last command so what's not the last command but what is so what is slash proc self the current process what is the process that's examining the slash proc slash self directory the ls command that I just executed and the ls command has to has to open what directory slash proc slash self slash file descriptor and we know that slash proc slash self is actually a symbolic link to slash proc slash the process ID slash fd which makes sense of why this id number is different than when I just did ls slash lsla slash proc slash self which went to four five four two two whereas the second location went to four five four two six because these are different ls commands that have different process IDs so it has to open this file right it makes sense that there is a third because it called the ls command called open on that directory so the inside there slash proc self fd slash three is a symbolic link back into that same fd directory crazy right what was I talking about I don't know that's cool file descriptor okay that's where this is I knew I started with the purpose I did write it down but so the idea is when you're opening up let's say a process but a file for writing right you don't want any other process to be able to write or change that file because you have it open so one of the this is so the list of file descriptors is the other one of the other things that these operating system keeps in the process table so it's keeping all the things we talked about which I don't want to enumerate and all the open file descriptors of that process begin at your range of the program and how that maps onto the file system and it ensures that unless the process tells it it's okay that no other application can open that file or mess with that file so this is why returning you want to use functions that return a file descriptor because that means the operating system is holding that file open for your process okay as long as the version of why this actually makes sense and why you want the operating system so this means that for two different applications which intuitively make sense for two different processes running on the system file descriptor zero may not be the same one in each of them right standard input or standard output let's say file descriptor one which makes sense because of a redirecting something to a file then it would make sense that that is I just want to see if this will work cell FD I'm going to redirect LSS LA proc self FD to test so if we look at test yes okay so we can see that now file descriptor one is a symbolic link to home and D test which is that file I just created so this is because I use redirection bash and it executes the LS command all that redirection does is tells the process that file descriptor one is now that file that we just open yes so it's LSS LA slash proc self FD pipe or redirect with that angle bracket to test and I'm inside slash home let me just cat out this file so I'm inside my slash home slash out of home slash home oh I am out of D O is the host name okay good that makes sense so this is what bash is doing so what bash is doing here is it's setting up file descriptor one and you can easily do this in your own program so there's the I believe the doof, the doof2 system calls dup and dup2 that can copy so you open a file let's say file descriptor three and then you say actually I want three to be file descriptor one and now when you write out that file will be writing out as originally so this is how all typing everything works is through systems like this so the idea is this is our goal is we want the operating system to act as the exclusive lock on a file for us we don't want to have to do the checking ourselves so we want to use functions that will return sorry a system calls that will return a file descriptor and so that way we can capture that file and same idea with we want to use make s temp which does this so this is the key difference between make temp and make s temp is that make temp returns a character pointer to a string of the file name which you then have to call oa but make s temp actually creates the file and opens it and returns a file descriptor so you know you have exclusive access to that file we have time 8 minutes ok so on the subject of so what happens when we execute so the system call we looked at that so at a high level how does the system work the system libc call it calls bnsh what else does it do it's mainly because it's running in bnsh how does it actually execute that it calls execve with the things that we just saw calls execve slash bin slash sh sh dash c your command but if it just did that it would replace your process with that command that you ran does it replace your process with the command that you ran no what does it do what was that press open and you turn it no such thing as terminals terminals just bash it creates a new process it creates a new process how does it create a new process with fork so it calls fork which duplicates your process and then the child process calls execve so and we can look at again this is one of those things that only the operating system can do so if we look at man fork so there we go create a child process so fork has no arguments because there's nothing to change the behavior just create a new process and we can read the description of the description of man fork man to fork because we want the system called fork creates a new process by duplicating the calling process the new process is referred to as the child process the calling process is referred to as the parent process the child process and parent process run in different memory spaces so if the child changes any memory it won't change the parent at all I guess similar is to kids but the time of fork both memory spaces have the same content so the child is an exact duplicate of the parent except for one key difference the process idea is to run this operating system not in the process itself what was that? memory space it has exactly the same memory layout I don't know not always sometimes the parent dies the child dies that is more complicated that I don't understand everything what file is here the standard those will be exactly the same like clone yourself think about me it's a similar problem with people if you could let's say you have a sci-fi machine that does a perfect clone of yourself which one is the original but we don't have a goal we have a parent and a child the child run in the process no I just remember when we checked the child had been successfully it's PID with zero but what PID it's not the return value of 4 yes so 4 itself so literally all the operating system does is makes a copy and then changes the EAX register so that the I believe it's the the child is returned zero where the parent gets the process ID of the child and then do things and refer to that process and kill it or do whatever it needs to do that's the only difference you can think about it as like a duplicator that does everything but changes one thing gives one of you a name and tells the other one that name so you know who's the original who's the clone so as it's doing this it reads an exact copy which also includes all of the open file descriptors that the parent has um yes okay good so so we think about just like your handy handy writing your web crawler when you call when you're calling gzip in order to do gzip functionality because you don't want to do it manually you you're a set you ID application you want to call some application to do and you know you want to be secure so you want to call something like slash bin slash ls or slash whatever you want to use some internal process if you don't use this close on exec flag that you're you can pass um then the new process inherits all of the open file descriptors so what if they open the etc shadow file and then call and then call fork and then exec v e on some process and now that process can then read the etc file so it's a super cool attack because it's actually brings home all the things you're just looking at a file descriptors um so there is a change pass so c h pass command on open bsd so just looking at a real historical vulnerability that included this um so the idea is it would create a temporary copy of the their version of the etc shadow file and then it would spawn an editor to display and then modify your account information so you think they copy the database and then they they do fork exec v e v i v i them or e max or n o whatever you want right and so now that application is running with this is running with all the open file descriptors so let's say you know that it just opened the database so file descriptor free is now going to have a database so you inside your editor can actually open up slash proc self fd3 and now you're opening up that file and you're reading from that so um yeah so you can do all this tricks and you can use v i and and get out of v i into a shell um so yeah super important that that when you're running a set you have the application and you're calling fork and exec that there's no open file descriptors um and this will be the end of like file games that we're playing so now we're going to move into um command injection attacks which are actually closely related to all the things we've been talking about about system and p open and exec v and exec and the derivative of exec v e so it's important to make sure you got those clear uh before we begin on Wednesday