 Wednesday, everyone, we'll start class in the new one minute of questions, two questions of all day, two questions. Zero questions, alright you guys just shut it. Is there a slight hand? So I'm going to raise my hand and ask a question. We don't have to do the class. Okay, it's on you. Okay, so we've been looking at folks making near all of you talking. The acoustics here are really good. Okay, so we've been talking about this week, we've been talking about attacks on local applications and attacks on Linux binaries and some of these attacks will generalize to any type of application, but we were really focused on local attacks. The next attack that we're going to talk about is we call it top two attacks, which stand for time of check to time of use attacks. So these are actually one of my favorite types of attacks. The idea is the program is making a check and saying let's say what are the permissions on this file and then later on in that program's code it then does something to that file. So let's say that it first checks, hey is this file owned by root? Let's say. And then if it is it'll open the file up for editing. So if it's using strings to specify these files, right? It says check the file permissions on this file with this name, this file path, and then later it checks, hey open up this file name with file path. What happens there? So that's gonna be easier. And this is a very generalizable problem and the idea essentially is just what we've been looking at. So there's the time that you're checking the security properties of something and you have some assumption A then in T2 you're using that entity and you're assuming that your check was still valid. So in this case you're opening up the file based on the file name and you are assuming that the property is still valid there. And there is a time of attack that we can create that's in between T1 and T2 to invalidate that assumption. A lot of different ways you can think about these things. This is low. Do you still see this? I don't know why. It sometimes weirdly does this with a quick time recording. So there's many different ways that this can manifest itself with vulnerability. There could be data race conditions when you have multiple processes who all have access to some shared data. So if you don't have any kind of locking mechanism or any place there where only one processor can check the data and then use it, then you could run into this top two vulnerability. And so the way this actually happens in Linux applications is there is, so similar to we have that is root check, there's this access system call that attempts to estimate hey would be real user of this program be able to access this file. Where real user is the user who started the program, not the effective user ID which is the second ID user. And as we saw, the open call takes in a string. So both of them taking strings. So it's something like this. If access filing WKZ was zero, then open the filing as write only. And then write to that file descriptor. So what's the key problem? How do you fix this? Who's not able to open it? So it uses our effective user ID. So it's using the second ID permissions. The program is executing as root. So it says yeah, root can open this file. Validate again after opening? Yeah, how though? Yes, using file descriptor. So this is the key here, right? Because we have these strings which point to a file, but they don't specify an exact file. So when open returns, we know that that file descriptor we have will be the same file the entire lifetime that that file is open. So we can write to that file and we know nobody's going to switch out that file and make us try to write to a different file. This file exists and the operating system itself holds onto that file for us. It says okay, here I'm keeping this file open for two, which you specified with the path on a string for the file name. And I will make sure that any time you write to that using this file descriptor it will go to this file and until you read for that it will go to this file descriptor. So the key problem here is that this access and the open command taken file names, right? And the core problem is that anything can happen to the file system in between those two steps. So it's not always the case that the same file will be checked in both cases. So the way to keep yourself safe is to use versions of system calls that only use file descriptors instead of file half names. So that would be one thing we just talked about, right? Try to perform file descriptor bindings first. So like we said, you may also be able to use some functionality to lock the file. So you can lock the file first, then do your checks, and then open the file. And through that whole time you know that the processor is going to access that file. You can also do this problem with the old make temp system calls we talked about there's a new secure version because that would actually create the file first and then give you the file name and you would have to open that file to read and write to that temporary file. So in the meantime somebody could change the location of that temporary file and be able to execute to write to anywhere else. Questions? I'll talk to you again. So you said there could be a time lapse between checking and producing, right? Yes. And so what if you just use the semaphore, let's say, to lock onto the file? Yes, in that case, as long as so the key is there's always going to be a difference between the time checking and the time using. Why don't you put a slip in the middle and you just lock onto the file? So the basic idea is you not only there's always going to be a time in between those two operations, right? Unless the operating system guarantees you that these will execute at the same time. So there's always going to be a time delta. The question is can anybody tamper with the file system or anything else in that time period? So in this case if you lock the file and the operating system guarantees that nobody else can access that file then absolutely, it doesn't matter how long your delta is, nobody else can do anything to that file in between those points of time. Same thing with opening a file. When you open a file and you get a file of the scripter, that file is held open for your process the whole time nobody else can touch that file. So could this time like be limited to a certain window so that if you're inactive? It would depend on what semaphore is the locking mechanism you're using. But if you want to be secure you wouldn't want that, right? Because then somebody would just have to find a way to get you to sleep long enough to where you lose that privilege. So you need to guarantee for that entire time delta nobody else can touch that file. Or we're talking files but this applies to anywhere. Can you make a check after checking whether the file is available or not? Check that for a particular time it's not open and it should check it again. In that case though why do you do the first check? You should just do the second check, right? Because now you're already saying the first check is useless. In that case we are saving that time if our delta has degraded multiple processes and we are saving that time. Yeah I'd rather be secure than efficient. You're still performing double checks every time, right? So in the case that you're not being attacked you're always doing double checks, right? I would say the better way is to just err on the side of safety if it's safety versus like performance benefit. I would always err on the side of safety. This gets back into a little bit about how applications read and write to files. So one of the three file descriptors on the unit's applications is the first one. What is the first one? What are we starting with? Zero is for new scientists. What's file descriptor zero? Standard input. What's file descriptor one? Standard output. Standard output and file descriptor two. Standard error. So then what happens when you open up a file and how the descriptor does it return when you call open on a file? A number greater than two, usually three is the first one and the next one will be four and the one after that will be five. So there's no guarantee that the operating system has to do it this way. That's why you get essentially this opaque integer that doesn't mean anything to the application but it's how the operating system maps this number to an open file descriptor. So by default you always have three file descriptors open and after that we'll start usually three, four, five, six. And so lots of standard IDE applications need to have open files to perform some tasks. This is exactly what the change shell program does. It opens EDC password and it actually will write to EDC password in order to write and change that file. In the effort of not repeating functionality, you may need to fork out and call to external an external process. You may want to grep something, you may want to do whatever you want to do. Just like some people did with the natural web server getting GZED to work because they called out to GZED to actually do that functionality. But the same thing with the natural web server when you've got a command. You have to run fork a process essentially to run that command. The question is what happens, so what does happen? So we talked about what is forking a process, what does that do? Yeah, fork creates an identical process. Right? So it creates an identical process. What's the only difference between the two? The process ID is different but that's not necessarily in the process itself. Different what? No. The return value of fork, that is the only difference between the two process, right, the child and the parent. I believe the, I don't remember the exact same index, I think it's something like the parent gets returned, the process ID of the child and the child gets returned zero, I think, or something like that. So that's the only difference between the two. And then so what does that mean when we fork, what does that mean about file discriminators then? Same for both. Whatever was standard input for the parent is now standard input for the child. Same with standard output, standard error and any open files. And then when we want to call out to another program, we have to call exec, right, which is then going to replace our current process with a new process that it loads from disk, right? And that's how we said new, a brand new different process gets executed. We call fork and then exact. And on exec, what happens then with those file discriminators? Yeah, the new process also has access to all the new file discriminators. So when you think about it, so yeah, so the new process inherits the standard input, the standard output, the standard error of the parent, of the original process and all of the open file discriminators as well. So let's say you have a program that reads, a secondary program that reads, let's say the ETG shadow file, so it opens it, which means it has a file descriptor and then it calls exec to call out to some other program. What does that program now have access to? A file descriptor, so it can read from file descriptor 3 or 4, whatever it is, to read that open file. So that's exactly the problem with the file handle or reuse. So the idea is these file handles are reused and passed between children and fork processes and the fork process. And so by using this we can try to maybe read any files that were opened in the standard ID process. So there is of course a way to do this securely. Also of course it is not the default. This is another instance where insecure API defaults means that they will develop it insecure first because they don't know about how to do it securely. So if this close on the exec flag is not set, the file descriptor will remain open. And of course these things might have access right, so in case any of these open files are important, the children process can read or write to those programs. I believe so. Either open or exec. I don't think you should be able to specify her file descriptor, but I'm not 100% sure on that. So if someone wants to look that up, that would be good. Great. So this is not an abstract vulnerability. There have been instances of this. So the CHPAS command is just like the ESSWD command on Linux. It allows you to change your password. And so to do this it would create a temporary copy of it. So on VSVs, we're talking about VSVs so they have a different system. They don't necessarily have easy password and all that stuff. And so the change password created a temporary copy of the database and it's fun to edit it. So it opened up an editor so that you could change and edit your password. Then when I committed it, it took that copy. So this makes sense. It's actually from a software engineering perspective. Because if you're writing, let's say, a change password function, would you want to write an entire editor too? No. And would you want to force your specific editor on the user? No, because you'll get people like me who hate using VIM. You'll force them into VIM or if you decide to do an EMAX you'll force the VIM people into EMAX or nano people into nano. It's all kinds of madness. So why not just let the user decide which ones they want to do. With certain editors, we actually don't call them editors now. So in VI, there's an escape to shell function which allows you to run shell commands from inside the editor. And so you could get a shell and that shell would have the open file of the temporary password database. Which then you could read and write from in order to edit and change your file operations. This is actually pretty cool. Vulnerability. This is like a multi step. The idea is the original CH pass is the one that has this open file descriptor. It invokes VI. Which from there we can go to a shell and that shell still has the open file descriptor of the original change pass. So by editing that file descriptor we can then all the changes we made will get merged back into the old database so we can make ourselves root. We can do any kind of fun stuff that you want to do. Because the change password function program has to be root. It's running a second ID root. So what some things to do is if you want to be secure and you want to prevent this you have to be sure that there are no open file descriptors that are inherited by core programs. So this is a pretty simple ish fix. But it's something that can be easily overlooked. Because like I said it's not the default. The next type of vulnerability that we're going to talk about is command injection. And this occurs in multiple like pretty much all over the place. So you'll find these in web applications, lots of places. So the idea like we said was that hey, rather than for some of you, rather than writing your own GZIP library, why not call out to the GZIP program on Linux. And this occurs a lot in all types of software. It's hey, we don't want to deal with this. Let's reuse somebody else's functionality and use this tool or command. So how do you an arbitrary command on Linux? How can you? Yes, you all look at that on the back of our web server. It's almost like that is actually relevant to this box. So how does this actually work? What does it do? It's got a high level, I guess, yeah. So it does fork exact on whatever you give, but how does it know what files to open? Is it just a fork and then an exact? When a studio is in a studio and saved up, a studio is going to open. I don't think system itself actually stores standard out at all, unfortunately. Which a lot of you only use different function calls for that. I don't know what it does on a Python system. I think also PHP. Would it see if the command in the string is in the environment path? Yeah, so I have to look that up in the environment path. Is it just used for executing commands? So can you just do slash bin slash s ls? Can you run executables? Can you pass arguments to the executables? Can you do the output of those executables? Redirect them to a file? Is that normal program functionality? Is that just forking an exact executing a program? Can you pipe input between programs? So it's the equivalent as if you're where. On the command line, exactly. So it's just as if you're in front of a terminal and you're typing your command that it's going to do. So there's actually a lot more steps in between the fork and exact. Because we're actually having to parse this string to determine what commands to execute. Are there any, and there's so many things that BASCH does, is looking, are there any file redirections? Are there any pipes? Are there any background processes working? Are there any ampersands or ors in order to execute conditionally commands? Are there any semicolons for multiple commands at once? This is all shell functionality. So actually what it does is it's the exact same thing as passing your string slash bin slash sh dash c with your string. So dash c tells s h that you're trying to execute a command. And so bin s h does all that stuff. It's parsing this string as if you had typed it on the command line. The same thing with p open. So some of you use p open. P open does the same thing forks and execs it as a shell. So what's the language look like for this input string here? What characters are important? Dash isn't really important to the parsing. It's important to the program itself. The program needs to decide how it wants to read arguments. Spaces are important. What do spaces do? So tokenizes the input, right? So it specifies the difference between what's the executable we're trying to execute and what are the arguments to that executable. What else? What else is important? Which slash? The escaping slash? Or are you talking about backslash now? The escaping slash. What else? The number of arguments is not even part of the string. It has to parse and figure out how many arguments based on the spacing. What else? What was that? Quotes. Quotes. Double quotes or single quotes. This is how you can pass arguments to a program that has spaces. You can have any kind of spaces you want. Inside that string, inside the double quotes, if you want to pass in double quotes you have to escape it. So you also have double and single quotes. What was that? Dot. What does the dot do? Linux actually, dot does nothing. So the dot's not important at all. Give your hand up. Do you want to answer anyways? Yeah. And the ampersand symbol. So the ampersand symbol has two different uses. One is to run a program in the background. Another one, so a single ampersand means run the preceding command in the background. Two ampersands next to each other means conditionally execute the next command if the previous command returned true of which the return value has to be zero. What else? What was that? Semicolon. Semicolon separates command. Semicolon says now everything else is now not an argument. This is a new command that I want you to execute after your previous command. What else? Pipe. Pipe also has two different uses. What is a single pipe? Yeah, so take the output of the program on the left and make it be what? Yeah, so we can actually talk about this incredibly precisely which is how you should be thinking. The pipe means set the standard output of the program on the left make that be the standard input of the program on the right. And the operating system creates a buffer between the two where the other one can write to it and it will write up to a fixed amount. And then it will basically wait and sleep the other one to try to write more. And it waits until the other process reads from that buffer. And then as it reads from that buffer that creates new space for it to read more. And it's literally just about creating the file descriptor. So essentially it's actually not the other way, it's bash creates this buffer in between the two and it just manages the communication. And it sets that up in the file descriptor. So we have bar for piping output and we have double bar for or. So execute this command and if that was false then execute this command on the right. Arrow, what's an arrow? Angle brackets, yes. I think the arrow has to have the dash first. Preferably two dashes otherwise that's weird. So the arrow brackets for redirection and they actually do a lot more things, right? So there's, let's do this. There's a single arrow is to redirect the standard output to this file. So what does bash do when it sees that? It says, good, create this file, I have a file descriptor and make the standard output of that program, this file descriptor and go. And that's all that happens. If you're redirecting input, it's open this file and set the file descriptor of standard input to be that file. And so that's when the programs only ever have to read from standard input and standard output but we still get this beautiful redirection and piping functionality. Because bash is changing what standard input means to that program but they just think they're reading from standard input. So we have redirections. This is why you also can do I think I'm going to do this off the top of my head I can't remember. And it's the two and the angle bracket redirect. So that's the redirect file descriptor two. So I think you have to do one, so you have to have two and then the bracket and then one. Which is redirect standard error to standard output and so that's how you can capture both in one string. All kinds of complicated stuff. What else? There's still other functionality that we haven't touched on. What is that? Percent? Question mark. What does question mark do? I'm asking you because I don't know. I shouldn't say that. You're trying to imagine what though like in the shell. Is that correct functionality or is that I don't know either. If you need to make but as quantifying as it may use question mark. Right but I think that's an argument to grep. That's a grep specific functionality. I don't know whether it's shell functionality. It might be because you still have wild card. The asterisk is wild card functionality. So that's actually shell functionality to look for all of the current files in the file directory and replace them there. What else? Dollar sign. Incredibly important. What are all the things you can do with dollar sign? Was dollar sign home? Yeah home. So environment variables. The dollar sign is a way for you to access environment variables. Dollar sign home means at this point in before you execute this program replace this with the current value of home in the environment. Is that the only reason dollar signs use? I can't remember. I usually don't use that to set it. I don't know if that's that. I don't know if that's a standard thing or not. So there's only four variables that have other uses. Right. Okay so loop. Like you talked about like while Do you use the dollar sign on that? Actually I don't know. I don't know. I guess okay. So you can define that variable that way in the loop bank. Actually we can go another direction. Dollar sign parenthesis where it executes the command inside the parenthesis and takes that output and just puts it in there. Yeah we'll do this. Let's see how this connects to that. Cool. So we want a simple example even though we basically already did. So here we're building up the cat.mr.log rd1 here. So one thing is to set uid binary. What's one thing that this is vulnerable to? Type of the argument to the string. It's a character one. Maybe we don't talk about that yet. Any file. Right. From here. We're going to read any file. So it's already vulnerable to a .5 attack. But specifically we can then use it to output any file we want. So we can do it just like we've been talking about. We can do a semicolon to output things. Anything that we want. So this is not a an arbitrary thing. Because you essentially get full access to the system as if you were this user. And then around a long time they still remain today. So these like I said are in several other context. One of those is the shell shock vulnerability. Does anyone remember hearing about this? Too long ago it didn't matter if you want people. So in 2014 there was actually a huge bug found in Bash that had been there I believe for I want to say 20, 25 years. And what happened was this is kind of a combination of things. But the idea is similar to what we've been talking about. Bash instances inherit the environment from whoever forked them. So in Bash instances it inherits all the environment variables. And there was this definitely known functionality of Bash that if you wanted to pass let's say a function from one instance of Bash to a child instance you would create a special environment variable that looked this certain way and then Bash wouldn't execute that function in order to get that function definition. So Bash as literally every time we would start up would look through all the environment variables. See if we were defining it this way we would take that code and basically execute it. So it looked like this. So if anybody if you set an environment variable whose value started with parentheses followed by a function definition it would be executed by the interpreter in order to create this function. So on the surface this doesn't seem to be a problem. So this would only be you can change your environment you can inject environment variables but every instance of Bash you execute will be executing as you the user. Bash is not a set UID program because that would be horrible. So it seems on the face of it that this is just like a simple little problem. So this actually has two problems. So one is you can pass when you ssh so actually you've seen this. So when you as part of the wechall challenges or sorry it's part of the bandit challenges that integration with wechall. You can specify your ssh configs what environment variables to pass to your remote shell. So it is possible if you've ever wondered how can I get how can I clone a repository from GitHub using ssh to my ssh key without getting an account on GitHub's servers. So the way they do this is it's actually a way in your authorized key file you can specify that certain users can only run certain commands so you can create incredibly limited shells. So this would be for one instance if you wanted one machine to be able to access another machine without a password and have full access to the system to do backups you can have it to where anytime that program ssh's to that system it only runs the backup command and nothing else. But with this functionality now if I can pass arbitrary shell parameter sorry pass arbitrary environment variables to a remote server I can then get it to execute any commands that I want which essentially destroys that functionality. So it completely hosed these like limited access ssh accounts files. So you can do things like this to get the easy shadow password a really incredible problem here was web applications and specifically CGI web applications. So we haven't talked about web applications yet but at a high level CGI is a technology that allows you to write a web application in any language all you do is create an executable file. So you can have a CGI web application that's C code. But as part of this there's a defined protocol for how parameters are passed from the user's web request to your CGI program and they happen to use environment variables. So every certain parameters are set as environment variables and so remotely just by making one single web request you could get arbitrary code to execute as the user of that system. Or as the user of that web application. So you can execute arbitrary code through a web request using this vulnerability. So this is why this was such a huge issue and there was a really big push to roll out patches and fix servers. It was like a huge deal because this was such a wide-trend problem in so many different scenarios. So how do you fix this? Sanitize the input. Only accept certain commands So if you're expecting a command, yes you can do that. What if you're expecting file names? Yeah, so we talked about that right? White list versus black list, right? So I'd say this is a really good instance of demonstrating that black lists are very hard because there's not just a single character you can eliminate to fix these, right? It really depends. There's a whole host of characters. So one way to do this would be to specify that the user input before you ever add it is either out of numeric and that's it and don't allow anything else. Another way to do it are sanitization functions you can use. So it depends on the specifics here. But fundamentally yes there are functions you can call to properly escape but usually you have to be even careful with those because you have to be careful to place them inside double quotes usually. So even doing the sanitization, it can look like it's done correctly but can actually be incorrect and cause problems. So it depends on the assumptions of the sanitization routine. The other way is to never use these functions, never use system and v open. If you use the function exec or exec v e, so usually you pass in argv vectors to the operating system. So there there's absolutely no parsing involved and that's the key there. The key is the problem is the bnsh is parsing. The shell is parsing these strings that are passed in. So you basically throw that out but then you don't get any cool stuff like redirection or piping or anything like that in your commands. So you have to be very careful with this. It may show up in a lot of Any questions on this before we continue? Does it exactly only execute one single command? Yes. Yes, only one command. So that's you're essentially giving up flexibility by making it much more rigid but you get added security because you know only this program will execute and even if I pass the second argument the user can arbitrarily put whatever they want. But that argv1 is passed directly to the program that I execute. There's no parsing involved there. I miss what I did on Monday. When I come to overflow, this is super important because we're getting into the class name of overflow vulnerabilities.