 We'll get started one unfortunate thing is I left my microphone to the computer somewhere else So speak up when you ask questions because we're using the microphone on the top of the laptop Which is a really bad microphone So it's good that you're all here today because it's gonna be really poor sound quality for everyone else online So, you know, you do what you can All right, I'll take well, so you want somewhere to do Thursday. All right, I'll take like three minutes worth of questions Project three, I mean assignment to Class is confused Both parts I don't know. I I gotta say yeah, I mean I know of some people 20 people have got part Three or part two. I don't know how many people have part 100% But it's probably a good number. I think anything else Let's get started Set UID, why don't we care? Yeah, our end goal is to get root privileges, but it doesn't necessarily have to be root, right? So set UID means that a program is gonna be executing with somebody else's permissions Right, which is really awesome because then we can use that and if those permissions are Different from ours like another user or if they're the root the super user's permissions then that's really great as well So Before we had started in on the attacks, I wanted to take a little bit of time to talk about How do we? Yeah, but if you do you should let me know because that would be an unintended vulnerability so No, I Don't think so even misled maybe some hackers in here Running campaigns of misinformation Deliberately to try tricky. Yeah Yes, I'm part I'm part one you're running it as root Mm-hmm. They're very different. They're not even being run on the same system. So yeah, don't need root for part two You shouldn't if you do we'll end up getting root you can tell me Unintended so how do we so? When we compile a program, right? What is so what's the file that's on the system? We saw on Friday the change change shell see it CHSH or the PAS SWD what kind of File formats are those what kind of programs are they? Like is that a Python file that's doing that What is it? That's executable binary. So what kind of format is it in? Yeah, so it's an elf binary So if we want to try to find some bugs or some vulnerabilities in it right because we know hey That's a privileged program when I execute it. It's running as root But what do I do? Do I just like look at the ones and zeros of the file? And maybe you could do that. So I guess I kind of heard but I get here. What is disassembly mean? That mean we like like when you get some Ikea furniture and move it out of your house You have to disassemble it Then when you put it back together, you have that one piece Doesn't go Ikea experts So is disassembling disassembly mean? Say it again Yeah, so taking in that binary those ones and zeros in that else file and converting it into back into x86 assembly code So why is that useful is it useful? That's a lot easier to read assembly code than it is just the raw binaries, right? But what about can we take that then back up to? C code Not always and yeah, it's a lot more difficult right because there's no guarantee that this is actually a C or C++ program, right? So disassembling the idea is you want to Take that program and you want to try to win the binary program identify the x86 instructions of that program Is it easy or is it difficult and why? Op codes may not match up because there's a lot of different hardware Maybe I mean there definitely could be the binary that you're Want to analyze this for some MIPS machine you've never heard of right but assuming it is for a machine that you know then What are some properties you can leverage? Yeah, you know all the registers, you know the layout of the machine You know the mapping between the assembly instructions and their raw hex values, right? So then is it just trivially easy, so where do you start? So how do we know where to start where do we start in that binary binary just a whole bunch of 1s and 0s How do we know with 1s and 0s to start with? Yeah, so in the elf file right so not even actually the main function of the program But in the elf file format it defines the entry point of the program, right? So we need to be able to look at that elf file identify the entry point of the program and start disassembling from there So from there is a trivial can we just go through and crank out op codes and assembly instructions Optimization flight so we actually be things that we're missing when we take a mile away, which we'll never be able to see It could be obfuscated Right, which is kind of a similar thing, but just so X What's the big difference between x86 and arm what's what are some of the differences between In my that So what does that mean not fix links instructions So how big are the instructions and arm right so assuming 32 bit right every instruction arm is 32 bits Right, so then when you're decompiling that It's kind of easy in some sense, right, you know, okay every 32 bits Right, there's gonna be a new instruction But with x86 x86 instructions can vary all the way down from one byte to five or six bytes right and so Assuming you properly decode the first bite, then you can be good and So some of these things The disassemblers work just by linearly parsing the program and saying okay Start at the main function and just try to decode everything into x86 instructions. So what types of? control flow constructs does assembly like this have So we're talking about jumps, what kind of jumps do we have? Unconditional and conditional right so some jumps that we jump to All the time and sometimes that we jump to some of the time what else it's another fix usually constant jump. Yeah, yeah Yeah, okay, that's crazy Luckily on x86 we can't actually write to the instruction pointer So when we do like a call or a jump most times we specify the address right we say okay jump five bytes ahead or six bytes ahead okay, so that would be like a fixed or a Direct control flow like hey go to this maybe it's conditional go to this place sometimes Can we jump to registers? So the value of whatever's in a register so we can actually With indirect jumps we can say okay jump to whatever's in PAS So then can we know statically where that jump is going to go? Possibly maybe but we have to do some maybe sophisticated analysis It could be that that value is computed at runtime based on hashing things and all kinds of crazy stuff So this actually is one of the things that makes even Disassembly difficult because we may not know all the code the pass through the program And so some of the tools the very simple stupid tools just do linear disassembly like just start disassembling bytes some are more sophisticated in that they look at the jumps and so They will analyze one function and then go to each of the jumps and analyze those functions So we'll do this kind of recursive attempt to follow the control flow of the program And the one I actually use like mmm a good amount of the time is Just incredibly simple object dump So let me turn on this VM object dump is the GNU disassembler. It's Incredibly simple linear disassembler. So if I do a dot out so the dash D means disassemble every section in this file So it's gonna try to disassemble the whole thing And I'm gonna pipe this through less because it's gonna be a lot of stuff So I can see here that some of these are actually bad Not valid instructions, right because it's just trying to interpret bytes as x86 instructions So if I search for main, there should be a main here So this is actually let's I'm gonna do m32 for 32 bits. Oh, I may not have the libraries Let's see It's the problem being this is the problem of doing demos live We'll go back. We'll not worry about it that it's not the architecture exactly that we're talking about Okay, big difference between 64 bit and 32 bit the R. I actually don't know what that stands for but I'm gonna say it's a really large register That's 64 bits now instead of 32 But we can see here from this main function. It's decoded. Okay Push the register the base pointer move the stack pointer into the base pointer push register 13 push register 12 push register Rbx right so we can see that This main the disassembler has been able to do this right so it's actually been able to try to identify these instructions that are That are executed at runtime And so we have calls and we have compares and we have jumps It's gonna jump to this location. So if I searched for 401 93 B Hopefully I will find that so this is that label or something that it's gonna be jumping from here to to do all this stuff Okay, so this is kind of like the most Works, okay So there are a lot of So where to look RadaWare is a whole suite of program analysis tools It does reversing some vulnerability analysis stuff and it does disassembly the binary some forensic stuff Really cool stuff is it has bindings to Python for a lot of these features So you can use this as a library in order to automatically disassemble a program to debug a program do all kinds of cool stuff It's also free which is also The cool feature when using these tools Another like the most popular tool is Ida Pro. So this is the state of the art tool for reverse engineering and disassembly like it has a incredibly sophisticated Disassembler it's got a nice graph view so you can see how the control flow goes throughout all those pieces it Can do Disassembly for just about every architecture you can throw at it. It will be able to disassemble it It even supports. Well with an additional plug-in. It will even support decompilation. So it's decompilation Yeah, so not exactly the original source code right because it can't guarantee that but it will output a C representation of that binary how close it is depends on a lot of things You can integrate gdb or other debuggers with Ida. You can also script Ida the downside is it is Very expensive. Anyone guess how much just a license for Ida costs at like 15 and under it I think for just a license and the decompiler is like three or two four thousand dollars Per architecture. So x86 arm 64 bit Yeah The nice thing is there is a limited version that is available for free So if you want to download it and play around with it, there's an older version. That's for free on their website It's but it's and it's also when you start using it the user interface is just terrible like I just have to tell you that off the bat now, but it's one of those tools that it's like It is the best in its class and people use it despite the interface being terrible So there's a whole book on like the Ida pro book that was written by somebody else And it's like how to actually use Ida There's other newer stuff. There's a tool called hopper that is a disassembler It includes a decompiler, I don't know exactly how good it is yet It's a commercial product But you can use it for free and it's actually not very expensive. I mean compared to Ida Everything's cheap compared to Ida But you can get you know you can get very far with Object dump and your brain. So those are nice things Okay, any questions on this before we get on to our attacks Okay, so we're looking at a few different attacks against Linux and unique systems Based on all the knowledge we've learned about binaries and the systems. So we're gonna talk about all kinds of attacks. So We're gonna talk about so what would be remote attacks against the network service What was that? Yeah So why would remote remote attacks be useful or interesting right so you have how much axe I mean, let's say it's a system like let's make of maybe a sign in to part two, right? What access do you have on that system? Anybody here have a user account on there? Can you SSH in that machine you run arbitrary command? No Right, but it's offering you a network service Right, and so if you're able to somehow trick that system into executing arbitrary code that you decide Now you essentially are on that system and then you can leverage maybe other attacks to gain other visibility on the system Some of the stuff we'll talk about are some you know can be applied even at the operating system level So why would remote attacks against the operating system be useful why we should attack network services and be done with it? Multiple hosts in what sense? Yeah, so we're the operating system, right? There's multiple applications running in that operating system if I just Relatively exploit one application Well, then I have the permission to that application I can do whatever it can do if I have the operating system now I can mess with any application on that system, right? Other kinds of things more even remote attacks against a browser, so the website that you visit may load some Advertisement which loads some unintentionally malicious or maybe intentionally malicious code Which then attacks your browser and then gets on your system, right? So this is another class of attacks that's able to go from remote, right to you They're incredibly rare nowadays, but you could I believe I was vulnerabilities and like the TCP in the TCP IP stack So you take buffer overflowing there. You send data and now you're executing arbitrary code in the OS is in the kernel space of that operating system, so you're completely out of that operating system USBs you still have physical access so like for here I'm more interested in like Internet accessible Stuff, but yeah, even like local printing things Like printing services on all kinds of those any even the operating system does Network-wise often times can be exploited this way They're a lot very rare because it means you can get to the OS level from completely outside of the system So it's very rare that you find these type of things nowadays So yeah, so you so we're local on the system, right? We want to try to exploit set you ID applications We talked about that because we want to try to expand our privileges in the applications And we may even want to try to do local attacks against the operating system, right? So see if there's a way so why would I Know what's What are the differences maybe between remote attacks against operating system and local attacks against operating system? Remote attacks are more difficult why? Yeah, so think about the attack service Right, so what the operating system on the network, right? Exposes if you think about all of the operating system code where there could possibly be bugs or vulnerabilities, right? The network code is just a small subset of that Right, but now once you're on the system locally you can make system calls You can interact with the operating system in all kinds of ways, so the attack service is a lot more broad there So you're more likely to be able to find a vulnerability Okay so on Unix right so on on Linux and what we're looking at right so Kind of as we said so that almost all of the local vulnerabilities exploit set you ID root programs Right because this way so now you don't really worry about the Linux code itself being secure or whatever It's another application that is necessary for the administrator. That's set you ID and the other about one these are obviously don't quote me on these numbers right these are Approximates The other tiny fraction of vulnerabilities actually exploit the kernel itself Okay, so We want to exploit set you ID applications. What can we do if we're local on a machine? to exploit They go to the applications What can we control when we call the functions right we control the parameters? What else? Standard input yeah any input to the program The environment yeah, so it may read from environment variables. We saw that the environment variables are actually in the process space on the executing application So it can very well be that we can get input into the application that way So the inputs we have the command line right any arguments you pass there the environment variables during execution We can actually maybe provide input with file input by specifying certain files. We may be able to somehow modify it if it's Dynamically loading libraries and we're able to influence that process to get our own code in there any kind of potentially any socket input sockets being TCP IP sockets or maybe Unix sockets Maybe we're able to write into a Unix socket to get into the application and so this is like so this is The way you need to be approaching these on how to break these local applications is okay What what is this program doing? What inputs are really is it reading from and how can I influence that? So what what how does it interact with the environment how do applications interact with the environment that they can be We could potentially Right, so we are able to run multiple of these processes at the same time. Maybe there's a race condition or something there We could maybe try to start a process Delete it start another process and maybe it leaves behind some temporary files What else? dependencies in one sense So usually well, okay Yes Yeah, so maybe through the environment we can influence what things this the trusted program runs Right, we can't actually change the code of the trusted program itself I think the operating system will actually if we're able to overwrite a binary it'll Drop the set UID bit so it no longer runs as root. I believe that's how that works So what kind of environment interactions do processes do what kind of things do your applications do? Read files open files, right? Can we mess with the file system? Yeah, right? We can Create files the access files, right? Can we trick it to maybe access our file instead of a trusted file that we have permissions to do? With the processes, right? Maybe we can send signals The process forks other things. Maybe we can influence it that way and so Really what this comes down to right? It's trying to define. Okay. What is like? What is this application doing? right, we had that abstract model an application using the operating system, right, but It's oftentimes because of this just defining. Okay, what is the application and what does it do from a development perspective? is really difficult right because It can do all these things it can access all these files and it could be that by changing one file that it's able to access We're able to trick it into executing arbitrary code or to doing something on our behalf So we're gonna look at a lot of different types of attacks here We're gonna look at things where we can influence file accesses We'll get into the details of all of these We're gonna look at command injection and memory corruption attacks so These are all still incredibly prevalent Even in completely different domains. So for instance like command injection is it still incredibly popular on the web and memory corruption even though We've known about these possibilities and the more sure we're almost what 30 years ago And you rely on memory corruption vulnerabilities that's still in it Incredibly active area of research in terms of attacks and defenses. So people say oh great We count with this new way to prevent memory corruption vulnerabilities and then Then we will break it to find other ways around it. So it's a cat very much a cat and mouse game Okay, so File access attacks. So when you run a command When you say like a dot out or ch shell chsh How does bash know which program you want to execute? Yeah right So in general, right, so there's actually kind of two ways to think about this So one way is when we execute the program, right? So for those I know some of you for time is one, right? Your program language like Java didn't have a system So how did you figure out how to execute the rate? So one way we'll see right is when we want to execute a program We check in an environment variable called the path to look for which program to execute right path says hey These are the program This is the list of how to look for programs to execute What about when we open files for reading and writing? If I just say open Who dot text? Where's that? Where's that? What files are we open in the current working directory who controls the current working directory program? The person who executes it, right? So maybe by running our program in a certain folder, right? Maybe we could change what files that actually opens and writes and So this is kind of file access attacks very broadly are all about doing this like can you trick an application? into somehow violating the security just by Messing with what it's opening what files. It's opening what things it's reading all that kind of stuff Okay, so the actually the most pretty much basic of these attacks that actually still works on a surprising amount of web applications is is What's known as the dot-dot attack? So does dot-dot refer to? Yeah, so in Unix, right dot-dot. So what's so when you're talking about folders and hierarchies, right? What is dot referred to? The current directory and dot-dot is what? the parent right So oftentimes so how do you when you want to open a file? Let's say from the user How do you do it usually you write applications you open files? How do you do it? Yeah, you get the file path from the user. Maybe concatenating some strings together, right? and If the user on this case the attacker, right? So we got to be thinking about every time users give input to our application these this input can potentially be malicious Right, and so this is the way we have to be thinking and so The idea is if the application is building some path to a file by concatenating values from the users Like for instance in the C example We have some initial path and then the user file, right? So we're concatenating these values together and let's say it opens this path with read write So am I restricted by this so okay? So what some of the things we need to think about right is what is this application by this code? What is the application intend for the user to do? Oh, is it? So if you were coding this application, why would you write it like this like what? What files? Are you do you want this? Does everybody agree? That you only want Whatever this initial path is you only want files to be opened in initial path Do you agree? That's kind of implicit in the security of this application you flash initial path right? slash ETC slash program one You know and then that's where the users store their log files or something and so this is a way to access their log files Yeah, we're not talking about that right now We want to think about what is what was the intention of the programmer by writing that right? You would probably not If that was if you're intent, I mean yeah, we want to try to get to what's the intent behind here, right? Reasonable and it would be a thousand different things in ways, right a reasonable subject would be okay, whatever initial path is You should probably only open files there in the initial path But what happens exactly if the user input something with dot-dot characters in it then what's gonna happen? So what can the user make this program open in the parent directory, and then what about that's brand directory? And then could they specify literally any file on the system? Yeah Right like with the dot-dot you can go dot-dot dot-dot dot-dot all the way to the root you can actually keep going You don't even need to know exactly how deep you are dot-dot slash dot-dot slash Dot-dot slash dot-dot slash Then what about if you want to open ETC password ETC shadow it's a root file You want to open slash home slash add-of-d slash dot ssh slash authorized keys file Or maybe ID underscore RSA for my private key So yeah, this is exactly what the dot-dot attack is right so the attacker provides file names with parent directories in it and this allows them to Circumvent the intended security policy of this application The other way to think about it is called the direct directory traversal attack right And so this Works on web applications to on web applications if you're writing to a file and you use this You can put dot-dots to get the web app to get the application to write to some other file Which maybe you can leverage for other purposes and so Basically if you're ever so okay, what do we do so how do we fix this? What are some strategies for how to fix this? That's ever contains a dot-dot ignore it. Is that enough? What do you do? What do you do you actually ignore it? Except only absolute pass in what sense? The whole path, but here I've given the whole path Right, I've given this whole initial path Yeah, so there's a couple actually there's a couple issues here, right? So one issue one way to think about it is Okay, disallow any top dots in the user file I think you're included slashes right now they can open a pile up not just in the initial pass Let's say they can't go out with dot-dot With slashes they can go to sub directories right which could be just as bad depending on our application but our Our security policy is hey, we really want to just restrict them to this initial path, right? Only in this directory do I want them to be opening and writing files? So, yeah, there's there's two ways to think about this one way is okay. We blacklist You know we search for bad things in user file Another way is okay. We do this in cat nation and then we do the security check that says okay This had better be in this folder, right? If it's not in this folder, then we stop when we don't do anything Yes Okay Yeah, so that's a very good point I was trying to do that too right that's another reason why just blacklisting things is dangerous and taking a whitelist approach It's much better right where you only accept known good. So I would do something like okay I would check user file for it's gotta be just Album error here a through Z zero or nine. It's anything else chuck it I don't I won't allow you to create a file Because yeah, okay, what about if there's a space like dot space dot And it could be that maybe You know, it depends on what's happening here between Pat and the open maybe other things are being applied to it What was that what we do want a dot for an extension of Maybe but maybe I don't want to allow them to do that right because that Yeah, that becomes trickier right then. What do you allow what do you not allow? Yeah, yeah, yeah, so we can use other functions to try to have the operating system help us determine where the file is Yes Does it maybe it maps all dot characters, maybe there's other dot unit code characters other languages But aren't the ASCII doc like it map by the operating system to the parent working directory Another thing I just thought I was what about symbolic links In that folder that will link you then out of that folder Maybe there's a symbolic link from inside the initial path to the root or to something somewhere else So now maybe you can go outside from that All kinds of issues so the basic idea is you you need to either heavily sanitize that input or use a white list to Guarantee that there's not going to be that the user cannot create a file that violates this security policy There's actually another cool technique and this is actually what I use in the submission system So all of the organizers stuff is done so see each route is a System the syscall. Yeah, I believe it's a syscall so that an application executes with a new root Direct a new root file system. So basically it says hey change So you do see a droop slash initial path and now To your application when it tries to access slash if only access is That initial that initial path it literally no matter how many dot-dot slashes it does it can never go outside because your application of that slash Actually, yeah, this is kind of the CH route was kind of the first They are also called jails in BSD It's kind of it's kind of gone full circle where you first have like CH routes and then you had virtualization So you're like, oh great You have all this isolation with virtualization and now with like containers like Docker and all that stuff We're kind of coming back to oh run the same processes on the same machine, but have them be separated somehow So when you're running stuff, so even if when you're running a root application right or A time of two part one on the submission system You're not actually executing in the submission systems directory. You're executing your own little sub directory So you shouldn't be able to go out, but There's always some stuff that could happen. So Especially when you're running as root and I also am running you as root, but only the capability to Connect to a raw socket. That was the other little trick there. So There's other root capabilities that I took away. I think so it should be less things you can do and So see it's also useful for something like a web server Right, so we know a web server should only touch a certain directory in most cases. It's bar So if we put our web server in a CH route now, we can guarantee that it's not going to affect anything else of the file system Okay Another type of file attacks are path and home attacks. So these are quite are really fun So what is so path as we saw right effects? So it's an environment variable and it determines how the shell right how batch or shell Searches for commands, right? So what's the format of path? Yeah colon separated values of paths or directories, right? So it's from the left from the first one search first in that directory for that file name if you can't find it look in the next one if you can't find it look in the next one and then if you can't find it you Say that that file can't be found So the idea is if the application itself, right is invoking commands if our set going the application is invoking commands Without specifying the exact path Then maybe we can control it to execute either a different version or something that we control right, so specifically the system calls exact exact LP and exact VP They use the shell path so you can call it say exact LS and it will look up in your path parameter to try to find the LS program So if we create if we change our path right because we control the path we control the environment that this program Executes in we can get it to execute our value and not somebody else's value or our application So what is the home environment variable used for home directory? Yeah, so it specifies the current user's home directory. So how do you usually access that when you're doing file accesses or something? The tilde Yeah, right the tilde and So when the shell sees that it knows okay look up in the environment variable Expand that or look up in home in the environment variables expand that to determine where to go so If an application is using something like trying to access things with tilde slash my file about TXP Right and the tacker can execute that and change The home environment variable to point to it anywhere and so it'll create this my file that TXT wherever we want You know, maybe that's a good thing. Maybe it's a bad thing Let's stop here for now because I want to come back and show you an example of Path and home tax So we can see what that looks like so We're back on Wednesday