 So we just call the technical people, we'll get somebody out here to try to do that, but we'll go to what we got, the slides are on the website, so if you want to copy, otherwise I'm recording the screen, so we'll see everything that's on the screen in the video recording. All right, so where we left off, we were talking about how, so we looked at how programs got compiled, we looked at how programs get executed, what the x86 language looks like. Now we need to look at how the operating system, so one thing that actually does when it loads your program into memory to start execution, so we've kind of been talking about this a while, but it's always important to remember that the operating system is going to, right, if it hasn't taken your program that's on disk, a series of bytes that has to turn this into an executable process. So it's actually pretty simple and straightforward, although I guess simple is a relative term. We first, so it's going to parse the L file format that we saw, which specifies exactly which memory locations, which segments and bytes on disk is going to memory, and what permissions each of those sections have in memory. And you can see, so what's fun to play with, if you haven't played with this, on the Linux system, slash proc, I believe is a, I think they call it a pseudo file system, I don't know the correct term, I believe that, that if you do ls slash proc it'll show you all of the, a bunch of information about the application and give you a lot of information. So if you go to slash proc slash some process ID, so every process that's running the operating system has a unique ID, and then if you do inside there you can see all kinds of fun stuff. One of the really interesting things is slash maps, which if you cap that file will show you the memory mapping of that application. So let's take advantage of the fact that I don't have to do anything to record this. Let's use this up. So as this is going, so it's, is black the worst color, or here? Cool, so I'll just use the bottom of this little order of the screen, which will be great for everyone. Alright, key address, SSH there. Alright, let's crunch this down. Alright, great. Okay, so I can do, if we do cat, so first ls proc, so you can see there's a lot of stuff in here. We can do ls proc, let's some process ID. That's annoying. There's a lot of things in here, so we can do that, maps. So this shows us for, I don't even know what process this is, but all of the memory mappings for this process. Wow, this is a big one. Okay, cool thing, there's a nice link, proc slash self. So self maps to the process itself. So any application can access its own proc information by going to slash proc slash self. And so this, because it's the cat program, it's going to be this in location of cat. So we can see bin cat, we have a readable and executable segment at these memory locations, we have a readable segment at these memory locations, readable writeable, readable writeable, and then we can see all of the libc libraries that are all included inside here. So anyways, just fun stuff to poke around in. Once it's loaded, the objects under nails are decided, and then the instruction pointer is set to the location, the memory address specified in the operator in the elevator, and the CPU starts executing from there. So the memory of a application is roughly, so for a native x86, a 32 bit application on a 32 bit operating system, the upper byte gate byte, sorry, is reserved for the kernel. You can look up reasons as to why this is, and then your program occupies basically three gigabytes underneath that. In a 64 bit operating system, the kernel is actually not there. I don't know exactly why, but it's interesting if you look at pointers that are towards the top of the program space on a 32 bit system, they'll start with bffff, and on a 32 bit system, or on a 64 bit system that's running a 32 bit application, they'll be, I believe, starting with the x, but that will definitely change. So what, so when a process executes, how is data, so when we talked about it in the beginning of the class, we got the application, and we had all the different ways that data can be, we can get data into the application. So specifically when we think about a process, what are the types of ways that data gets into our application? File-based, user input, so how does the process get all of those things? Through the operating system, through system call, so they don't actually exist when the program is created, right? What else? Some of the memory that the application itself includes, what else? So that's loaded by the operating system, by the elf editor that's supposed to put this data from this here, you already have to, somebody else can answer. Some other processes or programs with this program? Yes, but in order to get that, it needs to, again, a system call to actually get that. What other data gets passed to your program? How are you going to pass? If you're writing, if you see code, how are you going to pass? On the command line, how do you access them in your program? Yeah, the arguments are the main function, right? You have first have an arc C, so you have the number of arguments, then you have a character pointer pointer that points to an array of character pointers, which are all of the arguments passing to the function. Have you know what the third argument is? So the third one is the environment pointer, so the ENVP, which is similar to RV, in that it's a character pointer pointer. We're going to look at it, let's look at it. So execve is the operating system, the system call that actually executes a new process. So this is, you can think of the entry point into an application where execve is the function, is the system call that actually will then give an A. So the final name will be the elf file that you want to execute. The RV pointer is a character pointer pointer to an array of null terminated strings that you want to pass in as the RV vector, and then the environment pointer is a pointer to environment, whatever, what is the environment? So like path, yeah, we mentioned that. So the path, so it's a essentially here's what they're saying. So if we run ENV, ENV is the, shows us all the environment variables in our environment. So it's a series of key value pairs. We have a key and an equal sign of the value, so the environment pointer. So all of this is passed into the application. So one application wants to exec a new process to ask the operating system, please execute this. So it gets passed in the name of the elf binary to execute the RV parameters and the environment pointer parameters. So you can actually write, so when your program executes, you then have a main function that has an RV and environment, sorry, RV and environment pointer. And that's always passed into every application that executes. And this is one of the main ways that this data gets passed into the program. The question is in terms of memory, so how does this actually get passed in, right? You have to think the operating system is creating, right, took these bytes from disk to load your program into memory, right? So that's the choose somewhere in this three gig space to put your program. But that also needs to include in your applications, in the memory of your program, all the RV parameters and all the environment parameters. So all of that data gets inside your program. So the where is pretty arbitrary, but it needs to be somewhere in your application's memory space. And it really just goes top to bottom. So the operating system will first put all of the environment strings, and then the RV strings, the actual bytes themselves. And then you have the pointers, so the array of pointers to those strings. Then finally you have ArcC. And then we then have our stack section. So that's kind of considered the start of your program. So what this means for you, the really important thing about this is, let's say your memory, so wait, I'll hold that clock. I'll come back to that. So the stack is going to then go right downward. So this is quite close that door. Thanks. So in this class, I mean obviously you can draw a stack in any direction. In this class we will always draw a stack as higher numbers at the top going down. Can you somehow get this screen down? Okay. Can you get the mic's working? So in here, our stacks will always go from higher numbers to lower numbers. So you have to think what this means when you're looking at stack operations, the stack is being decremented, it's moving down. So if we say that, let's say I don't have a stack pointer that some memory address, let's say 8000, what does that mean in terms of like a stack as a data structure? So can you infer about the memory locations that are above that and below that? Yes, it represents the top of the stack. So then what about, so if the stack is growing down, then what does that say about the memory addresses that are above the stack? If it's growing down, so when you add something on the stack, which direction is it going? Down. So that means that memory locations that are below the stack pointer are free, or garbage you could consider, right? So those are all free and memory addresses that are above the stack pointer are all used memory addresses. So after the stack, so the operating system will reserve in the application's memory space because it has to remember allocate memory segments that it wants the process to actually use. So there is a limit, even though we think of the stack as infinite, we will run out of memory and that is the classic, one of the classic programming problems besides a stack fault is like the stack overflow, right? So you're going past the stack boundary, so this is if you have a well as we'll see, if you have a recursive function that's continually allocating things on the stack, you'll eventually overflow that stack. After that, you have any shared libraries, so that's what we saw on the maps, libc will get mapped somewhere in your application's memory space. And then we have the heap, so what's the heap used for? Free and malloc, so any kind of, essentially I think of it as like programmer or if you think of C++, any new or deleted objects are all created on the heap. And so just like the stack grows downward, the heap essentially grows upward. And it makes sense you have these two memory regions that are going to change dynamically through time, you should probably put them as opposite ends and make them grow in different directions. Yes? Sure, yes. Let me see what we're talking about. Alright, now we have data segments at the bottom, and then the code, but all of this can depend. So how are access controls implemented on Linux systems? What's the access control mechanism? Like how do you know that you can read or execute a file on a Linux machine? Yes. Oh, it's the, if you do an ls-l, this is a permission, so I'm going to have to end it. When you do a shmott or whatever. So what are those permissions? Like what are the... So read, write, execute, there's like the capital S too, which is like... We'll talk about that in a second, yeah. And then, so there's read, write, and execute are the basic kind of three bits of permission, and then what are the other... We'll talk about the set you idea in a second, but what about the... So you have read, write, and execute, but how do I know, like, is it just anyone can read, write, or execute it? Yeah, so there's the three read, write, execute for the owner, so the person who owns that file, so every file also has an owner and a group associated with it, so you think of every file has owner and group, which these are members of the system, so the users in the system are all defined in ETC password, P-A-S-S-W-D, and all the groups on the system are in ETC groups. So from there, then for each, so every file has read, write, and execute permissions, in addition to some others, for the owner of that file, so can they read it, can they write it, can they execute it, the group of that file, so whoever you think of as the group that owns that file, read, write, and execute, and then the other attribute is other, so everyone else on the system, can they read, write, and execute that, and so this is why, so this is really important for understanding how you actually access a file, or if you wanted to create some program that you didn't want to let the whole class execute, if you're all on a server, you would make that file so that only you have execute privileges of that file, or for instance you could make it so that nobody else can read your files by setting it, so that only, yeah, so that 6-0-0, or however you want to do it, so that only you, the owner of that file, can read it. What about, what are the differences, so read, write, and execute, for files it's pretty clear, right, is it? I mean, that's more of a question, right? You can either read it, you can write it and change the contents of that file, and you can execute it. What about on a directory? Hey, that sounds like something, awesome. So what about on a directory? Yeah, so yeah, so that's a, so the idea is you have the same bits, these are read, write, and execute, right, but they need a different thing applied to a directory, like does it make sense to execute a directory? Oh, it doesn't make any sense, it's not a, I don't even know what that would mean, right, so in the console, yay, thank you. Alright, so in the context of applications, so in the context of directories, actually probably we need to look it up, I know execute is you can traverse into the directories, you can access sub-directories inside that directory, read, I believe is you can list the files in that directory, and write means you can add or delete files in that directory, assuming you have the permissions to do so. So this is how you can let's say give somebody access to a sub-folder in your home folder, you can put execute permissions on your home folder, and give them access to some sub-directories, so they can, if they know the path, alright, cool. Alright, so we talked about permissions, we talked about read, write, and execute, the question is when you execute a program, so let's say you're on bash, when I type in ls, and then I try to ls, which is going to try to, I don't know, list some directory, how does the operating system know that I'm going to execute this ls, so when you're accessing the system that your ls's don't try to get confused and like how does it know who's that accessing what or executing what. So the process id defines like, you can think of it just as a unique identifier, so by itself it doesn't have any additional information. Yeah, so inherited by essentially whatever process calls exactly e, right, so you've got to think, your shell is just let's say bash, and so SSH into the machine SSH creates a bash application, a bash program for you that sets the standard end and standard output of bash to be essentially your socket, and so when you type in ls, bash then uses the path to figure out that you want slash bin ls and then it calls exactly e slash bin slash ls with rv parameters that you cast in bin p parameters that you also cast in there. So when that happens, when the operating system creates this new process it says, okay, this new process essentially inherits the same user id as the process that created it, right, so when you create so, and you can easily you can check this out, you can run things like I like htop is a good can you, yeah you can see that at the bottom so this has like so this is showing me all the everything that's running on my system so you have the pid is the process id, you have which user is running this owns this process and a whole bunch of other data and information so why this is important is because we talked about when you when a process wants to let's say open a file or open a directory for reading to list the contents of that directory who does it have to ask to do that operating system through a sys call right, it has to eventually call sys call like open to open a file with certain flags, let's say for reading and the operating system has to decide are you authorized to do whatever action you want to do on this file, so how does it determine that? Your user id it looks at your group permissions and then identifies whether or not you're exactly, so it looks at the but when we say your user id we mean the user id that's associated with that process right, so there's no concept of you, right, you are actually not doing anything on the system you are invoking programs to create a process in the system that is acting on your behalf right, so that's super important and so this is what's checked, so if you want to try to fill out some directory that you don't have access to right, the operating system is going to stop you or if you're trying to cat some file that's sensitive, the operating system will stop you because it checks does this user id that's running this process have access to this file based on the permissions on the file system it turns out that it's not really expressive enough so I mentioned earlier today that user applications that the list of users in the system are stored in the ETC password so what else is stored there eat all the groups for all the users maybe in ETC groups I don't know or maybe in both it's a weird thing, let's look actually so before you freak out the password is a misnomer as we'll see it's an old old style name but hey you can see in the format of this file it's an incredibly simple old file format I mean in the fact that it's new lines separated by fields separated by columns so you've got the username what used to be the password in that second field and then the user id the other id I don't actually know I guess I should stop saying all of these fields especially when I don't know them but I do know that the last so there's two important fields here let's see where am I the second to last field is the home directory of the user and the last field is the shell that the user wants so you can use the sh you can use the bash you can use the sh whatever shell that you want so if you want to create a secure multi-user system what permissions would you put on the ETC password file so let's think about it this way who should be able to create new accounts on the system the system admin I don't want to be writing people to be creating accounts on my system by default maybe I want to change that but that would be a reasonable secure default so what permissions would you have then on the ETC password file 7-0-0 if you didn't want to last over a year so who owns it first so you guys need an owner and a group so if I own it then that means I'm the server then the group whichever group wants me I'll give them zero permissions so nobody else can do it and other means nobody else writes the last zero okay so the first thing we're going to talk about is when we talk about Linux systems so root is going to essentially be our stand in for the admin so the root user is user ID 0 they can essentially do anything they want to the system so you can think of they are the admin of this system now owned by root actually if you run ID I'm not root I'm just a normal user you can see I have a lot of root IDs here but fundamentally I'm not a user so what would be an argument for having everyone be able to look at the ETC password file so we can see the permissions it's owned by root root root read write owner root can read write it the group root can read it and everyone else can read it so why would you want it to be world readable a legitimate reason I mean you can make up a reason but it should be legitimate all grad students who should be able to use the making of the due that's having reasons for what was that when they go through process when they open this file with that provision right so why so like why should I as a normal user be able to read this I would like to go and if I was clattering with someone and they had a pretty cool user area their home directory or something that I'd like to use that to look at their home directory and then copy their you might want to know you might want to know where people's home directory are because the Linux is a multi-user system so if you have a multi-user that can work on a system simultaneously and so in this case for example in the process of a system if you didn't get access to this file you don't know actually who is the user that running this process or who is the user on the owner of this file or any operating system yeah so when you think about it right the operating system when it's actually storing let's say files on disk right and that owner this root right would you store the string root like r-o-o-t is the person who owns this file would you want to store like an integer of 0 like user ID 0 owns this file right there you go with integers for sure or that's what they do right which makes sense because if you so many changes there user name you don't want to go in and update every single file on the system to that new user so it's actually nice because every user has a mapping between user IDs and user names for nice things on ULS you can actually see who's who on the system so there are legitimate reasons here now like you said though it clearly makes sense arbitrary users should not be able to just write to this file because they can create new users and a fun fact is if you can get access to let's say the evc password file if you change your user ID to 0 in that file you theoretically will then get root privileges although I think I did this last year and I think I got walked out of my license so I will not be demonstrating but I have done it before and you can do it on your own virtual machine and play with that because the OS it says oh yeah Adam here's your password great and gives you a user ID 0 just also happens to be roots user ID so now it's essentially like your root in the system so clearly this is a very security sensitive file but when I want to change my shell from bash to let's say ZSH should I have to bug the admin user to change that for me does anybody know how you change your shell there's an environmental variable shell I thought yes but the problem is when you log into the system it has to look up what to execute first and that's actually how this gets set yes there's also an rc file an xinit rc which will set up your windowing systems that will call your other things I think you can set the default shell there too possibly in my case you get the shell that's in this file to execute another shell for you so you just stuck with bash forever because some terrible administrator you want to use ZSH I'm going to install this and then I'm going to want to use it so how do I change my shell to ZSH profile rc I want to use something crazy I want to use something different but all of those things bash is going to execute first and then it's going to execute and replace bash with whatever you have so there's a so there's a command called change shell ZSH right so this is going to let's see what happens when I do this so let's say so as we look so to prove there's nothing on my sleeves as we see here my shell has been bashed as you know I wrote to and changed this etc password file and I just completely own the security of this operating system use the tools it provided to change it yeah we talked about that when I execute a program I type ls here right ls is running as my user so what's the difference between why can't I like then etc password and go down to atomd doing this is what got you locked out really close no it didn't I changed my user id but I can't actually write it because it's read only right so even if I do that can't open file for writing there's no possible way for you to write out this file because I cannot write to this file so I have to I believe that but when did you change your user id was it in the same file this file? so this doesn't work so I can't actually vi can't edit or change that file but clearly I just saw that change shell chsh was able to change that file so let's figure out what are the differences in the file permissions between okay there we go so we have then basic this application it's executable I can execute it it's owned by root which makes sense because you wouldn't want me to just be able to change vins binary to be whatever right so what's the difference between these two besides the nice red color what's the difference in the file permissions here the s on the first on the owner so what this means is there's actually 4 bits so it's read write x so I can read write and execute set uid and a sticky bit so set uid there's both set user id and set group id essentially what this means is this tells the operating system when you execute this program execute it with the permissions of the owner of the file not the person who executes the file right so in this case we're running change shell what user is it running as as root which is why change shell can change the ETC password file so what does this mean if I can what if I find a vulnerability in the change shell program that allows me to do whatever I want to the system I can do whatever I want I can change ETC password I could I essentially group it that way I can change anybody's password on the system I can change root password to whatever I want so this really is the core of essentially local security on a multi user Linux type system so the attack service you can try to find a vulnerability in the kernel and try to exploit the kernel to get arbitrary code execution there and clearly you can do whatever you want on the system but the much better way to go is to analyze all these finding areas that are set user ID because these all have essentially root privileges and so if you can control and find a vulnerability here you'll be able to take over the application and so it's the the truth is slightly more complicated in that there's your real ID is the actual ID of the user there's an effective ID which is normally the same as the real ID unless you're executing with a set UID bit in which case that will be the root and so that will be compared there's actually a really good research paper in this called set UID demystified so it's like an 11 to 15 page research paper describing exactly what all of these things mean because do we talk about it networking? the ports less than 1024 are privileged we could not be kind of good so you have to be root to be able to listen to any of those ports that are at 1024 or less so why is that? to establish ports for processes sure but you can have an established port of 2048 where it could be on port HDB could be on 2048 why does it need to be on port 80? that's what was the problem that's the standard usually most standardizations have this is actually a good tip for your future careers assume at the beginning there's a reason for things so when you get in your career and you get into this code base and it feels like there's ridiculous architecture choices it's very easy to derive these fools who came before you for making this terrible, terrible architecture and then you start talking to people and you start realizing they had insane budget constraints we're dealing with all these weird systems so all those decisions were actually legitimate type of choices at the time still a lot of things with cases like these where there are good reasons you have to dig and try to find them so think about it this way on a system who can create an application that listens on port let's say 2048 anyone is your web server important? yeah so you're running a server maybe it's running a web server maybe you're giving other people access to your system so you're running this web server it's running on port 2048 and then one of your users writes a fork bomb or something that you always end up doing somehow every semester and I have to come up with new ways of trying to defend against them but you end up taking down the web server and the web server shuts down it releases port 2048 what can any user on the system then do? create a server listening on port 2048 so fundamentally all you need to do is kill a server once and any user on that system can then bind to that port and now can serve whatever they want as that system so this is why most all privileged applications have a port less than 1024 because only the admin of that system can bind to that port and essentially only a root now think about the implications of that do you want a web server so you're running as a root on your system? no everybody on earth can talk to that web server right? that's the entire point of having a web server so we don't want that but we don't want anybody else on the system to be able to bind to that port so you need to have a way to be able to give up your privileges so that's another aspect here of dropping privileges so you can actually even though so all of the davids that are running on your system that are binding on ports they all start out running as root and then as soon as they bind on that they give it up and they either start executing it as a different user so they have different file permissions sorry not file permissions but user permissions so anyways I think it's important to end it but these are the important practical considerations that you need to be thinking about or talking about understanding these systems okay so these are a bunch of the set user ID functions we basically went over all of the high level I think these are the important parts and so we have to think about this as the any any potential vulnerability in any application set you any application on a system could be a local privilege escalation vulnerability but these are all compiled binary applications so we need to be able to try to understand how to find vulnerabilities so we talked about a little bit on Monday about disassembly so disassembly is the process so you can think about what way to think about it is you have kind of the normal application life cycle right the well first I mean you're supposed to design an application right create models design nothing should work like randomly type keys then so you decide and then you write the code and then you compile that down to an application so you can have disassembly or reverse engineering so that's like the engineering process and the reverse engineering process is the reverse so you have this binary application you want to disassemble it up to an assembly you want to try to maybe decompile it or reverse engineer it because usually the term used by hand to try to come up with what's the C code that was used to create that and then there's even stuff that can try to look at a source code and try to abstract and think about the models that was used to design that application so so disassemblers are fairly straightforward so we looked at even on Monday object dump so object dump is a fairly good disassembler it's called a linear disassembler so it literally just tries to disassemble the program starting at a certain byte so why so why might that what's the problem with a linear disassembly so why I guess think about this way what's the goal of disassembling if you're disassembling you want to actually know which lines were executed if you're drawing only a linear path we are not very sure that we're the control control transfer actually took place for example a function could have for another function yes okay so think about it think about those in tuning so A we have to separate the fact of execution time so we're not trying to trace the control flow at one time so this disassembly is usually a purely static process where you have the bytes and you want to turn that into assembly code but the idea remains the idea is we want to see all of the code right all of the assembly code in this case we want to see everything that the program does so if you think about it we want to see the code that approximates all the program behaviors which includes all paths of all jumps what we talked about there's indirect jumps which from here go somewhere else right and and we talked about that x86 is a variable length assembly language so yeah it's very you have an entry point so you know the start of the program and then you can start disassembling from there and then you can figure out branch targets so you can figure out where else to just actually I'm getting out of myself so that's a recursive so linear just tries to keep disassembling from whatever offset and disassemble disassemble it's a weird property of binary code that it's not exactly self healing but eventually it gets on track there's linear disassemblers work pretty well a recursive disassembler tries to follow the control flow so it starts at the main function disassembles that sees what other functions are called where all the branches disassembles from there the problem then of course is indirect jump so if you're if it cannot statically compute the location of a jump it's going to miss code so there are lots of disassemblers out there I'm just going to briefly mention some of them so Radoware is a disassembly tool or program analysis tool these are all very nice tools it's free which is nice it's always a good plus IDA Pro is actually now I was probably a really good time for you guys because so IDA Pro is like the gold sander of disassembly and a lot of reverse engineers use IDA Pro I will tell you this is one of the most difficult pieces of software to use I mean don't let that scare you it's super easy to throw a binary into it but to like there's literally so I believe he's a professor at the Naval Postgraduate Academy he wrote a book on like essentially it's basically how to use IDA Pro he's not involved with the company at all he just realizes this is a very poorly documented piece of software and so he wrote a book on how to actually use it properly but it is the state of the art we're in luck because there is the commercial product is expensive there is, they just upgraded the free version number to IDA version 7 so this is Linux, Mac actually probably not Mac Linux and Windows and does 7 do 64 bit yes what else does it do you know when 7 was released when 7 was the latest yeah it was really cool yes it will definitely work for this if you want to play around IDA feel free to get the free version download it, use it, play with it like I mentioned I used Hopper I think it has a limited span of like 30 minutes these are all so what these all these tools are really nice in that you can you can add comments to any of the lines so you can kind of look at what's going on some of them have decompilers that try to take a function and turn it into C code so IDA has an even more expensive decompiler, Hopper has a limited decompiler so but I guess the important point that I should stress is none of these tools are matching like nothing does the job for you they just make your job slightly easier so I know some students it's very easy to just think I just had a better tool I had the pro version of IDA with all the plugins and X-rays like everything would be so easy and I'd be this binary ninja and it's like no like the free version of IDA is nice but you still have to actually go through the program, spend the time spend the effort, figure out what's going on really there's no substitute for experience but you all have I know the ability to do it you just have to put in the effort and do it and you'll feel like a magician when you finally get these things and you don't need any expensive tools all you need is your brain which I guess kind of was expensive but it's free for you now so you should use it okay so now we can talk about all the ways that we can attack a space system cool okay so so one of the we talked about different locations so depending on the capabilities of the attacker so if we are remote in respect to the server what does that mean what capabilities does a remote attacker have what was that can they listen on ports can they listen on ports on that system that we're interested in no but they can listen to their own ports maybe the server is sending stuff what else can they do yeah so they can interact and send anything to any port that's on that system right so and then so so we're either going to try to attack a network service some service that's listening on some port on that system we may try to attack the operating system itself if there's some vulnerability inside the operating system that allows us to take control that's obviously very awesome and then we may want to remotely attack a browser that's more about trying to get into somebody's physical machine locally on the system now we can still do all of these other things if we're low on the system right so you gotta think of this in terms of the capabilities are getting more and more right so as we get more permissions if we're local on that machine which means we can execute a process of our choosing then we can try to find vulnerabilities in secuid applications or find vulnerabilities in the operating system so what you'll find is a remote attack against an operating system is highly very rare nowadays why is that? less privileges I'd say no operating systems are doing the same if not more of a job now you gotta think they have IPv6 like they have a lot of like the TCVIP stack is growing significantly more complicated easier targets easier targets in what sense like your application yeah but why so why are they an easier why are applications an easier target than an operating system no player roles don't really protect you I don't think we'll talk about those yeah there are a lot of abstractions in the system and they do live on permissions so that's why it's a little bit harder than hacking into another application because another application is out of the thing that can hold them most of them are off the DMC and they need access to it but the operating system is not actually that easy to get access and also it's regular updates and so there are a lot of people in the community working with that every day I'd say it's more of the last thing than anything else you've said the thing to think about is actually if you teleport yourself back to like 2000s there were an insane number of remote operating system attacks like because you have to think remember the people who were designing these operating systems I mean they used to be like classes like this they didn't want anybody to teach like I have stories from older professors who when companies found out they were teaching people how to exploit vulnerabilities they were like why would you do that would you train like the hackers which is super funny because now it's higher people with security skills like crazy so but looking back the people developing these operating systems didn't have any security knowledge and they weren't aware of the possible security mistakes that they made so we looked at like smurf attacks and these types of different attacks on the network the same things were true of the operating system so for a long time there were a lot of like this is where the worm concept came from is taking over an application or an operating system and self propagating that code so but there's much fewer operating systems out there than there are let's say applications so people like Microsoft the Linux everyone got better at writing more secure code at the operating system level which is why if you want to attack something you go up and you attack the application I would rather find an exploit in a web server than let's say the Linux kernel so it's not that they don't exist it's just that they're less likely I would say at this point to exist but they still happen from time to time when they happen they're super cool you should look at them so we're going to look at local attacks so attacking Linux applications because and we're going to study some very specific attacks we're going to look at I mean when I say attacks I mean vulnerabilities and exploits so code vulnerability that allows an attack to be completely controlled a Linux application or alter the execution and the important idea is that these problems that come up we'll look at specific attack x the ideas behind there can be applied in many different contexts so it's important to be thinking about that as we're studying these things so yeah so and we're going to be focusing on applications because by the same logic as to why remote operating system vulnerabilities are so rare the same thing happens on the local system right finding a kernel level vulnerability is much more rare than just finding a vulnerability and say do I have an application that allows you privilege escalation and so this is when thinking about what are all the inputs to the application is super important because fundamentally we can't change the execution or we can't make an application do something it's not supposed to do if we can't change the input to that application so that's why we spend a lot of time thinking about what are all the different ways our input gets into the application and so we need to think about when we exact we call exact VE on a process or to create a new process we're passing in arguments those arguments as we saw you copy into the memory of the running process the environment the person who execs a program gets to choose the environment variables right so the RV all of the RV parameters all of the environment pointers during execution file input socket input right these are all ways we can get data into an application and it's actually super tricky because thinking about all these ways you really have to get creative and start thinking about man what if I did this or maybe you know sometimes we'll see there's like you can even explain application with like a timing gap of what's our 15 milliseconds that you can change the process to get in a certain state then you can take over the application and so it's such a small time like how can I get that to happen but there's tricks that you can do like a as an attacker you won't need to be successful once right that's like even if it takes a million tries if trying is very fast you know computers are very fast they can try very quickly so we are going to go over this is kind of a menu going forward of the types of attacks that we're going to focus on so there'll be things really to do accessing files command injections as you'll see and memory corrections so memory corrections kind of the more classic ones that we'll talk about heap overflow stack overflows, heap overflows basically being able to if the attacker can change the memory of our application what can they do let's get into it so we'll be able to do a few of these today so there you go the file access attacks so this entire class of our release is we think about manipulating the file system right because the process has to interact with the file system to do things so if we can trick it to do something that I didn't expect to do then that can be as good as getting let's say arbitrary code execution so one of the main ways is the classic dot dot attack so the idea here is let's say you have an application that takes in user input and said let's say so this would be a pretty classic way to write an application you have different users of your application each user inside your application's directory gets a special folder and inside that folder you put whatever their file name and the name of that folder is their username and so you put a password in there for every user so somebody tries to use your system you go great what's your username and what's your password and you check does that password access that file or does the password match what's in that file so it seems like a reasonable thing to do we're all reasonable people you can see developing an application but the entire security of this depends on checking that password that's in that directory and so if I as an attacker can say well I don't want to use a root or admin user let's say in this case because it's an application if I could trick the application into opening up a file of my choosing then I get to choose that password and I can get that password so the basic idea here is it's really behind the name which you can't really see which is cool so there's a few types of I'm not sure what the correct term is maybe path shortcuts so how do we refer to let's say the current directory dot dot so a symbol dot anywhere within the path string when we're talking about file means the current file or current directory sorry so this is why when you have an executable like a.out but you want to execute your current directory you need to do dot slash a.out right if you just type a.out it says I can't find that and why does it say that so unless if the current directory or the dot is not in your path yes okay good so if I just write a.out it says command not found by slash a.out a little more right if I echo my path and the reason is is because bash is using my path environment variable to look up any of these directories so the path environment variable is separated by colons and it checks every single directory in there to say is a.out in any of those directories and then if it's not it says command not found so if I change my path and I add dot into there we'll always look in the current directory for the command I want to execute that can be very bad depending on how you do that so the way to think about when we type in an a.out or even type in something like ls right we type in ls backslows is it ls in a home add-and-be bin no is it in a home add-and-be no local bin is it in all of these finally we'll get to a slash bin somewhere here so let's say I change my path because I am a super cool I'm like so tired of doing this it's really annoying and it's really hampering my programming side so I want to say path is equal to dot right so here I have my path and what's going to happen when I type in a.out it's going to execute isn't this awesome I did something stupid like this before and I had a like a normal command like make or something and I had local use my local version instead of other than I would yeah so let's say I'm developing something and I wanted I don't know the good thing for it but okay we'll go with this now when I type in ls is it actually going to execute the ls that I want no it's going to execute the one in this directory now let's say that and this is based on the current directory you've always controlled the current directory no let's say I did an ls on slash home I figured out all of the user directories on a system I found out that one of you I could go into your directory so I cd into your home directory and then I type in ls what ls is going to execute the one that's in your home directory and if you were very clever you could write that ls program so that it actually does what ls does but also creates a file owned by the person who executed it which is me that has the setui e permissions to give you and you only have access to that so you can set this up such that you could basically get permissions as me because now you're executing I'm executing whatever code you think so doing this is super dangerous don't do that I don't want to recommend putting it at the end I think you could probably do that and you would be safer but I still would not do that I also say that in the file resolution pass also has that problem too because the current working directory can sometimes override standard file resolutions for the C compiler say it again, sorry to call I just file resolution can be affected by that too in compiling yes we'll talk about that something similar to compiling this is weird did the backslash ls help that if you do, I mean sorry forward slash ls that means non alias I thought so yes so I can access the directory directly because this means absolute path execute that file so this is a good lesson don't do that we'll make sure that's gone because that was super annoying so we saw how that works with dot it's the local directory so how do you go to the directory above dot dot slash dot dot slash is there a limit do you eventually reach space you eventually get to the root to slash but you can put as many dot dots as you want I mean there's a limit but you can put all of those and so the idea here if I can control the path that you're using to open a file and if I can include dots in that directory in that file path specifically dot dot I can go outside of that directory that you thought you were in and give my own path so in that case I could go dot I can say my username is dot dot slash dot dot slash home slash add and d slash password password is whatever is in that file and it will think that I whatever one of the users and this actually and it seems like a silly thing because really when you think about it why would a period be a dangerous character you think about how to prevent against this it's just a period character that should appear in any english text so it's not like they're inputting non-asky, crazy, weird characters right and this even comes up in the web so we'll talk about this in about a month and a half so put it back in your brain but this is why we study these kinds of things because even this what seems to be a very local thing occurs in lots of different contexts so it's also called the directory traversal attack that's important to kind of know and play to store in your brain so how do you defend against this you're just passing by the path although that's a hard problem this means that any user input that the user can possibly control should not be used as a path there is a way to give an set path not really go towards a particular starting from slash but even if you start from slash I don't want you to start from say it's slash bar, slash mySQL, slash lib or whatever right if I control the path after that I can use dot dots, arbitrary number of dots to get any out of that to any other path and any other file that I want yeah yeah so there's a couple of different ways for a lot of these things we want to think about whiteness and blackness so what does whiteness mean whiteness what does that say yeah so if you want to do something that means pretty much give you information to do something important but the blackness that it means is a dangerous way of performing I see you were thinking about input so the idea is with the whiteless do you specify what the input what good input looks like you say the input can only be American characters, right? And that way you know there's no dot dots, there's no anything else. A blacklist would be in the other way that says no dots or no dot dots in my input. These have frozen cons, we'll talk about them in a second. Before we go though, I want everyone, so a couple announcements. I'm checking my calendar to make sure my dates are right. So in two weeks, we're gonna have another CTF in class on the 21st of February. This will be a team-based CTF. So by the 20th, the THS, Connor's gonna send an email. By the 20th, I want you to get a team and this is gonna be your CTF team and your project team, so they're together. So it'll be four to six people, so minimal four because it's a lot of work so you don't wanna be just one person. Max of six because more than that, it gets insane. So start talking to each other and coming up with teams and use the mailing list to find people if you don't know anybody, stick around, talk to people at your class. Everyone here is closely friendly, no. Everyone here is very friendly. I've been working with people. I've been through the art school. Art, they got me. That would be a good way to get to me. Oh wait, the other really important thing, your team has to come up with a good team name, like a half the name, a full half of a team name.