 So now we're getting closer to more security type exercises, so these are each basically challenged as they got to the break. So two weeks, the first part this is a little warm-up. This part should not be very difficult. The idea is there is a website called Book of the Wire. They have a bunch of really cool war games. If you want to get really into high-high exclamation, this is a great way to do that. You're going to be doing levels 0 through 20 on here. This is the absolute beginner one. Nothing fancy here. This is just getting used to accessing a Linux server over SSH and doing a grand line and those kind of stuff. This is not meant to be super difficult, but you may need to, depending on your experience level, learn something to do this. And so to track you with this, we are going to use a leech haul, which is a site that keeps track of how people are doing across all different kinds of challenges. So what we're going to do in the next day, we're going to email each of you individually with a random user ID. So you'll sign up to this website with this user ID and then you will set up your Over the Wire account so that when you break a level on Over the Wire, it will post on your leech haul account. And that way we can check and make sure that you're actually breaking all levels. And that's how we'll be greeting and following your progress. So for this assignment, just need to submit a review file with the description of how you broke each level. And we'll be pulling, I have a deadline we'll pull down where everybody was and how many levels they broke and that will be your score for that part. One important thing to note on this section, I call that here, these board games have been public for a long time. So if you search hard, it's not hard to find the solutions and walk through to these levels, but that's not really the point. The point is, if you can't do these easy ones, the next homework assignment is going to be insanely difficult for you. So you should spend time learning this stuff getting it in now. That way the next time it's going to be a lot easier. Questions on this? The second one is you'll be pen testing a new awesome startup. It does not look fancy. I wonder, I don't think I can read the author that I remember, the IP address. Ben, IP, I don't know, 8.7. I don't know if that's something I've been accidentally giving anyone. Hey guys, I have this setup for doing stuff so that you can access the service from the outside that it's running. Essentially, there's a brand new awesome startup that's created this really cool service where you can upload code that they trust and they will run that code. So, your goal is to read through the description, play with it, and there will be a file called secret.txt in the working directory where the server is running. Your goal is to get the contents of that file, submit that to prove they broke a level, and submit a review that describes how you went about reading this. I think we've distributed it as an open source mailing list, but I'm not posting it out here for everyone on the internet to pen test our stuff. Do we have any language restrictions? Do it on your own. The third one is, gear you up for doing what you're doing now. So, we're studying assembly language, and as part of bringing binaries, you can be able to read and understand assembly language. So, hopefully this works correctly. There's a binary here. It's a program that asks you for a password and tells you if it's right or wrong. So, your goal is to extract the password from this binary by studying and analyzing the assembly language code. So, you'll need to know x86, or you'll need to learn it. You will use file object.freedale, whatever tools you need to use to do this, understand how this program is working, and to extract the password from this program. Any questions on that? Yes. Do you need a password? Do you need a password? Any files you may have used to crack the password? Just in case there's anything you used. Similarly, in part two, there's many files you use to write whatever you're using to submit everything, and a review. So, yeah, these are more breaking things rather than programming stuff. So, will you be submitting a password and checking back? How do we verify that? So, I'll back test case and say, oh, you've got the right password. Yes, the server will tell you that you're correct. And you've got like 17 chances for you to limit it. I think it's calculated, but I think it's calculated like 100. Don't try to guess it by guessing. Or a way to do that. It's much easier to do that. I think they're unrelated. I think they're three pretty orthogonal parts. So, they shouldn't depend on each other. So, yeah, it gets time on one go to the other one. That's a good way to look at it. I'll also start thinking about them. We'll hear them now because that will help you. Good question. Yes. Where are some good resources to learn about like X86? The Internet. Okay. Use your Google Power if you don't find some. I think with the book on X86 assembly, there's good, all kinds of good stuff. The other thing is, you know, you shouldn't just read the whole book on X86, right? Right. You want to accomplish a certain task. So, I'm always task focused on things that I need to learn. So, go through there, see what instructions are happening and figure out those things so you can start piecing together what's happening. Okay. That's when the last assignment. I do not yet. So, we're just going through and grading the assignments to see if there's any, anybody using any library or anything else that will be used. So, we'll be adjusting it forward so hopefully not too much. It was pretty good though, I think, overall from my brief view. Yeah. We're going to see your test cases for the last one. I can't tell you that. There are test cases that only have to do with specification, I promise. You can come see me in office hours when we can talk about it. I'm happy to do that. But I'm not going to broadcast everything. It keeps on taking. So, how suddenly the X86 construction is going to be more important? So, Mary talked about, actually Mary talked about this. We talked about all the different ways that you can address memory and how you can access memory in X86. So, now I can cover this again. There are various classes of instruction. There are instructions to move that around. These are move, the move instruction, which is the one we've seen. So, move either from a memory location to a register or move between registers. There's also exchange. So, you can exchange the value that's in two registers. You can push and pop. What do you think these are? Yes, specifically. So, these are push this value onto the stack or push the value inside this register onto the stack, depending on what the argument is. If it's an immediate value, a hard-coded constant, it will push that value onto the stack. And so, this is incredibly important to understand what's going on here when you're pushing things off the stack. And we'll go into much greater detail here when we study function frames and how the layout of the stack all looks. But for now, it's important, when you push something onto the stack, it's going to decrement these stack pointers. The stack pointer register itself is going to move as a bright product of this construction. Similarly, when you pop off from the stack, it's going to be doing the same thing. There are all kinds of cool binary arithmetic operations. Adds, subtracts, multiplies, dividing, incrementing, decrementing, all kinds of stuff. Many of nice resources online about what each of these things do and the exact semantics of what happens here. All the logical operators that you're used to and ors, xors, nots. Ways to transfer the control flow of the program. We talked about assembly code, and code to be right doesn't usually just start at the top, start executing it until it gets to the bottom. We want to jump around and have branches. So we have jump instructions, which is also an unconditional jump. So this is always jump here. Call instruction calls another function. What's the difference between a call and a jump? What's the difference between the instructions themselves? What's the difference between the instruction themselves? Does the call instruction itself return? Exactly. So a jump essentially just says, set the instruction pointer, the EIP register, to be wherever we're jumping to. And that means that the CPU, the next instruction that's going to execute, is wherever the IP is. Call says, set the instruction pointer equal to this function I want to execute and also, in this case, push the address of the next function that wouldn't be executed onto the stack. Essentially, you can think of it at this point as leave a bread crumb on the stack. So that way we're going to have functions done doing whatever it wants to do. It knows we're going to jump back. That's the main way. There's a key distinction between those two. Return is the opposite of call. So return takes that bread crumb and starts going back to whatever we're calling that function. Int, so int is for interrupt. So this is how to signal and interrupt. IRET, I think I know what IRET does. I think it's only going to return this camera. And we can do all kinds of cool comparisons. So the comparison operator to operands, either constant values or registers. And then the bits of the e-flag registers are set accordingly based on the results of this comparison. So, for instance, the zero flag will be set. And so each, there are various jump instructions that each correspond to one of these flags. So JNE is jump if not zero. So that means, and specifically that means if the ZF flag is zero, which is what the compare operator or the subtraction operator sets as a byproduct of the comparison. Jump if equal is if the zero flag is one. J, A, E. I believe this is a jump if there was no carry operator. Jump if greater than or equal to. And so, there's two types of control flows. And this is actually going to be a separate concept, I'll come back to you later. So one thing is, hey, jump, let's say 10 instructions ahead. A fixed location. That is pre-computable, that's static at run time. Another way is maybe jump to whatever is the value inside the EAX register. So when might you use, what are the ways? Why? In your high level code, why does that functionally exist? Yeah, for conditions, for branches, for branches you have fixed locations. If this is true, X and UDs five instructions, otherwise jump over and X and UDs five instructions. Loops as well. Fixed jumps back up to the top. So when would you ever need an indirect control flow? Yeah, so you have function pointers in C. Even a language as low level as C, you can pass it a function pointer to a function and call that function. So there you have to be able to jump to a different register. What kind of language features? In object oriented programming. Inheritance? Yeah, inheritance, right? So you have a call to a function, but you may not know which actual function gets called depending on what type that object is at run time. So the way that actually works is there's a table of functions and so the code grabs that table based on the current type of the object and jumps dynamically to that function. So it's not computed statically at run time. The switch uses that? Switch uses straight jumps. So switch, because the switch it actually translates very cleanly to assembly code, as you're saying. It just won't do this, it's this, this. But I think you could make like a jump table for a switch data. It's decided in run time. Yes. But it's a fixed, I think it would depend on the compiler. I'm not sure if I'm going to compile this to it. But one way to do it would be essentially about you could have an array and then you can figure out based on the comparison which place in the jump to based on your table that you made. I think they help you jump tables or you could just translate the switch data into an if-else statement. What do you expect? In the assembly level code it would be like this we have to provide a value to register and then they have to jump right. Because we don't, we can't, we can't try to, or can we write those things once it's decided. So if the, it's just like any other branch statement, right? You're branching based on the condition of one variable. So you test that variable and you say, okay, if it's 10 or it's 1 then X would use this branch. Otherwise it would jump over it and then check if it's 20 X would use this other branch to jump over it. That would be if it translates to a switch data or a table, like an if-else. Cool. Input and output. So there's ways to do input and output to peripherals. I think this will almost never happen, so I'm going to go over that for now. One important instruction is the not instruction. So what does this do? Nothing. Nothing. Why does it exist? Why would you waste your creating an instruction set architecture? You have a limited number of instructions you can possibly have. Why would you waste instructions on something that does nothing? Timing. Sleep. If you're waiting for input. We're waiting for input. Yeah, there may be times, so there's a lot of reasons, right? There may be times where maybe we just want to loop and do nothing. And still we could maybe have a check and then jump back to that check and that would essentially be a no-op. Or we could add 0 to a register. That could also be a no-op. In general, to harken back a long time to when you learned about pipelining and CPU architectures, there would be some architectures where the CPU would have to manually put no-ops in between instructions so that that way they were not executed as part of the pipeline. So that would be another reason why. I don't know specifically why X86 has it, but it's there and it will be important. Okay, so system calls. So we talked about system calls. What's a system call? It's a terminal. No, a kernel. Kernel, yes. So it's a call from a user space program into kernel functionality. And this allows basically the kernel to be accessed or do. So when you're opening a file, that's a system call that happens in the kernel. And the kernel's the one that decides, hey, you actually get to open this file or not. And that's where all those file permission checks and everything that's important happens. So most of the time when you're coding a C program, you will not write a C program that calls system calls directly. Oftentimes, this is happening in the library. So for instance, on Linux, there's system calls for read and write. And if you read from file descriptor zero, it's usually standard input. And if you write to file descriptor one, that's standard out. And if you write to file descriptor two, that's standard error. And this is a system call in between you two, the kernel, to actually write output of your program. Now, what do you normally use to write output of your program in C? Print dev. Print dev. So what is print dev doing? Yeah, so it's giving you a lot more functionality on top of just, hey, output this many bytes. So print dev. So I believe the right system command, the right system call is the file descriptor, the buffer, and the number of bytes you want to send. So with print dev, you don't have to send that. You just pass it a string and it will try to print it out. Or you can get really fancy and have a fancy print dev string with a lot of format arguments and pass a ton of arguments to the function. So the key idea is the kernel provides this kind of basics functionality and the libraries are providing other functionality on top of that. So oftentimes, you don't actually need to call system calls directly, but they're going to be incredibly important when we want to start writing shell code because we want to not use the system library if we want to call into a kernel directly. So the way you set up is an interrupt of x80 on Linux in x86. So this is how you signal to the kernel, say, I want to make a system call. And there's a various protocol that we have to get into of how to actually do this. But in the EAX register, the system call number that you're trying to call. So we have a super simple Hello World assembly program. We can have in the dot string section, or we can have in the data section a string Hello World that has the label HW. We can end in our text section. So now we have code. We are saying that we want main to be a globally accessible symbol so that everyone else can see this main function. Then in here, we can do things like move four into EAX, move one into EDX, move... Now what's this doing? Yeah, so this is going to move the address of this string Hello HW into the register ECX and then we're going to move 12 into EDX. So what's the... How many bytes is this string Hello World? A bunch. Not very specific. It should be 12. Yeah, with the new line 12. And then we call it 80. So from here, you can kind of piece apart the calling convention. We're not calling into the kernel. We're not calling a regular function. So we're not going to have a call instruction. What we have is we move the number. So system call number 4 is right. And you can look this up. There's a header file that defines all these symbols and specifies exactly what the system calls for itself. 4 into EAX 1 into EBX. That would be the file descriptor we want to write out. So 1 is standard out. So this string will be printed on the standard output string of our program. The ECX is the 2nd parameter. So this is the buffer that we want to print out from. And EDX is the 3rd parameter. And it specifies how many bytes we want to print out. So we'll do all this and then just like a call instruction. You can think of this really as a function call into the kernel. So we call into the kernel. The kernel is able to return. And when it returns we're going to move it 0 into EAX and then return. So the 0 into EAX this is setting up the return value of main. So we haven't got into it yet but this is how functions return values in x86 is by putting a value into the EAX register. Questions on assembly? It's actually kind of fun to do. We'll do it a little bit more when you're ready to go. It's kind of fun to get down really closely to the CPU and just mess with the instructions. Don't worry about variables. Okay. So now we've been talking about very briefly well, okay, so we don't call system calls directly in our code, right? We're using libraries. But how actually do those shared libraries work? So what are the two defensive ways to use the library? So static, what's static linking? That's compiled into the executable itself where dynamically linking is what? Points to What was that? Run time. So it loads the library at run time. So how could you actually do that? So think about it. How do you compile code that's going to call other code at run time? Use a linker. What does that mean? What does a linker do? So you have all your function calls point to a table and then update that table once you redirect it towards the function. Classic computer science response, right? Saying, hey, we don't know where we're going to go, where we're going to call, right? So let's add a level of indirection and then we can change those indirections, right? Because one option would be the very naive and super straightforward way as well just put whatever some placeholder means for every call. If you're going to call, just say call printf, I don't know, you may get all caps, whatever. Call some printf function every place that you call printf before the program executes it goes through and let's say it replaces that printf with the actual address of the printf library when it loads up, right? That has a lot of problems because you may not know exactly where it is and then you have to calculate a lot of times calls or can we have any of this? There can be a lot of problems there. And you have to fix every single place where it's called, right? Because you have all these calls from the printf function. So the main way around this is indirection. Well instead of calling into the function directly, what you do is you get the value in some table and so you know printf will always be at this fixed offset, get the value inside that table and jump to that value. So essentially an indirect jump into printf and so this way all the linker has to do is when it loads up it says okay it's finally the C library, great loaded in memory, great okay now I have to put the addresses and all these functions that are used in the correct places in this table and bang everything in the program will work. So that's exactly what happens. So this is what the PLT and the GOT do. So the PLT contains all these essentially little trampolines that fetch the entry from the GOT so there's like multiple levels of indirection. Fetch the actual entry from the global offset table and jump to it. So the idea is when we call the function we'll call some function in the PLT and that little bit of code will get the address from the global offset table and jump to that address. So one cool thing to do is we need to look at this in the, what do you have to do for part 3 so looking at that binary you can see through and kind of look and see actually how this functionality looks like in the code. Yes so and the way that you're on this the cool part is well why load a library that's never called right you may have a complicated program or you may have lots of libraries. So the idea is in the PLT the first entries link to point to an entry that basically says hey look this library this is what was called and then once it's loaded now the GOT entries work and now all the subs is going to call so we'll have that library loaded. The key thing here is for this to function should the program ever be able to change the procedure linking table the PLT? No it shouldn't ever change right? This is these are always little things that say hey grab something from the global offset table and jump to it. Now what about the global offset table? Fundamentally this is a table that at run time has to be we have to write the addresses of all the library functions that we're using in our code. So fundamentally this table must be writable and that's actually a very important fact that's going to come up when we talk about exploiting all kinds of vulnerabilities format sharing vulnerabilities or one of the big ones. So the fact that we can write to the global offset table is incredibly important. So what's happening? So just like we kind of talked about at the end of networking one good way to check yourself is to think about everything that happens when you put Google.com into the address bar of your browser and get entered as it happened. Another interesting thing is thinking about what happens when you're on a bash prompt and you type in ls and hit enter. What are the exact steps that happen in order for that program to be executed? So what does bash do? Is bash anything special? Is bash anything special? Kind of. I was thinking you know when I first asked that question you guys gave me the time to think. It is a shell. So there are certain programs that are designated as shells. So the idea is it's not anything really special about the program. It's special about what it does. What it does is accept input from you and execute programs based on that. So not anything super special. So how does bash work at a high level? What happens when you type in ls and hit enter? The first thing it has to do is it first has to ask what ls are you talking about? Which exact program? Are you talking about the ls program that's in the current directory? The one that's in slash bin? The one that's in slash user slash bin? The one that's in slash sbin? There could be many different ls programs and so it needs to know exactly which one do you mean. There's actually a procedure as we'll see that it figures out exactly what you mean. So then let's get rid of that and say batch in slash bin slash ls. Yes, this is all it does. Bash is incredibly stupid. All it does is figures out which program you want to execute. It forks. What does fork do? Creates a new process that's exactly the same as the old process. And then exactly the same except for what difference? The return value of fork is different so you can know which one is the child but the parent's usually going to remain as batch and the child is then going to call exec with the command you pass in slash bin slash ls. And that what does exec do? Yeah, does all this. So really exec is a wrapper around the exec ve system call and so eventually exec will call exec ve and then the operating system will say okay this process wanted to call this exec ve. This means it wants to completely change its process from batch into now this slash bin slash sh so it reads the elf header file, it loads everything up in the proper memory addresses and it starts executing at the entry point. And then it starts executing and then batch is waiting listening for when its child piv is done executing and then when it's done executing it asks you for the prompt about what you want to do next. So this is a high level that shells do if you ever have a chance to write your own shell, highly recommend it. It's a good learning experience. So this is what we're looking at. So how does the operating system take this file, right? That's the important thing to remember. It's just a file on disk but it turns it into a running process. So it's actually pretty simple, right? We already saw the elf header file format. It parses that header file format and it copies everything in memory. A very cool thing to do on the link system, the slash proc file system is a special file system that has information about what processes are running on your system. So you can cap out slash proc slash whatever pid process id and the process you're interested in slash maps and that will show you the memory layout of the process and show you exactly where things are. So proc file system is really cool. You can see it has all of the processes that are running on this system. I think does anybody remember? I think there's so many access to the current processes. So we can access the maps. Can you all read this? Text? Yes. Look at it. Maybe we can't see. Because we just always make the window bigger. We will cap this through less. So we can see that what this is telling us is that we can see interesting stuff in here. So we can see that memory region from 00400 to 0048a in memory is readable, executable and I don't know what the p means but you'll notice that it's not writable. So it's only executable in memory and this means that this file slash user slash x864464 hud service is mapped into that location. So you can see that for all of these addresses so you can see everything about where these things are mapped. Anyways. My question is, so why does the hud library have three different maps in memory? So you got one map for executable, one map for just reading, one map for just writing. Yes. So let's look. The first thing is I don't know what this is. So I'm going to run a file on it. The file is going to tell me it's an L64 bit executable. So it's actually an executable program. And I can run readL which will show me all the different sections in the ELF file. And so this is what we looked at earlier. So for instance we have all of these sections. It's a little bit, the table's not showing up right now. But we can see that the text section is allocatable and executable but not writable. We can see there's a read only data section which is just allocated. It's not writable. And there should be a again you can see. So here's the GOT table. So the GOT table is writable and allocated. So basically those three parts of the disciplinary are just being mapped into different memory locations. So rather than being mapped right next to each other, what I was running I just decided this is where to put them. So the science is where to put everything in memory and those executables in memory sorry not executables the segments that are defined in the ELF header where should it put them in memory it may be fixed based on the file itself. Then we may need to do some relocation. So when we get into ASLR we'll talk about the operating system that the binary is compiled especially. It can relocate and change where the code segments are so it's not in this place every time. And then the OS sets the instruction pointer specified as the entry point in the ELF header and it starts executing from there. So then what does the address space look like of the program? So on a normal x86 process I found this does vary if you're on 64-bit versus 32-bit so if you're running a 32-bit application on a 64-bit system the memory layout will look slightly different. But usually and for assignment 3 you'll be on just a 32-bit system. So the 32-bit of memory is reserved by a kernel. So you'll never see any addresses with all f's because those are not for your program those are specified for the kernel. Then your program just starts with everything after 1-bit. So the vf addresses all the way down to the 0 addresses are what's mapped for your program. If you're running it on a 64-bit operating system I've seen you get basically everything. So all your addresses start with all the s's. I don't know why. So what is when your program executes what does it use as input? So how when you write let's say a program how do you get access to the command arguments that are passed? So in C how do you access the command line arguments? A second argument to main. So the first one is an integer argc that tells you how many arguments are in the argv vector. And then you have a character pointer pointer which is an array of a null terminated array of pointers to the arguments, each of the arguments. And just like any C function these arguments are on the stack above your program. But even before that so you think about what the stack looks like and above you is going to be argc which is the integer such as 4 bytes. And above that is going to be a pointer to what? argv to a pointer. So the important thing to remember so I actually want to dig in this is something that I found really useful when in the kernel the very first thing that it does is creates first the environment and argument section. So if we look at top of memory to bottom of memory this is how we're going to always draw to the stack where we're on now. On the very start of the stack is all of these strings of the environment and the strings of argv. This is really important to understand exactly when you give input to the program. When you change parameters of a program how does that change the memory layout of the program? So we have the strings, then we have the pointers we have to have those tables this is a pointer to this first string up here this is a pointer to this next string up here we have argc this actually is missing a step there should be pointers up to the actual data for each of these things then we have argc then we have our stack and so the stack as we'll see is going to grow depending on how the program executes all the function calls will push new information onto the stack so the stack will keep track of the current function calls so that we can go backwards it's also used for any temporary storage that a function might need Yes. Is the stack used for this? So when you fork a process so that's how you get a child when you fork a process the forked process has exactly the same memory layout and everything that your parent processed So it's all shared? It's not shared, if you write to the child you won't see it in the parent unless you set up all this stuff but it's different it's separate but it's the same the data is actually the same so this is part of the problem they have a lot of classical vulnerabilities you have a forking web server that would fork and then the child would be able to read some secret password that was input from the parent because it's shared in the processed space so it can read it but once it writes it it gets its own properties How much memory does it get? That's a good question I think it varies but it is a fixed stack layout it's not an internet at some point you'll stop and we'll see why because after this so the stack has a fixed end point and I believe you should be able to see actually this is a good question I think you should be able to see in the L section there's a stack section and so you should be able to see from there the size that's allocated to the stack Is there a termination character like 0 or something? The termination character is the fact that the next byte after that is not mapped to your program so when your program tries to access that the kernel will give you a segmentation fault Is it possible to specify the stack size? Probably yes I've never had to change it or mess with it but I'm sure you can What exactly does that mean by the space it might need for the argument so what exactly? Oh for the stack specifically so depending on the argument we'll change the top part and depending on our knee connectors that will change the next part and then you have that kind of paint section in there is space that your program can use as it's executed so we can use that whole space all the way down to the end of that arrow and it can't use anymore so the stack will grow down and we'll grow down when it's being used and when it's being free we'll move back up and exceed the stack so if you have an infinitely recursive call you'll keep allocating memory on the stack until you try to write to one of those bytes that are outside that memory segmentation and so it will throw a segmentation fault So is it based on Facebook? I believe you should be able to change it based on compiler options but I don't know Good question Do we have space for any shared libraries? So any libraries that we're going to need will be loaded there then we have the heap so what's the heap? Every time we call malloc we get memory given to our program Technically malloc is a libc construct so the kernel has no idea about malloc and free the kernel uses a there's a system called called sbreak which allows the program to increase the size of its heap and so that's how you get more memory from the kernel I think you can also decrease that as well so we have our heap and the heap grows up right and so similarly it has a fixed location where you cannot get anymore Now things change and even a lot easier when you have 64 address spaces because then these things are very far apart you then have after the heap so there's a data section so the dot data section usually the dot mss is what it's called this is where you have so the dot data is the initialized variables the dot mss is the uninitialized variables and these are what kind of variables in your c code? global where are local variables stored? then we have so we have the vss to the data and then finally at the end which is usually marked read only is the dot text segment which is your actual code usually this is around the 008 frame so some of this is by standards so some of this for instance like rc and rv this is actually specified in the posix standard because you need to be able to write a main program in c code and you want to be able to run on any os that's posix compliant and you need to be able to read in arguments and everything so that's actually all specified in the standard the rest of this is probably a mix of maybe some standards maybe conventions do you necessarily need the dot vss to the dot data and then you would have swapped them whatever now that we understood this is how the operating system takes these raw bytes and turns them into an actual executing process we need to actually understand what does that mean what is this process so what is a process is it just code that's executing that's an id has an id that the os gives it right so it has a unique id for that time frame that's executing memory space it has some memory space for us and it has its own distinct memory space that's the other aspect the operating system and the hardware will enforce the fact that you should not be able to modify the memory of another process life cycle life cycle you want to expand on that a little bit well how process starts how memory application is going to take place how is that going to execute for this frame yes so the process has definitely a life cycle in that life cycle is it executing or not so the os has to may only have i know it sounds rk may only have one cpu to actually execute on so it may have to schedule one process to execute as a time so it may have to do time sharing and kick some processes and off and then take those and then let somebody else execute and then finally come back and start executing the other one right so they may not be executing all at the same time what else do they have when a process tries to access a file how does the operating system know whether to allow that or not was that a file descriptor so a process does have open file descriptors right so there's three open by default standard input zero standard output one standard error two it may open new files and then it will get a file descriptor that i can read and write from but how does the os know to allow it to open a file like if i were just let's say we're all on a shared system and i were to try to write a program that writes to your tilde slash dot ssa slash authorized key file which if i did that i could put my public key there and be able to ssa for the server user that started the program so there's not necessarily permissions it's a tricky thing especially with android now but every process has a user id and a group id and this specifies what permissions does this process have when it's running so each and it's actually a little bit more complicated each process has a real user id group id and effective user id group id and a saved user id and group id so the real id defines the user who started the process right this is why when you're running on not as in a manner most systems when you run let's say your browser right that's running as you you're the user that's running that browser right that means that the browser can do anything that you the user could do on the system right essentially you're vouching for this process and saying yes I'll allow it to execute on my behalf and do whatever i can do the effective id is used to decide what it's actually allowed to do so this is a kind of find distinction we'll get into a little bit later saved id's allows programs to drop and try to regain privileges so you may want let's say a web server so we'll talk about it in a bit but to be to bind to any port on a server less than 10 to 24 you need to be the root user to do that but do you really want your web server which is accepting requests from who knows where to be running as root on your system which as we just saw we mean that now is running with the permissions of root no but you do need to run it it needs to run as root to bind to the port so the server starts up binds the port and then drops the privileges and it actually doesn't use the saved id's we'll talk about that later but it actually makes a call that says actually I have now going to be executing as on the moon proof that the www.data user and that's the user I'll be executing as okay but let's talk about something so why is this important yeah so it's great for the administrator right because the administrator can kind of like parcel out chunks and functionality to various programs right so just run this program to change your own file it's really dangerous if you missed it out yeah so what if I can take over and control this chsh program what if I can trick it into editing any file not just EDC password maybe I can get it to add something to roots authorized key file or I can get it to change the actual password file the EDC shadow file which actually has all the passwords so any vulnerability in this chsh chsh program allows me to execute with the permissions of that program in this case they are more than my permissions because this program has the authority to have it written I do not have that authority and you break out of it you're actually running it now through yes exactly so that is the key that's what we want to do so all right we will this is really important part so we're going to stop here come back and dive into more detail and say good idea later