 All right Cool all right well morning everyone Thursday, hope you all are doing well out there in Quarantine land And Yeah, okay, so a couple things Actually, there's updates on all the assignments so on the web of trust assignment it I'm sorry for some of you that I maybe gave an accidental Heart attack and additional stress too So I there was a bug in my grading script and it missed a whole bunch of your submissions So they weren't included in there. So yes, I know Right away that there's a problem when I got 20 25 emails. No, it wasn't certain users. It was It was just a I don't know. There's a bug in my grading script that it wasn't picking up your And it wasn't importing it into my local key ring where I have everybody's stuff So then when I went to go create it your all your signatures you submitted were not on there I'm not really sure why exactly that happened. It's kind of a baffling bug for me, but Anyways, we fixed it regraded and got that all Sorted out so I apologize for that The other interesting thing that came up with that, of course, we got more signatures Which means we got more scams. So there's actually a person who got 27 Signatures on their adversarial key. So congratulations to that anonymous person. That was actually the winner Yeah, so that was pretty impressive that's roughly I mean less than but it's still not that much when you think about it, right? We have about 350 people in the class and so that's even less than 10% of the class. So that's you know for as For as high as that sounds in terms of scamming you actually all did a great job of either detecting that or and being vigilant Cool, okay assignment five So I wanted to discuss this a little bit. There's something funny actually another bug that happened here a little bit Did anybody so I know some of you did the extra credit part of assignment five How many day people in this in the chat attempt assignment five the extra credit? Yeah, so some people got it a lot of people tried it For the people that tried it one of the most difficult things is that it There was no intelligence given about what the distribution of the key was The funny thing that I want to share with you is that it actually turns out there was a usually the distribution there is I don't know what it's pretty difficult and that I haven't had people. I think only a few people have actually done it in the past What happened was when I was creating the grade scope version I actually had a bug and I duplicated the line from part four So it was actually the exact same as distribution as part four. So it was only five character I think lower case or something like that It wasn't lower it wasn't all lower case, but it was five characters, right? It was uppercase n numbers. Yeah, so that was not exactly what I wanted, but So it's funny. I was thinking about it It's a interesting way when you have no intelligence about what the Distribution of the key spaces it makes it actually much more difficult to even approach the assignment, right? Because if you knew that exact key space you could easily search through that I'm not gonna say what I'm going for because maybe I'll use it in the future But definitely more than that. So no, everyone does not get extra credit. That's not how that works it still fits within the spirit of it because Everyone didn't know what the distribution was so Yeah, it just happened to be easier than I thought it was going to be so that I thought was pretty funny cool, okay So then now On to a new assignment assignment six. So this is actually before I start I guess this is one of my favorite assignments And I hope you'll find it pretty cool too. This is essentially the culmination of all of our This is the culmination of all of the things that we've been talking about here So we're gonna put into play and you're actually gonna do basically hacking of binaries and other types of things so this is gonna combine everything we've learned and done with The bandit levels so the bandit levels will come in super handy here You'll be reversing stuff. Yeah, all kinds of how do I There we go, okay all kinds of fun stuff so You actually have a lot of time on this assignment, but I urge you these are kind of like puzzles and challenges so It's very difficult to keep them all for the end. So please start early on this assignment it's due on May 1st and We're not gonna do any late submissions because finals week is the week after that So I want everyone to just like focus on this assignment and then the assignment can be done And then you can focus just on the the final and So the So the goal so you're gonna be breaking a number of levels. Yes, this is going to be the last assignment and So I'll show you we'll walk through this. Actually, I'll do it right now So What you will do is you'll all be given So assignment six is currently live on the site But you don't have any way to start it yet because you don't have a username password So I'll be providing that with you later today Very So stick with me on that. Okay, so when you get all set up you'll SSH into this machine and From here You'll notice so it's just an Ubuntu 18.04 So this is kind of why we went over those bandit levels. So, you know how to interact with this know how to access this on the command line and What will happen is? So inside the directory LSS LA var challenge So in the directory var challenge, you'll see a number of Different challenges. So there are these are essentially different challenges that you have to break. So For in so and Let's see Yeah, so let's look at one of these. So essentially so if we look and this is where it gets into The why set you ID is important So essentially the way this all works under the covers is in each of these levels here under challenge There is some kind of executable that is here We can see it's set group ID executable the s bit on the group execute And this means that it runs with the privileges of the just dash execute dash meet group And so we have handy tools There's a Utility called score that when you run this will output Who's solved what on which level so this shows you this is kind of like a cool leaderboard that you'll see people moving up as they play the game and The goal is so I'll show you I'll even tell you how to exit Break the first level called just execute me which spoiler alert all you have to do is just execute it So you do this It says who congratulations. You broke this level Adding you to the group just execute me. And so if I run the score now, I'll see that I have I have a check mark under just execute me So there you go. That's how you break this first Yeah, this is the easy one. There'll be a more difficult one It gets more difficult. Let's see And okay, some other tools you need So there's a program if we look a Program called leet so this program the idea is if you trick so You have to when you break this level so If you think about it here, right, we have this just execute me that Challenge so essentially what this is doing is the way I know that I've broken this level is because it adds me to the group Just dash execute dash me but So if I control this binary, I can get it to do whatever I want And so what I want to have it do is to execute this command leet user local bin leet because that will add me to the level So the entire point of this leet command is To help you when doing this So once you break one of these levels and you're operating as that group this leet command adds you to that group forever And that's how you actually count it as your point. So it'll depend. Let's actually look at object dump dash D Actually can't remember what this does. Let's look at it the main function Yeah, so It's calling exec Vee Which is execute with some stuff and if we run strings on it, I'm sure we'll see you what it does Yeah, user local bin leet. And so if we run it with S trace we can see the system calls that it writes So here we can see that it's calling If we go to the very top top top Execute, okay, so this is basically all that just executing does is call User local bin leet That's all it does so it executes that command. So this is essentially so let's say for instance Some of the challenges will call it for you and that's why when you break it, you'll get that Get the level if it doesn't call it then you'll have to force it to call that so that's with Using some of the vulnerabilities. We'll talk about so the idea is you we actually haven't covered all the things That's necessary to break all the levels. We'll do this as we work through the the binary hacking The application security slides But start it now there is plenty of things to work on and to start on now So it's super important Okay, cool Yeah, so that'll be good There's a list of tools here. So if you don't know these tools, I highly encourage you to Check them out object jump gdb l trace s trace Wireshark is what we talked about the network tool SCP is very handy to copy files from your local machine to the server or from the server to your local machine So you can analyze them that way Let's see. Okay, so Let's see answering some of the questions Yeah, you'll get using password combo later today, so I'll make a post on piazza of how to do that and where to get it from Yes, you can change the passwords that you get that's totally fine recommendations on which ones to start Yes, we are still having a final we'll talk about that in a second So Okay, so evaluation so Each so the entire assignment is essentially out of a hundred points each level is worth 12 points With a maximum of 105 points that you can earn from breaking levels So I'll let you do the math on that but essentially it means If you break all 10 levels you would get 105 points Now there is another extra credit, so this is actually just a tool to incentivize you to start early on this assignment And so if you finish five levels just roughly What's 60% of the assignment is that right? By the 24th so basically a week before the deadline you'll get an additional 10 points on the assignment And this stacks with the previous one So this means total the maximum points you can receive on assignment is 115 out of a hundred cool and then Submission instructions so submit Basically if you write any kind of code or anything just like we've done in previous assignments Submit that along with a read me and the read me is really important here because we want to understand that you know how to break these levels so Submit a read me file that has your name your ASU ID and some description of how you broke each of the levels And so there'll be a grade scope assignment where you can submit this and we'll see that there There is a bug bounty on this server So I'm giving you permission if you manage to get root which means You have privileges of the root user then you'll get 50 additional points So I will say The reason why this is here and not part of the extra credit is because this is a not intended thing It's a bug bounty. So it is No, it's never well I'll tell you a story in a second. So Yeah, so it's not intentional, but if you happen if It does happen. I want you to tell me about it and you'll get 50 points so that I can fix it I'll tell you so I borrowed the concept of this assignment from Giovanni Vigna who is a professor at UCSB and when I was an undergrad taking undergraduate security course He had an assignment like this. He had a very similar bug bounty thing and actually the very first thing The very first thing that I did was figure out what Linux version it was I found that it was vulnerable. I downloaded exploit and got root on it basically immediately I mean if you have the ability to get root you should be able to get the rest of the points very easy as well So that's kind of on you So anyways, I've seen it done in the past. Nobody has done it in any of these assignments I like to think I'm more careful in setting these things up, but hey, who knows so It's more there to encourage me to make sure that all the stuff is secure But if it's not I want you to take advantage of this and get all the points Any questions overall on the assignment? cool. All right, this should be fun and now Let's see done with this When should you start for maximum efficiency? Honestly as soon as possible. That's like the because There you can get it's easy to get kind of stuck on some things so you want to You want to make sure you start early, I think that's gonna be the the best way to go We'll discuss the final exam later. I want us to focus on this and then we have on We have an in-class CTF on next Thursday. So I think that's the 16th. Is that right? Yeah So we'll talk about the CTF on Tuesday. Oh, well these I'm telling you I've seen students fail because they start this late and because of the nature of the assignment It's not like a coding assignment that you can just crank out in one day It's very possible to get stuck on these things. And so you need a different You'll I'll tell you this once the assignment gets up it's It's it'll become clear which just which levels are easiest and what things to start with So I don't feel like I need to give you that information It will definitely come up organically with the list of users. Oh There will be and that's on them. So, you know, you're all adults if you want to procrastinate that's on you Like I said, we'll talk about the CTF stuff on Tuesday Cool. Now. Let's move on. We're gonna talk about XA6 Cool Okay, so back to assembly So to refresh our memory there are What we're kind of focusing on here is how do applications? And how does How does a program that's compiled from a high-level language like see how what is that assembly language that the CPU actually? executes How does that? that happen And so we've looked at how memory operations work So how to move data from a register to another register how to move data from memory into a register and From a register back out of memory There's a number of other operations. So There's move exchange will definitely go a lot into push-and-pop those come up a ton There's also types of ways of doing All our fun binary arithmetic adding subtracting multiplying division increment decrement and Also all the logical operators that we're familiar with we have ands ores X ores knots all those things that we would expect We also have just like in normal programming languages. I Actually don't know the difference between I div and div you'd have to look it up Yeah, a lot of these things. I don't know exactly what the semantics are, but they're all very precisely well-defined So if you look up x86, I div it will tell you exactly the semantics of what that means Then the other thing we need is so When we talk about oh, maybe immediate that kind of makes sense. Yeah, I don't know if somebody looks it up Just link it in the chat so We also need the ability to kind of alter the control of the application So when we talk about control, what are we actually talking about here? Anyone? Okay, I div and div is signed and unsigned Yeah, so the control controlling the flow of the program right the control flow of the program what things execute, right? It's almost second nature at this point, but when we write some line of code like if Foo is Equal to 10 Then print Hello, wow, okay, this is not going great Else Goodbye so So, you know roughly here right where the control what? Instructions get executed changes based on this value, right? So we need some way of doing that in In assembly language as well. So we have a various types of things. We have things Jump is an unconditional jump that just says go to go here. It's kind of like a go-to statement Call and return are going to come up a ton. We're going to definitely go over this. This is how Essentially, we're going to pull apart the covers and understand well when I call it this function print How does that actually happen and how does the CPU know know to go back and where to go back to? Int and irat are interrupt handlers We're not going to get into that a ton, but we can also do things. So these are kind of what we call So these are maybe unconditional transfers So whenever this line of code of this assembly executes always jump to this other instruction Whereas we have other types of conditional jumps. So for this we'd say well jump if not equal jump if equal Jump if I can't remember what ae means jump if greater than or equal I think and The other thing is we can have control flow that is direct. So we're says we say hey If this is not equal jump exactly to this point in the program Otherwise we can actually have indirect transfers. So this says hey if this Jump if not zero to wherever is in this register And then kind of other miscellaneous things and then put output instructions and A not or a no operation instruction So why would we want something like a no operation instruction? It kind of seems silly Why would we want something that does nothing? Yeah, so actually on some CPUs we We actually need a not after a jump statement because of pipe lining, which is kind of crazy We may want to delay certain operations. Yeah different kinds of reasons. So there are actually legitimate reasons why you need this So yeah, it's kind of interesting. Okay, so If we go back to kind of our diagram, right Um, we're not going to get You know, you can take and you will be taking an operating systems course that will go into all the kind of concepts of an operating systems and what it does EAX is a register. So this is the register EAX um And so if we think about we talked about right our app our application is running kind of on the processor And Essentially and this is kind of the very nice thing that we've built up From programming. So you guys have all written, you know applications that have done stuff. Have you ever had to worry about? Talking to a hard drive So the hard drive, right? That's kind of a crappy hard drive, but it's connected to your computer Right. It's uh hdv It has a spinning disk. It has uh, it speaks What is the protocol SATA? I think no, it must be it's some other thing Um SATA is the connection. There's a protocol that it talks to it, right? So there's different. There's hard drives. There's uh SSDs and those are actually different and can have different things your hard drive can be rated. No rate is different It's on top of hard drives Man, I can't remember that protocol I cannot remember what it is, but nvme. Is that it? Okay, we'll go with that for now Scuzzy I think is different too. Anyways, it doesn't matter Okay, cool. Let's uh So the somebody has to tell the um hard drive. Hey get this data at this sector Right. So when you're reading and writing files Uh, what's actually happening, right? So in computer science and throughout all of computing We've essentially created these beautiful abstraction layers So your application doesn't have to say, okay If this is a hard drive connected to it then issue this command to the hard drive If it's a floppy disk do this command if it's a um If it's an ssd do a different command because that's more uh, that has better performance if it's a um flash drive use a different type of command and so We've created this layer where how do you read and write files in your app in c No, that's c++ Yeah, okay, so I'll allow it so open Read And write Or one way There's other ways as we'll see these are um Kind of wrappers around different things, but anyways, so you can call these different uh functions like open So open a file read from a file write to a file And what's actually happening is there's a bunch of logic that's happening in your operating system That handles all of this complexity of how do I actually talk to this hardware device? But your app doesn't have to deal with any of that complexity And so the way this happens we need some kind of mechanism for your application because if you remember Your application is essentially running on the cpu and things are executing So we need a way to signal to the operating system Hey, um, I would like to open a file for reading And then hey, I would like to read 20 bytes from this file that you opened for me or I would like to write 20 bytes to this file that you opened for me I think there's close Similar things happen with networking like we talked about right This is actually the beautiful thing of why our applications don't have to deal with the tcp three-way handshake Because the operating system does it for us all we ask the operating system is hey I'd like to make a tcp connection to this ip address on this port And then the operating system itself will take care of everything Um, and so in order to make this whole thing happen in order to make that these calls happen um We need a way for applications to talk to the operating system and these are through system calls And this is how Your application talks to the os so on and The tricky thing is this varies a little bit depending on the exact architecture and depending on the operating system itself Yes, we are only focusing on 32 bit in this section 100 so what we're doing is so now Uh, when we want to ask the operating system to do something for us, we invoke a system call That our libraries will actually invoke for us, but on linux what we do is we have an interrupt 80 So it's int means interrupt Uh 80 is just by convention the interrupt that says Hey, I want to the application wants to make a system call to the os And then the eax register contains what system call number we want to make so, um So we can actually use this to write our own and I guess we'll do this right now. I think this could be fun Um, we're gonna we can write quickly our own hello world application 100 in assembly So, um, we'd look up. Let's see. I have this We'd look up, uh, linux x86 system call numbers Um Why does this look so ugly? all right, so we can see like, uh, write is Hex four in the eax register And then the other parameters to write the file descriptor to read and write from is an ebx the buffer to read and write from is an ecx And the number of bytes to read is an edx So we can see kind of all of this and we can see there's actually a ton, right? There's uh over 337 Possible system calls that can be called uh in linux, which is kind of crazy But we don't need, um, all of all of this for now This is just to show you where you can get this information from Uh, also, this is exactly how computers work. So if you want to understand security, you need to understand explicitly what's going on Um Cool Yeah, so we can walk through this code. Uh, let's try it, uh ssh Go on my server here um 365 so I don't have emacs. That's great. I'll install this very quickly cool. Okay, so we have uh We have a dat dot data section and we're gonna make a Uh, the hello world string. So we're saying we want a string somewhere in memory called hello space world slash n Uh a dot text segment Now we're saying, um That our main we're gonna have a main Main label here that we want to be global so when we compile this everyone can see it Uh, so we can actually go through. Let's look at this We'll look at up here Oops, that's not what I want Oh, that's cool. It puts you up there. Okay And the very cool thing is you can use man pages to look at the, uh Manual here is this not uh, is it difficult to read the text? Yeah Okay, how's this? Okay. Well now I have to use vim. So thanks. Um, okay So if we look at what this system call is this is a, uh System call we're writing to a file descriptor um So we're writing to a file descriptor from a buffer that we have a number of bytes So one of the really important things to understand is by default The three file descriptors that are always open for every process in Linux So you have, um, standard input is file descriptor zero standard output is file descriptor one And standard error is fire descriptor two. So if you just want to write to Um, hello write to so essentially what we're going to try to do is Uh, what we want to call is essentially write to file the descriptor one to standard output Uh, the string hello world slash n And we want to write how many bytes is that one two three four? Four 12 I could have cheated and looked at the slides. Uh, this is essentially what we're going to try to be doing So how can we actually accomplish this? So then we can look at our syscall table. This is really small There we go, uh, let's see We can look here we can say okay, right So I need to first move Uh, dollar signed four into eax. So this is the way that I, uh, specify Uh, nope, we will look at this in a second. We're actually going to debug this. It's going to be great. So We're going to move four into eax, which we know here because this says the system call number is four We're then going to move, uh, one into ebx So we're going to move one into the file descriptor. So we're going to write to file descriptor one We're going to move now dollar sign hw So what this is going to do Is the compiler when we assemble this into binary It's going to put this string somewhere and then we'll see it's going to move the address of that string into the The register ecx which we need for this And now we're going to move, uh, 12 into edx Uh, we need to call an int 0x 80 and then at this point it should write out hello world And then now we're because we're very nice, uh, people We're going to exit cleanly. So if we look here It should be one. Oh, that's interesting. Has this been wrong for a long time? Move one, uh, dollar sign eax Oh, okay. No, no, no, that's make sense. Okay And return so we're going to return from this function. So whoever calls us, um, we'll look at this later, but eax Whatever's inside eax when a function returns is the return value. So if we write this out And let's see. Well, let's just compile Oh, cool Okay, and it's all much packages because apparently we don't have all the things here So, okay while this is happening. I will answer the question. So the other beautiful thing that, um The operating system provides is even though an application is essentially executing on our cpu The operating system has Created such protections that it should not be able to take down the whole os So that even if the app no matter what it does because if you remember You'll have different applications running on your operating system So the operating system needs to do super cool things like deciding how much cpu time each process gets So it's constantly switching between them. Also, you wouldn't want it that one application is able to crash or take down the other one So there's protections in place there um So now we can look and we can see we have now an a.out file We can Run it. So we run it. We see that it says hello world is exactly what we wanted Uh, we can use a very cool tool. This is what I showed uh earlier So s trace is a system called tracer. It uses a debugging trace to output all of the system calls that this binary makes Um, so there's actually a lot of stuff that has to happen because you're um, it's loading up all the libraries and everything But we can see as we get to Yeah, wait, huh it says stat one. Why doesn't it say I would have thought it would have said right Okay, but let's uh debug it now Um So now we can just use uh, let's see. Do I even have gdb? No Okay, install gdb Okay, yeah, we can do a few things here. So we can debug uh a.out. I'm going to put a break point on main Uh, there's a great, uh gdb Gdb actually has an amazing manual that has all kinds of stuff of how it works What kind of things it does and can do this user manual? Um, really everything you can learn about Debugging with gdb is in here. It's amazing Um, so what I'm going to do I set a break point on main Uh, so the b is break point. I'm now going to run the program with run Um, and if you're used to gdb, you'll see you'll try to do something like l to list the source code But here we're actually debugging. Um Assembly code so what I want to look at So what I'm using is examine And then I'm telling it I want you to interpret what I'm giving you as instructions And you're going to do 20 of them And it's going to be uh dollar sign eax So now we can see in memory and this is super cool So we're seeing exactly the layout of memory that things are so here At main exactly at 4004 d6 We have our instructions and these are almost exactly the instructions that we wrote we have Move four into eax move one into ebx Now our move hardware This line our move hardware into ecx has been replaced with this. So let's look. What is that x? xss So if I look there I see at this memory location currently inside of our program are the bytes h e l l o and I can actually Uh 12 Is it c? Yeah, so this is outputting each of the bytes. So This is examining the exact same memory location. This interprets it as a string And this interprets it as show me 12 characters in hex. So at the byte level So here it's showing me 48 and if you look up in an ascii table 48 would be capital h 65 is lowercase e 6 c 6 c l l o space w o r l d new line So we can see these bytes are in memory and when gcc compiled our assembly To this l file It told it to load up these bytes of hello world slash n at that specific location So then we can use n i to go to the next line Oh, I see what's happening. Okay Uh, okay, let's kill this real quick The problem is I didn't put dash m 32 And of course, I don't have all those libraries installed. Okay, that's fine. All right Okay, yeah, this is because I uh That makes sense Because I didn't compile it as 32 bit Um So I can single step instruction. So go instruction by instruction I can do n i to do next instruction to step over function calls. So let's single step RIP I can see so I previously moved four into eax I can uh I can do info registers to print out the value of all the registers so I can see an rax I have four in there And my next instruction is going to be move one into ebx Step instruction I can show my registers. I have one in ebx And then my next instruction is going to move That hardware value into ecx And then we can see ecx has the value of our string We're then going to move 12 into ed our edx And then we're going to do in 80 which will print out hello world And then we'll return As you can see so the gdb interface is not great Gf gdb Actually, usually on my machine set up. There's tons of different Configuration things you can use to make gdb better for debugging. I really like especially binary debugging I've got used to this thing Uh So let's install this and see how this changes So now if I go a little bit bigger So now it actually shows me a very nice output. It's showing me at the top, uh, all of my registers Uh rax and I really want to get rid of these. Um And it shows me exactly where I am in the binary so I can single step the instructions and I see that all of this Updates as I'm doing this Yeah, there's other stuff you can use For there's other like reverse engineering specifics. I really like this one. It's it's pretty nice It shows you a lot of cool information And shows you when so like for instance rcx It knows that this memory address is can point to something and it just figured out that this is a string It actually shows you the legend here that says this is some string Uh these the stack locations heap locations All kinds of stuff Cool And I need to Oh, I should do this on the other machine. Okay, that's fine for now Um, cool So, yeah, this is kind of the the brief over Look into how system calls work Um, how we can actually write assembly code that can do stuff and this actually happens all the time So normally in a if you wanted to Uh normally in some function you would say uh main This is not valid, but you'd maybe say printf Something like this, right? Well, what's happening is printf is a library that is a function inside libc and if we figured out Oh, I need the There we go So in this library Provides this printf function, but under the covers printf does a lot of work and then eventually calls right just like this So this is why typically when you write c code, you're talking to a library that does the system call for you So that you have a nicer interface, but this is actually all that's happening under the covers is These things get translated in your library down to these system calls Cool, all right, okay So, okay, so we did that So now So we've seen we can compile this. Uh, we can Look at the file format here. We can see it's an l file and you know what I'm sick of looking at The 64 bit versions. I'm gonna copy this we're gonna go to All right, uh gcc-m 32 There we go. Now we have a 32 bit binary so we can see that it's an l file. It's 32 bits um Now I wonder if I s trace it Yeah, perfect. Okay. That should be my first sign that something was wrong. So here I can see in this right system call That it is uh writing to this hello world. So That's cool. That works exactly as I would want it to s trace showing you exactly what system calls are being called And And now I can kind of step through here. We'll see that the the instructions changed um, but Now I can do things like use the eip register, which is what I thought I was doing. Anyways, okay, cool Okay, so Now as we saw right and this is kind of an important, um concept to understand in your mind and to think about is what Do like what so if we look right remember this a dot out file is just a file on disk, right? It's just bytes that are sitting on the disk. It's actually not different from any other file Like this hello dash s file Right, there's all there's no difference there The difference is that um a dot out is in a format where the operating system knows how to take those Uh bits and bytes in the file put them into memory and then start executing So then our when I run dot slash a dot out it executes that program for me Um So And you can actually look uh, we're not going to get super into it, but if you look at the uh proc File system has a lot of information so ls uh proc Self Let's see. We're looking at maps So if you look at this file you can see I can't scroll There we go So we can see that this bin cat is at various memory addresses and exactly kind of what those look like what's going on here um And so basically this l file gets loaded into memory The operating system lays it out exactly how it shows it here. It maybe does some relocation Which we won't talk about right now And then it sets the cpu instruction pointer at the start address that we saw in the l file. So if we do redel s And We can see the entry point address at 3c0. So that's what it sets as the entry point and then it starts executing okay, so uh couple things so We're going to talk about kind of how the memory is laid out in a process. This actually becomes all this stuff is all intricately tied together when we understand if an adversary has arbitrary The ability to write into memory of a process What kind of things can it do and how does that work and how to specifically the layout work? So the very first thing that you'll notice is In this class whenever we draw memory layout We're going to be drawing it with high memory at the top and low memory at the bottom So that's in all of these types of examples We will always have high memory at top So this would be in 32 bits. So we talked about it The highest memory address will be all f so f f f f f and the lowest memory address will be all zeros Again, this is a another super cool thing that operating systems provide to applications If you think about it, so this application can technically talk to most effectively You know two to the 32 this whole memory range Um But you could have multiple apps that all think they're talking to this whole memory range and The operating system manages this and with virtual memory in such a way that they Actually think they have different views of what memory looks like So the operating system does super cool lies to the system. Um, okay Now Cool So when we write a program like, uh Yeah, I do have emacs. Okay, except you said emacs was ugly. So If I make a little test.c file, um Okay, so what's a typical main function? What what are my arguments that I can have here? argc And I just did some stupid vim thing and Overwrote it cool, um Into argc character pointer pointer Oh god Oh, there we go. Wow argv and actually There's another argument that you can put there that most people don't worry about which is the environment pointer Um, and then I can do things like let's say print f Percent c Percent d argc. So I'm going to print out what argc is I'm going to maybe print f Uh Percent s slash n argv zero Let's just do envp zero See what that outputs And because I'm a good person I'm going to return zero So I can compile. Uh, we're going to compile this again 32 bit So that compiled So if I run this I can see Uh, let's see. So argc was one Um, argv zero was dot slash a dot out and this was envp zero Uh, now if I add foo bar here Three is argc a zero is still a dot out and this environment pointer is this so let's debug this Oh, I want to install jeff again because that's much easier to deal with here Okay, stuff happens. Okay, so we have now our First called a print f. So one of the cool things here So one of the cool things here about jeff you can see it's telling us print f is printing out the first argument is a pointer to that string percent d slash n with argument of one And uh, we can step over the shoot. Okay next instruction. So the print f will print out one and then this next I guess it compiles it down to a puts. Okay, this is not great for this. So let's oh because it's a constant string. Yeah, that makes sense. Okay So no don't you don't you Okay, so um So when we execute this program, right? So the important thing to remember is When we do a dot out, we're actually not doing anything, right? What we're doing is we're asking our shell in this case bash Hey, I would like to execute dot slash a dot out and then it will execute it And if we type in any extra parameters We want it to pass that to the program, but how does this actually happen, right? So this data needs to get from essentially our command line or really from bash into this. So um, what actually happens Is it all comes down to exec ve so if we look up sys call reference So exec ve Is a system call, right? So the other thing that a program can't just do by itself is execute another program It actually has to talk to the operating system and this makes total sense because we talked about um access control and everything else that the operating system has to do the permission model everything like that And how is that enforced? Well, it's enforced because applications can't execute things themselves They need to ask the operating system with by calling exec ve So how do you do that? You pass in the path name and then you pass in a string pointer arg v Which I believe its last argument is null. That's how it knows how many arguments are there and then a pointer to the environment And essentially what happens is all this data gets passed into your program the number of arguments is passed in as arg c Um arg v is what's essentially passed in here and the environment pointer is passed in for you to use So there's a ton of stuff here, but essentially everything eventually calls exec ve And then your operating system Has to then create the process. So when it takes your program and executes it It needs to be able to pass that data. So it first Passes in all of your uh, so your environment variable all of your argument strings are actually in memory This will definitely come up when we talk about how we can use these things to exploit buffer overflow vulnerabilities Um, we have pointers to them. We then have arg c And then we have a section of the program called the stack So why do why do programs? Need varying amounts of memory. So or another way to put this is do programs need a fixed static amount of memory And maybe what are some examples? So what kind of program doesn't need a fixed amount of memory? Chrome that's a good example, but why why is chrome? not a uh Why does chrome need a dynamic amount of memory? Yeah, we need more because we may be reading unknown user input for chrome. We're going to a web page and we're visiting it We don't know how much data it's going to uh, send to us. So we need the ability to Use different amounts of memory. Uh, as we'll see Um, the stack is used for function calls and that'll be super important So essentially the way we're always going to draw the stack and this is again The stack grows from high addresses to low. So it's always going to grow down um And then we have some shared libraries other stuff and then we have another type of data That's the heap that grows up. So We'll see different ways of how this works, but the heap essentially You can think of when we need to allocate like when you call, um Malik or free in c this allocates memory on the heap that's used for you um When you do what's the c plus plus equivalent? Is it new and delete? Is that uh the equivalent I believe? um Yeah, that's allocating things on the heap and so that way that allows your memory to grow in a different kind of manner um And then you have your data sections and then finally you need your text section your actual code So this is kind of roughly the layout of how Memory looks to your applications. And so it's kind of nice. You have this nice dichotomy Where your stack you have two different types of data A memory areas that can grow you have your stack that grows down towards lower memory and the heap that's growing up Uh, so they can each grow in that direction um cool okay So we've actually looked at this a lot. So we've looked at disassembly a ton Uh, so disassembly is this notion and this idea that we want to be able to take those raw bytes. So if we look, uh Right, if we look at hex hex dump, this is actually the bytes that are inside that a.out file So some of it at the top is this is all probably elf header stuff that we don't need to worry about But this stuff in here is actual code and unless you're I guess so well versed in this you can look at this and see the matrix for what it is Really we need something that can decode this for us. So we need Some kind of process and if you think about it What we're talking about here is we originally had a c file that we So we took a c file to assembly And then we'll just call it a like an object file or maybe I will call it an exe for right now like a binary um So you can think of this maybe is like compiled And this is assembled Right and so one way to think about this and this is where this term comes from So this is kind of the forward engineering process right as you make something you have a c file It's compiled to assembly and then assembled to an executable Part of what we're doing is kind of a reverse engineering process where now we have an executable We need to go to the assembly through a disassembler and then we need and then a Great thing to do would be to be able to go back and go here to a decompiler Um Yeah, sorry. This is uh and so as we'll see Disassemblers are Essentially trivial and easy. They're they're fairly easy. I'd say not trivial Um So we saw that with object dump, right? I can use uh object obj dump a dot out Oh, I need to say dash d disassemble everything and here trying to interpret everything as bytes So if I look here at this main function I can see okay, here's The code that the compiler generated that corresponded to my program. Remember this was the prints and the print devs and stuff Um So here I can learn and understand what this program is doing by using a disassembler to look at that So I'll talk about different types of things. Um, so there's actually a tons of Types of tools. I'm going to point you to some stuff right now. So, um Radoware is a um Uh Disassembly tool it has all kinds of reversing and vulnerability analysis Uh, it has really good scripting capabilities. Uh, if I remember correctly It's a more type of a command line tool similar to like gdb style, but much, uh, better um I will say Ida pro is kind of the state of the art tool for reversing. I actually don't think I even have it here. Um, It supports disassembly of binary programs, but it also supports decompilation So it has a decompiler that tries to decompile assembly code to c code Um, this cost well the current hex rays Uh, basically costs like 20, uh, $10,000 for one license Uh, usually to do one type of assembly language So if you wanted to support different things like x86 x86 64 arm, um, each of these cost roughly in the order of $10,000 Um, so this kind of shows how In some sense special purpose these tools are but in other ways, um How but they do have a uh Version that's available for free. So you can check that out. Uh, you can check out Ida pro it is currently um Considered kind of the the state of the art or one of the best tools for doing this. Um Hopper is one that I used to use. It's a disassembler that has a decompiler. That's not great. It's Roughly a hundred dollars Super cool. This is kind of a new Thing is gdra. So gdra is an open source reverse engineering suite that the nsa actually, um released So this is actually super cool. You can check it out. It's written in java. It's um Uh relatively inexpensive, right? So when you're talking about a $10,000 tool or a couple thousand dollars, uh 90 dollars is uh a little bit less So gdra is pretty cool. Um, it has a lot of the ability to do a lot of different architectures Let's see. I do have installed right now um So what i'm gonna do is i'm gonna copy that a.out file From the server locally. So scp is a secure copy From this server and this is the file. I want to dot locally. So this should copy it here Um, and then I think I have there's another product called the binary ninja that also has a free version to use Um, I guess I haven't ran it in a while And so just just to show you kind of roughly what these things look like. So if I go here a.out um So what this has done is it shows me, uh, this is at the start location. So this is kind of a standard, uh, Libc of things that happen at the start I have the symbols over here of all the symbols. It's pulled from my binary. I can go look at main And where's my nice? And i'm still getting used to this so I don't use it a ton. Oh, there is no graph. Okay walls um But let's see cool things here. Okay, one thing you'll know the syntax is different, right? So it's it's uh Default is intel syntax um Cool thing is let's see. I can look here. I can double click I can see Yeah printf This should show me yeah, so you can kind of see here on the screen I can't show it with my cursor But you can see this is showing me the data location at that point and we have the bytes 25 640 a So that's actually percent d new line and then a null byte So this is what's getting passed into my printf. Um Cool thing is you can do things like So here I have cross-reference. So here This shows me everything that calls puts. So if I'm trying to debug like a large program, let's actually, uh I don't know how useful this will be but here's a Say something I was looking at for a ctf challenge No, I want main There we go. Okay. Now you can see more the the power at least here So here is kind of so now what it's doing is it's showing me the graph So each of these blocks is essentially what we call a basic block. So it has no jumps in it Um But here so for instance, um, yeah, here we go So here's something where it's doing some code It's getting something from the user probably f gets and then it has two branches So you can see two branches here if this condition is false it goes this branch if it's true it loops back to itself um, so this gives you kind of a high-level overview of the graph kind of of this this program and what it's doing And the super cool thing is like you, you know, here are functions Oftentimes in the compilation process the names of functions the names of variables these type of things go away as it's compiled But I can do things like, uh, call this main loop And then it will update that so I can see what are all the references of main loop I can see it's figured out. It has different, uh, arguments. Uh, maybe this is the fd. I don't know so I can um Name different arguments I can figure out kind of what's going on And you can see here when it's able to determine that the thing that it's pointing to is a string It actually shows you it here in the output. So there's just kind of one example here of this Um, I will say I'm definitely not if you want to learn more about, uh, reversing you should talk to dr. Fish Wang. He is Uh, no joke probably one of the best like reverse engineers ever. Um, he's Debugged bugs in windows Kernel itself that caused slowdowns of his computer. So he is able to like reverse engineer and understand, um The windows kernel I don't know if he covers it in any classes. You'd have to talk to him. Um Or ask him about that but Uh, yeah, cool. Okay. So that object dump is kind of the the standard thing. Um, I You know I will say you don't need any of these fancy tools, especially for these challenges. It's nice to be able to um To do that and talk about that You can get pretty far with just object dumping a file and seeing what's going on cool Okay Where are we? Um, cool. So as we saw so kind of if you think about it, right And the reason why all these skills are important is if you think about like software that you care about Or systems that you care about oftentimes all that you have is the executable, right for something like, um Let's say powerpoint or windows, right? All you have is the executable So we want to maybe understand how it works as far as reverse engineering We need to be able to use these tools in order to understand, um How it works But also that's not the only tool. So like we saw debugging is actually incredibly useful to Either understand what's going on or identify bugs in a program Um, so it's a really important part of reverse engineering So you can think of like disassembling. We look at it statically to see what's happening We can also debug a running program in order to understand it So debugging is an incredibly important part Like we saw gdb. I didn't even go into half of the power of gdb One of the super cool things is its ability to do to script it So you can, um For instance, like if you're reverse engineering a binary and it's doing some kind of maybe hard coded password check You can write a gdb script that every time it hits a breakpoint It prints out a certain value in memory and then continues And this allows you to do really cool like runtime behavior analysis of what's going on Gdb has like amazing. I highly recommend learning about it and using it for debugging Um, like we said, there's other types of extensions and stuff We actually already saw that so that's great. Cool. That's actually a good place to stop here Because when we get here, we'll talk on tuesday, we'll then talk about different ways to Attack general ways of identifying vulnerabilities and then we'll start getting into actual vulnerabilities And how to exploit them Cool, uh, I guess we do have a minute any questions or are we good? We'll talk about finals, uh as we get closer Yeah, assignment six will be released ctf. We'll talk about on tuesday Yeah, the grade scope for assignment six will go up later today, but you only need that to submit so Username passwords will be given later today and so yeah, keep an eye on piazza. So I'm not sure I'm not sure exactly how the mechanism that I'm going to send you the passwords are so we'll see Uh, yeah, we'll talk about more on Uh The ctf on Tuesday No, it'll be in groups, but we'll get into more details on tuesday Cool. Yes, you can set up public key authentication for assignment six. That's highly recommended Cool. All right See you guys Stay safe