 Ah, one minute late. Ah, we're so close. Okay. You are probably don't need. Cool. Okay. Why is it like impossible to get to this thing? Live streaming. Connor, can you check if I'm streaming? I clicked buttons, right? It is live? Oh, hello people. Okay, I'm trying to find you. Where? Says it's offline. Stupid cache. Live now, there we go. I really gotta figure out a better way to do this. Okay. Cool. In the chat, online, everyone in the chat say something if you can hear me, so I know that the sound's working this time. Okay. Let's do, before we move on to the next module, we're gonna be learning about assembly language and specifically the joys of X8664. Let's do, thank you, Chad. Let's do 10 minutes of questions on the Talking Web assignment. Okay, 10 minutes on the clock. You already asked me at four o'clock, so you don't get another one. No questions? Just move on, everyone's done? Yeah. They'll be posted after this. If you wanna talk, leave. Yeah. The next module will be on assembly. It will be released after this one. I don't wanna overlap things to add additional pressure to people. It will be, if you super wanna get jump started, it will be some of the modules from, where's it? In Fundamentals, Assembly Crash Course, it'll be some of these. I don't know exactly if it's gonna be all, and actually it probably will be all 23 of these, but I'm gonna go through and make sure they make sense and are good, but you can definitely do them now if you want, just like the other stuff, but you won't be on the leaderboard if I guess some of you actually care about that, so the leaderboard doesn't start until the assignment's released, which won't be until right after the last one's done. Yes? We didn't have an assignment. What? There's an assignment, do you? Monday. I mean, it's due on Monday. It was assigned on Monday. Yeah, there was a whole class we had on Monday that I posted in the lecture floor that you're responsible for watching. Is it posted on? Correct. It's posted on the syllabus, where it's all the other lectures. Web Basics, this one launched the thing and has everything about how to access it. This is the course website, the course syllabus. Are you in the right class? This module is released, it is ready. Yeah. How do you know if your ASU ID is connected to your account? Good question. On the Discord, it would say your role will change. That should be running every 10 minutes. If it's not two very important things to do, check that in your settings. So settings up and right on Discord, this should have your Discord name as it is in the Discord. So join the Punk College Discord. Link, if you don't see this, you'll see something that says to link the accounts. So it A has to be linked. It's a two-step process. The other process is you go into our class, you go into course, and this is the syllabus page, and then you go into identity, and then you can type in your ASU student ID here, and then once those two things are in, roughly every 10 minutes, something gets checked and then gives you the role on Discord. Yeah, yes. Is there any requirements? Well, this gives, yes, because we use Discord for all the announcements. So we need to know A, because there are people taking this class that aren't at ASU, so we wanna know that A, you're in this course, and that you're a student in this course, and that you're the only the people with the role in Discord that are mapped to this habit and should have it. So that way it's not intermixed, that we as this class have our own way to talk about things. Other questions? No, we have a whole, let me show you. I hope nothing sensitive comes up, we'll see. Discord, you failed me. Oh, software. There's a section that says ASU 365 spring 2023, right? Fall 2023, it's not spring yet. And it won't even be spring, it'll be spring 24, huh? Okay, so yeah, in here, yeah, here it is. ASU CSC 365 fall 2023, so you should see this. You should also see, let me see a great person here, has this role here, ASU CSC 365 fall 2023. You'll get this nice, what is this like, light peach color role looking thing. Yeah, so you have to follow the steps I just mentioned, right? To make sure that you can get it of making sure your student ID is in here, and then making sure Discord is linked. Those are two steps. Yeah. Is your student ID in here? Then you need to message me because maybe you joined the class late and you're not on our list of students or something is weird as happened. Any questions on the assignment? You have four minutes left. Yeah. All assignments. Yeah. What's happening in four minutes? We're covering new content. Yes. About challenges, correct? Louder. We can ask about challenges, correct? Of course, yeah, sure. This is how in trouble, 26 and Jason, Jason, Jason. Yeah, yeah, yeah. Great, good question. Go to the dojo, talking web. Yeah, it's level 26 and 27. Ah, this is a hard one. Or not a hard one, this is just slightly tricky. So one thing, why is that thing flashing? Is stream still working? I don't see it live on my stream. Oh yeah, okay, cool. Okay, so one good resource is json.org. This is the specification of JSON. It's just a data exchange format. How do you send data from one place to another? It's a very common data format on the web. This is why we included this in here, because if you do web stuff, you will see a lot more JSON. And it may take, what is it, like 230? Is that, no, 210, where you learn this like syntax of these graphs. No, this is the first time you've ever seen this in your life. It's not true. I didn't take it too fast. Yeah. I've seen it in Minecraft before. Okay, great. Awesome. I swear somebody told me that this was in one of your lower division classes. But anyways, this says that, hey, you define an object, it's a curly brace, and then white space, and then, let's see, it can just be a curly brace. That's a JavaScript, a JSON object. But if we go into the challenge, which is not there, if we go into the challenge, what do you guys want to see, the workspace or the desktop? Bellow. Workspace. Workspace, okay, I hate the workspace. Why do you choose the workspace? You can actually copy and paste it off, yeah, but nobody likes setting up the SSH, so that's why I gave you the two options. Okay, it's okay, okay, I'll deal with it, don't worry. I'm okay. All right, new terminal, so I can set up my terminal. What's the first thing I do here? Yeah, run the challenge to see what I'm supposed to do. Can y'all see this or do I need to make it really big? Better? Okay, so the challenge is telling me make an HTTP request to this IP address on port 80 to get the flag. The HTTP request must specify a content type HTTP header of application slash JSON. Must send an HTTP post with the body as a JSON object that has a pair with a name of A and a value of this. Cool, so looking back at JSON, or I would see what is so, I would look for maybe pair. And right here on the what is JSON? A collection of name value pairs. In other languages, this is an object record, it's actually very similar to a Python dictionary. You're already learning about dictionaries in there. And so, yeah, so an object is an unordered set of name value pairs, so we can go back, we can see it's telling us that we need to, a body as a JSON object, so if it's an object, how are we gonna know that it's an object? We just looked at that syntax. What does an object start with? Left curly brace, and it must end with a right curly brace, and then there will be any number of strings, white space, colon, and then a value, and so that's the unordered set of name value pairs, so it's telling us it wants an object with a pair that has the name of A and the value that it's given. So if we were just gonna write that out in like a scratch file, so what would that be? So I start with curly brace, I guess end with curly brace, actually I don't like you, stop doing this. File, new text file, great. Okay, oh, we're out of time, okay, we'll finish this one though, this is a good question. So what we say, okay, start with a curly brace has to end with a curly brace, so what was the challenge asking us? So an object, so we have an object with curly braces, great, it's a pair, what's the name of the pair? A, but let's check the syntax because this says string colon value, but what are strings, how do we specify strings? We'll go to here, a string is a sequence of zero more wrapped in double quotes, so we look at the syntax here, what does a string have to start with? Double quotes, can it start with a single quote? Even though that's valid in Python and other languages, but it's not valid JSON, right? So we have to follow the specification and we can do anything except for double quote or slash is what it says here, otherwise we need to escape it with backslashes like we're used to, so we'll go back here, we'll start our object and say, aha, we can't do this, we need to put it around double quotes, we need to do an equals and then what? The value, and we didn't look at value, value could literally be anything, all we're doing is syntax, right? Looking at this syntax here, we're not even trying to memorize anything, yeah. The people who just were sending can't see the space Oh, sounds like a problem. Should we fix it? All right, sorry folks. Must have been when I readjusted the sizes or something. Why is the size different now? It worked fine on Monday, this makes no sense. Okay, why isn't, did anybody message about that? Okay, now it should be good, the recording is messed up, but I dictated I guess hopefully pretty well as us going through this thing. Thank you for pointing that out, okay. So now we get to a value, what is a value? Yeah, it could be a string or a number or an object or an array, what's an object look like? Yeah, curly braces again, great. So we could do this nested as many times as we want to make arbitrarily complex objects. Can we have an object for the name of a name value pair? How would we be able to find out? So if it's name, colon value, what can we put for name here? String, it has to be a string, it's not a value, which means it can't be an integer or anything like that, it has to be a string. Okay, so let's look at what the challenge is asking of us and a value of this, is this an integer? No, that's kind of a trick question and maybe it's a hexadecimal number, but it's not like a normal integer. I guess we should specify that it's a string value, this is a good thing to be more specific. So we'd look at this, we'd say, okay, is it a string, is it a number, is it an object, is it an array? Well, it's definitely not an object, it's not an array, it's not the values true or false or null, so it's probably a string, so we add this value here. So what would you call this in JSON if you had to parse this as JSON? So object first, and what does that object have? Yeah, name value pairs, what's one, so it's unordered, which means there's no order of the name value pairs, but what's one, there's only one pair, what's its name? A, and what is its value? Yeah, the string containing this random thing. Does that match up with what the challenge is asking of us? Yeah, so now what's our goal here? So we have the payload, hopefully, we've looked at JSON, we've looked at the specification, we've derived from looking at the specification what this should be, so what type of, yeah, so what type of request do we need to make, so what's the method of our HTTP request? We'll use netcat, it's telling us we need to use netcat, but when we specify an HTTP request, we have a method of the request. Post, yeah, we know because it's telling us, it's telling us we need to send an HTTP post with the body as this JSON object, and it's also telling us that we need a header of a content type HTTP header. Can HTTP header names have spaces in them? No, so how do I figure out what actual header to put in here? Google, content type HTTP header. We can look at the first one, ah, it's content-type, right? It was telling us in text, I guess, maybe it could be more precise, but life's full of impreciseness. So, okay, so this is looking like what the body needs to be. We're gonna make a, what did we say, what kind of request? A post, it doesn't tell us where, so we can just do it to slash and see how that goes. What HTTP version do we need? Actually don't remember, is it like this? Yeah, yeah. Cool, then new line, what comes after, so is it the start line, what comes after that? Headers, so we know we need the content type because the challenge is telling us this, and we just looked it up, that it is content-type. If you dig into the spec, you can find out that content type, I think HTTP headers shouldn't have any capitalization, but I don't know if that's like for certain, so we'll do exactly as the spec said of, or what this documentation said of capital C content dash capital T type. And we will use, do you need the single quotes? This is a great thing we could look up at this point, I don't wanna do that, but we could look it up. If that fails, that will be one place we look. You see as I'm going through this, there's things that I'm like, yeah. Maybe that was gonna be a learning moment later when we tried this and it didn't work, but no, I just misspelled that, okay. Yeah, but as I'm doing this, I'm also thinking about, okay, what things do I know are probably correct? I'm pretty sure this body is correct because I checked that thing. I'm pretty sure the post is correct because the challenge is telling me this. I'm pretty sure this is correct because I used it in all the other challenges. The single quotes or the parentheses around this is one thing I'm not 100% certain about, so we may need to play with that later, yeah. Yeah, so that would be the other thing. I was gonna try it first and then fail, so let's just try this since we got a person here in input. Why do I have a space between my headers and this JSON object? Yeah, because this is how HTTP specifies I have no more headers. I could have arbitrary numbers of headers here. You've seen this. I can have host, foobar. I can put, I don't know, x, whatever I want, blah. I can literally add as many headers as I want to the only way that HTTP specification says that there are no more headers is there's a blank line at the end, right? And that's part of the specification we looked at on Monday. So start line, method, path, the version, always on the start line, arbitrary number of headers, zero or more headers, a blank line, and then the body. So let's not do this. We'll do one blank line there. All right, let's try it. That looks fuzzy to you all, or is it just me? It looks a little fuzzy. Yeah, I don't know how to fix it, so it's not gonna get fixed, but I just wanna know if it was me. Okay, so on a netcat to local netcat 127.001, 480, I'm gonna paste all this in. Yeah, I'm sure, right? I think I don't know how to paste. Okay. Ah, perfect. Okay, so it's responding to me that I sent a bad request. Remember we talked about HB status codes 400? Means I messed up making the request. And it tells me incorrect content type, value, application JSON should be application JSON. So what does this tell us we should probably do? Yeah, remove the quotes around there. Cool, so we can try this again. Your new determinals, you can hit up, and that takes you to the previous commands. You can keep hitting up to scroll through them. All right, okay, cool. So this is where we get kind of a weird error. So now we get a 500 error, 500 being the server blew up. In this case, I think the server was supposed to blow up, but let's look at this. So it says JSON, bunch of Python junk, JSON decoder, JSON decoder, expecting value at line one, column one. During handling of that, I'm gonna go ahead and do a couple of things here. One, column one, during handling of that, failed to decode a JSON object. So basically it's telling us that I couldn't get it. And if we go back and review the lectures where we talked about, hey, the thing to be very careful about when you're specifying posts, again, another thing like we talked about, the only way that the HTTP protocol says that you're done with headers is a new line. Well, similarly, how does, if I'm sending input, how does the server not know that this goes on forever, that I'm sending just thousands and thousands of bytes? Right, because it's gonna get it in a certain order, so it needs to have some way of knowing how many bytes I get, or it gets. So that's when we need our nice friend, the content length header. And we maybe say, oh yeah, I remember Adam said that in class or somebody mentioned on the Discord, rather than just blinding typing in content length, we can do another Google search, length HTTP header, what is the content length header? Indicates the size of the message body in bytes sent to the recipient. So this means we need to specify how many bytes we're sending. So we need content, content length. What should I put in here? One, why? There's more than one. How many one? How many more than one? Yeah, a lot. All these characters, you wanna calculate this by hand? You can, let's see, another way we could do this is, so WC is the word count. Oh, that looks like we, print. So it prints the new line word in byte count. So 44 bytes, this may include the thing at the end, but let's not worry about that too much, unless that comes up here. All right, and then we get our five. Those are made to request correctly. Makes sense? And I'll delete this part of the recording so nobody can watch this video. Just kidding. Sorry, that was mean. All right, now we're on to lower levels. So as a little preview of where we're going, you've started out with learning about how to talk to a web server. We're now gonna go super far down, almost as far as you could go in a computer at a very low level, at least in some sense on the programming level of how far down you can go. Does this thing follow me? No. And then we're gonna use that knowledge to actually build a web server. You'll be building a web server that parses HTTP requests, just like you're learning how to make now. So it's gonna be super cool. So you'll learn and apply your assembly knowledge to build programs and build the little web server using only assembly. Then we're gonna go from there into learning more, use that knowledge to learn about buffer overflows, memory corruption and all kinds of cool stuff. So you're gonna be experts in all of this. Slide show, let's see if this works. Okay, I'm not Yann in case you haven't noticed. But Yann is letting me use his slides. I think the last slides on this unfortunately that I have are out of date. I think Yann would be embarrassed if I used them. So it'd be better to use them. So first, how many of you have been exposed to assembly language before? Hey, very good. Yeah. And what language was that? MIPS. Yeah, MIPS is the best. I know you're talking about this the first week, but anyways, I don't really have any preference. I don't care. But just as a refresher, if you don't have any experience of why we need this, right at the bottom of everything are just these gates, these little logic gates and these control our machines. And these are the things when you actually type in your Python program and making a web request, it's literally these gates like electrical signals going through these gates that cause things to happen all the way from your computer to the Wi-Fi router to all the switches along the way until finally to the end of the server. And so it's important to remember that at least when you think about one program, one machine, even if you're using a language like Python, which isn't directly executed, fundamentally, something needs to execute on your computer's CPU in order for something to happen. So for instance, how many of you have written a C program? Okay, good, this is like a good catch us. I wanted to make sure that it was a C of hands. So you've written a C program. Can you tell the CPU to execute your C program? Have you tried? Why not? Seems like that'd be nice. Just execute it, boom. Here's the program, it's code. What was that? It wouldn't understand it. It wouldn't understand it, right? Because your CPU doesn't speak C. It doesn't speak C++, it doesn't speak Rust, it doesn't speak Go, it doesn't speak whatever crazy language you're using that compiles down. It speaks a very specific language that they make very fast. And by using tools like a compiler, the compiler translates your high level C instructions into something that the CPU can actually execute. This is why I used to teach 340, it was one of my favorite classes because it's a beautiful mix of the theory, there's theory of how to parse things, how to understand things at that level, but it's applied in something you literally use every day in this compiler. But we don't do all of our programming in compile language. We also do other things in Python. You're using Python, maybe some of you for the first time. Hopefully it's not too hard. It's pretty, once you get to know the basics here, everything will flow pretty easy from here. When you're executing Python, do you compile Python? How do you run a Python program? Do you do it? You sit there line by line like, ah, print hello world, you just say hello world, you keep going. Use a program called Python, right? You do Python and Python file, otherwise there's other tricks, you can put a shebang at the start of your Python file, the hash and then the estimation point with usually slash user bin Python that's telling the system what program you want to interpret it, but the point is you're using another program. Most Python invitations that you use are written in what? Yeah, it's written in C. I actually don't think it's C++, but I think it's mostly C. So it's literally called C Python, so I guess that was a good guess on my part. But the Python interpreter is a C program that's compiled to a binary and then it interprets your Python script and it's what's actually executing on the CPU carrying out all of your Python instructions. Same thing for JavaScript, same thing for Java, all of that stuff. And so coding is hard enough without having to think about all the detailed logic gates. Did you guys take in logic gates style classes? Did you build anything with a breadboard to make things blank and stuff? Yeah, that's my extent of my knowledge here. If you're a computer engineer, I think that stuff is very cool, but I just want hardware that works so I don't really care about this stuff, but there are a lot of fascinating stuff in here. And so what we're thinking about is at this high level and it's all about abstraction. So specifically here, we're talking about how do we think about accessing the resources on a computer? So you have your CPU. Hopefully you've seen some kind of diagram. It's like a little jibby guy, right? Yep. Your screen is like messed up on the screen. Is it the sharing of this thing? I don't know, it's causing me, but the right half of the slide. This is Google Slides. This is definitely a Google slide problem. You know what? They tell you, use Google Slides. They say PowerPoint is for old men, but then you use it and it sucks. Oh, it's still broken. Even though I'm doing all of this talking, I'm pontificating. I'm gonna be cut off. I mean, you can see that it's cut off. No, no, okay. Thank you. The question is why? Why is it cut off? Where did it go? Okay. Well, this is insane. You know, all this technology may not be worth it, everyone. All right, can we just do this on the fly? Can I delete it and re-add it? Yeah, if you've learned anything from computers, it's just restarting it or unplugging it and plugging it back in actually works. This is absolutely absurd. Okay, cool, cool. Everything makes, oh wait, is the camera? Ah! Yeah, how do I like Z? Under the sources, you move the source code. Everything's fine. We're jamming. Okay, let's watch the thing. Ta-da! All right, thank you for that. Okay, incredibly frustrating. Oh, because I couldn't see the CPU and the CPU. I must have thought it was all for the right thing. Okay, so we have the CPU, little piece of silicone that implements all these logic gates and stuff that eventually gives rise. So you can think of, but of course, if you just had a CPU, so is there any data on your CPU? What does it do computation on? It uses the logic gates to perform computations, but what does it do computation on? Memory, but that's not on the CPU. Yeah. Registers, which are just little other logic gates, I guess logic gates was correct, because it's all logic gates, but a register is just like special flip flops and logic gates in order to store values, but there is some storage directly on the CPU, right? The, all the little components there have to work directly on registers, but it's very limited as we'll see. And so of course, we want to write programs that use gigabytes of memory, like Slack or Discord that just eats memory. The more you give it, the more it will take, right? But we don't want that to live on the CPU. If we did that and made that, that would be super expensive. Yeah. Okay. No. So we may want memory, external memory. We may want our data, our data may be too big to live in memory. We may have terabytes of data. We want to put on disks and we want to talk to other people. So our CPU has a way of doing that. There is, like I said, drilling down a little bit more. We have our registers, which is where we can basically take data. You can think of taking data from memory or disk onto our CPU and then like performing computations on it and then putting it back maybe. And this stuff is actually very complicated. If you, the further down you go, there's all types of actually a lot of the CPU core, like the surface area or layout of the CPU is spent on caching to make things faster. So when you go get one piece of information from memory, the CPU manufacturers know, hey, if you're getting one thing from memory, oftentimes you're going to go through an entire array of things. And so rather than just bring that one bit of memory in, I'm going to bring that bit of memory in and also bring like 4,000 bytes around it. Well, that's all right, but anyways, like a page or whatever of memory around it, put it in the cache so that you can access it. The rest of those accesses are already on the CPU and you don't have to go all the way back to memory. Anyways, take architecture courses to learn more about this stuff because it's crazy. And your memory may actually, your computer may tell your operating system, I should say tells your programs, yeah, I have tons of memory, don't worry about it. Like you can actually write a program now that uses, I don't know, 100 gigs of memory and your computer actually won't blow up. It will allocate whatever memory it can and then push stuff down to the disk and swap that out to disk where it's insanely slow. And anyways, it gets even more complicated when you break down caches, that you purchase CPUs, you probably looked and seen the size of the L1 and L2 caches. Also, as you can see here, modern CPUs also have multiple cores, so you're gonna have multiple things that execute it concurrently on your CPU. And they kind of, as you can see in this example, share resources, like the L2 caches shared between these two cores, which are the solid black lines. So each core is its own ALU and control unit. Okay, any questions on like high level computer architecture at this level? Interesting things to note is, well, Von Neumann is who the, I guess, designer of this architecture. We name this type of architecture like a Von Neumann machine. Wait, is that right? Yeah, yeah. Anyways, not important except for historical reasons, but you can go look up these people. Cool, so that's the architecture. Now, we need some way to control that. So that's what assembly boils down to. It's trying to solve the problem of, hey, I have these registers, I can do computations, but how do I actually specify that? How does a programmer be able to talk to the CPU? So this is that same diagram we just used, but looking more into the CPU. And what computers are very good at, and we'll look at a bit more later, is ones and zeros, right, binary. Like again, we don't wanna write C programs in binary, ones and zeros, because that would be insanely annoying. But programs are good at, programs like compilers take our languages like C or C++ and compile them down to ones and zeros that the CPU can actually execute. And so, oh, there we go. So I don't know, does anybody like enjoy it? Have you done like the, I hope the exercises of like translating binary to decimal and other types of things and doing the places of two and stuff? Yeah, please more, yeah. So the idea is, hey, the CPUs only execute binaries, ones and zeros, but as a human, I don't wanna write ones and zeros. I guess I would do it if I had to or somebody was paying me a lot of money, but fundamentally that's not my idea of a fun time. So what they created was a textual representation of the binary code that the CPU interprets and understands. And this is assembly. So assembly is literally the contract between how to program your CPU. And in fact, if I can, this is actually even a lie. So everything I've just told you up until now is a lie because your CPU does not actually execute those binary instructions. It actually translates it itself down into microcode instruction. So you may give it one instruction to add two registers together and that it may under the hood translate that down to several micro instructions that all happen. But what it does is it guarantees that you can't tell. So that's what the assembly language is and what we're doing at this level here is the contract that the hardware people give the software people that says, okay, you tell me to take whatever's in register RAX, add it to what's in RBX and put it in RBX. I will make sure that happens. So by the time you next read RBX, that value is updated. And you don't have to worry about how I do that, but it's just done. This is why like processors and stuff are insane. So, oh, yeah. So the other important thing to remember when you're studying and looking at any assembly language is this, there's a one-to-one mapping between the binary here in the upper right. I actually don't know. I'm gonna assume this is correct for Mian, this zero, one, zero, one, zero. This instruction in this binary is means that this is a push instruction and the register that it's referring to is RBP, this one, zero, one. If you change any of these bits, it may be a different thing. With X86, if you change any of these bits, the type of instruction may be longer or shorter, that's not important, but the important thing is this is a one-to-one mapping. So when you're writing assembly language, an assembler can easily take this and translate it directly to ones and zeros that the computer can execute. I'm not getting any of these chats. Okay, so yeah, so the idea is assembly, it's assembled into binary. So you can actually, like assemblers are very easy to write. It's nothing incredibly complicated there. Here's a very cool picture of Kathleen Booth, who invented this concept of assembly in the late 1940s, early 1950s. And it was a big invention, as we'll see, because you used to have to actually write stuff out in like hexadecimal. So having a program that would take your human input and translate it into binary code with an assembler was super important. So, all of this is, I'll log in here, okay. What's this person asking about on here about GitHub videos? Okay, is that just a random question? Cool, okay. So assembly tells the CPU what to do. So this is all it's doing, right, is saying it's gonna be things like move some bytes from memory into this register, add this register to that register. If this condition is true, then change the execution to this value. So, essentially you can think of these as sentences, and this is kind of in my head how I approach these things. So when we think of any assembly language, and what we've covered up to here, is not specific to X8664. This can also be applied to MIPS or any other language that you're thinking about. You can also have weird languages, and a weird languages which I am a fan of. Like you can have architectures that don't have, that don't have the what's we're looking for, registers. So you can use a stack machines and all kinds of weird stuff, but this is like the standard of most systems now. So anyways, the instruction is assembly you can think of as a sentence. And then the question is, what should this instruction do, right? Just like a sentence has a verb. You may be, I've forgotten all of these terms, right? I actually don't think in terms of verbs and nouns anymore, but you think back to your days of, I don't know, elementary school or whatever you learn English, right? You're learning about these grammatical concepts. So what do we want it to do? That'll be the operation of the instruction. The noun is gonna be the operand. Like what do we want that done to? There may or may not be more than one operand, or there may be zero. It may be clear about what it is. So, cool. So this is what you should understand. So I guess, before I get there, is anybody scared of assembly? Why? Yeah. Because I had to take a, I don't guess you tracked out most of the specifically hard rules. Mm, yes, it can be hard. What's hard, the syntax? Oh yeah, go ahead. Cool. Consider this therapy for any of your past trauma with assembly. You will learn to love it. It is great. I actually love programming assembly because it's very simple. I don't code giant things in assembly. I'm not an insane person, but whenever I have to do it for either security or writing some cool assembly code or something, it is super nice to just not have to worry about any of these high level primitives. So, cool. So it is fundamentally simple, right? Just pull those down to these sentences. There's an operation, there's a operand, and things happen when those instructions are executed. But again, it's very simple. Like, you know, you're, you are in most of, you're taking a computer science course, is probably the best way to describe all of you. You have to think logically, right? And for me, that's like one of the most lowest levels of thinking about logic. Like, aha, well, this adds that to that. There may be nuances that come up that we will discover and learn about together, but fundamentally, all this stuff is all documented. It's all knowable and learnable. You can learn this stuff. It's nothing is magic. I wouldn't, this is where I may quibble slightly with the slide that says it's the simplest programming language. Lisp I think is the simplest programming language, especially in terms of syntax, but I think that's like a yawn and I disagreement here. But yeah, you can definitely become an expert in assembly in a week. And many of you already have experience in assembly, so I have the utmost face that you will be excellent. So what things do we care about? Right, so we're talking about nouns. So we have four instructions. Oh yeah, great point that somebody mentioned in the chat that the game Roller Coaster Tycoon, if anybody's ever played that game was written purely in assembly, which is insane, you just see that game. It's like incredibly complicated. It's like one person that did it, which is very impressive. And you can do it. You just have to be very careful as you're coding things. Just like, you know, C and pointers can be hard, but once you get a hang of them, you can also master that and move forward. So anyways, the basically way to think about it is that, and we'll see this, that the CPU is kind of concerned only about data that we directly give it as part of the instruction. So you can think of like literally the instruction itself says move the value one into some register. So that one is literally part of the instruction that's like in this example of analogy, cash in your hand, whereas data that's in a register on a CPU is not in the instruction, but move the value from RAX into RBX. So you're taking whatever's in RAX into RBX rather than take the constant value one from the instruction into there. And so that would be like money in a cash register. You still got to open the machine, but it's right in front of you. It's very easy to do. Versus when we get to, and here storage is a little bit of a misnomer in some sense, because really what the CPU can only talk to is memory. So the CPU can say, okay, move this value from this register into memory. And that's it. That's all the CPU has to do. These are the only three ways that data can ever enter the CPU. Does this jive with the diagram that I showed you earlier? Yeah. I don't know why, but I think imagine like cash rolls like bunch of like buckets. Just think about cash like magic at this level. Just ignore cashes. Cashes are great. You want more of them. It's only when you get to the super advanced stuff, like do they do that, like 466 or is it this year? Yeah. So they do speculative execution attacks in 466. So that you do need to understand them. Now it stops being magic and starts being something you have to understand. But for our purposes, you can completely ignore our caches. Yeah. You can think of it just like a magic genie that when you're like, ah, I'd like that thing. It just magically appears in your hand. And then you're like, cool. And then you just use it. Okay. But what about disc? Is disc one of these three things that I talked about? I had it in that diagram. We can stand here for a while. It is not one of those three things. Say one person was listening. But why was it on that slide earlier? Yeah, it is slightly more complicated than all this. And I don't want to get into all the details but for most programs that we write, you as a user space program, do not ever talk to the hard disk. You don't get to put that on the hard disk. You don't get to talk to storage. You actually don't even get to talk to the network. All you do is talk to the operating system and the operating system on your behalf manages that. And then how that's done could be actually just through memory and other ways but how that data gets there. So yeah, but for the most part for right now you actually don't ever talk to hardware directly. So that's why these are the three things you only need to think about. And this is again important for simplifying the problem down. When you're conceptualizing, okay, how do I write this assembly program? How do I do this? There's only three pieces of data you ever need to think about. Data part of the instruction, data as in a register or data in memory, cool. And then we want it to do stuff, right? If computers just hold data, I guess it'd be more like a library in that sense but we don't want it to just store data. We want it to do stuff with that data. We probably wouldn't have jobs if the computers couldn't do stuff if they just held stuff, I don't know. So what do they want it to do, right? Well, they don't do anything. They don't do magic, right? Computers do things that computers are good at, right? It's a hardware circuit, it does numbers stop. So the things that we've determined that computers are very good and very fast at are adding things together. When we say data, that could be those three types of data, right? Adding some data together, subtracting data, hey, basic max stuff, multiplying some data, dividing some data or moving data from one location to the other. Finally, and that's like kind of all you need for computation, right? Take data from here, add it, boom, put it back, right? It's actually not that complicated at the end of the day but we don't just want that. We also want the ability to say, hey, if this condition holds, then do something else, right? That's the other part of programming that we all know and love, not just moving data around but actually changing what things execute based on conditions, usually based on the data. So we may want to compare two pieces of data with each other and say if this piece of data is greater than this piece of data then jump to here or test some other properties of the data, we may want to say, is this piece of data a negative value? Is it a positive value? Yeah, and these are literally some of the bolded here are the mononics, is that how you pronounce our? Monics? New Monics. New Monics. The new Monics for each of these x86, 64 instructions. Literally add, sub, mold, div, move, cop and test. And if you can master those, I don't know that there's a lot more you need to do. Those like, oh well, there's some stuff, push, pop but they're like the basics. You get this, everything's easy. Cool. Like we mentioned, the specific dialect and language of the CPU speaks is different. I was just talking with somebody before class about they were thinking about upgrading, getting a new computer. They were asking about, like I use the M1 Mac or the Apple Silicon Mac, what I thought about that and I said, honestly it's great machine. The problem is it's an ARM machine so it only speaks ARM and a lot of the stuff I do like with CTF challenges are x86 binaries and it's a pain in the butt to run an x86 Linux virtual machine on my Mac that has everything that I want. So I actually have to have remote servers that I can SSH into and do analysis there. I think Connor actually uses the Dojo instead of using his Mac for this purpose. But there's all kinds of stuff. ARM x86, PowerPC before, this is probably way well before your time Macs used to be on PowerPC and then they moved to Intel and that was a major headache switch and then Intel chips were x86 and now they moved from x86 to the Apple Silicon which is based on ARM. And so that's a big pain in the butt. MIPS, which you all learn and love is what I'm hearing. Risk 5, PDP 11, all kinds of stuff. And essentially this is like when we talk about its simplicity these are essentially, I actually don't know which ones are three operands, but one, two, three, four, five. Here's only five examples of the syntax of an assembly instruction. Some operation, an operation, one operand, why is that repeated? Just messed up my counting, right? That seems right, right? Yeah, you all agree. Okay, one operand, two operands or three operands. Done, that's it. And so the history here is crazy and one of the reasons that if you look from like a, I kind of like to think of these things from almost like a historical perspective of like why are we talking about x86 still? Well, it really started with, I mean the first one was the 8085 CPU. I wish the release dates were here, Connor, could you look up the release date of the 8085 and the 8086? I roughly know, but I don't wanna guess. That's where the 86 comes from, this 8086 machine. And then what Intel did was made sure that the instruction and art set was backwards compatible for all these things. 76 and 78. 76 and 78, wow, that's, I don't know, I guess the 80s based on the number of eighths in those names, but what do I know? Okay, yeah, so back in the 70s. And so rather than redesign the architecture every time or have it be that when you have a binary compiled with one version, it wouldn't apply to the other. That was a massive pain in the butt, but Intel put a lot of effort into making sure these things were backwards compatible. What this means is there are an insane amount of x86 instructions. Like I am not familiar with every single one of them. They are all documented, they are learnable, but there's a lot of them. And part of that is because of this legacy baggage. They have to support it. If they ever decided from 76 to now to support a specific instruction, they basically have to keep supporting that in order to be backwards compatible. Okay, a brief, I'm not even gonna rant, but a brief statement about syntax. There are, you would think that with something that's this simple, operation operand, operation operand operand, that there'd be no dispute of what you mean. But typically this one here, operation operand operand. So in x86, when you move something from one location or not move, but add. Add move is a good example. So when you add, and you're gonna add, did I say that different way? Starting over. When you're moving data from one operand to another, which is the source and which is the destination? Do you move it from the left to the right or from the right to the left? Yeah. So yeah, the important thing is actually it doesn't matter, but you better specify what it is. And that's one of the super annoying things that these syntax wars have is they have different ways. So in Intel syntax, you move the thing on the right into the thing on the left. Which to me is not quite as intuitive, but it does have better benefits and honestly like almost everything uses that. So we're gonna use that. For some reason, AT&T decided to create a completely different syntax that flips that fundamental distinction, which now means you have to understand what syntax you're using when you're looking at assembly code. If it's AT&T or Intel, please feel love of everything only just using Intel syntax and everything will be fine in this class. You can easily tell because AT&T has print percent signs in front of the registers. So if somebody's showing you code and you have a percent RAX, instead of just RAX, that means it's AT&T. So you can very nicely suggest that they reformat it so you can easily understand the syntax are all in the same thing. I guess this is a good point. Intel made the architecture so they should define the syntax of how we look at things. Questions on that? Anybody feel very strongly the other way I want to argue? I don't actually really care. Cool. All right. Now that we looked at that, we will look at data. Now this is where I think more of the complications can get in and this is where attention to detail with assembly is incredibly important. You don't necessarily often need to think about when you're coding in C or Python or C++, how does the CPU store my information and in which order do the bytes go? But that's what we need to think about. So, okay, we talked about binary a little bit, but digging in, right? Binary is a base two number system. Every digit is a bit. The digit's gonna be either of what numbers? Zero or one or two? No, can't be two because that doesn't make sense. Binary, only two digits, zero, one. If you actually look in the brickyard, there's a door on one of the elevators that has a two on one of the binaries. It's just there to mess you up. I stared at that thing for weeks until I figured out what was wrong about it. It's a two in one of the binary streams. It's really trippy. I don't know who put that there, but it is definitely there if you look for it. So, pa-pa-pa, cool. So again, this is something you learn as computer scientists. The way we represent numbers doesn't really matter, right? Why does assembly use base two? I have more than two fingers. I have 10 actually. Why not use a better one? Yeah, louder. Yeah, so it's, all goes down to logic gates, right? So like the logic gates all deal with ones and zeros and on and off. So it's a very natural translation of that into numbers, representing numbers. So there are examples of weird ternary computers where you have three states. And so for there, you do base three. So all numbers would be zero, one, two, one zero, one, one, one, two, and so on. Cool. But I don't know. It's impossible to use binary. I'm very confident in saying that. If you love looking at binary numbers and talking about numbers in binary, I think that's great and you may be a computer, but that's fine. So we actually have better ways of representing this and it's usually not decimal. So decimal numbers usually don't align well to the binary round numbers. So for instance, like, I don't even know how many did one, zero, one, two, three, whatever this is, 128 in hex. So like changing one bit, right? Of this number to this number gets you from 128 to 192 or 224, but that's hard. Like I can't see, like I know 128 is a number with just a one and all zeros because it's a power of two, but the other numbers, I don't know if they have weird binary representations. That is why we use things like octal and hex. So we use actually different number of code. I guess it's kind of insane. It's like, we're stuck with this binary thing, but binary's hard, decimal doesn't work really well, so let's invent new base numbers in order to look at things. Everybody familiar with hex and octal? Cool. So octal is base eight. If you're not familiar with it, usually prefix with a zero, which is the stupidest thing anyone's ever done. It's like incredibly easy to mistype one of these octal numbers for a binary or a decimal number, but you can see these all start with zero and you can see it goes up to seven and then rolls over. So each digit there is zero through seven. The one that I am definitely more comfortable with and that comes up way more is hexadecimal. So this is base 16. So digits go from zero to 10 or zero through nine and then we hit a problem because, wait, the number system that we're using, our number system, doesn't have a way to represent a single digit as 10. So we add letters to it, like, I guess, weirdos or something, I don't know. So we have A, B, C, D, E and F all the way up to F. Why is F not there? It's insane. It's a very important letter. Okay. So yeah, we can see that up here zero through 15 are all one digit, zero, one, two, three, four, five, six, seven, eight, nine, A, B, C, D, E, F, all then rolling around to 10 and these map much nicely to the decimal digits. So oftentimes you'll see in examples of code we post in the slides, there will be hexadecimal values where it makes sense. Sometimes decimal values make sense. So you will need to be familiar with, like, find a good calculator app on the Mac has a programmer mode that you can switch easily between the bases, which is what I use a lot. Use whatever tool is good to translate between them as you need them. Like memory addresses are often hexadecimal because if you looked at them with binary values, it would look insane and it's hard to tell the difference between them. But it's very easy with two hexadecimal memory addresses to see if they're 16 bytes apart because it'll be a one, zero difference. So you can eventually figure that out. There are, I do know people who can do hexadecimal math in their head. I'm not one of those people that's like, I guess very basic, like plus or minus 16 or 32 or whatever, but anything fancy, I'm using a calculator because I don't want to mess it up. Okay, we talked about this even on Monday, but this is a nice refresher, right? So, how do I express text? Again, in the computer, you just have the ones in zeros and we've tricked the computer into representing these as numbers by grouping them together. But we actually, a lot of the things we do is not necessarily on numbers, but on text, right? We want to see words, results, whatever. And so, like our programs are written in text. Everything comes from ASCII, which was in seven bits encoded the English alphabet. There's helpful things here if you want things, but this is like the ASCII table on the right, like a super condensed version if you needed to find something. So someone was asking about space. So two, wait, where's zero? Yeah, it's 20, but, oh, two zero like this. Here we go, three zero, four zero. Oh, that's a really backwards way to do it, but I guess if that's what you like. Okay, and then ASCII evolved into UTF-8, which you can go look into, it's very, I don't know, much more complicated than one byte is one letter because we eventually realized like, oh, there's other languages than English, so we should really be able to represent them with our, and their text. So that's all set up there. But now, so we understand we can get bits together. We can take bits, a binary number, interpret that as a decimal number, or as a hexadecimal number, but we don't have unlimited bits on our CPU. The CPU can't just fundamentally make numbers as big as it wants, why? Yeah, but why? The response was restricted by the size of the registers, which is correct, but why? Buses, yeah, that's still correct, but like deeper, yeah. Louder? Overflow? Ah, it's related, yeah. Yeah, there's only so much gates that can fit on there, right? Each register, so if you wanted to double the size that every register could be, you need double the space or double the transistors, and then all of a sudden, your chip is gonna be big, and then you have to deal with heat, and all this kind of crazy physical constraints that I don't wanna ever deal with, and I'm glad that there are people who do. So yeah, fundamentally, like those are all correct, but at the end of the day, it's like you need a standard and your CPU can't just support our, technically maybe it could, but anyways, CPUs don't support just arbitrary calculations. If you want to add like, I don't know, 100 gajillion to 300 chagillion, you can do that, but the CPU won't do that by itself. So we need a way to think about that, and the way we typically do that is grouping bits together into a higher level representation of bytes. So all right, this is a bit of weird historical nuggets that there's really no difference between them. You can find crazy machines that Yann has played with that have weird number of bytes where every byte is six bits, seven bits, eight bits, nine bits, 12, 16, whatever, but the eight bit byte stuck. So almost every modern architecture uses eight bit bytes. So here on out, we talk about bytes, we talk about eight bits, and that allows us to group and reference. Usually, depending on the architecture, what this means is how do you address, so when you reference something in memory, what are you referencing? Often times you can't say I want this specific bit in memory, you specify a byte. So when you say transfer me one byte from memory into a register, that will transfer all eight bits. You can't say just give me that one bit. It'll always give you those eight bits. Or yeah, there's eight bits into a byte. Cool, so more grouping because the original CPUs only dealt with bytes. So you only got, this is why anybody really good at the original Pac-Man or probably mixed Ms. Pac-Man, I guess I don't know. I think, is that one of the ones that goes forever and after you loop around on the level number at a certain point, or the score calculator, but maybe that's 99, so maybe it's binary coded decimal, but I don't know, but yeah. Talking about the bit glitch. I was thinking, I know there's some arcade game that used a byte to represent a level, so once you get past level 256, or 255, it loops back to level zero, and you just keep going, 256 on Pac-Man, yeah. So yeah, because they, I don't know, didn't really care about that, and the registers were only a byte. I don't know if that's true on that time, but anyways. So now, when we talk about the size of the architecture, so specifically the key difference, like we talked about a little bit, difference between 32-bit architectures and 64-bit architectures is the size of the registers fundamentally. There's also some other weirdness, but I like to think of it as how much data can you store in one register? So how much can you compute on? It can also be one of the sizes of memory that you could address, but for now, we can think of it like that. So these actually are important terms to use, and this actually will be helpful going forward to learn these and use these terms correctly. One of my favorite words is a nibble, which is half a byte. Why, so it seems counterintuitive. I said you can't really reference like less than a byte. So who cares how to talk about those four bits within a byte, but it turns out like for some security things, you may be able to know for certain of those eight bits. You may know a nibble's worth, so you may know four bits of information, but the upper four bits may be unknown and you may have to brute force that, but brute forcing that is way better than brute forcing the whole byte, which is 256 things to try versus whatever. Half of that is 128, maybe, sorry. Did I just do that on top of my head? Can't tell if I can trust your nods yet, but until somebody tells me it's wrong, I'll say it's right. Cool, so nibble is half a byte, so four bits, a byte is one byte, eight bits, that's super annoying. I guess I haven't seen this too much, but a, going all the way to the end, a quad word is eight bytes or 64 bits, so this is like the normal size of computation on 64 bit architectures, so the register RIX is a quad word, which means it's eight bytes. A double word is four bytes, so half of that, a 32 bit number, and then a half word or a word, that's really annoying, either 16 or 16 bits. That's stupid, okay, I'm sure that won't come up at all. Okay, so like we said, how to express numbers, again, this is another key thing that I love learning about assembly because it rips away our notions of what high level languages gives us in terms of strings, in terms of integers, in terms of pointers, the CPU does not care. The CPU only deals with the bits that are in memory and the bits that are in the registers. And for 64 bit machines, that means it only cares about 64 bits at a time. Now there's insane specialized hardware stuff that you can use to compute our more things, but you can, a 64 bit integer or number can represent from zero up until whatever the heck this is, which I'm not gonna say exactly what that is, but it's a very large number, but is that the biggest number you can think of? How can you think of a bigger number? Add one, and then boom, what happens on the CPU? It overflows and it goes back down to zero, right? So fundamentally, you're constrained with what numbers you can represent. Oh, that's literally the next part here. That's funny. Cool, now we have the problem of, okay, we just have bits, but how do we represent negative numbers? There's no negative sign to add to these things. So the idea was add a sign bit, but that actually makes the math hard. Hopefully this is all refresher because two's complement is the nice thing where you have a sign bit so you can easily test if a number is negative or not by the uppermost bit. If it's a one, it means it's negative. If it's a zero, it means it's positive. And if it's negative, then you flip the rest of the bits to get the representation. Study this if this is confusing because negative numbers can definitely come up in here. Okay, ah, we'll stop here. Why is it not talking about end of this? Do I do that later? Yeah, I mean it has to. It's gotta be here. Why is it here? Oh, um, you know.