 So, uh, yeah, those of the people that know me know that I'm always talking about how everything's too high level. Um, like, uh, DEF CON 22, I was talking about how a lot of security tools were, uh, too high level and how to screw it, uh, analysts that way. Um, then there's also, uh, recently, uh, I got this here. Um, if anybody picked this up, I wrote about, uh, in the good book about how escape is too high level. And then, uh, now I'm going to be talking about how assembly language, uh, in certain situations can be too high level. And, uh, just a note in the speaker room back there. Um, there's minor disaster, so my callie image broke, but, uh, whatever I'm going to roll past that, I got screenshots, so it's no big deal. But, um, first before I begin, simple, uh, shouts to start with, most importantly, um, the person that helped me with all the art you're going to see in here, Kurt Cocaine, um, my fiance, um, she's back there right now not wearing the hamburger ear muffs. Um, but it, which is funny side note, I know my handle is kind of straight edge and it's kind of ironic that hers has the name Cocaine in it. But, um, also Fat Cat Fab Labs, it's a hacker space that I, uh, hack out of in the West Village, um, NYC 2600, um, also DC 201 because DEF CON. Um, currently we don't have, uh, NYC chapter, so we go over to Jersey for that. So anyway, um, about me, in the context of this slide, not about me in general, but as a teenager in the 90s on Windows 3.1, I wanted to learn to program, but I wasn't exposed to any kind of programming languages. I didn't know about BBS numbers to, to even dial. I was like, in a little silo. Um, but I, I kind of had the idea that programs still had to be editable in some kind of language and some kind of editor, so I opened up Notepad and, you know, naively dragged Calc into Notepad to see if I can edit this program. Um, and what I saw was kind of discouraging. I'm like, I don't know this language, whatever this is, this garbage, but I want to. So eventually as a side note, I, I did, uh, start programming in machine code directly and at CactusCon last year, I live demoed programming Hello World in Windows 3.1 stock using debug. Um, but back to my journey of assembly language. Uh, when I first tried to learn assembly language, um, I tried on the TI-82 just cause I thought it'd be simple. Um, the Z80 chip and I followed a tutorial to clear the, um, the screen and that worked out great and then when I tried to adventure on my own to clear the memory. So I gave up on that and then I went, when I was in college, I learned officially, uh, more academically on a Motorola chip, the M68HC11. Um, and that's when I really learned assembly and really fell in love with it and really learned the relationship between machine code and assembly language because one of the first challenges I gave myself was to write a, you know, a self modifying or, uh, kind of like a virus, but not really. Just a program that, uh, wrote itself back into memory and executed itself. And to do that, you'd have to understand the machine code side of it, not just the assembly side of it. Um, and then, uh, lost. He, uh, he used to come to our Phoenix 2600 meetings when I used to live in Phoenix and he's the one that got me into all things parallax. He gave me my first basic stamp and he coerced me into using the, um, the propeller chip and I learned assembly on it and wrote some audio stuff. And I've also learned the machine, uh, code for this as well and the relationship is fairly one to one. Um, and then, uh, down the road, my previous employer, Vaughn told me to do GRAM training, uh, the GRAM certification through SANS. So this is when I actually formally learned X86. Ironically, it's the last architecture that I've learned, not the first one. Um, and the screenshot is all the, the manuals for, uh, Intel for X86. Um, I've actually read them all cover to cover. Um, except for volume four, which I guess is a new thing now. So, to get into this a little bit, I want to start with, um, my feelings of assembly language and machine code in X86 and its relationship to each other. Um, so I introduce you to my mental image of the Infosec Bro. Um, somebody that Bro explains, um, oversimplifies things. Um, so here's, here's like a scenario that, you know, it's like I'm witnessing this. Um, this is kind of a quote of the type of things that you would hear from a Bro explainer about assembly language. Seeing it's a one to one mapping, they're basically the same thing. You can take machine code and you know exactly what assembly it is. Um, it's just, it's just ones and zeros, you know, he's saying. Um, there's no other layers of abstraction between assembly and the processor. That's okay. Um, the downside is, I took all these, I didn't make these quotes up, I took them all out of this book. And I'm sorry, I know, um, a lot of the authors are, uh, DEF CON attendees. I honestly think that, uh, I know, but I don't blame the authors actually. Um, and even, even one of the authors blames the publisher. This is an Amazon review. The one that shows at the top for this book is from one of the authors of the book and at the very end, he said, well I can't, let me try to zoom here. That's the thing he says. So, this talk is about how assembly language and machine code is not one-to-one and I'm going to go into this in gruesome detail. So, here's the disappointing part. I had an awesome little example in Cali. Um, it, it was just a toy, uh, vulnerable program, toy exploit. Um, it wasn't like I was trying to drop an ode, it was just, just to demonstrate what you could do with raw machine code, uh, with an understanding of it. So, um, this, at least, at least I had a screenshot of like one of the, the crucial parts of debugging the vulnerable program. Um, but what the program does, and just as a point of reference, if you guys wanted to play with yourself, you can. Um, so I at least put a little note up here and I'll make it bigified. Um, it's also in a recent, uh, the most recent issue of 2600. Um, these, these examples are listed here. But the vulnerable program is called Kitty. Because I call it that because it's just like cat. It just cats out a predefined text file which is file dot text. Um, and it has like a limited buffer on the stack of 16 bytes like purposely naive. And so file dot text is the, the exploit for it. So, and to run it you just, you know, run Kitty. Okay. So, the crucial part that I was talking about here is this move ECX, uh, or moving ESP to ECX and then jumping to ECX. It's kind of like your typical jump ESP, but indirectly we were only, we weren't able to find a jump ESP anywhere. So, we were able to find this in our theoretical example of moving ESP to ECX. Now ECX has ESP, so now you can jump to ECX. Um, the crucial thing to note here though is the machine code 8BCC, which is so blurry. But if you're to use a tool like Nassim shell and you typed in move ECX ESP, you're not going to get 8BCC. You won't. Um, that's the, the thing that Nassim shell gives is officially what Intel says you should do. But there's redundancies. Um, and I'm not really saving this for last. I'll just show you the tool that I was talking about in the program guide and I'm going to jump in and out of it. But if I do move ECX ESP, this is what my tool does. It's like Nassim shell. And up at the top it gives 8, 9, E1. That's what your Nassim shell is going to give. But my tool IRASM or the independent redundant assembler also gives one of the alternates 8BCC. And for some of the other instructions we'll see there's way more, um, alternates for some instructions. You'll have like 8 variations that work. So, some of the tools I'm going to use in this talk, IRASM you saw it. MDELF is another tool that allows you to program in direct machine code. So I can say 3, 2, C0. It's like direct machine code. And it tells me what this is. XOR, AL, AL. Or I mean I could also type 3, 0, C0, you know, different machine code. But it's still XOR, AL, AL. It's not 1 to 1. So to go through the 1 to 1 kind of philosophy here, we're looking at the add instruction. Um, in this case, and this is all from the Intel manual. Let's see a lot of screenshots from the Intel manual. Um, 0, 4 is a machine code in this context for add. In this context add, uh, an 8-byte value, or an 8-bit value to AL register. So, and this is what it looks like in the, in the debug, in a Evans debugger. Uh, you have the 0, 4 for add and 4, 2 is our data that we want to add into AL and in decimal, 6, 6 that you see over there. And we step through it one step. So you see EAX has, uh, 4, 2. And that's what we want. Um, to take it up a notch, uh, let's do a increment. 4, 0 is our machine code for that. And, uh, really 4, 0 is machine code for the first register to increment, which is EAX. And they go and order EAX, CDB, ESP, EBP, ESI, EDI. So 4, 1 would correspond to ECX, 4, 2 would correspond to EDX. And, and it works like that. So there's all of them. 4, 0 through 4, 7. That's incrementing all of our 32-bit registers. Unless you're 64-bit, and then it's a prefix and it's, it gets confusing. Um, but then taking it up one more level, we got the move instruction. So it's kind of like increment where we have B and then the 0, 1, 2, 3 after the B is what a register it corresponds to. And then we also add the immediate byte we want to move into the register. So the registers for this one being that there are 8-bit values, those are the registers in order. And this is our variations of that. So we have B0 for AL, B1 for CL. And that's what the machine code looks like corresponding. However, this to the left is that original screenshot into the right. We have the same assembly, but completely different machine code. And the reason for that is in the manual, we have a different encoding we can use. We can move an immediate 8-bit value into a register or a pointer, but because we have the option of a pointer or a register, we still have a register and hence the redundancy. So it's like a simple example of how it's not 1-to-1. Now I'll cover probably the, um, most complicated and one of my favorite examples of it not being 1-to-1 or the abstractions. So the assembly in this example is 2I level. The machine code is even 2I level and so is the mathematical concepts that we're trying to demonstrate with this instruction. Um, or how to, you know, do math in base 1 and base 0 because that makes sense. So this is what this instruction is supposed to do by default. Um, it takes these, uh, like a 2-byte value, it splits them up and we're not really adding them together. We're kind of like smashing them together to, like we're taking the BCD values and making it together as 7-9. And the hex value 4F of that is what goes back into that register. In this case it's AX, so the result goes into the AL register. So that's what it does. So it's supposed to do, but you know, it's like BCD, like we have a byte for each value that could go way above 10. Weird things like that. And another weird thing is, it's, it's a base 10 conversion, but, um, in this case, you know, D5 is AAD and then we get this 0A that shows up after it that we don't actually get to say an assembly language, but machine code you can. And Intel says you can do that too, but you have to do it in machine code. So we can mess around with that and do different bases, which is kind of cool. So let's do that. Base 6, we have a couple of base 6 values that are valid. Um, we smash them together like that, that's the hex value of it and it goes in like that. And there's a screenshot of it. Um, I, of course, I step through so you see that that value actually does show up in EAX, the 1-7. Um, base 2, this is even easier, you know, 1 plus 1, we get that 1-1 together, that's 3, you know, we put that back in there and, you know, that works too. Um, so now let's get ignorant. Um, we're going to use invalid values. So 0, 5 and 6F, like 6F is base something really, really high. In this case, it's not hex, it's like if you can imagine 6, like that value though, the hundreds up, that would be, you'd have to have that many symbols for it. So we take those values separately, we kind of add them together, I don't know how to visually represent that, but that's the closest I get for that. And then by the process of magic, we get A1 and it goes in there and that's actually what happens. Not an error, that's what happens. We'll get to Y in a second, but let's try base 1 because that's, that's a thing. Um, we split those values up, I don't know if it's base 1, I don't know, 0 is the only valid character, right? Um, we split them apart, add them together, we get 0, I mean whatever, that's like no surprise there, but that's what happens for that. But I mean like, what about base 0? Like what symbol do you even have for base 0? So I just, I don't even know what to choose, I just put beef in for my value and you separate those out, add them together and by the process of magic, you get EF and, you know, that's actually what will work on the processor. So like, why is this happening? And it is though, this is intentional. So we're getting to like microcode, although it only gives you the pseudo code, which is why it's hard to trust what's actually going on under the hood. Um, but to simplify it because that's a little bit too obscure, AL, it gets AH times the base that you supply plus AL. That's all it's doing. And it turns out that that abstract mathematical concept is converting bases which is kind of profound in a way. It also shows that mathematics is kind of not reality, you know. Um, and so this is us working out every single example I went through with that simple formula. So it works. It's, it's what we wanted. Like if we give it the right input, it actually converts bases. It's kind of elegant. But you know, if we give it invalid crap, it still does something. So why would you use it? No real reason, but it is kind of a new novel way to clear out the AL register, I guess, if you do base 0. So now, I'm about to go through like 30 slides of one of the most complicated encoding mechanisms in the Intel processor. It's the ModRM plus the Sibbite. It's what allows us to write assembly language with like pointers, really. It's the way you can encode pointers. So in pointers you can have like a base register, you can have a scaled register like, you know, EBP times 2 or something like that. And you can also have a fixed offset, um, either 8 bit or 32 bit. So some examples of what you see in assembly language, like the pointer part of it is EX plus EBX times 2, uh, EBX plus 33, um, ECX times 8 plus this, uh, longer hex value. Um, and you know, just maybe, um, uh displacement. They're all optional, but of course you have to have at least one or else. What are you referring to? Um, this is what the ModRM table looks like, and we'll get to the Sibb table first, but it's like a lookup table. You know, you align one of your operands up here, and then the other operand, which could be a pointer over here, and you just find where it aligns to in the table. Um, so we're going to work through a lot of examples to make this clear, to see the proof of concept. For all these examples I'm going to use XOR, um, just to keep it consistent so you know what the 3-1 in all these machine code examples refers to, and by that I mean this 3-1 up here, right? So our example is XOR EAX with EDX. So the EDX I'm talking about in the table for that second operand is this EDX here, and the EAX, which is not a pointer, it's just a register. We find it down here, and if we follow it on the table, we end up with this D0 over here, and that's our D0 over there. So that's how that works. That's what's happening. An assembler is converting it like that. Um, if we do XOR ECX as a pointer, uh, and EAX, then first of all we have this EAX, EAX up here for the second operand, and we got to locate ECX as a pointer, and we find it here. And that would give us the machine code of 0-1 after our 3-1 XOR, and that's our machine code up here. So we're just like kind of ramping it up, getting more complicated as we go. Our pointer, say, is ESI plus 0X42, um, and EAX is going to go into that pointer. So first of all, um, the easy part, that EAX up there. Now we need to find this, um, second section here is the one that has all the 8-bit displacements, because that's the displacement we're using. And then we got to just find that one that's ESI. So 8-bit displacement, ESI. So that's what gives us our 4-6 machine code here, and that's why we have our 3-1, 4-6, and then 4-2 is just referring to that part right there, that 4-2. And then get more complicated. Uh, this is kind of like the previous example, only we're doing a 32-bit displacement on an 8-bit. Um, I only included it because I wanted to show Andy in this. So first of all, there's the ESP part, that's easy. Um, and then EBX plus a 32-bit displacement. And it's our A3 right here. So we have our 3-1, A3 up here. And then you notice I have like FFF elite. Um, you see it kind of backwards there. That's Intel being little Indian. I think it's little Indian. And it's, I call it reverse Indian myself because it doesn't make sense. I'm a guy that learned on Motorola at all. It's the, the, the right way in my mind. Um, so then there's XOR with just one displacement here. Um, I have EAX up here. And this is where I can do just a displacement. We don't have the option to do an 8-bit displacement, but we do have 32 bits, which is fine because we just pad it with zeros. And that's what we have. We have our 3-1 right here. And then we take that 5 from the machine code and that's our 5 there. And then our 0, 0, 0, you know, reverse, um, and our 4-2 at the end. So that's that. I'm almost done with these really tedious examples. I just want to show you how like kind of complicated it can get. Um, in this case we're going to do scale. So this means that we're going to have to use the Sib table after the, the moder M. That, that scale meaning ECX times 4. So first of all, AX, that second one, that's easy. Um, and then we know that we're going to have a 8-bit displacement. We know we're going to do a ECX scaled. Um, this dash dash thing is what means use the Sib table. So this is the, the dash dash that has 8-bit displacement, which is what we want. And then we'll deal with the rest, the, the EBX and ECX than the Sib table. So here we are on the Sib table, a different table. EBX we select up here for that first EBX. And ECX times 4, we find there. And that's how we do that. We get our 8-B right there. And that's how we have our 3-1, 4-4 from the Mod RM table, 8-B from the Sib table, and 4-2 from this displacement. It's, it seems complicated, but like it's so logical. You're just looking the stuff up on a table. Now we're going to get a little bit complicated. We're going to poke through the exceptions first and then we're going to go through some of the redundancies. Um, so first of all, um, ESP say we want to encode that. Um, you'll notice we don't have a register for ESP here. So we have to do a little kind of a hack. Um, it's, it's hard to call it a hack because I mean it's the official way to do it. But we're going to use a, a Sib, like this dash dash to go to the Sib table. Um, and then in this case, uh, we can say for, um, scaled register, there is none. But when we say none for our base register in Sib, we do have ESP in this one. And that's how we do that hack. So we have our 3-1 for X or O4 from the modern table and then 2-4 from this table. And the question, one question that I first asked when I'm looking through this and coding it manually is, we have this none here. But what's the difference between that none and this none and that none and that none? Because we're not scaling any registers. So like it doesn't matter. That's like literally saying none. What's the difference? None. And that was us using the Sib table as a hack to put in a base register with no scaled register. But we could do that with other registers other than ESP. So that was, this is me just doing it with EAX. So the first instruction there is how you should encode that, 3-1-0-0. But these are all the other alternative ways to do it with the Sib byte. I mean, of course an assembler is not going to do that because that's more bytes, you know. Um, how about scaling the ESP register? Nope. Uh, you just can't. It's impossible. If you try to do it in an assembler, it's going to give you an error. Um, it's interesting, you know, it's like a general purpose register that you can't, um, scale with. And we'll run into complications with that too. Um, another example is say we want to scale EAX times 2 and add EBP to it. Well, EBP is kind of weird too. Um, it's, in this case we, we can't use this other format down here. Um, we really are using the Sib byte to do this. So 4-4, um, we're doing EAX for that. And we're going to use 4-4 for the, um, machine code to go to the Sib byte. In this case, uh, we have EAX times 2, which we find here. And this asterisk means this. So in this case, I, from the modern M table, I use the second encoding, which refers to this displacement 8 plus EBP, which is how we get the EBP over here. And it's actually doing a displacement of nothing. That's how an assembler chooses to encode that. It's not straightforward. We're getting into that territory. Um, there's an implied scale of times 1. Um, because this is technically valid assembly that an assembler will look at, but it's kind of BS in the back end. What really is supposed to be happening here is, uh, EAX is the base register and ECX is the scaled register. Just in this case it's times 1. So that's how it encodes it. It still needs the Sib byte to encode that. Or say we have ECX times 1. Well, we could encode that with a Sib byte if we were doing it manually, but your assembler is not going to do that. It's going to interpret what we're doing. It's going to do a little dance and it'll actually encode it as ECX is the base register without even using the Sib byte. Um, then ESP times 1. Well, you know, I said you can't scale ESP, so you think you can't do that. Um, if you were to write this in your assembler, you're going to get an error. But if you're going to write EAX plus ESP times 1, um, it'll actually work because your assembler is going to, I mean, NASM is the one I use. Um, it might just make ESP the base register instead and then scale EAX times 1 because that's valid, like the community of property. Um, and sometimes it just ignores you and chooses less bytes, you know. Um, like with that community of property example, this is me just showing how you can switch EBX and ECX and you get different machine code in that case, but logically it's the same thing. Um, EBP you can do it, but the machine code is going to be a little bit bigger because EBP is encoded a little bit weird as I showed before. And ESP just can't do it because in one case it would have to be forced to be scaled, which is impossible. Um, another little trick for redundancy is put a null in it. Um, so in this case, these two assembly instructions look the same, but they're not. One of them I'm using an encoding with a 8-bit displacement of, of nothing. Um, and then you can do a 32-bit as well, you know, just nothing. Um, and then you can do commutative and mix and match and, you know, put a null in it and you got all kinds of redundancies and that's kind of the point of this whole talk, you got redundancies in different ways to do things. Um, another basic, uh, redundancy is the mod RM redundancy. The, the, the, or this plays in as you have, uh, with instructions where you're moving things around or doing logical operators, you cannot do a memory-to-memory operation. You have to either do a register to, uh, a register or a register to memory or a memory to register, but not memory to memory. So that's why they have to include two different encodings. Um, in the case of this, like for, uh, compare, 3B is the encoding for moving a register or a pointer into a register. And then the second one, 3-9, is moving just a register into either a register or a pointer. Um, because of, uh, register can be encoded in both, you got a redundancy. Um, and this is actually the exact type of redundancy that I had in the screenshot before with that toy exploitable program. That's why that worked, is because of this redundancy. Um, and then this is just the mod RM table showing, um, that C0 encoding for the EAX and EAX. Um, some more interpretive dance with the Sib byte. Uh, say we did EAX times 2, um, is it the same as EAX plus EAX? Well, um, to an assembler, I'm just going to actually write this in source, and this is what it looks like when it gets disassembled. So the assembler is choosing to encode the EAX times 2 as EAX plus EAX. It's ignoring you and doing interpretive dance. The reason for that is if you were to directly encode in, in machine code, uh, using the Sib byte, doing multiply by 2, it actually requires more machine code to do it. And I should take a step back and say, all this experimentation, I say that I do it directly in machine code. Um, the way I do that is with this tool that I wrote, um, MDELF that I just showed you, um, earlier. Um, so like I could do what was a 33040, 33040, and I get that representation there. Um, this is an interactive mode, so what I, what I originally wrote this for was to write out a whole program in machine code, and it actually spits out an ELF executable. Um, so you can use it for that too. Um, so just a point of reference, that's that tool. Um, Nassim is tolerant to your bullshit. So you can write something like EAX times 5, even though you can only do times 1 times 2 times 4 times 8. Um, you can do EAX times 2 minus EAX even though minus isn't a thing at all. Because Nassim is smart, it's cool. I praise Nassim. EAX times 5 is the same thing, pretty much as EAX plus EAX times 4, which is a thing. And EAX times 2 minus EAX is just EAX, and that's the thing. So Nassim is tolerant to your bullshit and will do that too. Um, so now I'm kind of done with the mod RM as a big thing. Now I'm just going to go through all kinds of random miscellaneous loose ends, and when I'm done with all that, um, I'll talk about the tool more. Like, I wrote a tool so you don't have to think about this stuff, uh, so it's automated. So you can go from Nassim shell to maybe irasm for other things. Um, first of all, I'm going to talk about test, this particular test encoding of moving a register or a pointer to, um, a 32 bit register. This is actually not a thing. There's no encoding for it, although you can still write assembly like it. But before I do that, to show kind of an analogy, compare, I'm showing the two different, uh, compares that you have. So what it looks like. This is me doing both forms of that, you know, a pointer to a register and then a register to a pointer. This is a disassembling it. We see it's different and everything. But then we go back to this test thing. So say we really did try to do, in the first case, um, a pointer to a register. This is what we get. Um, the assembler gives you the same thing for both. Um, why is that? Well, um, Intel doesn't have an encoding for that first one. Um, this is the only encoding that it has. So why is that? Well, with compare and test, compare is like a subtraction but it doesn't do the subtracting, it just sets the flags. Test is like an and, but it doesn't do the ending, it just sets the flags. Compare if we were to look at doing a subtraction, if we do 5 minus 3 or 3 minus 5, like we switched them around, the result is different. Whereas with test, if we switched those around, it's the same thing either way. Hence why you only need one encoding. And then this is just kind of a miscellaneous, um, 64-bit, uh, trick. Earlier I was saying how with increments, it's like 4-0 through, uh, you know, 4-7. Well, if you're 64-bit, that 4-0 through actually 4-f is actually a prefix that modifies the instruction after it. Now for that, uh, let's talk about fencing, um, which I think is kind of like a semaphore but it might be completely off on that because I've never used this defense instruction. But this is our Intel, uh, like manual screenshots of the machine code for L fence, S fence, and M fence. And this is me writing those instructions, um, and then I'm, you know, disassembling it and this is the result I get. This is logical, this is the machine code that the manual gave. Um, but then there's also this. Um, so if I go in here, we have like a bunch of L fences and you'll notice that like the first part's the same and then you've got EA, E9, EA, EB, um, and the same kind of thing with other ones. It's like it starts with this F-0 and just kind of increments from there. This is normal, this is fine, says Intel. It's not a weird thing that I discovered. Intel says you can do it so I did it and it works and I don't know why it's there, but it works. Um, so you can write some more machine code that you can't in assembly because assembly's too high level. Um, so here's another thing. Um, to make things easier for the programmer or actually more kind of, not for the programmer, for the processor it's less bytes. Because it's so common to compare a value with the AL register and AX and EAX, there's a specific encoding just for it. So even though you can do an 8-bit register or 16 or 32-bit with a register or a pointer, in this case AL is so popular that it gets its own machine code. But as you're probably guessing now, of course you can encode AL with that. So because of that, here's some more redundancies. Similar to that with rotating instructions and bit shifting instructions, it's really common to shift by just one. Um, even though they have the encoding here to do an 8-bit value, like you can shift by, weirdly enough you can shift a full like 255 values, which doesn't make sense for a register that's like, you know, too small to actually make a difference for that. Um, but anyway, same kind of thing. It gives me more redundancies here. There's also branch hints where there's no reason I'd ever want to use it other than the fact that Intel says there's no mnemonics for it. So of course I want to write a branch hint because I can't write a branch hint in assembly. Um, a branch hint is just a prefix that you put in front of any instruction that would branch. So in this case, I put the 3E in here and it tells the processor that, you know, that hints the processor that there might be a branch, but I don't even know if it's even used anymore, but whatever you can. Um, so I really like this one because in the manual, this instruction machine code wise doesn't exist, but it does. To me it does. Um, so this is writing an example, uh, just in source here. Um, and the reason that Intel doesn't have an encoding for, uh, the shift arithmetic left is because it's logically the same as just shift left. So they really only use one encoding for it. So if I try to do a move and then the shift left and the shift arithmetic left, when I disassemble it, my shifts that I had both got converted to a shift left. If we look to the Intel manual and look at the machine encoding for it, it is identical for both of these instructions. Um, that's weird. Um, and that, that 4, that slash 4 thing you saw is really represented by this binary 100 and they throw this SHL and SAL on the same part of the table. And then what I see though and what some of you might be seeing if you're sitting close enough to see this table is there's a blank spot over here. And I'll make note of this number, 6. So I'm going to try to do it manually here. This is making that instead of slash 4, making those 4 bytes, uh, or making that, that 4 part of the byte a 6 and I now have SAL and here's all the different versions of it. So now I can do SAL. So I'm mission accomplished with that. Um, there's a hidden test. I like looking at these tables and seeing empty things to try to see what it actually does. It, um, is, this blank part here is actually just a test, just like the one right next to it. Although some disassemblers can't even disassemble it. Um, me using EDB, it just says it's a data word and then there's a move right after it where really this is actually a machine code for test EAX and this move isn't even a move, it's actually the, um, operands for that test instruction. And when you step through it and execute it, it actually does run as a test. So if you're looking at this disassembly, um, you'd be mistaken at what it actually does. Um, there is no move. Um, I call this set of slides load ineffective address even though the instruction really means load effective address. Um, and what it really does, I'll zoom into this instruction here, is it doesn't really treat this as a pointer normally. It just kind of does a mathematical operation. So whatever is in, um, RAX, whatever is in RBX, it adds them, multiplies RBX times 8 and adds 10 to this. I mean, this is how I wrote this instruction, but you can use any pointer math you want. Um, and then it takes whatever that value is and literally puts it in EAX. And that's what that's used for. So in this case, like, um, RAX, you know, it's 5, or I'll start with RBX. RBX is 30 times 8, um, plus 10, plus 5, will get you 255. And that's why when I ran through it all the way, we have FF 255 as a result. That's what it's supposed to do, but really to do this kind of instruction, it assumes that the second operand is a pointer and it assumes that the first operand is a register. If you write anything else, it's not going to work and it's going to give you an error. So, you know, if I try to do something else, because of course I want to try to do the wrong things, I'm a hacker, I want to see what the wrong things do. So I type LEA, EAX, EAX, and I get an error. But this is using the mod or M table to encode it, so of course in machine code you can still write it the wrong way. Which is what I did, and my debugger tells me this is invalid, and I actually get an illegal instruction fault there. So I mean this garbage, it screws up, but the cool thing to me at least is that I can write something that I couldn't in assembly, even though it will crash. It's still kind of cool. And then this is the last major section of this talk about redundancies. It's prefix abuse and it's kind of one of my favorites. So first of all, byte swap. It's an instruction that allows you to swap all the bytes, like 8-bit values, in one register. And really you only have the option of doing a 64-bit register or a 32-bit register. You think why can't I do a 16-bit register because there is at least two bytes in it. You would be able to just swap them. You can do it with exchange if that's what you really want to do. But like in my head I'm like why can't I do it with a, you know, a B swap? So anyway, I try to write it anyway because I want to do the wrong things. So AX is a 16-bit register, I try to do it. And, you know, of course I get an error. But there's actually, if you're writing assembly, if you write 32-bit operations or 8-bit operations, there is machine code dedicated for those. But if you want to do a 16-bit operation, there's, you're actually using the machine code for a 32-bit operation and then there's a prefix that is put in front of it that overrides that into being 16-bit. And it's used a lot. There's the 6-6 and 6-7 prefixes. So that's how we get B swap AX. But yeah, like a lot of other hacks like this, turns out it doesn't actually swap the bytes. It doesn't do nothing though and it doesn't give an error. What it actually does is clears out the AX register. So again, yet another clever way to clear out at the AX register. But it's still kind of interesting because it's a way to clear it out that actually does a thing and it doesn't consistently, but you can't do it in assembly. Although you can, you know, X or AX with AX or move zero in AX or whatever. Then there's also the repetition prefix. This is mostly for string operations. You just repeat the same operation over and over and over again and it decrements the ECX register to keep track of that. But if you do that, that F3 prefix, it's F3 in machine code. If you prefix it with that, turns out that it's going to just do nothing. So there's one weird exception though. Anybody that knows assembly, which there might be a few in the room, do you know what the machine code for a no op is? Like 90. I heard a lot of 90s. Yeah. So with that in mind, this one maybe people might not know. If you do, just shout it really loud. Do you know what the machine code for pause is? Show of hands. Anybody? Oh, fucking Joe. It's F3. So a repetition prefix, F390 I should say. F390 is pause. So repetition prefix is F3. Machine code for no op is 90, which actually is just exchange code. But being that that's the machine code for these two different things, what if I repeated a no op and then just to compare, I pause right below it. So of course repeating a no op, it doesn't actually repeat because it's not a string based instruction. But if I do that and disassemble it, I get that. Almost what you would expect weirdly enough. But again, it's cool when you know what's going on under the hood in machine code, you can do a pause in assembly by writing something ignorant like repeating a no op, which is actually not a no op at all. And that's the machine code in the Intel manual there to show the two instructions in machine code side by side there. You got the 90 as you guys know. And then the F390 for pause even though F3 is a repeat prefix. This one is totally trolley. There's no real good reason to do it, but I love it. So here's some proof of concept code from smashing the stack for fun and profit from a very old issue of frack. I modified it a little bit to be 16 bit for reasons. So what happens if you prefix a prefix? Like if I did 6-6 before 6-6, like does it override again? Does it like double override? Like what is it does nothing? So take that into, if you combine that with the fact that in x86 the maximum instruction size in bytes you can have is 15 bytes. If you make an instruction that's 16 bytes so you get an error, I've tried. So you can do something like that, which is amazing. It's the same machine code or same programs, the same shell code. It logically works exactly the same except it looks like that. So every instruction is 15 bytes. And something about that just seems elegant to me. I love that. Because x86 is not a fixed size instruction set. There are some architectures that are where the bytes of each instruction is the exact same, like that propeller architecture I was talking about earlier. That's an example of one where every instruction actually is the same size, but Intel is so confusingly not that until you do prefix abuse. And this is another example of repeating every instruction, even though it doesn't repeat because none of these instructions are actually repeats or string instructions. So you got that. Full offsets. This is an interesting one, but see where to look at this example. Just XOR pointer racks plus racks and then EAX is the second operand. So if I rewrite that in source and then I compile it or assemble it and I compile it and then disassemble it again, I end up with the same kind of instruction. You know, you see in the assembly part it looks exactly the same, but the machine code is less. And why? Well, the reason is because in the machine code up here, you don't see it in the disassembly, but there is an implied 32-bit offset that happens to be nulls. We put nulls in it. So we can try to trick it a little bit. We can, well, first of all, you know, I'll try to write those nulls out in my assembly, although still interpretive dance. It doesn't listen to you because assembly is too high level, right? But it seems like that's pointless. Why would I even go through that exercise? Well, the reason for that is because there is a multi byte no op and they actually do abuse the mod RM table to do things like that to make use of multi byte instructions. So I can try to replicate what Intel recommends, write those, end up with that, which is totally not what Intel showed, so that's garbage, that's bullshit. So maybe I can try to trick it and not put nulls in there so they can't, you know, take the nulls out and make it smaller. It's a little bit better, a little bit closer, but still bullshit. So really you've got to write it in direct machine code. But why do that when you can just repeat a bunch of prefixes, get even more bytes than they give and have a weird ass knob sled, I don't know. So this is just a kind of place holder slide in the PDF version only. Just so you can see some of the instructions that I'm going to demonstrate. But this is a part here where we get to see iras in action just to see a little bit more different instructions than this one that only gives you two things here. So first of all, I'm going to start with an ADC instruction and really that instruction doesn't matter so much. I'm going to show you an interesting thing that it does with the pointer. So EAX, EBP plus EAX and then EDX doesn't matter the second one. So I get this. It actually does a forced community of property but the official machine code is that and a redundant version of it is actually less machine code. So not always does your assembler try to reduce the machine code. But that's because it's a community of property and kind of weird. I can do or EAX 50 and I get all kinds of different things for that. And I'm just trying to show you what this can do. That's as fence. It's doing that for you automatically. I can do a jump of zero for one and there's a little bit different bite sizes for that. I can do a really long one here. And I don't know if the speaker goons flag me but I'm actually getting close to done. So say no. We'll do this really long instruction here. Elite twice. So forced community of property. I'm just showing I can take all that crap and encode it for you. Push ECX. You know you got that. Just showing some of the things that it can do. Which is kind of cool. And this is not like Nassim shell where it's a wrapper to Nassim. It is a full, or it's a full assembler written in Ruby. And I'll give a link to it in a second. Lastly, I just want to show a cool trick with self-modifying code. One of the other applications, this isn't just for exploitation. Like if you can do machine code stuff at low level you can do cool tricks like self-modifying code. You can do different stego like Haydn is an example. But even more with this knowledge. So just showing you like a simple thing like you know incrementing and decrementing with this format is really only one bit of machine code different. And I show the binary difference down there. Which you know that's the effective difference of that. So like if you write self-modifying code you have this machine code here. These two examples it's exactly the same machine code. Although when we go through it we have move sub SBB and on the first one. But when you execute all the way through really it's move sub add XOR. Because self-modifying code is actually modifying that one bit for those instructions. And that trick I actually use on CactusCon coming up in September in Phoenix, Arizona. I'm doing a little talk called boot and play. It's all about 512 byte boot sector games. Somebody in POC or GTFO did Tetris that inspired me. So I did like a Tron game. I have some other friends that did some other games that I'll be showing in there. Goose, are you here? He wrote something cool. Yeah, okay. So he's co-presenting me for that. And then also I wrote a bunch of like crack me type puzzles that are also boot sectors as well. So yeah, you guys saw the tool and that's pretty much all of it. I don't know if I have time for questions. I'll take them. If not the Goons can shut you down. But I left this as the last slide for links and you know my blog which I talk about how assembly is too high level. It's my Twitter and then the two tools that I was going through. So I'm assuming if there's questions there's probably going to be a microphone maybe. I don't know. Okay, shout really loud or come up close. Or if there's no questions that's easier for me. Okay, Joe wants to ask a question which is going to be terrible. What's your question, Joe? I don't have any more info. I don't know why. Oh, try it. It might. I haven't done anything with Ida. Yeah, okay, no I will. He was asking if doing these tricks confuses Ida Pro. So I haven't really played around with that. Because for me those kind of things don't interest me as much. But I know Ida Pro is really, really good at dissecting. So it might not trick Ida. Yeah, you. Okay, so to answer the question was there ever a point where machine code was one to one with assembly? For Intel or X86, I don't know but if it was, it was a long time ago. Because a lot of these weird things that I was going through is because of all the backwards compatibility. But really I do want to say no just because at the top of my head one of the first things that I think of is that thing where you can't do a memory to memory operation. So you have to have those two different encodings and because of that you have that redundancy. So for that reason alone I would say probably no. But that doesn't mean, I mean for other architectures like propeller specifically, I can almost say that it's 100% one to one. There's a couple like weird things that like there's like a little bit of difference but I still wouldn't say technically that makes it not one to one. So because of that I love propeller. It's weird, there's no interrupts, it's like a really weird architecture. There's like no stack and all that kind of stuff. But is there any other questions? I'm trying, with time? Okay, time. I'll be in the hangout room if you guys want to ask other things. Thank you, thank you.