 Let's do these branch instructions then. Okay, so we will need a Table of instructions, so we have BPL, BMI, BVC BPL, BVC BVS, BCC, BCS, VS BCC In fact these come in pairs, so plus and minus overflow, carry, overflow Overflow clear, overflow set, carry clear, carry set and BNE BEQ. It uses the bottom bit of the A field to indicate whether it's a positive or negative branch, which we will be exploiting later. So that's 1.0, 3.0, 5.0, 0.0, 0.0, 0.0, 0.0, blank entry. So down in our paths Here we are going through looking for instructions. Here we have conditional branch instructions. Now a branch instruction always takes a simple expression as a target. So all we need to do is expect paths expression. I don't know how this worked anymore. Paths expression, yeah, returns avoid. And then we need to actually emit the thing. Now we're going to be using almost exactly the same code here. So we're going to take this out of line, put up here in our record management type goes here. This is getting more complicated quickly. So this will actually fail with the branch instructions because we're using GetInstantFlags to try and figure out what type of opcode it is and this is actually, it needs to return a relative branch. In fact, in real life, one of these branch instructions will never be used for anything other than another text segment label. So we'll always be going through this path. If you actually use a constant using this code here, it will emit the constant directly without encoding it correctly, which might be useful, maybe. But this is wrong. We're going to have to deal with that later, but we'll do that later. So down in paths, we are going to do add expression record and you do actually want op and this is going to be and we're going to want to change that as well. And in the conditional branch, it's going to be the same. So go up to add expression record, which is here, like so. That does not build because GetInstantFlags is actually defined down here. I want to move that up and get instant length here as well. Okay. So that will now generate the record. So if we do BEQ label and then an RTS and we run it, it will actually do something, but it will not be the right thing. So what we've got here is an F0 which is a BEQ followed by a 3, which is the low byte of the address of label here. This is because our generator code down here thinks this is an ALU instruction and we are going to want to change that. Right. Okay. We are going to have to rework the way we analyze instructions. So we're not going to do this anymore. We're going to do something else. Basically, I've been looking at this table and there's a lot of very similar instructions. For example, here in C equals 1, this block, these are all the ALU instructions. But if you notice over here, all the logical shift instructions are used the same in codings for these B elements. So we can reuse the same code that we use for the ALU stuff for these and get them for free, complete with turning absolute values into zero page values for both sides. But we can't do that for stuff like y-indexing or point-to-de-referencing because they are not supported. Likewise, if we go up here, we've got this column here of absolute instructions, all except this one, which is special, and we've got some zero page ones here, etc. And we could reuse the same code. But then we've also got these oddities and that there should be a CPY somewhere to go with CPX. Oh, it's right there. These use the same encoding as B equals 2, even though they're in the B equals zero column. And likewise, we've got JSR abs here, which is same encoding as a B equals 3. So in fact, we are going to want to need to be cleverer about the way we analyze this stuff. I think what I want is a function that given an opcode returns the effective B column of the instruction, which will then be used for encoding the result. So this would return B equals 2, this will return B equals 2, this will return B equals zero, and so on. So how do we do this? Well, the obvious thing is anything that's in the C equals 1 row, we just pull out the B and return it. And we can actually special case this that would be useful. That is actually, there is no value assigned to that, but we can figure out what it is. So that is C equals 1, C equals 1, B equals 2, A equals 4. So that is a illegal encoding. Okay, what is next? Well, we can pull out these. These are all in the C equals 2. So if the B value has the bottom bit set, then we can pull out the the B value. Bottom bit is 0, 1, 0, 0. If that is non-zero, we can do that. For the others, anything with actually, actually, I keep forgetting, the user is not supplying any of these opcodes. So we can't get invalid ones, because my parser will never generate instructions containing invalid opcodes. So, anything that's not one of these columns is a one byte encoding implicit parameter with no payload, except for this one. So that is effective B value of 2, which is actually that. Otherwise, it's a... There is no B value for implicit. Let's add that. All right, so that we have now covered blocks 1 and 2. C block 0. In fact, there are there are four valid C block values. But it looks like C equals 3 is never used. If you look at this table, yeah. C equals 3 is this column, this column, this column, and this column. These are never used, at least for the stock 6 of 0, 2, which is all I'm currently targeting. If this works, I'm eventually going to want to add 6.5 CO2 support, which adds a whole bunch of extra opcodes, but let's let's deal with that much later. Okay, so C equals 0. Anything in this column and this column are implicit. And of course, I realize that this stuff is only going to be used by the the the extra parsers for dealing with expression-based opcodes and none of the implicit opcode stuff will actually go through that path. So we'll return something anyway, but this we should never get there. So we can ignore these. Not this one. We can't ignore that one. So anything in B equals 1, or 3, or 5, or 7, is going to be than the the indicated B value, except for this one. So again, so that is looking for the bottom bit set of the B field. Except for that one. It, however, this is encoded as an absolute value. It's not like a zero-page indirection. This is a 16-bit thing. So from the perspective of the instruction encoding, this is the same as one of these. So we don't need to pull that out as being special. So the other things is the branch instructions have to be relative, and these are in B equals 4. So and we are going to have to create a special fake B encoding for that as well. And we can't use 256 there because we're using a byte to store all these things. So all we're going to use these three fields here that puts it out of range of these. So that's 5, 2, 5. Okay. So if it's not a relative branch, then it must be one of these. So if it's 20, that is special. And then we have these three instructions. So what's the easiest way of doing this? Well, we know that B if B is zero and C is zero, then the only field that matters is A. So A O C O and E O is one O one C is one one O E is one one one. Yeah. Yeah. Okay. That's right. So if the top bit is one, these two bits can be anything. So if don't care about those bits, B field must always be zero. A field must always be zero. Sorry, that's the C field at the bottom. So if this if the top bit is set up, B field must be zero. C field. C field has to be zero because we've tested that here, but the B field must be zero. So we want to test those. So if this expression is true, then it's one of these. So that's a return B M. So this chunk of code should decode every instruction here and tell us what the what encoding it is. And then this code is going to given a B value, it will return various properties of it. Now, our two extra properties. In fact, let's rephrase that as 8, 2 and 9, 2 because that will allow us to put these here. So a relative is a relative B prop and an immediate now hang on, now this is implicit, implicit, then relative. So this will get the properties of a given B code. The instruction length, we want to add this on. But again, actually, I think we can be cleverer here. How many fields we have left? We have we can actually put the size in the top two bits. So this will be so X pointer is two, zero page to immediate to apps three, my pointer to X index, zero page to Y index, three X index, three, implicit one and relative two. So this then allows us to just do this. Right, and that's not going to build because this is actually going to be cool, is going to be get B props. We're also going to add a get in some flags, get in some crops. So this is going to be we get the B props of. Now, that's wrong. We get the B props of the of the B of the op code and get instant length here is actually wanting to call get in some props and now we want to change get in some flags to get in some crops. And we are we have to get rid of B prop relative. Do we know we can put B prop relative here because we don't need a field for implicit because it's never going to go through this code path. So we do want to take out this. In fact, there are no fields present there. And this is actually 10 elements. OK, Clang format has done a sterling job of formatting this table for me. Now, I forgot to take the size of the binary so I don't know whether that's bigger or smaller. It's probably bigger because of this. Now, there is actually something I want to do, which is do I have a canned command line? I do have a canned command line. So this is a disassembly of this program. So this is the get in some crops function. I believe that it has. In line looking at this get B. Which makes sense because this is the only thing that calls it. But it starts about 108. Just looking to see how this works, just looking to see how big this is. Actually, these shifts here are part of this. I think. Yeah, it is. And 151D here is obviously the address of our flags table. So it's done quite a lot of inlining. But anyway, we start at 108D and the end of our function appears to be 1122. So what I was wondering is, is it worth eliminating all of this in favor of a 256 byte lookup table? But no, this code is a lot smaller than that. OK, so that does produce the same code as before, which is nice. All right, a few other things we want to change. We no longer have an ALU record. We have a expression. Because we're going to be using this record for any op code that requires a complex expression as a parameter. So place code here. Now we want to know. To figure out what to do here, we want the. Yeah, this is actually correct. This is if this thing is referring to a zero page value and the op code is shrinkable, shrink it. So here we want to do. We want to get the we want to get the B value of the op code and we get the B props of the B. In fact, you don't need the B itself. We just need the B props. So you can do that. So if OK, shrink anything which is pointing into zero page. So if B props and B props rail. Ative. So this is the place where we have to figure out how long our branch instruction can be. If it is in range, then it's going to be two bytes because it's one of these. If it's out of range, it's going to be five bytes because it's a negated branch plus a jump. So here we want to calculate the delta which is going to be the the difference between the current program counter and the target. We can only branch to something in the text segment. So say if there is no variable or the variable is not pointing is not is not referring to the text segment. Then we give up that is allowed. OK, so we now know that we have a address which is the contents of the. No, we've already let me think. We have we written the address into the offset of the text symbol. Yes, we have. Therefore, the expression code here has looked that this is wrong. So what I was trying to do here is if the if we take an expression to a computed node, then we look at the base variable and add on the offset. Otherwise, we just take the offset from the variable. But that is incorrect. So there's two cases. So the first case is we have we are jumping to a label. What we want to end up with is. Variable is label offset equals zero. The other case is we're doing this at which point we want variable equals label. Offset equals two. Now, notice that in neither case are we taking into account the actual address of the variable here. So in fact, rather here, rather than picking up the offset out of the variable, which is where the where the where the symbols address is being stored, we actually want to load it from here. And then here we add on the value in the immediate expression. So that should be jump A and there is in fact a third case where we have a label plus two, jump A plus one, at which point we want variable equals label offset equals three, because it's the sum of these two. OK, so. In our length calculation code. Which is here. We want the target address. Which is the address of the variable. Plus the address in the expression minus. The current program counter, which is apparently zero. Well, when we process in the BQ, the program counter should be zero and label should not be zero, print all the values, all zeros. Fantastic. They should not all be zeros. This is the first time through place code. So it hasn't actually set any of the. Any of the addresses in any of the symbols, which is irritating. OK, we're going to need to keep track of. Which pass we're running in because pass zero is going to be special. No, I hate that. So now in place code, if the pass if we're in pass zero, then the length is always going to be five. Otherwise go through for the main code. So we should now and if we will always. We always want to do another pass, so we tell it that it's always changed. So now it should do two passes. There we go. And the second time around, we have actually set it. It's already calculated the address of label here at address six. Therefore, our offset for the branch is six. So this is five bytes. This is one byte, and therefore label is at address six from the beginning of the text segment. So that is producing the right result. So once we have our delta, essentially, if the delta is in range, we know this is going to be three bytes long. If it's out of range, then we know it's going to be five bytes long. But this is also not going to work. The reason why it's not going to work is because on the second pass through. We need to know whether this has made the instruction shorter. If it's made it shorter, if this is now less than five, then we need to set changed so that we can do another pass because the addresses of everything after this will have changed. And it's not sufficient just to do this. Because on the third pass, it'll go through here and and it will increment PC by three and set the changed bit. So it'll just keep iterating forever doing passes. We have to track whether this instruction has, in fact, been shortened. So how are we going to do that? Where is our expression record? I think we need a bite of flags in here. It seems a waste for only one bit. Can we stash it in the opcode? Not really. I mean, we have opcode values that are invalid. We could simply pretend that these are the corresponding long forms of these. But that will make life difficult when it comes to the 65CO2 because there are opcodes in here. Can we put this in the variable? No, we can't. And there's no space in the and the offsets in use. I think we're just going to have to put another a length field in, which has got the old length of the the expression thing. So in fact, we're in the wrong place. We want place code. So OK, if the so if if there is no length set, now we can do that elsewhere. OK, if the if the length has changed, record the old length and exit. OK, and we have some code here for adding an expression record, which we're going to set to max int for a byte. So that when we actually calculate the length, it's always going to be smaller. OK, so now what do we get? If zero, oh, four, six, zero. And the label here is zero, one, two, three. So that's the wrong thing. That's because we actually down here in write code, we actually want to emit the delta. This is all getting annoyingly complicated. So we have spread the logic over quite a lot of the program. So we've got bigger, much bigger. Have we gotten five, six, two, three, five, eight, eight to two, two hundred bytes longer. OK, so down here in the write code, we actually want to fetch the B props of the op code. So if it's a if it's a relative branch, then we do the relative branch thing. Otherwise, we know that it's a normal two or three byte value and we know how to omit those. The one difference is that we actually have the length of the of the instruction here. So instead of having to call that, we can just say if the length is three, do this. Otherwise, we're up here and going to steal this delta line. Well, we want to say if the length of the op code is three, it's a relative branch. So we're going to do write byte S op code, write byte delta. Otherwise, if it's not three, it's five, which means we want to do a long branch. So that's going to be write byte S op code or with because remember, I said that we wanted to flip the bottom bit of the a field to negate the other condition. So this is where we're going to do it. So this is bit five. Then we are going to jump three bytes ahead. Then we are going to omit a jump instruction, which is a four C. Then we are going to emit the target address like so. We haven't kept track of the program counter in the right code stuff. We need to do that. That's irritating. Luckily, we actually have the value read to hand in the length field there. And those are the only two records we have that emit stuff. So now what do we get? We still get a four. Why did we get a four? OK, it thinks the address is the address of the label is at four, which it is. That's correct. That's doing the right thing. No, it's not, actually. That's because a branch instruction is two bytes long, not three. There we go. F zero, oh, three is branching to this address. Good. How do we now branch to distant addresses to test that? If we do this, do we have enough working for that to work yet? You haven't done a jump. It should work. F zero, oh, five. LDA zero. So something is wrong there. That's done the right thing. That has not done the right thing. So start that wants to be an else. If so, if this is a if this is an expression pointing at zero page, then and it's shrinkable, then the result must be two bytes long. Otherwise, we go with the default length. So I think that our default length might be wrong. Yeah, it is two bytes. Let me try that one again. It is two bytes. OK, so this should be a LDA absolute value that will be one of this column. That would be AD. So B equals three abs, this one. So it should have a size of three. One one and is abs and shrinkable. What's up code? Has it produced a five? That's the wrong up code. That's a zero page up code. So that would be a problem with the parser. So if there is not a token variable, which there isn't and the token value is small enough to fit, then it's zero page. Which this is not so. Have we correctly passed the token value? We have not. Why not? We know we can parse numbers. Our token value is the right length. It's nothing complicated here. This should work. We know it has worked in the past. So is it something wrong with our expression parser? We read the token. This should set token value. And then we return a number. Yes, that is a correct. That has, in fact, not worked at all. Let me double check the source. Oh, X on. I think that might not be working. Let me try this. No, that hasn't worked either. OK, so we can see it accumulating the value. Let me just put that back into hex to make it more obvious what's going on. And then it will return token number and then we get number equals zero. So unless OK, token value is one, two, three, four. When we return from read token, we call read token here. Interesting. C then. C is two. Two is a number. This does not make any sense at all. So we are here. We have just read the first token, which is LDA. And it has returned in the A register a one, which is a token ID. So we go again and set our program counter here. We have now just received a number, so A equals two. So we compare by two. We're now here. Jump to Mono F8, which is the end of the function, RTS. And we should be. Down here, somewhere, we're somewhere inside. We've just finished parts expression. So we're probably here. So we're now going to read the token value variable into A and X, which is zero and zero. So now it's zero. OK, we're here. Two C, three, four, two, three, three, five. Right. One, two, three, four. We've just read token value. So it's now doing something. This will be the printf happening. Then it does a JSR. It does something with a two. Right, the two is the token number. So it does the print, which is this JSR. And then we end up here. So we put a breakpoint here and go. OK. Now we look at where our token value should be and it's one, two, three, four, which it should be. OK. So we've just jumped to the end of the function, which should be more or less here. Don't know what this is doing. OK, it is actually going through the return process because there's our RTS. OK, we're now in, we're now here. So let's look at our value. Two C, two C, three, four. It is in still one, two, three, four. So, OK, well, I found something interesting out. So here's a memory dump. This looks like the source code from our program. And there is the initial tab. So this obviously loads at two B, B, four. Two C, three, four here is where our variable is. And I can see it's just been erased. Let me have a dump up here somewhere, somewhere. I can't remember where. I did verify that it was actually in memory. So we've got two B, B, four plus, and the buffer size should be 128, two C, three, four. So four bytes have been overwritten at the end of the buffer, which is very suspicious. The question is, of course, why? Well, let's take a look at our globals. Where is token value? So token value is here. We are using parse buffer as our output buffer. Hang on a second. I've got some symbols. OK, so two C, three, four here is BSS start, which is the start of the uninitialized data area, which is where we would expect to find variables. Now, where is output buffer? Two C, five F, which is here. We haven't done anything with that yet. So this is the parse buffer. You can see we've just read a zero. Yeah, that's the initial zero of the OX. We didn't use the parse buffer for the rest of the number. We just read it directly. You see here, there's a fragment of an LDA. Therefore, I bet that this is the input buffer to be 8E. That's a very odd place for it. Buffer. Where is our input buffer? There we go. CPM default DMA. OK, this is a 128 byte 2BB4. Yeah, this is a 128 byte block of memory that is given to you by CPM. It normally contains the command line. So that is at 2BB4. So why would it be overwriting the next thing? Now, I don't know, to be honest. Let's take a look at our read code. Well, this reads. This sets the address the data is being read. It then reads another sector. Our file is only one sector long. This should be fine. I know what's happening. So I took some time to add WatchPoint to support to my emulator and did some debugging and added some printfs. And then you can see that it is reading our token value of 1, 2, 3, 4. But then it's calling parse expression to start, which is surprising because it should have parsed the number here. But of course, read token here may not actually parse the number then and there. It may have been parsed previously and then pushed back onto the token stack. So with peak token. So in fact, we have a push token here. So that is exactly what has happened. We push the token back onto the stack. Then we call parse expression again, which of course immediately discards token value and token variable. So what we're going to need is these. So when we push a token, we actually also push these. And when we read that, we then set our token variables. So now when we run it, we get read token 1, 2, 3, 4. Then we get parse token start. Then we get parse token 1, 2, 3, 4, indicating that we've got the right number. So let me just get rid of a lot of this tracing and take a look at our source file. So F006, that is the branch instruction AD, which is LDA with an absolute value. There, 3, 4, 1, 2. OK. So we want to waste 200 bytes. So if we add, call it 70 of these, like so, that should add enough padding between the branch and the label to put the branch out of range. So what do we have? We have D0, which is BNE. So it's flipped the condition. Then we have 4C to 00DB. 00DB. Well, DB is there. So it has, in fact, generated a branch to the wrong place. No, it has it. 4, 8, A, B. No, no, it has. It's put it in the right place. It's just we have garbage at the end of the file. Excellent, that has worked. It's probably not worked correctly. Let me just see if I can find four reasons I will explain. And let me just try and find a 6502 disassembler. OK, I have a disassembly of our file here. So it has, that's interesting. Oh, yeah, that, yeah. It's turned the BEQ into BNE, and then it's put the jump. So it has done the conditional jump to the right place, which is the instruction immediately following, which is good. But the actual jump instruction going to DB has landed one after the RTS, which isn't so hot. There's also a CD here. That's because it's disassembled garbage, and it thinks it's a genuine jump there. I think you can tell it to stop that. I can't be bothered to find it now. So the question is, why has it jumped to the wrong place? Well, let's take some of these out. I'm going to assemble it again, and then disassemble it again. So that has branched to well after the wrong place. This suggests to me that we are getting the instruction lengths incorrect. So down here in record expression, this is our three, and then we just put down the flat address. And the address should be right. So we see an address of nine for that, and that has in fact branched to B. That seems odd. So this one worked. I know what's going on. Now I've forgotten to take into account the size of the branch instruction, so that here we're emitting a constant three, which means jump over the three bytes of jump instruction. So the point where things are calculated is after the branch. So let's put a plus two in there and see if that makes a difference. Let's try a minus two and see if that makes a difference. And that has jumped to the right place. And now if we put all these back again, that is jumping to D nine, which is also the wrong place. Great. Okay, so let's change that to... We don't have a jump instruction yet. Let's change that to LDA label and see if that produces anything relevant. LDA D nine. D nine is correct. Okay, so if this is a long branch, it is getting the lengths confused, meaning that it is failing to put things in the right place. Okay, so let's take another double check at this. If it's a relative branch, let's cut at five and we go down to either two or leave it at five. When we write, we always increment the program counter by the recorded length of the instruction. All the zero page stuff will be wrong, but I'm ignoring that for now. Zero page relocation stuff will be wrong. So what are the actual instruction lengths? Which should be right? Record bytes here is advanced to the program counter what looks like the correct number of bytes. What about here? That looks right. So why is this wrong? Because I'm an idiot. I put the minus two in here rather than down here. So the address was always going to be wrong. Let's try that again. Okay, I've changed my program slightly. So it's now got LDA in the BEQ to the same address. You can see that the LDA is using address OB. Jump is using the address OB. AB is the instruction after RTS. The branch here is jump going over the jump. This is all fine. It is now working. Given that that particular mistake was really stupid and I've been doing this for a while now, this seems like a good stopping point. Let's drop this off. Rebuild. Let's see what the damage is. Yikes. 6K. But we do at least have most of the assembler. Just wondering what I can do to simplify or shrink things. But let us look at that next time. So see you then.