 Greetings, 6800 friends. So this is video number three in the series of coding up a 6800 CPU in nMyGen. So just a few words before we get started with actually coding. So some of the comments on the previous video. So specifically on this board, there is actually an external flash for this FPGA. It's right over here. So it is used to store the program for the FPGA, which implies that on power up the FPGA does load its program from the flash memory, which I guess means that there would be some small startup delay. I don't know exactly how fast it configures itself, but there you go. Some of the other comments were about, well, why would you choose nMyGen over, say, Veralog or VHDL? And to that I just say personal preference. I just want to use a different language. And if you would like to use a different language too, you can go right ahead. And you can take your language snobbery and stick subtract six internet points. Okay, so on that note, why don't we get started with some coding? So here is a file that I'm just calling core.py. It's going to consist of the core of the CPU. So it doesn't include things like the pins, tri-states, that sort of thing. So I start with just a whole bunch of imports that are generally useful. And the class is just going to be called core. Again, I'm putting a comment in here just to remind myself exactly what core is and what it does not include specifically IO for the actual pins. So this is, again, the skeleton of the framework. So I have just an init method. I have a ports method, which returns a list of signals. And I have the elaborate method, which creates a module and returns it. And in main, I start up the main parser, I parse the args. At the end, I do main runner. I'm also creating a top level module, and I'm adding core as a sub module to it. I'm also defining a new clock domain that I'm calling phase one. So I'm adding that to the domain. So now I can do things like m.d.ph1 to execute things on the positive edge of that clock domain. I'm defining a reset signal here. And what I'm doing on this line is I'm just extracting the clock signal out of the clock domain. And on this final line, I am basically saying, OK, phase one doesn't have its own reset. Its reset is going to be this global reset. And I define a global reset in case I want to use it for simulation. Finally, we can see that the ports over here consist of the core ports plus the clock and the reset. So that's really all I wanted to do here. The reason that I'm not defining phase two at this point is that I think, at this point anyway, phase two is really for things like pins, specifically reading things and basically sampling what we're reading on an edge of phase two. So everything internally, I think, happens on phase one. OK, so now I'm just going to define some signals here. So I'm going to have an address signal, 16. I'm going to have data in and data out. And remember that when you're not dealing with pins and you have bidirectional signals, you really have to split them out into inputs and outputs. So here I have data in and data out. And I'm also going to have a read-write signal. So really what I should do is I should say that one is read and zero is write. Now this does correspond with the pin definition for read-write on the CPU. But this is really just to say whether I am looking for input from the address or I definitely want to write data out to the address. So if read-write is one, then I'm just reading. But if read-write is zero, then I definitely want to output whatever is on data out to the address. OK, so that's really it. I specify that the read-write signal should start on reset as one because obviously I don't want to reset and have it write. That doesn't make any sense. Now the next thing I'm going to do is I'm going to define all the registers. So here we have registers A, B, X, SP, PC. We know we're going to need an instruction register and we know we're going to need a temporary 8-bit and a temporary 16-bit register. Now I've defined all of these as reset lists. What that really means is that when the system first starts up, I believe they will all either get random values or zero values. But reset list means that when you reset the processor after that or when you toggle the reset line, none of these registers will actually get reset to a specific value. And the reason I want to do this is that during formal verification, I may want to toggle reset, but I really want the registers to have random values or values that formal verification is going to stick in them in order to test instructions and make sure that they work. So that's really the only reason I made them reset lists. The point is that they don't reset to any particular value. If I don't put reset list, then the default reset value is going to be zero. So you can see here that the address is going to default to zero, data in, data out is going to default to zero. Next I'm going to define the buses that we talked about way back in part one. So we're going to have an ALU and it has two inputs. So these two signals, eight bit signals are going to be the inputs to the ALU, source eight one and source eight two. And the output of the ALU I'm just going to call ALU eight. Similarly we have a 16 bit bus, so we have source 16. And that's going to go to a 16 bit increment decrement unit. The output of the increment decrement unit I'm just calling ink deck 16. So those are my buses. Now the next thing that I'm going to need is some way of specifying what goes on these buses and where these buses write to. So I'm going to define what I call selectors. So let's take a look at this first one. So it's called source eight one select and it's a signal. And I'm going to use an enumerated value and I'm going to call it reg eight. So let's look at what that is. So here is my reg eight enumerated value. It's an integer enumerated value. And I've specified all of the eight bit signals that I think that we're going to need. So none basically means doesn't apply. But we certainly have A, B and the high and low for X, the high and low for SP, the high and low for PC. We have our temporary eight bit register, the high and low for the temporary 16 bit registers, and then you can specify data in and data out. And I've just assigned them unique numerical values. So down here, what this effectively means is in order to get a value onto source eight one, I want to be able to select one of those enumerated values. And whatever that is, it will go on to source eight one. Same thing with source eight two. Now ALU right. What I really want here is that any of these values over here, say A, B, X, high, X, low, I should be able to write them from the output of the ALU. And I should be able to write any of them that I want, not just one thing, but any number of things, which means that I need basically a bitmap. So here, what I'm doing is I'm taking the length of however long reg eight is. And you can see that we have 14 values here. So basically, this is a 14 bit signal. And then I would just set one for whichever register I want to write from the ALU output. And we'll look at that in a moment. Now I want to do the same thing for reg 16 for 16 bit values. So I've set up a reg 16 enumerated class. So none again means doesn't apply. And then we just have our 16 bit values, which are X, the stack pointer, the program counter, the temporary 16 bit value and address. Okay, so for now, that's good enough. Let me replace the ports with address data in data out and read write. Those are really the only public ports that I'm interested in. Now when I'm going to be doing formal verification, of course, I want to know what the state of the registers are so that I can make sure that they were set to the correct values. But for now, this is fine. Okay, so let's start in the elaboration. What I want to do is I want to start implementing the source eight one bus. So basically what I want to do is I want to look at source eight one select and whatever that is, copy it to source eight one. So I could do something like this. So I'm going to start with a switch statement. And if source eight one is set to a, then set the source eight one bus to self dot a. And then I can just copy that for B and so on. And then of course, if I have a default, I can just say that I'm going to set the source bus to zero. So I would need to fill in all of these cases. This is with m dot default. So I would need to fill in all of these cases, but this is Python. And remember that with n my gen, we are writing code to generate the HDL code. So instead, what I'm going to do is I'm going to define a function that I'm going to call say source bus setup. Okay, now I'm going to need a module so that I can refer to m dot d dot whatever. And what I want to do is I want to specify the bus that I want to set up and the signal for its selector. Okay, so here I could if I wanted to copy all that in. So first of all, this select is now the selector and the bus instead of self source eight one is just going to be bus and so on. Now I haven't really saved myself any work here, but bear with me. This is getting somewhere. So in order to actually put this logic into the module, I could just do self dot source bus setup and then pass it the module. The bus that I'm interested in is self source eight one and its selector is self source eight one select. Now the nice thing about doing this is that I have two buses. So I can always just copy the code and that will just copy the logic. So now I've set up my two buses. I would like to use a for loop somehow to generate all of these cases. So I want something like this. So here's something like what I want. So I want some sort of a map, which maps the enumerated value to the actual register. And now I can just say with case E, which is the enumerated value in the combinatorial domain copy whatever register that is to the bus. So obviously I'm going to need a register map. So I'm going to put the register map in here. Okay, well, I still need to construct the register map. So let's create a register map. Okay, so this is my map of eight bit registers. So it's a dictionary from the enumerated value in reg eight to the signal that corresponds to that eight bit register. Now I've added an additional Boolean in here. The only case where this is false is for DN. And the reason that I need to do this at eventually is that DN data in is not something that I can write to. In other words, it's a read only register. So read only register. So true means that I can write to it and false means that I can't. So let me take self reg eight map and put that in the calls. And here, let's see the first Boolean value is the actual signal. Okay, so now I've got the type correct. So this is a dictionary from int enum to a tuple of signal and bull. So so the first element is the signal itself. And the second element is whether it's read, write or read only. Now because I'm doing a read into source bus, I don't need to pay attention to that other Boolean obviously for the destination I do. So let's write something to set up the destination. Okay, so here's how I'm going to set up the destination. Again, it takes a register map, the bus that I'm writing from and the bitmap. So what I really need to do here is for every element in the reg map, as long as it isn't a read only register, if the value of the enumerated register in the bitmap is set, then I write the first element in the tuple, which is the register that I want to write to with the bus on phase one. So in other words, I can set any bit that I want in this bitmap. And then on the next edge or on the next positive edge of phase one, whatever that signal is or whatever that register is, it'll get written from the bus. So in this way, I can actually write registers A, B, the high of X and the high of PC all at the same time from the destination bus. So now let's call it. Okay, so now because I have not actually said that the register map has to be an 8-bit register map, I can use these same methods, I can use these same functions to set up the 16-bit registers, rather the 16-bit buses. So first, I'm going to have a reg 16 map. So none of them are read only. I can always write to X, SP, PC, temp 16 and address. So now all I really have to do is let's just copy one of these. So there's a reg 16 map. And the source is just called source 16. And its selector is just source 16 select. Okay, that's set up reading to the bus or putting a value onto the bus. Now we have two destination buses. We have source 16 write and ink deck 16 write. So remember that ink deck 16 is the output of the 16-bit increment or decrementer. So obviously, if I don't want to use that, I need to write from the source 16 bus. So I need to set up two destinations, both using the 16-bit map. So the first destination will be written from source 16 and its bitmap is source 16 write. And then I have the other bus is ink deck 16, and its bitmap is ink deck 16 write. Those are all the buses. Now the thing is that I could, in theory, set up source 16 write and ink 16 write so that they have the same bit set. And then you're going to run into a problem, which is that you're going to write on phase one something that you specified, like let's suppose the PC. You're going to write it from source 16, and then you're going to write it again from ink deck 16. Now the way that and my gen works is that it's the last assignment that takes precedence. So in other words, if I were to write the PC from both buses, then in fact, the only write that will take place is the one from ink deck 16. Now again, in theory, that should never happen, and I can always check that using formal verification. But for now, I think this is good enough. Now in terms of what happens to the processor when it goes into reset, we can see that there is this table of vectors. And one of the vectors is called restart. And it is located in memory at FFFE and FFF, which are the very last two bytes in addressable memory. So we can see from this diagram above this table that you hold the reset line low that puts it into reset because it's an active low signal. And a whole bunch of cycles goes by. So apparently you have to keep reset low for at least eight clock cycles. And then you can release reset. And then what happens on the address lines is that FFFE appears on it. And we're going to read the data at address FFFE. And that will be the high byte of the PC that we're going to start at. And then after that's read, we output FFFF on the address lines. And we read the low byte of the PC. And once we've got the PC constructed, we output that to the address line. And that's where you start reading the very first opcode of the program. So there's another diagram. There's a little more complex diagram over here showing exactly what happens that as you hold the reset line low, eventually the address bus will get FFFE and remain there until the reset line goes high, at which case the address line remains at FFFE, except that we now do a read. And then it goes to FFFF. We do a read. And then we go to the new PC constructed out of that. And we can always reset the processor after that. And again, the address lines will go to FFFE. And then of course you release the reset and it goes to FFFF and et cetera, et cetera. So that is what we need to do in our processor. So to begin with, I'm going to define an internal state and I'm just going to call it reset state. Now when reset gets asserted, any signal that doesn't have a specific reset value and is not marked as reset list gets reset to zero. So this reset state starts at zero. And the idea is that once you release reset, reset state zero will cause FFFE to be output on the address line. And then we will increment reset state to one and that will cause FFFF to go out on the address line. Then we'll increment reset state to two and so on. Eventually we will hit sort of the run state where we remain until the next reset. So in the elaborate, I am going to call a function that I'm just calling reset handler. This will set up all the logic to reset the processor. So here's the first part of it. So it's just reset handler. It gets the module. And what we're going to do is we're going to switch on the reset state. So for reset state zero on the next positive edge of phase one, we're going to set the address lines to FFFE. I'm also going to set the read right line to one to make sure that we are still doing a read. And we're going to increment the reset state to one. Now when we output FFFE on the positive edge of phase one, that's when the memory read starts. So what I want is that on the next state, I want to read that in and store that somewhere and also increment the address line. So we can see that here where on the next reset state, I'm going to set the address lines to FFFE, the read right line to one, and I'm going to store whatever is on the data in lines onto temp eight. And of course I'm going to increment the reset state to two. So now during state two, we're going to be reading the low bite of the program counter. And here's what state two does. So first I'm defining this statement cat and this concatenates values. So this is the low value and this is the high value. So really what I'm doing is I'm constructing a 16 bit value. The low value is whatever I've read in on data in and the high value is whatever I previously stored in temp eight. So this basically is the program counter. It's the reset value and on the next positive edge of phase one, I'm going to load that into the program counter. I'm also going to load it into the address register because I know that that's what I'm going to be reading next. And I'm going to set the read write register to one and I'm going to set the reset state to three. Now I'm not going to have a case three for the reset state because I want to remain in state three. So basically I wouldn't be doing anything. In fact, what I can do is way back here after the reset handler, I can say, okay, well, if the state is reset state three, in other words, we're now running, then just start decoding. And just for now, I'm going to create the decode function as doing nothing at this point. I'm not going to write any decoding at this point. Okay, so let's do some simulation of this to take a look at what a reset looks like. So I'm going to comment out the main runner. And I'm going to add a fake memory. Now this is just a dictionary of address to data. And I'm going to kind of do the same thing. I'm, as before, I'm going to generate some logic that will generate this fake memory. So what I want to do is switch on whatever the cores address lines are, and then I'm going to iterate through the memory dictionary. So there's address and data. And then for every such pair, for case address, I'm going to simply load data in with the data at that address. Now what happens if there's an address that's not in here? Well, I want to put something in the data input lines anyway. And what I've decided to do is just return ff just for fun, because f stands for fun. So this is my fake memory. Now let's actually hook up some signals. So the first thing I'm going to do is create the simulator. And I'm going to add a clock to it, which is going to go off at one microsecond intervals. And it's going to be in the phase one domain. The next thing that I'm going to do is I'm going to create a synchronous process called process synchronized on phase one. And we'll define that in a moment. And at the end of main, we're going to write the VCD file, and then run the simulator. And we can see that the traces that we're outputting are just, you know, whatever the core ports are. Alright, so what is the process that we're going to do to stimulate the simulation? So here it is. All I'm doing is I'm doing a bunch of yields, which for a synchronous process will cause the process to wait for the next edge. So we're going to go remember that there's always one edge before we start the process. So there's two edges, three, four, five, six edges. So basically, when it comes out of reset, which is what happens when the simulation starts, we're going to wait for six clock cycles to go by. Now I'm then going to assert reset, wait one clock cycle, and then de-assert reset, and then wait another bunch of clock cycles to see what happens. So let's run the simulation. Okay, so the simulation has run. And now let's see the output. And I just forgot to run the X-Windows server. So I'm doing something different than I did the last time, which is running GTK wave on Windows. Here I'm running GTK wave on WSL. And in order to do that, I need an X server started. So alright, so here's the output. Again, it's in the picosecond timescale. And we are operating in the microsecond timescale. So let's zoom out. Okay, so going to the very beginning. So here we get the very first clock edge that always happens. And we can see that out of reset, the address lines have gone to FFFE. We're in reset state one. And the fake memory is outputting one, two. On the next clock cycle, we see that we're entering reset state two. We've output FFFF to the address lines. We've stored one, two in temp eight. And the data lines are outputting three, four. On the next clock cycle, we're going to go into reset state three. We can see that the PC has now been loaded with one, two, three, four. The address lines have also been loaded with one, two, three, four. And the fake memory is responding with zero one, which would be our first instruction. So that's what reset did. Now, if I scroll a little further along to the point where we reset, let's see, where's the reset line? Here's the reset line. So we reset over here. And then we go out of reset. Once we go out of reset, the address lines reset themselves to zero. And you can see that the memory is outputting FF. But then we enter reset state one again, and we output FFFE, the memory outputs one, two, and so on. Until finally, we jump to the reset factor, which is one, two, three, four. And we start again. So again, we're not doing any decoding. So nothing actually happens on the next clock phases. But the point is that what we've done is we've written code to simulate putting the processor in reset, and then we've looked at the waveforms and we see that indeed the processor does the right thing. Now, what we would have to do really is is use formal verification to make sure that if after a reset, we want to assert that the first address goes to FFFE, the second address goes to FFFF, and whatever the memory responds with, the next address gone to consists of those data bytes. So let's see if we can do some formal verification of this. So here's some formal verification. First of all, I have added this little switch so that I don't have to keep commenting out the simulation part and the verification part every time I want to switch from one to the other. So here, basically, I'm saying, okay, if in the past four cycles ago, that's what this extra argument to past means. So if in the past four cycles ago reset was high, and three cycles ago it was low, and two cycles ago it was low, and one cycle ago it was low, then assert that two cycles ago the address was FFFE, one cycle ago the address was FFFF, and right now the address is, well, the high side of the address or the high byte of the address is what dn was two cycles ago, and the low byte of the address is what dn was one cycle ago. So that is my check to make sure that the thing does exactly what it's supposed to do. I've also just added a cover statement. I want to see what it can come up with when it comes out of reset what address it's going to look at. So I want to make sure that the address is non-zero because that's the reset value for address, and I want to make sure that the address is less than FFFE because FFFE and FFFF are the reset vectors when it comes out of reset. So let's see what happens. So first I'm going to generate the code, and I'm going to call it core.il. Now here's the SBY file, so I'm going to run BMC and cover, and I'm just going to run it for a depth of 10 clocks. You can see that I'm reading in core.il, and the files are core.il. So let's go ahead and run formal verification on core.sby. Okay, and we can see that cover passed, and BMC passed as well. So cover has a trace in engine zero, trace zero. The directory is core underscore cover. So here's GTK wave. Let's take a look at the clock reset. Okay, so it didn't manipulate reset the address and data in. Okay, so we can see that when it came out of reset, FFFE, FFFF, and then it went to address zero zero zero one, which is fine. So that really is formal verification. I suppose that one thing that I should also do at some point is verify that core.pc is the same as core.address over here. So let's just add that and rerun formal verification. So first compile, then run verification, and indeed BMC still passes. So that's good. And then I can always change the cover statement to PC. So let's see what happens if I do that. Recompile, okay, and run GTK wave. And let's see if we can go into the core and look at let's first pull up the clock. Let's take a look at the core PC. Okay, so it went to three FFFF. Let's see what the address lines are. Okay, three FFFF. So this time it chose to have a reset at three FFFF. That's fine. Okay, so that is formally verified. Okay, so in order to support actual processing of instructions, I have added this cycle signal, which can go from zero to 15, which tells us where in an instruction we are. So we will always start at cycle zero. So the first thing that I need to do is when we come out of reset, just make sure that the cycle gets reset to zero. It should automatically do that really because when we come out of reset by default, registers are set to zero. So I shouldn't need to do that, but I'll just do that to be explicit. Okay, the next thing that we're going to do is we're going to add some logic after our reset handler, which will handle fetching and executing. So here's the code block. And the idea is that if we're in reset state three, which basically means we're running, and if we're in cycle zero, then we want to do a fetch of the op code. Otherwise, we want to execute whatever instruction we got. So let's define the fetch method first. All right, so here's the fetch method. So the idea is that whatever is on DIN, now remember that when we come out of reset and we're in cycle zero, the program counter is already being output to the address lines. So at some point during the cycle, the memory is going to come back with the data at that address, and that's going to appear on data in. So at the end of the cycle, what we want to do is store whatever is on the data in lines into the instruction, go to cycle one, we're going to set read write always to one. So we're going to do a read, and we're going to set the program counter and the address to the next address in memory. Now, some instructions, by default, most instructions will have to do this. Some instructions won't, like for example, a return from subroutine or maybe a jump. Well, okay, so a jump actually has operands, so we are going to have to read the next instruction, but a return from subroutine has no operands and would go to some other program counter. Nevertheless, we're going to do this mainly because if we look at the, this table in the 6800 data sheet, we can see what happens on the address lines on every cycle of every instruction. So let's just go and find return from subroutine and you can see that during cycle one, this is fetch, but during cycle two, we automatically increment the opcode address. And of course, we're going to eventually change to whatever the next instruction is supposed to be. So all instructions work this way. So we may as well keep doing that. So that's really all there is to fetch. The really important thing is that we load the data input lines into the instruction register and go to instruction cycle one. Now, for executing instructions, we're going to switch on whatever instruction we have in the instruction register. What I'm going to do is I'm going to define a default and these are going to be the illegal instructions and all the right now, the only thing that'll happen on an illegal instruction is that we set the cycle register back to zero and we output the current program counter to the address lines and of course get set up to read the next instruction. So this is not how illegal instructions actually work on the 6800. In fact, there is a transistor level simulation, which you can go to and put in an illegal instruction and see what actually happens. And sometimes it just goes to the next instruction. Sometimes it actually executes something that isn't actually documented. Like for example, there's actually a test instruction where when you execute that the address lines just start increasing and every cycle and that's it. In fact, you can't get out of that mode unless you reset the processor. So eventually there will be no illegal instructions. There will only be undocumented instructions, but for now I'm just going to say that anytime that I'm not decoding an instruction that I know about, I'll just go to the next byte and use that as the instruction. Okay, so now that we have fetch and execute, I guess what we can do is implement an instruction and what simpler instruction can there be then no op. So I'm going to add a case statement here in execute for no op. And that is the bit pattern for no op. It's just hex one. And I'm going to define a separate function to handle no op. Now what should no op do? Well, nothing. In fact, it should do basically the same thing as I'm doing for the illegal instruction. We just simply go to the next instruction and start again at instruction cycle zero. So that's all there is to implementing no op. It's not very interesting, but let's see what happens. So in our fake memory for our simulation, I'm going to add a one, just so that we can see the instruction actually executing. Now, again, this is not going to be any different from executing illegal instructions, but at least we can see the program counter incrementing. So let me just add a couple more yield statements before I do a reset. And let's go ahead and simulate this. So I set simulate to true. And now I'm just going to compile the core that worked. And now, well, actually, I'm going to run the core, which I did. And now we have this test dot GTKW, which I can pull up GTK wave test dot GTKW. Okay, and there we go. So we can see that reset state is one, reset state is two, reset state is three. And now we have program counter is one, two, three, four. So the data in is one, and we can see that the instruction register gets loaded with one during instruction cycle one, or just when we're getting into instruction cycle one, which of course does nothing. So we then go back to instruction cycle zero, the program counter remains at one, two, three, five. And the data in is again, zero, one. And then of course, we execute that and nothing happens. And we go to one, two, three, six, and so on one, two, three, seven. Then we do a reset and we go to F F F E F F F one, two, three, four, one, two, three, five, and so on. So again, if you look at say one, two, three, six where we have no memory, I'm returning F F. And if we look at the instruction register, that gets loaded with F F. But of course, F F is an illegal instruction. So it's basically going to do the same thing as a no op. So maybe we should implement an instruction that's a little more interesting, like for example, a jump to a specific address. Okay, so that was fairly straightforward. Here's our jump instruction, specifically using jump extended, which takes two bytes as an operand. And that is the address that you should jump to. So the op code is 70. And here in the code, we can see that for the cycle one, which is just after we fetch, what we want to do is we want to take the data in and use it as the high byte for temp 16, because that is going to be the high byte of the instruction to jump to, or the address to jump to. And then we do the usual thing of incrementing the program counter, also setting that to the address that we want to read next, and then going to cycle two. Now in cycle two, we have data in, and that is the low byte of the address that we want to jump to. However, we probably don't want to put it in temp 16, because that will take into effect on the next edge of the clock, which means that we would need one more cycle to transfer temp 16 to the program counter. So instead, we're just going to use a temporary statement that says concatenate the low byte, which is data in, and the high byte, which is in temp 16. And just load that into the program counter and the address to be read on the next cycle, and set the cycle to zero, which basically finishes up the instruction. So we can simulate this. So here what I've done is, at the beginning of our program at 1234, I've used 70a010 as just an example address to jump to. And then at a010, I've just got a nilop. So let's go ahead and run that and see what happens. Okay, we've run it. And let's take a look at the wavelength, the waveforms. And we can see, if I make this a little bigger, so we go to 1234. And where is the instruction right here? There's the instruction 7e. We go through cycle one and cycle two, reading a0 and 10. And when that instruction finishes, the program counter is now a010, as is the address line. So that is just a nilop. So that's 01. And then we go to cycle one of that instruction and then cycle zero of the next instruction, which of course, are all illegal because we have nothing at memory there. So a011, a012, a013, and so on. So we've successfully jumped to an address. Now, we haven't formally verified this. So all I know is that it works in this one specific case. So it would be kind of nice to formally verify whether this works in all cases, jumping from any address to any address. We won't do that right now. What I do want to point out is the possibility of refactoring some of this code. So if you look at this code, we can see that we're setting the address lines from the program counter. We're setting the read-write line to one, and we're setting the cycle to zero. So this basically means we've finished up an instruction and we're going to read the next instruction. Now, notice that we do the exact same thing over here, basically ending the nilop instruction. And at the end of the jump extended instruction over here, we do almost the same thing, except instead of loading the address lines with the program counter, we're loading the address lines with some other address, namely new PC. Now, it would be tempting to refactor this into a Python function like this. So imagine that we basically copy these four lines into a function called end instruction. And then what we would do is we would call end instruction here, and here, and here. Now notice that we're not assigning to the program counter here because we've already got the program counter. So that's why that's left out. But you could conceive that you could call end instruction here with just self.PC. So here's an instruction and we're passing it an address as a statement. And the reason that I pass it as a statement is because this line over here is actually a statement and not a signal. So the idea here is that I would just set the PC and the address to whatever address you sent. And I set read write to one, so we're reading, and I set the cycle to zero. Now, this would be fine if you were refactoring for Python, but we're not refactoring for Python, really, we're refactoring for the hardware. So instead, what we're going to do is we're going to define a one bit signal, basically a flag that says, okay, we want to end the instruction. And of course, we're going to need an end instruction address register to store the address of where we want to go to after the instruction has ended. So I'm going to create a function called end instruction handler, which will generate the logic for handling when we want to end the instruction. So here's what we want to do. So these are the things that we want to do when we want to end the instruction. And of course, I need to add a statement with m dot if self dot and instruction. So if and only if and instruction is set, then these are all the things that we want to do on the next edge of phase one. So now what I can do is I can go ahead and replace things like this with something like this. So that's nice. However, what I also need to do is by default, set that to zero. So way back in end instruction handler, or even before that, I'm simply going to set end instruction to zero. So what this does is it says end instruction is always zero, unless it's overwritten. And the only place that we override it is when we end the instruction right over here. So let's go ahead and replace no op with that. And let's go ahead and replace the end of the jump instruction with all of that. So in this particular case, the address that we're going to jump to is new PC. And that's that. Now let's simulate it just to be sure that it works. Yep, okay, that worked. So let's pull up GTK wave and just make sure that we are still jumping to the same address as we were. 1234, 1235, 1236, A010, A011, and so on. So that worked fine. Now it's possible that there's another place that we can put that specifically here. Okay, so we've got this reset vector. And really, all we're doing is jumping to the reset vector. It's as if we were executing a jump instruction. So let's go ahead and go grab those two lines, stick them over here, and replace the address with reset vector. Now we don't need to set the PC, the address or the read-write line, or reset the cycle to zero. Because now that's done for us. Now, in this particular case, we are setting two signals, and there is no further hardware refactoring that we could do. But we could do a software refactoring. Now is where we can define our end of instruction. So define end instruction. And I'm going to take a statement as the address. And basically, we just do this. And now instead of having to remember all of this stuff to do, all I have to do is say self dot end instruction. And in this case, it would be new PC. I can do the same thing for the illegal instruction, where all I'm going to do is jump to self dot PC. Same thing with no op. And the same thing with, let's see, when we're ready to run, when we come out of reset. So this is just going to be reset vector. And I guess let me move this down. Okay, and of course, I forgot the modules. And an instruction is already is already taken. Let me maybe rename that to end instruction flag. Okay, so now we were able to replace those two lines with with just one line. And now we don't have to remember which signals we need to set what order we need to set them and so on. All we have to do is called end instruction with some statement. In this case, the statement is just the register. But in other cases, the statement is a concatenation. So for example, like that. So let's again, run this and make sure that we can run the simulator. And that's still looking good. So a zero, I guess that's one zero one one one two. So we're jumping to one two three four out of reset, which is correct. So all this stuff is correct. Great. And we can see, let's see, is there an end instruction flag? Yeah, right here. So that's kind of interesting. We now have an end instruction flag, which goes high whenever we're on the last cycle of an instruction, which could which could end up being important for things like interrupts. So anyway, that is a very nice refactoring. And also, what I want to note is that we haven't yet made use of any buses. So for example, here, right, remember that we said that we wanted to have a that we wanted to have a 16 bit increment decrement unit. And the problem with doing something like this is we now have set up an adder, possibly two adders depending on what yosis is optimization does. So at least one adder, if not two, strictly for this one particular case when self dot cycle is one, and the instruction is jump. So that's not great. So we're going to have to fix that, but we won't fix that right now. Maybe that would be premature optimization. I do want to wait until we have maybe one or two more examples of this. And then I can look at it and see exactly what we need to do, just like we did for the end instruction pattern. And the same thing applies in fetch where we are also having an adder, possibly two, just during the fetch cycle. So that's one adder over here, and one adder just for jump. So that's obviously not going to be great. And in fact, if we were to instantiate this and synthesize this onto an FPGA, we might actually see the effect by looking at the number of lookup tables or carry units or whatever. And then if we replace this with something a little more parsimonious in terms of hardware, maybe we'll see the number of lots and carry units go down. But again, we're not going to do that yet until I get a couple more examples of doing this. And then I can figure out exactly what I need to do with the increment decrement unit and how to set the buses and so on. So to make formal verification a little more regular, so that I can apply it to any instruction that I like, I have created this class called formal data, which contains a snapshot of the registers before the instruction and after the instruction. And then I can just compare what I need to the instruction here just contains the instruction that I want to verify snapshot taken because I'm going to because I'm going to take a snapshot before the instruction. I need to know if I have to take a snapshot after the instruction. And then I would just want to keep track of any reads and writes that the instruction happens to do so I can make sure that number one they're okay and number two that if any of these registers depend on what I read or wrote that they are correct. So this is something like read. So this is the read function that should be called on every read that the CPU does during an instruction. So if the snapshot is taken and we have enough space in our array, just put in the address that we read and the data that we read and increment the number of addresses that we read. Same thing with write. So that's just exactly the same thing. So with free snapshot basically I pass in the instruction along with the standard registers abx spnpc and I first of all zero out the number of addresses read and written and state that I've taken a snapshot and store everything. If there is no snapshot to be taken then I want to make sure that I set snapshot taken to zero and finally for post snapshot again I pass in the current values of the registers abx spnpc and I save them in post a post b and so on and then I can do some comparisons. Now what I've also defined is this class which is basically an abstract base class called verification and it contains two functions a valid function and a check function. The valid function returns it should return true if this is an instruction that I'm interested in formally verifying. So for an instruction that has only one bit pattern this would just compare that bit pattern but if it's a little more complicated than that then I can add more complicated logic and what check does is it will look at the pre snapshot and the post snapshot and make sure everything is okay. So as an example here is the formal verification for the jump instruction so the verification is valid if the pattern is 7e because that is the jump instruction that I'm looking at. For checking basically in the combinatorial domain I'm asserting that abx and sp have not changed and that I have not written any addresses. On the other hand I should have read two addresses because I'm checking jump extended which has two bytes for its operand so I want to make sure that the read address is equal to the pc plus one and the second read address is equal to the pc plus two and also that the pc after the instruction finishes is simply the concatenation of the first data byte with the second data byte. Now you'll notice that I have this function here just called plus 16 and this is to get around that gotcha that I talked about in the last video where if you add a 16-bit number to a 16-bit number you get a 17-bit number but then obviously if you want to compare that and you get an overflow into the 17th bit well now your comparison won't work. So plus 16 just adds the two values and then truncates to 16 bits same thing with plus 8 you add the two values and truncate to 8 bits. Now what does the CPU look like? So way back in the let's see here in the constructor I pass it an optional verification so this verification is going to be something like this formal jump class so if it's present then down here okay so I save it and I create a formal data structure and of course again the formal data structure is all this stuff and obviously if there is no verification then I don't want any of these signals messing up the CPU just taking up space so if I am doing verification then this will set up all of the signals that will store all the data that I can compare. Now here I've added an extra function called maybe do formal verification so what does that look like? Let's scroll to maybe do formal verification so here if we are doing verification then so as long as we're in the run state so in other words the reset state is three and also we're on the instruction cycle zero then if this instruction now remember it hasn't yet been put into the instruction register but it is on the data lines so if this instruction is valid in other words it's the one that we're looking for then take a pre snapshot otherwise declare that we've taken no snapshot. Now when do we do the actual comparison well again if the cycle is zero and we've taken a snapshot now this is only going to be true on cycle zero of the next instruction then we do a post snapshot of the current values of the register which of course have not changed during cycle zero of an instruction and then we call the check function which simply checks based on the instruction and the formal data structure. All right now we know that we have to add some hooks so here are those hooks so for example for jump extended I do two reads the first read is for cycle one and the second read is cycle two and in both reads basically all I do is I pass it the address lines and the data lines and that is the address that I've read and that is the data that I have also read so I do that for cycle one and I do that for cycle two and that stores the data that I've read and that should be about it. Now in main what I've done is I've added an argument called instruction to the parser so now I can specify which instruction I'm interested in formally verifying you don't want to verify all the instructions all at once because you'll just be there for you know possibly days if not more so when you create an instruction then you can write the formal verification class for it and then formally verify that instruction then when you're interested in providing a release you can just run through all the instructions to make sure that everything still works. So here's the verification I start out with none but if we have an instruction then we're going to do formal verification if we don't have an instruction we're just going to do simulation so I've used some python magic here to look for the specific class that we want to load in for the particular instruction so the name of the file is formal underscore the instruction that you want to look for and this file over here that I've written is formal underscore jump okay so that just loads up the proper verification class and then I instantiate the core where before I had no argument this time I have the verification argument which again could be none in which case there is no verification that's going to happen now for verification so first of all notice that I've kept my reset verification so I still want to make sure that reset works but I've added a few more lines so the first thing is just a cycle counter I just want to count up the number of cycles that I've run so far and the reason that I want to do that is this clause over here forget about this apparently I don't need to force a reset but over here what I want to do is I want to make sure that on cycle 20 and that's fairly arbitrary but it does need to be far out enough to have enough cycles to be able to encounter any instruction and maybe a few more instructions so on cycle 20 I want to cover the case that I found the instruction or I was able or formal verification was able to generate the instruction that I'm looking for so obviously if this cover statement never goes off then formal verification will never be able to create that instruction in order to verify it in the first place so that's what the cover does and then I'm also going to assume that the instruction appears on cycle 20 and if we assume if we force formal verification to create a sequence of signals such that the instruction appears on cycle 20 then all the assertions will happen so let's run that and see what actually happens so I'm going to compile the code in generate mode because I'm doing formal verification so I need the ilang output but before the generate argument I put in the instruction option and I say jump okay so that compiles and now I can run formal verification now let's take a look quickly at the sby file again so the only change that I've made is I've changed the depth to 24 obviously the depth needs to be something greater than 20 I just chose 24 but other than that there really is no change so let's go ahead and run the sby file okay and that took a little bit longer because there is a lot more logic to check so if we look at the first section this is cover so it passed and we'll take a look at that trace later and if we look at the second status it says pass that's for bounded model checking so everything worked now I can show that if I go to my formal specification and I change something like for example let's suppose I want the post a or the a after the instruction to be one plus the a before the instruction and also let's suppose I wanted to say that the number of addresses read had to be three instead of two so this should fail so I'm going to recompile okay and then I'm going to run now the cover should pass actually the cover failed because when it covers the instruction the assert statements are still in effect but what we're interested in is really bounded model checking and it basically said okay by step six I know that it's gonna that it failed because I was able to find a sequence of instructions that led to a jump instruction and I checked the formal verification assertions and two of them were violated at line 16 and 23 and of course line 16 and of course line 16 is the erroneous a assertion and line 23 is the erroneous addresses read in assertion so let me change those back so that everything is correct let me run formal verification again okay and you can see that even though it took a little bit more time it's not a trivial proof nevertheless it did take two seconds as I add more and more instructions there's more and more logic and formal verification will have to work a little bit harder because there are just more instructions to go through and some instructions have certain effects that may affect other instructions so let's take a look at the trace of the cover statement so let's see what do we want to look at well first of all you want the clock so let's grab the clock I guess we want the address lines and the data in now of course it looks very random because of course when doing formal verification the engine will choose basically whatever it wants what I'm really interested in I guess is the instruction itself so let's go to the instruction register let's find that and pull it up okay so here are the instruction registers we can see right away there's 7f but remember that what we did was we assumed that the instruction appears at cycle 20 so let me pull up cycle or cycle 2 I called it okay so cycle 20 and let's see this is 18 19 and 20 so yes by the end of cycle 20 the 70 instruction appeared so that is the instruction that we are looking for so you can see that it was a 7e and the data in was well 7e and 00 and we can see right over here that we did indeed jump to 7e 00 now if bounded model checking failed it would give you a trace showing how exactly it failed so let's take a look at that we can modify say a and then we can rerun formal verification so we compile run verification and we can now look at the BMC failure here is the trace file right here now let's take a look at the clock again and the address lines the data lines and the instruction where is the instruction here we go okay so here is our 7e instruction so really to find out what's wrong we need to take a look at the assertion that failed which was on line 16 which was that the the contents of the a register before the instruction and the contents of the a register after the instruction so we can look for that here is post a and where's pre a okay and here's pre a so basically what we would do oh and we can also look at the valid no there is no valid signal okay so uh we can see that here's the instruction right here and here's the end of the instruction over here in fact I have an end of instruction signal don't I yep so that's the end of the instruction right there and we can look at pre a and post a and we can see that yes indeed you know pre a is zero post a is also zero and that violates our assertion that post a should be pre a plus one so based on that we can then go debug the logic and you know see what what we did wrong so I think we can conclude this video right here so that's the end of part three so what we've done is put together a CPU very simple we've implemented the knob and the jump instructions and we've also formally verified the jump instruction so I think in the next video what we're going to do is implement more instructions and we'll probably start using the buses we'll probably start implementing the ALU and have more instructions and more formal verification to come so until part four I will see you take care bye