 So I think it's time for another live coding session and Just to break from tradition today. I am not going to be writing an assembler. I'm going to be writing a disassembler So for some context at the weekend I went to my favorite shop that sells things other people have thrown away And I got myself a type-star 210 Canon electric typewriter. This is a thermal typewriter It's a microcontroller based device You type on the keyboard and it prints stuff using a thermal transfer printer onto paper. I Already have one of these typewriters, which I turned into a USB keyboard But when I saw this one, which is considerably bigger and had a full-sized keyboard I got it hoping that it would be a similar architecture inside, but with a socketed ROM because that means I couldn't change the software and Yes, it was indeed a socketed ROM But the processor which took me a long time to find was not the same as the other typewriter The smaller one used a 6303, which I've been getting interested in but this one Used a CPU called the TLCS 90 Which I have never ever heard of before It's made by Toshiba They have documentation and it's online It's a really interesting thing because it is a redesigned Z80 So it's got all the Z80 registers and it's got all the Z80 instructions But they've also added a huge number of more instructions including lots of proper 16-bit operations. So you've got here and with HL It's got hardware multiplication and division It's got a radically redesigned addressing mode system, which means that the instructions are now all orthogonal It's got things like you as stack frame address the dressing modes zero page form Or any of the 16 bit registers can be used as an index register It's got a 20-bit wide address bus which allows it to address way more than 64k of memory Although sadly you can only use this for data. You can't use it for a program And they've also completely changed the binary Encoding of the instructions which makes it source compatible with the Z80 But not binary compatible which is really interesting here is the base page table of the instruction set And you can see that if you're used to the Z80 instruction set It's only got two rows of LDs here because the other LDs will all be moved onto a base page onto an extended page It's got Down here that these are the prefix bytes to take you to other pages except unlike the Z80 The prefix byte encodes the register that it's going to use to As part of the addressing mode Instructions can be up to six bytes long It's a really interesting thing and I would like to find out more about it Except that because nobody's ever heard of it. There's very few Programs around to work with it. I found a couple of assemblers, but they're all weird DOS things Mame and will emulate it because there are a few arcade games that use it, but not very many and that's about all I found so I pulled the ROM out of the machine and Dumped it which took longer than I was expecting So I do have the 256k ROM from the original typewriter And you see here are some text messages that appear in the ROM all to do with typewriters So if I Can figure out how this works. I Can write my own software program it onto an EEPROM. I've got one Stick it in the machine and make it do my own thing Sadly, it's only got 8k of RAM which isn't really enough to run a real operating system It would be nice to run CPM on it CPM will run on it I would have to reassemble the CPM programs for the new instruction set It wouldn't run any CPM software. It's just making it a little bit useless, but it would run Of course, it's got no disk interface It's got a rather good keyboard for input and a printer for output Though there is a little LCD on the typewriter as well But mainly Mainly I'm doing this because it's cool So this is actually the second time I've tried this the first time I got a couple of hours into it and then ground or halt because the approach I was using just wasn't working Now disassemblers are not normally particularly complicated, but this instruction set is tricky The instruction encodings are all kind of weird For example, if I find a nice simple 8-bit instruction Here we go add a value to the accumulator We have how many one two three four five six seven different forms for all the different addressing modes You've got add a register Which in the on this architecture is two bytes rather than the one byte for the z80 add a constant Add a dereference 16-bit register add a dereferenced index plus displacement Add dereferenced HL plus a which is quite nice. That gives you variable sized byte array access for very nicely Dereference 16-bit constant Dereference a 8-bit address in zero page Here are the encodings and you notice for example with this one We have the prefix byte the e3 Then we have the source address then we have the opcode and If we go down to for example Yeah, the other form of add which adds to a Register that's not a or to a memory address You can see that the you have the prefix bytes Here you've got the address then you've got the opcode byte and then you've got the actual constant being added and One I did was I tried to essentially Reverse engineer the instruction encoding rules Here's the base Page of instructions and if you use the z80 you'll see it's quite different from the z80 itself So we have only two rows for three rows of LDs Where the z80 would have like masses and Are we are limited to copying to and from a Copying constants into a register This is here's copying constant was 16-bit register Here we are copy 16-bit registers among each other If we look down to one of the secondary pages that use a prefix byte Here are all the other LD ready instructions because they use the The the bottom few bits of the prefix byte to determine what register is being referred to And it's all quite orthogonal The problem is I kept finding weird little exceptions and I got Less and less sure that the rules I was trying to put together were in any way right And they kept getting more and more complicated as I tried to account for the exception so what I'm going to do instead is To just brute force it Which is a shame because Be so much nicer to do it elegantly, but this will be vastly vastly more reliable So anyway, let's take a look at the code. I have an empty boilerplate and my normal auto build window up here so the first thing we need to do is to well Do that because I haven't got around to putting it in the standard library yet and let's do some argument parsing and I'm just gonna steal that from the Assembler Yeah, oh yes, this is by the way All in cow go my pet programming language Right now it's compiling into Well everything but the one I'm gonna run is the native 386 Linux version I can also compile into a bunch of different architectures including CPM, but I'm not gonna bother for this So this program is called cow this it takes a option a mandatory input file name a optional output file name and a start hex address the way This disassembler works is it's just gonna work with a raw binary file with no header So you have to you have an address parameter to tell you where it's going to be What the base address of the file is? So we don't have a listing file The input file name is the only mandatory one We need some variables up here Input file name We've got the output file name We've got the input file Which is a file control block This is a large structure that encapsulates whatever the Platform needs to refer to a file. So in Linux as a file descriptor plus the buffer and we want to open the input file if the Output file is not zero Try opening that and we know so I need a variable for the start address and I'm gonna assume 16-bit addresses for the time being Okay, pausing the start address is I want to use the a to i subroutine which returns a signed 32-bit value and the and the advanced pointer to the argument so if the If the advanced pointer does not point to the end of the string then this means it's a Invalid number so take the Pa's result cast it to a 16-bit integer and that should be That I have everything No, I've are I haven't done this standard Start error and error stuff So let's just copy all these. Oh, we need to upper as well Two uppers should really be moved into the standard library the standard library for cow gold needs quite a lot of work Line them line. No. Oh Yeah, because I stole this from the assembler If an error occurs during assembly, then it tells you what line it occurred at FCB and you and eight are not compatible get the output file name. Oh I like languages with typing. Okay, that works and we want to actually Yeah, let me just So here is the executable that's been produced 44 bytes of code 1,306 bytes of code the cow gold linker does really aggressive dead code removal So if I didn't call Pa's arguments Then it wouldn't have pulled in this subroutine here or any of the Or a to I or the argv library or the file library and you end up with a Minimal executable that just start up and shuts down Please ignore the size of this data segment that's because it uses cheap and nasty A cheap and nasty way to claim memory from the system This is really intended for 8-bit systems. So on Linux. I just told it, you know use a megabyte and be done with it Okay, now we need actual main subroutine Which is actually going to do the work The input file and the output file are loaded Let's Actually do Just thinking about how to output the result. Let's just use print for now. So I'm just going to ignore the output file So for the input file When we read instructions We want to pass them. We have to pass them because otherwise we won't know how long they are We won't have any bytes to read. So we have to pass them as we read them But we also want to copy the instruction into a buffer so that we can output the hex representation and the ASCII representation along with disassembly So we're going to create a buffer for that Which is six bytes long. So that is the longest instruction that I've seen so far. We may need to change that and the instruction length and we're going to create a routine Byte Which returns a byte And what this is going to do is Read a byte from the input stream write it to the buffer Advance the buffer length so that every time we call read byte it'll get automatically copied into the buffer this means that in our loop Because we're going to continually do read instructions from the input file until we reach the end of the file But the beginning we want to reset the buffer pointer now. There is actually one other thing we need to do the cow goal File abstraction doesn't have an end-of-file concept because various reasons combining laziness and CPM so We are we actually have to keep track of the current file position and the Length of the file file then is Extent of the input file and input file pause is going to start out She'll put that here So every time you read a byte we need to remember to advance the file position and We want to keep reading bytes until we reach the end of the file So that should do it for me automatically and it even compiled Yeah, file size is increasing. That's because it just pulled in FCB X so Here we're going to read each instruction in turn and in fact I'm going to add a subroutine to do that. So that will go there Just read one bite for now after we read the instruction. We want to Print it. So now actually we're going to print for now at least we're going to print it right here So the first thing is the address is No, hang on what am I doing? It's the dress so after we read the instruction that we then want to write out the hex bytes One day I must get around to doing Four loops Okay, so that builds we have a Small binary 12k but most of that is debug stuff hmm 6k bear in mind that 6k of which there's only 2k of Actual code the rest of it is like overheads. Let's take a look at the DOS version See much smaller and there's always the cpm version about the same size as the DOS version But this should now work So I should be able to run it on this one Again, we haven't increased the address Yeah, that was the thing I'd forgotten so Read instruction will read stuff into the buffer, but won't advance the address We're only going to advance the address after we've done with each instruction There we go But of course it thinks all instructions are one byte long and it's not doing any disassembly so the results are actually quite boring So up here and read instruction. We're actually going to have to pass the instruction And read the various operands and the way I was going to do that is simply to have four different tables Huge tables well two two fifty six entries each containing Uh the instructions needed on what to do to pass these So it's simplest for the simple instructions like these that take no parameters For instruction like this that takes a single A single byte This is a two byte instruction Where the second byte is the low byte of the direct page address We somehow need to tell it to read a byte Uh For use as the parameter For the more complicated instructions like these we have to tell it to read a register into Wherever we are storing our parameters Uh for this one we want to read a 16 bit value Then we need to go to one of the other pages to Read another byte then we need to do whatever that byte says And Eventually once we've finished we need to output the instruction so each of these entries Is going to have to contain all this information And the way It's going to happen is basically Byte code sort of so Let me define a I'm just thinking about how to do this So I'm not going to use a record in fact. It's just going to be An array of strings So we're going to start at the base templates of which there are 256 There are also going to be additional 256 byte templates for the other pages Two three So in fact, we're going to use up a complete kilobyte just of these lookup tables Each one of these will contain a template which for template string Which for these four is going to be simple, so I'll just Not hold dii hold dii So the idea is we look up the byte in the template And then we do whatever we see here if it's a lowercase letter That's a instruction. That's just going to be omitted If it's a uppercase letter, then we're going to want to do something And is that actually correct? Yes, it is It's going to be Yeah So we're actually going to want to have to have 256 items in this and in fact The compiler will complain because I didn't put a colon in there because I've been programming in other languages Wrong number of elements in initialize that we actually do need 256 entries here now Now We are going to want to have Some empty slots The base page has a few here The Mostly here Don't know why they didn't use those Uh, but the subsequent pages actually have quite a lot now. I could just do this eat This should take us Look that last one is Inc x n Um So there's actually going to be something in here. That's a direct page. Let's call that a d But in fact, we're going to want to be cleverer than this So We may not want to omit the parameter at the point where we Uh, read it So i'm actually going to do that to say Read a direct page value into parameter one and here We omit parameter one Because we could actually write this like this I am making this up as I go along by the way. Can you tell? So the reason for doing it like that is because these We are going to want to our Copy one of the we're going to copy a register into a parameter But then follow another template from one of the other tables It's then going to need to refer to that parameter and looking at these Yes, I was rambling Those Let's just do that so it compiles. Let's take a look at the cpm version Cal goal doesn't yet have string Commonly So each of these empty strings will turn into a single Terminating byte So here are here are the strings not halt di ei Empty bytes Unfortunately the way initializes work means I can't just do this and equally unfortunately Just thinking Can I do this? Ah, right. No, you can't I was thinking of doing Yeah Something like this and then referring to unknown imson Sorry, that would have to be like that for an array. Unfortunately. I haven't got around to doing that yet so I'm actually just going to leave these all the same and just soak up for the the hit These have to be question marks because these are in the base page. So these will actually be printed In the extended page, I'm just going to use empty strings each of which will use up a byte I I should prioritize commenting out of strings Unfortunately to do that you have to load all the strings into memory and keep them in memory so that you can compare them against Any new string coming in So it's actually quite expensive, but it's worth having so We're going to need Right, so here is an empty block of eight instructions. Okay So we've done blocks zero and one. So we want to do two three four five six seven eight nine a b c d e f Hang on. I'll cover about this Okay, and that's built So it's not like we have, you know, very many but it turns out that the first instruction of the ROM Is a di Instruction o2 because that's where execution starts. So we can actually Start work This is a extremely crude way of doing this and hugely over complex But I think it's the simplest way Of getting the result that's, you know, more or less correct I'm sure it would be possible to Actually decode instructions according to the rules and collapse this all this code by quite a long way but you know It's uh difficult We also need to put buffer To hold our output instruction And i'm going to arbitrarily call that If that's the size of 20 So read instruction So if so this is the Routine that just writes something to the output buffer and advances the length So if the thing we've read is a lowcase letter Just print it Otherwise we're doing something more complicated. And in fact what we're going to do is Just error out That's the usual error you get if You miss semi-colons okay Right, we're actually Doing quite a lot of the infrastructure So after we have printed the hex we actually want to print the The disassembled instruction So we make sure it's zero terminated And just Print it so So it's going to do anything useful Yep, we have disassembled our first instruction correctly too Now 1a has failed because It's just marked as unknown For the time being we are going to We're going to check to see if the template string is empty And error out right so we actually fail at instruction 101 because 1a is not in the templates Or rather we don't have a template for 1a so eight nine a one a is jpnn So this is going to be A simple address Which we then emit so what is a parameter going to look like? It consists of Let's have a quick scan to see how many parameters we could have Here is an instruction with three parameters We have the destination index register We have the source index register And we have the displacement And if you look at the way the instruction is encoded We have to read the destination first Then the displacement Then the source in this form the source always goes Uh, sorry got that backwards again The source index register and displacement and then the destination destination always goes last But there are forms where the destination goes first Like this one for example so It's just thinking of the template bytes. I actually I'm slightly changing my mind about the way these work There are two things we can do we can either write the parameter To the spec we can write a value to the specified parameter as we read it And then when we write it We emit You know a register via one of the various lookup tables or an address depending on What type the parameter is We would have to have a way to Remember the type of the parameter Or we can assume the parameters are just 16 bit values And have Multiple writers Depending on type Which actually I think I prefer so we can have up to three parameters And we are not going to Track which one it is The template is going to say this is a direct page address We're writing it to parameter Parameter zero And we are going to write that as No, actually, that's not a direct page address. That is a byte We are reading a byte into parameter zero And then we are writing a direct page address from parameter zero And I'm going to put this over here to make it clear that this is the read phase And then this afterwards is the write phase So how is our jp going to look We need to read a word into parameter zero And we're going to write out a address So this thing is going to be b Or w byte or word And we're also going to have to have extractors for some of the various different types of register to So let's go and actually implement that So when it's a b we're reading a byte the And just thinking about There are some things that we'll just write out directly as well as lowercase letters So let's put these in here such as space comma open parenthesis closed parenthesis I think that's everything This means that if we reach this particular code path the Uh Then we know that we're we've got a Uh A byte that's going to do something and these always take a parameter as index So We read that Yes, and actually I forgot that this should be The the at next keyword in cal goal advances a pointer to the next element A pointer wish you add a value to a pointer. You're actually adding bytes This is for a variety of reasons one of which is to discourage people from using pointers too much because it's very expensive on Uh pointer is very expensive in 8-bit machines for example, if you If you want to offset a certain distance into an array you typically need to do a multiplication to Determine at runtime how much you want to skip It's much cheaper to add The size of the The item so We have next and prev if you want to actually Jump around in a pointer Jump around in an array by pointer Then you either use indexing or actual explicit multiplication I think that made sense Okay, we want to read a byte from the input stream. So that is very simple Read a byte cast it to a u n 16 This one we read a word Which is always going to be a u n 16. We haven't done read words. Let's implement that here The tlcs90 is little indian So Read the low byte read the high byte. So what's that going to do? Bad template byte for one Uh A address we haven't done addresses So we now need to emit using print byte here the There's the address or we're being referred to by the parameter And in fact, I'm going to change my mind again Because we don't really care that this isn't that this Value is an address. We just want to omit it As a 16 bit value Uh, what what character can I use for this? Do you also have eight bit values? I'm going to use n and m No So m for a 16 bit value n for an 8 bit value there's a reason for this which is the The encodings here use n for an 8 bit value And it's kind of arbitrary So that's an m We want to write out a 16 bit value and we're actually going to steal a Well, I was going to say I was going to steal some code from the standard library But it looks like the code in the standard library does not actually use uh ui2a Because it uses a much cheaper routine for writing hex But we have ui2a in the code We've pulled it in in order to Print decimal values if you haven't done yet Also ui2a is a bit of a pain to use so I'm actually I am just going to steal steal these print text word parameters param It occurs to me that this Framework may actually be work may actually work for other architectures Okay, we have managed to disassemble 0793 here This just jumped 0793 here, which is correct That's where the rest of the code is. You know, this is a startup code for the system Disable interrupts jumped where the main code is It's then followed by this chunk of ascii, which is the type star 210 Uh Some kind of version string I don't know if it's ever actually printed so This is going to disassemble as garbage, but we have to start at the beginning and go through to the end So let's just have a look at five four. Oh look five four is a nice simple Uh instruction so we can actually do the entire five band So let me find it the Actually the bottom few bits are the register And we're going to have to be able to decode these for This chunk down here. So we could easily write the routine to pull the bottom A couple of bits from the the current byte Stick that into a parameter and then the routine to Uh Amit that parameter as a 16 bit register And then we would be able to use the same string for all of these And that would actually Yeah, let's do that So this is a 16 bit Register Which I'm going to call q so read q zero No, I'm not no. I'm not oh means the current byte that is b So that's just copies the current opcode byte into the register q is then going to be a Amit byte I'm well that's going to do the masking there So in fact this So right now this is going to be duplicated in the binary When I get by when I get string commenting in that will be so much better Okay, byte word opcode Very simple q So we need another table Which are the 16 bit registers The encoding here. Where is it? It calls them gg. So it's this table here There's also a qq which is very similar Yes, it is Yes, it's going to be this table the difference is whether register six is the stack pointer or af In this case, it's the stack it's Stack pointer that's going to be bc d hl Uh It's my screen recorder ix i y sp Like so These two question marks correspond to the holes at three and seven Which you can see here three and seven uh These are filled in down here by the nnn constant forms But we don't have to worry about them for now. So this is this has eight items in it So all we do is So that allows us to emit a string So this is just my print p registers parameters param and seven So that will pull the parameter Make sure it refers to to discard any bits. We don't care about Look that up in the q registers table and print it And that doesn't work expression was uint eight ah Firstly, that's an array Secondly, the value of that is actually a uint 16 But the q registers index is a uint eight so And this is actually referring to a error on a different line probably there Because b is a byte Oh, that's interesting That's a compiler error It can't compile this code on the 6502 I Should go and investigate that But for the time being Let me just turn off the 6502 tool chains Yes Okay That's better. So now we're going to run the disassembler It crashes zero one two three four five six seven to an array of strings disassembled it correctly I compiled it correctly I hope Yeah, let's start this at zero. Okay. So q the value being read out of q registers is incorrect for some reason It's not not a valid string. So is the value we're getting out of this scroll As uint eight and seven That's found a seven byte Which is an invalid register. It should shouldn't crash So zero one two three four five six seven It should print that interesting So here's my print wrong read a byte keep going around print it Should be nothing wrong there So string is actually a uint eight So that is equivalent So the values that have been stored in the array are actually pointers to strings If we take a look at that cpn version again You should be able to find Here we go Here are the strings being omitted one at a time and then following it Is the array itself So There's the last question mark that 7f looks Odd but here is the array uh Oh, no, that that's that's fine. The address is 067f Which in this file is 057f. So that's here. So that's actually the address of bc 0682 0685 0688 0688 Yeah, something wrong there So what is it somehow overrunning the end of the buffer? Hello 8049 b23 And then b23 Is Not does not look like a String to me I mean c02 c7 I wonder if this is a code gen bug This is the codes that the cowgirl compiler has actually generated for this program So One q registers. I hope I remember to put a comment in. No, I didn't Okay, so where is q registers actually being referred to template? So This is the if statement here This is this chunk of code Is the call to print byte or if we go look for print x word There we go. She means now we're here. I mean the next thing is q itself Uh So if it's not q jump away otherwise ECX is going to be the address of q registers Yes, we dereference it we Corporate I wonder if ecx has been corrupted by something doesn't look like it Eax is the actual Offset Hmm. I'm trying to remember whether these three eights on 386 this Uh scales eax automatically because it's a more of l It could be the case actually Okay, well there's one way to find out which is instead of using the 386 version We use this version this version is uh Cowgirl compiles into c and then we compile the c Okay, that fails as well. That's a good thing that means it's probably not a Bug in the 386 code generator that's very peculiar So, uh, we have got a value But the value doesn't seem to be correct. I've used this idiom before another code. So I don't see why this would be Producing in you know an invalid and initialized Array and we can print the The address of the the array itself So the pointer we've got is some way before the array. It's in fact too far before the array Because the strings are omitted Immediately after I think they are in this 386 version No, they're not all the strings are packed together by alignment So all the one byte aligned things are omitted in chunk then all the two byte aligned things then all the four byte aligned things You know what? I think I'm going to look into this offline because this is not the interesting part of the Of what I'm doing here. This is trying to track down a bug in my code somewhere else So I will do that and jump cut It is approximately 45 seconds later Because I'm an idiot My print my print routine was wrong. It's just it was never stopping at the end of the string and just going on forever And until it crashed Fantastic Well, we have successfully disassembled this invalid uh Instruction if I'm going to change the into Into this just so we have a little bit more information So the next byte is seven nine Which is seven zero one two three four five six seven eight over here, isn't it? So what we are doing is we're adding a 16 bit value to HL So this should be I'll just go and look at that up, but this should be There we go seven nine nm Very simple. We've got all the stuff we need to do that That's the whole seven eight block. What we are doing is we are reading a word under parameter zero add hl comma m zero 16 bit value And it's just this seven times with different instructions. So add adc sub sbc and or xor And xor or in fact that's only seven items because the last one is cp I just happen to remember It's cut off in the end here. Okay So that has disassembled this bogus instruction Probably correctly seven three that's this bank over here just on the left Which is I think there should be a comma in there This data sheet has got typos And you'll notice that the alu operations are always in the same order. That's direct page. So This needs to be So I didn't do d I changed my mind about dean took it away. I'm going to put it back in again as a This is You know Actually, it's cheaper to do it this way So we've read the byte into the parameter now we or it with the base address of the direct page And write it out and that doesn't work because that should be a word. Okay. We're getting a bit further What I was what I was saying before I got sidetracked is the alu operations appear in the same order in multiple places in the Instruction set like you can see them that they're here. They're here. They're here. They're here Uh, they're here. They're here. They're here. Etc. Etc. So As a future optimization, we might just, you know, turn this into another Uh parameter You know, tell it to pull the bottom three bits out of the opcode And print that as a step from the standard alu operation table And now I think about it. That's actually Easy Let's do that like so So write the opcode to parameter one print parameter one as a alu Why am I doing this? Because it means that all these become the same Which means that when I eventually do my string collapsing, it'll be smaller But now I need of course to implement l which means I need to need another table add adc sub spc xor cp And we need to add support for the l printer So this gave us adc spc sub that ain't right This o is Yeah, uh, what we've just done is in fact we've Uh, we're using the bottom Few bits of the template byte rather than The value just read So for that that needs to be like this, but I'm also going to have to change this because we can't write the If we use oh here, we're actually going to pull the byte out of the The last byte read from the instruction stream, which is of course the High byte of the 16 bit value. So we need to put that there I've got myself a model I'm going to cut that and put it there entry here that template. Oh, that's a Yeah, that doesn't mean the value zero that means the that's because I managed to chop off the w there Okay, that's better. That isn't how much you're working So the next block is three one, which is this This is copying a eight bit constant into an eight bit register so and again for these the Uh, the destination register is encoded into the bottom three bits of the opcode This is another standard table. This is the Uh, the R table we're going to call it. So let's put in a These this is the same order that the Z80 users if I remember correctly So that the accumulator is actually the last eight bit register not as you would expect the first So what are these going to look like? well We want to write the opcode parameter one We want to read a byte into parameter zero. Let's just do that that way around It's then an ld followed by these destination register followed by n1 the payload and this is There's seven of them Not eight because the last one Is special So we're going to leave that for the time being So now we just need to implement r actually put this down here Let's see that doesn't work unexpected string line 93. No commas bad template character n Oh, yeah, that's the eight bit value. I didn't do that ld c comma 30. So that's three one three zero Three one ld c comma 30. Yep, that looks correct All right, the next block is two seven. Ah, that's that's That's special but not too special. That's just loading a value into a And in fact, we can do these at the same time Yeah, let's just do this one two seven is a direct page address So read a byte A comma Direct zero Okay, that worked Three seven right. It's our special one That that goes here and I'm going to have to go look that one up in the detailed docs. Where are they? lds right at the top So this is three seven. This is This form Which I'm a little bit confused by because that's got an r when I think it should have be an n There's an n here this will copy an eight bit value into a Uh an address in direct page Without going through a register Which is quite a nice feature to be honest So the first byte is the The destination the second byte is the payload So it's a direct page destination Followed by the payload that would be here Three seven cc zero zero. Yep. That looks okay. Oh by the way, uh, x address 0010 is where the The software interrupt goes One thing they did change from the z80 is they got rid of all the reset instructions And it's just replaced with a single swi instruction Which now I remember about it is here That's it All that does is it does a It's a reset to this address But what that means is that this is all real code now Rather than mangled garbage when we try to disassemble text so All this does is Is this we start here Yeah, we read whatever we read whatever is in this hardware register We clear it to zero and we jump to the routine that actually does the work And then we have another vector here Which is it starts with a di so this is probably a Interrupt handler and then we jump to the routine that does the work And look the next byte is an ff, which is swy. That would be garbage padding Yeah, lots of garbage padding the It's both the z80 and the tlcs 90 uh, the bottom of memory is the um It's the vector table for the interrupts and things. They're all eight bytes apart So each one of these is turning the interrupts off and jumping to a routine at 3a9 So these will go on till we run out of vector table Wow, we're getting quite a long way into it Which looks about here This is the same code we saw above Again, no, these are more interrupts Okay, here we Write eight zero is where things actually start. I think I think this cars look kind of odd Anyway f8 oh f8 This is the first of our exotic pages so each one of these we need to copy the The opcode into A parameter And then go visit a different page Right, let's go find one of these in the wild So i'm looking for an f8 as the first well anything from above f8 f e f e really That does not seem to match honestly That's not a reg asp Yeah, let's find another one Let me come back to bitus later There we go test f8 plus g so g is the uh register number encoded into the bottom seven bits of the the uh prefix byte and then the Other parameter Is taken from the bottom seven bits of the suffix byte Well from our perspective, these are going to be straightforward. They're all going to be Write the prefix byte into Uh a Parameter and then we are going to want to change template and there are actually four banks Sorry, not change template change bank. There are four of these banks Oh, these ones are special Okay, I was confused for a moment because there are four banks, but five available slots as it were So if you include the base Then you've got this one this one this one this one and this one but actually Two of these share a How does that make sense? So this is the one from f8 to fe. This is the one we're using now is e o to e seven And f o to f three ah, right Yeah these And these use what use one bank these And these use another bank. So you've got the base the source bank the desk bank and the reg bank right Let's use their numbers. So I'm going to use t two to mean switch to bank two And then here We are going to need Another 256 entries So there Like this in fact I did that wrong. I want to copy this once So I can put the one in place here And then duplicate this one six times The reason for that is that allows me to search for the number I need to change Rapidly in vim. Ah, also, I need to copy more of them Not seven times. I wanted 14 times. So we should have eight here. So another eight should do it So we now need to implement the t modifier Oh Have I got myself confused Because of course, we can't switch the other templates until we've done all the operations for the instruction we want Yeah, I thought I hope that's going to work So in fact, what we need to do here is to So when we when we get a t byte, we need to be sure that we have finished reading all the prefix bytes It's now time for the secondary opcode So read the secondary opcode Look it up in the Uh Look it up in the appropriate table And then continue So we'll switch to the template read from bank two and then continue processing bytes Okay, so What's that going to do? Unimplemented template for byte three f So we have read the template byte Now we are No, actually What's happened is that we have read an empty we've switched banks correctly. We've read an empty byte An empty template and we have terminated so In actuality what we need is Another error block the narrow check block here like this Okay, that worked We have switched to bank two. We have tried to load the template for 6c which is It's one of these so zero one two three four five Wait six c Zero one two three four five six. That was one of these that's this side here Uh, okay. I'll go Look that up that's That's and so That's those the 16 bit ands. I want the eight bit ands which are here f seven six c Yep, that's the right one. So this is No parameters We read a single constant eight bit byte And then we emit that so the g here Is parameter zero for these we're going to have to be pretty clear about the order in which the parameters or I will just get confused But we want to read a byte into parameter There we don't we want to put the opcode into parameter one And the byte into parameter two Because this is an alu operation So it's the alu operation based on one six c and g comma n right we put We want the g register based on parameter zero followed by n Two I think that's correct. So this is the instruction. We have just possibly passed f eight six c three f f eight G register zero, which is b. So that is correct six c three f good And I just want to check to see what d down here and we want bank two this one Uh, yeah, okay. So we've done this block here Right unimplemented template byte e f that's a base bank instruction E F that's the one here, which I can't see That will be desk n. So Oh, actually, we're going to have to do the whole desk block. So in fact these are going to be The same as here Except they are going to be going to a different block The desk blocks go to We were looking for the e eight bank Bank four bank four seems to be largely empty. Thankfully, of course Uh, e f itself is special because it's the it takes a constant so I'm gonna have to find one This was supposed to be a short video I feel like this architecture is working pretty well even though it's Rather verbose in terms of code filling out the Uh, the tables is going to be time-consuming But we are doing chunks at a time, which is nice. I'm looking for a must be one E f here it is Okay So we have a single byte of prefix payload Followed by the opcode followed by a suffix byte So first thing is to let's just copy this This is going to be bank three Just to save time So I'm going to need it later Actually, once this is done, it's probably going to be worthwhile changing at least some of these Into code and this is bank four Because some of the banks are very empty this is One of them We need to go all the way back up to bank zero E f this one here Uh, we wish two And I think I have managed to stuff myself Because the parameters don't carry what they information what they are so I mean here we need to read a bytes worth of Parameter But we can't tell the different but the template can't tell the difference between this and this Because we're going to need to use a different template To emit the instruction depending whether it's an e f or a e eight because the e eight Is dereferencing a register And the e f here is Dereferencing a direct page value But the way I've done this Is there is that each of the tables for the banks Is needs to use The same template Now I could get round this By using more banks So that we have one template one bank for the registers One bank for the 16 bit values one for the 8 bit values And then one bank for the these three Index dereferences and yet another bank for this but that's not going to work So I think the only possible thing to do Is to change all these templates that I've done To a different model Where the type of the parameter is set when it is written to the variable And then when it's emitted as part of the string we just say Amit parameter whatever And then we have a generalized parameter emission routine That looks to see what type it is and does the right thing Which is going to involve rewriting some code Okay, I'm going to take a short break and Be back and let's go give that a go Okay, so let's have a look at this What can a parameter be? The obvious answer is a register a constant value A well a byte constant value a word constant value a Condition code, but I'm going to leave that for now because I have an idea or a Index plus displacement So these are really tricky The problem is that each of these actually has two different Parameters You've got the index register and the displacement. So in fact f0 to f7 Have to well the exception of these two have to write Two parameters one for the index register and one for the displacement. Well, these write one value So the the template that applies to any of the source for desks Is going to have to be able to cope with parameters which are either A indirect register or an indirect Index and displacement Yeah So I'm going to have to have a normal register for Indirected via ix Indirected via iy or indirected via sp or hla But Uh for the same reason that I'm not going to do stuff with registers and condition codes I'm going to Do all that for the time being so a parameter becomes The type which is one of these constants followed by either a 16 bit value or Or a reference to a string Uh, this at syntax here is cal goals union type. Actually that needs to go there Union type it says that this Parameter Must be placed at offset zero from the beginning of the record. So value and text actually overlap each other Type here will come after both of them. So we still have three parameters now the reason for text is If we're dealing with a register We are going to look up the value And just stuff it directly into the stuff the string pointer directly into the parameter This will make actually writing things out much easier so When you write a byte parameter, it will show up as a 8 bit constant If you write a word parameter, it will show up as a 16 bit constant These will show up as ix plus the displacement Based in value And I think that is all we need So let's take a look at these these templates. So we're going to have to rewrite them all again So Ink x that's this one that takes a 8 bit constant. You know, it takes a direct page constant So i'm going to say put the direct page value into parameter zero Write parameter zero. What have we got here? Take a word put in parameter zero write parameter zero This is another direct page So I think this is actually going to make things easier. So this one Ha ha ha So this is these already 30 this this block We want to turn the bottom seven bits into a r register And put it in parameter zero We then want to take a Byte put it in parameter one So these become this This last one is the direct page version We want to Read a direct page byte into parameter zero Read an ordinary byte into parameter one and output Okay, this looks like it might be beginning to come together for these we want to read q registers from The opcode into parameter zero Yeah, q and r will always have to read from the current byte Things like d will have to read a new byte This is the alu op one So we want to load a alu op Into parameter zero This will be omitted actually hang on that will be omitted as text we want to read a Yeah, I want to read a word into parameter one And it's the text in parameter zero hl comma one and this is the same but What are these seven eight oh 16 bit values so alu operation seven zero These hang on that's wrong We're not reading a word. We're reading a direct page address Well these however are reading a word so Read alu op into zero read a word into one Meet the alu op hl comma one right So we scroll down and we're now at e Eight we are dealing with desk instructions. This is the one that flawed us before so We are reading a 16 bit value a q register from the opcode into parameter zero And we are then switching to table What does e8 use? table for For this last one We are reading a Oh, actually we do this one as well three so that's e8 this one We are reading a word value to parameter zero This one. We're reading a direct page value into parameter zero So the same table should therefore apply to all of those Okay Now we're on f8. We're on these register operations here This is reading a pants This is reading either an eight bit r register or a 16 bit q register G register q register And which one it is depends on whether it's a depends on the instruction It changes the interpretation of the register Oh great This means that we cannot decide what the register What the actual string value is Until we write it out Is there any kind of order to this? No, these are just jumbled And these ones aren't even like registers I was really hoping not to be able not to have to decide But I was really hoping to be able to decide what they were at this point Because I was hoping that when we read the register uh You know like these We would know when we read it it was a q register So we look it up in the q table and stick that string constant pointer Into the parameter So we're going to have to have text byte word and we're going to need r registers and q registers So the The r and q modifiers that we came up with for here are still going to apply that just instead of turning it into a text thing No, no, that's that's doing the same thing again, isn't it? This needs to be an abstract register. I think we're going to have to do Something pretty dubious So these will still apply that's just going to put a string constant into the parameter field But when we come to these we are going to We're going to use the a By to say this is a abstract register value from zero to seven And then we're going to add a special modifier Which will change the type of this parameter from an abstract register to a string constant Which will then be invoked by one of these other templates So this one f eight block is table two. So it's actually going to be this So what are the six eights? These these ones so we know that the G here is going to be a 8-bit register So we simply say this is going to be the uh the low We want the low part of this register call that six eight bits Yeah, and We are then Reading the alu operation in from the current opcode into l one. That's not going to Is that going to work? Now we want that to happen last But we do need to read a byte Into parameter one The alu operation goes into parameter two We then write out Parameter two, which is the alu operation We then write out The destination Which is in parameters zero that we have here converted to the appropriate register type And then here we write out the source value Yikes, so bank three templates. I believe was empty bank four templates There were some stuff in this Somewhere no, they weren't Yeah, I never got that far. Did I Okay, so now we need to change the state machine to do the appropriate thing So Be read to byte. In fact, we can be a little bit Cleverer here So param is pointing at the correct parameter Or it'll be a hyperspace pointer if it's a t So when we have a b what we want to do is to say the type is byte parameter The value is One of these w is exactly the same thing but word word O is gone D is a direct page. So this is actually going to turn into a word n and These have gone those are these are printers What do we have we've got You read a queue register from the current byte so this is going to be text and the text value is going to be queue registers current and Seven that will work our registers And we have alu operations And we have abstract registers Where the value is just the register value itself t remains Unchanged That build no it doesn't 1.5.0 Okay, this will fail when I run it. Yep, because we haven't done any of the printers yet and in fact In fact, this is now invalid because our printers The zero one and two Are a single byte. That's fine. We can put that here. I'm sure this could be more elegant but If it's a valid parameter number then call this routine to print it otherwise we go through this routine to load it writes a print parameter So this should actually work Unimplemented template for byte six one who we should have done. Oh No, we that should work. We should have done that one six one six one adca comma n Did I break the template? This is now mostly templates Six ones here So I think what's happened is like I have a bad template and it's read the wrong number of bytes So what is seven three seven four? zero one two three sbc Sbchl comma direct. Okay, that's Okay And what about this one seven nine seven seven wait seven nine seven oh seven five seven nine Yeah, that's a three byte instruction. All right Four is five four Yeah, push ix that also looks right So I don't really know why this is failing now But was working previously Okay, well, let's do that print parameter routine It should be fairly straightforward. I haven't yet figured out how to make vim's control p auto complete select an item On tab rather than return Okay, there's more but i'm not going to touch them for now. That's not working print print x word print x byte Right. Well that seems to be doing a thing Did I implement this? I think I did implement that last time. That's direct page into a which should be easy anyway six one So it's direct page into Now hang on a your operation to zero direct page into one print zero a comma one that is Yep common throughout six one seven two Yeah, I think we're going to template for byte two zero lda comma b Okay So R register in zero lda comma zero Two zero lda comma b. Yep, that's looks right d31 three two Yep d31 three zero p27 byte cc This is definitely different disassembly than I was getting last time So maybe it was just wrong last time i4 is five four really push ix. Yes, it is seven nine seven zero six five seven nine Yes, that's a three byte instruction and we do indeed have three bytes Okay Maybe it was just wrong last time So cc right, we're going to have to do those condition codes and that's going to need another table Are the condition codes here the same as they are in the other tables f l t le f l t le p n z n c p n z n c right, let's do a table l t u le p e z and if it looks like there's quite a lot of condition codes And then you will be very pleased to hear if you're a z80 programmer That this machine has Signed and unsigned comparisons Trying to do signed comparisons on the z80 is a Exercise in frustration. It's terrible Not as bad as it is on the 8080, but It's a grim grim business Go So this is going to need a new modifier byte, which we're going to call C And this takes the bottom Four bits so where were we zero of bank One So all of these are going to be condition codes into zero Oh under displacement I haven't done displacements Now this machine has a single byte and double byte displacements, which is rather exciting Let's do a byte for now All right, so so cc is cc 37 c See yep p o 37 There should be a long form jr somewhere. Here we go jp long form They use the same Unlike the z80. They're assuming that the Assembler is just going to do the right thing And pick the appropriate form depending on the displacement Which as a compiler writer is lovely. Okay c8 is special Because this does not have a condition code. It's just the It's wait. Did I write jr? I'm sorry. What I just said was rubbish Uh, for some reason I thought this was a jp Right that is less nice for a compiler writer Because I would much rather have the assembler take care of picking the appropriate form depending on where you're jumping to And in fact the assemblers I've done tend to do this and yes, this is using Yeah, you see our real a real instructions started at one zero, but because we are We have different synchronization now for some reason Then it's It's not disassembling from one zero. We re-sync with the actual instructions set here at one eight so the real Code is down here in eight three, right? We haven't done that yet so This picks the low byte of the I did Yes, because this is pointing to the right, which is where the low bytes are This is going to pick the eight bit version of the register text equals This is a r register That's already been done for me. So that's all I need Whereas this one Is going to pick a q register now. What does this done? This is the instruction that we were worried about which is f eight six c three f f eight Oh, and of course this is Okay, right that is the right register Uh, why are you there? That's a bad template bank two we are Here You should be ones compare b with three f f eight six c three f so If we go up to the eight bit section look at the full documentation. Here we go uh f eight six c So this is this is reading a byte from after the The opcode Six c is not cp I think I put this All right, okay So l here has not does not read a byte from the stream So it's actually pulling a byte from here, which is where the constant is so that's wrong Our instruction here, which is f eight six c should be f eight six d Where is six c here we go f eight six c three f it should be an and so We wish to read a byte So when we reach this template, we have read the opcode byte, which means l will refer to that Then we read the byte of payload and b comma three f that's correct good Right, uh template By actually no, I know hang on. Let's before we do that. Let's start doing displacements For displacements, we need to add on the current address And we're going to need some more modifiers to load them and we are slightly running out of memorable ones So So we're going to have j and p J for a short jump and p for a long one For I hope reasonably obvious reasons So these both take words these both produce words The value is the current address Plus the displacement sign extended Plus a constant the constant varies depending on Where the uh where the processor has its instruction Point at the time when the addition is done And this can vary from processor to processor from the beginning of the instruction to the end to somewhere in the middle And I cannot remember what the z eighty does But it's likely not to be the same as this So a bit displacement Should there should be a comment here somewhere A lot transfer instructions And then we get onto hardware registers Where are our jumps? I wish this table was in alphabetical order shifts Test here we go jumps just says pc plus d so Maybe that's just the address. Okay So we want to take The value As a signed 8 bit uh value We then extend this to be a unsigned 16 bit value and add it onto the address for p Then we just add on the value What this is doing here is it's sign extending the bottom 8 bits to be 16 bits wide Now whereas where are our jr's? Instead of a byte this is going to be a J that's not right We didn't read the byte. That's what we didn't do Yeah, I was looking at uh these ones when I cut and pasted those things 37 plus 11 is 48. So Maybe that's right Let's find another jr No, we haven't met any yet Uh, okay now we're going to get on to byte 20 bank four Which is what bank four is this one? 20 we've got these so the source is the It's the r register baked into the secondary opcode The destination is the thing that we stuffed into parameter zero So if assuming everything works this should be relatively straightforward So we wish to Pull the r register Out of r1 Yes, and the the destination is always going to be indirect Which means we don't have we know it's going to have to be a We don't have to do the same horrible thing we did with the angle brackets so the Destination source. I think that is all there is to it and we want seven of them so What's this going to do? eight five Ld ffc cc comma b There aren't enough bytes in that five six seven eight nine Oh, no, that's that's I think right E f Is this one destinations direct page? Therefore the next byte is the direct page address, which is the same cc. We keep seeing And two zero is the opcode, which is store Yeah, that that code makes sense. It's loaded the value at that hardware register. It's masked it. That's not right Really should that be a b two seven cc two Seven No, that's an a okay, apparently that's right Good it's this actually working Or a byte a six bank two This is bank two a six is F e d c b a A six these are these bit operations. I just realized I could simplify We don't need this special reg type. We can use a byte for that Because these don't care what the actual type is And it allows us to use them here to get the bit number So we want bank two A block a six. Oh, that's actually These ones on the left. These are the shifts I Don't think we need a table for the shift instructions or the roll instructions No, we do they're here as well And you can only roll a's apparently let me double check that Sorry, you can roll registers Yeah, you can roll lots of things But they appear in different banks, which is why I'm not seeing them Also, it's getting late and I'm getting slightly scraggly so table time you see r of c rl s r a s r l and yes We need another Thing here for rolls and we've used r and we've used l We used s. We have not used s back to a six and these are all identical We need to read the Uh, the shift type into parameter one Amit parameter one amit register is this bank two Oh, this is bank three There are so many of these runs where all eight are the same that this will compress quite well into code Okay, what did we get f nine a six? Oh, yeah, this was the bank where we need to do the nasty shifty things So that needs to be like so All right f nine a six s l l c a six s l l And that is an eight bit register. So see good By 28 space bank That's these notes this one Uh, I don't know what that is. I'm going to have to look it up. That's another special two eight. Here we go load a into a register And I'm getting my eight confused with my s. Yes, I'm getting tired. I probably need to Uh have another break Probably for the night, but this is actually getting somewhere. It's now mostly a matter of lots of data entry so it's a R register Which is the destination Like this We have seven of these and then one of something else Where is Two f here it is two f n So two eight is ldb comma a Okay Back to Here so f Two right. This is our first index This is sp plus something it's reading something from stack frame And this is going to require More modifiers and actually dealing with the whole index stuff. So that actually seems like a good point to pause So I will See you all in probably a few seconds So it's the second day. I've loaded up my development environment. I've got the pdf open. Let's get going So we had stuck at f Two this is this block of source indexed operations So we're going to need to add a parameter that turns one of these index operations into the appropriate type of register And we're going to call that of course x so Uh, if you just check that x isn't already used I've completely lost track of what modifiers we've got No x's all right, so To do this it's Uh, very simple. We load the bottom two bits of the opcode Into parameter zero and then switch to One of the other templates we it is in fact temp for this one. It's templates three And for the f4 block is template four So we simply have this four times and this Four times now the implementation of the x modifier Is going to Not be as simple as I was thinking actually Because Yeah, this is this was the thing I had I was talking about earlier uh Yeah, what we're gonna Yeah, so uh, each of these indexed operations they really have Two things we need to know which is the register and the displacement but our parameters can only store one value So we're going to use the value to store displacement and then have four different parameter types for the four different parameters Uh, we're going to treat hl plus a as one of these even though it doesn't have a displacement So we actually had only had three. So let's add another And make sure they're in the right order i x i y s p h l x i s p h l so So what this is going to do is We need to Create the register based on Let's go create the parameter type based on the bottom two bits, but then we need to load a byte into the actual value Which of course will vary depending on the value Uh Yeah, we can do that. It's it's a little bit annoying, but so Parameter type is not HLA like this and let's actually put this Up here So this takes the bottom two bits of the opcode Uh, we add on param x i x which gives us the right type here If it's not this one then read an extra byte for the displacement That should work And we also need the code to print a A value print parameter So that's going to be When param x i x so Except that these Uh displacements are signed So So that will print the sign and the value That needs to be a int eight. Okay, so i x i y sp and finally This one is simpler hl plus a Okay, so what does that give us? Right, it's selected the appropriate bank. It's read the opcode. Maybe correctly But we haven't put the actual template there so This is The five zeros of bank three. So that's these that's going to be one of these. These are the ex instructions So this is actually a q register Is it a q register? Yes, it's a q register in the opcode so that will be q one zero One twice followed by a gap Repeated so what does this give us? Ex sp plus eight comma bc f two eight five oh so f Zero one two it is indeed sp plus a offset followed by Five oh bc that looks like that has disassembled correctly Okay, back to the base page for one seven Ld ar hl comma dd. That's an easy one. So the uh Hope it's an easy one. The ld ar instruction is a 16 bit load Into hl using a displacement So it gives you a relocatable code And this is going to be I forgotten what the displacement bite is Don't actually want to put it down here Where is our displacement? Oh j and p of course So for one seven this will be Hl comma and this is a p displacement That's not done what I expected should have been the current address plus the word So current address should be nine six plus one b o o Oh, yeah, no that is that is right. I was just confused by b one and one b slight dyslexic moment there okay we're now on to four two bank four One two three four this block here Okay So these are all uh copying 16 bit registers Again, it's a q register in bottom followed by ld One ld zero comma one interesting Oh, yes, uh Okay, well first dark these need parentheses and we also Yep f 606 4 2 F Okay, its destination is sp plus constant and four zero one twos hl Okay, that's correct. I hopefully We have all the template stuff working. So it should now be data entry from from now on Which is possibly not terribly exciting to watch but If you're really lucky, I'll discover another major design flaw and have to rewrite it to open scratch 3a and As I'm implementing blocks at a time eventually we'll run out of blocks and we should have Uh coverage of the instructions that are used in this particular file so these are Loading a 16 bit constant into a q register So we need the opcode. We need a word I load destinations to register. That's the word But the last one 3f is special. That's a 16 bit thing and we'll deal with that later Okay, hl 4000 3a is Hl good six f bank four zero one two three four five six oh alu operations Bank four has very little in it actually So that's this block here now. We have the destination parameter zero the parameter one is going to be the Opcode which is an alu operation followed by a third parameter which is the byte So it is the opcode followed by the Destination followed by the byte eight times and as this is a slightly complex instruction, let's go and Check this against the detailed documentation Where are my eight bit operations? all right so efcc Here we go ef Yeah, efcc it's direct page. That's doing it right 6f for the opcode f0 for the payload Yes, that is working. I'm slightly surprised Back to the base page for e6. I've not done the source operations yet. Apparently I haven't Okay, so these are the same as the destinations but different They go to a different page as we page three. So it should be that simple really Q register from into parameter zero except for the nn and n forms and then we go to page three and then we hit a missing template for Lock 70 This is page three 70 is more alu operations. These are destination hl source is The destination well the source parameter so alu Thing No, that's it. So that's going to be the alu operation Our destination is hl source Seven times add hl comma sp is that E6 70 will 70 is add that's correct The prefix byte was e6 Yes sources sp Add hl comma sp good for a bank three for a We Load a indirect value and put it into a q register so where is our bank three so Q register from the opcode Destination is the q register Followed by the source So e2 for a just derepensers hl into hl check that's correct for a Destination is hl e2 Sources hl good Uh, this one it did for us except that there's a Yeah, the I didn't put the Uh the parentheses in let me double check that one Yep destination parameter is dereference therefore I need parentheses sp plus 0 4 into Yep Okay, and back to the base page I strongly suspect that the way this is going to end up Is that the base page? Which is complicated Is going to contain It's going to be a lookup table And the other pages are going to be Code or the now I look at this base page. It is much less complicated and it looks from the chart So maybe not Anyway, nine six is One of these nine six these are Increment a q register Uh, except for this one, which is special So really simple No parentheses that should be just that seven times nine six ink sp nine six. Yep, that's right five e pop a One of the registers that's not a q register now. We should have done pushers Okay This is wrong actually because these are not queues But you know what the only two places where af is used in the entire chart except for up here with ex which is kind of special is this push and This push and this pop So let's just do That And then here it's going to be the same This means we don't need a chart for the registers which aren't q registers and much more importantly I don't need to come up with a name for them five e pop af five e right One e is ret It don't get simpler than that And I think this one is going to be ret i So let's just put those both in and in fact now we're here. We might as well do these ones as well I'll set that zero eight nine a Is the xx now these are ex dhl ex Dash daa decimal adjust They've that's gone out of fashion these days now that Uh bcd is no longer used much I have never used bcd myself And I don't know what this one is. I think it's probably going to be deck and a Uh address, but I don't know whether it's a 16 bit or a eight bit So we'll come come we'll deal with that later Right actually got a lot further Inc s p pop af ret And then lots of swipadding f b ldi x o one f f yes Followed by a bad template char Yes That may have hit one of these And we actually want to If we see a question mark, we actually want to print it So that needs to be added to this list here f b Okay. Yep. That's a that's a zero five. It's This here. It is not a instruction Followed by a swi followed by a very obvious jump table at the massive jump table followed by trampolines these look Well bc is obviously the parameter to whatever at o o 80 I'm not going to start reverse engineering this right now So we're at five e two bite one c It's a call And in fact now we're here I'm going to do That is a long relative jump. It was using a 16 bit displacement We have an absolute call using a 16 bit address followed by A long relative call using a 16 bit displacement Okay bank four bite 37 zero one two three three seven is It's this okay Yeah, this was one of the instructions that The these two actually Which made me stop trying to reverse engineer the the encoding because These use the Parameters the other way around to the rest of the instructions in this block so the So these are these This block is for destination Uh instructions where the destination is supposed to be encoded into the prefix bite And in fact looking at it. It's these ones which are wrong Because in these these the This source is encoded into the prefix bite These ones are actually normal Okay bank four So we need to read 16 bits from the instruction stream into a word And then it's Zero parameter one And i'm also going to do Hang on i'm putting that in the wrong place Let's go there And this is similar but with a bite right three Oh Okay, I was Careless there That actually goes there And the other one is 3f 37 3f and that gives us a behemothic five bite instruction with a fixed 16 bit address and a 8-bit payload Yeah, and that that looks like it makes sense All right We're at 646 Bite a b bank two Here is bank two f e d c b a a b that's this block of bits So Oh, it's these again So the parameter is a 8-bit value. So we do this You're here rather We want to read the bottom three bits into parameter one Which is a one Rather by bit one zero bit o three comma a yeah that that will F e a b a b is three Uh, it'd be nice to lose that zero, but I think i'm not going to worry about it for now We could actually do it because a Is only ever used to convert a value to a register using an angle bracket Or to print a seven bit value so we can actually distinguish, but yeah, I'm not going to worry about that for now Bite bb base page bb More bit operations this one is set with an indirect through direct page Now these three blocks are interesting Because these give you a two byte operation that will Manipulate anything in the direct page This is obviously for tinkering with hardware registers and hardware peripherals So they were obviously thinking that this was going to be A heavily embedded system that did a lot of hardware control Because in most code you don't touch peripherals very much you tend to have a library that Uh controls them so they were obviously expecting there to be lots of Peripheral control code and this would save space or possibly performance Anyway b block bb it's going to be eight bit constant for the direct page So put the bit number into parameter zero Direct page value into parameter one Stove so bb e6 set bit three at this address looks plausible Yeah Right is that one Let's go look at the detailed doc 30 minutes 3f That seems familiar. Oh in one of the other pages. It's familiar Let's look at the detailed documentation for it Here it is load a 16 bit value into a Direct page indirection With the ldw opcode. Oh, yeah, uh, it's very similar to this one so d zero for the direct page a word into parameter one ld w zero comma one And the reason why it's ld w is because the the assembler cannot tell Whether you meant this instruction for an eight bit constant or this instruction for a 16 bit constant If you had a if you were supplying a register as well, that would be easy Here it is Okay, it's possible bite six e add a eight bit constant to a ALU operation from the current byte 16 bit six e yep eight Eight bit constant to parameter zero point the ALU operation a commas one Seven times and that was this one. I think For 30 which 6e or that seems to have worked code code code 6 8 b bite for eight that's this one Whatever that is That's probably looking at the other instructions writing hl to an absolute address. So Check the detail docs for f here we go now for eight. Sorry that'd be writing a For eight. Ah, it's the move hl to a 16 bit register that'd be a q register so that's actually going to be Same as up here But the other way around It's a q register to parameters zero Wait, that's all there is so Like that set for the last one for eight e c comma hl, okay four four And that's the other way around And there are gaps actually I shouldn't so four oh is q register into parameters zero followed by that That's interesting. Isn't there a Isn't there a way to copy bc directly? the copy hang on hl into bc Ix into hl Okay. Yeah, it does need two copies for three Really for three that's a Bad opcode That looks like a actual instruction stream. I mean that makes sense. Uh, no, hang on. Hang on. This is and this is ASCII That's a zero that's a space Yeah, okay. This this is garbage Uh, this is the end of the actual code when we jump off to whatever's at five b six And then this is a string Just for my sanity Let's just do this Print So this will use space there. I mean nothing's lined up, but it will print the uh The ASCII codes Next to the instructions. So here we go VHD check I bet that's going to be VHD check and this is a string produced by some kind of self-test routine which means that Once I've located the string in the ROM I can reverse engineer it find out where the string is printed from I'm trying to figure out how to make the typewriter actually run that bit of code But before we do that we have to go to for b here should be There Yep VHD check Followed by Oh in fact here you see this is Loading the constant value o six eight eight into IX which is the address of our string Uh I bet that this is going to be the length 10 bytes Well 16 bytes or other that's hex one two three four Six eight eight six nine eight. Yeah 16 bytes So I bet that this routine is actually going to print it somewhere five b six Which it's not disassembling because it got desynchronized somewhere, which is a shame anyway Where are we now? We're all the way up at seven one two. We're actually getting through this bank three bytes two e to e D reference whatever it is and put the result into an eight bit register so Eight bit register. So we do the Thing do we need to do the thing for this bank? It looks like we don't Let's not do the thing I'm not even sure that the thing would work if one's the wrong parameter type. It'd probably just use garbage uh, but we want to The destination is a r register from the opcode And r one so one comma zero Seven times for eight times two e is indeed here E three odfa Let me just be paranoid and check the docs. We're looking for lda comma nn No, we're not we're looking for this one. No, we're not we're looking for this one e three Yep, odfa is a 16 bit address followed by The encoded register. Okay, that's cool Six two bank three we are in bank three so six two this block alu operations from the source parameter into a so alu print the alu thing A comma zero eight times Yep, that looks plausible You see that both of these have the same uh, basic the same Prefix byte So we don't know whether it's an ld or a sub until we read the last opcode at the end of the operation Which is not something I've seen in an instruction set before this is quite weird There is a logic to it and a certain amount of elegance, but it's not It's unusual Right bank two df bank two FED okay, these are return via a Uh Condition code there's a note here to say that the prefix byte must be fe But actually I am not going to worry about that But what we need to do is read a condition code, which I believe is in c into the parameter parameter one And then it's just ret one Okay 16 times two nine And let's put these in the right place You may have realized that the reason why There we go So this is comparing a with constant 10 and returning if the carry flag is not set Df is this one ret NC So that's actually doing a A unsigned magnitude comparison. It's it's It's subtracting away from a And if the carry flag wasn't set then we do the return and it won't be set if 10 is less than or equal to a I think Yeah, the now the reason why I'm doing each instruction as we find it in this file Is because this makes it much easier to test. I could just start from the top of the data Of the data sheet and just work through them But Also, that would be much less interesting to do Okay, we're still on bank two six two zero one two zero one two three four five six six two ALU operation with a Eight bit register and this is the block where we have to do the things that are ggs here into a Do the thing ALU operation into Into one and that's all we need a comma seven times Sub a comma b Yeah And and on the z80 this will be a one byte instruction I don't know what the code density of this thing is It's got way more useful registers way more useful instructions But a lot of the things that on z80 were one byte are now multiple bytes so Bank four bytes c0 c0 are conditional jumps to a Via a register Yeah Which was something z80 wasn't very good at Where are we bank? You're on bank four So this is we take the condition code to parameter one JP condition code destination Yeah, man. This is using the The addressing mode where you give it a 16 bit value So Yes, you can use these for going to a 16 bit constant address or Jump to a address in a register Which is cool and since we're here And because it's almost exactly the same I'm going to Do the calls as well Okay, so we've done both of those blocks By 19 in the base page Technically one nine because it's hex that's this dj and z yeah Uh again a nice thing that this thing Had which the z80 didn't the z80 only had the basic dj and z Which is decrement b and jump if non zero Very useful for loops. This one has a this machine has a 16 bit form So these two instructions we might as well do them both It's a 16 bit jump j and z zero c comma zero Here we go Two bytes for a complete loop Very nice Then lots more instructions This looks like main code We don't where do we jump to on startup? It is the main code We go to 0793 Yeah, this is the main routine that runs the entire typewriter Turn interrupt off. They're already off. But never mind Set the stack. This tells us where we have some ram And I believe Yeah, this is part of the built-in ram on the system. I think And then we've got all these routines to do things to set things up Poke a hardware register No idea what that is, but that looks like a built-in tlcs 90 hardware register So I can look that up in the data sheet do things and Here we have a loop we jump back to 07c2 that's Halfway through this instruction Which makes me think Yeah, these are all These are all one byte out Because this is obviously trying to jump to Either here or here Jumping to the next instruction is obviously wrong Therefore we have an off by one error in the 8-bit Displacement this one So I'm just going to stick a plus one on that so now that's Still wrong. So 07c3 is halfway through this instruction That's what's this loop doing Okay This is now pointing at natural instruction So this piece of code here Is the main loop for the entire typewriter So we have all the setup and then we just go around here doing things This is So what we're doing here We load de with a value Then we check the thing at this address and if it is Not zero we change de to this other value This is a very simple If this is very common idiom for If this is zero set de to this otherwise set it to this Except that Well, no, we're also calling this subroutines. That's not quite right Yeah, but anyway, we are Doing the thing here. We're doing a thing based on the contents of this address this memory location and either calling This routine with de set to 70e or this routine with de set to 0806 Which is interesting The what's at 0260 Push bc set bc to 1 jump to 080 and this is this will be a standard routine that does something I it's not immediately obvious. Anyway, let's go down to the bottom and see where we are We've gone off the bottom of our main loop. We have padding and we have a unimplemented opcode 9f base page this All right, that will be deck some things. So let's go find deck ink deck 9f here we go Decrease the value in a direct page memory location direct page memory location deck zero and The other decks are a q register. Okay That looks extremely suspicious as code. So I think this is garbage But it is at least disassembling eight one two byte eight zero This block here. Oh, this is exactly the same but for inks So Hang on. Hang on. Have I got that wrong? 90 What's the thing I just did? 98 wasn't it? Uh, yes, okay Good here. We're decking a q register a 16 bit value Now we are inking a r register an 8 bit value. Okay, that's just straightforward And this last one which I might as well do is a direct page d0 So 80 ink b So this takes us all the way down to 5f 76 bank 2 01234567 76 this block we are adding a q register to hl Well, we're an alu operation of a q register against hl So it's a It's in fact the same as this block here but with With A q register rather than our register and hl f a 76 or hl with hl That looks plausible Back to the base page for seven That I think is writing for seven this one is writing a Uh, it's reading a 16 bit value out of direct page into hl two Byte five nine zero one two three four five five nine. Ah Right, this is another case like the the reps where the prefix bytes should be fe So these are the The block operation instructions that z80 users will know and tolerate They mostly look really useful But only a few actually get used in real life uh So the ldi ldd cpi cpd They do a thing Uh, and then increment a A specified register so the ld will copy from Either hl to bc hl to de or dt de to hl I can never remember which way around they are It will then increment hl and de And decrement bc bc is a loop counter This will do the same, but it will decrement hl and de so This will do one element of a copy This will do one element of a copy in the reverse direction These two are exactly the same, but they do Compares and leave the flags set The r suffix versions will then repeat the Operation until bc is zero Or I believe for the cop they compare until the appropriate flags are set, but I have never found a use for these I think you may be able to do mem copy memcump with them Uh, there are also on the z80 equivalent instructions that will repeatedly read from an ioport But they don't do any synchronization So you don't know whether the hardware device is going to be ready for another read So again, I've never found a use for them And it looks like this machine has just left them out Anyway, oh Oh haven't saved ldir Yeah, here we go. This is this is the most common use of these What this is doing is it's copying eight bytes from Either here or here To either here or here I think de is the destination actually Yeah, so where are we now bank three a b a Be more bit operations. Am I in the right bank? I don't see any other bit operations. I'm sure I've done some I've done these Okay a eight so the source Is in zero Put the bit number in one And these two are the same but different. So let's do those as well Test bit three of this address looks But the instructions do seem to sync up. Let me just double check that Here we go Three of f nine f b. That's this one Oh, no, that that's correct. That's correct. I was getting confused about where the payload was Anyway, we're at eight b three base page Which is it down here for f This one This will be the 16 bit version Now this will be the This will be this but in the other direction But i'm still going to look it up in the detailed docs uh direct page destination hl source Yeah, it's the same as Yeah, it is it is indeed the same as this but in the other direction. Okay instructions instructions instructions nine five one base page ac bit operations Now the bit operations are great because I can do huge great swathes of them at once So in fact, I've already done set so d1 This one is res So interestingly the the res and set instructions take these destination as the second parameter where the rest of the mnemonics The destination is usually the first parameter mm-hmm bank three eight seven three eight seven Ink a indirect thing now that is just this Which means that this is going to be this and these two going to be ink word and deck word so being able to Increment a value at an address and a 16 bit value at an address is quite nice Wait a minute doesn't the z80 let you do that? mm-hmm You know where have you got to bb4 uh Bank two three eight we that's this bank three zero one two three eight Copy a 16 bit q register into another register Yeah, you can copy any 16 bit register into any 16 bit register Including ix and iy which is really nice But it takes two bytes to do so Unless you are Copying in and out of hl so hl now really is the 16 bit accumulator Everything is faster in hl Anyway bank two Three eight is q register in the opcode ld One comma zero That is not right right that's because we haven't done the thing This is a q register This is a 16 bit operation so We need to set this to be a register Here we go ld bc comma d good Lots of code lots of big instructions for bytes I did a 8086 back end for cal goal the other day Wow that instruction encoding It's so special byte bc bank two f e d c b b c oh more bit operations excellent These are the ones that operate on registers So set bit four of a That's interesting Is there a Or a comma n option in the base page Yes, there is So they could have done exactly the same thing by simply oring a with uh 16 And that would have produced Precisely the same effect in two bytes Yeah, okay, uh still in bank two Three five Hang on this is bank two three five three zero one two three, okay Moving 16 bit registers So that's the same as the ones on the other side except with the But using 16 bit eight bit registers Even okay, that looks good C one Base page again. It's one four. It's one of these Right i'm just checking to make sure that these line up with the other index register forms which they do So Sorry the other q register forms so we can use q register Decoding for these three Let's go find them. So there's one four zero one two three four So q register from the opcode A word from this construction stream Add zero comma one so add five to ix And now you should be able to do this using One of these forms Here we go No, no we want a I'm looking for a add Gg comma x I don't see one. Ah, yes. Yes Uh, I'd forgotten this Uh, hl is the 16 bit accumulator. So most 16 bit alu operations will only work on hl So here they are We have Uh Add hl comma gg which adds a Sorry, I'm not looking for gg. I'm looking for nn Oh, yeah, but this is Yes, this works because if it's an nn, then it will actually happen through the prefix byte. So this is the right instruction So you can add a 16 bit constant to Hl by doing presumably f e For the prefix followed by your 16 bit value followed by 7 0 for add hl comma gg Uh, but if you're adding a 16 bit value to ix, i, y, or sp We have an abbreviated. We have a This abbreviated form in the base page So it's interesting that adding value to ix is cheaper than to hl Yeah, anyway Bank 3 byte 18 1 8 t set What is t set It looks like a bit operation, but Oh, it's test with a typo Okay, okay I should say there were typos in this Yeah So, uh, find that page Right, we do need the thing Then we need to extract the bit Then it is Like so That's all right. I have a bit of a feeling I am in fact looking at the wrong Uh page of the data sheet So I don't need this But I do need this Okay, have we got to now one two two six Base page byte 13 Right, this is multiply and divide Hl by a 8 bit value Oh, and we have a couple of trivial instructions we can put in Complement a negate a So this is straightforward It's a Bite value followed by it's a shame. It's only there are only eight bits Uh multiply and divide unit because That makes an enormous difference But even just having eight bit divide in hardware is nice Look, you can divide by three in two bytes two by uh One four one four right So these are actually very similar to these But using a different addressing modes So how is this working? We have the source value in a Register, okay, so we need to say that these are Eight bit registers while these are 16 bit registers We are now done for multiply and divide for add we Are also Done No, we're not We need this Destination comes from the Q register in the actual opcode while the source comes from the Uh the prefix byte I think that's correct. So f8 1 4 4 1 4 is add ix comma bc good If i see five, you notice we're We're moving through the file in bigger and bigger chunks When we reach the end I will then go through the uh the data sheet and fill in the Remaining opcodes and just kind of hope they're right Base page 8f 8f that's Whatever that is That will be Decrementing direct page 8 bit value Yeah, we've already done this so we might as well do the rest So This one becomes so and then the next Row down is the same but with 16 bit values And this is still direct page. So that's D0 ink work zero And this should actually be Dekwa zero Because it's a 16 bit thing So here we go. Decrementing the thing in direct page address two three 12 one ah Look what I found. It's exactly the same things that we had before but in a different addressing mode So multiply and divide We don't need to do the things all we need to do is emit the instruction multiply hl by source divide hl by source We need alu or no not alu operation q register into one add destination register source So multiplying hl by the thing at fb4 one Looks plausible. We clearly haven't desynced base page a four a zero oh Uh shift 8 bit shifts by a We saw these in another page Yeah, it's the same as these Did we make a table for that? Yes, roll-ups, which is s for shift Good. So these should all be A four sla a a zero one two three four sla a good Where where are we now d6? The entire d block is junk that's actually kind of surprising Does this look like code? No, it doesn't this is garbage. It's a data table bank three bite a five a shift so The value is in parameter zero So load the shift into parameter one Get the shift. Yeah sra f9 for e. Yes, that looks That looks like code again base page five f Is garbage in fact all these holes are garbage in page in this row So we can just do that for bank four Okay, I wondered when this was going to happen This entire row is garbage. In fact bank four has very few real instructions in it So we just happened to hit a prefix byte for this bank uh and then these It's tried to decode the prefix according to whatever rules there are And then the actual opcode turned out to be well garbage So once this once I have added all the real op codes What I'm going to do is go through and take out all those question marks And make them empty strings and then change the rules in the decoder So if it sees an empty string it just prints a question mark and stops Because I don't want to fill question marks in all of these So for now, I'm just going to stick the appropriate question mark there Just so that we can get further on in the file. There we go. This is the one Because that allows us to proceed This is all data table So I bet this is going to be the same or not actually 2d 0 1 2 Yeah, these are all garbage lots of progress Byte 2 bank 4 I don't want to put question marks everywhere just yet or even at all because uh I still want it to error out at unimplemented templates 4057 Byte 1 bank 4 although I might just I know that this row is junk. So let's just do that 8 5 bank 4 Yep, nothing there essentially. Hang on a second. So we've got This block at 2 0 you've got this block here From three. Okay. We haven't done three eight these instructions here. Ah, I didn't actually fill in the Okay, here it is This looks like code again 5 3 9 bank 4 Okay, right now we are here So That's this one This is the destination is a q register That's going to be one car zero except for B which is garbage and that's this then Uh, I don't like that 3 9 d comma x Wait a minute Is that a dereference? Yes, it is. There should be parentheses around that But most places I am writing the parentheses as part of the template But that's a that's one of our special indexed operations That's one of these F7 it's uh this one. So That is bank 4 So is this a typo and there should be parentheses around that? Let's look at the other Let's look at the detailed docs. We are F7 3 9 so We go F7. No, that's not it. F7 F that's very strange I would expect to see an F It's zero No See these are the 8 bit forms and these are the 16 bit forms and this is clearly a 16 bit form I'm looking for instructions. It's got rr on the left. So that's one of these but the the F7 prefix byte Is It's a bank for operation So it should be That's the one that it calls dest So the prefix byte is mostly encoding the destination except for exceptions I think this is one Yeah, see here is f7 with hl plus a As dereferenced on the left. I've got that template wrong 7 3 9 is this but the 3 8 block Is mostly encoding the destination register That appears in the secondary opcode rather than the prefix byte. So that is why we have qr and ld1 So what we've got here is that the prefix byte Is encoding an index operation, but then we're using it with the wrong operation So the 3 8 opcode Even when it appears after a prefix byte Doesn't do a dereference. It just copies a register to another register or a register to wherever But we're using it with the Yeah, I am thoroughly confused by this I wonder if this is even a valid opcode You see if it was trying to dereference hl plus a And put the results in de then it would be using this form With an f3 prefix byte Not an f7 And it would also be using a different Well, is this coming from a different bank? So it's completely different instruction With f7 I would expect the destination to be indirected through hl plus a Okay, let's take a let's just double check the big table right f7 Destination dereference hl plus a so it is It's this prefix byte here Index thing on the left switch to template bank 4 And now we are here And this is an instruction that should work on registers Yeah, I do not know what's going on there. It might be in valid opcode It might be a completely valid way of saying add a to hl and put the results in de Uh, I don't think there's either. I'm missing something completely, which is frankly plausible or Uh, this data sheet is incomplete So I'm just going to stick with this and then I think when I actually get things running on the real machine I will have to try it and see Anywhere, where are we 4 6 2 1 bank 2 byte 8 9 0 1 2 3 4 5 6 7 8 is garbage 5 5 d 0 We are over a quarter way through bank 3 byte fd Is garbage 3 byte o 9 is garbage bank 2 byte o 8 is garbage And in fact, you know what I said earlier That's nonsense. I can always change things. So let's just fill these out with question marks We know that this row is garbage uh We know that these are garbage You know that this is garbage These are not garbage because these are t-set instructions But we know that all of row 2 0 is garbage 5 6 d 2 bank 4 o d that entire row is garbage So is the next one So is this 2 8 We know that that's wrong Because that's actually referring to this whole here I was misled by the way that there are actually nine instructions here and seven instructions here e 3 0 is garbage down to there 3 8 that one is garbage 4 0 these two are garbage all of 4 8 is garbage 0 1 2 3 4 Yeah, all of 5 0 and 5 8 are garbage 6 0 is all garbage All of these are garbage garbage everywhere. In fact, I'll do this Just to make my file a bit shorter So that's scrolling up and down doesn't take as long In fact, it's all garbage from here on out 9 a b c see Nope, she's not garbage Okay, all right that gets us to 5 8 fo Bank 3 by 2 What do you reckon? That's garbage This is bank 2 bank 3 by 2 it's still garbage It's even this bank 3 by 2 this is all garbage Okay We are continuing I do wonder just how much code a typewriter has anyway there's There's going to be two major bits of functionality one of which is to Compute the bitmaps that needs to be sent to the printer The total 256k ROM is I suspect nearly all font data the Other part of the functionality is the user interface There is in fact a text editor which will run on the little 16 bits 16 character display Which requires like that will require work But it's a character cell display. It's won't be that much work Right. I don't know what that is. That's a deck of some description, but it's being clipped So we're looking for deck o f Here we go deck x Okay ink x and deck x are strange x is a flag Which is set under some circumstances, but I'm not sure what I think That this it is a flag used for um possibly sign extension I haven't quite pinned down where it's computed admittedly. I haven't looked very hard but it's what it does is These into instructions if x is set then increment or decrement the value Lease the thing in the that address Otherwise do nothing So it's clearly an adjustment after some kind of operation Maybe it is a Instruction that allows you to Uh, I already did did that That allows you to do multi byte arithmetic cheaply Like if you if you're adding a 8 bit value to a 16 bit value Then you add the bottom byte But then you have to add zero to the top byte and include carry So maybe you can use this instead to just Uh add either zero or one to the top byte depending whether you need to carry or not But then why would it be called x? I will have to look into that later There it's clearly garbage Bad template char quote mm Okay, so quote is obviously a printable character What this does is it swapped af with the shadow copies of af This machine does have shadow registers just like the z80 This is the only place where a single quote is used in the entire instruction set Bank three Byte nine eight nine eight. Oh, it's a real thing so eight seven eight f nine seven nine f Uh, I've already got one here. Um, I've fact done Eight zero and eight wait a minute Sorry nine eight. That's this one. That's garbage But let's fill all of these in this garbage 98 bank two Is garbage in fact all of those two rows Can be eliminated. All right. That's got us to five bc nine Bank four ff Hey, what do you reckon? Don't think anybody was expecting that one to be garbage I think I mean, this is obviously clearly not code Uh, we may have actually run out of code And it's data tables from here on out Uh bank two byte four six zero one two three four that all rows garbage And so is five oh bank three c four Garbage and again everything from co down lots of progress six bf Six bf four Um, I search for the last address before quitting less So that I can use the history to find it again the next time around Uh bank three three seven Three seven all three o range is garbage in fact seven is four oh Not a lot there Okay, we are now over halfway through and very suddenly Starting at the 16k mark We have code Now this machine has ak of ram I She's that code No, that's not code. This is this is garbage That makes no sense as instructions So this machine has six eight k of ram and I don't know where it's mapped Uh, I could make a guess based on things like where it sets the stack pointer on startup and where its variables are but Uh, I know them. I know the cpu has some internal ram as well. So it might be using that for those But clearly something has started here Anyway eight o b six three Byte two six Three two six it's All garbage, okay Scrolling down See we're three quarters of the way through and then it stops But it still doesn't look like code bank three bite one zero that is a instruction a first one for a while R L D Roll double I mean, it's let me have a look at the docs and the Yeah, yeah, it's a 16 bit roll nice wait, uh Oh, this is rolling by four bits This goes to this this goes to this This Goes here And this is unmodified. Is that a 12 bit Roll from a to memory. I think it is I don't know what to make of that. That's very strange It must be useful for something C 303 CD for eight This actually does look like code It does look like code again Good Byte one eight bank two One eight. It's our old friend T set again wait one This is bank three. That means that these two instructions I just put in Shouldn't be there. They should be in bank two And they should be custom marks No, no, no All right one eight bank two. I need to look at bank two I thought I was looking at the nevermind one Eight, right. These are indeed our old friend t set So this is a eight bit value Just do that It's a eight bit register The bit number is in the bottom Is is in the secondary opcode Okay CD four eight F nine one eight test zero zero comma C Yep, there are we now D D six two five Byte e three bank two this is bank two The whole e block is garbage back e and f are garbage Not 60 that's the wrong address Oh, what's happened? Hey, we reached the end of the 64k address space in the ROM and now we start disassembling the rest of the ROM Which is going to be complete garbage because you on this machine you can't put code above 64k So this is now all data tables Uh and Entertainingly I also found a letter to dr. Livingston in it in multiple languages. I may have mentioned that before Uh, which is obviously some test data Good, we have now reached the end of our 64k address space Which means that there are known to be no more instructions to test on So let's go start from the top And look for instructions. I haven't implemented. This all looks full D zero is Garbage Okay, I have completed Uh bank one Bank two seven eight zero one two three four six seven seven eight is here These are garbage C zero f e d c. This is garbage. Okay. I've completed bank two Bank three This one is this hole here And is garbage five zero these two holes are garbage Five eight is all garbage Six eight is a likewise garbage seven eight These are mostly garbage Right, no empty instructions. I've completed bank three Bank four line right I've completed bank four. This means all the instructions are uh interpreted And there should be no more unimplemented templates anywhere So this should go all the way to the end of the file if I press shift g Textile wild says 256k of it and there we are at the end of the file And it has successfully disassembled everything good Wow, okay, that was quite a lot of data entry We are not completely finished because we wish to Amit the disassembly in a slightly cleaner fashion So we're actually going to have We're going to change our output buffer. So instead of just being the instruction It's also going to be the the complete buffer. We're going to write to the To the output So that needs to be longer. We're going to make it, you know 80 bytes There's a limit as to how long instructions can be Uh, this is enough to span one line of Normal code We wish to so Our print routines here write to the output buffer At output len Do this we wish to So what we're going to do is going to write various things to the buffer and then emit the buffer to the console or to the file in one go Uh, however, we can't emit the The hex dump until after we've read the instruction And reading the instruction will write to the buffer. So we actually want to tell it I didn't want to do that Where the Uh, where we want to start writing So we're going to set up some constants here that describe the buffer the address Starts at position zero the Then we have three bytes. This is going to be four bytes long Then we're going to have the hex dump which will start at Position of the address plus four characters plus three characters for the separator After the hex dump, we're going to have the ascii Which is going to be We can have up to six bytes per instruction And they are the hex for these is going to be three characters long that includes the the separator at the end but not the colon and the space The instruction Will appear after the ascii Which will be six characters long plus the separator And I think that's it so We would just start writing instruction at this point now before filling in the Before writing instruction, we need to clear the buffer to all spaces. So that's going to be zero output buffer So that will Wait a minute. I don't want to set it to zeros. I want to set it to bases So we have now read the instruction and disassembled it into the appropriate place We wish to This give this means we now know how long The buffer is going to be so zero terminated Now We are going to write the address Now we are going to write the hex dump Now we are going to write the Ascii However, we also want to put the separator in first because we can't put that after the hex dump because the hex dump is going to be variable width And now we want to write the separator for the instruction And then print the whole thing to the console and here we have a nicely formatted disassembly Okay, and we also need to rename output liner to output pause because it is no longer even slightly The length of the output buffer Yep, this gives us a nicely columnated tabulated disassembly Okay, and instead of just printing it to the Uh console we want to say if output file name Is Not set Then print it Otherwise what did I call that? Otherwise Write it to the output file followed by a new line Okay So Here it is printing directly to the console if I put output to disassembly Oh, it did something Do we have a file? We have a file. It's got our stuff in it Excellent Quite a lot of stuff Okay, uh And I think there is one more thing left to do because it's all looking pretty finished Which is Let's just check that we can set the origin Yep, that looks like it set the origin All the addresses are different Right now There is uh It should be finished except I actually want to check this Which I forgot to do earlier. We had to add this plus two offset to the Eight-bit displacement. Do we also need it for the 16-bit displacement? Now the only way to find this out is to try and find a Uh A use of the instruction And see whether it makes sense I suspect that if this has plus two then this needs to have plus two as well But I actually want to well that that's that's garbage That that did that's garbage Yeah These are data tables this Might not be garbage, but I suspect it is The code looks a little bit too weird to make sense That's clearly garbage. This is uh Counting table. Yep. This is also garbage Right. There are no actual uses of jrl in the actual code Uh Where else can we find a 16-bit displacement? We have color Nope to go all the way up to Potentially I don't think so because this instruction it makes no sense They're just not using that addressing mode does b712 actually point at anything It doesn't even point near an instruction Okay Uh, I wish I could search this but it's just a bit map So there were some jump instructions. They were here, but these do not do They don't do relative Uh Displacement So I think that the only two instructions that use that is addressing modes are jrl and color which uh Should be Which is there's another typo here um, you know what I am going to assume that Because there's a plus two offset for this Then there must also be a plus two offset for this and I'm going to add this To here Right now. Let's see how big this thing is Probably quite because of all the tables 4k or data table about the cpm version said 12k that's quite big for a cpm executable and And Most of that is repeated strings So let's take a look at Yeah bank four Very nearly every row Is like repeated sequences of eight Uh templates except for 20 30 38 and 40 everything else is the same So let's actually just try No, and that wants to be f eight So Does it match this row? Actually, we can be smarter than that So we can filter out most of the exceptions right here by three Seven We want ld zero from one eight nine a b four three and four seven I think there are no more exceptions There are no more exceptions Okay, so we we check to see if it's an exception If it's not We Follow one of the rules So this will be more code Because the code to do this is more complicated But much smaller data tables and I think this will be a overall Improvement and repeat that for this This and everything else is garbage So we reduce that entire 256 entry table to this function So this is gone from one two one four two bytes to Wow, we chopped 2k off Does it still work? Probably Okay, let me hold down the u button for a while And undo everything I did save And hold down ctrl r and put it all back again Right, we now have a snapshot of the state of the disassembler Before we made that change so I should be able to just say compare before after They're different. That's because these should be calls. All right. They're the same. So our two versions are equivalent And we can do precisely the same for bank three And bank two as well So Yeah, let's just do that. Do you wonder how long this video is going to be? Okay, he's template is zero Three the template is the same but div one seven is uh Garbage our exceptions with exceptions are at five three and five seven Then we've got these Seven return there's still some more to do but Yes, this is so much better It's five zero So it is And we've done those two exceptions six zero seven zero Okay, these ones are more exceptions. So that's eight seven eight f and the same from lines zero So disassemblers are supposed to be small and simple And I originally thought I was going to knock this out in a couple of hours I'm very interested to know how long this video ends up being once I splice all the bits together Okay, capture that do after diff Right Bank two Uh, there's a little bit more work in bank two. Oh, yeah. How big is our file now? Yeah, we chopped off another good kilobyte and a bit Okay, we're going to do the same thing we did for Here, uh Are we? Yeah, we are consistency So when Unfortunately calgo doesn't do range when statements yet It'd be really nice to have but they're actually harder than you might think they would be I have a blog write up on just how tricky case Can be When you care about size There are lots of really obvious optimizations to make that are totally not worth it. Okay, so these Are all exceptions And then zero I think it's all plain sailing from here on in Eight zero In fact What I've essentially done is reverse engineered the instruction decode logic Which was what I originally tried doing and failed But I seem to have done this rather through a rather strange backdoor method But it seems to be working So Do what's wrong there? Oh, okay Capture that after If it Lots of breakages Okay, what did I do wrong? I Didn't implement This what was it that's the these ones, um Yeah one At one zero And it does need to be one of these right And it's done We could probably ruleify at least some of the base page, but I'm not going to because it's frankly it's much clearer not How big is our executable? 8k. Yeah, we managed to chop quite a lot off that Let's look at the 386 version. Yeah, they're reasonable and if you look at the code There's still quite a lot of duplications, but uh, that is a lot better Awesome so we have a fully functional disassembler for the tlcs 90 The next step I suppose Is to write an assembler and won't that be fun? Anyway, it's done now. I can smell my dinner which has been there for quite a while I hope you enjoyed this video. Please let me know what you think in the comments