 So, it's time for another live coding session. So it has been a while since my last video, and that's because I've been working heavily on another project, which is this cow-goal, which is a compiler for a language I made up myself intended for and to run on small machines. So it will both compile for the 6502 and the Z80 and so on on a PC. You can also run the compiler on the systems. It's intended to be as small and simple as possible while still being a modern, type safe language with similar expressivity as with ANTC, although I've picked Ada-like syntax just because I happened to like it and it's my compiler. So I have this working. I've just released a demo where is the demo. There's a link to a demo somewhere on this page where you can see it in action on an emulated BBC micro. I know that the compiler and the language is good enough to write major programs in because I've written the compiler in it. It is completely self-hosted and bootstrapped. But I want to actually try this in direct competition to the Amsterdam compiler kit, which is the compiler I've used in the past for CPM-ish, which, if I can type it correctly, is my open-source CPM distribution. Now, a while back, I did a video where in a marathon nine hour session, I wrote a 8080 assembler in C for CPM-ish. And what I want to do today is to rewrite that assembler in Cal goal so that we can see how well it works compared to C. Now, the assembler was a single C file, which is this. It was one and a half thousand lines of code. It's a very simple assembler. 8080 machine code is easy to assemble. The Amsterdam compiler kit actually generates pretty dense 8080 code. So that came out to about 11K of machine code. It's also pretty terrible machine code. It's dense because it uses lots of helper routines. It runs chronically slowly. So I'm hoping that rewriting the assembly in Cal goal would use a much faster and hopefully smaller assembler. Plus, it'll be a good test of the compiler and a demonstration of the language and how it differs from C. So what I've got is an epically huge command line up here, which actually compiles the source code, which is in this window. The compiler is divided into several chunks. There's the front end, which is this bit. This takes the source language and does all the type checking and the raw code generation into an intermediate bytecode like tree format. That gets dumped out to disk as a Cob file. Then there's the back end, which takes the Cob file and actually does the machine code generation to produce a Coo file, which is marked up source code, not actually a real object file. Then there's the linker. The linker is the really clever bit. I'll explain why later. But what this does is it links together the multiple Coo files and emits a assembler file, which we then assemble using a third party assembler into an actual executable binary. This is all for CPM, so this is an 8080 binary. There is actually a Z80 version as well, but the ACK binary is 8080 machine code, so it would be cheating to use the Z80 version. So we can run this. And it takes a negligible time to run. I've tried timing it. It's like below 10 milliseconds for an empty program. And it produces a huge binary, 20 bytes. This is because there's nothing in this source file. We can look at the generated machine code, which is this. So we've got the general header up the top, which actually sets things up and calls the main subroutine. Then we have the main subroutine that does a little bit of initialization and then does nothing. So the command line I entered uses the Enter program, which watches a file and runs the command when that file changes. So every time I save this, it will automatically recompile. So we can do hello world and save. And over here, you notice the file size just bumped up to 98 bytes. And we can run this using the CPM emulator. And there we go. Let me look at the machine code again. Reload the file. The linker does dead code elimination. So only the standard library routines that are used can actually get included. So here is our main subroutine. It's got the same setup as before. There's a single call to print here. Print is this routine, which is written in pure Cal goal, which is simply called print char for every byte in the string, which is this, which just calls the CPM entry points, and prints a string to the screen. All right. Now, this video ought to be shorter than the one where I actually wrote this assembler because the algorithm is sorted. I don't actually need to change this. So a lot of this would just be a simple mechanical transcription. Excuse me, T. However, I am going to rework things a bit to make it a bit more Cal goal friendly. Apart from anything else, the assembler has some bugs in it, which I'll attempt to fix if I can remember what they are. One of the things I did with the assembler was to try and honor the original digital research assembler syntax, which it does reasonably well, and also the command line user interface, which is terrible. The way the interface works is normally you would type just like asm test, and that would find a file called test.asm and assemble it. If you want to configure it, what you do is you add special characters as a file extension on the input source file. What this means, it doesn't look for a file called test.aa. It always looks for a file called test.asm. It's just the first character means that expects to find a source file and drive a. The second character, I think, is where it puts the listing file. And the third character is the drive for the output. And that's just grim. So let's go for a much simpler and more traditional command line user interface. And that is where we're going to start. So let's have the original source code here. Now, you've noticed that there is no main function. In fact, the entire file, it contains ordinary statements. So we can just put a print statement in. But in fact, it makes things more efficient if we define a subroutine and then call it. I'll explain why later. So the first thing we want to parse our arguments, to do that, we use the argv library. So if we save that, we get four bytes of extra binary. That's three bytes for the call to parse arguments and one for the ret at the end of this subroutine. The first thing we do is we tell it to start parsing arguments. This fetches the command line from the system, which is CPM in this case, and does some basic parsing. We now want to start reading arguments. Unlike in C, you don't get an array of arguments as they need to be. So that's actually fairly complicated to do. You've got to put the array somewhere. So the way the interface works on Cal goal is you call argv next. It just returns a pointer to the next argument. So if there are no arguments left, then stop. So this not as string thing is because the language is supposed to have a null token, but I haven't implemented it yet. So you have to take a literal zero and cast it to the type you want. Oh, yes. I haven't defined string. So the square brackets are Cal goals pointer syntax. A string is just a pointer to byte. And bytes in this case are unsigned. OK. So the file size is actually bumped up to 182 bytes. That's because it's just pulled in argv init and argv next. So if you look at the machine code, here is argv init. And here is argv next. That's actually parsing the string byte by byte. 8080 machine code is not very dense. So the assembler is going to have three files it's concerned itself with, which is the input file name. That is the text file that is being read, which starts out unset. There's the output file name, which is where the output binary is going to be written. And we're going to have a default for that. And there's the listing file name, which is where the listing output is going to be sent. So if the first character of the argument is a hyphen, then this is an option. Otherwise, it is a plain, it's a non-option argument. And the only one we're going to allow for that is the input file name. So if the input file name has not been set, that's a syntax error. Otherwise, set the file name. Let me correct myself. If the input file name has been set, that's a syntax error, because you can only set it once. If, however, we've been passed an option, then we look at the option. If it's a O, this means that we want to set the output file name. So the output file name will become the next argument, or null. If it's a L, then this is setting the listing file name. If it's anything else, it's a syntax error. And we haven't defined the syntax error subroutine. So let's put that in. And that goes up here. Now, CalGoal supports nested subroutines. We could put this in the top level, as this is not actually using any of the variables in pass arguments, but it's more efficient to do it here. Because the compiler knows that there is no way that syntax error can be called once pass arguments goes out of scope. So that when the end sub is reached, this allows the compiler to discard any resources it had stored for syntax error, such as symbols, names, types, et cetera, which makes the compiler more efficient. And we're just going to print a simple syntax error like so and terminate. And that has compiled, and we now have a whole 400 bytes. That seems like a lot, but most of this is going to be standard libraries. So we've got exit with error, print char, print argv in it, argv next is quite big. And here we get our actual program. So here is syntax error. And here is pass arguments. And down the bottom, we need to do some verification. We need to make sure that the input file name must be set. The output file name must be set. The listing file name is optional. OK. So that should work. So with no arguments, we get the syntax error. With one argument, we get a syntax error. Why is that happening? That should hit this set input file name. Oh, we have set the output file name. That's interesting. So it has set, I am an idiot. Got these the wrong way around. That's worked. That has also worked. That should not have worked, because that has not set the input file name. So input file name should be null. So we should have hit the syntax error. It has indeed found an input file name. I think it has. On startup, CPM puts the command line at address hex 80. The first byte is the number of characters in the command line. So 81 is the address of the first byte of the actual command line. So why has that not worked? Luckily, we have a debugger. It's a machine code debugger. But this is the start of the program. We can take a look at the command line. There is nothing in it. No bytes. The E5 is junk inserted by the emulator to make sure you don't rely on the string being zero terminated, which is not. This is exactly what we expect. It is always possible that there's a bug in the argv library, which actually seems plausible. The standard library has not had an awful lot of testing. I know it works well enough to run the actual compiler. This is cross-compiling from the x86 version of the compiler to 8080 machine code. But you can also compile the compiler into 8080 machine code and run that. I'm not doing it because it's slower because you have to run it in emulation. But I have a real CPM machine that I want to actually try this on later. OK, I'm going to ignore this for now. It's a bug in the argv library. It should not be returning a argument if the command line is empty. It's an edge case where if there are no arguments at all, it's obviously not checking for that correctly. Anyway, that should have worked. So we'll go with that. So the next thing we want to do is to actually open the files. There is another library, which contains a basic file abstraction. So we create those. Files are referred to by FCB structures, file control blocks. These are quite large. On CPM, they're about 140 bytes because they contain a buffer. The reason for that is so that we don't have to allocate the buffer from the heap. We do always need a buffer. So if you look over here, you can see under workspace sizes, this is Cal goals equivalent of data bss. So there's 18 bytes of global variables in use. If we save this, suddenly we now have 516 bytes of global variables because we've got three of these buffers. We now want to open the files. We are opening the input file. Let's actually give ourselves a help of utilities. Can I take the address of scalar variables? Yeah, OK. That's because I've got input file name rather than input file. So FCB open in takes the address of the FCB block that represents the file we want to open and the file name. OK. And we've gone up to a whole 895 bytes. So if you run this, you see, there is no file called foos, so that doesn't work. But we should be able to open our own assembly file. There you go. And the end goal of this is to try and assemble the assembler's assembly file with the assembler. I hope that made sense. And let's open the output file. And again, let's make a help of utility. Actually, do we want to do that? And I think we're only going to want to use this in here. So let's move that subroutine inside our arguments. So we want to open input file and output file. That's going to work. Yep. So that should have created, should have opened, and then immediately closed again, the output file, which defaults to test.bin. So there should be a file called test.bin, which there is, which is empty. And you also want to optionally open the listing file. OK. So we should be parsing the command line arguments and opening all the files. OK. So where did we put our original assembler, which is here? Now the original assembler is in C. This stuff up the top is the command line parsing. It's complicated because it's dealing with raw CPM FCBs. Cal goal abstracts over the CPM FCBs and the file library. So I can compile this program for Linux 386 code or the 6.02 or anything like that. We are going to put that banner in. So now Cal goal has a malloc library, which turns the available workspace into a heap. We're not going to use it, because I'm going to try not to use it because I want to manually manage memory instead. The way assemblers work is they continually allocate stuff, but they never free it. So a heap is a waste of time. You may as well just start with a incrementing pointer that starts at the bottom of memory and goes up. The original assembler worked the same way, so we're going to be following the same logic. And it had the special variable CPM RAM top and RAM bottom, which are the addresses of the top and bottom of memory. Now Cal goal has this as well. So we've got high mem minus low mem. High mem is the top of available memory, and low mem is the bottom. We want that in kilobytes. So is that going to work? No, expression was a UNT16, yeah. This is because I'm trying to use print i32, prints a 32 bit value, except that on CPM addresses 16 bits wide. So subtracting one pointer from another actually gives you a int pointer, which on CPM is a UNT16. So we actually need to cast this to a UNT32 before we can print it with print i32. We could print it with print i16. Then we wouldn't need a cast, but then this wouldn't compile on 32 bit systems because their int pointer is, of course, 32 bits wide. Okay, so we now have getting quite big 1600 bytes. You see, we now have a banner and we have 61K free. Most of this, take a look at machine code again. There's a lot of helpers. So here we have the 32 bit division routine. There's quite a lot of it because dividing on the 8080 is a pain, and the remainder routine, which is used by the int to string code. Add 32 bit values, we subtract 32 bit values, left shift, some more 32 bit values stuff. This is all standard library boilerplate. We're gonna be using these probably in lots of places. So once we have them, it's not gonna increase the size of the code much more. We compare 16 bit values. We compare 32 bit values. And here we have our program print char memset. Print, print a new line, which is not complicated. Here is the unsigned integer to ASCII routine. Yeah, this is all written in Cal goal. We could probably slash the size of this by rewriting it in native code. This actually turns a 32 bit value into a string, which is why it's got all these 32 bit operations in it. And here is print i32, which as you can see, it loads a 32 bit value. It calls our UI to a routine and it calls print. I'll be in it, I'll be next. I can't remember what fill does. Oh yeah, this is part of the file library. This parses part of a file name into the weird fcv format used by CPM. This is, that's one of the things that's using memset. Open in, open out, all boilerplate. And this is where our program actually starts with starter and we're in like a good three courses of the way through the file. And here is our main routine. So we set up low mem, a few other things. Here is where we are printing the banner. So these two lines will print the initial text string. Then we load high mem and low mem, do a 16 bit subtraction, shift left, sorry, shift right to get the number of kilobytes, cast to a 32 bit int and print it. And finally we call pars arguments, hang on, I can improve this. We do not, I can actually use an int 16 for this because the print only has to print the number of kilobytes, not the number of bytes. And you know, the 32 bits runtime, you can see the compiler here is saying it starts out with one megabyte free and it finishes with a bit less. A thousand is fine, it'll fit in a 16 bit int. So if we do this, we start out at 16, 18 bytes. And if we save this, it gets slightly bigger. Yeah, okay, let's go back to the original. The reason why it's doing that is the print I16 routine actually just calls print I32. So that doesn't help. Okay, anyway, we're opening our files. We're printing the banner. We can actually start parsing the file. Our original assembler, are we, here's all the admit code, read token. So read token reads a thing from the input file and it takes care of parsing numbers, strings, adding things to the symbol table, et cetera. And we are going to follow the same model. So read token, return to token. And the token we're going to define is a byte. Bite. Unlike C, CalGoal allows you to do native arithmetic on byte. C always tries to promote values to machine words. This means that trying to run C on an 8-bit system is kind of frustrating because 8-bit systems are typically very bad at doing 16-bit arithmetic. The 8080 is not as bad as some, because it does have some 16-bit operations. But you don't want to do it in the 6.002. Excuse me, my voice is going already. This is not voting well. What tokens can we return? We have, how are we doing this? I seem to have decided that identified tokens should be quite high. So clearly single ASCII bytes are tokens in their own right. Yeah, I wrote this ages ago. I can't remember how it works. We don't have enums in CalGoal yet, though I would like to add them at some point. So to find some values for tokens, find these up nicely. And let's add a test routine. CalGoal does basic type inference. So if you are declaring a variable with an initializer and the initializer is typed, then you don't have to specify the type explicitly on the variable. Some values are untyped like constants. So you can't do that. You have to tell it what type the variable is because the compiler can't figure this out from the number. So UF, then break, loop. Token is a partial type. If you read one token, which is faked, and is the UF. So how is this working? Oh yeah, I remember. There's some really grim stuff to do with end of line handling because the assembler syntax used by this archaic assembler, the one I actually copying is awful. Yeah, because you use the exclamation mark to have multiple statements on a line. End of line is Boolean. We don't have a Boolean type, so we just use a byte. Line number is a 16-bit value which starts out at zero. Zero, let's not initialize that there. Let's initialize it here. And we do start with the end of line flag asserted according to this code. So if UL is not zero, then line number equals line number plus one equals zero. So the first thing this does is to skip white space. Now I need to go look at the definition of read byte. Yeah, this, so some of the stuff here has got to do with CPM buffering. CPM doesn't allow you to read bytes from files. You can only read 128 byte chunks. Luckily, this is all handled for us in the file library, but it also does the listing code and there's also an unget. So what we're going to do is actually where is unread byte called from? That's inside the parser, parser, parser. Okay, so we're going to define a routine that reads a byte from the file. Very simple, except not as simple because we're going to allow one byte of pushback. So if you call getC, you'll get either the byte that you previously pushed or the next character from the file. So the first thing we do is we skip white space. I'm wondering if there's a cleaner way to do this. Probably, but let's just do it the simple way. Backslashf and v, I've forgotten about that. I wonder if that's actually supported by the Cal goal compiler. Nope, they're not. Let's just ignore those for now. They'd be trivial to add at another later date. So the C code is using do while. I don't have a do while yet. I have a while loop end loop and a loop end loop, and that's it. There is also no for. These are all things that would be very simple to add. I just haven't done them because not very important. So that will skip white space and exit the loop with current character in C. If C is a comment character, then we loop until we reach a end of line or end of file. Now, the assembler syntax is kind of weird. The leading white space is significant. So let me just have a quick look at the main code. Actually, I tell a lie. It is not significant in this assembler. The first thing read, I can't remember. I honestly cannot remember. Some assemblers, if a word is left justified, it's considered a label. And if it's not left justified, then it's considered an instruction. But I happen to know that in this assembler, if something is not left justified, it's considered a label, which means if you make a typo in an instruction, what that will do is actually define a label, which is really confusing because it's syntactically valid. So what I am thinking about is the fact that it's checking for white space here and then it's skipping comments, but we're not checking for white space again after the comment. And I'm wondering if that is a bug or not. I think so that will consume bytes until we get something that is a terminator. So a new line, an end-of-file, or an exclamation mark. Okay, that's correct. Yeah, most tokenizers, including cow goals, when you read a single line comment, it will consume the trailing line feed so that the next token read is the beginning of the next line. This one is different where it consumes the comment but not the terminator. So that is in fact correct. Is a exclamation mark or a new line or an end-of-file byte break. Okay, convert to uppercase. I do not actually have a C type library. So let's just go and add a simple routine that does it the cheating way. No unicode here. If C in is C4 and C in is S, Z, then C in equals C in minus A. Notice the input and argument parameter lists. You can actually have multiple input and output arguments. The output parameter currently can't be anonymous. I cannot say just return a Uint. I have to give it a name. Again, it's a thing on my list of stuff to do and they have to have different variable names. One day I would like to have it so that you can have the same name on both sides. This would avoid this extraneous assignment. Hang on a second. Was I turning this to uppercase or lowercase? Upper, good. So we have skipped the white space and comments and we now have in our hands the first byte of the actual token. And the rest of the routine is a set of conditionals that do various things based on what it is. So the choices are it's either a digit, which case this is starting a number. It's a uppercase character, in which case this is starting a symbol, which can be either a existing label, a new label, or an instruction. It's a single quote, in which case it is a string constant. Or it's a terminated character or a single byte token. So the first one is a digit. If it's a digit, then let's just stop this out. If it's an uppercase character, if it's a string constant, if it's a new line, if it's a separator, then we pretend it's a new line. If it's a zero, yeah, that's not right. If it's anything else, then it's a single character token, which includes a real EOF. So is that gonna work? It's found a character. Now what does our assemblifier look like? Yup, it has successfully parsed the tab and has found the OOF-org as it parsed the tab. That's whitespace, it should have ignored it. Right, so it's ignored the whitespace and it's found the OOF-org. Okay, so we're now up to two and a bit kilobytes. I do wonder, is this actually going to become bigger than the C version? If so, that'll be embarrassing. Calgo code should be a lot denser than C, particularly 80KC. C is a stack frame oriented language where variables live on the stack. And stack frames are really expensive on a lot of 8-bit systems, especially the 8080. Calgo is not a stack frame-based language. It does static analysis in the linker to determine absolute locations for every variable in the program. This is this workspace stuff here. So instead of having to do arithmetic every time to fetch things from the stack, because the 8080 doesn't have an addressable stack, we can just do load and store directly from memory, which is way cheaper, especially for 16-bit values. But the 80KC that uses helper routines, including a lot of one-byte resets for accessing the stack, which is what makes it dense. The 80KC's 8080 code is actually denser than SDCC's 8080 codes that runs much, much more slowly. It'd be kind of embarrassing if it turns out to be bigger. Anyway, so. The first thing we want to do is to parse identifiers and identifiers are looked up in the symbol table. So what we're going to do is we fill up the token buffer with the string that we read from the file. And then once we're done, we look it up in the symbol table. So the first thing we want is a token buffer, which we're also going to want for numbers. So let's just create that here, which is a array of, we can't put that there, we have to put it here because it needs to be accessible from outside the parser. Do we have a, yep. So token buffer is a 64-bit array, which we're going to put the token into and this is also used for numbers and string constants. And token length is the fill of the buffer. Where is, here we go. So we're going to, A is token lengths, not initialized. That'll be because it's initialized elsewhere. It's left as a zero. No, it's initialized here. This is not a local. And check token buffer size is a helper, which we can put here. If token length equals size of, size of is a special keyword, which returns the static number of elements in the array immediately following. It only works in arrays and types of arrays. It means that we don't have to hard code the, we don't have to find a constant for the size. So here is the loop that fills the buffer, the size, token buffer length. C is, oh, I remember this. Wow. This weird little loop is because dollar signs in tokens are just ignored. You can put, I can do this and that is the same as this, which is kind of weird, but we're going to duplicate that because it's actually used to upper C is not dollar, then rig loop is identity is a routine that checks for valid identify continuation characters. Identifiers can only begin with a letter, but they may contain numbers and underscores afterwards. So if it's not an identifier, and we've already turned it into uppercase, so we can say a equal to z and z equal to c greater than zero equal to nine underscore break. And that will have left the last byte, the byte after the end of the identifier in C. So we want to unput that. And for test purposes, let's zero the, to terminate the string and print it. I mean, you notice here in Cal goal, you cannot cast an array to a pointer. So we have to take the address of the first element of the array. And we have read an O, just an O. If C is a character, if C is a letter, or if C is a number, or C is an underscore, stop. Are correctly incrementing the length. So I don't use not much. It's possible there's a bug. It's extremely possible there's a bug. I think there's a bug in not curses. I keep, this keeps happening to me. The language seems completely solid. And remember that the compiler itself is written in Cal goal. All the code you can see running up here is in Cal goal. Except every now and again, I will find a terrible, stupid bug where nothing should work. And yet the compiler works absolutely fine. And I think this is one of them. I think my not operator is not working correctly. And I know why as well. But I'm not gonna worry about it for now. I mean, this is writing this is, one of the reasons for writing this is to find these things. Right. So this is a really hacky way around, if not, but I can fix that later. So we have now correctly read a keyword from the source file. So the next thing to do is to look it up in the symbol table. And for that, we are going to need a symbol table. Now the way the C code worked is, for an out of T, this could be bad. It's a hash table. Here is the hash table, 32 buckets. And it's using the bottom five bits of the ASCII value as the hash. Each bucket points at a link list of symbols. And the symbol table is initialized statically using this complicated structure here. And each of these initializers values into a single chain in the bucket. And I remember having to spend ages setting up the link list manually. So LHLD points at LXI, which points at LDA, which points at LDAX, which points at L, which is the first item in the bucket. And then L is referenced, LHLD is referenced there. Oh yeah, that's the last item in the bucket. Yeah. And I hate that. So I'm not going to do it like that this time. The symbol table is still going to be a hash table, but we're not going to initialize it like that. But anyway, we want a, we want to define a symbol record. And this will contain the 16 bit value, the callback that does the thing. The way Cal goal does function pointers is not the same as C. So this has the next symbol. And we wish to define the hash table itself, which is a simple array of 32 symbols. And that has pushed up our workspace again. Cal goal variables are not initialized by default. So we actually want to wipe this. So we have to explicitly add this. So take the first, the address of the first item in the symbol table, cast it to a unit eight, because I don't have automatic casting to avoid tight void pointers yet. Bytes always a modifier that returns the number of bytes in the thing on the right. So that will initialize symbol table to nothing. So this should give us enough for a, this should give us a skeleton of our symbol table. So where is our read token? Right, search for this identifier in the symbol table. So in the original assembler, this is the only place where symbols get added to the symbol table. In our new assembler, it's going to be a little bit different. So we are going to create a find symbol routine, which looks up a symbol in the table. So we take the bottom five bits of the first character of the string, which is our hash. We use this to, we use this to find the pointer to the head of the bucket. Not the address itself, but the address of the pointer in memory. We now need to follow the chain down until we either find the symbol we want or we reach the end of the chain. So this gets us a pointer to the symbol. If it is null, we've reached the end of the chain. So stop. If it's not null, then this might be the symbol we're looking for. Symbol.name, common name. Symbol is a pointer. Cal goal allows you to use dot notation to dereference pointers and dereference structures. There's no arrow operator. So if these match, then we have found the symbol we're looking for. Which means we can just stop and return it. Symbol is set to the right thing. If it's not, then we wish to advance. So if we find the symbol, then we return here and symbol is pointing at the structure we're looking for. If we don't find the symbol, then we reach here and p symbol is pointing to the address of the null, which is the end of the hash table chain. Okay, now we want to make a slight adjustment. So I'm thinking that name would be a pointer to the string. But of course, we don't have a heap. So we've got nowhere to allocate the string. We could just put the string immediately in memory after the symbol. But if we're going to do that, then we can do this instead. This is in fact what the old assembler was doing as well. So that we now have a unbounded array immediately. So we now have a variable size structure where the last element is an array containing the string. A expression was a symbol used where a, interesting. The line number, we don't want an array of symbol structures. We want an array of symbol pointers. And SturaCump is in strings.co, right. Now of course, symbol name is an array not a pointer. Therefore we can't pass it directly to SturaCump. So we need to do this, okay. So here we're going to need to allocate a structure for a new symbol. Remember that low mem is the bottom of unused memory. So we are actually going to increase low mem by the size, by the amount of memory needed for the symbol structure and the name. So that is going to be the length of the name plus the size of the symbol structure plus one for the string terminator. So the symbol appears at the bottom of memory and we advance low mem by the amount of memory we're allocating. Except we actually need to align that. Why is that not working? Expression was a Uint8 used where a Uint60 in the app. That might be an interpointer. Ah, no wait, why is that not working? Line 81, Uint8 used where a Uint6. Low mem is a pointer to a byte defined there. You can add in pointers to these, a lineup. Have I, ah, a lineup doesn't take a pointer. It takes an in pointer. Okay, what we're doing here, so this last line is not needed for CPM but it is needed for platforms that require data alignment. We're making sure that low mem is always correctly aligned for the architecture. In fact, a lineup on CPM is defined right here and does nothing whatsoever. But if I'm gonna run that on like an arm, then I would need that. So what we've done here is we've allocated a new symbol of the bottom of memory. So where is our, so we are now going to initialize it first thing is to copy the string. I just want to remind myself of the order of, yeah, yeah. I got about part way through rationalizing the string library with the result that the parameters are not necessarily the right way around. So we copy the string, the value is zero. The callback is CB, which we're going to define in a moment. Name len is not relevant in this version. Next is the contents of P symbol. So do I need anything else? I think that's all I need. And the fine label CB not found right. Now, function pointers. In CalGal, function pointers must always be associated with a type and that's what this interface is doing. It's defining a type symbol callback and you can have any number of functions that correspond to this type, but you have to declare them. And the way you do that is we're saying undefined label CB implements symbol callback is. So the reason for this is the static analysis to determine where variables are located in memory. The way it does this is it walks the call graph of the program. And make sure that each variable is assigned to a memory address that it knows will not be used when that particular subroutine is called. Variables can overlap if the linker can prove that the subroutine owns the variable. They won't be called at the same time. Now, function pointers make this really awkward because function pointers can basically, can in C, point anywhere. So we have to use the symbol callback type in CalGal to let the linker know that when it sees a call to a single symbol callback, it can be only one of the subroutines associated with this type. So that is this one. Undef label CB, undef, undef label CB. Right, that is not actually a real callback in my original assembler. It's actually being used to indicate, it's actually being used as a marker value to indicate that this, oh no, no, there really is actually a callback. So simple error, unrecognized instruction. Okay, so we should now have the code to add a symbol to the symbol table. So we can invoke this down here. Yeah, I think there is actually very little to do here. Yes, I do actually need to assign the symbol to a variable so that the user of the parser can actually get at it. I think it's just like that. Yeah, that's worked. So 255 is the token for an identifier. It has read the identifier and added it to the symbol table and has then moved on to the next token, which is a number. Now it's actually just added a new symbol org to the symbol table and we actually want this to be an existing, it's an instruction. So we want it to be an existing symbol. So we need to initialize the symbol table. This is what this slot here is. So I am not gonna do that now because I'm gonna move on to the number parsing. We'll wait until we can tokenize the entire file and then start dealing with that. I will however put this down here. Yeah, that hasn't stopped, it's just finished with an error. Right, numbers, let's read token. So this works in a very similar way. What we do is we read the number into token buffer. We then look at the last character which has got the base signifier. Dollar signs again can be admitted. Hang on, this is the same code as we're using here. I mean, it's exactly the same code. Oh, except for the condition. So we can actually simplify this. So in CalGal, subroutines are cheap. Like each one is four bytes. One byte for the return at the end of the sub and three bytes to actually call. So because this takes no parameters and returns no parameters, no parameter marshalling is required. Nest subroutines can access variables defined in an out of scope. This is a corollary of the fact that all variables are statically assigned. So I don't have to deal with working up through stack frame to find the C. I can go directly to the value of C wherever we are. So it is very much worth using helpers like this. So for the number code, all we do is accumulate token byte. If C is not a digit, if C is not a hex digit, it avoids that not bug as before, that should read a hexadecimal value into the token buffer and exit with C set to the last... We kind of do that. And C will be set to the last byte red. It is not part of the number. So we want to push that back onto the input queue. Okay. We wish to fetch the base. The base is going to be the last byte. It's token length minus one. Doing right. So it can either be a B, in which case the base is binary. O, which means it's eight. Q, which also means it's eight. D, which means it's 10, which already is eight, which means it's 16. If it's anything else, and if it's anything else, if the last digit is a number, then we don't want to consume the last digit because the last digit is actually part of the number. So we increase token length again. If it's another character, then stop with an error. Not found, wise be, yeah, no, okay. So we have now, we now know the base and we have a zero terminated string containing the number. They see 100, which is exactly what we want and the base should be 16. So all we do now is value equals, we want I2A, I2A returns a, no we don't, we want A2I, here it is. A2I turns a, so it looks like the standard library, A2I doesn't actually allow you to pass in the base. Well, oops, never mind, we will do it the hard way. Yeah, this code is assuming C style numbers with a OX, OO, OBOD prefix, rather than the archaic syntax needed by the assembler, which has got a suffix byte. Right, there is nothing at all complicated about this. This is all just completely standard number parsing. If C is greater than or equal to base, then simple error invalid number equals token value times base as you're in 16 plus C as you're in 16. Notice that we're not parsing hyphens that happens at a different stage and we have now parsed the value into token value. So all we need to do is set the token to be a number and exit, token value not found, I haven't set that. It's doing something. Yeah, we get all the way to the end and there's an error, invalid base, that would be, so this should now print the line number in the source file of the error, line 296. What's line 296? So I think that what's happened here is that it's actually the last byte of the, hang on, why is that failed? So my thought is that that's actually the end of the line. We've actually seen lots of those before. Invalid base F, line 296. There's no F there. Okay, zero, zero, yeah, I keep doing that. The Cal goal assignment operator has a leading colon on it. There is no OXFF, oh, oops, that's another compiler bug. That is returning the wrong format of, oh, no, this is the standard library, that's inline assembly, okay. Well, let's quickly fix that. Okay, so the assembler I'm using is the venerable ZMAC and it supports both syntaxes. And it'd be nice to have that as well for my assembler, but that was a hard coded OX in this line here, so let's save that, make OX as my dependency analysis isn't quite right. I need to, what do I need to generate? I don't think I need to regenerate anything. Just need to make sure this is saved, finished. Excellent. We've now successfully tokenized the entire file. Let's try that again for performance of the prints. There we go. So allocating all the symbols for all the instructions has actually given us, it's actually used that much memory so we have loads to spare. So we have now, oh yeah, that file does not contain any string constants. String constants, that's the last thing here. Let's have a quick look at cousin.c wherever it's got it, there we go. So string constants get read into the buffer, but we also do escape, you do string escaping. No, we don't. No, we just look for the end of strings. Yeah, easy enough. So copy a byte into the token buffer if C is not printable, then it's an invalid string. If the last character was a single quote, then stop. Actually, we want to grab one more character because we don't want to accumulate the leading single quote. Once we've finished the loop, then we have read the, so accumulate token byte pushes the current character and fetches the new character but doesn't push it. So if it's a single quote, then we haven't actually added that quotation mark to the buffer. So we don't want to unread the last thing. So this is in the old assembler, this is wrong. It shows how often string constants are used. But what we do want to do is to terminate it and say it's string. Okay, but we're never gonna use that. So all right. So I believe that is our lexar done and the core of our symbol table. So the next thing to do is to initialize the symbol table. However, I am going to take a very short break to get another drink because my voice is going. So as far as you're concerned, I will be back in seconds. Ah, the joys of nutrition-free carbonic acid flavored with chemicals, but it's got ice in it. So that's the main thing. Okay, symbol table. So the way the symbol table works is we have a structure, we have our table of symbol structures. Each symbol has a name, an associated value and a callback to actually do the thing. Now, the way we're going to work this, I'm just trying to think about where I'm going to put this. I think it actually makes sense to put it down here. We are going to define a static array of symbols that we will then go through and add to the symbol table on startup. I'm not sure this is going to work. Yeah, so I was actually planning on having, you know, like this, this A symbol. Well, the symbol is value callback next and name. So that would be the value is seven. The callback is, let's just ignore that for the time being, next is always going to be zero and the name is going to be A. So what I was going to do is have a simple loop that just goes through and adds the symbol table. That's building the hash tree, except this is actually in line with the structure in line with the structure. So that's, yeah, it doesn't like this because you can't initialize a inline string like that. So I think what I'm going to do is I'm just going to eat the extra two bytes of waste. I'm turning this into a pointer, like so. So we will actually have to allocate the symbol and the string separately. This also has the advantage that for symbols where the string is already in memory, like they're going to be here because this A is a static string in the program memory, we don't actually have to duplicate them into the heap. Well, heap, I'll be going to do this. So find symbol is now a routine that just returns the, this should be a, it returns the address of the bucket. Add symbol is going to be the routine that add token calls. Wait a minute, no, we don't want that. This is just going to be in line, saves a little bit of space. Well, four bytes, plus marshaling to be honest. Plus we, this way we actually know how long the symbol is. We don't need to call Stirlen. The name needs to be, yes, this is now a pointer, not an array, so we turn it back to that. Unexpected if, yeah, it should be an n sub. Okay, because it's still run. 16k free, really. That has allocated a lot of space. Now it should only be two extra bytes per symbol. That should not be 16k. So that means it's used at the bulk of the, yeah, I think this is truly, let's do some little bit of refactoring and clean this up. Yeah, splitting this up wasn't a good idea. So we're actually going to create a sturdip function which duplicates a string onto the heap. Nope, this is actually going to do a, so this is actually going to take the token in the buffer and copy it to the heap. And I just realized I have lost the content of my clipboard, so I mean to have lost this. And then we're going to have a add symbol. Yeah, okay, I know what's going on. Yeah, what I did was that I never actually, I'm always adding the symbol rather than only doing it if it already exists. So what this is going to do is, it's just going to add the symbol. It's not going to add the string as well. The string will be added at a separate stage. So bytes of symbol. So from the length is a UNT8, so that needs to be an interpointer, that's one. So one of the things that Kalgoal does is it has no automatic type conversion except untyped constants. This is cumbersome in some ways because it means I have to do this sort of thing. But it also means you know exactly what types are being used where and in particular what type is being used for arithmetic. Because it doesn't promote values to machine words the way C does. It's kind of important to know whether your arithmetic's happening as a byte or a word. So copy string open by zero to, so down here, this just becomes a name that goes away completely, should work. So that means this code down here. So we find the P symbol. It is null. That means it hasn't actually found the token. Expression was a UNT16. Use variable. Yeah, the line numbering in the Kalgoal compiler is not quite right. So it's actually reported the error in line 94 has been on line 96. I don't expect, if yeah, I keep doing that. Okay, now let's run that. What do we get? 51, that's better. That's the right number. So the reason why this is split up is this now allows us to call add symbol with a constant string rather than a string that's on the heap because these now happen in two stages. So down here, although it occurs to me we're not going to be using add symbol to add these anyway because the symbols already exist. Bah. Anyway, the order in which things happen are we have the name, we have the value, and I believe you need to specify the other two values and they're not initialized to zero yet. So that would be not as symbol callback, not as symbol. Initializer must be a number. Does the symbol callback have to be a number? That's supposed to work, but I think I may have, that's probably another compiler bug. So it might actually be this instead. It might not like the cast. Doesn't like the cast. Yes, that is indeed another compiler bug. It's got a pretty simple way to distinguish between constant expressions which you can use in things like initializers and expressions that actually generate code. And I don't think that is working quite right for casts. So I think it believes that casts actually emit code and not allowed to use these in this context. So we're going to bod around this. So let's create a subroutine to initialize the symbol table and call from here. And we are indeed going to have a bodged symbol just so that we can refer to it here. I will fix this later. And we're going to have a big long array of the symbols we're going to add and then a loop to go and add them. Now, we are going to be using find symbol to find each symbol as you might expect. So we're going to iterate over every symbol in the array. Next here advances P to the next item that it's pointing at. It's the equivalent of the C P plus plus or P equals P plus one. Point of arithmetic in Cowgirl always happens in bytes, not in elements. That's a deliberate choice because you shouldn't be doing point of arithmetic other than just incrementing and decrementing because point of arithmetic on APA systems is frequently woefully inefficient. So you can't, for example, index pointers at all. I can't do this. That's not allowed. I have to do that will be stride of symbols or something. So this is the normal idiom for iterating over arrays. This generates nice, efficient code. So we are going to look up the symbol. This should always be not found. The symbol is already created so that all we have to do is to hook it on to the chain, which we do by doing P next equals P symbol. But we know the P symbol is always null. So that's just going to override the dummy value put in here. That should be all we need. Symbol and Uint8 are not compatible in this context. Uint8, a string, of course. Yeah, that's right. It's the compiler being absolutely correct. That was my mistake. Suppression was a Uint8. Yeah, that should not be done exactly the same thing here. My brain apparently thinks that strings and symbols are the same thing. Okay, so this should actually be initializing the symbol table from the values in this array. Okay, now we now need to actually put the stuff in the array. That's the contents of this structure here. Now, symbols can come in several varieties. There are instructions, and the instructions all have a callback to basically tell the assembler what it's supposed to do. The value of a instruction is the byte that will actually be omitted. The 8080 is nice and straightforward in that every opcode, there's a one-to-one mapping between opcode and byte. It doesn't have addressing modes. The other kind of symbol is a value, and these are initialized using these macros. Values are used in expression parsing, and the value encoded into them is used in the expression. Now, a value is actually a ordinary symbol with an EQ label, CB callback. So what we're going to do is I'm just going to cut and paste these. We don't need any of this. These are all going to become like that. These are going to become these. This is a EQ label, CB. These are going to become ALU source, CBs. These are going to become simple two-byte values, and that will not assemble because we're missing some. We haven't defined any of these, and the operator precedence constants are not set because the precedence table is another annoyingly complicated thing. Okay, I had better go and do that. One, two, three. I have to say I am quite looking forward to not having to think about expression parsing because the shunting algorithm parser that I used for the original assembler will work just fine in Cowgirl. So we now need the various callbacks. These are many of these callbacks and mostly placeholders used to produce errors. So we can define these up here. So operator callback implements the symbol. The callback will only get called if the symbol is used as an instruction. Note that these implements lines do not have parameter lists. The parameter list is taken from the parameters in the interface definition, which is here. Well, like there aren't any, so it's not relevant in this case. Again, a deliberate design choice. I, it's one that I'm not terribly keen on, but it drastically simplifies the compiler because when the compiler is reading parameter lists, it either reads the list and adds the parameter to the subroutine or in a language like C, it has to read the list and check it against a subroutine that already exists to make sure they're the same. And by not allowing parameter lists in places where you're referring to a subroutine that's already been declared and just copying the parameters from the declaration, it means I only need one set of parameter parsing, which vastly simplifies everything. So these are just going to be placeholders for now. We've got simple2bcb and it compiles. Does it work? It does. So that's the first batch of symbols. So let's go and find the others and put them in. Where are they? Looking at the video where I actually entered all this information from the tables, which I actually have here for reference in case I need it. Wow, it took a long time. B to end if, E to M, mod to V, R, R, C to X, T, H, L. Okay. So this is somewhere where I should really start using, should really know a bit more about the most exotic features of VIM. And in any way, this will do. So if we search, if we replace everything that's not a space up to the end of line with, excuse me, that worked. Initializer of wrong type 368. Yeah, that's because I'm missing the, these should be, yeah, EQ label CBs. And all the values should have one. So the, right, simple three BCB. So Cal goals are fairly traditional compiler and it generates a text file containing assembly that you then need to assemble. So it needs an assembler. For most systems, I just use the platform assembler. And the assembler that I'm replacing now is actually supposed to be the CPM platform assembler. However, when cross compiling, it's a little bit problematic because frequently for eight bit systems, the assemblers are kind of weird. The 6502 backend is using 64 Taz as an assembler and 64 Taz is extremely weird and it doesn't follow the standard conventions in a number of places, things like symbol names and trying to produce source code that would assemble on both 64 Taz when cross compiling from Linux and also on the assembler I found for real BBC Micro was quite problematic and I had to bodge things in a number of places. So I'm thinking that it'd be rather good to have my own assemblers for Cal goals own special needs. There's not so much for the 8080 but particularly the Z80 and the 6502 run into issues because some instructions have limited range. So in the 6502, it only has two byte branch instructions. You can only jump forwards or backwards, assigned bytes worth of address space. That's minus 128 plus 127. And knowing about, knowing how big the code is isn't really the compiler's job. So Cal goal relies on the assembler turning a short branch into a long branch if it tries to jump too far. 64 Taz will do this. The same thing applies with the Z80 and the assembler I'm currently using which is ZMac does do this but I kind of like to stop using ZMac because it's quite slow. So it would be nice to have my own suite of assemblers and this is kind of an experiment to see how well that would work. Oh, how many of these are there? DB, ES, DW, fixing, is that all of them? I think so, just tidy up. Okay, and this will not assemble because of all the missing implementations. So let's just stub these out. I've got one here. This is a typo. That was a tab that shouldn't have been there. Equals four. Spacing equals four ET. Simple one B, DB, CB, DS, DW, RPCB. What is RPCB? Oh, register pair. That's what it is. ALU desks. That's a ALU instruction that writes to A. There's exactly one of it which is reset which is special anyway. Really? ALU desks. Reset is extremely special. Why is that called ALU desks? Oh, and also inner? Ah, right. Yes, I know what's happening. Er, yeah. Ooh, I've forgotten about that. That's kind of manky. Actually, do I miss remembering how this works? So what it does when emitting these is the value in the byte here has the value of the expression following embedded inside it. Because that is used by, that's used to encode the register up here. So the registers all have internal numbering. So that's 0, 1, 2, 3, 4, 5, 6, 7, I believe. Which happens to match the encoding of reset. This assembler is extremely crude and the way the registers work is they're just variables. EQCB, these are used for defining symbols we've got EQCB for immutable values and setCB for setable values if, else, and end if are for conditional compilation. EndCB terminates compilation. LXI is a variant of simple 3B. Is it? Oh, yeah. LXI takes a register pair and a two byte value. MOV takes two registers, two 8-bit registers rather. And also the MVI should go next to LXI. LXI loads a 16-bit value into a register pair. MVI loads a 8-bit value into a register pair. ORG is special because it tells the assembler where your code lives. Title is special because it does a title. That dates from the days when you did your assembly, you printed out the log of the assembly. Okay, 5139 bytes. We're now half the size of the ACK assembler. We haven't actually put in any of the emission code yet, which is, well, it seems to work. So we have now successfully initialized the symbol table. I think the next thing we do is actually start the, ah, that's useful. Remember I said earlier about white space? This is the syntax. Yeah. So we've got label colon followed by an instruction. We've got label no colon followed by instruction. You've got an instruction on its own, and we've got a label on its own, and they all have to be distinguished. I think the next thing we do is we start work on this bit of state machine, which actually drives the bulk of the assembler. What it does is it reads tokens from the input file. It attempts to distinguish between labels and instructions, and it calls the appropriate callbacks to make things happen. And we're going to do that using a subroutine, parse, which is, sub, replace this code. We do parse, parse one, parse two. Between the two parses, we want to rewind the input file. We just remind myself of the syntax for that, fcb seek. The reason why we're trying to put code into subroutines is the static analysis I talked about earlier. The assembler, sorry, the compiler is capable of figuring out that parse here is never going to be called at the same time as initialized symbol table here, which means that temporary variables like this symbol and p here and parse here aren't going to overlap. So you can place them in the same memory location. So adding a variable to this subroutine should not increase the amount of workspace. And I will try and demonstrate this. Let's just add a un32. It's in the global scope. Therefore, it will not overlap with anything. Therefore, the workspace size will have to go up from 708 to 712. Put that in here and it drops back down to 708. This is incredibly important to save space on these old systems. The compiler would just not be possible without it. So what are we going to do? We initialize our parsing state. We copy our very useful comment. And here we have our loop where we actually read tokens. There is a call here to just try and abort compilation if the user presses a key. We could do this by calling CPM directly, but the Calgal standard library doesn't have it yet. So let's just quietly ignore that for the time being. Let's read a token. If T is the end of file, then if T is a new line, do nothing. Just continue around the loop. If it's not an identifier, choose an error. If it's NCB stop, correct. That is the opposite of the error I made before. Yeah, I stole the wordy syntax from Ada because after writing in Ada for a bit, I grew to rather like it. It is both easy to parse and avoids a number of simple human errors. Every construct starts with a word and ends with the same word. So you have to match up your ifs and end ifs and your loops and end loops and your subs and end subs, which makes it much harder to do the kind of mismatch braces that's so easy in C. Right, so this is the stuff that's trying to parse this. What we do is we try and identify what's instruction and what's the label. We don't need the type spec there because we know the cast means the type is set. If isLabel, what does isLabel do? Yikes, this is that used expression reader. This is the parse arguments subroutine and that should actually go down here so that it's closest to where it's being used. Here is the lexer. So isLabel returns whether the current set symbol is a label or not. And it's a label if it's been set with EQ, if it's been set with set or if it's been referred to but not actually set, which means it's probably a label that gets defined later. Token symbol.allback equals blast. These be defined here. EQ label. There is an EQ label. And, ah, yeah, yeah, okay. Right, I'm confusing the callbacks used to actually do the thing with the callbacks used that are part of the, that are attached to the symbol that it's defining. So, right, those are defined up there. So if the current symbol is a label, then set the label, read another token. If the tokens are colon, skip it. If we are not at the end of line, right, if we, so that we've read and dealt with the label, which means the only cases we now have to deal with are the instructions. So if we're not at the end of line, then we must be one of cases one, two, and three. So therefore this must be the instruction. If we have an instruction, then we need to actually call it. But if a label was attached, then we need to set it here. But only if the callback is not a set or an EQ, I mean, if CB is not. Otherwise, set the label in all cases. All right, here's some stuff to do with the printer, with the listing, which we're going to ignore for the time being. All right, that is the bulk of the main state machine code. I've done that again. Set implicit labels not defined label. That has to be a global. So there's current instance, I believe. Okay, if there is no label, do nothing. If the label has already been defined to a different address than where we currently are, then we would use an error. So we actually need a variable for the program counter, which is of course 16 bits. So if current label.value is not program counter, but callback is not under file label, CB. Can't remember whether I'm using simple error or start error, end error. Set the value, mark this as a EQ label. Down to when we run the assembler. Error in line one, expected an identifier. Is that complaining about org? Org is in the symbol table. This one or this one. So when assembling this, we read the org. The org is not a label, because it's defined as an instruction. So it should be skipping this clause. It's not a new line. It is an identifier. What do we have? We have a 254, which is a number. That's clearly found the second item, which makes me suspect it has actually picked that up as a label. Now org is definitely here. So this could be my hash table is booked. So it's failing to find the, it's failing to look up the symbol and is therefore adding org again in the wrong place. Or I got the wrong conditional here. Unrecognized instruction, undefined label CBR. It should have called org. So it hasn't done this bit. It's not a label. It hasn't done this bit, which means that it's correctly identified that token symbol is a identifier. It then calls the thing and gets the wrong one. Okay, I am going to need a little bit of assistance here, I think. So let me go look up the ZMAC documentation. So I think I want to tell this to emit a listing file. Yes. So this is ZMAC's output of the program here. So the reason for doing this is that this will then allow us to find everything and then run the debugger. So here is the passcode. So you can stick a breakpoint at address 1469, go and we've landed here. The debugger here is showing Z80 syntax rather than 8080. That's fine. I want the address of the symbol table hash. Yeah, I'd forgotten that addresses. Oh yeah, this is the static array. So you can see address of name value callback at next, which is always initialized at WS plus 1652. And then we get on to the next one. So here is the code that initialized the symbol table. Symbol WS plus 558, yeah. That is 1823. This does not look like, wait a minute, I know what I'm doing wrong. Oh no, no, that is right. Yeah, I thought that I was using pointer arithmetic and I just added slot onto the address, but no. Because pointer arithmetic happens as a byte, that wouldn't have scaled things correctly. So yeah, what this is doing, it loads the address of the symbol table. It then casts the cast slot, which is an A to a 16 bit value. It doubles slot, which is the scaling and then adds the address on. So WS plus 558 is the address of the symbol table, which is at 1823 and this is clearly blank. So that's not worked. So WS plus 666 is the, is P symbol. So that here we can see it load P symbol, dereference it, which is this fairly laborious piece of code, stash symbol in 669, check it again, zero, et cetera, et cetera. All right then, let us stick a break point at 1311, A. That didn't hit the break point. Initialize symbol table is here. You can find symbol is called here. So here is our code for initialize symbol table. Is that getting executed? Yes. Actually, did I try to break point on line number? Yes, I did. Okay, this is the line number, this is the address. So 8A6, go, right, that's better. So HL is the address that it's trying to use. It's P symbol. So this is the address it's found for the symbol whose name has been put in 186F, which is at 199, which is org, really, org. That's the first symbol this is adding. This sounds, this definitely sounds like initialize symbol table isn't working, which is here. Okay, so break point at 138, go. We're in initialize symbol table. Well, I can already see what the problem is, which is that here it's loading HL with the array, and here it's loading HL with the end of the array, except that it's actually the array. So when it does the comparison, it's always coming out wrong. This is another compiler bug. Fantastic, obviously we're not using that in the compiler either. Okay, I do think I know what this is. I'll be, oh, okay, I'm going to have to fix this one. So these are the rules the compiler uses to generate the code. So if you see an internal abstract syntax tree node representing like BLTS2, branch if less than signed to byte, this takes its two parameters and registers, the register allocator will take care of making sure the values are in there. This is what actually happens. So the bit I'm looking for is initializers. So, no way, that's correct. So this is the rule that takes care of loading the address of something into a register. It generates an LXI with an address parameter using this symbol and this offset, because it's clearly not added the offset. The offset, there should be a plus at the end of this, but there isn't. So this just passes it into eSIMref, which produces a reference for symbol. And because it's a static initializer, it follows this code path. It should have added on the offset. Interesting, I don't like those errors. I very much don't like those errors. Ah, yeah, okay. That's because this routine is also being used to generate the label at the top of the initializer, this label here. Let me actually try and find the thing, this label. And if there's a plus zero here, then it's obviously invalid. So why has that not appeared here? The only possible explanation is that the offset being passed in is zero, which is not right. I think I might have to take a break and look at this offline. I could use a bio break anyway. So there should be a test case for that. I wonder if this, if size of here has actually worked. Let me just try this and look at the code. Oh, oh, there's a plus eight now. Right, size of has not worked correctly. That's a front end compiler issue. And I can guess with that it is in fact because, because this is an implicitly sized array. Yeah, but I bet it's not setting that correctly. Now I know that I am using those in the compiler. So why that's not doing it right? I don't know. Yeah, zero. Size of returns the number of elements. Bites of returns the number of bytes. Also zero. That's not great. Okay, well it's not this piece of this code. It must be this. And it's probably in the parser. This is the compiler source code. It's a lemon parser used for actually reading the, reading and parsing the program. This is the bit of code that deals with static initializers. This is where it should be filling out the current type with, if the current type is an array, then set the size. If I put a actual value there. Yeah, wrong number of elements for initializers. It's actually got that right. That's produced by this error probably. How can a compiler where the binaries are 20 to 30K have a 2,540 build artifacts? Of course, I've just changed the compiler itself. It then has to go and rebuild absolutely everything. There we go. Produce the error and path one. That's this one. So I'm willing to bet that current type width is not zero. Therefore it's not initializing that stuff. I think that's why would that not be zero? So braced initializers use a separate code path. Not braced initializer. Close brace. Check the end of initializer. I bet this is setting width of that here. You know what? I'm not going to worry about this for now. I can actually just statically set the number of elements. So we're in line 515. 409. That's another thing to fix. Wrong number of elements. 409. 515. 106. This is very peculiar. Also not really helping me get my assembler done. The other thing to do is to not use a countered array at all and just put a terminator element at the end. Which I might actually do. 848. I'll add the width in bytes. So it does have a width. Yeah, this will be because it's a initializer containing braced initializers. And I bet that each time one of these things ends, it's increasing the known width of the outer array. Yeah, I'm not going to worry about this. Initializer must be a number. CD. Oh, I can't put that. Okay. Now that's kind of gruesome. It will do for now. So that instead of doing that test, I can just do e.name equal 0 then break. Right. What that does is it checks to see if the string name is not empty. What's that done? Try that without the debugger. OrgCB. Right. It's done the right thing. It's added. The symbols are all added to the symbol table. And we have now actually reached the point where we're calling one of the instruction callbacks. Good. Right. So what does org do? Read an expression. Set the program counter. Okay. So now we're going to be doing the expression parser. Yay. I've got token values that have token number. So what expect expression will do, at least some code up here I'll be about to put in, is that it will read an expression from the input stream and put the result in token number. So the org instruction here calls expect expression. It checks to make sure I'm not trying to move the program counter backwards and set the program counter. Like so. Nothing complicated. None of these are particularly complicated. But we're now going to have to do the expression parser. And that is complicated. So the expression parser is, well, the expect expression is simple. Read expression is not. This is read expression. In fact, it's not that bad. I did all the hard work last time. You can see the debug code here from trying to make it work. It's actually used in several places. So expression is if read expression is not used to token and then simple error expected a single expression. And if and sub. Read expression presumably returns a token. So read expression uses Dijkstra's shunting algorithm and some stacks. So it's not just that routine here. It's all these helper routines as well from syntax error down. What it does is it reads tokens from the stream. It pushes them on to various stacks depending on the precedence. It pushes values off the stacks when it thinks it's time to evaluate them. It does evaluate them and eventually returns a value. Now the advantage of using cal goals, all these helper utilities can be nested here inside read expression, which saves memory. The disadvantage is that it can't really tell the difference between operators and values. So we need to have this scene value flag that needs to be set correctly. But anyway, let's do some stacking two values, I think. Yeah, I'm slightly flaking out, which is why I'm making the typos. I suspect I'm not going to be finished this session and I'm going to have to wait till tomorrow for the rest of it. I have an appointment later, so I will have to break this off. Okay, value. So there's nothing terribly interesting here. We have some, oops, equals value sq plus one. So we have some helper utilities for pushing and popping stuff from the stack with underflow and overflow checking. 308, unexpected if is, and there should be an is here too. Okay, apply operator is the routine that takes values off the stack, pops a operator off the stack and calls the thing to make it happen. And there is an additional table containing information about the operators, including another set of callbacks, which we're going to have to put in at this set of routines here. An operator callback takes a left, a right, and returns a result. And there's also a operator structure. This contains information about which callback to use, whether it's the precedence of the operator and whether it is unary or binary. Here, that can go down there as well. Yeah, looking backwards and forwards between C and Cal goal keeps making me get the syntax and things like the ordering of types and names the wrong way around. Yeah, these are in alphabetical order. XOR, div, mod, mol, this one is unary. This one is also unary. It's just those two or these two are special. So in Cal goal, the right-hand side of a shift has to be a unate. So we actually need to do that sub XOR. We now need the operator table. This actually has to be the same order as the set of enumerations further up. A of operators. The same bug that I spotted before is going to apply here. So we're going to have to be careful. Unary not is unary was binary. Oops, missed one. So this is the element that represents parentheses. So I can't put null in because, you know, that bug again. Does it actually? Well, there's no callback. So it cannot actually be using it as a real element. Let's try that. So I don't actually think, yeah, provided apply operator is never called for parentheses, I think we're good. Pop the right-hand side. If we're binary, then pop the left-hand side. Call the callback with left and right. If we're not binary, call the callback with just the right. Operators is already defined. Expression was a UNT8, using UN16, line 360. So the types of an index in cowl goal, right, this is in fact the same bug. The type of an index should be the smallest numerical type capable of addressing all elements in the array, which in this case should be a UNT8 because there's only a handful. However, the bug that's causing width to be set is clearly setting it to the wrong thing, which means that it thinks there's a lot of elements in the array. So that will be an int 16. So what we're actually going to do is index of is an operator that returns the index type of the array. So we explicitly cast that to the appropriate type. Again, we shouldn't need to, it's a bug, but that's a workaround. Expression was an operator callback used when a UNT16. Ooh, I know what this one is. That's another compiler bug. That's a precedence bug. Seriously, unexpected open parent? Wow. Try this. Yeah, it's going to have such a laundry list of stuff to look at. What's happening here is the way this expression is being parsed with dot for dereferencing and the callback is incorrect. So instead of referencing it directly, I have to put it in a variable. That is not brilliant. And you have the ply operator, push operator. This is the same code as here. Push and a ply operator. Yeah, this is the core of the Dijkstra shunting algorithm. You give it a operator that you want to push onto the operator stack and it checks against the precedence of the things already on the stack and it pops the stuff off as needed a while. Interesting. The code here is actually storing operator indexes on the operator stack. Why don't we just store operator pointers? Because that way this just becomes, doesn't change. Where are we calling a ply operator from? Oh, lots of them. Yeah, so this becomes, you don't need that anymore. So this becomes the top operator is operator table, yeah, index of values, index of operators. Pop op precedence is less than reported pop precedence. Then ply operator 375 value, not 376. Expression was a, whoops, it could have been nasty. This one too. Expression is unit 8. That seems to be the bulk of the helpers. A couple of simple helpers. These two look like they could be simplified actually. And then we get into the actual routine that does the work. So it's a simple loop that reads a token and then does stuff with that token. So dollar is a special value that returns the current program counter. Yeah, so you see these, lots of these tests. We can actually, we've seen value on operator, yeah. We can put the test in here. Because these subroutines can access the local variables of read expression. This makes this extremely easy, which then simplifies the code down here. So we can change this to want operator, want value. So dollar is an operator. Is that right? Expected operator, dollar is not a value. So dollar is not an operator. Dollar is a value. I want a value. That should be want value. So the reason for the flag is that you basically, you alternate between values and operators with a few exceptions for things like parentheses. So that dollar only makes sense if it's after an operator. If it's after an operator, seen value will be zero. So yeah, push value program counter. Seen value equals one. And I'm using capital V's there. So if it's a number, want value, push value number equals one. Token string is also a value. These are kind of special. So the thing about strings is in most contexts like using LXI, use LXI, H, A, B, this will actually turn into a character constant where you get 16 bit value that's loaded into H. How this also applies to DW, which is the word. That way you get a 16 bit character constant. But if you use DB, then the bytes are returned in the other order as just a simple string. So yeah. So what I'm doing here is that DB will set DB string constant hack and then call read expression. If the string constant appears not inside an expression, then we just return a string constant, which is extremely special. But we're going to copy that. The string constant hack is equal to zero. And value is equal to zero. Because we still want to allow you to use character constants in expressions in DB. Yeah. This actually returns a token. That returns a token. So we're going to use this here. So this code is actually evaluating the string constant because we've got this, which has value 65, and this, which has value 65 by 256 plus 66. Whatever that is. And token length is not equal to two. Then character constant one. Okay. This is plus can be used in a unary or a binary context. In a unary context, we just throw it away. If in a binary context, then we do the push and apply thing. So if the last thing we have seen as a value, then it's in the binary context. Push and apply operator op add. And that is an operator. So scene value becomes zero. The same logic applies for subtract. Except that in a unary context, we do want to do something. Multiplication can only happen in a, can only happen after a value. So we want operator, likewise division, parentheses, parentheses are special because they are values or open parentheses of values. Close parentheses are operators. And I actually think I am going to have to add that parentheses thing here because I need something with a very high precedence because I believe that push and apply here is going to attempt to look up the precedence via this table. So since we actually have to put something in the, excuse me, yawning, since we actually have to put something in the callback field, we're going to do that. It should never be used. Now in a closed parentheses, we actually do some special code to, we pop and apply all the way back to the next parentheses. So adding new operators onto the stack will never apply past the parentheses there because the parentheses has the highest precedence of all. We only pop that parentheses when we see the closed parentheses that matches. So hope this works. Simple val not found. Token value not found. Token value? Number? Why do I keep typing value instead of number? dv string constant hack that can go up here. String constant hack. Operator, we've got to close parentheses. Identifiers. Token identifier. Okay. If the identifier that's just picked up is a operator, then we push the operator. The operator index is the value as seen in the symbol. So if it's the wrong kind of operator, use a syntax error and apply operator. If it's not an operator, then it must be a label in which case we push the value. If it's not a label, it's an error. If it is none of the above, then we've reached the end of the expression and we actually need to stop. Unfortunately, CalGol suffers from exactly the same design for the C does, where I can't actually put a break here because it will break out of the case statement. I'm actually kind of wondering whether I want the break statement to break out of cases. I don't do fall through, which is why there are no breaks at the end of Wends. So that's the major use of them. So I think I could just take them out. That would allow me to put a break in here and have it work. I'd have to go through the existing code and look for occurrences. The simplest way to do this is to simply stick all of this in a subroutine, because then I can just do return. Syntax error not found. That's an informative error message, I have to say. Hey, the error messages in this assembler are so much better than the error messages in the original. You have no idea. It would just display one character to indicate that something was wrong in the margin of the listing. Read token, it's not equal to the token. Then Syntax error. Apply operator. Yep, that needs to be UNT8. 5 is the label is not found. That's because that's here. I'll move this up. Cal goal is a single pass language. There we go. Being single pass, well the front end is a single pass and the back end is a single pass. This means that you have to declare things in order, which is not so great. This is actually Cal goal 2.0. The Cal goal 1.0 is a multi pass compiler and you could declare anything in any order and it worked pretty well, but it was huge and just very, very slow. When we reach the end of the expression, we then have to unstack everything that's been pushed onto the operator and value stacks and evaluate them. Once all the operators have been evaluated, the value stack should contain one item, which is the result. Expression t. And read expression then returns the token that terminated the expression, which in the case of end expression is the new line at the end of the line. So we should have enough that org runs. Expected value got operator, fantastic. So we have not seen a value, which means we want an operator. This is the wrong way around. Well, it's different. So 255 is a number. I think it's a number. Now it's an identifier. 254 is a number. Why has it done that in that order? So it should... Oh wait, those two prints were from the expression parser. So 254 is the number that we just read. But the next token should be a new line. Is our tokenizer actually returning new lines? Or is it consuming them as part of the wide space? Should not be returning new lines. Intriguing. So what I think is happening is that the new line token... Well, it's reading the number. The number is being pushed. And then it's reading another token, but instead of getting the new line for the end of the line, which should terminate the expression, it is in fact skipping it and reading the identifier at the beginning of the next line. Let me just double check the terminators. The terminators are 10 slash ends. That's correct. So the wide space swallowing code only looks for slash Rs. So this will dump every token red. 255, 254, 255. Right, it has in fact... Yeah, I did think that was it. 79, which is an O. 49, which is a 1. 76, which is the L here. It has swallowed the new line. It might have swallowed it at the end of the number. So this will push the new line back onto the stack. I should be able to check this by doing... Yeah. So that should have un-got the new line. If possible, my un-get is not working. Push c equals c. Push c equals c. That's not complicated. Okay, let's do this. I can see that it has read the 10 once. And it should have read it twice. That's because I'm an idiot. Much better. So by putting that inside the subroutine, it was being initialized every time. So the value of push c was not being preserved. Right. We are now doing the org and moving on to the next instruction, which is the LXI. So let's do the LXI. And then I'm afraid I must take a break because I need some dinner. So here is an example where we call read expression and then check the terminating token. We expect it to be a comma. If it's not a comma, something is wrong. And this is starting to refer to the emitter code, which we haven't done any of. So that's probably the next major chore. For now, let's just do projection at line two syntax error. Right. This is in fact a expression. So that we're trying to read the identifier top, which of course doesn't exist. It will create it automatically and then add one to eight to it. And what circumstances are we doing syntax error here? So top is not showing up as a label. This is probably because the tokenizer, the Lexa routine that adds implicit labels, yeah, needs to be setting the type to, well, undef label CB. Let me just check symbol. Oh, wait, it is doing it. So is my label wrong? I changed that because it was wrong. Now it's wronger. Okay. Well, that works. And there is our first actual instruction. 01 080 01 is 01 LXI B, followed by 16 bit value. So look at our source code. That has, in fact, SP 01. Whereas LXI SP, right, that should be 31. That's not so hot. So token number is the value of the expression. Have we actually set that correctly? I don't think we have. Token number equals values zero. Let's try that. Better 31. But that is the correct instruction. Good. Okay. So break time. I'm going to have some dinner. I do not know whether I will have time to come back to this today, in which case I'll have to come back to it tomorrow, in which point things might be set up differently. But I think that's some pretty decent progress. The program size is 7.5K, more or less. We've got most of the functionality of the core assembler. We need to add a bunch of callbacks, but there's not a lot of code there. The two bigger things we haven't done are the emitter, which is actually pretty straightforward. And the listing generator, which is not straightforward, mainly because it's very poorly defined. I think that we will probably end up with something smaller than the ACKs, but not necessarily much smaller, but it should be a lot faster. And once it's done, I will actually try it for real on a real machine and see how that works. Plus I need to go through and fix the bugs. What am I looking at here? This is a read and stack. That's the big routine that does the expression parsing. So read token, stash it in a variable, check the result, call want value, call push value, set scene value, jump to the end. This actually, a cal goal actually generates a pretty reasonable code for the 8080. Right, break time, back later. All right, here we are back again. It's a new day. I have a new cup of tea. And most importantly of all, I've managed to fix some compiler bugs. So the not operator should work now. And array sizes should be set correctly of arrays of structures with no set type. And one of the consequences of this is it's now setting the type of the index correctly. Hence the error message you can see over here. So I'm going to have to fix that. What we're doing here is the value of the symbol, which indicates for an operator, which operator it is in the table here, is a Uint16, but its operator table is expecting a Uint8 as the index type. So that should be better. Yup, that built. We are at 7453 bytes of binary. And let us go and find that else. Where was it? I can't remember what I was doing now. It was in the lexer. So that would be read token. Here it is. So if we do that, that should, I hope, work now. I haven't tested this. So the binary got a little bit smaller. Expected value got operator. I am going to guess that hasn't worked. That's interesting. So what the bug actually was is the code that handles conditionals and deals with the expression short circuiting. You know, when it's evaluating this and here, if the first condition is false, then it knows it does not have to execute the second condition. This stuff was, if you use the not, it was correctly negating conditions themselves but wasn't negating ands and ors. And I put together a compiler test case and fixed the test case. But that apparently has, oh, hang on, hang on. Yeah, because just sticking a knot on the front is going to apply it to this clause but not this clause. Let's try that and see what happens. Good. Right, that does work after all. That's a relief. It's always nice to have a little bit of credibility left. This does have a parenthesis around everything. Oh, wait a minute. I don't want to change that one. I'll be doing this anywhere else. I don't think we were. Wait a minute. Right, that does still seem to be working. And the other one was the array size in the symbol table because we put this nasty hacky marker in the end. So let's change this. So while P is not equal to the symbol size of symbols, loop, and we can get rid of these three lines, that still works. Excellent. And if we go look at the machine code, here is the initialize symbol table subroutine. So here we have, let me actually just take a quick copy of this. So we take the address of the first item in the array and assign it to P. Here we take the address of the first item in the array and assign it to P, and that are these two instructions. We set J to be the number of entries. No, the size. I put that in as debug code yesterday and I need to take it out again. Anyway, that is now initializing to 848 and not 0. There we go. Here we reload the value of P, which is in WS plus 662. We have to reload it because there's a label here, because execution can jump back to the label at any point. We cannot rely on it being in a register. In fact, CalGold doesn't try to remember things in registers across labels anyway because of this. Exchange copies HL to DE. Load HL with the address of the last item. Call a helper routine that just compares HL and DE. If they are equal, then we jump to the end of the loop, which is here, which then exits the subroutine. As I said last time, it actually produces a pretty good 8080 code, mainly because 8080 code is incredibly simple. This comparison routine, by the way, you wouldn't think you needed one to just compare two 16-bit values, but it's kind of awful on the 8080, and the shortest way to do it is with a helper routine like this. What this does is... Why is that an LD? That is Z80 stuff. That's wrong. That should be a mob. The E into A, that is the first thing we are comparing. Compare against L. CMP will only compare the A register, which is why we need this copy. If they are different return now, then we compare the high bytes in the same way. The reason why we haven't just inlined this is this return NZ is a single byte. But on the 8080, all comparison jump instructions, like this, are three bytes. We have to have one for the actual comparison jump at the end of the helper routine, and this call instruction is another three. We do actually save by... You know what? Let me start that explanation again, because that made no sense. If we inlined the comparison, we would have to inline the actual comparison bits. That's the LD and the CMP twice. Each time we would then need a comparison and branch, and that would be a JNZ here. The JNZ would be three bytes, and of course we need the one at the end. We would end up having 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 bytes just to compare to 16 bit values. By having a helper routine, we actually managed to reduce that to a mere 6 bytes, which is still a lot, but is better. 8080 code is not what you would call efficient. The Z80s helped a lot of this, because it introduced two byte branch instructions with a limited range. Okay, so that bit... That now works, so we need to press on with implementing some of these helper routines. Once we've done that, we'll do some of the emission code, and then we should actually have a working assembler. So where were we? You might as well start at the top. Title Cb is... This was used in the original assembler to display the title of the assembly file in the listing, which traditionally went to a printer. We are not going to copy it to the listing file, because no one really uses that anymore. I'm just going to copy the code that we used in the C assembler, and all it does is print the title to the output, but only in the first pass. That actually brings me to another small point that we have to fix, which is the pass routine here takes the pass as a parameter, but we don't want to do that, because that needs to be a global variable, because we're going to want to reference it inside other routines, such as title Cb, because we only want to print the title in the first pass. We haven't done the expect instruction yet. The old code is using zero-based passes, the new code is using one-based. Okay, and that will probably... Oops, that's not right. We have implemented expect. Good. So here we have... This routine is to do with emitting the listing, so I'm going to ignore that for the time being. The EQCB sets the value of a variable. Where is that? Yeah, that should not be there. So this is called when you do something like... Actually, there should be an example in here somewhere, like this. So you give it an implicit label, and you give it a value, and EQ will set the value of the label to the expression on the right-hand side. EQ is different from set, in that EQ can only set the value once. It then becomes immutable. Now, we handle these in the same way. We just use the callback to remember whether it's an EQ or a set. And it's slightly more sophisticated than you might think because EQ labels do get redefined in the second pass. So we have to allow them to be defined to the same value that they are currently set to, which is interesting. So if the label is zero is... If there is no current and implicit label, this means somebody's just in EQ on its own, which is not allowed. Okay. Read the expression. If the value of the label is not either the value of the expression, that is, if you're trying to set it to something else, and the type of label is not undefined, then produce an error. We're surely in the second pass. Oh, yeah. Right. If it's undefined... Hang on. I have this condition rather backwards, I believe. This is another case of a nod. We want to produce an error if it has been defined and the value is different. So now set the label. Set is extremely similar, but does not have the test. So we can just cut and paste that. It would be possible to comment out some of this code. Oh, it's actually got a different test. Yeah, you can only redefine EQ... Yes, you can redefine undefined labels and set labels, but not EQ labels. Okay. Oh, dear. If was annoying. I remember that from last time. Yeah, let's just do if now. So I will move if, else, and end if up. So it's in the same order. So the way these work is... Okay, the CalGold assembler doesn't actually produce these. You have if followed by an expression that evaluates to either 0 or 1. And if it is 0, then it skips the first branch and continues assembling the else block. If it's non-zero, then it assembled the first branch and skips the second block. And the way we do this is when we want to skip a block, we just start reading tokens until we find an end if or an else and just discard them. Now, the way we handle the else is in the normal case, that is, when you have if 1, then the assembly of this branch happens perfectly normally and we hit the else as a normal instruction. When we execute the else, we start skipping tokens until we find the end if. If it's a 0, then the true branch will skip tokens until it hits an else, consume the else, and then continue assembling perfectly normally from the second in the false branch. And the end if is then ignored. It's simple but kind of unintuitive. So we read in an expression, if the value of the expression is non-zero, we don't do anything, we just continue assembling as normal. If it is false, then we need to start skipping tokens until we hit the else or the end if. And what we do is we set a professor flag in the past to stop the listing happening. We haven't done the listing yet. We read a token. We can't hit an end of file inside a if, that's a syntax error. We want to keep going until we hit a identifier, which is a end if or a else. Yes, this is slightly more interesting. We have to put that here because we haven't defined end if, cb or lcb yet. So when we come out of the loop, we have just consumed the end if or the else. So we now need to consume the new line and turn listing back on. Note that this way of doing else and end if does not support nested conditionals. You have to be rather cleverer than that. But the original assembler didn't support nested conditionals, so I don't see why I should. Oh yeah, end if. When end if is executed as a normal instruction, it does nothing. Just thinking about this. Expect expression does not consume the terminator. Does it not? Expect expression returns the terminating token. So it will return the new line and consume it. So we don't need to consume the new line after the expression in if. Else is extremely similar to the consume branch of if, but of course it will only be executed when the true branch of if is being followed. In the false branch, then it will never be seen by the actual assembler because this code here will have consumed it. And we know that else, if else can only be terminated with an end if. Just copy the comment I put in. That does indeed work. Well, when I say work, I mean compile. Let us do... Oh yeah, this one. We seem to be doing the hard ones first. DBCB is the instruction that does bytes. And let me find one. Normally it takes a list of expressions which are bytes to insert into the output stream. It may also take strings in which point the bytes in the string are emitted. I remember I talked about that last time, well, earlier in this video when we handled the relevant hack bit in the expression parser. And this is the code that is actually going to use that. So just copying the original code. Okay, so read expression will return token string if it's just red string and string constant hack is set. And I realize now that this actually introduces an ambiguity because if you have code like that, read expression will read the one and the terminating token is going to be a string. So we'll return token string. So in fact, that is actually going to confuse the DB codes if you did this in real life which in fact is not uncommon because all you need to do is skip a comma. Then it will lose the number. So we're actually going to add a new token type. It's a token reading code. Right, this is the expression code. If DB string hack, blah, blah, blah, token equals token string hack. Right, wait a minute. Yeah, no, that is correct. So if we get string hack, we now need to emit the contents of the buffer. Let me just double check to make sure that I'm remembering to terminate the buffer which I think I am. Numbers, identifiers, strings. Yes, I am terminating the buffer. And now we do not want to emit the actual terminator. So that can be after a string. Why do I do that? All we really want to do at this point is to, we do want to read the next token because when read expression returns string hack it has not read the terminating tokens. We need to do that here. If it's not string hack, then read expression has returned a, has read an expression and returned the terminator. So in fact we want to do that. If the terminator is a comma, wait a minute, right. If the terminator is not a comma, stop. Otherwise that comma has been read or consumed either by read expression for a number or by this call to read token for a string and then we loop back to the top again to read the next expression. So we end up here after we've read all the valid expressions and commas. And T is set to the terminator of the last expression. So that must be a new line. Otherwise it's a bad separator error. Value, that should be token number. And now we have ds and dw. ds is very simple. It reserves space in the output file. It does not actually add bytes. Let me really rephrase that. It reserves a block of memory which is usually uninitialized. If it appears inside code then it will be zeros. So to do that we just read an expression and advance the program counter. dw is the simpler... Oh yeah, I need to turn the string constant and hack it back off again at the end. dw is the simpler version of this code. dw doesn't allow strings. So all we need is we read the expression, omit the value, and then we steal this bit of code to terminate. Right, 8,001 bytes. So that should be most of the complicated ones. We now actually start with the callbacks that produce actual instructions. rpcp is the first one. This is an instruction that takes a register pair like dad, double add. It takes either bc, de, hl, sp, etc. The reason why this table is saying bdh is because traditionally in 8080 machine code the register pairs are referred to by the high register of the pair. So bc is just b. This works because the encoding used by register pair bc is the same as register b. Yeah, go figure. So that is actually a zero. So the numbering is 0, 1, 2, 3, 4, 5, 6, 7. It's the low three bits of the opcode. So the high register of the pair always has the bottom bit set to zero. So all we do is we all in the register pair to the instruction. And inks over here is exactly the same except for the bottom bit set to... Wait a minute, wait a minute. Nope, that's wrong. I'm getting my columns and my rows mixed up. The register number is ordered into the high word of the instruction. And still the... Oh wait a minute. Oh yeah, it's shifted by three. So in fact the bottom bit of the register encoding becomes the top bit of the bottom nibble of the byte. Yeah, shift by three anyway. That sounds very easy to do. So you just do current instant dot value. That needs to be a urn to eight. Or all that with... number... eight, six, shift by three. Right, RPCP. And yeah, the reason for the two casts is to generate nice efficient code. So this entire expression becomes this. So load current instant, which is a pointer, advance it twice, which takes us to the offset of value. Because we are reading a byte, we don't have to read the entire 16-bit value. The compiler is clever enough to do that optimization. So we just read the single byte. M, by the way, refers to the value at the address in HL. Remember that these h's refer to HL because it's a register pair. Inks means increase register pair as opposed to inner INR, which increments an 8-bit register. Yeah, 8080 mnemonics are interesting. So that puts the value in B. We then read token number into A. And by six, shift left by three. Push it onto the stack. Why is it doing that? So the reason why it's pushed it onto the stack is because we're about to put B into A. But it should have actually tried to save that into a register because we've got all the other registers. You could easily put that in D. Interesting. It won't make a difference to the code size because the push and the mob are all one byte, but it'll make it a little bit faster. And in fact, there is an optimization you could make here. It's doing the OR in the wrong order. So instead of... What it's trying to do is it's ORing A with B. If it knew how to OR B with A into A, then it wouldn't need the push and pop and mob. It could eliminate these three instructions completely. But the register allocator's not smart enough for that yet. And then call emit 8. Now we've done ORG ALU source. This is pretty straightforward. ALU source is an instruction that operates on... which reads a value out of a register. That's basically one of this big block. All these instructions do something to A with a register. And the register number goes in the bottom nibble of the opcode. So that is straightforward. Read an expression from the stream, conduct value as UNT8, or with token number as UNT8. Likewise, AUDest is a instruction which writes to a register which is one of these. So we've got pop B, BC. No way, that's a register pair. I actually remember from last time there's only a few of these. Oh, yeah. Decor and inner. Yeah, that's these. This column. And decor over here. And likewise are these. That is extremely similar except we shift the number. Done RP. Right, simple one byte. These are instructions that take a single parameter as a byte. Examples are ACI, SBI, XRI, CPI, etc. Add with carry, subtract with borrow, XOR, all working on immediate values. So these are really, really simple. We emit the opcode. We emit the byte. Simple two byte operations are precisely the same except the... Wait a minute. Okay, I got myself confused. Simple one byte instructions are instructions where there is no parameter. Not an implicit register, nothing. All you do is just emit the opcode byte. Examples are RET, NOP, RNZ, etc. Simple two byte are the ones that take a opcode and a byte parameter. And simple three bytes are exactly the same except the parameter is 16 bits. Okay, LXI is the instruction that reads a 16-bit constant into a register pair. And this is a combination of simple 3B which takes a 16-bit parameter and the register pair. So we will actually just steal that. Wait a minute. I implemented that. Okay. MVI is a move immediate. It's the same as LXI where there's a register encoded into the opcode. Here they are. But there is a parameter byte. So you steal the LXI code because that deals with the separator. You don't want to do the shift, but otherwise everything there is the same. And the token value is a byte. And MOVCB is the instruction that moves one register into another register. Both registers are encoded into the opcode. That's this massive block of registers here. The one exception is HALT, which occupies the encoding of MOVM,M. As reading from HALT and writing back to HALT makes no sense, this is actually used for something else. We're not going to bother to check for that because, you know, life's too short. So check to make sure it's a valid separator. Destination register is token number as a byte. Read the second expression. Source register is token number as a byte and emit. NCB is the instruction that terminates compilation. And in fact, this will never be called. So okay, that's done. Let's see what happens when we actually run it. So we take a look at our assembly. So 310080 is this instruction. CD000 is this instruction. This is the first pass. So it doesn't know the value of any symbols that are defined in the future. So that is coming out as zero, which is here. Reset zero is C7. Let me just double check that. Reset zero is here. C7, that's correct. We then define a label for MULT2. MULT B comma H is 4,4. MULT B comma H 4,4. Yup, that's correct. C comma L for D. LXI H comma zero is 2100. Label MULT A comma B, 7,8. A comma B. A comma B, 7,8. That's correct. ORR C is B1. Where is ORR? Oh, it's an ALU op. So it's here. ORR B is B0. No. Oh yeah, sorry. ORR C not A. Okay, B1 is correct. Expression for RZ. That simple one byte. Okay, I know what I did there. Don't want to read an expression in. Okay, we get quite a long way. Lines 3, 1, 4. Right, that's because... Yeah, that's that LD that should be the MULT. So actually I should be able to fix that by just running doing make to rebuild the relevant bits of the compiler. And do that again. Plus, we also need to trigger a recompiler of the assembler. 3, 3, 2. I bet this is the same. The Z80 used OR. The 8080 used ORRA. So that should be here. Search for... And I believe XOR should be XRA. Yes, it should. 1067 ink A. Yup, that should be INR. This is IN. The assembler I normally use, which is ZMAC for this, is considerably more forgiving. It understands both Z80 and 8080 op codes. Which is harder than it looks because they actually overlap. You can't support both at the same time. That's in a chunk of inline assembly in the CPM file handling code. I have to do a make clean for that because there's a bug in the build scripts and this file is not in the dependency tree, which I really should fix. So we just wait for all 2,703 build artifacts to be made. You can actually see it running ZMAC there. Sometimes. And it finishes. With 39K free. Okay, right, that bit's done then. It is now successfully reading, parsing and assembling this entire assembly file. Well, I assume successfully. I don't know if it's actually producing the right result, but we can always hope. But it's not actually doing anything with the result yet. It's just like spewing bytes out onto the screen. That's because I haven't done the emitter code. But the emitter code is not hard. So the compiler is a two-pass compiler. What it does is during the first pass, it discards all output because the point of the first pass is to define any labels. And during assembly in the first pass, it figures out where all the labels should go because of the size of machine code. Only during the second pass does it actually start writing stuff out. And the old code had some nasty CPM stuff in it, but the new code is going to be much simpler. So if we're on the second pass, then simply write the byte to the output file. Easy. Likewise, emit 16. Then we are going to write the 8080 is little endian, the low byte, and the high byte. Okay, now let's assemble our thing and see what happens. That seems to be taking rather longer than I expected. Interesting. But that will have produced a out test.bin. Test? Why do I call it test? So there is the output binary. And there is the binary that the ZMAC is producing, the one I'm actually running, and they are not the same. Fabulous. That means that something is wrong. I mean, it looks like machine code. You see here are the text strings that are down here. It stops at 1F00. What's in this one? Oh, I know what's happening. Yeah, that's really stupid. I'm forgetting to close the file. Because it's not closing the file. Oh, yeah. It is not flushing the last block to disk, which is why the output is truncated. Yeah, I am surprised that's taking so long. I mean, it's not a very good CPM emulator, but it should be faster than that. So test.bin is 8192 bytes. Cameras and files, 8111 bytes. That's actually normal. Because the CPM file system doesn't record the exact size of files, it only records the number of 128 byte blocks. We expect it to have padding up to a round number. So let's just see C9 is the last byte at 1FAE. What have we got here? 1FAE does indeed have a C9 as the last byte. Well, there is only one thing to do now, which is let us try assembling the file that we've just assembled with the assembler that we just assembled. And that looks like it hasn't worked. Right. I know what's happening. I know what's happening. And yeah, I completely forgot about this. So what is happening is program counter here, as I put emit age, right? So emit here doesn't just write a byte to the output file. What it needs to do is to advance the position in the output file to cope with the current program counter, because you're allowed to reset the program counter. If I, this is, you use the org command for this. If I were to do this, then this would emit three, six, seven bytes of code and then padding zero up to 200. And then the rest of the code would follow. And we haven't done any of that. I'm actually slightly surprised that that could be causing a problem. I mean, it's something we do need to fix, because the assembler produced by Calgold is actually pretty basic. There are no ifs. There are no orgs apart from this wall at the top. So I'm not sure why that would be causing a problem. Also, I just spotted this deseg at the bottom here, which is a special instruction that we should not be emitting. That only makes sense for different types of assemblers. It determines whether it's data segment or code segment. But this assembler doesn't distinguish. So let's actually remove that. That is emitted by the linker. This is the architecture specific bit of the linker. So this is the code that emits the top of the assembler files. We've got our two instructions of boilerplate. We then call all the main subroutines in order, and there is normally one. Then we do a reset zero to exit. Here is the footer code. Let's just take out that deseg. But we do need to make the program counter stuff work. So I'll just wait for that to build. This is output listing stuff. Here we go, emit eight. Yes, that's annoying. OK, let's change this. So the first thing is emit 16 actually needs to work by calling emit eight, because emit eight is going to have the actual logic in it. I mean, that's not difficult. We need a flag byte that goes in pass. So there are two states the assembler can be working in. Either the program counter is known and it is emitting bytes to the file. In this case, when you do an org, you need to add padding bytes. The other case is when the program counter is not known and it hasn't started emitting anything to the file. This is the case that happens at the top of the file because you can have instructions that don't... Let me start that again. So the first org statement cannot add padding. All it does is it sets the program counter for the next instruction that actually does emit bytes. So we have to track whether the program counter has actually been set or not, which is what that byte does. Ah, right, that's what I'm doing wrong. I'm not actually advancing the program counter whenever we emit a byte. I mean, we still need to do all the org stuff anyway, but this is why the program is crashing. There, easy. I've forgotten that, yeah. In order to emit padding, we do also need to know the old value of the program counter so we know where it was. Okay, so we're trying to emit a byte. If the program counter has not been set, then don't emit any padding. Mark the program counter as set. Initialize the old program counter. Otherwise, emit padding bytes until we've emitted enough padding to make up the difference. So while old program counter is not equal to program counter, loop, like so. Thinking about it, it would actually be much easier to simply emit the padding in org because then we wouldn't need old program counter. Let's do that. Do I need program counter set? I don't think. Yes, I do, but only org needs to care. So I believe that all this stuff goes away and we're back to this code. Now, if we're in pass to output the byte, increment program counter. So we do need program counter set. Yep, I did leave that in. Let's simplify. Wait a minute. We can do better than that because we know that the new program counter is token number. Delta is... Okay, I was going to set the delta here and compare it, but that's actually kind of hairy due to signing issues. So we're dealing with 16-bit integers here. You've got to be really careful when doing magnitude comparisons of signed values. Delta is not zero. Emitter byte. Okay. So we use our new assembler to compile the assembler. Program counter move backwards. That's because the... We just hit the beginning of pass to and we hadn't reset program counter, which starts at zero. Pause, pause. Labeled already defined. It's actually this label. It's spotted that this is the second time through, but it thinks the value has changed. Interesting. I'll just add some more useful tracing. Okay. That's one. That's two. 103 versus 107. Hmm. So zero, three, six, seven. So the old value that is set was 103, which is wrong. It should be 107. So the places where we can change the program counter are org and emit. Well, we're incrementing the program counter there, which means we're incrementing that. Yeah. Okay. Yeah. That's kind of dumb. So emit 16 was only incrementing the program counter during the second pass, because it was calling emit eight inside the conditional. We want to do that. Right. Assemble. Pass one. Pass two. Done. Let's just actually compare our binaries. Byte 36. Great. So what's different? It's looking pretty similar. Byte 36 is, that showed that in decimal, which is two, four hex. So that's, yeah, there's a three E there and a three nine here. Why is that changed? See, are there any values I recognize? Here's a double zero, which is probably this one. Yeah. Two one is load H three 10. We're here. Gone too far. Jump. Jump is C three. So where is there a C three here? C three OCO one. That's that jump instruction. So I had three E O O is more of a comma zero. That is coming out the wrong value. That means my MBI is incorrect. That's because I cut and pasted the wrong parameter. Okay. Let's try that again. Assemble. I was expecting this to be faster. Prepare. Byte 341 line one. Fantastic. Hoping that reached the end. Okay. What have we got? It's a C three, which is a jump. Wait a minute. 341 decimal. Right. C nine seven a now C nine. If I remember correctly is a ret C nine. Yes, it is. I am looking at the new files. Seven a is wrong. Oh, why is that a C zero? RNZ seems to have admitted a ret instead of an RNZ. And then compare is compare is showing me one based. Yeah. One based offset. So I actually want to go to three four zero. Yeah. C nine. Well, RZ is simple one BCB should be returning this value here, which is C. That's not right either. C zero. Oh yeah. Yeah. RNZ getting my RNZ and my RZ mixed up. So why is that returning C nine? I mean, nothing much can go wrong there. There is only one RNZ in the symbol table. Yeah. It's probably this one, but it's not because there's the previous bite should be zero. So there's C zero RNZ. So what's BD? D compare L. There is only one RNZ in the source file. Oh, it's another Z 80 ism. Oh, yeah. Okay. I need to write. I can't just change the generated code. I have to make sure that this correctly errors out. Right. So what's happening is on the Z 80, the way you do the conditional returns is you stick a operator on the end of red. Tell it what and what under what conditions you want it to return. So that is the same as the 80 80 RNZ. But clearly what I'm doing is I'm reading the red and emitting the return, the unconditional return for the red, but then ignoring the NZ rather than producing a general syntax error. So we're not going to fix that. And I know why it hasn't produced an error. This could be interesting. So what it's done is it's read the instruction and done the thing in the callback, which is here. And then we return to the end of the loop and gone back to the top. It now thinks it's at the beginning of a line again. So it's read the NZ and is treating it as a label. So in fact, here, I need to expect a new line, which could also be an exclamation mark, except that's not going to work because reasons. Right. Because the expression parser here has consumed the new line. Yes. So we can't consume the new line there. But we do want to consume it here because this is a label on its own. No, we don't because the new line could have been read. If it's an implicit label, then where are we? We read the token label here. We decide it's a label. We then read the next token to see if it's a colon. If it is, we read the next token. So at this point, T is set to the token after the label, which is, in the case of a label on its own, is a new line. So we have consumed the new line. So we're going to have to consume the new line in instructions, which do not take an expression as a parameter because expect expression is going to consume the new line at the end of the instruction, which is going to be all the instructions except simple one B, where we need to expect the new line. And now we get a syntax error at line 316, the way the God's intended. Okay, now I need to go and fix that. How many of these things are there? Actually, let's just search for RET space. Not that one, not that one, not that one. Just this one by the looks of it. Good. Okay, that one's in the dependency graph. So I can just do that. Assemble path one and wait. Path two and wait. Done. Compare three, five, four, line one. Right. It's a F five. And F five is a push PSW. Okay, it's W PSW in the symbol table. Yes, it is. And it's a six, which is so push and pop take register pairs. So that should be zero, two, four, six. What is the incorrect value? C five. Interesting. This suggests that it has not correctly looked up the value of PSW. PSW is a magic register value, which refers to the AF register pair. It's got the accumulator and the process status word, hence PSW. We use this to push A. So this looks like it's actually picked up a zero instead. Or push is pushing pop both RPCPs. So this could be a problem with RPCB. So let's just take a look at and is six or with three. This was the one I looked at earlier. That looks right. Throwing my mouse across the desk. Okay, it's more of those than I thought. Just need to let this run because I've forgotten what... Oh, that's interesting. Notice it gets slower as it goes on. Three, five, four, which is one, six, one. One, six, one. It's found a four. Four. Four. But we know that PSW is six. Three, five is push B. But if it got a four, how is it getting to see five? So we are shifting it left three, which is turning four into zero, zero, one, zero, two, two, zero in the top half. So that should be producing... This is 001020. That should be producing E5 push H. But it's not. It's producing C5. And I... Oh, yeah. And now I've changed the code so everything's moved. I'm going to have to let that run. Yeah. The way that's slowing down makes me think that I have a nasty exponential algorithm somewhere. Probably the symbol table lookup. But that's a hash table. That should be quick. Relatively quick. But most of my labels start with the same character. So what I bet that's doing is only 43. Now it'll be the C's. Every temporary... Every temporary label starts with a C. So, yeah, that will be it. Okay. That's going to need a slightly different hashing algorithm. But that should be easily doable. So what's that comparison? Byte 2? Oh. Byte 2? That's this. 24F9. 2244. This suggests that my... That this version, kaizen.sim and this version are out of sync. So I forced kaizen.sim to be reassembled. They're still different. That's extremely suspicious. Can I tell compare to show all differences? Oh, that's useful. There's quite a lot of differences. Differences all over the place. So 5 and 6 are different as well. But 5 and 6 are where main is. So we expect that to be different. 24. What have we got? Here is our LXI for H. So expect this to be down here. It's probably one of these. What's a dad D? It's a 19. It's there. Right. So the two bytes it's found that are different are mol no carry, which is here. But that's just... That's right there. So this is 18. This is 19. This is 1A. So D21A is pointing at the right place. Okay. Now let's take a look at the other one. D21B. That's the wrong place. And here is our dad D. And then there's another dad D. And then there's EB. Which is... Exchange. Right. It's repeated that byte. But it only did that. Because I cut and pasted that. So I don't actually know whether I make this many stupid mistakes when I'm coding on my own rather than on video. I suspect I just notice them more when I'm on video. I suspect when I'm working on my own I just move past it and it's done. I don't remember it. Okay. So this comparison has found two different bytes. This is good. This is what I like. Which are where we expect them to be. So... Got 161. 4E5. Which is the push 8 we expected to see. But in the file... The E5 has turned into a C5. Why has it done that? Now... This code is the same as this code. I just cut and pasted it. Amid 8 and print hex I8 both take bytes. So... They should be the same. Let's try that. That's kind of evil. But let's see what happens. So that has seen an E5. Twice. I mean... I don't think that's the right value. Ooh. Ha ha. Okay. That's done. That's really done. Program counters are not the same as file offsets. So actually let's just back out this stuff. So... This is file offset 161. CPM binary start at 100. So this is program counter 261. So let's run this again. And 261 is a... 0 C5. Okay. Right. I mean it's wrong, but at least it makes sense. I'm also very interested by the fact that it's got a... No, no, never mind. It has actually read a zero as the parameter. So... This means that it has somehow failed to read the correct symbol. Hmm. If it is referencing a symbol that has not been set for some reason, then that will be initialized to zero. So why would it not be finding this PSW? Whereas Alexa. So accumulate byte is converting it to uppercase. We are terminating it. We're looking it up. So that was program counter 261. It has looked up two symbols at program counter 261. 260 actually, because we haven't actually omitted the byte. No, it is actually 261. So this one will be the... The instruction. And this one will be the PSW. And I should be able to check that by doing this. 261. Z80 is another one. The Z80 refers to AF, not PSW. And the uninitialized symbols show up as zero. So I do not believe the original assembler did that. I mean, I do not believe the original assembler checked for uninitialized symbols. But I think that I am going to have to just to maintain my sanity. And the easiest way to do that is to walk through the symbol table after pass 1 and produce an error if we see any symbols with the undeff... of type undeff label cb. So we're actually going to have to check each chain individually in the hash table. Do like this. It would be nice if... Is that going to actually do the right thing? Or is it going to hang forever because I've forgotten something? It's going to hang forever because I've forgotten to do this. To move to the next item in the bucket. Right, undefined symbol AF. It would be nice to have the line number. But we don't record the line number when we define a symbol. We could. It would take up another two bytes in the symbol table. Now we're compiling this honestly not very big program. We're actually gone from 54 to 38k. So I mean there's enough space that we could put the line number in. But I think I don't want to... Simply that... I'm just going to do that to suppress the line number at the end because that's misleading. I don't want to do that simply because finding the symbol is easy enough for the search. And I don't want to use up the space. There's another one. Need to build. Save. Is that a force of recompile? Let's rerun our assembler. Wait for it to finish. Do the comparison. Right. It is now producing the same result. So I should just be able to test.bin just to prove that it works. Let me just change that to a more sensible default file name. Okay. We now have a functioning assembler. I mean it works. It is quote finished, unquote. There are a few things I need to do. One of which is to change the hashing function for the string. And we're just going to sum all the bytes in the string because it's easy. I think this is the cheapest way to do this. There is an entire science of cheaply and efficiently hashing strings, which I'm not going to even think about using slot in URLs. Okay. I can just do hash and OX1F. So hopefully, types cannot be inferred from numeric constant. Quite correct. They can't be. Slot not found. And if we assemble this with luck, should be faster. That's not faster. That's definitely not faster. Interesting. Maybe, so this will just print the number of items you need to chain of the hash table so that you know whether it's unbalanced. That's because I'm running the wrong version. That's faster. That's a lot faster. And here's our hash table, which is relatively balanced. Good. Yep. Okay. That is definitely an improvement. So there is, I believe, only one more thing we need to do. One more thing I need to do, which is to emit the listing, which is surprisingly useful when debugging things, because it tells you the exact value of every label, byte, et cetera. So what we're going to do. I am going to drastically simplify the listing format because one of the problems I had with the old assembler is trying to duplicate something even slightly like the listing format of the original was just awful. So I'm not going to. Instead, we are going to emit one line per instruction, which is fairly straightforward because you can do that here. So we want to have a buffer, which is going to be 80 characters wide, one standard line in the terminal. We only emit the listing in pass two. We need to track the number of bytes we've listed, which gets reset to zero at the beginning of each instruction. Whenever we emit a byte, we're just thinking of field sizes. I need a comment for this. So program counter is no, actually. Line number, program counter. Then we're going to have emitted bytes. I'm just wondering how many you can get away with. It serves eight as a nice round number, followed by the instruction text for one instruction. So we start at, this is on column 15. We started on three, so we want 12 bytes. So if there are fewer than eight listing bytes, then we want to insert text into the... We want to insert the byte we're emitting in the right place into the buffer. So we're going to use UI to A for this. Let me just double check my stupidly expensive UI to A because it does everything in 32-bit arithmetic. But we have it pooled in anyway because we're using print with numbers. And the pointer to the buffer is going to be 12 plus listing bytes times three. And this returns a pointer to the end of the string, which will be zero-terminated. We don't want it to be zero-terminated. We want it to be a space. So this should emit bytes. Now we want to save a bit of time. We only do this in pass two. We want to set the content of the listing buffer, or at least this much of the listing buffer to all spaces. So we're a start at column 39. 39 minus three is 36, probably. We have memset in here. Oh, yeah, that's defined here. Yeah, the CPM version just uses a really simple one written in cow goal, but you can actually... The back ends are allowed to define their own. We want to insert the line number. So again, UI to A. Line number followed in base ten. Starting at listing buffer zero. So we could... This will not left pad the hex number. You have to do that yourself. Yeah, this is supposed to be a simple routine for doing... Well, simple numbers. Now I could change the routine to do padding, but it's in the standard library and I'd rather not. I mean, it's pretty terrible, that is. I should probably shift this out into the back end code into the back end runtime so that it can be a nice small machine code implementation. And I should certainly fix this. This actually does the division twice. There are some helpers to... Basically that implement div mods. You only have to implement it once. You have to do the division once and it returns both the result and the remainder. Well, let's use UI to aid for the line number because that actually needs to be in decimal and replace the terminator with a space. But let's just do our own hex conversion, which is fairly straight forward for two bytes. Just trying to think. There are two ways to do this. One is to use a lookup table of hex digits and the other is to actually do arithmetic. And I'm not sure what would be smaller, but let's do it the right way. So take the bottom nibble at right. Yes. No, I think that's right. I'll try and see. So that will do the bottom nibble, which actually needs to go into here. We need to start at the top. And we're also going to want a hex 8, which is going to implement in the traditional way. So this code then turns into listing buffer 12 plus listing bytes plus 3. And we don't need to fiddle with the terminator. So unexpected if we keep doing that. Semicolon, missing parenthesis. Expression was a U16, yeah. Listing buffer not found. I'm also going to do this just to terminate it. And then at the end of the loop, if pass equals 2, then this is going to be strictly temporary. I should add listing buffer 0. What happens when we run our assembler? So we've got a line number. We've got our bytes in the wrong place, but so we've got LXI. That's interesting. There should be three bytes here. But line 5, we've got a C7, which is C7, RST0. Okay, well, the line numbers aren't right, I have to say. It's also only emitting, what's this? That looks like a string, yeah. So here is 1788. Here is the C9 RET. Then we have our string here, that's the first chunk of it. Two, three, four. That's nothing even like the right number of characters. Because that needs to be a three, a multiplication operator. That's better. This is looking much more like what we expect. So LXI SP top plus 128, the top of memory. That's the top of the stack. The stack occupies the first 128 bytes of the program's memory. Call main, reset zero, empty line, that looks right. One nice thing about the 8080, one of the only nice things about the 8080, is that most simple instructions are a single byte. It's not like the Z80 with extension prefixes everywhere. You can tell from just looking at the machine code, basically how big it's going to be. So some operations like this shift right, looks long-winded, because it is long-winded, really. But it's only a small handful of bytes long. And given that all branches are three bytes, then it is actually worth inlining things. Okay, and here's our string. One, two, three, four, five, six, eight, yeah. So we need to insert the program counter. But we need to insert the program counter. Do you want to just thinking of org? Because org changes the program counter. I think for simplicity, we're going to insert the program counter at the beginning of the instruction. So we're going to put that in here. Let's say in X16 program counter, and that will go here. We'll actually also want to insert that colon. So four, five, six, colon there. And we want to put the program counter at P plus six. And then we want another colon, 10. That wasn't quite what I had in mind. Right, P is the wrong value. P is actually the address of the end of the string written by UI to A. I actually want to do this. Well, I actually want to do that. But here I want to do listing buffer four. Address of listing buffer six. The other advantage is that as the compiler knows statically the address of listing buffer, no pointer in direction happens here. That's just a simple write to memory. So that is going to be five bytes. Load colon into A, store A into memory. And what's this done? That's looking almost plausible. It's a bit slow. It has actually produced all the stuff. So we're eventually going to want to block out this code so that it only runs if the user actually wants a listing because, frankly, it's slow. But there is one more bit we need to do. I'm just trying to think the best way to do it. We need to copy the text into the buffer. And I think that this is going to be the job of the lexer. So we're going to want to copy text from the source into the listing until we run out of space. And the simplest way to do that is to put it here. So the text starts at 39 minus 3 is 36. Yeah, that's where we put our terminator zero in. And so we actually have 44 bytes of text. So because we also need a terminator, listing charge is 43. If parse equals 2 and listing charge is 43, then listing buffer, listing charge equals C. Listing charge plus 1. OK. One last thing we need to do to make that work is just before doing the thing, we need to make sure the string is terminated. And that's complete garbage. I know why it's garbage because we're actually also emitting all the control codes. So if we don't want to put new lines in there, of course, that's still complete garbage because I have forgotten to do this. Lots of nasty magic numbers in there. We should be using consts to K. That's interest. Oh, yeah. It's done that because we've done this in the wrong place. This should happen before we read the token because otherwise we'll just reset over the string that we just read at the beginning of the line. OK, that's not so bad. That's looking moderately good, actually. It's got the line numbers on the left. Going to the end of the file takes a while. This is not the quickest. We might need to allocate another character for the line number. I think 4 bytes is not enough. How are the line numbers looking? So here is line 24, f3-mol. f3-mol2. Yeah, that's right, actually. Cool. Let's find a DB. Let's find a DB followed. I'll find a DW. OK. This is actually the lookup table for initializing the operators because it's... Why is there a double curl on there? This will only be... Hmm. Anyway, there should be a big DB at the bottom of this. Here it is. So here we have our eight characters. Ooh. Right, I know what that is. That's nice and simple. This is because catokens have been ungot. And when we unget something, listing chars equals... listing chars minus 1, which would be DW again. What happened to the tail of these? What's happened to the last byte? What's happened is that this has ungot it. So that's not quite what I had in mind. What will be happening here is that at the end of the line it is ungetting something and then not getting it back again. It's probably ungetting the new line. So I think I should be able to easily hack this by saying C is not a new line. I mean, this is all really bodgy, but that's better. Okay. And now I believe we have a reasonably coherent listing file. Yeah, let's change that. That's going to be... Well, X-line number plus 5, of course. So... Program counter is going to be X... I know... I can't type. I know end plus 2. Program counter end is, of course, going to be X. Program counter plus 4. Bytes is going to be X. Program counter end plus 2. And the text is going to be at X-bytes plus 8 by 3. Text end is going to be 79 minus X-text. I'm trying to think how I want to do this. Let's just do that. Okay. So here, this is going to be X-text. X-text. Listing byte is not going to be the number of bytes. It's going to be the exposition in the string. So that is going to start at X-bytes. Likewise, this is going to start at X-text. The first colon is going to be at X-line number end. Second colon is at X. Program counter end... Oh, yeah. If listing final name is not equal lot as string and listing byte X is not equal to what comes after the byte is the text. Then write X. There we go. Much more efficient as that is now no longer doing a multiplication. We should see the code size go down, I hope. Where is read token? Listing char X is not equal to X-end. Listing char X. In fact, we could use pointers here and make things even easier, but honestly this is clearer. So these multi-branch conditionals are awful on the 8080 because of the 3-byte branch instructions and because comparing things is just generally a bit grim. Comparing listing file name against zero you have to load it into a register which is 3 bytes. It's going to go into HL. You then have to move H into A. You oar it with L. So we're up to 5 bytes. And then you do your branch which is another 3 bytes. So that's 8. And that's the optimised comparison because we're comparing again zero. Comparing against another number it gets even worse. Listing is a little better than the 8080. Sorry, on the Z80. But not much. It's just all kind of terrible. The 8080 is a very simple processor. The Z80 is not a very simple processor but is in many ways just as annoying to actually write in. Okay, what's that done? Scrub to live memory and crashed? Not quite. I think it's put the program counter in the wrong place. That should be X line number. Yeah, that should be X program counter. What's happened here? Oops, that should be X end. That's why none of the bytes showed up. That doesn't explain why the text hasn't shown up. That's not much better. Also that colon floating in the middle of nowhere looks done. Let's put that in a better place. Let's put that here. That means we don't need the line number end. So we can actually... No, that's right. I mean that still doesn't explain why the text hasn't shown up. Or the byte. If listing file name is not equal to string and listing byte is not equal to end. Okay, yeah, that's... Oh! It is in fact doing exactly what I told it to do. It's not emitting stuff because listing file name is not set. So we do this. That should work. But I need to stick a double hyphen after the emulation vacation. Why are we getting a syntax error there? Why are we parsing arguments? Yeah, right. This is tricky. This is because CPM and my emulator actually convert the command line to uppercase. So we could just compare these for, again, the capital O and capital L version. But I actually want this to work on other platforms as well. So we are going to have to call toUpper on the byte. We just read containing the character. That's better. That looks okay. Where's our DW? Here, there's our DB. Oh, that's not great. 0, 1, 2, 3, 4, 5, 6, 7, 8. That should be 9 times 3. Well, this garbage is garbage. But for DW, 0, 1, 2, 3, 4, 5, 6, 7, 8. Oh! Okay, that's not right at all. Well, that's wrong. I think that could have been it. So there are two problems. One is that there are too many bytes being emitted. So let's take a look at that. There we go. That should not be bytes. And the other is the text is garbled. But I think that's actually going to be text. I think that is because it's being overwritten by bytes. So there we go. That looks fine. There is... You see, I wonder whether we could actually get all the bytes by simply when we reach the end here, we flush the line to the output. Make a new blank line continue. Therefore, you get multi-lines of bytes. It would be fairly straightforward. But I don't think I'll bother. People won't be particularly interested in the byte values other than for, you know, the actual instructions. So we want to do this if listing file name is not equal 0 as string and pass equals 2. Because that will skip everything if... That will skip all the work if no listing file name is set. It's still printing lots of stuff because I haven't done this one either. Listing file name is not equal 0 as string and pass equals 2. Now there's actually... Do you remember when we were doing the if callback and it was setting the pass high in order to suppress writing anything to the listing file? So we haven't put any support for that in but actually I think we don't need to. I mean it's going to be pretty hacky but if we just take that stuff out then what you'll see in the listing is in the text section you'll see if and then your expression and if it's a false then you will see text from the next few lines appended on the end of the if. Likewise for else it will just give up writing text when it hits the end of the line. So let's actually just chop those out for simplicity and I completely forgot to do it entirely for else so that's straightforward. It also occurs to me that I'm doing this test lots so we can simplify things by... I can simplify things a lot by doing a flag byte. Up here we have listing file name is not equal to not as string and pass equals to then do listing equals one else do listing equals zero. So every place where we're doing this do listing is not zero. CalGold does not support automatic coercion of types to booleans so you always have to do the explicit comparison. I am still not entirely sure whether that's a good idea but let's stick with it for now. So this simplifies all this stuff no end. Let's run that and see what it does. That seems to be doing the right kind of thing. Let's try that without the minus L. Yep, nice and snappy relatively speaking. Okay so all we really need to do now is close the listing file at the end of the program. What am I doing? Let's help her here to save sanity. Start error failed to close. Name and error okay. And the one last thing is here instead of dumping it to the console we still want to terminate it. I think it's called fcbputstring.listingfile.listingbuffer0 and a new line and that should actually be a pointer to an fcb. Okay so assemble and write to listing. Yeah the listing is slow. It's probably let's see if that actually worked. Yep that looks fine. One thing I do need to do which is the very last thing before closing the file I need to write an end of file character. This is a convention in CPM where for text files where you don't want garbage at the end of your file you saw the zeros at the end of that. You put a control Z to tell any text editor that this is the end. You can see it here and anyone who's programmed in DOS will remember that this convention exists there as well because DOS has its ancestry in CPM even though DOS does support files with exact byte sizes. So it doesn't need them. Okay it's finished. I mean there is nothing else to do. We have a working assembler. It is 9,000 bytes, 9,013 bytes which compares to, I'm just trying to remember the path as in 10712 bytes for the ack assembler so we've beaten it in terms of size. Let's just try running it and seeing what happens. I think we have to call that cowasm. It looks like asm.com. Open input file. What file is it looking for? It should be looking for cowasm.asm unless it wants it in capital letters. Excuse me. This is it writing the result banner. That's weird. It hasn't actually tried. Did I get the couldn't open input file? It hasn't actually tried to open the file. That's peculiar. This will be an interaction between CPM emu which I wrote and is a kind of hacky and the ack's command line parsing which I also wrote and is a bit hacky. You're probably sensing a theme here. I was going to try to compare them. Let me just try this. I'll have to debug that later. That's frustrating. I was going to try to compare them up against each other. The ack's output is... I will see if I can find it for you. It's not brilliant. It's not kept the assembler file. The ack's output is very slow because as I mentioned earlier lots of helper routines but quite dense. But given that we are comfortably beating it by a kilobyte, that's 10%, and I suspect that cowasm here is way faster because it's not using all those helpers. The ack uses helpers to pull values off the stack. Cowgold doesn't have a stack, so it's got direct access to all its variables. I think this will absolutely beat the socks of the ack assembler. It's smaller and faster. I can compile this on an actual 8080 or Z80 machine which you certainly can't do on the ack even though the ack is pretty small as c-compilers go. One more thing I want to try. There's the command line that actually executes everything. Let's just see if I can time it. That's taking a whole 106 milliseconds to compile this assembler using cowgold on my machine. Let's try that with the Z80 version of the compiler. We still use cowlink8080. We're using the wrong runtime. We're using the 8080 runtime rather than the Z80 runtime. Ooh, that's not so great. Oh, that's because I told Zmak this is 8080 code. Minus Z? So that's produced Z80 code. So here it is. It's quite different from the 8080 code because the Z80's got more registers and more operations and everything. So we can actually use ix and things. Yeah, this is relying on... So what's going on here is that some of the fixes I made to the linker I've actually put 8080-isms in the Z80 code because it uses the same linker for both. Yeah. Oh, hang on, hang on. No, I know what's happening because I have the wrong header path. So there we go. So how big is this? And that's not the cowasm.sim. Okay, so that's knocked 500 bytes off the size of the compiler at the assembler. Let's try running it and see what happens. This could be hideously embarrassing as I haven't actually tested this. And of course now it's trying to compile the code of my 8080 assembler. Okay, I'm not going to bother with that. I'll beat this into shape and actually integrate it into the toolchain and everything. And hopefully I can stop using ZMAC and use a much faster compiler instead. You've seen how long the CalGo toolchain takes to compile. I will also try it on a real machine and see how that goes. Anyway, there was a stupidly long video of me hamfistedly programming in a language which I invented. So I'm sure that is of absolute thrilling interest to everybody. But the last hacky video seemed to go down well, so I hope you enjoyed it. Please let me know what you think in the comments and do follow the links and check out my stuff.