 So, the story so far, I am attempting to port the Fusix operating system to the ESPA266. I have the kernel running up to the point where it's actually trying to load a binary. There's a block device on the internal flash and a file system that may or may not work. So the next thing we need to do is to actually load the init binary into memory and then we can get started on the code needed to actually, you know, run it. But before we can do that, we actually need to have a binary to load. So, I'm taking some time out of the kernel to instead work on the libc and application side of things because we also need to make that work. Now, it has been a little while since I've looked at this. So I will have to figure out how it all works. Fusix has its own libc and system call layer. So there is only one thing we need to port. And then we can go and actually build all the applications and put together a root file system with some stuff in it. So, what happens if you actually make this? It seems to be done. All right, that's just built the cross-platform things. We are going to have to touch this at some point. The system call library is actually generated from these commands here. So, let's take the, that's interesting, the z80 ones aren't there. Well, let's take the 8080. So we have a little program here that reads in all the system calls and then spits out chunks of code that actually make the system call happen. This is the easiest way of generating all the system call stubs, which are the little machine code routines that actually call out to the kernel. For the 8080, calling out to the kernel is literally a jump into the kernel, to the syscall entry point. But I think we can do a little bit better than that. So, where does the actual, well, this is the actual libc. Oh, right, and there are multiple make files for each platform. Find them, here they are. So we are going to want, what's a decent 32-bit one. So we're going to want to copy this. So you see, this is actually pretty straightforward. It just builds all the C stuff plus a bit of machine code stuff, including the dreaded set jump and the slightly less dreaded CRT things. All right, so let's copy this, make file 68000 to make file ESP8266. Now, normally the user land, which is all the actual user applications that are going to run on top of Fuzix, are independent of what kind of platform you're on. So it doesn't care whether you're running on an MSX or an Amstrad PCW or anything like that. They all have a Z80. They all support the same Z80 ABI, so you use the same user land for all these things. That's not the case for us, unfortunately, because the ESP8266 has no memory manager. We have to bake in the addresses where the executables are going to be loaded into the libc. So our libc is specific to the ESP8266 rather than just being a generic LX106 executable. In fact, that is not entirely true, as it is possible to generate relocatable executables. But I'm not going to because that's work. OK, so loading, there we go. Right, so we need to tell it how to invoke the cross compiler. And we've already done that. It's in CPULX106rules.mc. So we just copy this. Cross compiler is LX106LPCC, LX106LXTENSALX106LF as XTENSALX106LFR. Platform is ESP8266. OK, the compiler flags, I believe this platform does not have a floating point unit. So that should work. This is 68,000. There's another soft float, which we don't really need. OK, this is referencing the libc headers. I don't know what fixed is, but we'll get rid of it. Now, we may need some other things as well. So if you look at the kernel, we are using the long calls flag, which causes all jumps generated by the compiler to be the long form with the full 32-bit range. This is because the kernel needs to call into the ROM and to other stuff like the code we put into the instruction RAM and so on. However, we don't need that. And force L32, which causes the code to always generate 32-bit accesses. This is necessary to allow us to put data in the flash and the instruction RAM. Now, we are not going to be using either of those. Long calls we don't need because we're not calling into anything other than the libc. And force L32 we don't need because we're not going to put any of our data in the instruction RAM. Because we've got 64K of data RAM, but only 31.5K of instruction RAM, it's actually better for us to put data, there's read-only data, into data RAM. Our loader will put it there when the binary loads. So we can do native byte and word accesses and not need this. This will also generate smaller code, which is good, and faster code, which is also good. OK, so that one is going to be generic. I think we want drive wire, but we do want user structs. And I suspect that could be all we need. OK, so let's try building it and see what happens. OK, we'll be able to find a file. I will actually go up here and set up my usual auto-builder. And this is going to be this is called bigfile.gsp.266. The Enter command is really useful. You feed in a list of files. Then Enter will run the command in the double quotes whenever one of those files changes. So that's going to fail, because we do not have our tools command. That's actually referencing something else. All right, you remember I said there was a syscall generator program? This is trying to run it. So we're going to have to actually do that now. So it's going to be tools syscall esp8266.c. Put anything there. Oh no, that make files in the top level. That goes here. So this is easy enough. Let's put this in roughly alphabetical order. esp8266. OK, we will actually go up one so that we can, when we edit our syscall program, the auto-builder will auto-build it. OK, there's nothing there right now. So we are going to copy the 68, I can't remember in the wrong place, tools syscall esp8266.c. OK, that is here. We are going to copy the 68,000 one. Here it is next to my bitrotted MSP431, which also works by calling directly to the kernel. All right, so this is going to be, this is just boilerplate thing. Fork. Fork is special. We don't care about it. Let's get the commenting correct. Commenting, indentation. It's wrong, but it's now sensible. Oh, I can't remember how to make clang format change the indentation. All right, so now we are going to be using a facility of the LX6, which is the syscall instruction, which is intended for exactly this. Syscall is a special instruction that can be called, which generates a particular kind of exception that the kernel will catch and is used for user programs talking to the kernel. So where is the detailed documentation for it? Here we go. It's just syscall. No, that's not the right one. This tells you how the instruction works, which is really not very exciting. What I'm looking for is the system calls. System call is used to talk to the kernel. If you're using, the only restriction that the LX6 puts on you is that if you're using windowed registers, system call request zero has a special purpose. That's generated by the chip itself. However, we do not have the windowed register ABI. It would be nice if we could use A2 for our own purpose. It wants you to put the system call request number into A2. So the thing is the LX6 passes parameters into a function in registers when you're up to four arguments. Is this described? There we go. This is the ABI. This is how functions call each other. So A2, 3, 4, 5, 6, 7, 6 incoming arguments. So ideally, for a system call, all we want is the syscall instruction, nothing else. That will pass the arguments in registers straight to the kernel so that we can, we need one register for the system call number and then up to four parameters. We have six registers available. We do have to set the system call number. You know what? I'm just going to follow the ABI here even though it's work. Do we have any VARAG system calls? VARAG system calls pass their parameters on the stack. Where does it get the list of system calls from? A header. So here are the system calls. Open is VARAG. I think we just pass those as if they're four. No, we're going to have to do this properly. I'm going to have to find out how the compiler passes VARAGs. OK, what's going on here? The open system call, opening a file, comes in two forms. You've got the two parameter version and the three parameter version. The way this is typically implemented is not described in there. Is that should be in studlib, studio, studio. Not studio. Unistud, unistud. Not unistud. Apparently like this. The way this is typically implemented is like this with a VARAG specifier. And these get passed specially. So that's actually going to be a bit of research. All right, so what we need to do is in our generated machine code, we align because we need to do that. Our input arguments are passed in A2, 3, 4, and 5. We want to put these into A3, 4, 5, and 6 with system call number in A2. So let's put the system call number into A2, which is MOVI, A2, D. And this is the system call number. We can't actually put that here. We have to put this down here after we move all the other arguments. And after that, we call the system call routine. So when we pass in our parameters, we need to move them into the right place. The number of system calls can be, the number of parameters is anything from 0 to 4. If there is one parameter, it's coming in in A2 and we want to put it into A3. Likewise, if there are two parameters, A4, A3, three parameters, A5, A4, and four parameters, A6, A5. Each one of these falls through to the one before. So if we get four parameters, this moves A5 to A6, A4 to A5, A3 to A4, A2 to A3. And puts the system call number in A2 and calls the system call. It might actually be possible, given the way this platform system calls work, to use the same routine for all system calls. No, no, it can't, because we still have to do this. Okay, let's just stick with this. Now, var args, I'm not sure what to do here. Let's write a test program. So our open is, looks like this, it's called topen for a reason. So we're going to call topen, like so, and like so. And now we're going to compile it into machine code. Make sure it's optimized to make the code easier to read. Oops, I forgot to put a parameter in, not that it will make any difference. And let's see what that made. Okay, so, surprising lot of code, actually. So here we have the standard entry point. Now, we allocate a stack frame, and on this platform, stack frames must be a multiple of 16 bytes. We're saving A12, which is one of the callee saves registers, which is defined, so it's one of A8 up. So it is not a incoming register. We, right, we get the pointer to our string into A12. So that's the first parameter. Though it's interesting it's going into A12. Here is our second parameter, which is A3. We put the string pointer into A2. So now our two parameters are in A2 and A3. That's not really good code. We then save the link register. This is the return address onto the stack. This would normally go up here in the function prolog, but for some reason the compiler's not doing that. So that is passing the two parameters in A2 and A3 as expected. Right, now for the three parameter version. This is just what I want to see. It's putting our three parameters into three registers, even though it's a var args calling convention routine. Good. This means that for var args we just pretend it's a four argument subroutine. That seems 68k specific and the indentation's gone wrong. So let's just run that again. That's better. OK, so that is the entry to the system call. That's taking our parameters and calling the kernel. On exit, we want to check to see whether the system call succeeded or not. We do not believe. Right, we don't need this. This has got to do with the 68k having pushed parameters onto the stack. On exit from a system call, the kernel will typically return two values, a single flag to say whether it succeeded or not, and the return value. However, the C functions that represent the system calls, such as open here, will typically return a non-negative number, meaning it succeeded, or minus one, meaning it fails. And if it fails, then the error code is put into the erno variable. So what we're doing here is this is checking to see. Right, on the 68,000, the return value is in D0, which is the right place for a return parameter from the system call function. The error of code goes in D1R. That's not quite what I expected. What this is doing is the actual erno itself is going in D1. If it's not 0, then we consider there to be an error. So we store the error code in erno and return. That's actually easier to work with. So the logic we want here is if the error code is not 0, something like that, where result for us is going to be an A2, and the error code is going to be an A3. So how do we do this on this platform? Well, I could probably puzzle it out, but it's much, much easier to do this. If A3 is not equals 0, erno equals A3, return A2. And now we compile it. And what do we have? Wow, that's small. So we test A3. We are. If A3 is 0, branch forwards to the case where there is no error. There is now an error. So we want to put the value in A3 into erno. To do that, we put the address of erno into A4 and store A3 into it. Remember from several sessions ago, if you do MOVI with a address, it will turn it into an L32R and put the value here into a constant pool. Where is the constant pool? Here it is. And it'll do that for us automatically, which is nice. So that's all we need to do to store the value in erno. We're now just going to return the normal error code, which is an A2. So there is the end of our error case. And just do a red. And the kernel will have made sure that A0, which is the link register, has not been modified. OK, so that should be our user space system called code. I'm sure there will be something wrong with it. This writes out the make file to generate the system calls. This is, of course, it's going to be extensor LX106LFAS. And we don't actually need to set anything else because it's not used anywhere. So all the assembler sources, this is boilerplate. Good. Right, I think that's us done for that. So what happens when we save it built? Hang on, we didn't add it here. ESP668. Well, it says it built it. So there it is. Tools, syscall, ESP8266. It's trying to write out the make file. OK, that's a slightly better error message. Ah, we did not create the directory to put all our assembler files in. So let's see what this does. OK, it has, if we look here, it's generated all the system call stubs, but they're not assembling. So let's have a look why. exit.s, line 9, I'm too used to the 6502. And I instinctively put a hash sign there. Fantastic. So it assembled all the system call stubs. And it turns out that we do need that. It's now trying to, what am I doing, this one? So the reason why it's produced the error message rc, no such file or directory, is because it's just tried to call the r archive command, which I decided wasn't being used and didn't define. So that's expanded to an empty string. So the rc is the first argument passed into the archiver. That's just one of those things you recognize once you've been working with make files for a while. I hate make files. OK, it rebuilds them. It archives them. And I didn't put the new line in. It assembled them. But I think it put all the output files in the wrong place. So the archiver can't find them. Wonder where they went. That is completely not how to work the find command. I want to do that. Maybe it just didn't go anywhere. OK, well, now to fix that. The GNU as command takes, I believe, it's minus 0 for the output file. Yeah. Minus c is supposed to put it in the correct place. But anyway, let's do that. OK. Right, so it's now generated all of our system calls into a library, which has gone into syslib.lib, which is this. And we can use ARL. Apparently, we need to use the right one. ARL, AR, what are the commands? T, ART, to list the contents. So here are all the .o files. A AR compiler library is actually a file archive for deep historical reasons to do with the origins of Unix. OK, right, now we need to do set jump. So the thing about set jump is it's very, very compiler specific. And I have a feeling that I've got one already that I can just borrow slash steal. So this is a third-party mini libc. So I can fetch the source for it. So here is the register window mechanism. But we're not using that. We're using the call zero ABI. So this is the windowed version. Wow, there's a lot of this. Here is the call zero version, which is pretty normal. Yeah, that's straightforward. So I should be able to basically copy those. So there ought to be a long jump as well. So what set jump does is it sets the execution state. It's the state of all the registers which need to be saved and the stack pointer, which so that it can be restored later. It's used for making non-local calls around a program. I did refresh that, didn't I? Did I create the file? Apparently I didn't create the file. Actually, this is generic. OK, soft float doesn't work. Is there a long jump? Yes, there is in a separate file. Yeah, OK. So let's create that as well. And we'll also probably need to create our own header file for it in this, because the actual layout, it depends radically on platform. All right, so let's take a look at this implementation. There's actually nothing particularly exotic here, which is nice. It's really good that this system isn't using windowed registers, because that makes life so much more complicated. So how is this going to work? Set jump saves the state into a structure. Oh, great, now we've got two of them. OK, anyway. Text, align for global set jump, set jump. So the first parameter to set jump is the structure pointer that we're writing it to. So this just stores all the stuff. I can actually just copy that. It's so trivial that it's not copyrightable. This is the stack pointer. Do we get to use LR? Apparently, we don't. LP, all right, this doesn't mention a particular register name, so let's just go with A0. OK, so that works. Now, we do need to look at this to see how many slots there are, which is six. So then we go over here, and we need to define this accordingly. And we are actually going to leave that for now, because it does actually properly produce an error. So I'm just going to wait until we see that error and then implement it, because that way it'll be easier to test. OK, so we want to get rid of that other file we put in by mistake, set jump, lx106.s. I want to fix this, no soft float option. Now it's building huge, great sways of the libc, and it fails because we haven't implemented the CRT. I do want to implement long jump, which is, it's the counterpart to set jump. It restores the state, and it works exactly the same way, but in reverse. So what the user sees is that they call set jump and pass in a structure. The state of the system is stored into that structure. Then at some later date, somebody will call long jump, which point the execution point will jump back into set jump. So set jump will return again. The caller can tell the difference by the parameter. The first time set jump is called, when it saves the state, it returns zero. And you can see that happening here. When long jump is called, you pass in a value, which is the new return value to set jump. So reload all these things. We restore the stack. The stack pointer is guaranteed to go backwards when this happens. It can never go forwards. Well, it won't work if you make it go forwards. There is one weird gotcha, which is that if you pass in a zero to long jump, as the parameter, because zero is used to indicate that set jump has returned for the first time, this is illegal, so it turns into a one. And here we have some code that does it. We load. That's quite cunning. So on entry, A3 contains the value. So we load one into A2, which is the return value, which is the backup return value. And if A, hang on, move into A2, that's a neat instruction. So move A3 into A2 if A3 is not equal to zero. Let me double check that. I think this search isn't working, right? Or maybe I should be pressing the right button. Move, Nez. This is the floating point version, but it will have the same logic. Not quite the same logic. Let me find the rule one. Move if not equal to zero. Yeah, if the contents of AT is non-zero, copy AS to AR. That's neat. I like that. Anyway, that should now work. That didn't build anything which suggests that I need to add a thing to here. Yes, I do. Long jump, lx106.s. OK, and now we need to do the CRT. The CRT is the C runtime. It contains the code that gets executed when the binary starts. The kernel will load the binary into memory, and it will jump to the entry point, and it will arrive here. So this will need to do basic initialization, initialize stdio, set up argc and argv, et cetera, and the environment pointer, and eventually call the real main function. It's the CRT that's responsible for the argc and argv parameters arriving in AC main function 6. Actually, this is generic. So this should actually build now. No, it doesn't. You've got CRT.s, and we don't have a no-stud.io version. The no-stud.io version, which is smaller binaries, we're not going to worry about that for the time being. OK. Right, we now have a libc. It's missing the CRT stuff, so we'll not actually be able to compile anything with it. But there's our, oh, we have maths library too. There's our math library. There's our term cap library. Here we go. Cursors, math library, read line, system call library, term cap. Has it built the libc? It might not have. Interesting. Anyway, let's actually write it. We know how the calling convention works, so we're actually just going to copy this with a few wrinkles. The section header here puts this function into its own special section, so the linker can put it in the right place. This is the, I believe this is the new lightweight physics system call format. No, it is not. This is the new lightweight system call format, which is the one we're going to be using. The MSP430 version, which I was looking at, is obsolete. So let's do that instead. So magic number to say it's a physics executable. This is the CPU identifier. Is that being used by anybody? This is the code that actually loads the binary we've, this is the bit where we're parsing the header, setting up arguments. OK, here's the header. Header OK validates the header. OK, we do need to set the system call library correctly. Ah, right. That explains what the, if you remember the other day, there was this set sysCPU kernel function that I was wondering what it was. That returns the CPU number, the CPU ID. So I bet that somewhere in here, here it is, exec.h. This is where our list of CPUs is defined. So we are just going to put one down the bottom. So this we don't care about. Actually, let's do this differently. We're going to declare ourselves as being an LX106, as the CPU type, with the subtype being ESPA266. This way we only use up one of these IDs. And we now need to go to here, tricks.s. No, I think it was actually in here. Yeah, so this is going to be ID 11. SysCPU feet needs to be defined in the, in here somewhere. So where did we put our CRT? And we're actually going to need to edit our make file again. Do we? 1, 5, 9. That's a .s, all right. OK, the difference between lowercase s and uppercase s is that uppercase s assembly files get pushed through the CPU processor before assembly, which is kind of useful. And in fact, we're going to have to move this because our CRT0 is specific to a particular platform type. And apparently I already, because I messed up the make file, it wrote that. So let's put that in here. 0sp266, forget, CRT0 LX106, remove the actual file. Go to the make file platform. Yeah, OK. Right, now we can actually start writing our code. CPU type is LX106. Feature set is, this is the ESPE8266 module. This is the base address, the place where the binary is loaded, which is meaningless for us. So we just leave it. Binman, I thought the linker made these. OK, well, let's leave these in. This is the offset of the start address, which is that. This is the size of 0 page, which is meaningful for us. Here we have the address of the signal handler, which signals are delivered to the program. The kernel will force a call to here. And then we actually do our code. Let's do, I was right the first time, let's do it like this. I don't believe those exclamation marks will be honored correctly. Anyway, let's get rid of the old header. Comment this out. The environment pointer is 4 bytes. Known pseudo op data 2. Is it called something else on this platform? Don't word, apparently. And I bet that data 1 is going to be byte. And of course, I forgot that a semicolon is not the comment character. Data 1 is probably byte. Exclamation mark is not a comment character. Cannot resolve O. OK, we have a CRT header. No actual code, but. So we go back to our test program. Right, because this bit's a little bit subtle. We need to wipe the BSS. This is the portion of the data, which is guaranteed by the C standard to be initialized to 0. And the way this typically works is it's something like this. Just a simple loop. We are doing it four bytes at a time. So what can this compiler turn this into? A fairly small amount. So mov i into a2 becomes bss start a3 becomes bss end. Can I write that code correctly? This is interesting. I don't see an ad. I don't see it incrementing the pointer. Unless mov.n is doing it, but doesn't that just copy registers? Start at bss start. Keep going while the pointer is less than the end. And advance the pointer. So what was happening is the compiler had noticed that I was setting pointer to 0 rather than the thing being pointed at to 0. Therefore, this addition, it was adding a value to something it knew was always going to be 0. Therefore, it was always setting it to 4. Sometimes compilers are smarter than they really turn the thing into a call to memset. That's not what I wanted. I actually wanted to do the code that I told it to do. Does this work? No. See, the reason why I don't want to call memset is, well, I could make it call memset. But honestly, this setup code, there's quite a lot of it, and I just don't care. We don't need to set up the stack frame because there isn't one in this routine. There will be a way to stop it from calling built-ins. But it would be far too useful for GCC to actually list all the command line options only in the man page or in the help. No, of course, we don't have a man page for these. That would be useful. OK, I'm going to have to do this myself then. So let's look at what this actually made. That's not useful. Hang on. I can nerf the memset by doing that. Although I can nerf the memset optimization by doing that. That's better. OK, so A2 is our running pointer. A3 is the boundary at the end. A4 is a hand of 0, which we're going to be using to actually write the thing. So BSS might be empty. So let's do the conditional first. If our running pointer is greater than or 0 the end, jump to the end. Otherwise, write our 0 to the running pointer location and increment the running pointer by 4 bytes and loop. Right. So our BSS should be ready. The next thing we want to do is to initialize stdio, which is just a simple like so. Now argc and argv are on the stack. So there will be something on the stack. There will be valid data there. The MSP430 has push and pop instructions. The way this one works is that argc and argv are values on the stack. And once they've been removed, what's left is the environment structure. And then above the environment structure will be argv itself. Now, we don't have push and pop. So what we're going to do instead is we want to load, just wondering what order they are in. OK, R12 must be argc. So that'll be R2 stack frame offset 0. So that is argc. That is argv. Therefore, R4, which is going to be our environment pointer, is the word above that. And is that, I think that's valid. The 16-bit alignment requirement is going to be a bit irritating. OK, so we actually want to store this. So S32, I, R4, R5, 0. So that pusher that we're going to keep argc and argv in the first two parameter registers for calling main. And we're going to write the environment pointer into a global variable for access. It also gets passed in to main as the third parameter that no one ever uses. And we set the return address to be the exit function and call main. And that should be our C runtime. And registers start with an A, bad register name, right. That's not call x0. That is call 0. That's also an A, boon, opcode, or form call. That should be a hash. OK, now we've got our C runtime. Good. So let's commit this. Are there any files that we need to add? Don't see anything offhand. Right, the next stage is we wish to build our applications, which live in applications. Should we restart this? That will change the root directory of the tree viewer. There's a key to do it. I can't remember what it is. So we're not going to build everything just yet. We want init. So I'm going to have to build the util block. Util contains the base unix utilities. And the one we care about is init here, which is surprisingly complex. So we need to make file. So let's copy the good old 68000, 266. And this should all look very familiar. This one is 6. Oh, yeah, and this one. No builds in, no soft floats, no 68000. And I think we do want that. What is actually in these things? Oh, right. It's stuff that I'm actually going to have to implement. When linking stuff, we do need to see library. We want libgcc. And this is the linker script that will actually generate the executables we want. And we can't use the standard way of doing it, because we need, because of our weird memory architecture, 266. And this is going to have to be different. Let's just change that to false for now. That will cause assembly to fail. OK, so what happens if we actually try to build this? OK, that looks sensible. So it compiled the C file with no warnings. It then failed because it couldn't find the linker script. Sorry, it got called away. So I'm trying to remember where I was. Yes, I was about to start work on the linker script. So normal platforms, that is the traditional Unix style, has binaries that put the various segments all in order in a single address space. So this is the original. Actually, I will hold to sp266. So this is putting all the, yeah, this is compiler nonsense we don't really care about. We have, this looks kind of odd to me, actually. Yeah, we have a text section, which appears first in the binary. This contains all the code. It also contains the read-only data. Then we have the data segment, which appears after it. This is read from the binary into memory to initialize all the variables assigned to the data segment. Then after that, we have the bss segment, which contains zero initialized variables. This does not get loaded in the binary at all because the values are all zero. After that is the heap. I also see a stack there, yes. This is a surprisingly complicated linker script, given how simple everything is. All right, let's do our version of this. So we've got two address spaces. We've got data and code. So our data lives here. So that's 3FFE8000, and it is 16K long. And we have a code memory, which lives here. And that is at this address, and it is 31.5K long. So this should actually be fairly straightforward. The text segment goes into this. Looks like an ARM thing that we don't want. This is all that's going to go into the code. I'm not sure what this is for. Let's get rid of this address. This appears at the very beginning so that we don't need the alignment. We do. This looks. Where is the actual code appearing? Oh, here. Here it is. And you know what? I'm going to chop out all this stuff. This is mostly used by C++, which frankly, I don't really care about. So we're going to go with the simple version for brevity. We have no exidx. OK, we've got the data segment. The data segment also does not need to be aligned because it's appearing at the beginning of this section. Our system doesn't have got. Yeah, we don't want any of this stuff. Right, data. Firstly, let's put all the read-only data into the data segment. Then we put the read-write data into the data segment. That is a microblase thing. We don't care about any of this. Let's see. What is this actually used anywhere? No. Let's use this. C++ constructors. No idea. No idea. Oh, initializer and finalizer. Yeah, let's lose all that. And this goes into the data segment and p-header. And exception table, staff don't care. Right, the bss. This is the uninitialized data. So align, let's start. OK, that looks correct. And this also goes into the data segment, I think. There should be a way to prevent that from actually being emitted. I think I need no load. There we go. Bss, let's get rid of the explicit aligns. That's a line for no load. Following this is the heap, which occupies the remainder of the space. Though we do actually want to stack. Although now I think of it, I've forgotten about the u-data block. The u-data block now lives in kernel memory, so it doesn't appear in the data memory entirely. So let's get rid of that, loss of junk. Let's leave the debug stuff, because this won't actually go into the final executable. But it will exist in the L file. The stack, the user stack used by the process. We actually want to put that where do you want to put it? Normally, the stack goes at the very top of memory. But I think I will actually put it in the Bss area. Not quite in the Bss area. Let me just quickly go look and see how the malloc routine figures out the top of the heap. Malloc is subtle and annoying. So right, it calls sbrook. OK, let's put the stack right here. It needs to be 16-miter lined, not loaded. And we want it to be, let's make it 1k. That's going to be too much, but it should do. Where does the stack pointer initialized? Probably the top of RAM. OK, so each process needs to have its own user stack. And this gets set up by the kernel. And I'm just trying to remember where it gets set. Ah, place argument, environment, and stack at the top of user space memory. OK, so we do not need a stack block. We're always going to put it at the top of memory. OK, so let's try that make again and see what happens. Syntax error. The LD script syntax sometimes requires semicolons and sometimes it requires the absence of a semicolon. Literal placed after use in header. OK, oh, that's awkward. Right, the L32R instruction, which we are using in many places. But in this particular version, in this particular situation, we are using it in the header block. These turn into L32Rs. It loads a 32-bit constant out of a constant pool into a register. The L32R instruction takes the number of bytes to subtract from the program counter. This means that the constant pool, or the literal pool in their term, has to appear before the use. Now, the header block has to be first because it's the header block. But the header block's literals have to go before it. And there isn't anywhere to put it. So we are going to have to change this code. And let's also fix the, let's just line these things up. And actually, where is that tab stop? Now, the problem here, well, what we could do is to simply have something like this. So this starts a new section. This is normal code. It can appear anywhere in the image. This is the header block and has to be first. The problem is that this is not an address. This is a offset, which can only be 256 bytes long. So one thing we could do is simply have this. Or we can put this and its literal block in its own section and make sure those appear after the header, thus making sure that the entry point is within range. But I think the simplicity will do this. Actually, we don't want that to be a global. 62 junk at the end of line. Now, what does this do? The text is not within region code. Yes, it is. Oh, address is not within region code. Also, I've remembered that I need to, there is something else I need to do. And I should do this to the kernels as well. And the utl makefile is 3266, which is we want another couple of compiler flags, which is functions, sections, data sections. What this does is it forces each function to appear in its own section and each variable to appear in its own section, thus allowing the linker to do a much better job of optimizing things. Clean, make, that wasn't quite what I was expecting. Yeah, we need to do that with the library as well. Okay, and now we should be able to do this intriguing. One of the side effects is that the section names get suffixed with the section with the functional data variable name. That makes it much easier to figure out what these are referring to. Okay, bssend, we just haven't set this as end equals dot. Undefined reference to seek handler. That's the signal handler. I'm not sure. Is it actually defined somewhere? Oh, it needs to be in the... Where is that defined? Right, that's new since the MSP430. Let's take a look at the 68000. Uh, no. Let's take a look at the 8080. Right, here is the signal handler. I have to write it myself. Called indirect signals from the kernel through code, which saves the non-reentrant OS glue. This needs to save everything onto the stack. Yeah, let's just dub that out. That's a special instruction that goes to the debugger, if there is one, which there isn't, so it will cause a fatal exception. This is... These need to reference the kernel. These need to reference the ROM, so these live. Now, we could simply do provide lines, like we did for the kernel. Yeah, here. I do not believe there will be in range for the linker for a standard call, so yeah, that's not going to work. So what instead we're going to have to do is... global view div si3. Actually, that ain't going to work. Let me just... It might work. Jump has a longer range than call. So... Hang on. That's the wrong address. 8268. Mod si3. Div si3. Alignment, alignment. Dangerous relocation cannot encode. Yeah, I think that's still out of range. Okay, we're going to have to do it the hard way, and we need a temporary register to put this in. I don't believe A6 will be in use, so I think we should just be able to do this. Okay. I mean, it might not work, but... But now we still need to figure out what's going on here. Can we get more information? Can we get a map? That wasn't really what I was expecting. Okay, so A4A is the end of mem copy. Huh? Well, here you can see the sections that each function goes into. So A4A is... Wait a minute. These are code addresses. These are data addresses. These should be... I think these are all just wrong. See, these are not the right addresses for... But it does know where the sections live, because it's displaying them here. Where's that linker script gone? So text goes to the code section. This goes to the data section. Data section. Junk. These must be prior to relocation. It doesn't make any sense otherwise. I have told them it's supposed to be in the right section. What's one of the other addresses? 1f60... 1f60. BSS. Not within section data. So I can see all the variables are here. So what is going on? I'm going to go away and do a little bit of research. Okay, well, I have made it work. The trick is to tell it that the text section starts at the origin of the code memory area, and the data section starts at the origin of the data memory area. But I do not recall having to do this in the past. So either I'm misremembering it. This isn't how it actually works at all. Or something else is going on. But I can now see in the map file. Here we have the addresses of the various things. The header is at 401. Wait a minute. Oh, that's nasty. Okay, so the header needs to appear at the beginning of the file, but it does not actually contain anything that's used at runtime. So it doesn't need to be loaded. I think it is loaded just because it's easier that way. So let's take a look at that 8080 version again. Okay, so the header does appear at the beginning of the text segment. Right, this means that my header needs to appear at the beginning of the text segment, so it appears at the beginning of the file. Even though it's not used at runtime, we could actually just not load it. But anyway, that should work. So here is our BSS. This is not a big program. Here is our data, which is immediately preceding it. Here's the beginning of the data at the beginning of our data memory area. Here's the end of our code. I will actually put a text in equals dot. Yep, okay. I think that's working. Let's just try running the make file and see what happens. Right, I disabled flat to map. So I've been thinking about how this works, so let's just put it back and see what happens. We can look at the binaries. I think what it does is it just pulls the appropriate p headers out, the program headers, and assembles the binary and then patches the header. That's the wrong make file. I want this one, and I want elf to... Oh, these are commoned out. That's nice. Okay. Rules.mesp8e266. So I believe this is all common. Let's just take a look at the ACAC version, which seems to be pretty modern. Seems to be missing quite a bit of stuff. This is an Amsterdam compiler kit version platform. It doesn't use the... It doesn't use GCC. That's also different. Okay. Let's take a look at the 68001. Elf to physics is elf to Flut. So elf to Flut minus s. I think that's the stack size. Let's make a small stack. Okay. So here we replace all this with include physics root applications rules.espa266. Okay. Now let's run the make file and see what happens. I think we haven't set physics root. Let's do this instead so we don't need to. Really? Rules with an s. I've had dinner and I've got my tea. Why am I making these mistakes? Okay. Elf to Flut. Where is elf to Flut? Apparently it's nowhere. Is this a standard tool? No. Okay. I saw a reference to another thing called bin man. So what does this do? User space binaries doesn't have the common packing magic. Yeah. I don't like any of that. Okay. We're going to have to do this the hard way. So instead of calling elf to Flut, we are going to do obj copy. The output target is binary. I think that will do input and output. This is really not support the standard minus a minus small o for the output file. Apparently it doesn't. Okay. So where is elf to physics used? Okay. Right. We can change this. All right. That built and linked to thing. So we now have a binary called banner. This doesn't look anything like right. 12 megabytes. Sorry. 1.2 megabytes. Yeah. That hasn't done what I wanted. What I was wanting to do is to convert is to tell it to convert the elf file to a binary file by simply concatenating the various p headers. But what it actually seems to have done these are strings. These are the. Oh, this is the data. I think it's actually done more or less the right thing. So the data segment appears first in the in memory order. So obj copy has put that first in the file. The code appears 1 megabyte further on. No, it doesn't. The code appears a slightly different. Hang on. We've got our LD file here. The code appears a 1 megabyte plus 16 K further on. No, apparently that's wrong too. The code is in here somewhere. It's here at the end of the file. And here is our header. So we actually want a slightly cleverer. All we need to do is to take the text segment and the data segment and concatenate them. That should work. And can I make obj copy do that? Okay, more reading time. So when in doubt, write it yourself. Here is my version of elf to flat as a very small shell script. What it does is it uses obj copy to pull out the text segment. Obj copy again to pull out the data segment and then it concatenates them together. So I just need one tweak. So I need to tell it what tool chain to use. And let's clean and rebuild and see if it works. Okay, well, we have an obvious compiler compilation failure there, but it has actually built several things. So if you take a look at our banner executable, this is the thing that our platform should actually run. We can see here at the top is our header. I don't think it knows how long a word is. Let's try a short instead. See if that works better. And that one actually turns out to be correct-ish. Okay, clean, build, excited. Okay, that's better. So here is our magic number. Here is the LX106 CPU identifier and the ESP8266 feature. Base page, byte, hint, byte, code address, a code size, data size, bss size, offset to the entry point, size hint, stack hint, zero page hint, address of the signal handler. Yeah, looking at this and thinking about it, I'm wondering if this is in fact the wrong binary format for the platform because these pointers are actually two bytes long in the standard header. You know what? We're going to have to write our own loader anyway, so I'm just not going to worry about it. This is going to be a different format. So what does the 32-bit loader do? Right, UC Linux bin flat. Yeah, that's what the... So the Elf to Flat program that we don't have is actually from the UC Linux code base and that turns it into a UC Linux format executable. Right, so that actually means that we need to rename this and change our rules. We can keep that. Okay, you're going to make... Yeah, those are words. Now there are... We're not using bin man anymore, which means that we can't rely on bin man to initialize these. So we're actually going to have to do this ourselves. So that is going to be... We need to fill out the header with the size of the various sections. End BSS start and BSS end. So in our CRT, code size is text end minus text start. And I really hope that the linker understands how to do this. Data end minus data start. BSS end minus BSS start. Right, fantastic. Can I do this? Great. Okay, so some tool chains are capable of representing pointer differences like this in the relocatable object file format, which means that this will actually get resolved by the linker and turned into a simple value. But it looks like we can't do this in this situation. So we are going to have to do something else, such as moving all of this into the linker, which does know this information. So our header segment now just contains a jump instruction. This will be the first probably three bytes in the executable. It will then be followed by the literals for our entry routine here and then the entry routine. Okay, and now in our linker, in our linker script here, we're going to manually emit all the various bytes that we need to generate the header. And I will need to go look up what the various commands are. Okay, that's not complicated. Byte short, long and quad. So the first thing that appears is, and I can't actually know what the comment character is, C-style music executable, followed by a byte, which is 11, CPU, followed by a one, which is 266, followed by a zero, followed by another zero. All right, followed by, is the word correct? No, it should be long. There we go. Okay, followed by the entry point routine, no size hints, no stack hints, no zero page hints, followed by a pointer referring to the signal handler. Now, I'm not quite sure where the text start should be here, meaning that the value in this is just the amount of code here to load and not including the header, or where the text start should be here, meaning that that does include the header. I'll do it like this to start with. So, and define symbol entry and global entry, and our signal handler is also global. Okay, all right, that actually seems to be behaving itself. Let's take a look at the binary. So we have, where did I put it? Here, so we have the magic number, CPU and feature, base page and hints, length of the text section, which in this case is 09FO because this is little endian, length of the data section 1514, this, these two bytes and these two bytes, it's the length of the BSS, 84 is the entry point that seems like a large number to me. No, it's right, it's here. You see that that is, these are addresses in the literal table, and this is code 0, 0, 0, and 40100AD is the address of the signal handler. So I would be slightly happier if that were aligned. Yeah, I'm, these are going to turn back into shorts. So that's back the way the, yeah, here we go. 09OE14155CO3. As we have 64K of data and 31.5K of code, then shorts will fit those, and this actually makes the header the same as it is in the other systems that use this format. I think I got away with that sentence. Right, and that makes the signal handler pointer word aligned. Good. Okay, so what is the problem here? We have DIVDI and DIVSI. Well, we're going to have to put more of these in. You know what? I am going to try and remember how to do macros. Backslash1, $1. Let's try backslash1. Backslash1, $9. $6. Help her. You. DIVSI3. $4. E21C. Help her. UmodSI3. OX4. E268. That's that build. No. Too many positional arguments. Name, address. Backslashn. Did that do the right thing? Yes. I actually remembered how that worked. Right, so we need helpers for DIVDI3. I thought I had still had the ROMs assembly. I know where it lives. DIVDI3. And DIVSI3. DIVSI3. Okay. So build that. Build that. UmodSI3. UmodSI3. UmodSI3. UmodSI3. UmodSI3. UmodSI3. UmodSI3. UmodSI3. Putting these help, calling the ROM version of these, is a really good idea. Ooh, I actually got somewhere. Right, because using the version in the ROM means we're not using up any of our vital code space with them, which is extremely nice. Okay. that we needed to define that header library include set jump now we need to we need to know what the compiler definition is for the for this platform is it lx106 no it is not how can we find out what it is i think do we see anything this shows the commands given to the underlying compiler that's not necessarily useful yeah uh if you run this more of i just for you does that help not really there may be a there may be something in here that might be of use but i suspect probably not yeah i think yeah i'm actually going to have to go away and research what this thing is uh it may or may not be in the documentation probably won't be here no uh oh hang on hang on dump machine not useful specs right that looks more useful take that into less just minus no no define no cpp options nothing there debug options cc one options this is kind of where i would expect to find this sort of thing but it is not that i can see okay i'm going to go away and look this up okay i found out how to do it which is you invoke the minus dm minus e options and give it an empty file and this produces 263 different hash defines and if we look around we can see here are some extensor definitions so the one we probably want is not extensor because that's the generic one what we want is this one because the if we are using the windowed a bi we do not want to use this particular set jump definition so our jump buff is going to be i believe it was six two three four five yep six just checking and so jump buff and attribute this okay so now let's try running that make file again if i can find it okay we've got some more missing helpers udiv di3 udiv di3 is here mul di3 mul di3 is here in fact there's a bunch of moderately useful looking libc things that we could use and we'd have to disable the the phusix libc ones we would also kind of have to hope that any storage they used was compatible for example s-rand here is clearly storing the this is the seeds the random number generator is clearly storing that seed at this address so we would have to make sure to initialize it on process startup and save it on context switches yeah not going to do that umad di3 umad di3 sort that actually got quite a long way uh undefined reference to h tons rather h2ns host a network order right this appears to be a yeah this is nearly always done in machine code only compiled and generated with little indian prop platforms um i can't remember if the lx6 has indian swapping what was that bite swap i don't know this is not what we want let's write indian yeah there's a big indian version of the uh of the platform okay um there's clearly not one so let's just add the c version to our list to ns let's see go build that that's the wrong list come on this one is much messier one okay h2nl will be the same okay right this is now trying to do the no studio version this is for programs that don't use studio the problem is so in our crt here we actually have a reference to the studio library so just having this present will mean it'll always get initialized which means it'll always have to be linked in even if you're not using it so the way physics does this is you can have two different crts one which sets this and one which doesn't uh so one which calls this and one which doesn't so if you don't want stood io use the one that doesn't call it and get smaller binaries okay that's worth doing and i will actually this could all be factored out we could refactor some of this as well but honestly it's not worth it okay so we're going to turn this into we actually have okay that's just for system calls so we're going to do that this is going to contain this stuff including the signal handler okay so crt no longer needs anything from here down we then copy this to no studio esp866.s we edit the so this is a crt zero no studio so that's form dot s where are we going to put our helpers in oh we own this file i kept thinking this was a common one so some as a new helpers esp866.s okay let's edit the edit this to take out the studio all right build this seek handler referenced an expression this kind of suggests that our helper here has been assembled is it then included in the yeah yeah here it is in the helpers don't think it's that well let's just yeah okay undefined symbols seek handler right that's probably because symbols referenced in the linker in the linker script don't cause things to be pulled in and it just so happens that this may not be calling anything from our helpers file so let's just take that out and put it right back in here and here we'll have to deal with that properly at some point when we actually implement signal handling i'm not sure signal handling did i just add that twice to the same file yeah apparently i did i'm not sure esp866 supports signal handling yet a vague memory it's been worked on could be wrong okay we build build build build more build done fantastic right so these are all executables which we could theoretically run on our platform how big is our unit well let's find something small decomp i think it was this is a decompressor utility looks like a binary do we have yes we've got true true is incredibly simple all this does is return a successful error code i wonder what's actually in that yeah that's about as simple as you can possibly get it won't have the libc in it so what we've got is the crt code some literals the actual main function itself exit underscore underscore exit a system called routine but yeah it's it's 186 bytes which is not too bad we've got for sick the file system checker which is pretty big but it's still occupying less than half of our available code space quite a lot less i suspect okay i can figure that out who's it's been here we go 9k of code 2k of data uh been under 2k of bss uh this is a nice one i wrote that it's a fourth interpreter so 7k code 10k of data because it's got the fourth dictionary in it i'm quite proud of that it's a entertainingly insane fourth interpreter uh it's you notice this thing at the top here and all these hatches this is a c file which is also a shell script you can run it as either and it's also an orc script which is embedded inside and it's also if i scroll down quite quite that far but oh wait it is further down than that it's been a while since i looked at this but it's also a fourth compiler when you run it as a shell script it uses the orc script to read the source code find these comments compile each of these fourth words into bytecode and embed the bytecode into the script and into the source file so that when you then compile it as a c file you get all your pre-compiled fourth words and the big linked list which is here which makes up the dictionary quite pleased by that anyway i'm done for tonight next time i will actually be trying to load a binary into memory and see if we can make it do anything at all when we run it we'll have to add a system call handler we'll probably end up with at best sorry yawning it's uh it's quite late at best we'll end up with a binary that runs and then we see the system call handler being hit and then everything falls over in the heap not having a debugger is going to be exciting for this bit but let's commit all this i think that's everything let me just check and stuff nothing which looks oh it's this yeah i think that's it and we are probably done for the day there i was just building the utils directory which contains most of the executables but there's loads of other stuff in here including a port of a colossal cave that we can run on our embedded system and some ancient languages i believe this is pilot some stuff pulled from the minix application set this is a vi clone there's even some games basic interpreter all right i am going to do something that doesn't do more coding for a while i hope you enjoy this video please let me know what you think of the comments