 All right, welcome everyone. So today, we're going to talk a little bit about Game Boy programming or actually discovering how Game Boys work using 4th. First introduce ourselves. This is my colleague, David. I'm Tain. We are working at a consultancy company in Amsterdam, Reactor. And our main work consists at the moment of JavaScript programming, not always the most challenging stuff. So we decided to start a sort of after hours hacking group called the Amsterdam Hackers with a few other colleagues. So for the first project, we were thinking what is interesting to do, something novel hopefully and something that we can learn something from. And we started playing around with writing some emulators. I started writing like a chip 8 emulator, which is like the most obvious place to start, I believe. And we're thinking like how can we make this into like a bigger project? So we were thinking like maybe we could like do Game Boy emulators, but that's like quite saturated field already. And I don't think we would be able to compete with some of them out there. A Game Boy game also like done quite often. And eventually we settled on writing a compiler for the original Game Boy. Okay, so as Tain said, the Game Boy has been started quite a lot already. So you can find plenty of like manual tutorials about how to write games, how to use assembly and everything. But the idea of the project was a little bit and the key point is to try not to use any tooling. We need the manuals. We cannot really write the compiler without the documentation on the hardware. But we try to avoid using emulators or debuggers or anything as much as possible. But although some points was useful, but should be like something that shouldn't be completely necessary. Okay, so let's talk a little bit about the Game Boy architecture. The CPU is a mix between the 8080 and the Z80 CPUs that is custom. The internal clock runs at four megahertz, but every instruction is actually like multiple of four cycles. So in practice, the CPU is like one megahertz. There are like 32 kilobytes of addressable RAM memory, but only four kilobytes are working memory that you can actually use. There are another four kilobytes, but that actually is hosted in the cartridge and depends on the functionality of the cartridges that you're using. So there are eight registers, eight to F. Some of them you can actually combine them and use it in like 16 bits way, but they're actually all like eight bits, as of the product counter and the stack pointer. Then we have like the video hardware, similar to the Atari talk that we saw this morning. You have like a scan line that goes left to right on top to bottom. You will have some space and time between each line, that is like the horizontal blank, and between the end line and the top line, the vertical blank. The interesting thing is that the video memory is actually unusable when the hardware is actually drawing. So you have to synchronize your code with the hardware to only write to the video memory in the vertical or horizontal blank. So this is quite tricky to get right, and what we saw this like if we start writing the compiler, it's not really difficult to get any feedback. So you have to get pretty far just to get something on the screen. Additionally, there are some other systems like sound subsystems. You will have some input timers, but those are like something that we can work on later, and initially we focus on the video. Okay, so how do we start it? We read the manual, but as I said, it's not easy to write the compiler from scratch. The feedback is going to be pretty long, it's not incremental, and we will probably need some debugger or existing toolings to actually assist with this. So instead, we decided to start with a working game that, as time will show you later, was as a hello world, the simplest ROM that you could find, and we're going to reverse engineer this binary using force into like a readable program that you can actually modify later on. Then on top of that, we will write the compiler. Time will show you like the steps more carefully later. But let's explain a little bit fourth. That's actually anybody in the audience who's fourth before. Right, nice. So fourth is basically the simplest language that you can imagine, or at least for me. There is no syntax. Your code consists of space separated words. That's all. You have numbers, and you have other words. Numbers will push themselves to a stack, and then the words will execute some action. Usually, taking parameters from the stack and pushing the result back to the stack. You can, of course, then define your own words. So in this example, we add five. Well, five is pushed to the stack. We push one to the stack. Plus, we'll pop two arguments from the stack, and that will pop the result of plus and print it in the screen. The interesting thing here is you can take just one plus or any subsequence of words and extract it into your own definition. So we define here ink increment as one plus. So this property, the ability to extract any subset of words into your own definition makes the language concatenative, and that's a property that's gonna be very, very useful to reverse engineer the wrong, as time would show you. And additionally, if you pick carefully the words that are gonna make your language, you can define domain, you can define domain-specific languages to model what you are trying to do. And time will show you as well a very cool example in the Assembler. Wait, so I will let time explain now how this works in practice. Hopefully. Okay, so we set up on the language that we were used for this project, and we looked over on the internet for the most simple case of a ROM that we could find. Just a ROM that prints hello world to the screen and doesn't do anything else, like just stops there. This was already something that we had no clue like how to make. So we just started from the ROM. This is the part of the hex dump of the code. Obviously like completely meaningless to us at this point, but we knew that this was a working ROM, and we took actually to make sure that we would keep this ROM intact over the course of the process would keep like a hash of the file somewhere and verify that like every step we took, like we are still compiling the same ROM. So we changed this file pretty much just adding the word C comma in between every bytes. So C comma is a word that takes the argument from the stack. So pretty much like the value that comes before it and writes it to a file or ROM basically. So in a way, this was our first like version of the compiler that would only produce like a hello world ROM, but it's like executing and like producing something that actually runs on a Game Boy. Actually at this point, we didn't try to run that, but still like kind of like a meaningless sequence of numbers. So we went to look for offsets of where certain data is stored. The Game Boy documentation and like other like wikis online that we could find, like they show a quite distinct part of the cartridge is the header, which contains some data information like the title of the ROM, like a checksum to make sure that like all the bits are intact, manufacturer information, et cetera. And we identified like the offsets of this and we can see here like I think the Nintendo logo on top and on the bottom, we have the example title for our ROM and we can simply extract this into a new words. So now we have a logo and a title words that still do the same thing and due to the concatenative nature, you can simply just replace those sequences with a new words and we get something that's a little bit more readable. If you imagine these like definitions to be abstracted away somewhere else. Still producing exactly the same ROM, but we can slowly like start seeing a structure in the program. Of course, like we keep repeating this for the rest of the program as well. At some point we have all the sort of like raw data extracted. So some like flags that indicate if it's a color Game Boy game or a normal Game Boy game, like the title fields, et cetera. Then we are left with some bytes that we haven't like translated yet and that's pretty much the program itself because of course like there's no documentation we'll tell you what the actual code will look like. So for this we reference the CPU manual and we'll find like certain numbers are like certain opcodes and here's an example of the hexadecimal value of 3C pretty much translating to an increment A, so the A register and a zero four translating to like an increment B. So first things we can of course replace all these like numbers with a new words that emits that number and that will work fairly well. But the nice thing is that we can find again like patterns in these like in between the machine code as well. So what you can do is abstract away the operands of an action. So the increments we can define separately and basically combine it with a A and a B register as shown here. There's not really like a meaning to the binary values, it's just like if you combine these like numbers in this way it will end up becoming an increment A or an increment B. And basically to use it we can basically just run A increments and it will emit an increment A like code to the ROM. So of course like we can continue doing this and at some points we'll find like more let's say like patterns in between the opcodes and we can start to create a sort of a DSL or like a pattern matching like system until the fourth. So we define here the like arrow words and the double column words. And what it allows us to do is like basically write a complete assembler and if you would look at the documentation of the CPU you would find pretty much the same table that you see in code here described in the manual as well. So we have like an increment instruction and for a normal register we have a certain like opcodes for the specific HL register on opcodes, et cetera. I think that's everything we can say. There's some nice additions here that we also like keep track of the cycles. I don't think we use them but like you can extend that as much as you want and kind of like create your own language structure. So at this point it's really just a matter of kind of a boring process of like looking up the opcodes and translating it to the correct assembly instruction and eventually we end up with a like program that looks like this. Again, this is the same program as the hexadecimal like code that I showed before. Or at least like a part of it. So we have like a full assembler written in fourth with like a post fix notation and you can use this pretty much to like create Game Boy games already. I think like a lot of Game Boy games were actually created like directly in assembly so yeah. With a nice advantage that it's still running in fourth so you can get like word definitions for free. Whenever you see a sequence of assembly instructions you can abstract them into a new definition that has a little bit more meaning to you and we can continue that with, we can continue that with more until we have like almost like barely any assembler visible anymore and like a bit more descriptive program. So at this point we still have, we're still checking that we're producing exactly the same file. So we're quite sure that everything is correct. Of course like we could have some typos and instructions that are not used but that has been fixed later. So now we actually have a program that we can edit without having to understand what offset we have to change what bytes and we can produce like a new version of our Hello World ROM with like arbitrary text. Of course like there's not a lot of story ends because this is just an assembler in the ends. We want to take it a bit further and implement fourth completely for the Game Boy. So ideally you wouldn't want to have to like work with assembler at all in a project. So there's like two ways of doing this. One the kind of the most like logical way maybe would be to keep extracting like patterns in your assembly code into macros and kind of like build some libraries on top of that to do various things. Even like playing sound you could just do with macros. And you can kind of like program your whole game or program in like a macro language basically on top of your assembly. Of course like we didn't do that. We went for the hard way. We're just to actually parse like every definition that we do that you create ourselves and kind of store all of the code that you want to write in an intermediate representation. This is a bit more complex because you're not directly emitting assembly anymore but it allows you to do things like lazy emitting. So you can create like a million definitions and if you don't use them in your like main codes we won't emit them to the ROM. We're just quite useful with the 32 kilobytes that you usually have. Other things are optimizations. We have worked on tail call optimizations in this. There's like a lot of other things we can do using like intermediate representation. So we basically like started working on this. The idea is that at some point we redefined the column word that usually like creates a new definition into something that collects the definitions that you wrote for later compilation to the Game Boy. And slowly we added code primitives in assembler. So basically defining what does swap do, what does plus do and so forth. So eventually like we have all the primitives to have something working we can add on top of that higher level definitions. So basically using existing fourth primitives to create new fourth primitives almost. And slowly but surely we are translating the entire ROM to fourth and not using like assembler at all anymore. Unless you may want to optimize something later. So this is more or less like what the, I think the same, yeah the same program looks like. It's not binary, like binary the same anymore but it will produce the same hello world or hello false them output. You can see like there's no assembler being used anymore. We have some like quite high level words defines. You can see in the top some constants for like addresses that are like common to use. Words like see, move, video that just like move a block of memory into your video memory. Kind of like according to the standard fourth implementation if you're familiar with it. Yeah, so there are a couple of limitations in our approach. Well, I guess not our approach but stuff that we just didn't do. If you're used to fourth at like you might know that you're able to redefine words at runtime. That's not possible in our system mainly because there's no keyboard input. So we didn't really know how to like handle that. Plus like words are not stored as like in fourth it's like a string in the Game Boy. We eventually just compile them to like memory addresses. So it's quite a, I guess it's quite static like system once it's compiled. Apart from that, we ran into some issues with the division between the read only memory and the when an access memory on a computer this is like pretty much like all RAM of course. On the Game Boy you have to take into account that data needs to be written to the cartridge which is read only. And once you want to modify variables you'll need to explicitly copy them over to the other part of the memory which is also quite limited. And like part of it is used for the stack that we use in fourth as well. Okay, so we implemented a basic fourth system that should be able to compile a fourth program into a Game Boy ROM. Of course like until this point I think we only tried to write like hello world examples nothing too fancy. Maybe we did some like moving around of stuff but the basics were like, like I mean everything that we did was quite basic. So we decided to find a third part the fourth program and this is soccer bum. It's quite old. It's included in the G fourth implementation and it's well soccer bum if you're familiar with it. Quite simple game. So the goal with this is like to try and compile this into a Game Boy ROM. This doesn't work because like we had a lot of like primitives missing still but that he was to add primitives or functionality to the compiler until basically this program would like compile into an actual game. I think we barely had to like change anything except for the fact that variables cannot be initialized because again like they have to be written to the ROM and you want them to be in the RAM. So there's like some initialization codes that needed to be added but apart from that we managed to get it working. I think like it's also like interesting to mention that the original fourth implementation is written for a terminal. So rather than rewriting that we created a terminal emulation layer on top of everything. So for I think like for definitely for simple like tile based games that you could write with like ASCII Arts almost. It's like almost like trivial now to port terminal fourth games to Game Boy. I will admit that we had to scrap like half the levels because they didn't fit on the cartridge so yeah. So in the end like we ended up with a pretty complete fourth implementation. I think we have some like tickets still open to make it like fully compliant to the end standards but everything that's like missing now is there's a lot of like bugs in the assembler of the original Game Boy. If you've done anything like Game Boy programming you might have heard of the increments bug or increments by sprite bug where incrementing a register in a certain range will result in the sprites being completely messed up. I think not even restoreable. So these are the things that the compiler could kind of like automatically remove and refactor into something that will work. There's a lot of like room for optimizations. I mentioned like tail recursion. We ended up not being able to include it because I know I think there were some like edge conditions that we didn't like to think about yet but there's other stuff we can do inlining of primitives like there are certain words that are like used all the time and right now like they're just calls to another address which is like reasonably expensive. If they can be inline of course it will be a lot faster especially for like code that's like running in the V blank sections. There's some people optimizations like with all these abstractions you will get pairs of like assembly instructions that are basically a knob when they're run together. This can be removed to save some space and time. We don't have like game of color supports. Game of color is like quite similar to the original Game Boy. The main difference is color. So there's like some extra registers that handle the payload data for the color and those things are, I mean they're accessible through assembly but of course it's not the goal of this project. Then there's like memory bank controllers like David mentioned already that like the cartridge often contains a lot of hardware. That hardware is like managed by a chip in the cartridge and you can kind of like write to the ROM addresses to access certain properties. There's like rumble cartridges that like vibrate the device. There's like the camera is a kind of a cartridge. So those can be like still implemented in libraries. We have like features like ROM bank switching. Basically you cannot access the entire ROM at the same time in certain cartridges. So you have to like bank them. More debugging tools and actually like we still have to write a tutorial of like how to write a fourth game from scratch. And contributions are welcome if you want to take a look. Anything? So thank you for your time. So the project is at msum hacker slash gv fort on GitHub. We can also recommend these two talks. One called the ultimate Game Boy talk if you want to know more about, well I guess Game Boys because it covers pretty much everything. And it's a great talk about reverse engineering the hardware of the Game Boy where actually chips are being how they call it uncapped to check the internals. And kind of like I think the idea is to create also like a cycle, cycle accruits emulator. Yeah. I have a question. To replace the text from with hello world to hello foster decision. What did you do about the headers and the text terms and all these things that are the beginning and the end of the ROM file? How did you... Right. Right. So the question is like how we managed to like go from replacing a string basically to replacing the whole file. So like, I think the main point is you mentioned also like checksums and deeds like I skipped over that. There's some like checksum calculations going on. Those are documented how they work. Yes, like checksums for sure. And the rest is like just like decompiling to assembler and like trying to abstract it higher and higher. That will all the rest of it. Yeah. So when translating circumventing to with resort, how do you do for the mapping of the input actually? Because I imagine this is all my... Yeah. So that's part of like the terminal emulation actually. So we replace like the word to actually get one key from the input to just go and look into the keypad driver. Actually making something that looks like Microsoft's integration but you need to map. If you already used that register or something else, did another map? So the question is how do we map like the registers in hardware to like the different pointers that we need in forth. Basically that's a convention. So we reserve one of the registers for the top of the stack. That's a quite common optimization. Then we have one of the registers point to the top of the stack. And then we have some scratch register that you can use and change. So the comparator will always stick to the convention. It's good also because it means that you can actually mix assembly and forth if you know them. The question in the context of the native machine with limited RAM and speed. Do you have an opinion, a strong opinion against or in favor of food compliance with the A&S tone up? I think we reached a conclusion that, sorry, let me repeat. So what the value is of complying fully to these standards. I think if most of the programs work well enough, I kind of feel that there's not really a need to be fully compliant. Apart from that, there's a lot of instruct, like a lot of words in the standard forth that don't make sense on a Game Boy. Like stuff that deals with OS, obviously, but also like floating point numbers we're not emulating. Like double precision arithmetic? Yes, double precision is like, so those words we skipped and other ones are quite easy to etch yourself. So I think like as long as we support the common cases, it's usable, I think it's like sufficient. Yeah. So the question is like how we test ROMs. There are indeed like special card switches that use like SD cards to load to ROM. So they have like a little like OS built-in as a file manager. Or you can use like cartridge flashers. You can buy them online for more like persistent flashing. Yeah. How long did it take you guys to write this? That was a good one. So how long did it take? I think we managed to get to the assembly point in two months. And this was like with like two nights a week, like after like office hours like working on it. And I think like 80% of the project in three months, like small optimizations later, but yeah, something like three months I would say. All right, thank you everyone.