 So the next talk I'm really looking forward to this and I know Why the most most of you are here? because I'm excited about this too I own a Gameboy and I'm really proud of it and I have it In my flat and on special occasions. I take it out Well nowadays I use an emulator for sure and I only have roms of cartridges. I really own Well, that is that that is how it's supposed to work, right? Yeah, and I'm looking forward to this so by the way I'm my favorite games on a Gameboy the first one to make that clear our Mega Man 2 Vario Land and Mystic Quest and And who does not agree with this? Well, you don't know me that good then so we Go to the talk of the ultimate Gameboy talk. I'm really looking forward to this Yeah, we have as a speaker Michael style and he held a really great talk at 25 c3 the ultimate Commodore 64 talk and now with the ultimate Gameboy talk I'm really looking forward to this Michael style by day. He works on operating system technology by night He hacks obsolete systems in a previous life. He hacked game consoles Please a big round of applause Michael the stage is yours Hello, hello everyone Again, my name is Michael style and this is the ultimate Gameboy talk The idea of the talk is to talk as much as possible about all the different hardware details everything that I can fit into 60 minutes everything about the Gameboy 60 minutes I have about 200 slides over 800 Individual builds so maybe the information density is a little higher than normal So let's get started real quick The Gameboy talk is in the context of a series of talks that I have started eight years ago with the ultimate Commodore 64 talk and other people have picked that up and talked about the Atari 2600 the Galaxia and the Amiga 500 And so it's my turn now Again, and I picked the Gameboy. Why is the Gameboy so interesting? Because they sold lots and lots of systems Gameboy Gameboy color 118 million alone And if you count the Gameboy advanced models that are compatible Excluding the Gameboy Micro in total. It's almost 200 million systems. They have made 1600 about 1600 official games and They produced it from 1989 just the 8-bit models until 2003 and again If you count the compatible Gameboy advanced models, they made 8-bit compatible gaming systems until 2009. So that's a 20 year run, which is pretty amazing Let's look at the competition back then just after the Gameboy the Atari released the links Sega the game gear and any see the Turbo Express all of these had one thing in common They had a color screen pretty good color screen But they also had one thing in common which was really bad battery life of maybe three to five hours While a Gameboy had 15 to 30 hours But the compromise was it had a screen that looked like this and you could hardly make out anything really as soon as it starts scrolling This is not the case with all the models of the Gameboy of which there are many The original Gameboy in the original design which was produced for the longest time is the DMG DMG stands for dot matrix game, which was the original codename then in 96 They released a game by pocket which had a much better screen much smaller The MG be a Gameboy light was only released in Japan It had a backlight and then the Gameboy color with twice the CPU spades Speaks and twice the RAM and twice the video RAM and color support Then the Gameboy advanced series there were completely different architecture They were based on an arm CPU, but still what they were on 100% backwards compatible with Gameboy and Gameboy color games The advanced SP of that one two models exist if you want to get one You really need to get the AGS one or one, which is the one with a backlight instead of the front light and As you can see Nintendo has always been ahead of its time a little they didn't only make a rose gold version But also you need an adapter to plug in regular headphones if you want to play Gameboy games on a regular TV set the two options for that either you use an M Super Nintendo and plug in a super Gameboy the two versions of that one only released in Japan Which had the proper timing didn't run three percent too fast as the original one and the Gameboy player For the Nintendo Gamecube and all of these had complete Gameboy hardware embedded inside basically a normal Gameboy and it just It puts its pixel into the host system instead of on the screen So what is a Gameboy like as a 2.6 inch screen joypad mono speaker? You can get stereo over the headphones and there's a link connector serial port to connect two gameboys together contrast and volume and on the back This is where the game goes and this is where you put in your batteries And this is what a game looks like and basically most games are really just rom chips. There's nothing more to it Specifications what are the specifications of the game when let's compare the Gameboy to some other systems that may or may not be comparable The CPU is a 1 megahertz 8 bit CPU Some people in the audience may complain at this point and say no it's a 4 megahertz CPU But I'll explain later while I call it a 1 megahertz CPU There's eight cul-bats of RAM which is plenty for a gaming system of that type VRAM of 8 kilobytes is a little tight The resolution of a hundred and sixty five hundred and forty four is really poor, but at a screen this size You don't really notice that much It can do four simultaneous colors, which is four scale four shades of gray and it supports up to ten sprites per line So if you compare all this to these other systems here, it's clear that the gameboy is Way more advanced stuff for example and Hari 2600, but it's not at all in the league of unsuper Nintendo It's more like an SNE and more like a standard Nintendo entertainment system or a Commodore 64 But the fun thing here is that while the NES and the C64 were released in the early 80s The gameboy is from 89 and as I said compatible hardware was Was supported and released until 2009. So that's what makes it really interesting It's an 8-bit system, but it's really the latest the last 8-bit system that was in common use Let's look inside On the right the board on the right isn't very interesting if you look at it from the front This is where the LCD is connected speaker and the buttons the button the Board on the back is much more interesting. This is where you can see three chips So this is the DMG the original gameboy board two RAM chips to identical RAM chips One is for CPU RAM one is for video RAM and this one big chip here It's called the DMG CPU, but it really is the SOC the system on a chip. So what you would Regularly expect to be lots of chips in a computer system like that. Everything is integrated into just a single chip Which is the gameboy chip Let's compare some other boards here the super gameboy. It has a very similar chip. It's actually really 99.9% identical The gameboy pocket is in Is a slightly optimized model that comes only with one RAM chip for both the gameboy light you might have seen It's not actually any different. It's just another mgb from the mgb series, but it comes with a backlight Then there's the super gameboy 2 which is based on the gameboy pocket and This is what the gameboy color looks like on the inside. They all have this one gigantic chip that does everything But what is this one? This is a special one. You may not recognize it from the markings This thing is called the GB boy There have been companies cloning the gameboy all the time and what this particular model is is it's a perfect clone from decapping and photographing and redoing the mask of the original So see here and It's a Chinese Gameboy that you can still buy today for like 30 40 bucks or so on eBay It's a shame that the quartz is 30% too fast. So none of the games are playable Let's go back to The board so the DMG CPU is the thing that we're really interested in that does everything on the system What is included in the DMG CPU while a regular CPU core interrupt controller time and memory bootrom What you would expect from something like that and then all the peripheral devices like the IO joypad input serial data transfer for the link cable a sound controller and a video controller the pixel processing unit So let's talk about the CPU historically the gameboy from 1989 was released between the NES and the Super NES and The NES came with a 6502 Super NES came with a 65 816, which is the 16-bit version of that same CPU So obviously the gameboy comes with a sharp LR35902 What is the sharp LR35902 core? It's nothing like a 6502 It's more like somewhere in between an Intel 8080 and a Xilox Z80, but it's neither Both these architectures are interesting because the 8080 Was for example used in the Altair, which is the first computer that Bill Gates wrote software for and released it for and Z80 is in pretty much every computer a 8-bit home computer system that did not contain a 6502 very successful architecture So if you imagine this is the feature set of the Intel 8080 That's the feature set of the Xilox Z80. So it's a strict super set. It's perfectly backwards compatible And that is the gameboy CPU There the core architecture is the same as the one from the 8080 So all the core features are the register set to the instruction encoding Everything all that is the same But there are some features that are not supported But it does support some Z80 features, but it does not support most of the Z80 features And then they added a few features on top Let's walk through that So let's talk about the core architecture of the 8080 first. It has these registers There's the accumulator which is special. It's the one that can do all the arithmetic and logic You cannot do these with the other registers a flags register It only has two useful flags the zero and the carry flag in the gameboy model The other two are only used for decimal adjust and then there's BCD EHL another another set of 8-bit registers but the fun thing about those is that you can combine B and C into BC and DE and HL same thing So you actually have 16 better registers that you can use as pointers for example so in total there are 4 16 bit registers that can do some things and 7 8-bit registers plus one special case here Which is the memory location pointed to or the value at the memory location pointed to by the HL register Which can be used in place of any register in any In any instruction, which is pretty nice So these are the instructions. There's the load and stores by practically the same instruction You can indirect immediate direct stack is 16-bit can only push 16-bit registers These are arithmetic and logic ones as I said only the accumulator can do most of those except for incrementing and decrementing Which works in any register and that one also works on 16-bit registers there rotate instructions control flow jump call and return Conditional and indirect and then some miscellaneous functions setting the carry clearing the carry not halt and Disabling and in enabling the interrupt What is the interrupt model on most systems on that time? You would expect there's an interrupt vector for all the interrupts, but there's it's neither a vector nor is it just one Instead of jumping over a vector it jumps to fixed locations in RAM at the beginning of RAM And for the different kinds of interrupts it jumps at different locations. So hex 40 hex 48 and so on and There's also concept of software interrupts You can jump to these locations with special instructions and RST zero is a special one because this is the same as a reset when you turn on An 8080 CPU it starts at location zero Let's talk about the few unsupported 8080 features So these are the flags of the Game Boy. These are the actual flags of the 8080 and it has two for two extra flags One is the sign flag, which is kind of useful and the parody flag, which is not so nine none of these instructions are supported And a few others for some reason they just decided not to include those and port IO You may remember that from 80 from the 8086 it has the port space That one is in the 8080, but it's not supported on the Game Boy because it uses memory mapped IO instead the Z80 has lots of extra rotate and bit shift and bit testing setting and Resetting instructions all of those are supported as are as is the relative jump instruction Which is a more optimized version of a jump and the return from interrupt. That's all that is supported Everything that is not supported is all the interesting stuff from the Z80 Which is the second register set the extra two registers and lots and lots of features Including the auto increment and decrement and loop instructions, which are nice for copying memory But it has a few features of its own that take their place for example There's a post increment and decrement decrement So if you want to access memory that is pointed to by HL you can post increment or you can pre decrement HL There's also the concept of a zero page and it's a little confusing because not actually page zero in RAM It's the topmost page in RAM And if you know the 65 or 2 you can see this is a concept borrowed from the 65 or 2 It just means that there's an optimized instruction encoding for instructions that are in the for memory accesses That are in the topmost page of RAM. So instead of doing this Loading from ff40 which takes three bytes and four cycles you can encode it like this takes two bytes and three cycles And obviously you have to put something useful up there in the topmost page, which is timing critical They added a few stack instructions another store sp instruction you can swap the low and the high nibble they added eight instructions there and There's an extra power saving Instruction, so this is what the opcode table looks like just to get an idea from the color coding That you can see this is quite orthogonal and there are a few instructions that are or a few opcodes that are not in use and they all crash which is a nice design and This one is special the opcode CB It's a prefix which hosts another 256 opcodes This is the complete opcodes space for the rotate and shift instructions borrowed from the Z80 plus the additional Swap instruction that was added on the Gamebo CPU Let's look at another instruction here just as an example This is a low it loads from an fixed address This means it takes three bytes 16 clocks 16 clocks at four megahertz But the internet disagrees whether it should be looked at at four megahertz or at one megahertz because all the clock Times are divisible by four And this is because and the whole system is memory bound the whole CPU is memory bound So it can only calculate as fast as memory is Providing the data So basically you could just as well say this is a CPU that is clocked at one megahertz And it takes four clocks with for this and these are much say no numbers now because now the numbers are actually Comparable with similar systems like 65 or two base systems That also have one megahertz memory with them one megahertz CPU so in theory Yes, the CPU is clocked with four megahertz the RAM runs at one megahertz the PPU Draws its pixels the pixel processing unit at four megahertz and it's connected to VRAM that is running at two megahertz So it's a little complicated here in the system but most of the time everything runs or most of the numbers can be expressed in terms of one megahertz but To be exact it's not exactly one megahertz It's one maybe Hertz as in ten twenty four times twenty four Hertz So they didn't base it on factor ten on base ten they based it on base two just pretty And so from now on if I speak about cycles, I mean machine cycles at one megahertz So that's the CPU and it's a 16-bit CPU an 8-bit CPU that has a 16-bit address space because of the 16-bit pointers So 64 kilobytes of address spaces all it can see 32 kilobytes of that is the ROM space that comes from the cartridge It's just mapped in from the cartridge and there's a boot ROM laying lying on top of that that we'll talk about later Video RAM is mapped in external RAM that can also come from the cartridge optionally Then there's the internal RAM and some empty space which is unassigned or is mirroring just other things And if we zoom into this we can see at the top There is another page of OAM RAM which is a special purpose video RAM that is distinct from the video RAM that we'll talk about later And then the last page which is the zero page contains the IO area So all the registers for the peripherals like sound and video plus another 127 bytes of each RAM which is Distinct from the other RAM. So does this mean that games can only have 32 kilobytes? While some games only need 32 kilobytes Tetris fits nicely into 32 kilobytes. There's a single chip there Pretty simple to manufacture Other games can have in practice. I mean there's no theoretical limit, but in practice games go up to I think two megabytes This one is 128 kilobytes the way it achieves that is by adding another memory bank controller special chip to the cartridge Which can switch banks and this is common on all kinds of system So in practice while these memory bank controllers can be very different But most of them work something like this the lowest bank in memory always maps to bank zero of the ROM and the upper 16k map to whatever they can map to Bank 1 Bank 2 and so on and All this is controlled by Writing magic values into ROM locations, which then will go to the cartridge be intercepted by the memory bank controller Which then can switch those banks the same is true with the external RAM if the cartridge wants to Expose extra RAM for example for safe games that are done battery-backed. That's the only way to do safe games It can map in external RAM here same model So what's up with this boot ROM? The CPU starts running at location zero in memory and the boot ROM is the thing that draws this and does the chime this boot ROM is built into the Game Boy and Took a while until this was extracted was a real pain Not done by me And what it does is this is the complete boot ROM it initializes RAM initializes sound Sets up and decodes the logo that it puts on the screen that in scrolls in the logo It plays the chime and then it gets interesting It compares the logo because the game has to have a copy of that Nintendo logo inside If that if it doesn't match the game does not boot this was meant so that Nintendo could control which games are released for the platform because all games had to contain the logo Which is not just a copyright violation, but also trademark violation if if you include that and you don't have a Nintendo's permission After that it also checks sums the header just to make sure that all you don't have to blow into the pins and Then it turns off the ROM and continues execution in there. So the Nintendo then that logo though is actually Presented from the cartridge. So if you boot up a Game Boy without a cartridge, you will see this So but it doesn't mean that an application or a game can put any logo in there Because it does get compared and at that point it wouldn't boot any further And since that there was no cleanup code it doesn't reset the system or anything or clear the screen So some games did something like this they played with what's on the screen Let's just continue with that and of course demo is also like to play with that the Nintendo logo is on the screen Let's just do something with it Which is pretty nice So the boot ROM runs it runs until the last instruction and the last instruction is the one that turns off the boot ROM Even the first page in ROM is mapped to show game data and it continues running here So it just continues running into the next instructions from the game And usually there's a jump there because there's a header that has that is specified to be there at this location That header contains contains the Nintendo logo and the header checks some which are important All the other metadata is not really important It was important for the developers back then for their hardware But the gamer does not actually check any of this and after that you can have the actual game data So one other thing that we haven't looked at is the IO area and H-RAM, which is the topmost page in The in the system. It's the zero page. It's the one that you can access quite efficiently So the top 127 here are extra RAM and all the rest you can sprinkle throughout here You can see the different devices and these are all the registers total in the system interrupt controller sound controller joypad serial Timer and pixel processing unit and these are all the components that we'll talk about now Joypad input. It's really really simple. This is these are all the inputs the game boy has it's four buttons and four directions So you would think let's have eight GPIOs and with that we can do this But we can actually do it with six GPIOs because it's two columns with four rows So you select the column that you want to test and then from the row you can read what the what button was pressed So they could do with six instead of eight. That's all about the buttons pretty simple Serial data transfer you can connect two game boys together with a link cable everything the link cable consisted of was One wire of data in the one direction one wire for data in the other direction and a third wire for the clock The two game boys had to decide which one is sending the clock and which one is receiving the clock So these are the bits that control this One sets the clock and it's always eight kilohertz the receiving clock can really be anything It can be go up to half a megahertz and it will receive it and as soon as the transfer started It will clock the bytes through and it always goes both directions at the same time The timer is any system. It has a timer. There's only one timer The TMA register the modular register is where you put in the start value and then you can select one or four different speeds and then you start it and then it counts up until it overflows and at the Overflow time it reloads the module Number and can optionally generate an interrupt Speaking of interrupts the interrupt controller supports five different interrupts V-blank and LCD stats deal with the pixel processing unit. We'll talk about those later We've already seen the timer causing interrupt serial can cause an interrupt when a new byte has been received and joypad When a button has been pressed and this was the interrupt enable register and there's the interlop flag register Where you can see which interrupt is still pending and these are the addresses where those different interrupts jump So there's you don't have to find out which interrupt is was because the CPU jumps to different locations So the sound controller sound controller nothing has as many registers as the sound controller Because there's four voices and they all have their distinct registers, so this is a better way to look at them four voices and five registers each and Those registers have particular meanings But the meanings are rough because the four voices are not exactly the same there are two very similar ones that do a pulse and One that does that is called voice and one that is called noise If you look at the bits here, they are similar, but they're not the same So it depends on the voice what exactly those bits in those registers mean Let's Look at the ones that are in common and all the voices have a trigger bit which turns on the voice and You could just turn it off again at some point But there's a length bit and a length register, so you can just say it turned off off after a quarter second Let's look at the wave register, which is the simplest one in addition So the idea of the wave register is that you can play any sound wave it has these extra 16 bytes of register here, which is 32 entries of four bits each and These can describe any waveform that you like that fits into these 20 into these 32 Slots, so here are some examples and you could for example create a sawtooth Which is a pretty simple waveform where you could have a sign or you could just do anything custom So you're pretty flexible at that and then there's the frequency which controls the pitch just controls how fast that wave table is played and Two extra bits of volume so you can play it a hundred percent fifty percent twenty five percent or you can mute it The other two here the two pulse registers the two pulse voices. They're very similar All these bits are the same and behave the same You cannot specify a waveform here the waveform is kind of fixed. It's always a pulse meaning low and high in in different ratios and These two bits determine the ratio between low and high 12.5 high and the rest is low 25 percent 50 and then 75 should sound exactly like 25. It's just inverted So you can do this with either of these registers and there's also the concept of a volume sweep So the volume can go up or the volume can go down volume going down It's the common case for emulating standard instruments or the volume going up is an interesting effect and Only the first pulse voice also has the concept of a frequency sweep So you can go up Or you can go down So you can see this is mostly meant for sound effects and these are some more examples of sound effects That's what you can do with that And there's a fourth voice that can only do noise. So this is a shift register that basically generates pseudoramnum numbers and Depending on whether you set it to 15 bit mode or seven bit mode. It will do one It will do one of two different waveforms That's 15 bit mode and that's seven bit mode So these are all the registers again different voices and three general purpose registers this register has a volume for left volume for right channel and interestingly cartridges can have their own audio controller that outputs an analog signal that can be hooked up into this but no game ever did this and There is Another register where you can say should a voice be on the left or the right on the right on both or on neither And then there's the power bit if you turn off the power to audio you will save like 13% or so of energy So the game boys not only used or game by sound is not only used for games People still compose music in tools like a little sound DJ today on the game boy and I'll show you a short example of that So much for sound Let's talk about the pixel processing unit the pixel processing unit is the thing that makes graphics It has 12 registers, which is not quite a lot But let's look at the specifications first. We talked about it before a hundred and sixty five hundred and forty pixels Not that much four shades of gray, which is more like four bad shades of green Later game boys are much better Everything on the screen is eight by eight tile based and there are a certain number of tiles on the screen That's bright and all this has to deal with eight kilobytes of video RAM What do I mean when I say eight by eight pixel tiles if we look at a game like Tetris? You can see everything's blocky everything's based on these blocks Same with Zelda if we put the grid over it You can see they're repeating patterns and they're all aligned to a certain grid with Super Mario land It's also pretty obvious here, especially because it doesn't have many different of these tiles Even with something like Donkey Kong land You can see it as soon as you put the grid over it and that there are some repetitions and everything's aligned to that even though they did a very good job at hiding that and In some games you can see it because it plays with that very concept So here in Turrican you can see it fills the screen with tiles Let's look at a tile and what a tile is like Tile consists of eight times eight pixels and has four colors like everything in the system and These colors are encoded zero zero zero one one zero one one So let's add that to these pixels and let's look at how the encoding is done If I look at the first line here, and I read it as a binary number zero two and ff So for every line of pixels I need two bytes And in total I need 16 bytes to describe a whole tile You may have noticed here that the the ordering of the colors here doesn't necessarily make sense This is because I can choose my own palette. It can be any palette There's a two-bit to two-bit mapping in the system for these background tiles So the native colors are one zero zero means white and one one means black So I can pick any palette like this. I can also reuse the same colors if I want that for some effect There are 256 tiles in the system. So this tileset do you recognize what it is? It's Tetris if you don't recognize those dancing people at the bottom, then you have not finished Tetris This is Zelda and That is a Super Mario Land which only uses 128 tiles. Just can't deal with that Anyone recognize what this is? Let's just puzzle together something and we'll see it's a tennis game It's all star tennis This is puzzled together from 20 tiles by 18 tiles which fill up the whole screen Which is not the whole truth because actually in video RAM there are 32 tiles by 32 tiles This is the complete background map and what you see in the screen is just a viewport into that and It's 256 by 256 pixels, which is nice and convenient Because this is how scrolling works by just moving that viewport around and We can see this in practice There's a really nice emulator that lets you see what's going on in the background map and how the viewport is changing around So this is basically really a camera that is moving around a bigger 32 by 32 map But this only works with games that are maximum 32 by 32 What about games that are scrolling infinitely like Super Mario Land? We can have this many extra columns and we can move the viewport over here But what what happens if we end up here at the end? Well, it will wrap around and if we just to draw columns fast enough just before the viewport hits them We can have an infinite world So in the emulator you can see this quite clearly in the off-screen area of the viewport It keeps putting in those new columns. This also works in two dimensions. This is Donkey Kong land again This looks pretty freaky. It just puts those extra columns and lines Where it will go So this is the one layer that we've talked about so far It's the background and on top of the background is another layer that you can optionally put on top It's the window it can cover it fully or it can start at any location There's an X and a Y position for that and it will draw from there to the right and to the bottom There's no translucency ever. So usually how this is used is you put it to the very right Or you put it it to the very bottom. It just overlays this and does not respond to the other scrolling Settings and you guess that this is necessary for something like a score that is shown at the bottom of the screen This is very nice easy and convenient for games But you can also put it on the right and these are game of color games, but it wouldn't matter It works on the game by just as well And then there's another layer on top of the background and the window which is a sprites Sprites are objects on the screen that are that don't fit into the eight by eight rasters. You can position them freely So we have three sprites here in the system and Nintendo calls them objects OBJ But I'll keep calling them sprites because everyone calls them sprites. Let's just look at this goomba here Every sprite in the system has attributes and there's the OAM Which is the object attribute map and this is one OAM entry and it has these values So one of them is the X position So if we put the goomba to the very left of the screen you would expect us to have a horizontal position of zero Right, but no, it's eight. Why is that? Because if you put it at four it's here and if you put it at zero it's here because it's eight pixels wide You need a way to scroll it in and something similar is true at the top of the screen, but the first white position where you see it fully is 16 because Sprites can also be can can be up to 16 bits 16 16 pixels in height So let's put it at its natural location here the next thing To look at is what should it look like so first of all you can see it's a eight by eight grid It's the same encoding except that it also has translucent pixels So the code zero zero stands for translucency And since it's the same encoding is the same kind of tiles and there's also 256 tiles in the system Which is a byte and we can see it here. It's tile hex 90 The next thing is the flip expert so in so you don't have to save two goombas here one That walks to the left and one walks to the right you just flip it horizontally and you have one that works to the right You flip it vertically and you have a dead goomba you flip it horizontally and vertically and you have a dead goomba walking the other direction So let's put it right side it up again The next bit is the palette because one pixel combination one bit combination means Translucency you only have three more colors for sprites, which is a shame So they didn't want to impose three specific colors for this So for any for the sprites you can pick which three colors out of the four you want and all the sprites don't have to share the same three colors Because there are two palettes in the system Which is why the sprite also has another flag does it take palette zero palette one if it takes palette one It would like this would look like this with palette zero. It looks like this And one more bit is priority. So how does the sprite draw in comparison with the background? If the priority is one this is the interesting case it will draw on top of all those white pixels here But it will draw at the back of all the non-white pixels, but there's nothing special about white here This is why there exists a background palette. It's about pixels that have the value of zero in the background So if you pick another palette and the goomba could very well be drawn on top of any other color If we set the priority to zero it will draw on top of everything except for the translucent pixels, of course Sprite and sprite priority. There's nothing you can do about this. This is always fixed We have this rectangle here and the goomba is on top of the rectangle Which is because the goomba's horizontal position is smaller than the rectangles as soon as they're at the same horizontal position The sprite with a lower number wins because the sprites are an array in memory The one that comes earlier draws over the one that comes later as soon as the white ranked angle It has a smaller X position than the goomba It will draw on top of it and you can see this as flickering effects in some games when you walk through other things There are 40 sprites in the system You can have 40 sprites on the screen at the same time, but there's another limitation per line You can only have 10 sprites So I have an 11th sprite here and it only counts per pixel line So here these pixels would not be drawn or the next line or the next line And this is not about the 11th one from the left It's the 11th visible one in the list of sprites in the ordering that the program decides So this is the complete OAM entry. It fits neatly into four bytes So this is one OAM entry in memory FE-00 is where on these entries are stored There are 40 of them for the 40 sprites and the whole thing is called the OAM RAM Which is a special purpose RAM at this location in memory, which is not part of video RAM One more thing I should say about sprites is even the small Mario This is the small one is too big for 8 by 8 sprites So you could do it as four sprites and this is what the game actually does But there's another mode where you can have sprites that are 16 pixels high But this is global, the whole game would have to deal with the 16 pixel high sprites then So we've seen the three different layers There is one more thing you can actually completely turn off the display Which is a fifth color, which is a little lighter than white Not very useful because you have to turn off the complete LCD So as soon as you turn on the LCD, but don't draw a background, you get white If you draw the background and say you want to draw it in light gray It completely replaces that color You can draw a window on top, again, no translucency here And you can draw sprites on top And notice that sprites don't distinguish between whether a pixel is background or window So there's no clipping going on with the window So how does all this work with the memory map? Four kilobytes of sprites tiles And four kilobytes of background tiles And one kilobyte of the background map The 32 by 32 And we also have a kilobyte of the window map It's not the most efficient representation that they did it for Because it was easier And we only have eight kilobytes of video RAM So if we put the sprites tiles here, the background tiles here We have already run out of video RAM Let's try it differently Let's put the sprites here Let's put the background and the window map here And what do we do with the background tiles? Let's have them overlap And there are different configurations here Three bits We can have them completely overlap or just partially And we can swap around these or have them at the same location What does this overlap mean? So this is one configuration The background tiles and the sprites tiles They're in the same format 8 by 8 Two bits per pixel So they could share exactly the same tiles But you can also put them this way So the first third, the first 128 Would be sprites tiles exclusively The last 128 would be background tiles exclusively And the ones in the middle would be shared by both Or usable by both And in the case of Super Mario Land here You can see that the first two thirds Are used for sprites And the last third is used for the background The next step is vertical timing As in CRT-based systems So this is how a normal old CRT system draws its picture Very in a very slow motion version It draws it from the top to the bottom From the left to the right And the same is true on a Game Boy It keeps drawing the picture 60 times a second From top to bottom, line by line, left to right This was not done because they were reusing some old components They completely redid the very idea how this was done But still, an LCD wants to be refreshed 60 times a second And they draw it like that And this is important to know If you wanted to do certain effects That you cannot otherwise do In this game, for example, you can see That different parts of the screen behave differently And let's just look at the scrolling city line here It would be easy to just have that scroll By itself on the full screen But we only want to scroll a part of the screen And the way this is done is with these extra registers So if you've seen 8-bit programming On something like a C64 before You've seen all this there You can see, you can read Which line is currently being drawn Or which will be drawn in just a moment And instead of just busy waiting there You can also set an interrupt It will wake you up as soon as a certain line is reached So let's set the scroll X register to zero And let's trigger here at line 8 So it will draw all this with a scroll offset of zero At this point, our program will set the scroll offset The horizontal scroll offset to, let's say, 23 In the next frame, we'll set it to 24 So we'll scroll And we'll set the compare register to 42 And it will continue drawing here with a certain offset And then we set it to something else Set another lyc It keeps drawing like that How exactly it draws the road here We'll talk about that later That's a fun trick as well And then we set it to zero again Because the dashboard doesn't do any kind of scrolling And we do the same thing on the next screen again In this example So it doesn't only have to do with the X scrolling register Here's a different example Where the Mario on the top right is actually a window And we talked about this before The window will draw from a certain position to the right And to the bottom of the screen You cannot just have it draw halfway But yes, you can You trigger at line zero and turn on the window And they trigger again at line 40 and turn off the window So when it draws line 40 The PPU will think, window? What window? Don't know about any window So it won't draw any further And you can see this in all kinds of games Some of these tricks can be done with a window Other have to be done with screen splitting techniques If you don't just trigger on certain lines But do something fun on every line You can do a trick like this So on the left you can see what's on the screen On the right you can see what's actually happening Or what's stored inside the video RAM If you change SCX in every line This is the curve that is just used as a transform For that picture and that curve changes on every frame And all the program has to do is on every single line It has to write a different SCX value into the SCX register And of course it has to be done in every line This picture here doesn't show it in every line And the racing game effect is pretty much the same thing here This is what you can see in the video RAM It's just a straight road But it gets distorted at runtime while the picture is being drawn So let's zoom into this This is the source and this is what we can see on the screen So if you ignore the sprites here This is the curve that it has to use to distort it And these are all the offsets of SCX So if you keep updating SCX in every single line You can warp it like that And another thing you can see here is That the line in the middle is patterned As well as the part outside of the road And this is done by changing the palette every few lines In a certain way And you can even go one step further This rally game here can also do bumps in the road Which is done by picking not just the horizontal scroll register But also the vertical scroll register in every line So you can duplicate lines and you can skip lines And with some good math you'll get to that And if you update your vertical scroll register In the middle of a line you can do this wobble effect Which is two dimensional But for that we have to go a little deeper And go into horizontal timing What happens while a line is being drawn? So this is the pixel transfer mode of the PPU It usually takes 43 clocks And this is done for 144 lines But you cannot just imagine that At the end of the first line It will immediately draw the first pixel of the next line That's not what's happening Because there's an extra OAM search At the beginning of each line which is 20 clocks I'll talk about that in a minute And an H blank area of 51 clocks At the end of every line In H blank mode the PPU is idling It doesn't do anything And there's also a V blank mode When the PPU doesn't do anything between screens So let's do the math A line is 114 clocks at 1 MHz 54 lines So this is this many clocks per screen If you divide the base clock by this You'll get a refresh rate of 59.7 Hz These four different modes that the PPU can be in You can read that out And the CPU can know about that And you can also trigger interrupts based on this But why would the CPU have to know? Let's look at what's going on in these different modes First, what is this OAM search of 20 cycles At the beginning of each line? For every line the PPU has to decide Which sprites are visible in that line So there are 40 sprites total in the system And it has to filter those sprites It has to find the sprites that are visible in that line And put them into an array of up to 10 sprites That are visible there And the logic for that is The exposition cannot be zero Because then it would be invisible And the current line that we're drawing Must be between the first line of the sprite And the last line of the sprite So it gets added to the visible sprite array And this takes 20 cycles By the way, in the original game Where there was a fun bug here If you do any calculations Any 16-bit calculations with numbers Between FE00 and FEFF Which is the pointer to the OAM RAM Even if you're not accessing RAM at all It will destroy the RAM during OAM mode So why else would you have to care about what's going on? The CPU is connected to RAM PPU is connected to video RAM And OAM RAM is special PPU is also connected directly to OAM RAM CPU could be connected to the video RAM as well So it can write to video RAM But this is not how it's done You would need a double speed video RAM here C64 does it like that But on the Game Boy It has to go through the PPU And there's this one big switch where the PPU can say You cannot access it right now If the CPU wants to write Nothing happens If it reads, it gets all FEFF At least nothing bad can happen But it's also not very useful So the CPU has to make sure that the PPU is in the right mode So that it can access all this During pixel transfer You cannot access video RAM But you can But during OAM search H blank and V blank Video RAM access is OK If you want to access OAM RAM You cannot do it in either OAM search Or during pixel transfer Because that's when the sprites are drawn The PPU needs that You can only access it in those times So you have to be very careful In those times while the screen is being drawn So basically all this is bad area for the CPU It shouldn't do anything important at that time So for example if you want to move in new columns Into the background map You should do this in V blank Where you have the most uninterrupted time And all the game logic All the game AI can be done While the screen is being drawn But there's a caveat here You cannot write the new sprite positions At this point because OAM is going on So what games usually do Is they write the new updated sprite information Into a shadow OAM Which is just a copy of the OAM And then during V blank they copy that Into the real OAM So they copy a block from here From any of those sources Into the OAM as a location This is not to scale It doesn't have to do that by itself Because there's a DMA function You just write the block that you want to be copied Into this location Takes 160 clocks And while it's doing that The CPU keeps running But it cannot access any of the source Of the source address space So it has to wait But since that code has to come from somewhere as well The only place you can put it is into H-RAM Which is a nice use of that as well So pixel pipeline Let's dig in as deep as it gets Into the pixel pipeline And this is cutting-edge research And some of these things have been Previously not known to the public The pixel FIFO is the central concept Of how the Game Boy draws its picture So let's jump somewhere in the middle We have some pixels on the LCD already So there's five pixels Already shifted out Sent to the LCD And the pixel FIFO, let's say There are a few pixels in there Now in every step In every four megahertz step It shifts out one pixel And sends it to the LCD It shifts out a pixel Sends it to the LCD Shifts out the next one Sends it to the LCD You may have noticed here That a green button just became red Because the pixel FIFO has to contain More than eight pixels To be able to shift something out Why that is, we'll get to that Now we should get new data And fill the FIFO And that's what the FETCHER is for The FETCHER fetches background tiles And 9802 is right now The position of the map That it fetches it from So it reads the tile number From the background map Takes one cycle In the next cycle It reads the first part of the data And the second part of the data From the tile RAM Because every line of a tile Is 16 bits And from that it can construct Eight new pixels It starts over again Goes to the next location And it can put those eight pixels Into the upper half of the FIFO And then it can just continue Pushing those pixels out But what is not happening is that It keeps pushing pixels And when it's done It has to fetch some again Of course this is all Interleaved and running at the same time So let's walk through this real quick The FIFO is running at twice the speed So it does two pushes Until the fetch can do one step So push, push And we read the first byte of data Push, push And the second part of the data And at that point It cannot put the data into the FIFO yet Because the FIFO isn't empty yet So fetch switches to red Has to wait for two more cycles here And then so it's idle for a while And then it puts the data in So if you look at the memory access patterns You can see three reads and an idle Three reads and an idle So again, the FIFO pushes one pixel per clock At four megahertz pauses Unless it contains more than eight pixels And the fetch runs at two megahertz Three clocks to fetch those new eight pixels And it pauses in the fourth clock Unless there's space in the FIFO available Scrolling is done very simply Let's say we scroll by three pixels So everything's moved to the left by three pixels The first three pixels are just discarded And then the next pixel goes here on the LCD So at the end of the line this becomes interesting Because when we want to trigger on 160 The FIFO may contain the next few pixels That we won't actually draw And the fetch is already in the middle of fetching The next tile that we also don't care about So at this point it'll just stop all this And it has done too much work And will be in H blank mode Which is the reason why a line takes 43 clocks Instead of more logical 40 clocks If we have a window Let's say the window triggers at position 26 And we're here at 26 The FIFO has some data in it The fetch is somewhere in the middle It will completely clear the FIFO And then the FIFO will be stopped Because we don't want those pixels anymore That are already lined up And the fetch will switch over to the map of the window And the fetch will be restarted And then we'll do the tile fetch The data fetch, the data fetch And we'll get the data from the window We can put it into the pixel FIFO So as soon as those are shifting out Window pixels will be drawing With sprites, there are 10 comparators That are triggering on the X position And let's say here at position 26 We have a sprite at that location And again, the pixel FIFO has lots of pixels And fetch is somewhere in the middle First, we temporarily suspend the pixel FIFO It cannot push out any more pixels We switch temporarily the fetch To doing a sprite fetch And restarting the fetch So we're getting the sprite information And instead of putting it at the end We overlay it with the first eight pixels And we mix them onto the pixels And this explains why the FIFO Always has to have eight pixels in there Because that's how it mixes in the sprites This is in stark contrast to other systems So when we just push out pixels at a constant rate Until a window starts When the window starts, the FIFO gets cleared We are not pushing out pixels for quite a while Until the FIFO has the window data again And at which point it is resumed So it takes 43 clocks or more Depending on what's going on the screen With sprites, it can take even more This is on an LCD-based system You can do that You can suspend sending pixels On a CRT-based system, you cannot So for example, on a C64 A line always has to have exactly 40 clocks Because any pixel that comes a clock too late Will be shifted to the right, visually So it's not completely accurate That we have 40 clocks for pixel transfers More like 43 plus And the H-blank area is just the remaining line And in practice, this is more like what it looks like Depending on what sprites or background you have So I wasn't completely honest about How the pixel FIFO works It does not actually store pixel colors What it does store is the information The original information of the bit combinations And the source Like here it says, these are nine background pixels And the same is true with a fetch It does not create pixel colors It creates those bit combinations Plus the information which sprite palette it was Or what the source was So let's mix these together The sprite is priority zero Meaning drawing on top of the background And let's go through these There's sprite 100 Means this is the translucent So the background wins In this case, the sprite wins Because it draws on top of the background And this is true for most of those pixels And at the very last pixel We can see this is a sprite with palette one And it's translucent again So here the background wins again Let's do this one more time with another sprite That is at the exact same location In the first case here Sprite with a palette of zero Draws on top of the background, sprite wins And in this case, a new sprite Does not win over the old sprite There's already a sprite at this pixel So the old sprite wins And this is true for everything And this is how sprites that draw farther to the right Don't draw on top of existing sprites And sprites with higher numbers Don't draw on top of existing sprites And this is the last one where the sprite wins again And the applying of the palette Is only done at the very end When the pixel is shifted out So we look it up here in the palette We convert it to a color And we put it on LCD Look it up, convert it LCD Another one, black And this is how it's also done on the color system So starting with the Super Game Boy Existing games that couldn't really deal with this Could be colorized And what's done there, there's... The existing three palettes can now be RGB palettes Everything else in the system is the same But as soon as we shift this out It looks up RGB color It puts a pink pixel on the screen 11, S1, that's here Black pixel One more example, 01 from S1 Because that used to be a sprite with palette 01 Put it on the screen That's the end of the technical part We have five more minutes Let's talk about development In case you're now interested in doing Game Boy development There are some really nice tools The Rednecks Game Boy development system Is a set of command line tools That work really nice with make files And your editor of choice When you want to debug your code The BGP emulator, which is meant for Windows But works really nicely with wine On top of OS X or Linux as well It has a built-in debugger, single-stepping breakpoints And it has this really nice video RAM viewer That shows you all the details about what's going on Also really nice to run demos in that And see what's going on inside And if you want to run it on real hardware There are devices like the EverDrive Where you can put in an SD card And since we have another four minutes Let's talk about my favorite peripheral Of the Game Boy, the Game Boy camera But not from a technical perspective Just how great a device that is This is the Game Boy camera You put it at the back of the Game Boy Shoots really nice pictures You can print them on the Game Boy printer On thermo paper If you can still get that paper And you can take awesome pictures like that Let's zoom in a little These are really nice pictures Every picture is based on the CCD That has 14 kilopixels At a bit depth of two So next time you go on a trip Make sure, take a Game Boy Take a Game Boy Don't forget your Game Boy camera And a PC with a parallel port And that one link cable That you cannot get on eBay anymore So thanks to those people Who helped debug the Game Boy And helped me with a presentation And those people helped me in various other ways So in the series of the ultimate talks This was now the fifth talk What's next? There should be a talk next year, right? I'm suggesting two talks I'm nominating for 34C3 Dominic Wagner to talk about the Acorn Archimedes And I nominate Janis Harder For the Super Nintendo It's your choice You can do these talks Or you can put a bucket of ice water on top of your head Thank you for your attention And see you next year