 So, about our next speaker, he's a security researcher focused on embedded systems, secure communications and mobile security, he was nominated by Forbes for the under 30 in technology and also has won a Oba's AppSex CTF. He has also found and disclosed responsibly multiple vulnerabilities and especially for you Nintendo fictionados, I want you to watch out for the next intro, which is really amazing and you will all love. Thank you very much. Damn it. Oof, what a trip. Welcome to my talk on hacking the new Nintendo Game and Watch Super Mario Brothers. My name is Thomas Roth and I'm a security researcher and trainer from Germany and you can find me on Twitter at Ghidra Ninja and also on YouTube at StacksMachine. Now this year marks the 35th anniversary of our favorite plumber, Super Mario. And to celebrate that, Nintendo launched a new game console called the Nintendo Game and Watch Super Mario Brothers. The console is lightweight and looks pretty nice and it comes pre-installed with three games and also this nice animated clock. The three games are Super Mario Brothers, the original NES game, Super Mario Brothers 2, The Lost Levels and also a reinterpretation of an old Game and Watch game called Ball. Now as you probably know, this is not the first retro console that Nintendo released. In 2016 they released the NES Classic and in 2017 they released the SNES Classic. Now these devices were super popular in the homebrew community because they make it really easy to add additional ROMs to it, they make it really easy to modify the firmware and so on and you can basically just plug them into your computer, install a simple software and you can do whatever you want with them. The reason for that is that they run Linux and have a pretty powerful ARM processor on the inside and so it's really, it's really a nice device to play with and so on and so when Nintendo announced this new console a lot of people were hoping for a similar experience of having a nice mobile homebrew device. Now if you were to make a Venn diagram of some of my biggest interests you would have reverse engineering, hardware hacking and retro computing and this new Game and Watch fits right in the middle of that and so when it was announced on the 3rd of September I knew that I needed to have one of those and given how hard the NES and the SNES Classic were to buy for a while I pre-ordered it on like four or five different sites, a couple of which got cancelled but I was pretty excited because I had three pre-orders and it was supposed to ship on the 13th of November and so I was really looking forward to this and I was having breakfast on the 12th of November when suddenly the doorbell rang and DHL delivered me the new Game and Watch one day before the official release. Now at that point in time there was no technical information available about the device whatsoever like if you searched for Game and Watch on Twitter you would only find the announcements or maybe a picture of the box of someone who also received it early but there were no teardowns, no pictures of the insides and most importantly nobody hacked had hacked it yet and this gave me as a hardware hacker the kind of unique opportunity to potentially be the first one to hack a new Nintendo console and so I just literally dropped everything else I was doing and started investigating the device. Now I should say that normally I stay pretty far away from any new console hacking mainly because of the piracy issues. I don't want to enable piracy, I don't want to deal with piracy and I don't want to build tools that enable other people to pirate stuff basically but given that on this device you cannot buy any more games and that all the games that are on there were basically already released over 30 years ago I was not really worried about piracy and felt pretty comfortable in sharing all the results of the investigation and also the basically the issues we found that allowed us to customize the device and so on and in this talk I want to walk you through how we managed to hack the device and how you can do it at home using relatively cheap hardware and yeah hope you enjoy it. Now let's start by looking at the device itself. The device is pretty lightweight and comes with a nicely sized case and so it really for me it sits really well on my hand and it has a nice 320 by 240 LCD display, a D-pad, A and B buttons and also three buttons to switch between the different game modes. On the right side we also have the power button and the USB-C port. Now before you get excited about the USB port I can already tell you that unfortunately Nintendo decided to not connect the data lines of the USB port and so you can really only use it for charging. Also because we are talking about Nintendo here they use their proprietary tri-point screws on the device and so to open it up you need one of those special tri-point bits. Luckily nowadays most bit sets should have them but it still would suck if you order your unit and then you can't open it up because you're missing a screwdriver. After opening it up the first thing you probably notice is the battery and if you've ever opened up a Nintendo Switch Joy-Con before you might recognize the battery because it's the exact same one that's used in the Joy-Cons. This is very cool because if down the line like let's say in two or three years your battery of your Game and Watch dies you can just go and buy a Joy-Con battery which you can have really cheaply almost anywhere. Next to the battery on the right side we have a small speaker which is not very good and underneath we have the main PCB with the processor the storage and so on and so forth. Let's take a look at those. Now the main processor of the device is an STM32 H7B0. This is a Cortex M7 from ST Microelectronics with 1.3 megabytes of RAM and 128 kilobytes of flash. It runs at 280 megahertz and is a pretty beefy microcontroller but it's much less powerful than the processor in the NES or SNES classic. Like this processor is really just a microcontroller and so it can't run Linux. It can't run let's say super complex software instead it will be programmed in some bare metal way and so we will have a bare metal firmware on the device. To the right of it you can also find a 1 megabyte SPI flash and so overall we have roughly 1.1 megabyte of storage on the device. Now most microcontrollers or basically all microcontrollers have a debugging port and if we take a look at the PCB you can see that there are five unpopulated contacts here and if you see a couple of contacts that are not populated close to your CPU it's very likely that it's the debugging port and luckily the datasheet for the STM32 is openly available and so we can check the pinouts in the datasheet and then use a multimeter to to see whether these pins are actually the debugging interface and turns out they actually are and so we can find the SWD debugging interface as well as VCC and ground exposed on these pins. Now this means that we can use a debugger so for example a J-Link or an ST-Link or whatever to connect to the device and because the the contacts are really easy to access you don't even have to solder like you can just hook up a couple of test pins and they will allow you to to easily hook up your debugger. Now the problem is on most devices the debugging interface will be locked during manufacturing. This is done to prevent people like us to basically do whatever with the device and to prevent us from being able to dump the firmware potentially reflash it and so on and so I was very curious to see whether we can actually connect to the debugging port and when starting up J-Link and trying to connect we can see it can actually successfully connect but when you take a closer look there's also a message that the device is active-reprotected. This is because the chip the STM32 chip features something called RDP protection level or readout protection level. This is basically the the security setting for the debugging interface and it has three levels. Level 0 means no protection is active, level 1 means that the flash memory is protected and so we can't dump the internal flash of the device however we can dump the RAM contents and we can also execute code from RAM. And then there's also level 2 which means that all debugging features are disabled. Now just because a chip is in level 2 doesn't mean that you have to give up. For example in our talk wallet.fail a couple of years ago we showed how to use fault injection to bypass the level 2 protection and downgrade a chip to level 1. However on the game and watch we are lucky and the interface is not fully disabled instead it's in level 1 and so we can still dump the RAM which is a pretty good entry point even though we can't dump the firmware yet. Now having dumped the RAM of the device I was pretty curious to see what's inside of it and one of my suspicions was that potentially the emulator that's hopefully running on the device loads the original Super Mario Bros. ROM into RAM and so I was wondering whether maybe we can find the ROM that the device uses in the RAM dump and so I open up the RAM dump in a in a hex editor and I also open up the original Super Mario Bros. ROM in a second window in a hex editor and try to find different parts of the original ROM in the RAM dump and it turns out that yes the NES ROM is loaded into RAM and it's always at the same address and so it's probably like during boot up it gets copied into RAM or something along those lines and so this is pretty cool to know because it tells us a couple of things. First off we know now that the debug port is enabled and working but that it's unfortunately at RDP level 1 and so we can only dump the RAM and we also know that the NES ROM is loaded into RAM and this means that the device runs a real NES emulator and so if we get lucky we can for example just replace the ROM that is used by the by the device and play for example our own NES game. Next was time to dump the flash chip of the device. For this I'm using a device called Mini Pro and I'm using one of these really useful So I see eight clips and so these ones you can simply clip onto the flash chip and then dump it. Now one warning though the flash chip on the device is running at 1.8 volts and so you want to make sure that your programmer also supports 1.8 volt operation. If you accidentally try to read it out at 3.3 volts you will break your flash. Trust me because it happened to me on one of my units. Now with this flash dump from the device we can start to analyze it and what I always like to do first is take a look at the entropy or the randomness of the flash dump and so using bin walk with the dash uppercase E option we get a nice entropy graph and in this case you can see we have a very high entropy over almost the whole flash contents and this mostly indicates that the flash contents are encrypted. It could also mean compression but if it's compressed you would often see more like dips in the entropy and in this case it's one very high entropy stream. We also notice that there are no repetitions whatsoever which also tells us that it's probably not like a simple XOR based encryption or so and instead something like AES or or something similar but just because the flash is encrypted doesn't mean we have to give up. On the contrary I think now it starts to get interesting because you actually have a challenge and it's not just plug-and-plug-and-play so to say. One of the biggest questions I had is is the flash actually verified like does the device boot even though the flash has been modified because if it does this would open up a lot of attack vectors basically as you will see and so to verify this I basically try to put zeros in random places in the flash image and so I put some at address zero some at hex 2000 and so on and then I checked whether the device would still boot up and with the most flash modifications it would still boot just fine. This tells us that even though the flash contents are encrypted they are not validated they are not checked something or anything and so the device and so we can potentially trick the device into accepting a modified flash image and this is really important to know as you will see in a couple of minutes. My next suspicion was that maybe the NES ROM we see in RAM is actually loaded from the external flash and so to to find out whether that's the case I again took the flash and I inserted zeros at multiple positions in the flash image flashed that over booted up the game dumped the RAM and then compared the NES ROM that I'm now dumping from RAM with the one that I dumped initially and checked whether they are equal because my suspicion was that maybe I can I can overwrite a couple of bytes in the encrypted flash and then I will modify the NES ROM and after doing this for like I don't know half an hour I got lucky and I modified four bytes in the flash image and four bytes in the RAM sorry in the ROM that was loaded into RAM changed and this tells us quite a bit it means that the ROM is loaded from flash into RAM and that the flash contents are not validated and what's also important is that we we changed four bytes in the flash and now four bytes in the decrypted image changed and this is very important to know because if we take a look at what we would expect to happen when we when we change the flash contents there are multiple outcomes and so for example here we have the spy flash contents on the left and the RAM contents on the right and so the RAM contents are basically the decrypted version of the spy flash contents now let's say we change four bytes in the encrypted flash image to zeros how would we expect the RAM contents to change for example if we would see that now 16 bytes in the RAM are changing this means that we are potentially looking at an encryption algorithm such as AES in electronic codebook mode because it's a block based encryption and so if we change four bytes in the input data a block size in this case 16 bytes in the output data would change the next possibility is that we change four bytes in the spy flash and all data afterwards will be changed and in this case we would look at some kind of chaining cipher such as AES in the CBC mode however if we change four bytes in the spy flash and only four bytes in the RAM changed we are looking at at something such as AES in counter mode and to understand this let's take a better look at how AES in CTR works AES CTR works by having your clear text and X or in it with an AES encryption stream that is generated from a key and nonce and a counter algorithm now the AES stream that will be used to X or your clear text will always be the same if key and nonce is the same this is why it's super important that if you use AES CTR you always select a unique nonce for each encryption if you encrypt similar data with the same nonce twice large parts of the resulting cipher text will be the same and so the clear text gets X or with the AES CTR stream and then we get our cipher text now if we know the clear text as we do because the clear text is the ROM that is loaded into RAM and we know the cipher text which we do because it's the contents of the encrypted flash we just dump we can basically reverse the operation and as a result we get the AES CTR stream that was used to encrypt the flash and now this means that we can take for example a custom ROM X or it with the AES CTR stream we just calculated and then generate our own encrypted flash image for example with a modified ROM and so I wrote a couple of Python scripts to try this and after a while I was running hacked Super Mario Brothers instead of Super Mario Brothers so we hacked the Nintendo Game and Watch one day before the official release and we can install modified Super Mario Brothers ROMs now you can find the scripts that are used for this on my github so it's in a repository called Game and Watch Hacking and I was super excited because it meant that I succeeded and that I basically hacked a Nintendo console one day before the official release unfortunately I finished the level but Toad wasn't as excited he told me that unfortunately our firmware is still in another castle and so on the Monday after the launch of the device I teamed up with Conrad Beckman a hardware hacker from Sweden who I met at the previous Congress and we started chatting and throwing ideas back and forth and so on and eventually we noticed that the device has a special RAM area called ITCM RAM which is a tightly-coupled instruction RAM that is normally used for very high performance routines such as interrupt handlers and so on and so it's in a very fast RAM area and we realized that we never actually looked at the contents of that ITCM RAM and so we dumped it from the device using the debugging port and it turns out that this ITCM RAM contains ARM code and so again the question is where does this ARM code come from does it maybe just like the NES ROM come from the external flash and so basically I repeated the whole the whole thing that we also did with the NES ROM and it just put zeros at the very beginning of the encrypted flash rebooted the device and dumped the ITCM ROM and I got super lucky on the first try already the ITCM contents changed and because the ITCM contains code not just data so earlier we only had the NES ROM which is just data but this time the RAM contains code this means that with the same extra trick we used before we could inject custom ITCM code into the external flash which would then be loaded into RAM when the device boots and because it's it's a persistent method we can then reboot the device and let it run without the debugger connected and so whatever code we load into this ITCM area will be able to actually read the flash and so we could potentially write some code that gets somehow called by the firmware and then copies the internal flash into RAM from where we then can retrieve it using the debugger now the problem is let's say we have a custom payload in somehow in this ITCM area we don't know which address of this ITCM code gets executed and so we don't know whether the firmware will jump to address 0 or address 200 or whatever but there's a really simple trick to still build a successful payload and it's called a knob slide a knob or no operation is an instruction that simply does nothing and if we fill most of the ITCM RAM with knobs and put our payload at the very end we build something that is basically a knob slide and so when the CPU indicated by Mario here jumps to a random address in that whole knob slide it will start executing knobs, knobs, knobs and slide down into our payload and execute it and so even if Mario jumps right in the middle of the knob slide he will always slide down the slide and end up in our payload and Conrad wrote this really really simple payload which is only like 10 instructions which basically just copies the internal flash into RAM from where we can then retrieve it using the debugger so woohoo super simple exploit we have a full firmware backup and a full flash backup and now we can really fiddle with everything on the device and we've actually released tools to do this yourself and so if you want to backup your Nintendo Game and Watch you can just go onto my github and download the Game and Watch backup repository which contains a lot of information on how to back it up it does it does check something and so on to ensure that you don't accidentally break your device and you can easily backup the original firmware install homebrew and then always go back to the original software we also have an awesome support community on this court and so if you ever need help you you will I think you will find success there and so far we haven't had a single bricked game and watch and so looks to be pretty stable and so I was pretty excited because the quest was over or is it if you ever claim on the internet that you successfully hacked an embedded device there will be exactly one response and one response only but does it run doom literally my Twitter DMs my YouTube comments and even my friends were spamming me with the challenge to get doom running on the device but to get doom running we first needed to bring up all the hardware and so we basically needed to create a way to develop and load homebrew onto the device now luckily for us most of the components on the board are very well documented and so there are no NDA components and so for example the processor has an open reference manual and open source library to use it the flash is a well known flash chip and so on and so forth and there are only a couple of very proprietary or custom components and so for example the LCD on the device is proprietary and we had to basically sniff the SPI bus that goes to the display to basically decode the the initialization of the of the display and so on and after a while we had the full hardware running we had LCD support we had audio support sleep support buttons and even flashing tools that allow you to simply use an SWD debugger to dump and rewrite the external flash and you can find all of these things on our github now if you want to mod your own game and watch all you need is a simple debugging adapter such as a cheap three dollar ST link a J-Link on S or a real ST link device and then you can get started we've also published a base project for anyone who wants to get started with building their own games for the game and watch and so it's really simple it's just a frame buffer you can draw to input is really simple and so on and as said we have a really helpful community now with all the hardware up and running I could finally start porting doom now I started by looking around for other ports of doom 2 and STM32 and I found this project by flop is called STM32 doom now the issue is STM32 doom is designed for board with 8 megabytes of RAM and also the data files for doom were stored on external USB drive on our platform we only have 1.3 megabytes of RAM 128 kilobytes of flash and only 1 megabyte of external flash and we have to fit all the level information all the code and so on in there now the doom level information is stored in so-called WAD what wears all my data files and these data files contain the sprites the textures the levels and so on now the what for doom 1 is roughly 4 megabytes in size and the what for doom 2 is 14 megabytes in size but we only have 1.1 megabyte of storage plus we have to fit all the code in there so obviously we needed to find a very very small doom what and as it turns out there's a thing called mini what which is a minimal doom I what which is basically all the bells and whistles stripped from the WAD file and everything replaced by simple outlines and so on and while it's not pretty I was pretty confident that I could get it working as it's only 250 kilobytes of storage down from 14 megabytes now in addition to that a lot of stuff on the chocolate doom port itself had to be changed and so for example I had to rip out all the file handling and add a custom file handler I had to add support for the gamut watch LCD button import support and I also had to get rid of a lot of things to get it running somewhat smoothly and so for example the infamous wipe effect had to go and I also had to remove sound support now the next issue was that once it was compiling it simply would not fit into RAM and crash all the time now on the device we have roughly 1.3 megabytes of RAM in different RAM areas and for example just the frame buffer that we obviously need takes up 154 kilobytes of that then we have 160 kilobytes of initialized data 320 kilobytes of uninitialized data and a ton of dynamic allocations that are done by chocolate doom and these dynamic allocations were a huge issue because the chocolate doom source code does a lot of small allocations which are only used for temporary data and so they get freed again and so on and so your dynamic memory gets very very fragmented very quickly and so eventually there's just not enough space to for example initialize the level and so to fix this I took the chocolate doom code and I changed a lot of the dynamic allocations to static allocations which also had the big advantage of making the error messages by the compiler much more meaningful because it would actually tell you hey this and this data does not fit into RAM and eventually after a lot of trial and error and copying as many of the original assets as possible into the minimal IWAT I got it I had doom running on the Nintendo Game and Watch Super Mario Brothers and I hopefully calmed the internet gods that forced me to do it now unfortunately the USB port is physically not connected to the processor and so it will not be possible to hack the device simply by plugging it into your computer however it's relatively simple to do this using one of these USB debuggers now the most requested type of homebrew software was obviously emulators and I'm proud to say that by now we actually have kind of a large collection of emulators running on the Nintendo Game and Watch and it all started with Conrad Beckman discovering the retro Go project which is an emulator collection for a device called the Odroid Go and the Odroid Go is a small handheld with similar input and size constraints as the Nintendo Game and Watch and so it's kind of cool to port this over because it basically already did all of the hard work so to say and retro Go comes with emulators for the NES for the Game Boy and the Game Boy Color and even for the Sega Master System and the Sega Game Gear and after a couple of days Conrad actually was able to show off his NES emulator running Zelda and other games such as Contra and so on on the Nintendo Game and Watch this is super fun and initially we only had really a basic emulator that you know could barely play and we had a lot of frame drops we didn't have nice scaling v-sync and so on but now after a couple of weeks it's really a nice device to use and to play with and so we also have and Game Boy emulator running and so you can play your favorite Game Boy games such as Pokemon, Super Mario Land and so on on the Nintendo Game and Watch if you own the corresponding ROM backups and we also experimented with different scaling algorithms to make the most out of the screen and so you can basically change the scaling algorithm that is used for the display depending on what you prefer and you can even change the palette for the different games we also have a nice game chooser menu which allows you to basically have multiple ROMs on the device that you can switch between we have safe state support and so you if you turn off the device it will save wherever you left off and you can even come back to your save game once the battery run out you can find the source code for all of that on the RetroGo repository from Conrad and it's really really awesome other people built for example emulators for the chip 8 system and so the chip 8 emulator comes with a nice collection of small arcade games and so on and it's really fun and really easy to develop for and so really give this a try if you own a Game and Watch and want to try homebrew on it. Team SureVegan is even working on an emulator for the original Game and Watch games and so this is really cool because it basically turns the Nintendo Game and Watch into an emulator for all Game and Watch games that were ever released and what was really amazing to me is how the community came together and so we were pretty open about the progress on Twitter and also Conrad was Twitch streaming a lot of the process and we opened up a discord where people could join who were interested in hacking on the device and it was amazing to see what came out of the community and so for example we now have a working storage upgrade that works both with homebrew but also with the original firmware and so instead of one megabyte of storage you can have 60 megabytes of flash and you just need to to replace a single chip which is pretty easy to do then for understanding the full hardware Daniel Cuthbert and Daniel Padilla provided us with high resolution x-ray images which allowed us to fully understand every single connection even of the BGA parts without desoldering anything then Jake Little of Upcycle Electronics traced on the x-rays and also using a multimeter every last trace on the PCB and he even created a schematic of the device which gives you all the details you need when you want to program something or so and was really really fun. Sender Wonderwell for example even created a custom backplate and now there are even projects that try to replace the original PCB with a custom PCB with an FPGA and an ESP32 and so it's really exciting to see what people come up with. Now I hope you enjoyed this talk and I hope to see you on our Discord if you want to join the fun and thank you for coming. Hi, wow that was a really amazing talk thank you very much Thomas. As announced in the beginning we do accept questions from you and we have quite a few. Let's see if we managed to make it through all of them. The first one is did you read the articles about Nintendo observing hackers like private investigators etc and are you somehow worried about this? Oh what's going on with my camera looks like Luigi messed around with my video setup here. Aha yeah I've read those articles but so I believe that in this case there is no piracy issue right like I'm not allowing anyone to play any new games if you wanted to dump a Super Mario ROM you would have done it 30 years ago or on the NES Classic or on the Switch or on any of the 100 consoles Nintendo launched in between and so I'm really not too worried about it to be honest. I also think that the aspect of the target audience is to be seen here so after the next question which is did you think that there is a reason why an external flash chip has been used? Yeah so the internal flash of the SM32 H7B0 is relatively small it's only 128 kilobytes and so they simply couldn't fit everything in like basically even just the frame buffer even just a frame buffer picture also is larger than the internal flash and so I think that's why they did it and I'm glad they did. Yeah sure and is the decryption done in software or is it a feature of the microcontroller? So the microcontroller has an integrated feature called OTF-DEC and basically the flash is directly mapped into memory and they have this chip peripheral called OTF-DEC that automatically provides the decryption and so on and so it's done all in hardware you can even retrieve the keys from hardware basically. Oh okay very nice and also and the next question is somehow related to that is in your opinion the encryption Nintendo has applied even worth the effort for them it feels like it's just there to give shareholders a false sense of security how what would you think about that? I think from my perspective they choose just the right encryption because it was a ton of fun to reverse engineer and try to bypass it and so it was an awesome challenge and so I think they did everything right but I also think in the end it's such a simple device and it's like if you take a look at what people are building on top of it with like games and all the kind of stuff I think they did everything right but probably it was just a tick mark of yeah we totally locked down JTEC and yeah but I think it's fun because again like it doesn't open up any piracy issues. Sure and the one thing is related to the knob slide which you very very well animated so wouldn't start of sub-routines be suitable as well for that for that goal the person asking says that a big push are four are five etc Instructions are quite recognizable how would yeah. Yeah so absolutely so the time from finding the data in the ITCM RAM and actually exploiting it was less than an hour and so if we would have tried to reverse engineer it would be more work like absolutely possible and also not difficult but just filling the RAM with knob took a couple of minutes and so it was really the easiest way in the fastest way without fiddling around in GDRA also. Okay cool thanks and this is more a remark than a question. The person says it's strange that an STs AN 5281 does not mention a single time that the data is not verified during encryption. I think it's more a fault on STs than Nintendo site. What would you think about that? Yeah, I would somewhat agree because in this case even if you don't have JTAG like an ARM thumb instruction is two to four bytes and so you you have relatively small space to brute force to potentially get an interesting branch instruction and so on. So I think it's yeah I mean it's it's not perfect but also doing verification is very expensive computational wise and so I think it should just be the firmware that actually verifies the contents of the external flash. Okay, so we I think we should two questions more and then we can go back to the studio. The there's a question about the AS encryption key. Have you managed to recover them? Yes we did but so there is an application of by ST and they do some crazy shifting around with the keys but I think even just today like an hour before the talk or so. A guy who's also or sorry I'm not sure this guy a person on our Discord actually managed to to rebuild the full encryption but we I personally was never interested in that because after you've downgraded to RDP zero the device you can just access the memory map flash and get the completely decrypted flash contents basically. Sure, thanks and last question about the LCD controller whether it's used by writing pixels over SPI or if it has some extra features maybe even backgrounds or sprites or something like that. So the the LCD itself doesn't have any special features. It has one SPI bus to configure it and then a parallel interface where you so it takes up a lot of pins but the chip itself has a hardware called LTC which is an LCD controller which provides two layers with alpha blending and some basic windowing and so on. Okay, cool, then thank you very very much for the great talk and the great intro and with that back to our main studio in the orbit. Thank you very much back to orbit.