 To our last talk in this hall today It's about console hacking and I guess that's the reason why you are here console hacking has a long tradition at our great conference and we have seen lots of funny things people doing stuff with Xboxes Playstations and Everything okay today We got a team who deal which deals with the Nintendo DS So give a warm applause for Pluto Derek and Smea I'm Smea. This is Pluto. This is Derek and today we're gonna talk to you about our work on the Nintendo 3ds So the way this talk is gonna be structured is we're just gonna go over all the hardware organization Software like just give you a basic overview of how the system works and after that we're gonna go into basically every layer of security the system has and Break every one of them Okay, so as you probably know the 3ds the original Nintendo 3ds was released in 2011. It's a system That's kind of underpowered. It's got like it's got an arm 11 dual-course CPU 268 megahertz. It's got a nice proprietary GPU a bit of RAM, you know the usual it's also backwards compatible with DS games, which is nice then the new 3ds was released in 2014 and 2015 there's like different regions And it was it's basically just the same console just some improvements in the hardware. You've got a better CPU It's got more cores It's faster. It's got more RAM Basically everywhere. So it's just the same thing. It runs the same software exactly. It's got some exclusive software But not much So in terms of a hardware review, this is what what we're gonna talk about looks like General so you've got the top part right here, which is what we're gonna go into first This is like the arm 11 Part basically you've got the arm 11, which is the main CPU. It runs the main operating system It has two cores that I just said or a four cores And so it runs the main operating system. It runs the games runs all the applications Basically is just if you're doing something or 3ds that you can you can see it happening. It's happening on that CPU It's got access to all of main memory. So that includes includes FC RAM which is 128 megabytes or hundred or 256 megabytes depending on which model it is and FCRAM is actually divided into three separate regions So you first got the application region which contains the currently running game or application The system region which contains applets which are basically tiny applications which run in the background so that includes the whole menu which is actually always running in the background and The web browser which you can actually run at the same time as your game So it has to run there and then you've got the base region Which is more interesting contains all the system modules of the operating system as well as some kernel data such as Handle tables and M. U. Table. So it's kind of sensitive stuff and then you got W RAM which is tiny and contains all the kernel code and Well, most of the kernel structures as well So it's also an interesting target Then we've got the lower part which is the arm 9 a part of the hardware. So if you're nine is basically a Separate well, it's an entirely separate CPU which has access to Well, so it runs a basically the same micro kernel as the arm 11. It's mostly the same code It's just got some pure features Mostly it runs a single process which is called process 9 Which does everything the arm 9 does Beyond that the role of the arm 9 is to broker access to hardware that might be Well sensitive in terms of security. So one of the things it does is brokers access to all Storage media. So that includes the permanent storage as well as ZSD card And then it does all sorts of crypto stuff Which is really important and it does that by using hardware actually So there's this hardware key scrambler which is used to To store secrets in hardware basically the idea is you feed it to separate keys And it's going to generate a normal key and feed that directly into the hardware Implementation of the AES algorithm. So that way we never actually see the final keys. So That's something that's kind of annoying And then beyond that what you can see is the arm 9 has access to all of main memory without much of well without any restrictions But it's also got its own internal memory, which the arm 11 does not have access to so the arm 9 internal memories where the arm 9 stores all of its code all of its data and This way we can't actually take over the arm 9 just from the arm 11 without some kind of exploit So it's basically a security CPU So this leads us to having four layers of security basically you're first going to have the arm 11 user land Which is what well like your gains your applications, whatever On top of that you're gonna have well below that I guess the arm 11 Colonel so that's gonna have full privileges on the arm 11 And then you're gonna have arm 9 user land which is process 9 and beyond that you have arm 9 current mode So that's in theory in practice The microkernel has a system call Which we call sis call we call it as VC backdoor Because essentially you feed it a function pointer and it just executes that function in kernel mode So you don't even need an exploit if you have access to that just call of course on the arm 11 No application or title or anything ever has access to that But on the arm 9 process 9 actually have that has access to it Which means that from here we actually well user land in kernel mode or not Well, basically the same things you got user land on your arm 9 you got kernel mode. So it's nice Beyond that in terms of cryptography on the system Basically the went all out so anything that can be signed is signed So that includes the firmware that includes every application signatures are checked not only at install time, but also at runtime. So That's something to keep in mind same thing anything that can be encrypted is encrypted and anything that can be made well console specific through cryptography or Authentication such as the internal Permanent storage or the data that's stored on the SD card or Save games or extra data for games. This is all made console specific and game card specific in terms of and in regards to Save games, so that's kind of annoying as well And of course all this is handled by the arm 9 using the hardware the crypto hardware hardware So we've got to get through that if we want to do interesting things So first we're gonna go through the first layer, which is arm 11 user land the arm 11 user land basically getting a foothold into the system So We first need to find some kind of entry point There are Problems well there are challenges there one of the challenges is that the system implements strict data execution prevention So existing pages will never be read. Well, we'll never be read write executable It's only gonna be read only or re-readable or read executable There's no way from a standard application to Reprotect or map new pages that are re-read executable Because all the system calls are locked out Except for higher privilege system modules Another thing is that there's no ASLR. So that's not a challenge. That's actually kind of nice The nice thing here is that we well that makes save game Vulnerability is totally fair game because well We don't need an actual scripting environment or any kind of exotic Vulnerability exploit this as long as we can get past depth somehow and Then of course the fact that all save games are both Encrypted and made specific either to the game card or the game console in the case of eShop games Is really annoying for save game vulnerabilities because basically you can't use those as an initial entry point in most cases Because well, you can't generate the rates it will write AS Mac or or just you don't know the right cryptography So that's annoying Thankfully the 3s runs webkit in So that's nice can always use that So webkit is using number of places obviously is using the main web browser which you can access from home menu It's also using the YouTube application which is available free on the eShop and doesn't use any kind of authentication It's a client-side authentication for the server so you can just redirect traffic through like a DNS server for example Miiverse applet other stuff. I also use it slightly more secure, but Might be usable at some point. I don't know Anywho The important part here is that it's not only using webkit. It's using a very old version of webkit Basically, they do cherry-pick some patches into the version of what kid they use but only after we exploit those on release so Comes a little too late most of the time So yeah, this has been used by multiple people most simply yellows eight But it's proven to be a very efficient Reliable entry point Beyond that we've got cubic ninja As an initial entry point cubic ninja is a game that was released in 2011 on the Intel 3s It's nice because it actually allows users to share levels that they make themselves through QR codes and Also, it is really bad at parsing those levels So what you can do is just well manufacture your own QR code that's going to crash the game and give you access So these are nice initial entry points So once we've got this what you have to remember is that we might be able to crash the game Might be able to control registers, but we don't actually have our own code running because of depth So the obvious solution to hit this is to use Rop The foot for those of you who are not familiar with what Rob It's just you build your own fake stack that lets you return into code snippets that are located right before return instructions That way so this is an example you can just Jump to this kind of application Instructions so pop our zero PC and then this is going to let you load your own Register value and then it's going to jump to the next instruction that you give it so this is a way of Executing code without actually executing code, which is what widely used so this is like the obvious thing to do Of course, Rob is annoying. It's very limiting It's it can be enough to actually execute and exploit to get higher privileges but Well overall, it's just annoying and very limiting for for homebrew for example And of course as I mentioned earlier, we don't have access to any of the system calls that would let us map Read writeable executable pages As also the system does support dynamically linked libraries So that might be a way but these are signed and checked in places that we can't access at this point So what we're going to look at next is the GPU to see if we can use that to bypass step What you can see here is that? The GPU has access not only to video RAM, but also to FC RAM, which is if you'll recall the main memory So if we look at this with all the the different Memory regions we've got the application region here Which is entirely contained within what the GPU can access Within FC RAM because the GPU cannot actually access all of that serum. So that's kind of limiting But we can see here is that of course application code is within range of the GPU's level access The reason the GPU has to access FC RAM and video RAM through the MA by the way is so that it can access information such as textures Vertex buffers the sort of thing So it's actually kind of important and reason you can write to it is because it has to render its data somewhere The point is that we can use this to render data Into main memory and main memory contains application code and Since the physical layout is actually completely deterministic and even if wasn't we could just use the read capabilities of the GPU to search for what we're looking for Well, we can use this to overwrite our current applications Text section and we get code execution that way in spite of depth Yeah, so this way we get code execution we exude our own on-site code, which is great It's great, but we are still confined within the application sandbox So we bypassed up. We're inside the sandbox. Well, this means that we can only We can only access our current applications save data So if you want to install some kind of secondary exploit, this is too limiting We can only access certain services a system calls Which is also limiting and frustrating and we can't alter the memory layout So we can't allocate more executable pages as I mentioned earlier. So we're still kind of limited at this point So what we're gonna do is look at what else the GPU can access and what you can see is that of course There's this entirely separate memory region the GPU can modify So it can access most of the system region and the system region contains a few things contains the home menu as I mentioned because that's an applet Contains the internet browser and contains actually a single system module, which is called NS Which we think stands for a Nintendo shell. We don't really know So let's look at this first we've got NS code well beyond the GPU cutoff We've got menu code, which is also well beyond GPU cutoff But we've got the menus heap right here. Well, actually there's separate heaps These are well within the GPUs ray range. So that's good And that's unfortunately is still well beyond the cutoff of its data all of its code. So we apparently can't get to that So then the idea is just Well, okay, so actually What's interesting here is that the cutoff is right before the system the end of the system region Which as we just saw has some interesting things But also excludes all of base region which also has very interesting things So it seems likely that Nintendo knew about the capabilities of GPU DMA like nefarious capabilities But they didn't do anything about it. So It seems that they probably didn't realize what we could do with it, which is a lot So, yeah, basically we got menu heaps. So what we do is since we have a heap and this is all C++ code We're just going to find an object inside the heap and Overwrite it so it's pretty simple just find an object that's going to be triggered through some kind of synchronization mechanism in this case it's going to be just return to menu and We create some kind of fake vtable and get it to run our own stack pivot and then we get We get Rob execution under home menu Which is cool. We still don't have code execution in their home menu, but that's okay So we can we can do a bunch of stuff for drop We can access a new system New server service just called nss, which is very helpful because it can kill any arbitrary process as well as create new ones Also gives us access to the SD card, which most applications actually don't have and it lets us decrypt Dump any title on the system. So any game even if you use this new cryptography that Nintendo Introduced you can actually dump that because for some reason well home menu apparently needs access to that And then we can also access and overwrite all of extra data used by any application, which is great So we use this as a base for running homebrew our homebrew launcher is essentially just a Service that runs in the background under home menu process. It's written in Rob, which is kind of disgusting, but it works The service handles running homebrew so the process very simple You just kill the current application you spawn a new one and then you take it over using the GPU DMA access DMA and then what we do is we send all these new capabilities that we got through handles to the new process and That gives us some higher privilege homebrew It also handles events such as home button power button all that good stuff Which is nice because we can actually run code under any arbitrary application or game So we can actually modify these games we can run rom hacks So there's been a bunch of translations that can be run through this for games that haven't come out outside of Japan So that's pretty nice. It's the same principle. You just launch. Yeah, if you take it over you pass the code And then you jump to it essentially all within the confines of user land, which is nice So the other thing is we can actually access any game or applications data because we can run code under it So these things include save game data for any game So we can actually install more convenient secondary entry points Which do not rely on the browser which can be patched at any moment or unsung old game So some examples include menu hacks by yellow state which exploits faulty theme code theme handling code, which was introduced in firmware 9.0 Which is really nice because this way you can actually just run homebrew right as home menu is opened So right on boot time, which is great. Then you got other games. Of course, you got a Zelda game that's vulnerable This time wasn't the horse's name, but pretty similar and then you got other games We've got tons of entry points at this point really literally drowning in them. So this is nice But we free out about Nintendo shell right it's a very attractive target for a couple of reasons for one thing It has access to the AMU service, which can be used to downgrade any system title It's not actually designed to downgrade titles. The thing is it can both install and uninstall titles So what happens is if you uninstall title and then install an older version of the title You actually bypass a version check So you can just do that to downgrade any system title and bring about bring back old exploits if that's necessary Assuming you have access to the service and of course it's in a region that we can partially modify. So It's an interesting target Unfortunately, we can't actually access his data right now, but maybe we can actually move it to somewhere that we can The idea is if you were to kill ns and then allocate something in its place and then run ns again You can move it below the cutoff Thanks, but unfortunately It's not that simple That can't work the reason being that we actually need ns to be running to launch ns again So that kind of sucks, but well no actually we also can't run a second instance of ns at the same time So we can't do that either But interestingly well the 3s has an interesting feature, which is called safe firm mode Basically, it's the second firmware, which is an old version of the the current the regular one and that creates a bunch of copies of System titles most of them anyways So that grates it a different ID So the idea is if it's got a different ID We might be able to run it at the same time because well PM might fail to notice that Of course, it doesn't actually does notice that So we can't run a safe mode version of the title at the same time as a regular version of the title But for some reason in the case of ns Well, you might not be able to see this very well But so we've got ns the regular title right here And then we got safe mode ns right here and for some reason they created a new 3s version of safe mode version of ns Even though there is no safe mode. There is no new 3s version of the original ns So that creates a separate title ID which we can run at the same time as regular ns So then the exploit becomes very simple you keep and it's running just allocate enough data that it'll be Below the cutoff and then you just run new 3s safe mode ns And then it's within range a range of the GPU and you can take it over and have access to everything So this is nice. It's more of an oversight than a vulnerability. Well, then the proper exploit, but whatever So this gives us access to a bunch of system calls mostly service handling system calls so we can host our own service which can be useful for other exploits that I won't get into for impersonating others services to other system the system modules and then we've got access to all of these Services, which is great. So we can downgrade system titles arbitrarily and this runs in background Which shouldn't always be helpful for a Humber The only problem is at this point It's still new 3s only because you realize on this it relies on this new 3s title but there are actually ways around that and Yes, so this is was just to show that we can actually get fairly high levels of privilege even still Just always staying in user land on the arm 11 and there are other similar attacks to that if you're interested You can look up our row hacks, which is a similar attack in a system module So now Derek is going to talk to you about Exploiting the arm 11 curl Derek So hi everyone First I will give you some very short Inside view of the kernel and then I will explain how you can exploit the latest version of the arm 11 curl so This is actually Nintendo's very first gaming console kernel like on any other older console There was no kernel all games were just running on bare metal Like there was a kernel for the V Like a very small micro kernel running on the security processor, but that wasn't Written by Nintendo and yeah, so it's the very first gaming console. Also that kernel is made to be a threat safe so it can run on multiple cores at the same time and There are like 130 system cores available. So that's Quite a lot in my opinion, but usually if you have Gained execution in army level in user land. You only have access to like around 50 System calls And there's a reason for that, but I'm going to explain to explain that in a second So internally the kernel Works with C++ objects. So here are some examples for system calls. So we have Create similar for for example that will just create a similar for object in the car and it will return a handle to the user land and When you want to do any operations on it on that similar for You have to pass that handle to the kernel and it will look up this handle in a handle table To find the original C++ object Also, there are two different kinds of memory allocators So we have a memory allocated for the main memory, which is the FC RAM and There's also a slab heap Where all the C++ object are stored in and this slab heap is Located in XC RAM, which is the Army level and memory Where all the kernel code and data is in Also, there's a an IPC System IPC is inter-process communication and it basically allows you to talk to other processes like To Like services For example the GSP service or FS or Yeah, so let's look at the security So the kernel is really small. There are only like 200 kilobytes of code Which is pure encode and there are only like a thousand functions so they try to keep the code size very low and That makes it harder to find bucks well, the code size is really really small and You don't have Like really much to choose from what to exploit Also, there are no symbols included in the kernel like when you run strings on it It will just give you some names of C++ object objects But there are no function names or something like that As we have seen earlier, it's Physically isolated in its own memory which turned out of course to be a good idea Otherwise, it would have been overwriteable by the GPU eventually and All objects have a reference counting. So that's similar to the C++ shared pointer where every object has a small field like a corner field and every time the corner wants to use an object this counter gets increased and every time the Like when the reference is no longer needed It will decrease that corner and when that corner reaches zero It will automatically delete that object from the slab heap So they are basically trying to prevent use after freeze Also, I'm not sure if that's a security measurement but there are more than 100 panic calls in the kernel and Like that's every 10th function per average and They have the Cisco access restriction So you as I said, you only have access to like 50 System calls and all the interesting ones are disabled. For example, you can map executable pages Yeah, on the other hand, there's no ASLR, but at least they are trying to change the merry mapping every time they are Doing a larger larger kernel update Also, there's no stake protection and the user land is always mapped. So once you've got control over the Program counter you can just jump to User land pages that are marked as executable. So you don't don't have to do Rob in the car. It's pretty nice Yeah, but they tried to Have an execution prevention in the corner. That is they are marking Colonel executable kernel pages. That is the code They are marking them as executable in the page table. So Let's take a look So the highlighted parts in orange are the kernel code sections and as you can see like When looking at the first highlighted line, it says virtual address FF 0 0 and so on is mapped to the physical address 1 FF 8 0 0 0 0 It is marked as executable And you only have access to it in kernel mode, of course and only read access, right? so This is correct, but when you look at the Second line of that page table dump you will notice that there is another section which covers the entire Xeram and It's mapped as Read write so Doesn't really make sense Yeah, so basically it's completely useless. We have read write access to it. So To summarize everything There's actually no exploitation protection. Once we found a exploitable bug It's pretty likely that we gain That we will gain code execution in kernel mode Yeah, so let's find that bug and I started at looking at the SVC table so this is kind of the interface between kernel land and user land and This shows all System calls that are available in the kernel So you have like normal system calls for memory management. You can map Read and write the pages you can mirror pages and do other memory management stuff and there's also some configuration for threads like you can choose which core should be used for Executing the thread and all that stuff You have a really large range of synchronization objects like the kernel mutex and all that stuff and of course you have IPC requesting so you can Send messages to services and There's a more advanced section like This is used by services mostly Because they have to respond to your IPC requests and there's also a kernel DMA cache control some things and They have a set of debug System calls. Yeah, it's just basic debugging. You can set breakpoints read and write process memory But yeah, you don't have access to them like on retail. It's not actually used and So one last section is the privileged section and Here are all the very interesting System calls that allows you to create processes and execute map executable memory and all that stuff so Unfortunately, we can use the Advanced debug and privileged system calls. I mean that would require exploiting some service and Yeah, that's just more work for for us so this leaves us with the normal system calls But yeah, IPC sounds really interesting, but unfortunately, it's full of panics Yeah Also, there's not much to attack at synchronization object calls system calls so We only have like this more interesting System call for local for local memory management and in theory, there's a lot that you can mess up right, there's a lot that can possibly go wrong and also we have unchecked DMA access like through the through the GPU. So maybe we can do something useful with that Okay, so let's have a look at the memory allocator There are two types of memory allocators First is the regular one It's just for mapping normal heap like for ML log in C for example And you have the linear memory allocator that is used for GPU texture textures like When memory has to be continuous Physically continuous you use the linear memory allocator and There's a serum memory layout that we saw earlier. You have these three regions and Every region has its own set of three pages so How are they keeping track of them? So you have a region descriptor Which tells us the dimensions like where does it start the region and its size and you get also a pointer to the first free piece of memory in that region and each Like free piece of memory which we call a Manchunk Has a Manchuk header right at the beginning and it basically tells the kernel how large that Manchunk is and It's also linked in a doubly linked list So you have the next and previous pointer pointing to the Next and previous Manchunk headers It kind of looks like that so you have the The red parts which are the three and men chunks and the green parts memory is memory that is all already allocated so Allocation is pretty straightforward. It's not really complicated so the first thing that The allocator function does it loads the next three pointer from the region descriptor and for regular memory it just goes goes through the list following following the pointers and It sums up their size Until the requested size is reached For linear memory, it would just look for a suitable memory jump to make sure that that memory is really continuous so When it found enough memory It sets the next pointer of that very last Manchunk to zero it will then update The list and also the next three pointer for the rich in the script And finally it will return a pointer to the first manchunk so Let's look at this from a security perspective And there's a problem. They break basically have kernel structures inside the FC RAM and That is a problem Because we have DMA access to it through the GPU and There was an attack by yellows 8 that is called Manchunk hex and What he did is basically He overwrote Manchunk headers with a GPU DMA flaw and then he gained an arbitrary kernel write When it's Deallocating memory So because the next and previous pointers Have been modified So unfortunately this was fixed by Nintendo in system update 9.3 last Last year like one year ago and The new color will now verify every manchunk header during allocation Like its size and also next and previous pointers so yeah in theory everything has been fixed and Invalid pointers or invalid sizes will just result in a kernel panic Yeah, so yeah in theory So when you look at the system call for control memory Yeah, we have access to it. It's one of them normal system calls It does basic stuff. You can map and free rewrite pages, but not executable, of course and it takes an address and size as argument and also also a Operation code which tells the kernel what to do it to map or free pages, whatever and so first it does some basic checks on the address and and Eventually it will call a very large function and I just call that function current control memory So what current control memory it does it calls the allocator function and It will just return a manager a header pointer as we have seen earlier and Then it goes through all of the allocated men chunks and It's mapping them to user space And it's also updating some block information for Cape Colonel process object so There's a problem there's obviously a race condition like We can overwrite mention headers after they have been allocated So we could try using the GPU, but it's Really slow actually because we we would have to ask the GSP service to read memory and Yeah, we have to go all Go to this very large IPC kernel code and that would be probably too slow allocation is really fast, so Let's stick a little bit deeper Yeah, I tried to reconstruct the source code in C So this is the first step. It tries to allocate memory for this example, it will just allocate regular memory and So when it found a Manchunk which means that it's Not enough memory is available It will then execute this really interesting do while loop So, yeah, it's I know it's a lot of code. I'm not sure where they can actually read it So let's go quickly through this code There are the pages read from the mention header it gets converted to a physical address and That physical address gets mapped to user land by a man map function And then it will go to the next man chunk Here and it will also update Userland virtual address and then it will clear that memory So What's wrong here? So the problem is they are mapping the memory chunk into user land and after it has been mapped they are accessing it again and What they access is the next pointer So we can just override it like when we have two threats running we can From another CPU core try to override that pointer so Our goal would be to map code the pages to user space But there are some problems. It requires really really perfect timing There's only a very small time frame to do with the to do the override and Also, we need a mention header structure at the next pointer address so Yeah, to do with this so to Make sure we get a perfect timing. I came up with a current address arbiter. Okay, and and It is actually used to a flat synchronization We don't care about it, but it tries to read from address and returns an error when address is Not as accessible by the user then So we can use that system call to make sure that the memory Has been mapped to user land and once it has been mapped. We are trying to override it so One last problem. We have to inject a memory chunk header in kernel so I Did this by using this lab heap We can just create some kernel objects and set their memory variables to Create a faked mem chunk header So this is this lab heap. We've got C++ objects vtable pointer and Some attributes So yeah, this lab heap is basically just a really large area of C++ objects And what I did was I Changed the attributes and use them as mention header and I am redirecting the next pointer to that Object and it will map a Multiple C++ objects to user land and that's really nice because we have vtable pointers so we can just overwrite them and That means that we gain code execution. So as a summary We set up some current objects change their attributes request memory from the kernel and once it becomes available Which patched the next pointer overwrite that map slap heap pages And then we call a system call which closes the handle For the kernel objects that we created in step one so it will eventually call some vtable function and It will just jump to our modified Vtable function and we got I'm 11 level zero code execution so now and Pluto will tell us what nice things you can do once you've gained on 11 code execution Hey guys Okay, so the arm nine Yeah, let's go. Okay. So the arm nine is actually Also used for Executing old DS games. So what they do is they actually you could say reuse the arm nine Which is their backwards compatibility processor? They use it for us as a security processor when executing 3ds code and Like Svea told said it's running a stripped-down version of the arm 11 kernel It basically only does threading synchronization things like that And there's no MMU. There's an MPU eight regions. You can configure You could do no execute within those regions, etc. But Like the granularity is not very nice. So and they only have eight so they basically ran out of space and date and stack is executable as long as you can jump to it and Text is writable. So That's bad. Basically, whenever you have a answer whenever you can write code into arbitrary memory, you can just overwrite code and Yeah, these features you don't want them on a security processor Okay, so let's go. So it turns out that like There have been a lot of exploits over the years and most of them are fixed and most of them use the The normal command interface, but in this case, we're taking a different approach. So On the 3ds the memory mapped IO is split up into three regions There's the arm 9 only IO. It does crypto. It does Yeah, it has DMA engine Things like that. And then there's the shared IO region and then finally there's the arm 11 IO region Which contains the GPU video decoder? Yeah So thanks to Derek and Spain. We have full arm 11 control. We execute kernel mode So the question is can we use the shared IO region somehow to own the R9 Okay, so it turns out the interface for reading old DS cartridges is actually in the shared IO region We're not sure why this is but Yeah They have it there for some reason and it's only the arm 9 which is actually using this region But arm 11 still has access to it. So when you insert the cartridge it starts by reading the banner and It does this by writing this magic value to control register and basically it just asks for 200 bytes 200 hex bytes. So and then there's this loop and Several code is on the right side You can see it basically waits for some bits to clear to set and then they Read for bytes and then they wait for another bit and There's no range check on the buffer, but it's always 200 bytes. So it should be fine Well, what if we overwrite the control register from arm 11 asking for hex 400 bytes 4,000 bytes boom Okay, so we have a nice buffer around it's in the BSS segment, but yeah, it's still nice Can we control the data? so the data actually comes from the cartridge and Yeah, we need to make our own DS cartridge So there's this old device called a past me It's for the original DS where you basically plug old DS cartridge in and it basically Modifies the header as it's read So these are available online for five bucks And then you add an FPGA so So I implemented this and it works, but it's it's very gimmicky and Yeah, I don't recommend it and yeah And here's my soldiering. It's not very nice Okay, so this gives us arm 9 code execution and this works on latest firmware But we want something better So let's look at the chain of trust So the chain of trust idea is of course you verify All the code that is running but basically verifying everything at load time So the 3ds has the simplest Chain of trust you can have there's the boot room at the start and then it loads the firmware binary from land and it jumps to it and On the new 3ds. They were a bit clever. They added an extra crypto layer on the arm 9 portion But it's actually part of the firmware binary And we we call this arm 9 loader So the theory that Nintendo had was yeah, let's add another layer of crypto So we changed the keys. We introduced new keys And they can't break it So and they don't have anywhere to place to put those keys. So they place them in NAND And but they're encrypted with the per console key. That's based on a hash of the OTP. That's unique for each console and Then OTP access is disabled early the boot So later on you can't dump the OTP and you can't figure out the keys so This looks safe in theory So here's their implementation. So they calculate some hash of the OTP They read the key sector from NAND and they decrypt the key and they put it in a key slot. It's basically isolated memory area and then they generate a bunch of sub keys and They verify that the key they loaded from NAND is the correct one. So even if we were to switch the key They would detect that and just panic and Then they decrypt the arm 9 binary and they jumped to the entry point but They forgot to clear the X11 key So we can just get code execution later on and we can just regenerate all of those keys So this implementation is useless okay, and They fix this because they have more than one key hidden in the NAND so they Took their next key It's basically the same idea. You calculate the same hash you read a key sector from NAND You generate all the previous keys for compatibility and then you Decrypt a new key called we call it key number two and then you decrypt arm 9 binary using the second key and You clear the key slot and you jump to entry points, but they forgot to verify the second key This is epic fail Okay, so let's exploit this So arm 9 loader hacks we can change the second key Arm 9 loader will just decrypt the binary to garbage and jump to it Okay, so if you look at the encoding of a arm branch instruction The probability is pretty high that it will just be a branch instruction I just any random date that will eventually Like if you try enough keys it will eventually become a branch instruction to some memory And so if we try a lot of key keys basically eventually we'll find some garbage that is used useful so this is the NAND of the flash memory of an Unmodified 3ds a new 3ds. So there's a small key section marked in teal like blue and It contains those keys that we're talking about and then there are two firmware partitions One is used for backup in case one gets corrupted. So It doesn't break the device whatever so we install our custom key and We install the largest firm binary we have in the firm zero partition And we keep the one with the vulnerability in the firm one partition and then we put our Code payload on top of the firm where zero binary So and then we reboot and so what will happen? The boot room is executed It will load the first firmware Partition And it's actually it has our code in the end, but it doesn't know about it and then it decrypts it and You see it looks okay There's the arm 9 loader stub in the front and then comes to encrypt the binary and then finally there's the R payload but Boot room checks the hash right and it fails So it thinks the partition got corrupted So it will load the smaller one on top. You see we have our payload in memory at boot and Then it decrypts firmware one Which is smaller and It still has arm 9 loader and another encrypted arm 9 binary and then it jumps to arm 9 loader because the hash checks out and then the arm 9 loader will decrypt our Corrupted key from NAND and it will decrypt this one to garbage And it will jump to it and hopefully it jumps our code So this gives us arm 9 code execution from cold boots early very early So it turns out we can actually use this to get some keys that are later not Available because they clear those they use a certain memory area for seeding encryption engine to generate keys and this Those keys are later or sorry the memory is later Cleared so you can't regenerate the keys But with this we can actually get those two keys. It's they're called firmware 6 safe key and firmware 7 NCCH key So that's a bonus Okay, so we talked a bit about the AS engine it's used everywhere for the crypto and Used for everything basically it supports all the usual block safer modes It has two security features. It's has a right only keys Which is really useful like you write a key and then you can never ever read it back this means that they can fill in the keys by the bootrom and Basically this will And we can't dump them later so they Yeah, so they can keep the keys secret Even if we hack the arm 9 even if we get code execution We'll never get the keys And then there's the key scrambler, which is that the key is actually It's an optional thing where the actual key is hidden calculated by a hardware function that is not never That we don't know about so the key is actually never exposed to the CPU The actual key so we just feed it to to values to keys and then it generates a new key based on that and we know We don't know what that key is so this creates a situation similar to the isolated spus on the PS3 where you can ask it to decrypt stuff, but you don't get the keys and if you don't get the keys then Yeah, we want the keys, but yeah We want to decrypt things on our PC because we're lazy So there are two keys key x key y we call them They're 128 bits and the normal key is derived as a function of those two and That function is unknown. It's implemented in hardware in silicon and The key here is yeah, so even if we know x and y we can't figure out the normal key and we can't decrypt things Without asking the 3ds first But we can put this hardware engine So and the first thing you notice when you do this is that if you set the nth bit of the x key and the n plus 2 bit in the y key you get the same result and In general you find that the function that we're looking for is actually just a function of one variable Where it's the X or between? Yeah, the X rotated by two. So this is rotation not shift and Xord with y but we still want we still don't know g but we want to know g so But yeah, so step back a little bit so The the key scrim is used for me qr codes. It's used for everything, right? So it's used for Network protocol called you UDS and it's used for download play, which is when you download games over Wi-Fi Temporary games, but the Wii U also supports all of these, but it doesn't have the key scrambler in hardware So the Wii U must be using normal keys Okay, so we make a table of the shared keys and basically these are the three keys that are shared with the Wii U and Who is where the key X and key Y on the Wii? Sorry on the 3ds where they are set and two of them have key Y set by firmware so We can't read the keys set by the bootrom because it's locked away and we don't have it But can we still figure out g let's see so I'll give shout out to shuffle to and to fail overflow who hacked the Wii U and They helped us or shuffle helped us extract the Wii U keys. So thank you and So now we have a key Y and we know the normal key from the Wii U. However, key X is still unknown and If g of t is bad Yeah, then a small change in the key Y will only lead to a small change in the normal key And yeah, it's bad Okay, so let's look at the data So when you flip one bit in the key Y we can brute force all keys similar to normal key Which is just within a couple of bit flips and we find that it's always it always results in a normal key with bits flipped at position either 87 or 88 sometimes 89 but never 86 So this reminds me of like an adder where you had the care bit Being propagated to upper bits, but never to lower ones So let's guess that this is an adder and let's try It's an adder with a rotation. So We guess that g of t is t plus c some constant c. We don't know it and rotated to To the left by 87 And then we plug it into our original formula and We don't know key X remember because it's set by bootrom. We don't have it We don't know the constant c because it's in silicon. So it's in hardware. So but if we look at our formula and we consider the inequality where we basically Rotate right by 87. We're basically undoing the outer rotation And then we plug in our formula our guess and then we get this And then we subtract c from both sides we end up with this and This is basically we're X ring two different keys with the same X value rotated to the left by two and Which yeah, well if you start for this a bit you'll see that if Why zero and why one which are two different key wise And are equal except for at one position one bit position then the extra is Smallest for the one which has the same bit Which shares the same bit value at the position that the two wise are differing at so it's it's It's actually pretty simple, but it sounds difficult But like Xor is zero if they're the same input and the one if they're different So if they're same it's zero and it's smaller so we actually look at bit by bit on this and We repeat is 128 times and we recover all 128 bits of the key X and When we have the key X we can calculate the silicon constant C So the end result is the key scrambler is figured out The and we have also the secret boot room key X for a couple of key slots as a bonus Okay, and also So I didn't include a constant in the slides because I want this to be an exercise for the listener so They actually When the new 3ds was released they rushed it we think because they left some interesting commands in the PSPS service and It included an early version of the NFC crypto used for the amiibo figurines and This implementation the first one uses a normal key and When the the newer one changed it to a key Y So they accidentally gave us one of these pairs in the firmware images We didn't we don't need to use the Wii U at all. So anyone who can decrypt 3ds firmware binaries can perform this This attack to to get the constant so anyone out there. Yeah, good luck Yeah, and I'll back to this mail for summer All right, so just gonna conclude really quickly. So some takeaways of what we talked about today First thing is it's all pretty obvious lessons, but you know bear with me Giving visit access to physical memory to any application through the GPU or whatever is dangerous You should always be careful about that even if you think you've protected stuff There's probably gonna be stuff that you forgot. So just like either don't do it or do it, right? Other thing is shared IO is dangerous if you don't know what we can actually control the IO then well again You should be very careful Also only checking your data before decryption is dangerous and well both that and not checking the key when you know that it could possibly be modified by an attacker is a bad idea and Finally secrets in hardware are great unless you give them away. So Don't do that Beyond that we just wanted to talk about the state of homebrew really quickly you might recall on the during the Wii U talk around here Two years ago that fell over flow said that they didn't think necessarily there was much a future for Console homebrew and there's definitely an argument for that with the rise of you know Well phones mostly anyone can make an app can make a game for any number of devices and sell it to Well millions of people but you know what we disagree It's been a year since we started releasing 3s homebrew and Okay, so this is supposed to be moving but let's imagine it's moving. Well, there's been like a bunch of 3s homebrew It's been awesome. We've been working on this really hard a lot of people have been joining us it's a great community effort and Basically, what I want to say is well, we want more developers. So if you would like to join us There is a very well, it's not very mature, but it's maturing our SDK And you know what reverse engineering hardware is fun when we don't have any documentation and reverse engineering software is fun We can always use more reverse engineers and just people who want to make cool shit. So Yeah, oh, right Just one more thing so lately there has been a wave of patches by Nintendo of known exploits which has been really annoying so for Our browser hacks well yellow seats browser hacks menu hacks stuff like that So yellow seats been working pretty hard So we he actually brought back browser hacks. It should have been released about 10 minutes ago So Not only that but we also had iron hacks for An eShop game a free shop game so you could just download it that was patched the thing is There's actually a way to download the old version from the eShop application with some patches So we're also releasing that right now. So basically if you can get homebrew and get onto the eShop with a modified patch That should also be released in about well whenever this is done. So Get it as soon as possible. This is a free game. It'll get you homebrew forever So just do that and also yellows a just released a new version of menu hacks, which works on latest firmware version This was also patched like a couple weeks or months ago. So this is all out right now. If you have a three. Yes Get it if you have friends who have three guesses. Well, tell them and tell them to get it because it might not last super long Yeah, so we would like to thank yellows a who unfortunately cannot be here tonight But has been super helpful has been doing a ton of work on a 3ds And like honestly a ton of this could not have been done without him And thanks to everyone on the 3ds dev homebrew channel everyone who's attending tonight Thanks for this and yeah, if you have any questions, I don't think we have a lot of time, but We'll accommodate Thanks Thank you for your patience if you got questions, please come up front to these guys because we have no more time for Structured Q&A. Thank you