 Hey, thanks everybody for showing up so early so Yeah, I talk about extensor based Qualcomm Wi-Fi chips and So why even looking at Wi-Fi chips? So in my opinion these Wi-Fi chips are pretty powerful chips So even though they're just intended to be to run to do your Wi-Fi stuff, right? They are still general purpose CPUs so you in theory can run any software on it The problem is the proprietary binaries which come with those chips they make it difficult to run your own code and Why would we even want to run our own code? So what we could do is enable additional functionality. I have a few examples in a few seconds and We could also make security research easier. So In case we managed to enable dynamic analysis security researchers Have it easier to to identify bugs and and yeah report them to the vendors so some examples for Additional features which we want to enable is for example We can enable monitor mode on chips, which is not available by default For example, this is monitor mode on a proc-con based chip running on a Nexus 5 Which is not available by default and then you can use your standout like Wi-Fi tools like like arrow dump and so on we could also Build a complete state machine for Berkeley packet filtering inside the firmware and then compile our Rules for packet filtering on the user space upload them via the kernel to the firmware and run them directly in the firmware So that way we could save some power when we do monitor mode basically And if we have access to even a deeper layer firmware code we can also implement like Stuff which is closer to the physical layer of Wi-Fi So here's an example of a pilot tone jammer, which only gems the pilot tone of OFDM to be very power efficient So now that we have all these these great goals in mind So let's take a few step backs a few steps back and look at the bigger picture so in general there are Two kinds of Wi-Fi chips so full Mac and soft Mac chips. We will look at a full Mac chips in this talk The biggest difference is that full Mac chips run the Mac layer inside the firmware So this also means if you want to change something in the firmware and the Mac layer you have to to change the firmware directly So those it's not possible to do these changes on the driver itself so the examples I have shown in the previous slides of stuff, which we could want to enable this was based on on the the next one framework, so this was I Developed this or together as a colleague of mine back in my time at the University So this enables us to so this framework will be used to modify proc.com chips and we even managed to get deep modifications working as this This Gemma I showed earlier and you can you can get it at next month org and apart from proc.com There was also work done on black at 2020 About Intel based chips and there was also a talk about hexagon hexagon based Qualcomm chips on DEF CON 27 and black at 2019 But this talk is the first talk to Explain extensor based Qualcomm Wi-Fi firmware So some background how to do our Wi-Fi socks Structured so normally you have an application core and then you separately you have your Wi-Fi core, so and optionally at some cases you also have like a Real-time core which handles like time critical stuff like the distributed coordination function And now for the chip which we will look at There are like several drivers Available so there's adh 10k by Qualcomm directly and also a corresponding firmware with that And then there was adh 10k CT by candela tech So this is interesting so candela tech bought the rights from Qualcomm a time ago some time ago and they Have their own driver and their own firmware. So they got the source code somehow from from Qualcomm and added additional features and so basically you have a complete separate is the kind of driver and firmware in addition to the original Qualcomm driver and firmware and Lastly, there's also the QCL QCA CLT driver by Qualcomm. This is used for factory processes and also what you can do is you can use the Candela tech driver and run the Qualcomm firmware. This is also possible So this is the chip and the board I have looked at it's the IPQ 4019. It's based on Development board by eight devices. This board is called Habanero and in the middle of it you can see the the chip we are We are interested in and this is basically a development board for Wi-Fi enabled home routers so What does this board so this chip look like from the inside? So as I said, this is used for for Wi-Fi enabled home routers and one of the big vendors in Germany is also using it. So this is AVM the Fritz box and Apart from this the chip consists of as I explained before also of an application core multiple Wi-Fi cores and the application core runs an open WRT in this case with a pretty old kernel and For Wi-Fi they have one core for 2.4 gigahertz and one for 5 gigahertz and they use a PCIe to communicate with each other So now to the firmware itself As I said, it's a extensor based firmware extensor is Initially was initially developed by Tensilika, but now it was bought for by by cadence and it's The Lendian firmware it comes it consists of a ROM part and a RAM part the RAM part of this firmware is stored in the file system of the open WRT and It contains multiple segments one that later and it's LC77 compressed the ROM can be patched And there is also a code swap mechanism which allows you to which allows the Wi-Fi cord to evict Code from its own memory space to the host memory space And of course, there's no security enabled by default. So no secure boot. No stack can raise no address and website randomization What's also nice is that there is already by default there comes this this debug of s which you can compile into your driver and this allows you to Get direct memory access via this this mem value file so you can directly read and write Memory directly into the chip. There is also a debug mask Which is very useful which you can set for example doing visualization of the The kernel module to increase the verbosity of the driver itself Also the interfaces which are used between the basically the application core and the Wi-Fi core So there are two interfaces The PMI interface the bootloader management and messaging interface is used to input up and it's implemented in ROM So basically to input up it this is used to start up the chip then after boot up The WMI is used to to communicate with The chip and this is mostly done to send commands like please now start Wi-Fi scanning to channel configuration stuff like that so this is how the loading looks like so the driver offers two methods for loading firmware either via the the BMI method or via a copy engine and because in our case we have a compressed firmware we need to use the BMI method and What's basically happening is that the firmware from the file system is passed through the driver via BMI to the Wi-Fi core and it's then uncompressed on the Wi-Fi core itself So this is the the file structure of the the firmware file So basically there are two big parts like this this The supper one and the slower one. They are identified with an IE header and then after this there is The segment header and this is identified with this These magic values here will tell you if it's compressed or not And on which address it should be loaded into the chip And it will also tell you what's the decompressed size of this segment and if you notice already there are in this case the The first part in the second part have the same address. So this this means that they get overwritten Yeah, and then Apart from the segment header There's also some metadata as I just explained it then basically and after this the real data starts And then it's compressed then after this the next metadata starts in the next part Which is then also of the firmware data, which is also compressed so basically Why these two separate parts is my assumption is that the first part is just needed for for boot up and You can also see it's much smaller in size than the second part and after this we can basically override it with the real firmware code and If you look at the driver locks you will see this whole process being done twice and I assume this is done to Load up the first Wi-Fi core and the second Wi-Fi core If you look at the memory layout we can see that it's repeating so if you just do a long enough read of Of your address base you will see these patterns and My assumption is that this is used to realize Different memory access rights So you can have the same memory on different offsets and depending on offset you can realize different access rights so this Then to the extensor architecture, so There are two major things I want to point out about this this architecture which were new to me and which also Came with some problems. So the first one is the use of literal pools So each load instruction. So this is just a load instruction instead of being PC dependent It's basically independent of that and you just calculate an offset to to a fixed literal base to get immediate value and in addition to that there is also Like a windowed registers and so this means that So that way I explained in a second, but what it's used for is basically to Not needing to store and as to save and restore your registers when you call a function So here in this example, you can it's using call eight so that it can be different values here or four or twelve or whatever and But my firm values to call eight and if you load on Value in a 10 and you call a function. It will be available in a two of the callee Okay, so this is how this these window registers look like so basically you have these overlapping sections in your in your New memory space and you have like way more registers than you need it for for one Function and that way they can just shift the window for each function called if you have nested functions and They they can Basically just just rename it and then make it accessible to the next function This is how literal pools work. So now the memory space is horizontal So Assume we have a function my patch which wants to call a function wlan main. So here we This will also be So the assembler for this will look like that does first a load instruction which loads intermediate value into a register and then the call is just using this register to to call the function and If we want to to get this The immediate value we need to know the literal base and then we can look up the offset for This function you want to call your wlan main in this case and then we get the actual address of the function So for example here and this is done for each load instruction So this means or what's what's also the case is that? This this literal base needs to be set up somewhere and this is usually done at the very beginning of the firmware And it this is a fixed value. So for me it was x40 8001 and this is the code how it looks like so you basically Have to start of this a literal pool and its size and then you store it into a 2 and write this to the special register with base and So actually the offsets are negative. This is why you add the whole size to it and then A load instruction later on will have a negative offset So basically the existing firmware does expect that this little pool is used. So if we want and this has Some this comes with some problems if you want to patch existing code Another problem was that This litbase is not supported by disassemblers So for example IDAR 77 added support for extensor, but has Does ignore the the litbase? called for cheat rod there is an extensor plug-in, but it didn't support a litbase either radar 2 ignored litbase has support for extensor and There's also binary plug ninja plug-in, but it ignored litbase as well So what I need to do is create some patches for binary ninja in this case so in this patch, I just ignore the dependency of the program counter and Directly use this this fixed litbase and I have a similar. So this is the patch for binary ninja and there's also and this the patch for for G draw which works the same way and I have A link to a git repository later on where you can find these patches so Now if you actually want to to change some code and the firmware I decided to use to the next one framework I have shown earlier and I wanted to to basically modify it to support Qualcomm based Wi-Fi chips and So this is how the framework looks like it's it's pretty Yeah complicated, but we only need a subset of this to to make it work So what we could do with the original next one was we could extract the ROM flash patches the RAM U code for broken base Wi-Fi chips and we could use C to write our patches and of course we could Call existing firmware functions and then in the end With this whole framework we were able to create a firmware file which we could then run on broken based Wi-Fi chips So how does this this next one framework work on a high level? So we have Our own patches which are in patch.c. We have a wrapper.c. Which is used to to have some stubs for the functions which are Already existing in the firmware and we use GCC and GCC plug-in to compile the files and also The plug-in will Help to create this next month of pre-file and this next month of pre-file basically contains some metadata which we use in later on and For example, we can use this file with an org script to create some linker scripts And then we can use the linker scripts and the the O file to create an overall patch file So this patch file basically includes everything which we need and now after this we we can Use another make file and the L file to create basically a blueprint to Copy out the relevant section of this L file so we might not need everything just the patch we wanted to introduce and the And some some some stuff we might want to override in the original function and the original firmware so now we have this this L file and This make file which is basically the blueprint for object copy. We can copy out the relevant sections Then we can DD them into the firmware the firmware binary So this is how the original next one worked So I need to do some changes to to make it work for Qualcomm So first of all, there was no support for decompression the the Broadcom based firmware had no Compression it was just the whole memory the RAM the whole RAM of the Chip basically uncompressed It also also needed to add support for multiple binaries So as we have seen earlier the the firmware consists of of multiple segments and I needed to add support for that and We also needed support for the lid base. I mentioned earlier And then if we make this happen what we could do is compile and link our own patches we could Patch these this the second segment I showed earlier and then we can compress it back into the second segment add some padding bytes and then write it the The firmware file basically so this is this is the plan So I will not explain the the decompression. This is pretty straightforward to implement I will start with the support for multiple binaries. So I needed to extend the the GCC plug-in which creates this this metadata file the next month of pre-file and So basically what this GCC plug-in does it looks for this attribute and in the source code and this attribute tells Basically where the following the code after this goes into the firmware in which chip this is used and in which firmware version this is used in and I extended this basically to also include the the target firmware file so after compiling where should this go to into so over which segment of the Of the firmware file should this go into So After so basically then after we compile everything we have this this this new next month of pre-file and It basically contains the the address which is here Where our code should go it has basically at the type so here? It's a patch then if it's a dummy It's basically an existing function in the firmware. It's the dot the next column is the dot o file But this will be compiled into this is the name of the text section in the dot L file and now we also added this this The name of this bin file where the compiled code should later on be copied into the add So with this I already wanted to compile some patches. So my goal was to Use this pretty simple patch. So just jump into some code into this function then and then write one two three four basically into this memory location and then jump to the Original code which is WLN main in this case I could use the ESP 32 GCC compiler to compile my my my stuff and And load it into the the chip And I use the debug of s which I have shown earlier to check basically after the chip was up if If the memory address has changed accordingly But this didn't work And the reason is these lip base this this lip base I have explained earlier So basically all the load instructions expect this lip base to be set at the very beginning of the the firmware code and so this means like the the chip which runs the existing firmware code expects a load to to honor this this lip base and I need basically now some way to Tell my code Either like that. There is this existing lip base. Please use this and it's like already this filled up but Unfortunately, there is like no such parameter existing in GCC so we need to do something else and I just highlighted here to all the load instructions, which are problematic as I said The the offsets are not right. They need to Use these this lip base So what we could do as an alternative is just avoid load instructions and handcraft our assembly ourselves So we just avoid all the references And have basically no immediate value. So there's obviously not something we want to do in the long run So I come up with two possible solutions either as a settler we can tell the link is somehow Where the existing lip base is and how full it is or we can use our own lip base value So I decided to go with the second option. I felt like that this is more More flexible in the long run and would what this implies is basically that I set the lip base to zero at the Entry of every function. This can be done using the GCC plugin and at the function exit I need to reset it to its original value So and I needed to do this into the Assembler itself because there is this target dependent feature which is called like Relaxation so if you have like a call to a function this assembler will basically relax this to a load in a call And then we have again a call which is problematic So how does this looks like in the assembly? So I said this relaxation is problematic And after relaxation we basically have these a call is extended to to two instructions to slow down this call So what we would end the thing is that in between those two basically we need to reset the lip base to its original value so that then if I call the existing function that the remaining code can can run as intended and Yeah, it took me a while to find out that I cannot manipulate this behavior in GCC It needs to be patched in the assembler directly because it's target dependent So how does this relaxation look like in the assembler? So basically you have this sort of a lookup table on the left hand side It's looking for a pattern like this So here it's looking for a call 8 and if it finds this this kind of pattern It will be relaxed to a load and a call x8 So and what I basically need to do is find the place in the assembler where this These built instructions so basically the rent right-hand side here. These are called built instructions where those are It's rated through and then they're applied and pushed on a stack and All I did is basically to look like is the current opcode a call x and was the previous opcode a load instruction and if this is the case then I can basically add an Additional instruction which will reset the lip base with an wsr instruction So now that we know all of this and we have patched our assembler. We can go back to our original patching code and we can use this GCC plug-in and the The patched assembler to compile this this much easier to look at code Which is pretty nice and then so this is the how the assembly looks like basically after we have saved our our parameters we will Save the current lip base into a 15. We will write 0 into a 14 and write a 14 into the The lip base value and then we can do all of our load instructions at the end we reset Or we copy back the value with which we have stored in a 15 into the lip base value before we do the final call out So there are still some open problems with this. So this implementation is not very good, obviously I take away two registers for the assembler It would be better to have this based on on on a stack based implementation also There is like this the support of disassemblers, which I mentioned earlier So I have these these these patches, but still it's for example, you cannot use either if you want to And also there is no support in this GCC. I'm using for for naked functions So this is a problem. So what I do is Like you need some way to sometimes to override an existing call and film it to even get to your patch and either you're lucky and it's like They are using the literal pool then you can just override the value in the literal pool but if not if it's like a direct call then you need to some way to just compile like a call to to your own code and You could easily do this with a naked functions, but and this GCC version or like GCC for extents I at least does not have this option for for naked functions And also what's also missing right now is that there was like no text console so you cannot just do a printf and Somehow dump this during in your host system to do some basic debugging But we could implement this ourselves now So basically I have this very creative name now for my framework, which is called QCA mon instead of next one So this is how the folder structure looks like At the very top you have some build tools then Disassembly patches, which I mentioned earlier for Chidra and the banner engine and then there's a folder for the original film Where's and a folder for our patches which you want to apply the the build tools contain the patched assembler and GCC and also the GCC plug-in The firmware Folders basically contains the original firmware and it already comes with a structure which would allow different chips and different firmware versions and There is also a make file which already extracts the relevant parts and decompresses them And then in the end we have our patch folder, which we can use to store our Source code which we want to introduce into the firmware So now to a brief demo So this will basically show the the code I had in my slides earlier, so We just want to set like one two three four Also one two three four into this memory address and then jump to the original code And then in addition to what I've shown earlier. We also need these two lines to basically have an entry point into our Firmware code so all I'm doing here is creating basically Overriding stuff in the in the literal pool to point to to my code so and so everything is prepared In the repository you just need to run make and this will already Compile to the C files and dot-oh files It will also compile the wrapper will create the linker scripts and Create a dot-elf file and then use the this blueprint Make file I mentioned earlier to copy out the relevant parts In files, then we can start reassembling the whole firmware file we are compressing it adding some padding bytes and then Basically creating the the firmware file how it's expected by the by the the Wi-Fi chip so here it's called a firmware minus five dot bin and And We can I will now SSH into this this this board I have shown earlier and in order to access the the debug of s we first need to set up the WLN zero interface now we can go into the debug of s and Look at the memory Location basically without any modifications. So for this I can just use The DD DD to copy out the relevant section and use X term to look at it So here we see that the first few bytes are not one two three four and Now we we install our modified firmware just copy it over via scp and we will Need to Remove and add the PCIe driver to Apply the changes Now we can SSH back into the router Just checking the kernel log real quick. So here it's already setting up the regulatory domain so this looks quite good, so it did not crash on us and We can go back into the debug of s set up the WLN zero first Go back to the debug of s then basically we we can copy out the The same memory location which we just did a second ago and have a look at it via the hex dump Yeah, let's see if you can see here. We modified two whole bytes. Yeah Thanks so with this a Quick summary and future work. So I was able to modify next one frame with the main next one framework to To make it work with the Qualcomm based firmwares I have a demo page patch. I have shown it's in you you can get it via this this location there's also the Patches for by non-enja and she draw in there the GCC plug-in and already a pre-compiled version of GCC and Also the patched in your toes file for the assembler Yeah, so this shows that it's possible to Make modifications on Qualcomm based firmwares There's still some improvements, but I need to do to make modifications even more easier I also want to look at this production software driver Thing I mentioned at the very beginning. So this this QCA CLD So this is basically using this Q-dial software on on a PC to communicate and you can get it from some sketchy Chinese service and then I want to see if I can maybe enable some features which are used in this production software Also by myself and I also want to look into this this code for feature a little bit more So I want to thank Martin court aka problem kaput. So he did some awesome a game by advanced reverse engineering And I had some he helped me out in some cases. So turns out that he did not even reverse engineer the the main processor of the gameboy advanced but also like the The Wi-Fi chip used there and this turns out to be like an older a theorist based Wi-Fi chip which Yeah, he also reverse engineered and a theorist later on was also bought by by Qualcomm and I think like lots of knowledge from back there is still went into their products And I also want to thank Roku aka our cannibal. I used her script to To do done as a being of the the firmware file Yeah, that's it from my side. Thanks everybody. If you want to reach out here's my best Yeah, if you have any questions I have also one more note. I have I found a Firmware file which contains all debug symbols for this IPQ 4019 chip I have no easy way to distribute this without the original repository being taken down But if you want to poke it at yourself Reach out to me. I will send you the link Questions. No, okay, then thanks everybody