 So good morning. Thank you for attending a Sunday morning talk. So today, we're going to talk about Android Packers. How do they work? We're going to analyze a few of them. And I'm going to show a method we developed, a tool that can handle most. It's a generic way to handle most of the Packers in the market. So who are we? So I'm Avi. I'm currently a founder at MyDRO. But I was formerly a mobile R&D team leader at Checkpoint for a few years. And this is where I did this research. And before that, I was a security research at the LACUL Mobile Security. And unfortunately, Slava, my co-presenter, could not attend today, but except being a very talented researcher and a good friend, he also did a lot of the heavy lifting around this research. We miss you, Slava. So let's go to business. So boxing apps. OK. So malware offers use various boxing or packing techniques to prevent static code analysis and reverse engineering. Malware offer invested a lot of time in developing this cool malware. And it doesn't want the security guys to understand what's happening, detecting it, using their automatic tools, or with manual reverse engineering. This is the same as in the PC world where most of the Packers and most of the Malware today are packed. So how can one protect his code? So they can use a proprietary technique or a third party software that can protect the app's code. So what does this software should do? So this includes code protection, anti-debugging, anti-tampering, dumping, all of the methods that will prevent the security guys, the static code analysis engines, the reverse engineering from understanding what's happening inside the app, the malware. And what was our motive to this research? So in the PC world, it's for years now that Malwares are packed by different Packers. And also, we see this trend rising in the Android world. And this is a snapshot of an analysis done in the checkpoint systems from May. And we saw there that almost 25% of the packed apps were detected by us as Malwares. And we asked ourselves, can we maybe improve the detection? Is this really the amount of malicious apps from all of the packed apps? Or maybe we are missing something because we're not going and doing static code analysis or full static code analysis for this app, these apps, because they are packed. So this was our motive in order to understand how packing works and to find a generic way to unpack them. So what techniques exist to protect an app's code? So let's talk about the main three. We have obfuscators, Packers, and Protectors. So what is obfuscation? So obfuscation is adding redundant code to the main apps code that doesn't affect the functionality of the app and changing the function names and the variable names. And this is done in order to prevent a reverse engineer to understand what's happening inside the app. And today in the Android world, there's a default obfuscator tool called ProGuard, which comes with Android Studio and used by most of the Android developers. But I have to say it's not the best obfuscator in the market. And for not a really experienced reverse engineer, it shouldn't be any problem to understand what's happening inside the app. And another method which we are going to concentrate here today in our talk is Packers. And what does packing do? So let's say I have an APK, an Android app, and it contains a DEX file. What's a DEX file? It's the small byte code that happens to the code after it's being compiled. And this is what Android executes. So I have the original code. And what the packing process do is it takes this original code and encrypts it, packs it in some manner. And the way it opens it, it adds a packer loader, which will be now the entry points for the execution of the app. And once the app is executed, the packer loader will take the bundled original encrypted DEX file and load it to the memory, unpack it, and load it to the memory so it can be used by the app. And there's also protectors that works in a bit of a different way. They take the original DEX file, but they don't only encrypt it, but they also modify it. While they are doing that, they want to add another layer of protection. Let's say one of the protectors that we saw in the wild adds an encryption in a class level, meaning if only when a class is initiated, lovely. The only when a class is initiated, it will be decrypted. And what's surprising here is that we didn't see a lot of use of protectors in the wild. And what we thought is because it might affect the logic of the malware, malware offers don't use it as much as packers. So we decided to concentrate in our research on packers. So in order to understand more on how packers work, we need to go back a bit to basics and understand some things about Android. So let's talk a bit about art, the Android Runtime VM. This is a schematic of how things look in Android 6, which is good enough for us in order to understand this world. So what happens, how does Android Runtime VM works? So the Android Runtime VM can work in two modes. One, interpreting the Smalley bytecode. And the second is working with a compiled bytecode ahead of time compilation. That was something that was introduced in Android 4.4. And what happens is when you install an app, it goes through a process of compilation. And then the VM will work on a compiled ELF code. And while this was done, this allows to gain a lot of improvements in RAM, battery performance, and startup runtime of the application. But it's important to remember that the VM can work in both ways, interpreting a Smalley code or with a compiled native. So what happens when you load a DEX file, when you start an app? So you trigger the zygote process, which is an empty process that contains preloaded classes. And it's not in order to short the startup time of the app. And what happens, the zygote process forks itself into an empty app process and loads the app code, the ODE file. But what happens if the ODE file is missing? So what will happen is that it will trigger DEX to ODE. That's the process that compiles the DEX file into an ODE file. And we'll use the ODE file in order to execute the app. So I talked a bit about ODE files and let's try to explain what is it. So ODE file is basically an ELF file with some added sections. One of them is ODE data, which contains the original DEX file. And one of them is the ODE exec, which contains the compiled version of the DEX file. And both of these sections are used by Android, the artVM when executing an app. One is for creation of different headers and the one for interpreting the app. But it's important again to note that you don't have to have the native file, all of the ODE exec in order to execute an app. You can backfall to the smaller version and interpret the code. So now that we understand a bit more about Android, let's try to think about ways to unpack an app. So the first one can be finding the algorithm. We can try to analyze the different packers, try to analyze how do they pack an app, which algorithm they use and do the back steps in order to decrypt the app. The problem with this is that this doesn't scale. You need to understand each packer how it works. And even if a packer only do a minor modification to the packing algorithm, your script will break and you need to start your research again. So this is not the way we want it to go. Another method could be extracting DEX file from the compiled ODE. As I said, we have the DEX file inside the ODE. But what we saw in different packers that you don't have to have the DEX file inside the ODE, you need only part of it. So some of the packers delete the DEX file from the ODE so you can just take it and use the DEX file. Another method might be dumping the DEX from the memory. But again, this does not always work because the DEX file might be missing and the packer will use the ODE file. So we wind up thinking about using a custom Android ROM, which this is something we already do in Checkpoint. We have a dynamic analysis engine. And maybe introducing a few modifications to the custom Android ROM that will allow us to dump, to place a few hooks in an interesting places. And this will allow us to dump the DEX file and pass it to our static code analysis engine. So before continuing on, I want to talk about a few notable works that was done in this area. One of them was Android Hacker Protection Level Zero. It was presented here in DEF CON a few years ago and it's a very good talk that talks about the different packers and protectors in the wild. And they also released a few sets of scripts that dumps, that work on some of the packers and dumps the DEX file. Another very interesting talk is from the guys that released the DEX Hunter tool, which is a modified version of the Android Dolby Core R2DM and it really reconstructs a new DEX file from the memory. And while this is a very interesting project, it was not what we aimed for. We wanted to get the original DEX file before the packing process was began, in order to have the same hash as the original file. So we want to go in a different path. So what was our approach? We wanted to find a solution that would require minimal changes to the Android source code so it will be portable. And we work on most of the packers. So how did we do it? How did we address the problem? So we took the most popular packers that we witnessed in our systems and we reversed them. We additionally analyzed the way Android loads a DEX file in order to understand freely how it works. And the result was a patch of a few lines of the Android runtime that will allow us to dump the files and analyze it in our static code analysis. So what were the analyzed packers that we looked on? So the most popular packers encountered were Baidu, Bankel, Tencent, Ali, and Fee 60, Jiago. And Baidu is the same huge Baidu Chinese company that you know, they also have a packing service. It's a web service, you send an APK and you get the packed version of it. And it was very surprising to see that they offer this kind of service. So in this talk I'm going to talk about Baidu and Bankel. And what's interesting about them that they work in a bit in a different method, but covering both of them allowed us to find a solution that works on almost all of the packers we encountered. So let's try to think about the abstract way which in a packer should work. So as I said, you have the packer loader. It will load the bundled packed DEX. So it will load the DEX, this will trigger libart, the RTVM to work, and opening the DEX file, map it to the memory, and then you have the DEX in the memory. But something is missing here. Where does the unpacking process takes place? So what we thought is that most of the unpacking process will be inside a native lib function and file. And why? Because reversing all the native file will be harder for reverse engineers. So it's a good idea to put your unpacking logic in an obfuscated and protected manner. And for that you need to not do it in the Java bytecode packer loader, but in a separate native file. And what does this native file do? It needs to interject itself somewhere so it could decrypt and unpack the packed file. So he will do it with hooking. It might hook libart libc, and now that when libart will open the file, the unpacking process will take place. And in the end, libart will get the original DEX file that it can execute. So cool, now that we thought about an hypothesis of how a packer should work, we want to verify if it's really what we see in the field. So let's look on the first packer, Banco. So in order to identify Banco, it's very easy. It has various classes which shows in every packed app by Banco and different files. One of them is the native packer, which is used to unpack an app. And the packed version of the DEX file, which is Banco classes. And this is a snippet of the Java loader implementation. And we can see that Banco loads a native lib file and calls fujini, which is a bridge between Java and calling native functions to functions from this native file. And then it loads the DEX file, which will trigger libart.so. So we wanted to understand what's happening inside the native file. So we tried to open it with IDA and it crashed. And this is one step afterwards. And after we fixed it, and I'll explain in a second how we did it, but what we noticed here that we didn't see any mapping between the native functions and the functions names in the Java interface, meaning something is missing here. Where does this mapping happens? So what we needed to do is understand the mapping. So I'll take a step backwards and I said that IDA crashed. Why did IDA crash? We know that this file is in use, so it should work. But when we dumped the file headers using the file headers, we saw that some of the segments were missing. We didn't see the text segment. So what we noticed is it is defined in the dynamics section and we had to manually reconstruct the different sections in the file from the info we got from the dynamic section. And then we could analyze it in IDA. But that wasn't enough because even when we opened the file in IDA, the entry point was not valid. It didn't point to anything interesting. So something else is happening here. So what happens here is that there's a call to an init function from the dynamic section. And what does this init function do? So what it does is the native file contains a compressed section of the code and the init function decompresses this code and overrides the text section. And now the entry point, the original entry point of the ELF is valid. And what we saw is that one of the functions inside the native file is JNI load. And JNI load provides the mapping between the functions in the native file to the JNI, to the Java. So now we could understand what does the function do. Okay. So now let's see how Banco works. So the first function extracts a file from the assets. And the second one, which is the interesting one, forks three different processes. The first one is just the apps process. The second one is an anti-debugging process which does different tricks in order to prevent us from understanding what's happening. And the third one is only executes when the ODE file does not exist. And as we know, this mostly is the first time when the dynamic dex is loaded. But it doesn't execute dex to ODE in a regular way, but it uses an LD preload in order to hook some of the functions in dex to ODE and create a special kind of version of an ODE file. This ODE file will be later used by R2VM when executing the file. Windows. Okay. So what does the hooking in Banco do? So on the left, so it hooks eight different functions and we have here an example of one of the function hooks and the way it hooks them. So we can see on the left an open add function without any hooking. And on the right, we can see the hooked version and we can see that the first bytes were overwritten and what it does, it changes the PC register in order to change the flow of execution to the unpacking process of that. So let's do a recap of Banco and how it works. So it creates a pack or loader as a Java activity to load the native library. The native library is protected with different anti-research techniques that we had to bypass and what it does, it hooks LibC for the unpacking process. And what it does is when LibArt encounters the ODE file, it will unpack it and provide an unpacked version to the LibArt VM. So we understand how Banco works. Let's look on Biden. So again, for classification, this is pretty straightforward. We can see that we have the stop application and the stop provider, which are the classes used by Biden. And again, a native Lib and the packed original Dex. And again, the same. We couldn't see the mapping between the native functions used in the loader to the functions in the native Lib. And you can see that again, Biden used the index section in order to decompress some of the code because again, we couldn't see it in either. And only after the compression, we could understand what's happening inside the file. And again, it is using the JNIL load function to provide the mapping and do some other interesting stuff. And these are the things it does. So it has an anti debugging technique that I will elaborate. And registration of the native methods, meaning the mapping between the JNI and the native functions, and it extracts the packed Dex from the asset and create an empty Dex file, not a known file, but a Dex file, and provide the hooking. So what are the anti debugging techniques used in Biden? So we have office station, log disabling, it checks that GDP isn't executed and JWP isn't executed and a few other more anti debugging techniques. And we can see that the hooking in LeapArt by Biden is a bit different. It hooks the Android log print function in order to prevent any logging. So if you try to debug it, you can't. It will be harder for you to understand what's happening. And it hooks the exec v function. And when Dex2Oats is executed by Android, it prevents the compilation of the Dex file, meaning the OAT exec section, it won't be empty, but Android won't use it to execute the logic of the app. It will fall back to the smiling code. And it hooks the function open, meaning when Android tries to look for the one.jar file, instead it will decrypt the packed Dex file and supply it to LeapArt VM. So again, let's see what Biden does. So it creates a stub in the Java activity. It, the native lib is protected with different anti research techniques and it hooks LeapArt for handling opening of the Dex file. Well, this looks a bit familiar, but this is a different packer, or almost a different packer. So what can we understand from here? That most of the unpacking process might be generic with a few minor changes. We can see that the trigger for decryption by the different packers is when libc opens the file. And in Bankel, it's when it's opening a class as an old file. And in Baidu, it's when it's opening the Dex file. And if it will hook, place hooks in the first places in the LeapArt VM process when it tries to open an old file and a Dex file and dump the files, we should have the decrypted version, the unpacked version of the Dex file. So that's exactly what we did. We understood the way an app is loaded by the art VM and whereas the first places we can place a hook in the VM and the code in the floor of loading, so we can dump the files. So one function in the old loading process and one in the Dex loading process. And as you can see, it's only a few lines of code. One is free and one's a few more. And this allows us to dump the decrypted version of packed files, packed Dex files. So let's see a demo. Okay, I'll try to, cool. So this is a demo of a tool we created that will generic unpack most of the packers. And what you see now in the background, we open now an app which is packed and you can see by Bankel and you can see the Dex file, the packed Dex file which you can't really understand what's happening here because it's packed. Now we will execute our app and it's our tool which is a forked version of the AOS Prom of Android. And I'll try to fast forward this and unfortunately I can't, so we'll have to wait. So what's happening here is the Android emulator is loading and once it's loaded, we will load the app and our hooks, our two different hooks will dump two versions of this Dex file. One should be valid and one should be not. It depends how the packer works. Some of the packers hooks in both places, some of them hooks only one of the loading and one of the loading flows. But this enables us to unpack the apps. So, well this will take a few more seconds, sorry. Okay, so how are you guys today? What I can mention is that the hooking used by the packers are not persistent. They place the hooks during the loading process and then they remove it. So it's a good, it was, we had to really understand when the hooks are placed so we could dump the Dex file in the right time because trying to connect later on with GDB and dump the memory or execute or just dump it in afterwards after the app is already executed won't always work. So it was crucial for us to understand the Dex loading process. And, well Android is so slow. Oh man. Okay, you'll have to believe me it works but you don't need to believe me. You can download the tool for yourself from our repository. And it's not a compiled version of Android but a patch that you can apply and the script that wraps the unpacking process and you can go over them, execute them and see that it works and enjoy it. So sorry, so we understood how the packing process works by different packers and we only introduced a few minor changes to their RVM and this enabled us to work on like 90% of the packers we encounter in those systems. And what was very interesting, we could, this change allowed us to to send an unpacked version of the Dex files to our style code analysis systems and we got a 50% increase in detection of maliciousness of packed apps from this feature, which was very good for us. Thank you, thank you, we're far too kind. And that's it, if you have any questions, feel free to ask.