 without further ado, here are the speakers on APT. Hi, it's something go wrong since because of the shot, right? Okay, I'm Sergio Dos Santos, I come from Spain, head of security and lab in 11 parts, which is the telephonic cyber security unit. And Sheila, and I work as security researcher as well in 11 parts, I'm 23 years old and I came from Argentina. Thank you, well, let's start. In the world of threat intelligence, determining the attacker geographical location of is one of the most valuable data for attribution techniques. However, in some cases, tracking a malware developer can turn into a pretty difficult thing and our researchers start getting mad and even feels a little bit frustrated. That's why we are always paying attention to new techniques that might help to track malware developers and reach the potential origin of a malware campaign. In this presentation, we'll focus on Android. First, talking about two new techniques that we found to track Android malware developers by getting the time zone of them. The first thing it has to do with a bag inside the Android SDK, controlling the package. This bag makes a time zone disclosure of the computer where the developers have compiled a malware. And the second technique is related to a calculation of creation times between the certificate of the APK and some files inside it. So at our talk, we'll get deep in these two new techniques. And finally, we'll talk about how we can do an accurate from the time zone that we got to a specific country. So let's start talking about the Android bag. When we download the Android SDK, it comes with a tool named APT. We can find this tool inside the SDK folder and the build tools API version. And if we run APT, we can observe in the first line that this is the Android asset package tool. So we can use this program from command line to add some files to an APK, dump the file, see the information among others related to APK's management. Due to an APK shares the PKC standard, every file inside it has a date and time of last modified. However, when we use APT for adding some files to an APK, we notice something strange in these date and time fields. They are not the real ones. Instead, the right date and time information, we usually saw something like 010180, an hour, and 00. That three in the hour field cuts our tension because if we change the time zone of the computer to, for example, CMT plus eight, that three turn into an eight. And if we change the time zone to CMT plus four, the hour, what four? So what the hell is happening here? Is this a kind of CMT offset? Well, after we notice that, we started to analyze the source code of APT. It is published on Google Git, and inside the path of APT, there are several files that convolve the source code of this program. So we put our attention in those files related to sipping process. Inside the zip file dot C++, there is a method named add common that is invoked for every file that we be added to an APK. As we can observe, this method receives a parameter, the file name, the size and some other scene related to the file being stored into an APK. So, analyzing the code of this method, we observe that there is a code to another method named setMode1, using a parameter, the mod1 variable. As we look for this variable, we find that it's a time t type, and it's initialized to zero. It should be used inside the setMode1 method, so let's check this code. SetMode1 is located into the zip entry dot C++ file of the source code of APT. There we have the method and its parameter. Remember that when is equal to zero. And inside this method, there is another variable of time t type, its name even, and the value of even is the value of one. So, even is equal to zero too. But the question here is, is the even variable used for anything at all? But immediately after the lines that we saw before, even is used as parameter for local time function. PTM is a TM structure where local time will save its result. And after that, the result is used for assign the last modifier field for every file that will be added to an APK. So, at this moment, we have a problem identified. In APT, local time function is receiving as parameter the even value that is zero. When the expected argument for local time function is a real time stone. So, let's analyze it in runtime. Here we are attached to APT. We have run the debugger with parameters for adding a file. We put a rate point on the subroutine that we were analyzing in the source code. And at this moment, we can observe the even variable with value zero being passed to local time function as an admin. The result of this source code is the one that we were seeing for. Times on disclosure of the computer. But what will happen if we pass to local time function the expected argument? That means a real timestamp. Well, now we have altered the value of even putting a timestamp of Unix epoch as surprise. The date and time of last modified was the correct one. So, let's show you a video demo with all this in action. The computer is in GMT plus three. We have run the debugger with two parameters for adding two files. We put a rate point, we reach the rate point for the first file. But this time we won't fix anything just we will leave in the even value to zero as normal execution. But for the second file, we will fix this bug. We'll put a real timestamp, Unix epoch of Gordon XS1. We are changing the value of even from zero to the real timestamp. We continue the process and it will finish. So, we can use now APT for extracting the dates of any APK. So, now we can see this APK with the first file with this bug making the time on disclosure and the second file with this bug fixed. Yes, showing the right date and time information. At this moment, we are convinced that there is a bug inside APT, but why does this ends in a time from the disclosure of the computer? Well, local time function makes a calculation to put the correct hour in the hours field. It takes the Unix epoch coming from the parameter that is UTC 3.0 and makes a sum or substration with the time sum of the computer. For example, if the time sum of the computer is g and t plus three, it makes the Unix epoch plus three hours. And if the time sum is g and t minus three, it makes the substration, the Unix epoch minus three hours. With that, local time function gets the correct hour in the local computer. But in APT, where we found this bug, local time is making a sum or substration over zero. So g and t plus three is just zero plus three. That's why we will see in the hour fields, the three number. In the case of g and t minus three is the substration zero minus three, but this substration affects the date too. Now we'll see December 31 of 79 and at 21 in the hour. That 21 is the result of the substration 24 minus three. So those g and t offset of g and t minus whatever might look a little bit confused. So for that, we made an offset table that we'll show you later. And there's a little detail when we use APT instead of seeing December 31 of 79, we see an 80 in the year field. That is because of a correction factor in the method that we were analyzing in the source code. There we have an if, which says that if the year is less than 80, the year is 80. So it's a little detail, but let us know that we are analyzing it correctly. So here we have the opposite table. In the case of g and t plus something is very easy because it's the same number. So for example, g and t plus five will put a five in the hour field. In the case of g and t minus something we have to do the substration. 24 minus the local time of the computer. For example, 24 minus three for g and t minus three. That is 21 and 21 will be the offset in the hour field. What we have done here is mapping the g and t that we guess it is with the file date in the APK itself. So you can check the file and then get back to the g and t and the local time some of the one that compiled it. After all, a good question is should local time retard this? Well, in several documentations we can see that local time should return annual in this case instead this information disclosure because zero is not a valid argument. So however we can see in the idea screenshot the return of this function to be sure that it's local time the one that is not handling the arrows correctly. There in red we can see the g and t offset three in this case was g and t plus three. We have to know that this bag is present on Windows, Linux and Austin. So Android developers using APT will be leaking their time zone regardless of the platform on the ART-BLP. Okay, once we know a technique based on a bag in APT let's talk about another technique that has nothing to do with a bag but with the way attackers or creators of applications usually work. As we have said APKs are basically seed files and every seed file has a date inside, a date and hour. They take it from the last modified field in the local system of the user and it gets permanently inside the seed file. On the other hand, an application in APK has to be signed by a certificate. Most of the times the certificates are created with no CA, they're self-sealed so there's no CA and you create it just a few minutes before, a few seconds before you can pilot or yeah or in other words you create this possible certificates for signing this APK you're creating. Certificates are in x.509 format that means that for the creation time field in their own certificate, they take the time from the file system as well. So if you compile it in this date, the certificate will take the date from the file system but they do it in UTC time without no time zone at all. So let's think about it. If there's a signature file inside the APK files that is the last ones to get in the seed file or the APK. It's the last one and it takes the last modified field from the local system or from the file system. And we have this certificate that if we think or we assume that this certificate is being created basically in the same moment a few seconds before the compilation, you get the time in UTC, the same time but in UTC. And in the files, they have time zone included. So in these samples, you can see that if you think that the certificate has been created 15 seconds, the certificate has been created 50 seconds before the compilation because dates are the same, hours are the same. And minutes are basically a few seconds. Here you can do the math between both dates and times and you will have a possible TMT or local time zone. For example, here is TMT minus seven. Let's have another example. Imagine that this certificate has been created four seconds before the compilation. So the last file in the APK gets the date on the left and the certificate has the date on the right. You have UTC on one side and the local time with the time zone of the person in the left side. So that means that this person is maybe in TMT plus one. So in a natural, assuming minutes and seconds are closing time when you create the certificate and you compile the application and the application are created together, we have information enough to reduce the TMT or time zone of the person compiling it because we can do the math between the UTC in the certificate and the time in the file that is last created when you compile. For example, here it would be eight hours or whatever. So it works. We have created a little Python tool that checks from one hand, checks the certificate creation date and UTC time and on the other hand checks the signature file date. If we assume they were created at the same moment, the developed time zone would be UTC plus three because signature file was created one second after the certificate. So the result seems quite accurate. We thought this is a fun example because in the email you can check that is dot er which is 83 and this is UTC plus three but this is basically a coincidence and as you can check in the certificate, it seems to come from Ratcha which is UTC plus three as well. So now we have these two techniques to check by a book and by a bag in AAPT and this certificate technique to know the TMT or time zone of the person compiling the applications. Let's do some statistics. We have a million, a 10 million application set or database that we have to get for both techniques. The time zone leakage by AAPT but we have 2,000 more or less APKs with this and the time zone leakage by date and some certificates we have almost half a million of them in our database. As you can check, as you can imagine, many of them will share results. So for example with UTC plus seven we have like 3,000 applications that has both problems and they leak the same UTC plus seven. So this confirm more or less these techniques and complement each other. Once we have all this information what we did is taking 1,000 samples with each leak, 1,000 of UTC plus zero, 1,000 of UTC plus one with the AAPT time zone disclosure back. Some of them we didn't have enough. For example, UTC minus seven, we only had six of them because this is America Samoa in the middle of the Pacific and we think there are not too many applications created there. So we check it out against different antiviruses, one, two, plus three antiviruses and check how many malware was there. This is what we got. We got that GMT plus four which is Russian, well, Russia is usually plus three but a part of Russia is plus four. GMT plus eight, plus eight which is China and GMT minus seven which is USA West Coast. GMT plus 11 and GMT minus eight are not good enough because we didn't have enough samples. We did the same again with the file certificate day times disclosure technique. We took 1,000 samples with every time zone, different time zones with 1,000 of them and checked against different engines with antiviruses. And this is what we got. We got that GMT plus five, GMT plus eight and GMT minus six were the ones with more malware in there. Why this little difference with the other technique here? Well, we think that this is because the DST time, they like saving time, that this technique is released on the local time of the computer so it may change. But if you think about it, GMT plus eight which is China and do not use DST changes remains the same. So we can conclude that we should have done this better. We should have taken into account the period of the year when you just changed the DSTs, okay? But basically what we can conclude is that Russia, this is GMT plus four, GMT plus three, and five, GMT plus eight which is China and West, Middle USA or West Coast are the ones creating more malware. And with this technique, they're creating disposable certificates as well. You have to take this into account. They're creating disposable certificates. So this makes sense because maybe we think, it's just a theory that the cloud is too many computers in the USA, in the West Coast and they create certificates in there disposable and that's why we have so many malware in there. We didn't hear the other way around. But we should be sold the malware we had with these leakages and check for the UTC or GMT time zone. This is what we got. Again, with one technique, the file certificate daytime technique, we got that UTC minus six and UTC plus eight were the ones with more malware. And with the other technique, the AAPK back time zone disclosure, we have basically the same UTC plus eight and UTC plus four. And what is it useful for? What we did as well is check our database. We have 10 million of them and we have got several sets of a thousand applications. And we have a rate of 6% of malware in there. And we took a set of a thousand APK samples with these leakages or disclosures and compare each other. And we conclude that the chances, for example, with the comparing our standard rate, a thousand applications, which is 6% of malware. And a random set of applications with UTC plus eight makes it six time more likely to be malware than our standard rate, you know, database. Let's see some examples with real life malware. For example, this death ring reloading some telephones had this file certificate dating problem that it was leaking the time zone. And it was Korea, UTC minus nine. And so did UD malware, which had this AAPT times on disclosure, black, black, and it was Korea as well. This is a malware we found a few years ago. It was a very interesting malware that once the mobile was infected, it took some user and password from the database of the attacker, preloaded the user and password, came to the telephone that was infected, registered with this telephone and email and everything in Google Play, got the token back to the attacker. And with this talking, associated with the telephone, he was able to give five stars and download fake applications. Fake users registered to real telephones, voting and unloading fake applications just to get up high in the Google Play Store. This is called ShuaVan. Well, we found it focusing on this bug, GMT for eight, which is China, and some other things like connecting to a PHP common and control, having this get account permissions and hiding behind wallpapers applications. Focusing on that, we were able to find and define this malware. And we alerted Google Play and they removed it and it was quite nice to research. That's how malware comes from China. Now, we have, for a simple, Hidalp, that we took a few samples from them. We checked, this malware had both techniques. You could check that with both techniques that it came from GMT plus three, which is Russia again. And aside, here you can guess in a way that certificates are always created about two or three minutes before the compilation, which lead us to think that they were like automated. These possible certificates created in an automated way and coming from Russia. Okay, with the technique that we were analyzing, probably we'll get the time zone of the Android malware developer. Now, we see quickly how we can do an accurate from the time zone that we got to an specific country. Inside an APK, there are some files pretty common. One of them are the RTF documents. Usually they are used for agreements, agreements inside Android applications. Related to this kind of file types, we had an inspiration some months ago when Wanakari occurs because this ransomware shows message in multiple languages using several RTF documents. This is a funny thing, a trick, with Word Office and RTF files. When you create an RTF file with Word Office, you get inside the RTF metadata. It's called slash deflang. This is the file language in your Word or text editor. So, every one of us has the file language in our Word, so it leakages through RTF files. And maybe if you have your, it's quite possible that your Word file has the final languages is your native languages. So, it leakages maybe your native language to RTF files that you create with Word. Yeah, we made our research for getting information about Wanakari and among other things, we checked those RTF documents for metadata. And as a result, we found that Korean is the file language configured in the text editor of Wanakari's developer. Well, here we have an example of an Android malware which among each file there is an RTF document. We have to know that either Android Studio nor others IDs remove the metadata from the media files added by the developer to an APK. So, we can check this kind of media file to get some interesting information. In this case, we found that Arabic is the file language in the authors text editor. There is another trick to accurate the country is we can get the strings typed manually by the developer. It's maybe helpful for knowing the native language of her. We can use APT for extracting a string of an APK but there's a problem because even if the APK is extremely simple, there will be lots of strings added automatically by the ID just for translations purposes. So, in the screenshot, we can see those sounds of strings. All of them were added by the ID automatically in a very, very simple APK. So, we can do a little bit of magic using APT for extracting all the resources, filtering by strings, using grid together with records. Pretty nice command. And, in the output of this command, we found a way to differentiate both strings added automatically by the ID from those written by the developer. Basically, we are checking the origin of the strings, removing those coming from resources where strings of translation are. After all, the only thing we had to do is to check what could be the native language of the developer, just seeing the strings type manually by him. And, we created a tool as well, which is our line yet. This is not the best simplest in the world, sorry, but you can drag and drop an application and it will try to reduce the possible GMT and check for another techniques to get the country and the languages. The country, basically, sometimes comes from these techniques, we have explained it, or maybe with the certificate itself, or maybe analyzing the strings.xml that we have shown, or even sometimes maybe from the TLD domains, the dot whatever that the application has inside. So, you can deduce with these different techniques, which is the GMT time zone or the language, the native languages of the person. This tool takes all these techniques together and once you drag and drop whatever APK, you will check different techniques to get in there. Even the RTF that we have just talked about. So, I hope this tool will be online soon. The other one is very online. This one, for example, says that it comes from Russia and has some domains with TLDs, and the certificate is the standard one for the boolean, so it's not useful. So, what are the conclusions here? Is that we present the different ways for not just leaking time zone, but as well as possibly taking automatic malware creation because of this possible certificate that I created a few seconds or minutes before the compilation is done. Possible better machine learning features in detecting APK malware. Remember that we said, some statistics that could be useful once you create a machine learning algorithm is useful to have need features to have a better understanding. So, I think this is a pretty good one for machine learning and detecting malware, the ones that I created with this possible certificate, so it comes from one time, some more and another. And a tool for a quick view of all this information around APK's metadata. Feature work, as we said, should take into account the DST, so it's more accurate these techniques and maybe have a little more sample, more than a thousand samples of each of this disclosure technique or whatever. And this is pretty much all. Thank you. Do you have any questions? Not too difficult, please. Not with Warpath and this stuff, it doesn't work. Just work with War, we checked because it has the foul language defining there. Warpath, for example, you create an RTF file with Warpath, you don't have a language. Taking into account that this language is the one you have defining your text editor to the syntax correction. So, I don't know any other office package, but with War, with office where it happens. Another question? Thank you, hope to see you in some other different.