 Hello and welcome to malware analysis for Hedgehogs. Today's topic is YARA and how we can use it. YARA is a tool, basically a signature scanner. And last week I found lots of ransomware using YARA. People asked me how I did it, so that's the answer. I used YARA. What is it actually? Let's take a look at the website. On the website says it's a tool aimed at helping malware researchers to identify and classify malware samples. With YARA you can create descriptions of malware families or whatever you want to describe based on textural or binary patterns. Each description, aka rule, consists of a set of strings and a boolean expression which determine its logic. Let's see an example. Here's an example. Today's topic is a small introduction into using this signature scanner. You cannot only find samples with that, so basically big websites like VirusTotal and others have a YARA scanner that they run on the files. That way you can find samples you want to look for. But you can also use that to read descriptions of malware families. For instance, you might search for YARA rules on the internet and then find things like these repositories. What was his name? I can't pronounce it. Here's a large repository of YARA rules for malware families. You can take a look at that, like Black Worm. Check that out and see if it has all of these strings. Then it's considered Black Worm Malware. In the same way you could provide information about files you find to other malware researchers. That's a great tool and I really think you should see it. Let's get started. We already know that. Let's get started with writing rules. That's the most simple YARA rule you can have. A condition that's false so it will never match. That's easy. Let's just do that. We start with the rule keyword and the name. In this case I would like to detect this ransomware family here. Both files are part of a ransomware I found. I have N643 ransomware so we might just name it after that one. And the condition. That's the minimum you have. I make it true. I have N643 and then let's take a look at how it works. That's the tool and the usage is somewhere you just call YARA. Then you told where the signature file is and you can have lots of signatures in that file. Not only one rule but several. And then you say for instance the current folder and you get the matches. Now matches all files in the current folder. Now we might want to add some meter data like the author. That's the author. That's us. And the description rule. That's our ransomware YARA rule. That's a bad description of course. But you can add anything you like here. For instance the malware type or whatever you want. You can also add tags like in this case I would add ransomware because that's the main type of the malware we are matching. And we might want to check for certain strings in the file instead of just saying it matches everything. So that's our second most important section of a rule. That's the strings section. Well let's say we want to match this ifn643 string. Then I would say, oh no that's actually a good thing here. That's the name of the ransom node. We will just match that one. Ransom node. And it will complain now because of the slash here. Let's check it. It doesn't complain. I didn't save it because it doesn't. Now it says it's unrefined string of course. I want this ransom node to be true ransom node. And now it complains about illegal escape sequence. That's because this is a special character here and you need to escape it first. And then it works. Now it matches both of our ifn ransomware files by just searching for the ransom node. Now there's obviously a problem with using only that. If you search for malware families you usually run into trouble with anti-malware programs. So anti-malware programs also search for these strings and also for extensions that are typical for the malware like .lucky or .ifn643. And there's also the anti-malware program file. So it's a good idea to add some code or something that's not in the anti-malware program. So what can we add? We could add the extension as well. I think extension. This one. We can say we want these to be in there too. Or part of the ransom, like your most critical files have been encrypted or sent bitcoins or whatever. You can add all of these strings and then you could say, like I can say all of them have to match. That's done like this. And that's the case. I could also say just please only match one of them. That would be enough. And then we match also our own file, the signature file of course. The Yara roots are not only for Portobeck executable file, they will also match on text files or anything else. There's a solution you could check for the type of the file first. Lots of people use the SPE rule for that. I found it in here somewhere. I'm not sure where. SPE. Yeah, there it is. Let's just copy it. It's a rule that will match all P files. And it does so by looking for the MZ signature of Z0. So checks the 16 bits at 0 for having that value of 5A4D that's MZ. And then it checks the P signature itself. It looks for the address of the P signature that's always at 3C. And then it takes the value from that location and it's compared to PE00. So that's a P file. If you do that, you will get an SPE notification or match for every P file. That's a bit annoying. You don't want that. So just make this private. And you can add it now here. One of them is PE. And now you got what you want. You got only your ransomware match. This match is ignored, but you can use it in the rule here. So only the P files are checked. And what else can we do to improve that? There is a debug string. Now debug strings start with RSDS in our case. And then 20 bytes later, there starts the actual path to the debug string. And that's a good candidate to look for. We might create... These are strings, obviously. But you can also use regular expressions. I like to use a regular expression for the debug path. Like this RSDS. Then the dot is for any character. And this any character can be in a range of 20 to 300 bytes. And it's denoted like this. So now it's a regular expression. And I would say ransomware, IFN, ransomware. Yeah, okay. That means somewhere in the debug path has to be the string IFN, ransomware. And I think it was in those... Yeah, yeah. We could make that part, okay. And now let's check if it matches. Of course it matches. I said one of them. We can add the minus S option to see which match. And this is not a match. So we did something wrong here. Oh yeah, of course that's... I forgot that one. Check again. Yeah, now it works. And every string that matches is printed out. And also the location where you find it in the file. So you can check that it's correct. And it works now. And I don't think it's a good idea to say one of them like... And if you have a cleaning tool that cleans files with the IFN extension or that cleans this ransom node, it might have a string in there too. So we might say, please you should have two of them. That would mean two or more. Two of them does not mean exactly two. It means two or more, it's a minimum. So it can also be O3. Yeah, all of them simply. And that works well. So that's a nice one, I guess. For this specific family, you could also do some general matches. You could say ransomware generic. And you could say, okay, I want generic. Generic means you match a lot of samples with your signature. And you could say this. You could say I want all files that are P files. And that have ransomware in the debug path. Just one example, but of course, it's a bad example because you will match all entire ransomware tools as well because I have ransomware in the debug path. So really just take this as an example that you can use as a basis to work on everything. And it matches. And I have a folder with some more ransomware samples that see if that works. And our ransomware generic also matches because these ransomwares also have ransomware in the debug path. There are some keywords that you can use to add some more things. Like you can say no case. It means we can now match RSDs. Well, it's actually not so good to have that in no case. But you can match the ransomware in the no case. Okay. And then we have another match here in crypto because it has probably a low case letter ransomware in the debug path. So just some experiments here. Yeah. Last one. Yeah. We make a last one just for fun. Well, I think you've got the idea of how you use a tool for everything else like specific functions and so on. You can check the documentation really a little bit. And I might leave some links below the video that also explain some cool features of Yara. But let's check another sample. That's this one. Beautiful picture JPEG. But if you look into it with a hex editor or that's everything all right. But if you look into it later, this program. Yeah. Here's the start of a P file. Now that's the MZ magic number. Here's the PE00 magic number. And that's the desktop message. So we could make a generic signature to detect hidden P files. Let's say hidden PE in JPEG. Something like that. And in the strings section, we could say we want the DOS step message. We could include the other one, DOS step message. Second one, this program cannot be run in DOS mode. And we could say we want the magic number of the JPEG. Now I checked it out. There are several magic numbers. And none of them matches our file. That's a bit weird. See this. Our file starts with FFD8, FFE2. None of them are in this list. I didn't check what's wrong. Maybe it's just another version, because that's E0 and that's E1. So whatever happens there. So we might just check that it starts with FFD8, FF. If you wanted all of that, you could do that as well. Just show you how. A way to check for hex patterns in a file is this. You add those hex patterns in the curly brackets. And you use those wildcards for any position where there can be anything you like. And in this case we might just remove that and check only for the three here. We need a condition. And the condition says we want one of the DOS stub messages. It doesn't matter which one. So we say one off. And then we say DOS stub blah. And we want the magic. Magic. At position zero. All right. Let's see if it works. Is it in samples? Yeah. Now it complains that it slows down scanning. But it matches our JPEG file. So critical because that's a very short pattern. I'm not sure if it always scans for this pattern, although I only checked for it at position zero. So I'm not so sure. But we can do that a bit differently. Just like it's done here using the Uint values and check for them directly. And then it won't complain. So that's the way out of it. But for scanning only five files, it's okay. So that's my introduction to Yara and have fun using it. Maybe if you are interested in it, check out the links below the video and the documentation on the main website because it's a good documentation. It's pretty much all you need. There are a few more keywords that are worth mentioning. One of them is the... Let's check the wide. I use the wide keyword very often. That means if you have unicode with... You want to match unicode strings with this. You say wide. And if you want to match both unicode and non-unicode strings, you say wide asking. So that's a possibility to include that. And there's another interesting thing that's the global keyword here. We used the private already. Now the global means that all of the rules in this file need to match this global. So we say private global, for instance. Now every file that we check has to be a PE file. That means our hidden PE in JPEG does not work anymore. So let's check that. Yeah, it's correct. It's a PE file and we made this rule global. So it can't work anymore. This signature here. That's it. All right. Have fun using Yarra and see you next time.