 Hello everyone. Good morning. Good afternoon. Good evening. Good night from the different parts of the world. You are tuning into my talk Welcome to my talk on Yarra rules to rule them all Thank you so much. Blue Team Village Tefcon for giving me an opportunity to present So today we're gonna be talking about writing Yarra's which can rule them all My name is Saurabh Chaudhary. I'm currently doing my master's in cybersecurity I'm a public cyber threat researcher with around five years of experience in banking and financial domain. I have multiple published research papers in IEEA scopus I've been a speaker and trainer at multiple conferences like besides San Antonio, Texas, besides Budapest EpsiC Indonesia, Texas cybercon, etc. etc. I Have a background in red teaming malware analysis and threat intelligence While I'm trying to buy with my career more into threat intelligence. I love motorcycles and All the types of adventure sports, which gives me a dandelion rush That's my Twitter handle 4w4r44 So let's dive into the talk so today what we're gonna be talking about is What is Yarra rules? How can we make Yarra? What are String and code-based Yarra rules and why code-based Yarra rule is better and what Kind of rule we will write which really rule them all In case you don't know about Yarra, don't worry about it. I'll try to take it from the scratch So Meller comes into different forms. It can come as a macros embedded in your work document or a VB script PDF as a file and The list goes on and on and on and on Melvies are deceptive and you never know what's Coming and what file is what? Which which is clean which else matter, you don't know So is there a one-stop shop tool to deal with them all? Yes, there is Yarra is your answer So what is Yarra? It's an open-source tool to identify and classify malware samples or Anything it's it's a Swiss army knife or malware analyst threat researcher It is free and open source. It was created by mr. Victor M. Alvarez and it's maintained by virus total modern-day IPS ideas frozen prevention firewalls CMS. They all support ingestion of Yarra rules to find Unusual activities and networks and you can do almost anything with Yarra rules So each year Consists of a set of strings and a boolean expression Which determines its logic? So what can you do with the era? There's a lot of things you can do with the era You can identify Classify files and melvies you can find new malware samples You can scan on live data streams and live network streams You can help speeding up the incident response process you can track future malwares as Well based on the code reuse pattern, which we're gonna be talking about later in the talk We can use your you can use your rules to track APD groups You can build your own anti malware product with the help of your rules So it's a one-stop shop if you are into malware analysis or threatening So how does your rule looks like in case you have never seen a Yarra rule? This is how it looks like. This is how it's run. So name of the rule at the first directory Against which it needs to be searched for and there are different optional. There's a lot of different optional arguments which you can use like finding it recursively Multiple threads etc etc In case you have never seen a arrow, this is what it looks like the rule name of the rule the metadata which is important so that you can keep a track of what's what and The strings Now there are different kinds of string which we can use We'll talk about that later and the conditions So what's happening here in the rule? So in this rule They are this is this rule is trying to find the word Defcon in any case jumble case uppercase lower case and if anything has this ASCII It will flag So what you are is consist of it consists of three things as I told a Metadata to keep things in track strings and conditions so Strings here you can use Different different types of strings here. You can use text strings to identify You can use hexadecimal strings hex. Well, that is the fun part and the reg X which is not cherry recommended And at the last of the rule there are conditions which you define for the matching based on the strings You have written So writing a error rule based on a unique string identifier doesn't take more than 10 minutes a good error will consist of bytes and not just Strings and you will find those juicy bytes based on the critical tanking thinking of The functions and the code reuse and the program flow If you want your error rules to last generation write rules based on code reuse a rule Which only matches a single malware files is no better than a hash For example, Gantt Crab has like five different versions and with one rule based on function reuse You can find all those different version of Gantt Crab So when a malware mutates of the third actor writes a new version of malware They reuse code and functions and that is what we're going to leverage to write your rules strings can change but code reuse are More povable to hit Okay, you cannot use these keywords as they are pre deserves Most of them are reserved for providing the boolean logical expressions like ASCII in 32 Yes, 16 PE Wide XOR etc etc Okay, how you use comments so Comments on your rules is just like how you comment on C both single and multi-line comments are supported C style come works. Basically it's C styles comments that are supported. You can see this one like single line command and do a line commands Multi-line commands on your Yara is case sensitive you have so you have to take care of the case in case you you want You want to write a Yara rule and you are providing a You are providing a word and you want both the strings about the Both the case to be supported. You have to write no case This is how you This is how you add text strings Horizontal tab so it can contain these following subsets of the escape sequence available in the sea language double codes backslash horizontal tab new line FTD for adding Biden hexadecimal notations As we talk the no case modifier For example here, you can see the rule trying to identify fuber So here adding no case will identify fuber fuber with uppercase and fuber with jumble case Come into the white character strings a White character is a computer character data type that generally has a credit size than the traditional 8-bit characters So the increased data type size allows for the use of larger coded character sets So we're using wide wheels are your problem here XOR strings Coming to the XOR strings. There are two rules So you can see these are two rules, which are basically the same. So if you write XOR Beside the string which you have written over here This will be similar to this strings So remember for an efficient rule write small and write logically for better and long term detections If with your rule you can only detect one single file, then there's no better than using a hash write it in such a way It detects logically We have base 64 strings. It is used in a rule Just like the XOR it will detect even if it finds At the byte level of the code as well So let's see if a CNC address that is base 64 encoder encoder and you can use the expression like this So this rule will detect all permutation of base 64 encoder string Hexadecimal strings so hexadecimal strings allows three special constructions that make them more flexible This is the fun part zero nine eight two f as you know hexadecimal They allow us to use special constructions in the rule to accommodate more and more logic Wildcards jumps alternatives So we will try to focus our rule writing more on hexadecimal strings than normal ascii strings Writing an efficient error rule is Reg X is not Recommended for writing because it it comes with a lot of false positive So we'll try to avoid reg X Writing rules on string or bytes Not regexed So I mean write rules Write rules, but do not include much regex because it will provide you a lot of it will give you a lot of false positive So efficient rule doesn't use regex Coming to the conditions So most of the malware target window systems They are PE files and you can dissect them. You'll find they're having headers sections. So you can You can give arguments and conditions to your aerals based on the executable entry the string offset or the virtual offset their file size Their entry points and a lot of other things. So for example, if If if there is a PE file, which is less than XYZKB, so you can define match this This is the end the entry point Match the file which has an entry point like this and the file size less than XYZKB Or based on the magic number based on there's a lot of Conditions which you can implement So why code base rules? So as you can see the virus mutates and with every mutation and it's hash changes And so are their strengths So these malicious threat actors work just like software development companies They reused codes codes functions logic program flows So writing one efficient code base rule will also detect future malware So you can see these two These two programs here. They are printing hello So here the hello is directly written in the first program the first First picture and the second the hello is broken down into different string and then it is adding together and printing it So if this if I write this rule based on strings It will only detect the first one the first program not the second program So that is why we need to focus more on writing error rules based on code reuse So for example gank have has like five version and if you if you made a error rule based on When gank have one came out Based on the code reuse of the gank have the function reuse you can detect even gank have one two three two Three four and five with one single rule So you can find future matters as well with with your rules if you write it efficiently You can write to one rule which will rule them all on the basis of their code reuse pattern Code based rule requires a little bit of understanding We'll need a little understanding disassembler and debugger to write code base real Code base rule last for generations. So That's what it is recommended So rules that match only to the specific samples are not much better than the hash value Making efficient your rules which can last and you can track the future malware from Same malware creators and writing one rule which will rule them all So things to take care while creating code based rule so compilation flags different compiles work differently. So use wild cards use Wild cards or where you where you can and one thing to remember is XOR and EX can produce different op codes. So write it accordingly Coming to testing Yara You should definitely do the following checks to reduce false positive like scanning the malware samples then scanning them Big goodware sample if the rule match to the malicious sample and did not generate a match The good way archive your rule is not good enough to go into practice Because it has a false positive So how do you test this? Creates Kaspersky great team has a project named Clara. This is a very nice project if you want to test your before Making go to production This is the life cycle of your rule analyze identify right rules test rules Deploy your Yara and enjoy So where you can hunt your arrow hybrid analysis Malpedia Once total all the wise total will need a premium license for that hybrid analysis You can do it from free and there's a new one which came into the picture recently Yara 5 from abuse CH product project So this is really good. This is where you can learn So what you can do with Yara again guys? Find next-generation malwares hunt for APDs zero days you can monitor APD groups you can make your own AV you can combine it with Zeke to make an interest in detection systems So there's numerous case in the past where people have identified zero days with the help of your rules So conclusion writing code based rules needs understanding of debugger But it's the efficient one string is Not the best thing to write While writing your rule cause enough and then sometime it will become absolute So always look for code reuse functionality and bill your arrows on that Thank you so much guys. I'll take Cursions and feedbacks. Thank you so much blue team village for this opportunity Please let me know if you have any questions and feedback. Thank you so much