In Part I of the series, Building Blocks of Success with YARA, we introduced YARA and some of its capabilities. That introduction ended with a very short discussion on the distribution of logic across multiple rules and the use of constraints to ensure accuracy.
Let’s expand on that.
While in many cases its appropriate to maintain all the logic needed in one rule, many edge cases exist where you lose accuracy, efficiency, matching, or all three. Let’s consider one of those examples. Take a situation where you have written a bundle of YARA rules to match on a binary: (1) in its wild state, (2) compiled with a key logger, (3) compiled with embedded malicious resources. Viewed standalone, each rule contains logic to detect the core of the binary as well as the individual unique characteristics – key logger, obfuscation, malicious resources – behind each situation. A small amount of comparative analysis–say, using a program like Groom-Porter–will quickly show you are repeating the same logic across each rule. This type of situation is a great candidate for optimization by diversifying the logic across more rules. Since the situation deals with only executable files, extracting that logic and building a global rule to detect only windows binaries makes sense.
[minti_table style=”1″]
Table 1: Global PE rule |
---|
global rule isPE { |
condition: |
uint16(0) == 0x5A4D and uint32(uint32(0x3C)) == 0x00004550 |
} |
[/minti_table]
That allows us to write that logic once and apply it to all rules. It becomes one of the first constraints we can apply to enhance accuracy. After all, candidates we apply the rules against that do not match are not of interest to us for this detection. Secondly, a short amount of comparative analysis will show a high amount of duplication among the rules; after all, the core of the program in each rule is still the same. Since we are only interested in that family of binaries, the same thought process can be applied. Table 2 shows what that rule would look like after isolation of the core logic away from the unique versions is applied.
[minti_table style=”1″]
Table 2: Core Detection rule |
---|
rule isCore { |
strings: |
$a = “oprat=2&uid=%I64u&uinfo=%s&win=%d.%d&vers=%s” nocase |
$b = “sve%d324ia%d” nocase |
$c = {25 27 44 24 2D 31 2E } |
. $d = {21 31 41 24 2D 2E 2E 44 } |
condition: |
all of them |
} |
[/minti_table]
You’ll note that this rule is not a global rule. It is consistent across all of them, but we are going to trap for a version of the binary that deviates from this standard in another rule so we’ll make due by importing it into the condition line.
Now that we have handled isolating detections to executables and to the core of the program, let’s work out the use cases. In the wild, the discovered binaries had a pdb file notation that was consistent in all but the keylogger compiled versions. Let us isolate that into a separate rule. These versions were also all consistent in file size as well, so we’ll take that as a constraint. Table 3 shows that rule.
[minti_table style=”1″]
Table 3: ITW PDB matches |
---|
rule ITW_pdb { |
strings: |
$a = “\\Work\\Release\\Finder\\Palopalin.pdb” nocase |
$b = “\\Findfith\\Release\\shenc.pdb” nocase |
$c = “\\Current\\Release\\eigersen.pdb” nocase |
condition: |
filesize<1MB and isCore and ($a or $b or $c) |
} |
[/minti_table]
You might have noted that we used a constraint to only apply this to files less than 1MB. PDF matches were randomly sized, but consistently under 1MB. Adding a shortcut constraint of 1MB keeps us only focused on what’s necessary to match. With that defined, all that leaves is the versions with the other three cases. Keylogger versions matched the core rule, but consistently did not have pdb strings and ranged between 1 MB to 2MB in size.
[minti_table style=”1″]
Table 4: Keylogger matches |
---|
rule a_keylogger { |
strings: |
$a = “__fa%d_%d_%dsaclo%d” |
condition: |
(filesize>=1MB and filesize<2MB) and (isCore and $a) and NOT ITW_pdb |
} |
[/minti_table]
To simplify detection, we are going to use the PE module to go through resources to match the malicious resource and then check for the pdb strings by using that rule in the condition line. That does mean we’ll need to import “pe” at the top of our yara rule file. So keep that in mind.
[minti_table style=”1″]
Table 5: Malicious Resource |
---|
rule mal_resource { |
condition: |
For any i in (0..pe.num_of_resources-1): (pe.resources[i].name_string == “22dde9d6155d1bcdffb6f96e1427ffcb”) AND ITW_pdb |
} |
[/minti_table]
Lastly, lets cover some use cases where the file doesn’t match on isCore, but does contain either the keylogger string or the pdb strings.
[minti_table style=”1″]
Table 6: Negative Matches |
---|
rule negPE_combo { |
condition: |
(ITW_pdb OR a_keylogger) and not isCore |
} |
[/minti_table]
We could match plenty more combinations, but this covers the major cases.