RTMA Part 10 - YARA and Snort

Damn it

Dec 17, 2022

Hi all. I have put this off for long enough. I will be completely honest. I HATE IDS/IPS rule writing. It is just plain awful. There’s too many syntactic finaglings that you have to deal with alongside multiple different structural formations depending on the type of rule you have to write. It’s awful. BUT, it’s necessary to learn if you want to be a malware or SOC analyst. Here goes.

What

To first understand YARA and Snort, you need to understand the premise of Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS). This should be pretty straightforward. The names are a bit of a giveaway.

IDSs operate as network systems that inspect traffic head inwards towards a network. They protect systems on a network by alerting network administrators, in real-time, if malicious activity is detected. There are 2 types: network-based (NIDS), which lie at specific, strategic points within a network, and host-based (HIDS), which lie on individual devices or “hosts”.

IPSs operate exactly the same as IDSs. However, instead of alerting network admins, the IPSs actively block the threats through methods such as blocking the traffic or by quarantining infected devices. They even have the same types, NIPS and HIPS.

Combine them together, now you have an IDPS.

Now that you know what an IDPS is, you can understand where YARA and Snort come in.

YARA is a tool that allows the writing and creation of “rules” for identifying malicious software and traffic. These rules entail patterns in the code or other heuristics that can be used to identify specific traffic.

Snort, on the other hand, is an actual network intrusion detection and prevention system that uses the aforementioned rules in its protection system. Snort protection can lie within the network, host, and even at the application level.

BowTiedCyber has a few posts on Suricata, which is an IDPS like Snort. I think the syntax is the same across both, and Suricata is more user-friendly. However, I am more familiar with Snort, sadly.

Here is his post:

Zero to Hoodie Substack

Introduction to Suricata

Hello frens! I’m unbelievably excited to be FINALLY bringing you this topic. Suricata is THE SINGLE GREATEST open source software for cybersecurity and I’ll argue that to the grave. This will be the first of many posts explaining how to use Suricata to make a simple NDR. The truth is that to create the multimillion dollar NDR that I made, it would be more harm than good for those learning to try and get a job, so we’ll hit all the major parts - enough to be efficient with your time studying but enough knowledge to really knock it out of the park in an interview…

3 years ago · 3 likes · BowTiedCyber

Why

YARA is an excellent tool that can classify malware (see MITRE ATT&CK). With an extensive rule framework, any malicious software that is caught by YARA can be classified depending on the rule(s) that caught the program.

Snort is a powerful tool that detects and prevents attacks and other threats. Snort can identify attacks and prevent them from occurring if the system is properly set up to do so.

Know this: An IDPS is only as powerful as its ruleset.

Okay, I made that up on the spot. But it’s probably true. A plain, fresh install of Snort with its community rules will be able to catch the majority of poorly written, recycled programs. But, a consistently updated one based on the latest rulesets and datasets will be able to handle the “wild” malware that could target your solutions.

How

YARA

First, we shall start with YARA.

YARA rules start with the keyword rule followed by an identifier. Within the context of the rule lies a condition section. This condition dictates how the rule behaves. Let’s look at a rule right quick.

rule example_rule {
    strings:
        $string1 = "42.22.69.22"
        $string2 = "42.22.69.23"
    condition:
        any of them
}

As you can see, there is also a strings section. Within the strings section lies the ability to capture strings within a program. Should the program have the UTF-8 string “42.22.69.22” and/or “42.22.69.23”, the respective string variables will be set to true.

Then there is the condition section. It states “any of them” which means that if a program has any of the listed strings (any are true), the rule will trigger. Think of it like the any keyword in Python.

The strings section is the main rule-creating section for YARA. However, it can be omitted depending on the complexity of the rules. Speaking of complexity, we can simplify the strings into a single string with wildcards, as YARA supports them.

rule example_rule {
    strings:
        $string1 = "42.22.69.??"
    condition:
        $string1
}

Wildcarding allows any IP address under the 42.22.69 subnet to be caught by the rule. The condition section has been updated to reflect this.

Now, let’s say you only wanted to capture a range of IP addresses, rather than just wildcarding the subnet. YARA also has what are called “jumps”, which reflect a range of values.

rule example_rule {
    strings:
        $string1 = "42.22.69.[0-127]"
    condition:
        $string1
}

In this rule, the IP addresses ranging from 42.22.69.0-42.22.69.127 will be caught by the rule. Think of it like a (X <= N <= Y) statement.

YARA also supports hex values (duh).

rule example_rule {
    strings:
        $my_hex = { FF FF FF 7? ?? 3B }
    condition:
        $my_hex
}

What if you don’t want to wildcard and perhaps only need to check a certain selection of bytes?

rule example_rule {
    strings:
        $my_hex = { FF FF FF 7? (3C | 7C | 77) 3B }
    condition:
        $my_hex
}

With this rule, any of these 3 hex values will be identified:
FFFFFF7?3C3B FFFFFF7?7C3B FFFFFF7?773B

YARA can even handle base64, wchar, even regex. That’s enough examples for now. You can read the rest here.

Let’s move over to the so-far, unchanging condition section. Accessing any variables requires the $ operator similar to PHP. Condition syntax is almost Pythonic in the way you write it. There’s or, and, at, not… plus a lot more. Here are all YARA’s keywords.

all		and		any		ascii		at
base64		base64wide	condition	contains	endswith	entrypoint	false		filesize	for		fullword
global		import		icontains	iendswith	iequals
in		include		int16		int16be		int32
int32be		int8		int8be		istartswith	matches
meta		nocase		none		not		of
or		private		rule		startswith	strings
them		true		uint16		uint16be	uint32
uint32be	uint8		uint8be		wide		xor
defined

Let’s complexify our conditions a bit.

You can count strings with #…

rule example_rule {
    strings:
        $string1 = "42.22.69.??"
    condition:
        #string1 == 7
}

…Look for strings at a specific offset…

rule example_rule {
    strings:
        $string1 = "42.22.69.??"
    condition:
        $string1 at 40211 or $string1 at 52112 and #string1 == 7
}

…Extract integer values from an offset…

rule block_logan {
    condition:
        uint16(0) == 0xD8FF
}

Pop quiz: What file does this rule catch?

…Reference other rules…

rule rule1 {
    condition:
        uint32(uint32(0x3c)) == 0x00004550
}

rule rule2 {
    strings:
        $funcsig = {55 89 E5 57 56 53 83 EC 2C 8B 75 0C 8B 5D 08 85 F6 74 ?? 8B 83 9C 03 00 00}
        $ip = 42.137.[0-127].??
    condition:
        $funcsig and #ip > 3 and rule1
}

Lastly, there is the meta section. This simply describes metadata for the rule. This section serves no functional purpose, yet it can provide data about a rule that may be relevant for other researchers.

rule xord_ip {
    meta:
        in_the_wild = true
        notes = "this ip is xor'd by a random byte with each iteration of the malware"
    strings:
        $ip = 42.137.[0-127].?? xor
    condition:
        $ip
}

Don’t forget comments! They are equivalent to C-style comments.

rule weird_domain {
    // The domain name is evildoohickies.com but they case is randomized in each iteration of the specimen
    strings:
    /*
        $domain = "evildoohickies.com" nocase
    */

    // They rebranded
        $domain = "evilchickens.xyz" nocase
    condition:
         $domain
}

Snort

I actually hate Snort. Doing this for all of you is like pulling teeth, and you’re about to see why.

alert tcp any any -> any 80 (msg:"Suspicious User-Agent header detected"; flow:established,to_server; content:"User-Agent|3A|"; nocase; content:"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"; nocase; reference:url,www.microsoft.com/security/portal/threat/encyclopedia/entry.aspx?Name=HTTP+User-Agent+Header+Field+Containing+a+Remote+File+Inclusion+Attack; classtype:attempted-recon; sid:10000001; rev:1;)

Now, how do you expect me to explain that to you?

Seriously?

I’ll try my best.

Let’s use a simpler rule first.

alert tcp any any -> 42.137.0.0/16 any (msg: "IP captured";)

And we break it down into keywords:

alertThis dictates the action that Snort will take. You can alert, log, pass, drop, reject, or sdrop the packet.
tcpTCP is the protocol that is to be captured by the rule. Other protocols like UDP or HTTP will not be captured by the rule.
any (1)
The first any is the source of the packet, which can be anything.
any (2)
The second any is the destination of the packer, which can also be anything.
->The arrow signifies that the traffic is going from the source (any #1) to the destination (any #2).
IP address and mask
This is the destination IP address. With the 16-bit mask, any IP from 42.137.0.0 - 42.137.255.255 will be caught by the rule.
any (3)
This is the port to be captured, which can be anything.
Lastly is the options section which is inside of the parentheses. Within this lies only one keyword, which is msg and its parameter “IP captured”. msg is the engine message to display when the alert goes off.

That sums up this very basic rule which will alert when an destination IP address is within the aforementioned range and then send a message “IP captured” upon doing so.

Back to the unholy rule. I will make it slightly syntactically prettier.

alert tcp any any -> any 80 \
( \
    msg:"Suspicious User-Agent header detected"; \
    flow:established,to_server; \
    content:"User-Agent|3A|"; nocase; \
    content:"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"; nocase; \
    reference:url,www.microsoft.com/security/portal/threat/encyclopedia/entry.aspx?Name=HTTP+User-Agent+Header+Field+Containing+a+Remote+File+Inclusion+Attack; \
    classtype:attempted-recon; \
    sid:10000001; rev:1; \
)

alert tcp any any → any 80”Alert when capturing any TCP traffic on port 80…”
msg
Send alert message “Suspicious User-Agent header detected”.
flow
This option tells the rule to only apply itself when it is a packet message from client → server.
content (1)
Will capture any incoming traffic that contains “User-Agent:". the |3A| encodes a colon “:”. The following nocase ignores any case differences in the string.
content (2)
Same thing as before, but the string is instead “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)". The following nocase ignores any case differences in the string.
reference
This option refers a URL that provides context to the rule’s behavior.
classtypeThis specified the class of the type of alert. The classtype is set to attempted-recon which means the dangerous packet is a form of reconnaissance.
sid
Uniquely identifies a Snort rule. Should be followed by rev.
revSands for revision, which is 1.

That was a bit of a mouthful. Essentially, this rule captures any packets that have suspicious User-Agent headers. See here for examples.

A good mindset to have here:

If Snort, think packets.

If YARA, think programs.

And vice-versa.

There’s a slight overlap, but you shouldn’t be writing YARA rules for capturing network traffic; maybe the content of the packets, but not the packets themselves. On the other hand, you don’t want to be writing Snort rules trying to capture obfuscated malware. Speaking of obfuscated malware, I should write about packers soon…

I tell you what. That was a lot to write. I probably should have split this up into 2 posts as I was unaware how much stuff is in these 2 tools. Oh well. Enjoy the extra long post.

Have a great weekend, and Merry Christmas! 🎅🍪🥛☃️❄️

Go!

-BowTiedCrawfish

Shellfish Systems and Security

Discussion about this post