Find Relevant Falcon EDR Data Using Events Data Dictionary

A Common Problem With CrowdStrike EDR Logs

Threat Hunting, SOC detection engineering, DFIR, etc, all typically require some sort of log sifting to find information. If you use CrowdStrike Falcon as your EDR, you may have found that it generates a TON of logs from endpoints. This is both awesome (because we get a lot of useful data) and cumbersome (because we have lots of data we don’t need). If we are performing an investigation, the log volume may cause your broad searches to not only be slow, but also return a sea of useless information, burying what you’re actually looking for.

The Solution

The Events Data Dictionary: https://falcon.crowdstrike.com/documentation/26/events-data-dictionary. This is a repository hosted by Crowdstrike that provides a dictionary of their event types. If you are looking through Falcon logs, this is your best friend. The field to specify an event type value is called event_simpleName.

Here’s an example of using it in a search:

event_simpleName=NetworkConnectIP4 RemoteAddressIP4="malicious ip"
| stats count by ComputerName 
| fields *

By using that event_simpleName filter, we’re able to specify exactly what event types we want to filter and speed up the search significantly. The above search is looking for IPv4 connections to some “malicious IP” (fill in the blank here), and then counting and grouping those events by ComputerName.

We can replicate this by searching through this dictionary and identifying the event type that we’re interested in, and filtering by that event to identify relevant data.

Takeaway

Use the Falcon Events Data Dictionary to drill-down into relevant data, rather than using broad searches that not only take longer but return a large amount of noise.

Posted 11/23/2022

Using subsearches to improve your splunk queries

What is a subsearch?

A subsearch in Splunk is pretty much what it sounds like– a search that runs “seperately” from the main search. These will also generate their own job in the Splunk scheduler. A subsearch always runs first before the outer “main” search, and returns events as arguments to the main search. All field:value results from a subsearch are formatted and applied to the main search in the form of “OR” statements (if there’s multiple fields/values).

Benefits

Using a subsearch can reduce search speeds and complexity significantly for certain searches.

By using a subsearch to gather parameters/filters first, then applying those to what would be a very broad search, we can drastically improve our search performance.

As an example, let’s say we want to find all events where bash history has been cleared on a host, and search for that ProcessID to see other relationships and actions performed to map out a kill chain:

[| search index="endpoints" event_name="env_variable_change" env_variable IN(HISTFILE,HISTSIZE,HISTFILESIZE) env_variable_value IN("/dev/null","0")
    | fields process_id ]
    index="endpoints" event_name=*
    | stats values(*) by computername

In the above search we first run a subsearch to find process_ids associated with environment variable changes (in this case wiping bash history). The subsearch returns the PIDs as arguments to the outersearch (outside of the square brackets). The main search then searches the endpoints index again but specifically for events associated with those PIDs which we’ve established have performed this suspicious activity, and builds a stats table grouping by computername.

The benefit of using a subsearch here is to be able to connect specific activity/events (suspicious variable changes), and link them to a broader scope via their PID. If we didn’t use a subsearch here we wouldn’t know what PIDs we needed to search by to find these variable changes, so our search would have to likely use a lot of “OR” statements to piece event types together to achieve the same result.

Bonus: Subsearch formatting

Sometimes we want to run a subsearch, but the main search fields may not match up to the fields from our subsearch. This may be because we’re using a different index or sourcetype between the two, or something along those lines. This can cause problems because the results returned by the subsearch specify the fieldname, i.e. process_id="value". To make our subsearch results fieldname agnostic, we can rename the field at the end of the subsearch to “query”, and this forces Splunk to drop the fieldname and only use the value as the argument to the main search. If we use the example from above:

[| search index="endpoints" event_name="env_variable_change" env_variable IN(HISTFILE,HISTSIZE,HISTFILESIZE) env_variable_value IN("/dev/null","0")
    | fields process_id 
    | rename process_id as query]
    index="endpoints" event_name=*
    | stats values(*) by computername

This will add the values for process_id to the main search, but not include “process_id=”.

Posted 11/23/2022