top of page
Writer's pictureGabriel Vasseur

RBA: a better way to dedup risk events

Updated: Jul 1

In this post we’re discussing an advanced way to dedup risk events in your risk alerts (RIRs) and at the same time have the RIR results paint an accurate picture of what happened.


Say yes to deduplication, and no to loss of information!


A basic risk alert (RIR)

A good way to start an RIR rule could be with the following code:


An actual RIR would then have some logic, such as | search risk_score > 200 for instance, but whatever the logic it is irrelevant in this article.


Of course, we could combine the tstats and stats in just one line but here we want an opportunity to maybe massage what we’ll call the “raw events” before we group them by risk objects with this final stats command.


Please note the risk_score*count so we really take into account every single risk events. That's adding complexity and it's counterproductive in the context of a conversation about deduplication, but at this stage we want to take into account everything, including risk events that fire multiple times.


Dummy data

In order to have something to play with, let’s assume that the first line (the tstats command) when run on its own gives us this:


From now on, we’ll replace the tstats command with this inputlookup. All the SPL on this page is copy-pastable and you are encouraged to download this lookup so that you can play along:



So our basic RIR rule stub would be:


Note the scores: 290 for Gabs and 770 for Hax0r.


The very basic dedup

Even with proper throttling, it's quite common for repeated risk events to happen and this can inflate the total risk scores disproportionately. In Haylee's guide, we're encouraged to dedup them:


The scores have slightly reduced: from 290 to 200 for Gabs and from 770 to 700 for Hax0r.


Note how I’ve now got rid of the risk_score*count logic to count each event a maximum of one time only.


I love this deduplication and found it extremely valuable in reducing risk alert noise. If you haven't already, and if you do nothing else today, I highly recommend you implement it. But hopefully you'll keep reading and go further with the idea.


Our goal: a better dedup and more information

Now that we’ve established our starting point, let’s take a look at our finishing line:



The SPL for this is in the final section of this article.

As we will see in details, there’s several levels of deduplication going on there. The scores have significantly reduced: from 290 to 180 for Gabs and from 770 to 340 for Hax0r.


On top of that, the results are painting an accurate story of what really happened before the deduplication. This is obviously not replacing the need for a proper risk investigation dashboard, but it’s still helpful.


Let’s focus on risk_message first


Let's go back to our "raw" events, remove some fluff and limit ourselves to just one of our objects:


Our goal for the RIR is to eventually group these "by risk_object_type risk_object", deduplicating where we can as far as the risk score is concerned, but not losing any information along the way. Let's start with:

This is building basic information about the amount and score of each contribution, which we can now group "by risk_object risk_object_type source risk_message":


Notice the first improvement to the deduplication here: in the (granted possibly rare) occurrence of multiple events with the same risk_message but different scores, the deduplication will retain the higher score, not just the first one.


Also now risk_message has a lot more information about exactly how many events triggered, how is the score broken down, and how much of it was dedup’d.


We’re now ready to group by risk_object:


Let’s do the same for sources


We want a similar report for sources. So let’s go back to the first step and add this:


Let’s re-add the logic for risk_message, but keeping our new breakdown per source available:


Now we need another intermediate step:


Before we can tie things together:


One thing I don’t like is how there is no consistency in the order between the sources and risk messages, so it’s hard to correlate the two. We could use the list() function instead:

Much better!


It’s worth noting that the list() stats function, unlike the values() function, is limited to a maximum of a 100 values. But we’ll deal with this next.


Ultimate deduplication


We want to protect against the situation where an RR is generating tons of risk events with distinct risk_messages and defeating our deduplication efforts so far. Let’s put a maximum of 10 risk contribution per source maximum. We’ll use the hax0r risk_object for demo data.

We’ll start with our risk_message based deduplication logic unchanged:


To which we add some logic to defang any contribution past the first 10:


And then we can resume what we were doing before:

Now we’re really on to something!


Final SPL


To keep things simpler we didn’t bother with _time and the mitre stuff, so let’s add that back in and we’re done:


Conclusion


I hope you learned something today. Although I’ve shown the evolution of the SPL fairly progressively, I haven’t gone in the details of what each command does. Either you know and it would have been superfluous, or you don’t and hopefully that gave you a prompt to learn it, play with the searches and read the documentation about the various commands.


40 views0 comments

Recent Posts

See All

Linux tips

This for the most part isn't splunk-specific, but if you do any amount of administration on the linux command line, you might find it...

Comments


bottom of page