One of the areas we research heavily at WitFoo is how to reduce the number of investigations our customers have to perform each day. Internally, we call this the “n” problem. Another area of focus is how to reduce the amount of time our customers spend on each investigation. We refer to this as the “t” problem. The lower we drive n and t, the more work our customers can accomplish each day.
Graph Theory
In working as an investigator and later consulting with them, it became clear that collecting and establishing pivot relationships could greatly help with reducing both n and t. Early in our research we were inspired by law enforcement linkboards like the one below.
In computer science, graph theory is establishing and analyzing relationships between objects called nodes. We were lucky to find a bioinformatics library called CytoscapeJS (http://js.cytoscape.org/) that we utilized to frame tests to see if graph theory could improve n and t challenges.
Six Degrees of Kevin Bacon
Our goal was to create a concise, clear picture of a security incident that would combine many security events (leads) reducing both n and t. What we envisioned when we started research a couple years ago was similar to the screenshot from our modern interface below.
The image shows a machine, it’s malware, it’s staging target and where it sent the data. Its a clear picture. Unfortunately, when we started research (and for months to follow) we kept running into what we called “The Cloud of Death” as pictured below.
Using network connections as the relationship edges between host nodes, we started observing a “Six Degrees of Kevin Bacon” (see: https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon) problem. Kevin Bacon was normally a core service like a DNS or Email server. Since most computers had a first or second generation connection to a “Kevin Bacon”, virtually all computers were stitched into a single incident.
While this approach had a positive impact on the number of incidents to investigate (n problem) by consolidating them all into the context of the network, it destroyed the amount of time it took to understand and investigate an incident (t problem.)
Finding Relevant Relationships
After several months of a tug of war between n and t, we were getting close to throwing in the towel. When we would improve n, we hurt t and vice-versa. We finally had a break through when speaking with Detective Bill Ritch (see: https://www.witfoo.com/blog/importance-investigative-mindset/.) He brought up the importance of establishing relevance in a relationship. The relevance is determined by human motivation and action. We began evaluating the nature of each edge relationship against the modus operandi, MO, of the attacker. The relationship can only be relevant if it is connected to a crime. This led to our first patent filing of “Temporal Link Analysis.”
Prescription to Presumption
We certainly learned many things in our study of “Clouds of Death” and Kevin Bacon but the biggest lesson is understanding how far the tool should go in automatically establishing relationships. We started the path by being overly prescriptive. We told the investigator what he/she should be looking at even if they thought it was overly complex. We believed we were making things better by showing all the relationships. It turned out that our presumption created noise and confusion. Learning from that we now only show what we are certain are relevant then allow the investigator to evaluate other relationships to determine if they are actually relevant. As I wrote in my People > Machine (https://www.witfoo.com/blog/people-machines-part-five) series, it is critical to allow the investigator to do the work with the help of the tools. The screenshot below shows an incident with a graph relationship to another incident that MAY be relevant. This has proven effective by recognizing the power of the human investigator.
Summary
Establishing relevant graph relationships between hosts, users and files in an incident have the capacity of reducing the number of investigations to perform, increase clarity and reduce investigative time. If done incorrectly or insufficiently it will hinder instead of helping. The devil is in the details.