The Threat Research group sat in on a talk by HBGary CEO Greg Hoglund yesterday where the regular speaker discussed some research he’s been doing over the past year that he hopes will help connect malware samples to known groups of malware creators. While that sounds promising for law enforcement, it’s actually not as helpful for tracking down originators of malware for prosecution as it is for security researchers to preliminarily group and classify the masses of outwardly-dissimilar Trojans we see every day.
In most conventional methods of classification, researchers look for programmatic similarities or behavioral characteristics as a way to group similar pieces of malware into definitions, which then simplify the task of an antivirus tool to clean up an infection. In Hoglund’s talk, he proposed another set of criteria antimalware researchers can use to make these kinds of classifications: the “tool marks” left behind inside of malware samples as a result of compiling tools, languages, and even sloppy coding habits employed by malware creators.
On a technical level, Webroot’s Threat Research team has been using these “tool marks” as guides for some time when they perform manual analysis of malicious files. Hoglund’s talk introduced a tool he created, called Fingerprint, which can process a malware file and, in an automated fashion, provide malware researchers with simplified output they can then add to a database. With a sufficiently large sample set, surprisingly good clustering seems to appear, as shown in the photograph above, which is a snapshot of one of Hoglund’s slides.
While the characteristic “tool marks” alone are probably not sufficient to establish that an arbitrary, unknown file is malicious, it can be a good indicator that the unknown file is related — possibly in several significant ways — to files that have been established to be malicious. It is this predictive ability of the fingerprint that may be its greatest strengths…at least, until the malware authors catch on, and strip this identifiable information out of their files. For the meantime, however, laziness on the part of malware creators, and the difficulty of completely re-coding new malware, means identifiable tool marks should persist for a while, which means this fingerprinting method may remain effective for some time.