Copy the page URI to the clipboard
Husna, Husain; Phithakkitnukoon, Santi; Palla, Srikanth and Dantu, Ram
(2008).
DOI: https://doi.org/10.1109/COMSWA.2008.4554418
Abstract
Compromised computers, known as bots, are the major source of spamming and their detection helps greatly improve control of unwanted traffic. In this work we investigate the behavior patterns of spammers based on their underlying similarities in spamming. To our knowledge, no work has been reported on identifying spam botnets based on spammerspsila temporal characteristics. Our study shows that the relationship among spammers demonstrates highly clustering structures based on features such as content length, time of arrival, frequency of email, active time, inter-arrival time, and content type. Although the dimensions of the collected feature set is low, we perform principal component analysis (PCA) on feature set to identify the features which account for the maximum variance in the spamming patterns. Further, we calculate the proximity between different spammers and classify them into various groups. Each group represents similar proximity. Spammers in the same group inherit similar patterns of spamming a domain. For classification into Botnet groups, we use clustering algorithms such as Hierarchical and K-means.We identify Botnet spammers into a particular group with a precision of 90%.