A Case Study of User-Level Spam Filtering

Bajaj, K. and Pieprzyk, J.

There are number of Anti-Spam filters that have reduced the amount of email spam in the inbox but the problem still continues as the spammers circumvent these techniques. The problems need to be addressed from different aspects. Major problem for instance arises when these anti-spam techniques misjudge or misclassify legitimate emails as spam (false positive); or fail to deliver or block spam on the SMTP server (false negative); thus causing a staggering cost in loss of time, effort and finance. Though false positive are very harmful loss of important information for the user, false negatives defeat the purpose of the spam filtering. This paper makes an effort in proposing another aspect to address this problem. It discusses some of these anti-spam techniques, especially the filtering technological endorsements designed to prevent spam to entrench their capability enhancements, as well as analytical recommendations that will be subject to further research. Apart from applying anti-spam techniques, training of Spam control tool with relevant user preferences can reduce the chances of false positives, false negatives and spam email that land in the inbox. We identify the need for training the filter with domain specific data. This paper shows the decline in false negatives via results of a case study on training the Spam Bayes tool with carefully collected domain specific user preferred dataset for over a period of 12 months.

Cite as: Bajaj, K. and Pieprzyk, J. (2014). A Case Study of User-Level Spam Filtering. In Proc. Twelfth Australasian Information Security Conference (AISC 2014) Auckland, New Zealand. CRPIT, 149. Parampalli, U. and Welch, I. Eds., ACS. 67-75

(from crpit.com) (local if available)