English»Natural Language Processing and Computational Linguistics»Anti-spam techniques | searchivarius.org
log in | about 

A Bayesian Approach to Filtering Junk E-Mail  Mehran Sahami, Susan Dumaisy, David Heckermany, Eric Horvitzy
Detecting and Characterizing Social Spam Campaigns  Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen, Ben Y. Zhao
Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets  Gordon V. Cormack, Mark D. Smucker, Charles L. A. Clarke
Learning Rules that Classify E-Mail (1996)  William W. Cohen - Two methods for learning text classifiers are comparedon classification problems that might arise in filtering and filing personal e-mail messages: a "traditional IR" method based on TF-IDF weighting, and a new method for learning sets of "keyword-spotting rules" based on the RIPPER rule learning algorithm.It is demonstrated that both methods obtain significant generalizations from a small number of examples; that both methods are comparable in generalization performance on problems of this type; and that both methods are reasonably efficient, even with fairly large training sets.
Spam Filtering for Short Messages  Cormack G.V., Hidalgo J.M.G, Sanz E.P.