|
|
|
|
|
|
|
|
|
|
A Bayesian Approach to Filtering Junk E-Mail Mehran Sahami, Susan Dumaisy, David Heckermany, Eric Horvitzy |
Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen, Ben Y. Zhao |
Efficient and Effective Spam Filtering and Re-ranking for Large Web Datasets Gordon V. Cormack, Mark D. Smucker, Charles L. A. Clarke |
Learning Rules that Classify E-Mail (1996) William W. Cohen - Two methods for learning text classifiers are comparedon classification problems that might arise in filtering and filing personal e-mail messages: a "traditional IR" method based on TF-IDF weighting, and a new method for learning sets of "keyword-spotting rules" based on the RIPPER rule learning algorithm.It is demonstrated that both methods obtain significant generalizations from a small number of examples; that both methods are comparable in generalization performance on problems of this type; and that both methods are reasonably efficient, even with fairly large training sets.
|
Spam Filtering for Short Messages Cormack G.V., Hidalgo J.M.G, Sanz E.P. |
|
|
|
|
|
|
|
|
|
|