Computer Science – Computation and Language
Scientific paper
2000-08-22
Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, N.J. B
Computer Science
Computation and Language
Scientific paper
The growing problem of unsolicited bulk e-mail, also known as "spam", has generated a need for reliable anti-spam e-mail filters. Filters of this type have so far been based mostly on manually constructed keyword patterns. An alternative approach has recently been proposed, whereby a Naive Bayesian classifier is trained automatically to detect spam messages. We test this approach on a large collection of personal e-mail messages, which we make publicly available in "encrypted" form contributing towards standard benchmarks. We introduce appropriate cost-sensitive measures, investigating at the same time the effect of attribute-set size, training-corpus size, lemmatization, and stop lists, issues that have not been explored in previous experiments. Finally, the Naive Bayesian filter is compared, in terms of performance, to a filter that uses keyword patterns, and which is part of a widely used e-mail reader.
Androutsopoulos Ion
Chandrinos Konstantinos V.
Koutsias John
Spyropoulos Constantine D.
No associations
LandOfFree
An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-mail Messages does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-mail Messages, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-mail Messages will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-134026