Sciweavers

CEAS
2004
Springer

Word Stemming to Enhance Spam Filtering

13 years 9 months ago
Word Stemming to Enhance Spam Filtering
Generally a content based spam filter works on words and phrases of email text and if it finds offensive content it gives that email a numerical value (depending on the content). After crossing a certain threshold, that email may be considered as SPAM. This technique works well only if the offensive words are lexically correct. That means the words must be valid words with correct spelling. Otherwise most content based spam filters will be unable to detect offensive words. In this paper, we showed that if we use some sort of word stemming or word hashing technique that can extract the base or stem of a misspelled or modified word, the efficiency of any content based spam filter can be significantly improved. Here we presented a simple rule -based word stemming algorithm specifically designed for spam detection and showed some experimental results to corroborate our claim.
Shabbir Ahmed, Farzana Mithun
Added 01 Jul 2010
Updated 01 Jul 2010
Type Conference
Year 2004
Where CEAS
Authors Shabbir Ahmed, Farzana Mithun
Comments (0)