Web page classification is important to many tasks in information retrieval and web mining. However, applying traditional textual classifiers on web data often produces unsatisfyi...
Recently a number of studies have demonstrated that search engine logfiles are an important resource to determine the relevance relation between URLs and query terms. We hypothes...
Max Hinne, Wessel Kraaij, Stephan Raaijmakers, Suz...
Image spam is a new obfuscating method which spammers invented to more effectively bypass conventional text based spam filters. In this paper, we extract local invariant features ...
Haiqiang Zuo, Weiming Hu, Ou Wu, Yunfei Chen, Guan...
By far, the support vector machines (SVM) achieve the state-of-theart performance for the text classification (TC) tasks. Due to the complexity of the TC problems, it becomes a ch...
We propose a simple yet eective approach to context sensitive synonym discovery for Web search queries based on co-click analysis; i.e., analyzing queries leading to clicking sam...