Various approaches for plagiarism detection exist. All are based on more or less sophisticated text analysis methods such as string matching, fingerprinting or style comparison. I...
Abstract. We aim to develop a technique to detect search engine optimization (SEO) spam websites. Specifically, we propose four methods for extracting the SEO spam entries from a ...
The development of test cases is an important issue for testing software, communication protocols and other reactive systems. A number of methods are known for the development of ...
This paper introduces the novel task of topic coherence evaluation, whereby a set of words, as generated by a topic model, is rated for coherence or interpretability. We apply a r...
David Newman, Jey Han Lau, Karl Grieser, Timothy B...
Work on evaluating and improving the relevance of web search engines typically use human relevance judgments or clickthrough data. Both these methods look at the problem of learni...
Hao Ma, Raman Chandrasekar, Chris Quirk, Abhishek ...