The world wide web has a wealth of information that is related to almost any text classification task. This paper presents a method for mining the web to improve text classificati...
CLIR resources, such as dictionaries and parallel corpora, are scarce for special domains. Obtaining comparable corpora automatically for such domains could be an answer to this p...
In this paper we propose a method for predicting the ranking position of a Web page. Assuming a set of successive past top-k rankings, we study the evolution of Web pages in terms...
Michalis Vazirgiannis, Dimitris Drosos, Pierre Sen...
Establishing relationships within a dataset is one of the core objectives of data mining. In this paper a method of correlating behaviour profiles in a continuous dataset is presen...
In this paper, we continue our investigations of "web spam": the injection of artificially-created pages into the web in order to influence the results from search engin...
Alexandros Ntoulas, Marc Najork, Mark Manasse, Den...