Collaborative Filtering (CF) algorithms, used to build webbased recommender systems, are often evaluated in terms of how accurately they predict user ratings. However, current eva...
Neal Lathia, Stephen Hailes, Licia Capra, Xavier A...
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
Knowledge representation (KR) is used to store and retrieve meaningful data. This data is saved using dynamic data structures that are suitable for the style of KR being implemente...
PDF became a very common format for exchanging printable documents. Further, it can be easily generated from the major documents formats, which make a huge number of PDF documents...
The automatic generation of back-of-the book indexes seems to be out of sight of the Information Retrieval and Natural Language Processing communities, although the increasingly la...