We describe a new approach for evaluating page segmentation algorithms. Unlike techniques that rely on OCR output, our method is region-based: the segmentation output, described a...
In information retrieval, relevance of documents with respect to queries is usually judged by humans, and used in evaluation and/or learning of ranking functions. Previous work ha...
Jingfang Xu, Chuanliang Chen, Gu Xu, Hang Li, Elbi...
This paper connects two research areas: automatic tagging on the web and statistical keyphrase extraction. First, we analyze the quality of tags in a collaboratively created folks...
The work described here initially formed part of a triangulation exercise to establish the effectiveness of the Query Term Order algorithm. The methodology produced subsequently p...
Wikipedia is becoming ever more popular. Linking between documents is typically provided in similar environments in order to achieve collaborative knowledge sharing. However, this ...
Darren Wei Che Huang, Yue Xu, Andrew Trotman, Shlo...