Improving the precision of information retrieval has been a challenging issue on Chinese Web. As exemplified by Chinese recipes on the Web, it is not easy/natural for people to us...
The dimensionality curse has profound e ects on the effectiveness of high-dimensional similarity indexing from the performance perspective. One of the well known techniques for im...
The World-Wide Web consists not only of a huge number of unstructured texts, but also a vast amount of valuable structured data. Web tables [2] are a typical type of structured in...
Cindy Xide Lin, Bo Zhao, Tim Weninger, Jiawei Han,...
The general problem of answering top-k queries can be modeled using lists of data items sorted by their local scores. The most efficient algorithm proposed so far for answering to...
This paper describes an application of IR and text categorization methods to a highly practical problem in biomedicine, specifically, Gene Ontology (GO) annotation. GO annotation...