The representation of information collections needs to be optimized for human cognition. While documents often include rich visual components, collections, including personal coll...
We consider the problem of efficiently computing weighted proximity best-joins over multiple lists, with applications in information retrieval and extraction. We are given a multi-...
AnHai Doan, Haixun Wang, Hao He, Jun Yang 0001, Ri...
Document clustering has been used for better document retrieval, document browsing, and text mining. In this paper, we investigate if biomedical ontology MeSH improves the cluster...
In this paper, we propose a new application of Bayesian language model based on Pitman-Yor process for information retrieval. This model is a generalization of the Dirichlet distr...
Abstract. MiDiLiB is a six year research project on digital music libraries funded by the German Research Foundation (DFG) as a part of the Distributed Processing and Delivery of D...