Sciweavers

EMNLP
2009
13 years 2 months ago
Polylingual Topic Models
Topic models are a useful tool for analyzing large text collections, but have previously been applied in only monolingual, or at most bilingual, contexts. Meanwhile, massive colle...
David M. Mimno, Hanna M. Wallach, Jason Naradowsky...
APWEB
2006
Springer
13 years 8 months ago
The Case of the Duplicate Documents Measurement, Search, and Science
Many of the documents in large text collections are duplicates and versions of each other. In recent research, we developed new methods for finding such duplicates; however, as the...
Justin Zobel, Yaniv Bernstein
CIKM
2000
Springer
13 years 9 months ago
Collection Selection and Results Merging with Topically Organized U.S. Patents and TREC Data
We investigate three issues in distributed information retrieval, considering both TREC data and U.S. Patents: (1) topical organization of large text collections, (2) collection r...
Leah S. Larkey, Margaret E. Connell, James P. Call...
SAC
2005
ACM
13 years 10 months ago
Mining concept associations for knowledge discovery in large textual databases
In this paper, we describe a new approach for mining concept associations from large text collections. The concepts are short sequences of words that occur frequently together acr...
Xiaowei Xu, Mutlu Mete, Nurcan Yuruk