Sciweavers

1319 search results - page 208 / 264
» Using the Structure of HTML Documents to Improve Retrieval
Sort
View
155
Voted
CSL
2004
Springer
15 years 3 months ago
Contemporaneous text as side-information in statistical language modeling
We propose new methods to exploit contemporaneous text, such as on-line news articles, to improve language models for automatic speech recognition and other natural language proce...
Sanjeev Khudanpur, Woosung Kim
158
Voted
BMCBI
2007
146views more  BMCBI 2007»
15 years 3 months ago
PubMed related articles: a probabilistic topic-based model for content similarity
Background: We present a probabilistic topic-based model for content similarity called pmra that underlies the related article search feature in PubMed. Whether or not a document ...
Jimmy J. Lin, W. John Wilbur
IPM
2006
171views more  IPM 2006»
15 years 3 months ago
Automatic extraction of bilingual word pairs using inductive chain learning in various languages
In this paper, we propose a new learning method for extracting bilingual word pairs from parallel corpora in various languages. In cross-language information retrieval, the system...
Hiroshi Echizen-ya, Kenji Araki, Yoshio Momouchi
147
Voted
TREC
2008
15 years 4 months ago
York University at TREC 2008: Blog Track
York University participated in the TREC 2008 Blog track, by introducing two opinion finding features. By initially focusing solely on the sentiment terms found in a document, usi...
Mladen Kovacevic, Xiangji Huang
TREC
2007
15 years 4 months ago
Parsimonious Language Models for a Terabyte of Text
: The aims of this paper are twofold. Our first aim is to compare results of the earlier Terabyte tracks to the Million Query track. We submitted a number of runs using different ...
Djoerd Hiemstra, Rongmei Li, Jaap Kamps, Rianne Ka...