Sciweavers

735 search results - page 26 / 147
» Corpora and data preparation
Sort
View
APVIS
2006
14 years 11 months ago
Generation of relevance maps and navigation in a digital book
This paper describes how to design a digital book for a new science called `Knowledge Science'. We prepare several types of navigation facilities for browsing the book. Speci...
Katsuhiro Ikeda, Kozo Sugiyama, Isamu Watanabe, Ka...
ICASSP
2009
IEEE
15 years 4 months ago
Resampling auxiliary data for language model adaptation in machine translation for speech
Performance of n-gram language models depends to a large extent on the amount of training text material available for building the models and the degree to which this text matches...
Sameer Maskey, Abhinav Sethy
LREC
2010
178views Education» more  LREC 2010»
14 years 11 months ago
Design and Data Collection for the Accentological Corpus of the Russian Language
Accentological corpus provides a researcher an opportunity to study word stress and stress variation, which are very important for the Russian language. Moreover, Accentological c...
Elena Grishina, Svetlana Savchuk, Alexej Poljakov
SIGIR
2009
ACM
15 years 4 months ago
Identifying the original contribution of a document via language modeling
Abstract. One major goal of text mining is to provide automatic methods to help humans grasp the key ideas in ever-increasing text corpora. To this effect, we propose a statistica...
Benyah Shaparenko, Thorsten Joachims
ACL
2008
14 years 11 months ago
Language Dynamics and Capitalization using Maximum Entropy
This paper studies the impact of written language variations and the way it affects the capitalization task over time. A discriminative approach, based on maximum entropy models, ...
Fernando Batista, Nuno J. Mamede, Isabel Trancoso