Sciweavers

129 search results - page 19 / 26
» A Corpus of Scope-disambiguated English Text
Sort
View
CASCON
2007
112views Education» more  CASCON 2007»
14 years 11 months ago
Removing manually generated boilerplate from electronic texts: experiments with project Gutenberg e-books
Collaborative work on unstructured or semistructured documents, such as in literature corpora or source code, often involves agreed upon templates containing metadata. These templ...
Owen Kaser, Daniel Lemire
CEC
2010
IEEE
14 years 11 months ago
Evolving natural language grammars without supervision
Unsupervised grammar induction is one of the most difficult works of language processing. Its goal is to extract a grammar representing the language structure using texts without a...
Lourdes Araujo, Jesus Santamaria
KDD
2009
ACM
211views Data Mining» more  KDD 2009»
15 years 10 months ago
Address standardization with latent semantic association
Address standardization is a very challenging task in data cleansing. To provide better customer relationship management and business intelligence for customer-oriented cooperates...
Honglei Guo, Huijia Zhu, Zhili Guo, Xiaoxun Zhang,...
MT
2002
297views more  MT 2002»
14 years 9 months ago
MARS: A Statistical Semantic Parsing and Generation-Based Multilingual Automatic tRanslation System
We present MARS (Multilingual Automatic tRanslation System), a research prototype speech-to-speech translation system. MARS is aimed at two-way conversational spoken language trans...
Yuqing Gao, Bowen Zhou, Zijian Diao, Jeffrey S. So...
CIKM
2009
Springer
15 years 4 months ago
Combining labeled and unlabeled data with word-class distribution learning
We describe a novel simple and highly scalable semi-supervised method called Word-Class Distribution Learning (WCDL), and apply it the task of information extraction (IE) by utili...
Yanjun Qi, Ronan Collobert, Pavel Kuksa, Koray Kav...