Sciweavers

735 search results - page 116 / 147
» Corpora and data preparation
Sort
View
IJSI
2008
129views more  IJSI 2008»
14 years 10 months ago
Managing the Acronym/Expansion Identification Process for Text-Mining Applications
This paper deals with an acronym/definition extraction approach from textual data (corpora) and the disambiguation of these definitions (or expansions). Both steps of our global pr...
Mathieu Roche, Violaine Prince
COLING
2002
14 years 9 months ago
Study of Practical Effectiveness for Machine Translation Using Recursive Chain-link-type Learning
A number of machine translation systems based on the learning algorithms are presented. These methods acquire translation rules from pairs of similar sentences in a bilingual text...
Hiroshi Echizen-ya, Kenji Araki, Yoshio Momouchi, ...
SIGIR
2002
ACM
14 years 9 months ago
Unsupervised document classification using sequential information maximization
We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential ...
Noam Slonim, Nir Friedman, Naftali Tishby
JAIR
2010
132views more  JAIR 2010»
14 years 8 months ago
Using Local Alignments for Relation Recognition
This paper discusses the problem of marrying structural similarity with semantic relatedness for Information Extraction from text. Aiming at accurate recognition of relations, we ...
Sophia Katrenko, Pieter W. Adriaans, Maarten van S...
ICDAR
2009
IEEE
14 years 7 months ago
Language Model Integration for the Recognition of Handwritten Medieval Documents
Building recognition systems for historical documents is a difficult task. Especially, when it comes to medieval scripts. The complexity is mainly affected by the poor quality and...
Markus Wüthrich, Marcus Liwicki, Andreas Fisc...