Sciweavers

2929 search results - page 39 / 586
» Models of English Text
Sort
View
ECAI
2006
Springer
15 years 1 months ago
Text Sampling and Re-Sampling for Imbalanced Authorship Identification Cases
Authorship identification can be seen as a single-label multi-class text categorization problem. Very often, there are extremely few training texts at least for some of the candida...
Efstathios Stamatatos
ICDM
2010
IEEE
122views Data Mining» more  ICDM 2010»
14 years 7 months ago
Learning Preferences with Millions of Parameters by Enforcing Sparsity
We study the retrieval task that ranks a set of objects for a given query in the pairwise preference learning framework. Recently researchers found out that raw features (e.g. word...
Xi Chen, Bing Bai, Yanjun Qi, Qihang Lin, Jaime G....
ICDAR
2011
IEEE
13 years 9 months ago
A Handwritten Character Extraction Algorithm for Multi-language Document Image
—In this paper, we propose a novel method for extracting handwritten characters from multi-language document images, which may contain various types of characters, e.g. Chinese, ...
Yonghong Song, Guilin Xiao, Yuanlin Zhang, Lei Yan...
EMNLP
2004
14 years 11 months ago
A New Approach for English-Chinese Named Entity Alignment
Traditional word alignment approaches cannot come up with satisfactory results for Named Entities. In this paper, we propose a novel approach using a maximum entropy model for nam...
Donghui Feng, Yajuan Lü, Ming Zhou
TREC
1997
14 years 11 months ago
Conceptual Indexing Using Thematic Representation of Texts
We present the thesaurus-based indexing technology developed by the Center for Information Research under the Information System RUSSIA project. The technology is based on using b...
Boris V. Dobrov, Natalia V. Loukachevitch, Tatyana...