Sciweavers

23 search results - page 5 / 5
» A Lexicon-Constrained Character Model for Chinese Morphologi...
Sort
View
DOCENG
2005
ACM
13 years 7 months ago
Injecting information into atomic units of text
This paper presents a new approach to text processing, based on textemes. These are atomic text units generalising the concepts of character and glyph by merging them in a common ...
Yannis Haralambous, Gábor Bella
EMNLP
2010
13 years 3 months ago
A New Approach to Lexical Disambiguation of Arabic Text
We describe a model for the lexical analysis of Arabic text, using the lists of alternatives supplied by a broad-coverage morphological analyzer, SAMA, which include stable lemma ...
Rushin Shah, Paramveer S. Dhillon, Mark Liberman, ...
KDD
2006
ACM
179views Data Mining» more  KDD 2006»
14 years 5 months ago
Extracting key-substring-group features for text classification
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
Dell Zhang, Wee Sun Lee