Sciweavers

23 search results - page 5 / 5
» A Lexicon-Constrained Character Model for Chinese Morphologi...
Sort
View
DOCENG
2005
ACM
14 years 11 months ago
Injecting information into atomic units of text
This paper presents a new approach to text processing, based on textemes. These are atomic text units generalising the concepts of character and glyph by merging them in a common ...
Yannis Haralambous, Gábor Bella
EMNLP
2010
14 years 7 months ago
A New Approach to Lexical Disambiguation of Arabic Text
We describe a model for the lexical analysis of Arabic text, using the lists of alternatives supplied by a broad-coverage morphological analyzer, SAMA, which include stable lemma ...
Rushin Shah, Paramveer S. Dhillon, Mark Liberman, ...
KDD
2006
ACM
179views Data Mining» more  KDD 2006»
15 years 10 months ago
Extracting key-substring-group features for text classification
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
Dell Zhang, Wee Sun Lee