Sciweavers

24 search results - page 4 / 5
» Classifying Chinese Texts in Two Steps
Sort
View
WWW
2008
ACM
14 years 7 months ago
Enhanced hierarchical classification via isotonic smoothing
Hierarchical topic taxonomies have proliferated on the World Wide Web [5, 18], and exploiting the output space decompositions they induce in automated classification systems is an...
Kunal Punera, Joydeep Ghosh
CORR
2002
Springer
90views Education» more  CORR 2002»
13 years 6 months ago
Mostly-Unsupervised Statistical Segmentation of Japanese Kanji Sequences
Given the lack of word delimiters in written Japanese, word segmentation is generally considered a crucial first step in processing Japanese texts. Typical Japanese segmentation a...
Rie Kubota Ando, Lillian Lee
JMLR
2006
125views more  JMLR 2006»
13 years 6 months ago
Spam Filtering Using Statistical Data Compression Models
Spam filtering poses a special problem in text categorization, of which the defining characteristic is that filters face an active adversary, which constantly attempts to evade fi...
Andrej Bratko, Gordon V. Cormack, Bogdan Filipic, ...
LREC
2010
133views Education» more  LREC 2010»
13 years 7 months ago
Towards a Learning Approach for Abbreviation Detection and Resolution
The explosion of biomedical literature and with it the -uncontrolled- creation of abbreviations presents some special challenges for both human readers and computer applications. ...
Klaar Vanopstal, Bart Desmet, Véronique Hos...
ICML
2004
IEEE
14 years 7 months ago
Leveraging the margin more carefully
Boosting is a popular approach for building accurate classifiers. Despite the initial popular belief, boosting algorithms do exhibit overfitting and are sensitive to label noise. ...
Nir Krause, Yoram Singer