Sciweavers

523 search results - page 23 / 105
» Metric Learning for Text Documents
Sort
View
ICDAR
2007
IEEE
15 years 3 months ago
Identification of Latin-Based Languages through Character Stroke Categorization
This paper presents a language identification technique that detects Latin-based languages of imaged documents without OCR. The proposed technique detects languages through the wo...
S. J. Lu, L. Li, Chew Lim Tan
108
Voted
AMT
2006
Springer
147views Multimedia» more  AMT 2006»
15 years 3 months ago
Semi-Supervised Text Classification Using Positive and Unlabeled Data
Text classification using positive and unlabeled data refers to the problem of building text classifier using positive documents (P) of one class and unlabeled documents (U) of man...
Shuang Yu, Xueyuan Zhou, Chunping Li
EMNLP
2007
15 years 22 days ago
Incremental Text Structuring with Online Hierarchical Ranking
Many emerging applications require documents to be repeatedly updated. Such documents include newsfeeds, webpages, and shared community resources such as Wikipedia. In this paper ...
Erdong Chen, Benjamin Snyder, Regina Barzilay
ESANN
2007
15 years 22 days ago
Kernel PCA based clustering for inducing features in text categorization
We study dimensionality reduction or feature selection in text document categorization problem. We focus on the first step in building text categorization systems, that is the cho...
Zsolt Minier, Lehel Csató
ICASSP
2009
IEEE
15 years 3 months ago
Improved lattice-based spoken document retrieval by directly learning from the evaluation measures
Lattice-based approaches have been widely used in spoken document retrieval to handle the speech recognition uncertainty and errors. Position Specific Posterior Lattices (PSPL) an...
Chao-hong Meng, Hung-yi Lee, Lin-shan Lee