Sciweavers

168 search results - page 8 / 34
» Document Classification Using Multiword Features
Sort
View
ICDE
2007
IEEE
211views Database» more  ICDE 2007»
15 years 3 months ago
Document Representation and Dimension Reduction for Text Clustering
Increasingly large text datasets and the high dimensionality associated with natural language create a great challenge in text mining. In this research, a systematic study is cond...
M. Mahdi Shafiei, Singer Wang, Roger Zhang, Evange...
CIKM
2010
Springer
14 years 7 months ago
Fast dimension reduction for document classification based on imprecise spectrum analysis
This paper proposes an algorithm called Imprecise Spectrum Analysis (ISA) to carry out fast dimension reduction for document classification. ISA is designed based on the one-sided...
Hu Guan, Bin Xiao, Jingyu Zhou, Minyi Guo, Tao Yan...
ICDAR
2011
IEEE
13 years 9 months ago
Identification of Indic Scripts on Torn-Documents
—Questioned Document Examination processes often encompass analysis of torn documents. To aid a forensic expert, automatic classification of content type in torn documents might ...
Sukalpa Chanda, Katrin Franke, Umapada Pal
TAL
2010
Springer
14 years 8 months ago
Summarization as Feature Selection for Document Categorization on Small Datasets
Abstract. Most common feature selection techniques for document categorization are supervised and require lots of training data in order to accurately capture the descriptive and d...
Emmanuel Anguiano-Hernández, Luis Villase&n...
ICDAR
2007
IEEE
15 years 1 months ago
On the Use of Lexeme Features for Writer Verification
Document examiners use a variety of features to analyze a given handwritten document for writer verification. The challenge in the automatic classification of a pair of documents ...
A. Bhardwaj, A. Singh, Harish Srinivasan, Sargur N...