Sciweavers

168 search results - page 15 / 34
» Document Classification Using Multiword Features
Sort
View
WWW
2004
ACM
15 years 10 months ago
Using urls and table layout for web classification tasks
We propose new features and algorithms for automating Web-page classification tasks such as content recommendation and ad blocking. We show that the automated classification of We...
L. K. Shih, David R. Karger
IFIP12
2004
14 years 11 months ago
Impact on Performance of Hypertext Classification of Selective Rich HTML Capture
: Hypertext categorization is the automatic classification of web documents into predefined classes. It poses new challenges for automatic categorization because of the rich inform...
Houda Benbrahim, Max Bramer
84
Voted
ICDAR
2009
IEEE
14 years 7 months ago
Invariant Primitives for Handwritten Arabic Script: A Contrastive Study of Four Feature Sets
The choice of relevant features is very decisive in handwriting recognition rate. Our aim is to present some useful structural and statistical features and see their degree of var...
Sofiene Haboubi, Samia Maddouri, Noureddine Ellouz...
DRR
2009
14 years 7 months ago
Using synthetic data safely in classification
When is it safe to use synthetic data in supervised classification? Trainable classifier technologies require large representative training sets consisting of samples labeled with...
Jean Nonnemaker, Henry Baird
DAWAK
2008
Springer
14 years 11 months ago
Is a Voting Approach Accurate for Opinion Mining?
In this paper, we focus on classifying documents according to opinion and value judgment they contain. The main originality of our approach is to combine linguistic pre-processing,...
Michel Plantié, Mathieu Roche, Gérar...