Sciweavers

483 search results - page 34 / 97
» Sampling the Web as Training Data for Text Classification
Sort
View
ICPR
2010
IEEE
15 years 1 months ago
Enhancing Web Page Classification via Local Co-training
Abstract--In this paper we propose a new multi-view semisupervised learning algorithm called Local Co-Training (LCT). The proposed algorithm employs a set of local models with vect...
Youtian Du, Xiaohong Guan, Zhongmin Cai
ICML
2007
IEEE
16 years 3 months ago
Support cluster machine
For large-scale classification problems, the training samples can be clustered beforehand as a downsampling pre-process, and then only the obtained clusters are used for training....
Bin Li, Mingmin Chi, Jianping Fan, Xiangyang Xue
IJCAI
2003
15 years 4 months ago
Web Page Cleaning for Web Mining through Feature Weighting
Unlike conventional data or text, Web pages typically contain a large amount of information that is not part of the main contents of the pages, e.g., banner ads, navigation bars, ...
Lan Yi, Bing Liu
JMLR
2006
105views more  JMLR 2006»
15 years 3 months ago
Parallel Software for Training Large Scale Support Vector Machines on Multiprocessor Systems
Parallel software for solving the quadratic program arising in training support vector machines for classification problems is introduced. The software implements an iterative dec...
Luca Zanni, Thomas Serafini, Gaetano Zanghirati
ERCIMDL
2005
Springer
100views Education» more  ERCIMDL 2005»
15 years 8 months ago
Importance of HTML Structural Elements and Metadata in Automated Subject Classification
The aim of the study was to determine how significance indicators assigned to different Web page elements (internal metadata, title, headings, and main text) influence automated cl...
Koraljka Golub, Anders Ardö