Sciweavers

483 search results - page 42 / 97
» Sampling the Web as Training Data for Text Classification
Sort
View
PKDD
2007
Springer
143views Data Mining» more  PKDD 2007»
15 years 9 months ago
Using the Web to Reduce Data Sparseness in Pattern-Based Information Extraction
Textual patterns have been used effectively to extract information from large text collections. However they rely heavily on textual redundancy in the sense that facts have to be m...
Sebastian Blohm, Philipp Cimiano
PAKDD
2004
ACM
96views Data Mining» more  PAKDD 2004»
15 years 8 months ago
Spectral Energy Minimization for Semi-supervised Learning
The use of unlabeled data to aid classification is important as labeled data is often available in limited quantity. Instead of utilizing training samples directly into semi-super...
Chun Hung Li, Zhi-Li Wu
SDM
2008
SIAM
135views Data Mining» more  SDM 2008»
15 years 4 months ago
A Spamicity Approach to Web Spam Detection
Web spam, which refers to any deliberate actions bringing to selected web pages an unjustifiable favorable relevance or importance, is one of the major obstacles for high quality ...
Bin Zhou 0002, Jian Pei, ZhaoHui Tang
TAL
2010
Springer
15 years 1 months ago
Summarization as Feature Selection for Document Categorization on Small Datasets
Abstract. Most common feature selection techniques for document categorization are supervised and require lots of training data in order to accurately capture the descriptive and d...
Emmanuel Anguiano-Hernández, Luis Villase&n...
JCDL
2006
ACM
140views Education» more  JCDL 2006»
15 years 9 months ago
Exploring erotics in Emily Dickinson's correspondence with text mining and visual interfaces
This paper describes a system to support humanities scholars in their interpretation of literary work. It presents a user interface and web architecture that integrates text minin...
Catherine Plaisant, James Rose, Bei Yu, Loretta Au...