Sciweavers

483 search results - page 22 / 97
» Sampling the Web as Training Data for Text Classification
Sort
View
KDD
1998
ACM
102views Data Mining» more  KDD 1998»
15 years 7 months ago
Joins that Generalize: Text Classification Using WHIRL
WHIRL is an extensionof relational databasesthat canperform "soft joins" basedon the similarity of textual identifiers;thesesoftjoins extendthe traditional operationof j...
William W. Cohen, Haym Hirsh
KDD
2009
ACM
191views Data Mining» more  KDD 2009»
15 years 7 months ago
Improving data mining utility with projective sampling
Overall performance of the data mining process depends not just on the value of the induced knowledge but also on various costs of the process itself such as the cost of acquiring...
Mark Last
SAC
2008
ACM
15 years 2 months ago
Exploring social annotations for web document classification
Social annotation via so-called collaborative tagging describes the process by which many users add metadata in the form of unstructured keywords to shared content. In this paper,...
Michael G. Noll, Christoph Meinel
TASLP
2010
144views more  TASLP 2010»
14 years 9 months ago
Active Learning With Sampling by Uncertainty and Density for Data Annotations
To solve the knowledge bottleneck problem, active learning has been widely used for its ability to automatically select the most informative unlabeled examples for human annotation...
Jingbo Zhu, Huizhen Wang, Benjamin K. Tsou, Matthe...
KDD
2006
ACM
165views Data Mining» more  KDD 2006»
16 years 3 months ago
Training linear SVMs in linear time
Linear Support Vector Machines (SVMs) have become one of the most prominent machine learning techniques for highdimensional sparse data commonly encountered in applications like t...
Thorsten Joachims