Sciweavers

251 search results - page 45 / 51
» A Protein Classification Benchmark collection for machine le...
Sort
View
CEAS
2006
Springer
15 years 1 months ago
Fast Uncertainty Sampling for Labeling Large E-mail Corpora
One of the biggest challenges in building effective anti-spam solutions is designing systems to defend against the everevolving bag of tricks spammers use to defeat them. Because ...
Richard Segal, Ted Markowitz, William Arnold
TREC
2004
14 years 10 months ago
Feature Generation, Feature Selection, Classifiers, and Conceptual Drift for Biomedical Document Triage
We approached the problem of classifying papers for the TREC 2004 Genomics Track triage task as a four step process: feature generation, feature selection, classifier training, an...
Aaron M. Cohen, Ravi Teja Bhupatiraju, William R. ...
ESWA
2006
149views more  ESWA 2006»
14 years 9 months ago
An effective refinement strategy for KNN text classifier
Due to the exponential growth of documents on the Internet and the emergent need to organize them, the automated categorization of documents into predefined labels has received an...
Songbo Tan
CIKM
2004
Springer
15 years 1 months ago
InfoAnalyzer: a computer-aided tool for building enterprise taxonomies
In this paper we study the problem of collecting training samples for building enterprise taxonomies. We develop a computer-aided tool named InfoAnalyzer, which can effectively as...
Li Zhang, Shixia Liu, Yue Pan, Liping Yang
KDD
2009
ACM
191views Data Mining» more  KDD 2009»
15 years 2 months ago
Improving data mining utility with projective sampling
Overall performance of the data mining process depends not just on the value of the induced knowledge but also on various costs of the process itself such as the cost of acquiring...
Mark Last