Sciweavers

TREC
2004

Experience of Using SVM for the Triage Task in TREC 2004 Genomics Track

13 years 5 months ago
Experience of Using SVM for the Triage Task in TREC 2004 Genomics Track
This paper reports our knowledge-ignorant machine learning approach to the triage task in TREC2004 genomics track, which is actually a text categorization problem. We applied Support Vector Machine (SVM) and found that information-gain based feature selection is helpful. Although we achieved decent performance in leave-one-out cross-validation experiments, the evaluation result on the test data turned out to be surprisingly poor. Further experiments revealed that there is a chasm between the training and test data distributions. It seems that more aggressive feature selection can partially alleviate the trouble caused by distribution change. Keywords Text Categorization, Machine Learning, Support Vector Machine, Feature Selection, Distribution Change.
Dell Zhang, Wee Sun Lee
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2004
Where TREC
Authors Dell Zhang, Wee Sun Lee
Comments (0)