Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

97

Voted

KDD
2008
ACM

favoriteEmaildiscussreport

128views Data Mining» more KDD 2008»

Scaling up text classification for large file systems

16 years 1 months ago

Scaling up text classification for large file systems

Download www.hpl.hp.com

: We combine the speed and scalability of information retrieval with the generally superior classification accuracy offered by machine learning, yielding a two-phase text classifier that can scale to very large document corpora. We investigate the effect of different methods of formulating the query from the training set, as well as varying the query size. In empirical tests on the Reuters RCV1 corpus of 806,000 documents, we find runtime was easily reduced by a factor of 27x, with a somewhat surprising gain in F-measure compared with traditional text classification. External Posting Date: June 21, 2008 [Fulltext] Approved for External Publication Internal Posting Date: June 21, 2008 [Fulltext] To be presented and published in the 14th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD'08), August 2008 ? Copyright 2008 the 14th ACM SIGKDD International Conference

George Forman, Shyamsundar Rajaram

Real-time Traffic

14th Acm Sigkdd | Data Mining | External Posting Date | KDD 2008 | SIGKDD International Conference |

claim paper

Related Content

» Scaling Up Explanation Generation LargeScale Knowledge Bases and Empirical Studies

» AMP An AffinityBased Metadata Prefetching Scheme in LargeScale Distributed Storage Systems

» Spyglass Fast Scalable Metadata Search for LargeScale Storage Systems

» Mining Console Logs for LargeScale System Problem Detection

» Scalable associationbased text classification

» Selforganizing Classification on the Reuters News Corpus

» Using ErrorCorrecting Codes for Text Classification

» File Server Scaling with NetworkAttached Secure Disks

» Scalable Text Retrieval for Large Digital Libraries

Post Info
More Details (n/a)

Added	30 Nov 2009
Updated	30 Nov 2009
Type	Conference
Year	2008
Where	KDD
Authors	George Forman, Shyamsundar Rajaram

Comments (0)