Search Sciweavers | Sciweavers

483 search results - page 15 / 97

» Sampling the Web as Training Data for Text Classification

159

click to vote

ICML
2005
IEEE

126views Machine Learning» more ICML 2005»

Hierarchical Dirichlet model for document classification

16 years 7 months ago

Download www.machinelearning.org

The proliferation of text documents on the web as well as within institutions necessitates their convenient organization to enable efficient retrieval of information. Although tex...

Sriharsha Veeramachaneni, Diego Sona, Paolo Avesan...

claim paper

Read More »

157

click to vote

EACL
2006
ACL Anthology

143views Natural Language Processing» more EACL 2006»

Web Text Corpus for Natural Language Processing

15 years 8 months ago

Download www.cs.usyd.edu.au

Web text has been successfully used as training data for many NLP applications. While most previous work accesses web text through search engine hit counts, we created a Web Corpu...

Vinci Liu, James R. Curran

claim paper

Read More »

199

Voted

EMNLP
2009

141views Natural Language Processing» more EMNLP 2009»

Semi-Supervised Learning for Semantic Relation Classification using Stratified Sampling Strategy

15 years 4 months ago

Download www.aclweb.org

This paper presents a new approach to selecting the initial seed set using stratified sampling strategy in bootstrapping-based semi-supervised learning for semantic relation class...

Longhua Qian, Guodong Zhou, Fang Kong, Qiaoming Zh...

claim paper

Read More »

161

click to vote

KDD
2004
ACM

160views Data Mining» more KDD 2004»

Boosting for Text Classification with Semantic Features

16 years 7 months ago

Download www.aifb.uni-karlsruhe.de

Abstract. Current text classification systems typically use term stems for representing document content. Semantic Web technologies allow the usage of features on a higher semantic...

Stephan Bloehdorn, Andreas Hotho

claim paper

Read More »

196

click to vote

ML
2000
ACM

124views Machine Learning» more ML 2000»

Text Classification from Labeled and Unlabeled Documents using EM

15 years 6 months ago

Download www.kamalnigam.com

This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. ...

Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...

claim paper

Read More »

« Prev « First page 15 / 97 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers