Sciweavers

196 search results - page 31 / 40
» Text Classification Using Word-Based PPM Models
Sort
View
88
Voted
ANLP
1994
105views more  ANLP 1994»
15 years 1 months ago
Modeling Content Identification from Document Images
A new technique to locate content-representing words for a given document image using representation of character shapes is described. A character shape code representation define...
Takehiro Nakayama
KDD
2009
ACM
191views Data Mining» more  KDD 2009»
16 years 6 days ago
Efficient methods for topic model inference on streaming document collections
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
Limin Yao, David M. Mimno, Andrew McCallum
85
Voted
NLE
2008
140views more  NLE 2008»
14 years 11 months ago
Active learning and logarithmic opinion pools for HPSG parse selection
For complex tasks such as parse selection, the creation of labelled training sets can be extremely costly. Resource-efficient schemes for creating informative labelled material mu...
Jason Baldridge, Miles Osborne
95
Voted
ACL
1994
15 years 1 months ago
A Corpus-Based Approach to Automatic Compound Extraction
An automatic compound retrieval method is proposed to extract compounds within a text message. It uses n-gram mutual information, relative frequency count and parts of speech as t...
Keh-Yih Su, Ming-Wen Wu, Jing-Shin Chang
EMNLP
2010
14 years 9 months ago
Translingual Document Representations from Discriminative Projections
Representing documents by vectors that are independent of language enhances machine translation and multilingual text categorization. We use discriminative training to create a pr...
John Platt, Kristina Toutanova, Wen-tau Yih