Sciweavers

NIPS
1994
13 years 5 months ago
From Data Distributions to Regularization in Invariant Learning
Ideally pattern recognition machines provide constant output when the inputs are transformed under a group G of desired invariances. These invariances can be achieved by enhancing...
Todd K. Leen
NAACL
1994
13 years 5 months ago
Tree-Based State Tying for High Accuracy Modelling
The key problem to be faced when building a HMM-based continuous speech recogniser is maintaining the balance between model complexity and available training data. For large vocab...
S. J. Young, J. J. Odell, Philip C. Woodland
ISMB
1993
13 years 5 months ago
Knowledge-Based Generation of Machine-Learning Experiments: Learning with DNA Crystallography Data
Thoughit has been possible in the past to learn to predict DNAhydration patterns from crystallographic data, there is ambiguity in the choice of training data (both in terms of th...
Dawn M. Cohen, Casimir A. Kulikowski, Helen Berman
AAAI
2000
13 years 5 months ago
Self-Supervised Learning for Visual Tracking and Recognition of Human Hand
Due to the large variation and richness of visual inputs, statistical learning gets more and more concerned in the practice of visual processing such as visual tracking and recogn...
Ying Wu, Thomas S. Huang
NAACL
2003
13 years 5 months ago
A Web-Trained Extraction Summarization System
A serious bottleneck in the development of trainable text summarization systems is the shortage of training data. Constructing such data is a very tedious task, especially because...
Liang Zhou, Eduard H. Hovy
NAACL
2003
13 years 5 months ago
Example Selection for Bootstrapping Statistical Parsers
This paper investigates bootstrapping for statistical parsers to reduce their reliance on manually annotated training data. We consider both a mostly-unsupervised approach, co-tra...
Mark Steedman, Rebecca Hwa, Stephen Clark, Miles O...
NAACL
2003
13 years 5 months ago
Getting More Mileage from Web Text Sources for Conversational Speech Language Modeling using Class-Dependent Mixtures
Sources of training data suitable for language modeling of conversational speech are limited. In this paper, we show how training data can be supplemented with text from the web ï...
Ivan Bulyko, Mari Ostendorf, Andreas Stolcke
FLAIRS
2004
13 years 5 months ago
Transductive LSI for Short Text Classification Problems
This paper presents work that uses Transductive Latent Semantic Indexing (LSI) for text classification. In addition to relying on labeled training data, we improve classification ...
Sarah Zelikovitz
SDM
2007
SIAM
85views Data Mining» more  SDM 2007»
13 years 5 months ago
Kernel Based Detection of Mislabeled Training Examples
The problem of identifying mislabeled training examples has been examined in several studies, with a variety of approaches developed for editing the training data to obtain better...
Hamed Valizadegan, Pang-Ning Tan
NAACL
2007
13 years 5 months ago
Detection of Non-Native Sentences Using Machine-Translated Training Data
Training statistical models to detect nonnative sentences requires a large corpus of non-native writing samples, which is often not readily available. This paper examines the exte...
John Lee, Ming Zhou, Xiaohua Liu