Sciweavers

220 search results - page 38 / 44
» Learning to Classify Text from Labeled and Unlabeled Documen...
Sort
View
ECIR
1998
Springer
14 years 10 months ago
Coupled Hierarchical IR and Stochastic Models for Surface Information Extraction
We present in this paper a combination of Machine Learning based Information Retrieval (IR) techniques and stochastic language modelling in a hierarchical system that extracts sur...
Hugo Zaragoza, Patrick Gallinari
AAAI
2010
14 years 11 months ago
Assisting Users with Clustering Tasks by Combining Metric Learning and Classification
Interactive clustering refers to situations in which a human labeler is willing to assist a learning algorithm in automatically clustering items. We present a related but somewhat...
Sumit Basu, Danyel Fisher, Steven M. Drucker, Hao ...
CIKM
2008
Springer
14 years 11 months ago
Identifying table boundaries in digital documents via sparse line detection
Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...
Ying Liu, Prasenjit Mitra, C. Lee Giles
ICDM
2010
IEEE
147views Data Mining» more  ICDM 2010»
14 years 7 months ago
Location and Scatter Matching for Dataset Shift in Text Mining
Dataset shift from the training data in a source domain to the data in a target domain poses a great challenge for many statistical learning methods. Most algorithms can be viewed ...
Bo Chen, Wai Lam, Ivor W. Tsang, Tak-Lam Wong
KDD
2002
ACM
186views Data Mining» more  KDD 2002»
15 years 10 months ago
Topic-conditioned novelty detection
Automated detection of the first document reporting each new event in temporally-sequenced streams of documents is an open challenge. In this paper we propose a new approach which...
Yiming Yang, Jian Zhang, Jaime G. Carbonell, Chun ...