We present in this paper a combination of Machine Learning based Information Retrieval (IR) techniques and stochastic language modelling in a hierarchical system that extracts sur...
Interactive clustering refers to situations in which a human labeler is willing to assist a learning algorithm in automatically clustering items. We present a related but somewhat...
Sumit Basu, Danyel Fisher, Steven M. Drucker, Hao ...
Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...
Dataset shift from the training data in a source domain to the data in a target domain poses a great challenge for many statistical learning methods. Most algorithms can be viewed ...
Automated detection of the first document reporting each new event in temporally-sequenced streams of documents is an open challenge. In this paper we propose a new approach which...
Yiming Yang, Jian Zhang, Jaime G. Carbonell, Chun ...