Automatic classification of documents is an important area of research with many applications in the fields of document searching, forensics and others. Methods to perform class...
The Machine Learning and Pattern Recognition communities are facing two challenges: solving the normalization problem, and solving the deep learning problem. The normalization pro...
Extractive summarization techniques cannot generate document summaries shorter than a single sentence, something that is often required. An ideal summarization system would unders...
Michele Banko, Vibhu O. Mittal, Michael J. Witbroc...
In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled si...
Soo-Min Kim, Patrick Pantel, Lei Duan, Scott Gaffn...
In this paper we study the problem of collecting training samples for building enterprise taxonomies. We develop a computer-aided tool named InfoAnalyzer, which can effectively as...