The cluster assumption is exploited by most semi-supervised learning (SSL) methods. However, if the unlabeled data is merely weakly related to the target classes, it becomes quest...
We propose a new unsupervised learning technique for extracting information about authors and topics from large text collections. We model documents as if they were generated by a...
Michal Rosen-Zvi, Chaitanya Chemudugunta, Thomas L...
Computing semantic relatedness of natural language texts requires access to vast amounts of common-sense and domain-specific world knowledge. We propose Explicit Semantic Analysi...
Ordering information is a difficult but a important task for natural language generation applications. A wrong order of information not only makes it difficult to understand, but a...
This paper reports on an approach which maps documents onto an ontology-based information space in order to provide support for machine-mediated communication. First, a composite ...