Scoring sentences in documents given abstract summaries created by humans is important in extractive multi-document summarization. In this paper, we formulate extractive summariza...
In many topic identification applications, supervised training labels are indirectly related to the semantic content of the documents being classified. For example, many topical...
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
In this paper we explore the effectiveness of three clustering methods used to perform word image indexing. The three methods are: the Self-Organazing Map (SOM), the Growing Hiera...
In this paper, we model the pair-wise similarities of a set of documents as a weighted network with a single cutoff parameter. Such a network can be thought of an ensemble of unwe...