This paper presents two sentence retrieval methods. We adopt the task definition done in the TREC Novelty Track: sentence retrieval consists in the extraction of the relevant sente...
In this paper, we present our online summarization system of web topics. The user defines the topic by a set of keywords. Then the system searches the Web for the relevant documen...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...
Abstract. Latent Semantic Indexing(LSI) has been proved to be effective to capture the semantic structure of document collections. It is widely used in content-based text retrieval...
Analyzing the author and topic relations in email corpus is an important issue in both social network analysis and text mining. The AuthorTopic model is a statistical model that id...