Multi-document summarization using cluster-based link analysis

12 years 11 months ago
Multi-document summarization using cluster-based link analysis
The Markov Random Walk model has been recently exploited for multi-document summarization by making use of the link relationships between sentences in the document set, under the assumption that all the sentences are indistinguishable from each other. However, a given document set usually covers a few topic themes with each theme represented by a cluster of sentences. The topic themes are usually not equally important and the sentences in an important theme cluster are deemed more salient than the sentences in a trivial theme cluster. This paper proposes the Cluster-based Conditional Markov Random Walk Model (ClusterCMRW) and the Cluster-based HITS Model (ClusterHITS) to fully leverage the cluster-level information. Experimental results on the DUC2001 and DUC2002 datasets demonstrate the good effectiveness of our proposed summarization models. The results also demonstrate that the ClusterCMRW model is more robust than the ClusterHITS model, with respect to different cluster numbers. C...
Xiaojun Wan, Jianwu Yang
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Authors Xiaojun Wan, Jianwu Yang
Comments (0)