Sciweavers

FSKD
2009
Springer

Chinese Web Comments Clustering Analysis with a Two-phase Method

13 years 11 months ago
Chinese Web Comments Clustering Analysis with a Two-phase Method
Usually a meaningful web topic has tens of thousands of comments, especially the hot topics. It is valuable if we congregate the comments into clusters and find out the mainstreams. However, such analysis has two difficulties. First, there is no explicit link relationship between web comments just like those among web pages or Blog comments. The other problem is, most of the comments are very short, even one or two words. Therefore the traditional clustering algorithms such as CURE and DBSCAN cannot work if applied to these comments directly. In this paper we propose a two-phase algorithm, which will first combine the highly synonymous comments into a longer one based on a connected graph model, and then apply the improved clustering methods to the new collections. Experimental results on two real data sets show that our algorithm performs better than traditional algorithms such as CURE.
Yexin Wang, Li Zhao, Yan Zhang
Added 26 May 2010
Updated 26 May 2010
Type Conference
Year 2009
Where FSKD
Authors Yexin Wang, Li Zhao, Yan Zhang
Comments (0)