Enhancing clustering blog documents by utilizing author/reader comments

15 years 9 months ago

Download www.cs.uky.edu

Blogs are a new form of internet phenomenon and a vast everincreasing information resource. Mining blog files for information is a very new research direction in data mining. We propose to include the title, body, and comments of the blog pages in clustering datasets from blog documents. In particular, we argue that the author/reader comments of the blog pages may have more discriminating effect in clustering blog documents. We constructed a word-page matrix by downloading blog pages from a well-known website and experimented a k-means clustering algorithm with different weights assigned to the title, body, and comment parts. Our experimental results show that assigning a larger weight value to the blog comments helps the k-means algorithm produce better clustering solutions. The experimental results confirm our hypothesis that the author/reader comments of the blog files are very useful in discriminating blog files. Categories and Subject Descriptors H.3.3. [Information Search and Re...

Beibei Li, Shuting Xu, Jun Zhang

Real-time Traffic

ACMSE 2007 | Blog | Blog Files | Blog Pages | Theoretical Computer Science |

claim paper

Post Info
More Details (n/a)

Added	12 Aug 2010
Updated	12 Aug 2010
Type	Conference
Year	2007
Where	ACMSE
Authors	Beibei Li, Shuting Xu, Jun Zhang

Comments (0)

Sciweavers

Enhancing clustering blog documents by utilizing author/reader comments

ACMSE 2007 | Blog | Blog Files | Blog Pages | Theoretical Computer Science |

Explore & Download

Productivity Tools

Sciweavers