Automatically clustering web pages into semantic groups promises improved search and browsing on the web. In this paper, we demonstrate how user-generated tags from largescale soc...
Daniel Ramage, Paul Heymann, Christopher D. Mannin...
Using SQL has not been considered an efficient and feasible way to implement data mining algorithms. Although this is true for many data mining, machine learning and statistical a...
Abstract. With information proliferation on the Web, how to obtain highquality information from the Web has been one of hot research topics in many fields like Database, IR as well...
Story clustering is a critical step for news retrieval, topic mining, and summarization. Nonetheless, the task remains highly challenging owing to the fact that news topics exhibit...
This paper presents Multilingual Document Clustering (MDC) on comparable corpora. Wikipedia, a structured multilingual knowledge base, has been highly exploited in many monolingual...