Abstract. This paper shows how Wikipedia and the semantic knowledge it contains can be exploited for document clustering. We first create a concept-based document representation b...
Anna Huang, David N. Milne, Eibe Frank, Ian H. Wit...
Taking advantage of the well-known cluster hypothesis that “closely associated documents tend to be relevant to the same request”, we can use inter-document similarity to prov...
In this paper, we argue that the agglomerative clustering with vector cosine similarity measure performs poorly due to two reasons. First, the nearest neighbors of a document belo...
Measuring the similarity between semantic relations that hold among entities is an important and necessary step in various Web related tasks such as relation extraction, informati...
The design of efficient textual similarities is an important issue in the domain of textual data exploration. Textual similarities are for example central in document collection s...