Clustering using the Hilbert Schmidt independence criterion (CLUHSIC) is a recent clustering algorithm that maximizes the dependence between cluster labels and data observations ac...
Wenliang Zhong, Weike Pan, James T. Kwok, Ivor W. ...
Coreferencing entities across documents in a large corpus enables advanced document understanding tasks such as question answering. This paper presents a novel cross document core...
Jian Huang 0002, Sarah M. Taylor, Jonathan L. Smit...
HyPursuit is a new hierarchical network search engine that clusters hypertext documents to structure a given information space for browsing and search activities. Our content-link...
This paper describes a new bipartite formulation for word-document co-clustering such that hyperclique patterns, strongly affiliated documents in this case, are guaranteed not to ...
Tianming Hu, Chao Qu, Chew Lim Tan, Sam Yuan Sung,...
In this paper, we focus on the ontological concept extraction and evaluation process from HTML documents. In order to improve this process, we propose an unsupervised hierarchical...