Coreferencing entities across documents in a large corpus enables advanced document understanding tasks such as question answering. This paper presents a novel cross document core...
Jian Huang 0002, Sarah M. Taylor, Jonathan L. Smit...
A new distance measure between probability density functions (pdfs) is introduced, which we refer to as the Laplacian pdf distance. The Laplacian pdf distance exhibits a remarkabl...
We present a theoretical study on the discriminative clustering framework, recently proposed for simultaneous subspace selection via linear discriminant analysis (LDA) and cluster...
An idealized clustering algorithm seeks to learn a cluster-adjacency matrix such that, if two data points belong to the same cluster, the corresponding entry would be 1; otherwise ...
Principal component analysis (PCA) is a widely used statistical technique for unsupervised dimension reduction. K-means clustering is a commonly used data clustering for unsupervi...