We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from...
Deng Cai, Xiaofei He, Wei Vivian Zhang, Jiawei Han
Many high level representations of time series have been proposed for data mining, including Fourier transforms, wavelets, eigenwaves, piecewise polynomial models etc. Many researc...
Jessica Lin, Eamonn J. Keogh, Li Wei, Stefano Lona...
The rapid growth of the web has been noted and tracked extensively. Recent studies have however documented the dual phenomenon: web pages have small half lives, and thus the web e...
Ziv Bar-Yossef, Andrei Z. Broder, Ravi Kumar, Andr...
The input to an algorithm that learns a binary classifier normally consists of two sets of examples, where one set consists of positive examples of the concept to be learned, and ...
Citation matching, or the automatic grouping of bibliographic references that refer to the same document, is a data management problem faced by automatic digital libraries for sci...
Isaac G. Councill, Huajing Li, Ziming Zhuang, Sand...