Studies show that programs contain much similar code, commonly known as clones. One of the main reasons for introducing clones is programmers' tendency to copy and paste code...
In this paper, we define and study a novel text mining problem, which we refer to as Comparative Text Mining (CTM). Given a set of comparable text collections, the task of compara...
We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from...
Biomedical images and captions are one of the major sources of information in online biomedical publications. They often contain the most important results to be reported, and pro...
Xin Chen, Caimei Lu, Yuan An, Palakorn Achananupar...
Background: The statistical modeling of biomedical corpora could yield integrated, coarse-to-fine views of biological phenomena that complement discoveries made from analysis of m...
David M. Blei, K. Franks, Michael I. Jordan, I. Sa...