String kernels which compare the set of all common substrings between two given strings have recently been proposed by Vishwanathan & Smola (2004). Surprisingly, these kernels...
Some models of textual corpora employ text generation methods involving n-gram statistics, while others use latent topic variables inferred using the "bag-of-words" assu...
Recently, a very appealing approach was proposed to compute the entire solution path for support vector classification (SVC) with very low extra computational cost. This approach ...
We present a directed Markov random field (MRF) model that combines n-gram models, probabilistic context free grammars (PCFGs) and probabilistic latent semantic analysis (PLSA) fo...
Shaojun Wang, Shaomin Wang, Russell Greiner, Dale ...
Many clustering algorithms fail when dealing with high dimensional data. Principal component analysis (PCA) is a popular dimensionality reduction algorithm. However, it assumes a ...