We present a method to discover robust and interpretable sociolinguistic associations from raw geotagged text data. Using aggregate demographic statistics about the authors’ geo...
This paper introduces a new algorithm, namely the EquiCorrelation Network (ECON), to perform supervised classification, and regression. ECON is a kernelized LARS-like algorithm, b...
Manuel Loth, Philippe Preux, Samuel Delepoulle, Ch...
Background: Most genomic data have ultra-high dimensions with more than 10,000 genes (probes). Regularization methods with L1 and Lp penalty have been extensively studied in survi...
Zhenqiu Liu, Dechang Chen, Ming Tan, Feng Jiang, R...
With very noisy data, having plentiful samples eliminates overfitting in nonlinear regression, but not in nonlinear principal component analysis (NLPCA). To overcome this problem...
We propose a fully Bayesian methodology for generalized kernel mixed models (GKMMs), which are extensions of generalized linear mixed models in the feature space induced by a repr...