We show that the log-likelihood of several probabilistic graphical models is Lipschitz continuous with respect to the p-norm of the parameters. We discuss several implications ...
In real-world applications, “what you saw” during training is often not “what you get” during deployment: the distribution and even the type and dimensionality of features...
For supervised learning, feature selection algorithms attempt to maximise a given function of predictive accuracy. This function usually considers the ability of feature vectors t...
We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...
Dimensionality reduction is an important pre-processing step in many applications. Linear discriminant analysis (LDA) is a classical statistical approach for supervised dimensiona...