Partially Observable Markov Decision Processes (POMDPs) have succeeded in planning domains that require balancing actions that increase an agent's knowledge and actions that ...
Policy gradient approaches are a powerful instrument for learning how to interact with the environment. Existing approaches have focused on propositional and continuous domains on...
Recently, many applications for Restricted Boltzmann Machines (RBMs) have been developed for a large variety of learning problems. However, RBMs are usually used as feature extrac...
We describe an algorithm for learning in the presence of multiple criteria. Our technique generalizes previous approaches in that it can learn optimal policies for all linear pref...
The infinite hidden Markov model is a nonparametric extension of the widely used hidden Markov model. Our paper introduces a new inference algorithm for the infinite Hidden Markov...
Jurgen Van Gael, Yunus Saatci, Yee Whye Teh, Zoubi...