Reinforcement learning (RL) algorithms provide a sound theoretical basis for building learning control architectures for embedded agents. Unfortunately all of the theory and much ...
Satinder P. Singh, Tommi Jaakkola, Michael I. Jord...
Genetic programming may be seen as a recent incarnation of a long-held goal in evolutionary computation: to develop actual computational devices through evolutionary search. Geneti...
We present a vision based, adaptive, decision theoretic model of human facial displays in interactions. The model is a partially observable Markov decision process, or POMDP. A POM...
We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooperati...
Jayakrishnan Unnikrishnan, Venugopal V. Veeravalli
We show that states of a dynamical system can be usefully represented by multi-step, action-conditional predictions of future observations. State representations that are grounded...
Michael L. Littman, Richard S. Sutton, Satinder P....