Utile distinction hidden Markov models

11 years 7 months ago
Utile distinction hidden Markov models
This paper addresses the problem of constructing good action selection policies for agents acting in partially observable environments, a class of problems generally known as Partially Observable Markov Decision Processes. We present a novel approach that uses a modification of the well-known Baum-Welch algorithm for learning a Hidden Markov Model (HMM) to predict both percepts and utility in a non-deterministic world. This enables an agent to make decisions based on its previous history of actions, observations, and rewards. Our algorithm, called Utile Distinction Hidden Markov Models (UDHMM), handles the creation of memory well in that it tends to create perceptual and utility distinctions only when needed, while it can still discriminate states based on histories of arbitrary length. The experimental results in highly stochastic problem domains show very good performance.
Daan Wierstra, Marco Wiering
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2004
Where ICML
Authors Daan Wierstra, Marco Wiering
Comments (0)