Sciweavers

1176 search results - page 2 / 236
» Sparse reward processes
Sort
View
IJCAI
2001
13 years 6 months ago
Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning
Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rar...
Gregory Z. Grudic, Lyle H. Ungar
FLAIRS
2003
13 years 6 months ago
Learning from Reinforcement and Advice Using Composite Reward Functions
1 Reinforcement learning has become a widely used methodology for creating intelligent agents in a wide range of applications. However, its performance deteriorates in tasks with s...
Vinay N. Papudesi, Manfred Huber
MDAI
2005
Springer
13 years 11 months ago
Perceptive Evaluation for the Optimal Discounted Reward in Markov Decision Processes
We formulate a fuzzy perceptive model for Markov decision processes with discounted payoff in which the perception for transition probabilities is described by fuzzy sets. Our aim...
Masami Kurano, Masami Yasuda, Jun-ichi Nakagami, Y...
ALT
2007
Springer
14 years 2 months ago
Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
We consider how state similarity in average reward Markov decision processes (MDPs) may be described by pseudometrics. Introducing the notion of adequate pseudometrics which are we...
Ronald Ortner
COLT
2007
Springer
13 years 11 months ago
Bounded Parameter Markov Decision Processes with Average Reward Criterion
Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in the parameters of a Markov Decision Process (MDP). Unlike the case of an MDP, t...
Ambuj Tewari, Peter L. Bartlett