Sciweavers

7 search results - page 1 / 2
» General Discounting Versus Average Reward
Sort
View
ALT
2006
Springer
14 years 1 months ago
General Discounting Versus Average Reward
Consider an agent interacting with an environment in cycles. In every interaction cycle the agent is rewarded for its performance. We compare the average reward U from cycle 1 to ...
Marcus Hutter
ML
2002
ACM
168views Machine Learning» more  ML 2002»
13 years 4 months ago
On Average Versus Discounted Reward Temporal-Difference Learning
We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...
John N. Tsitsiklis, Benjamin Van Roy
ICML
2001
IEEE
14 years 5 months ago
Continuous-Time Hierarchical Reinforcement Learning
Hierarchical reinforcement learning (RL) is a general framework which studies how to exploit the structure of actions and tasks to accelerate policy learning in large domains. Pri...
Mohammad Ghavamzadeh, Sridhar Mahadevan
IJCAI
2001
13 years 6 months ago
Complexity of Probabilistic Planning under Average Rewards
A general and expressive model of sequential decision making under uncertainty is provided by the Markov decision processes (MDPs) framework. Complex applications with very large ...
Jussi Rintanen
ICML
2002
IEEE
14 years 5 months ago
Hierarchically Optimal Average Reward Reinforcement Learning
Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...
Mohammad Ghavamzadeh, Sridhar Mahadevan