Sciweavers

1176 search results - page 4 / 236
» Sparse reward processes
Sort
View
ICML
2009
IEEE
15 years 10 months ago
Piecewise-stationary bandit problems with side observations
We consider a sequential decision problem where the rewards are generated by a piecewise-stationary distribution. However, the different reward distributions are unknown and may c...
Jia Yuan Yu, Shie Mannor
CORR
2011
Springer
183views Education» more  CORR 2011»
14 years 4 months ago
Mean-Variance Optimization in Markov Decision Processes
We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomiz...
Shie Mannor, John N. Tsitsiklis
ICRA
2010
IEEE
137views Robotics» more  ICRA 2010»
14 years 8 months ago
Robot reinforcement learning using EEG-based reward signals
Abstract— Reinforcement learning algorithms have been successfully applied in robotics to learn how to solve tasks based on reward signals obtained during task execution. These r...
Iñaki Iturrate, Luis Montesano, Javier Ming...
AAAI
2011
13 years 9 months ago
Optimal Rewards versus Leaf-Evaluation Heuristics in Planning Agents
Planning agents often lack the computational resources needed to build full planning trees for their environments. Agent designers commonly overcome this finite-horizon approxima...
Jonathan Sorg, Satinder P. Singh, Richard L. Lewis
ICML
2004
IEEE
15 years 10 months ago
Apprenticeship learning via inverse reinforcement learning
We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we wa...
Pieter Abbeel, Andrew Y. Ng