Sciweavers

453 search results - page 52 / 91
» Learning from actions not taken: a multiagent learning algor...
Sort
View
KCAP
2009
ACM
15 years 4 months ago
Interactively shaping agents via human reinforcement: the TAMER framework
As computational learning agents move into domains that incur real costs (e.g., autonomous driving or financial investment), it will be necessary to learn good policies without n...
W. Bradley Knox, Peter Stone
ECML
2005
Springer
15 years 3 months ago
Using Rewards for Belief State Updates in Partially Observable Markov Decision Processes
Partially Observable Markov Decision Processes (POMDP) provide a standard framework for sequential decision making in stochastic environments. In this setting, an agent takes actio...
Masoumeh T. Izadi, Doina Precup
IJRR
2010
107views more  IJRR 2010»
14 years 8 months ago
Non-parametric Learning to Aid Path Planning over Slopes
— This paper addresses the problem of closing the loop from perception to action selection for unmanned ground vehicles, with a focus on navigating slopes. A new non-parametric l...
Sisir Karumanchi, Thomas Allen, Tim Bailey, Steve ...
COLT
2008
Springer
14 years 11 months ago
Regret Bounds for Sleeping Experts and Bandits
We study on-line decision problems where the set of actions that are available to the decision algorithm vary over time. With a few notable exceptions, such problems remained larg...
Robert D. Kleinberg, Alexandru Niculescu-Mizil, Yo...
PODC
2009
ACM
15 years 10 months ago
Load balancing without regret in the bulletin board model
We analyze the performance of protocols for load balancing in distributed systems based on no-regret algorithms from online learning theory. These protocols treat load balancing a...
Éva Tardos, Georgios Piliouras, Robert D. K...