Sciweavers

238 search results - page 44 / 48
» Value-Function Approximations for Partially Observable Marko...
Sort
View
AIPS
2008
15 years 2 months ago
HiPPo: Hierarchical POMDPs for Planning Information Processing and Sensing Actions on a Robot
Flexible general purpose robots need to tailor their visual processing to their task, on the fly. We propose a new approach to this within a planning framework, where the goal is ...
Mohan Sridharan, Jeremy L. Wyatt, Richard Dearden
TR
2010
126views Hardware» more  TR 2010»
14 years 6 months ago
Optimal Maintenance Strategies for Wind Turbine Systems Under Stochastic Weather Conditions
Abstract--We examine optimal repair strategies for wind turbines operated under stochastic weather conditions. In-situ sensors installed at wind turbines produce useful information...
Eunshin Byon, Lewis Ntaimo, Yu Ding
ATAL
2003
Springer
15 years 5 months ago
Optimizing information exchange in cooperative multi-agent systems
Decentralized control of a cooperative multi-agent system is the problem faced by multiple decision-makers that share a common set of objectives. The decision-makers may be robots...
Claudia V. Goldman, Shlomo Zilberstein
GECCO
2009
Springer
162views Optimization» more  GECCO 2009»
14 years 9 months ago
Uncertainty handling CMA-ES for reinforcement learning
The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...
Verena Heidrich-Meisner, Christian Igel
ECML
2007
Springer
15 years 6 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber