Sciweavers

17 search results - page 3 / 4
» APRICODD: Approximate Policy Construction Using Decision Dia...
Sort
View
IJCAI
2007
13 years 7 months ago
A Fast Analytical Algorithm for Solving Markov Decision Processes with Real-Valued Resources
Agents often have to construct plans that obey deadlines or, more generally, resource limits for real-valued resources whose consumption can only be characterized by probability d...
Janusz Marecki, Sven Koenig, Milind Tambe
CDC
2008
IEEE
120views Control Systems» more  CDC 2008»
14 years 23 days ago
Approximate abstractions of discrete-time controlled stochastic hybrid systems
ate Abstractions of Discrete-Time Controlled Stochastic Hybrid Systems Alessandro D’Innocenzo, Alessandro Abate, and Maria D. Di Benedetto — This work proposes a procedure to c...
Alessandro D'Innocenzo, Alessandro Abate, Maria Do...
ATAL
2009
Springer
14 years 26 days ago
SarsaLandmark: an algorithm for learning in POMDPs with landmarks
Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...
Michael R. James, Satinder P. Singh
WIOPT
2011
IEEE
12 years 10 months ago
Network utility maximization over partially observable Markovian channels
Abstract—This paper considers maximizing throughput utility in a multi-user network with partially observable Markov ON/OFF channels. Instantaneous channel states are never known...
Chih-Ping Li, Michael J. Neely
RSS
2007
176views Robotics» more  RSS 2007»
13 years 7 months ago
Active Policy Learning for Robot Planning and Exploration under Uncertainty
Abstract— This paper proposes a simulation-based active policy learning algorithm for finite-horizon, partially-observed sequential decision processes. The algorithm is tested i...
Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do...