Sciweavers

17 search results - page 3 / 4
» APRICODD: Approximate Policy Construction Using Decision Dia...
Sort
View
106
Voted
IJCAI
2007
15 years 2 months ago
A Fast Analytical Algorithm for Solving Markov Decision Processes with Real-Valued Resources
Agents often have to construct plans that obey deadlines or, more generally, resource limits for real-valued resources whose consumption can only be characterized by probability d...
Janusz Marecki, Sven Koenig, Milind Tambe
CDC
2008
IEEE
120views Control Systems» more  CDC 2008»
15 years 7 months ago
Approximate abstractions of discrete-time controlled stochastic hybrid systems
ate Abstractions of Discrete-Time Controlled Stochastic Hybrid Systems Alessandro D’Innocenzo, Alessandro Abate, and Maria D. Di Benedetto — This work proposes a procedure to c...
Alessandro D'Innocenzo, Alessandro Abate, Maria Do...
90
Voted
ATAL
2009
Springer
15 years 7 months ago
SarsaLandmark: an algorithm for learning in POMDPs with landmarks
Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...
Michael R. James, Satinder P. Singh
135
Voted
WIOPT
2011
IEEE
14 years 4 months ago
Network utility maximization over partially observable Markovian channels
Abstract—This paper considers maximizing throughput utility in a multi-user network with partially observable Markov ON/OFF channels. Instantaneous channel states are never known...
Chih-Ping Li, Michael J. Neely
RSS
2007
176views Robotics» more  RSS 2007»
15 years 2 months ago
Active Policy Learning for Robot Planning and Exploration under Uncertainty
Abstract— This paper proposes a simulation-based active policy learning algorithm for finite-horizon, partially-observed sequential decision processes. The algorithm is tested i...
Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do...