Sciweavers

85 search results - page 13 / 17
» Approximate Policy Iteration with a Policy Language Bias
Sort
View
147
Voted
CP
2004
Springer
15 years 5 months ago
Heuristic Selection for Stochastic Search Optimization: Modeling Solution Quality by Extreme Value Theory
The success of stochastic algorithms is often due to their ability to effectively amplify the performance of search heuristics. This is certainly the case with stochastic sampling ...
Vincent A. Cicirello, Stephen F. Smith
NIPS
1998
15 years 29 days ago
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
In this paper, we address two issues of long-standing interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning aft...
Michael J. Kearns, Satinder P. Singh
APLAS
2003
ACM
15 years 3 months ago
Resource Usage Verification
We investigate how to automatically verify that resources such as files are not used improperly or unsafely by a program. We employ a mixture of compile-time analysis and run-time ...
Kim Marriott, Peter J. Stuckey, Martin Sulzmann
95
Voted
IJCAI
2007
15 years 1 months ago
A Fast Analytical Algorithm for Solving Markov Decision Processes with Real-Valued Resources
Agents often have to construct plans that obey deadlines or, more generally, resource limits for real-valued resources whose consumption can only be characterized by probability d...
Janusz Marecki, Sven Koenig, Milind Tambe
ICML
2008
IEEE
16 years 13 days ago
Reinforcement learning in the presence of rare events
We consider the task of reinforcement learning in an environment in which rare significant events occur independently of the actions selected by the controlling agent. If these ev...
Jordan Frank, Shie Mannor, Doina Precup