Sciweavers

829 search results - page 15 / 166
» A time aggregation approach to Markov decision processes
Sort
View
IJCAI
1997
15 years 1 months ago
Aggregating Features and Matching Cases on Vague Linguistic Expressions
Decision making based on the comparison of multiple criteria of two or more alternatives, is the subject of intensive research. In many decision making situations, a single criter...
Alfons Schuster, Werner Dubitzky, Philippe Lopes, ...
COLT
2000
Springer
15 years 4 months ago
Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning
We model reinforcement learning as the problem of learning to control a Partially Observable Markov Decision Process (  ¢¡¤£¦¥§  ), and focus on gradient ascent approache...
Peter L. Bartlett, Jonathan Baxter
AAAI
2011
13 years 11 months ago
Policy Gradient Planning for Environmental Decision Making with Existing Simulators
In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action s...
Mark Crowley, David Poole
ICALP
2009
Springer
16 years 2 days ago
Reachability in Stochastic Timed Games
We define stochastic timed games, which extend two-player timed games with probabilities (following a recent approach by Baier et al), and which extend in a natural way continuous-...
Patricia Bouyer, Vojtech Forejt
ML
2002
ACM
121views Machine Learning» more  ML 2002»
14 years 11 months ago
Near-Optimal Reinforcement Learning in Polynomial Time
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decisio...
Michael J. Kearns, Satinder P. Singh