Sciweavers

829 search results - page 119 / 166
» A time aggregation approach to Markov decision processes
Sort
View
AAAI
2007
15 years 4 months ago
Authorial Idioms for Target Distributions in TTD-MDPs
In designing Markov Decision Processes (MDP), one must define the world, its dynamics, a set of actions, and a reward function. MDPs are often applied in situations where there i...
David L. Roberts, Sooraj Bhat, Kenneth St. Clair, ...
NIPS
2001
15 years 3 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
ENTCS
2008
110views more  ENTCS 2008»
15 years 2 months ago
Game-Based Probabilistic Predicate Abstraction in PRISM
ion in PRISM1 Mark Kattenbelt Marta Kwiatkowska Gethin Norman David Parker Oxford University Computing Laboratory, Oxford, UK Modelling and verification of systems such as communi...
Mark Kattenbelt, Marta Z. Kwiatkowska, Gethin Norm...
JAIR
2010
115views more  JAIR 2010»
15 years 13 days ago
An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs
Decentralized planning in uncertain environments is a complex task generally dealt with by using a decision-theoretic approach, mainly through the framework of Decentralized Parti...
Raghav Aras, Alain Dutech
GLOBECOM
2010
IEEE
14 years 12 months ago
Need-Based Communication for Smart Grid: When to Inquire Power Price?
In smart grid, a home appliance can adjust its power consumption level according to the realtime power price obtained from communication channels. Most studies on smart grid do not...
Husheng Li, Robert C. Qiu