Sciweavers

280 search results - page 51 / 56
» Planning for Markov Decision Processes with Sparse Stochasti...
Sort
View
QEST
2006
IEEE
15 years 3 months ago
Compositional Performability Evaluation for STATEMATE
Abstract— This paper reports on our efforts to link an industrial state-of-the-art modelling tool to academic state-of-the-art analysis algorithms. In a nutshell, we enable timed...
Eckard Böde, Marc Herbstritt, Holger Hermanns...
WCNC
2010
IEEE
15 years 1 months ago
Dynamic Control of Data Ferries under Partial Observations
—Controlled mobile helper nodes called data ferries have recently been proposed to bridge communications between disconnected nodes in a delay-tolerant manner. While existing wor...
Chi Harold Liu, Ting He, Kang-won Lee, Kin K. Leun...
NIPS
2001
14 years 11 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
ENTCS
2008
110views more  ENTCS 2008»
14 years 9 months ago
Game-Based Probabilistic Predicate Abstraction in PRISM
ion in PRISM1 Mark Kattenbelt Marta Kwiatkowska Gethin Norman David Parker Oxford University Computing Laboratory, Oxford, UK Modelling and verification of systems such as communi...
Mark Kattenbelt, Marta Z. Kwiatkowska, Gethin Norm...
JAIR
2006
101views more  JAIR 2006»
14 years 9 months ago
Resource Allocation Among Agents with MDP-Induced Preferences
Allocating scarce resources among agents to maximize global utility is, in general, computationally challenging. We focus on problems where resources enable agents to execute acti...
Dmitri A. Dolgov, Edmund H. Durfee