Sciweavers

EOR
2006
90views more  EOR 2006»
13 years 4 months ago
A parallelizable dynamic fleet management model with random travel times
In this paper, we present a stochastic model for the dynamic fleet management problem with random travel times. Our approach decomposes the problem into time-staged subproblems by...
Huseyin Topaloglu
AAMAS
2007
Springer
13 years 4 months ago
Parallel Reinforcement Learning with Linear Function Approximation
In this paper, we investigate the use of parallelization in reinforcement learning (RL), with the goal of learning optimal policies for single-agent RL problems more quickly by us...
Matthew Grounds, Daniel Kudenko
ICML
2010
IEEE
13 years 5 months ago
Finite-Sample Analysis of LSTD
In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...
Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...
ICML
2010
IEEE
13 years 5 months ago
Bayesian Multi-Task Reinforcement Learning
We consider the problem of multi-task reinforcement learning where the learner is provided with a set of tasks, for which only a small number of samples can be generated for any g...
Alessandro Lazaric, Mohammad Ghavamzadeh
NIPS
1993
13 years 5 months ago
Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming
Dynamic programming provides a methodology to develop planners and controllers for nonlinear systems. However, general dynamic programming is computationally intractable. We have ...
Christopher G. Atkeson
NIPS
1998
13 years 6 months ago
Risk Sensitive Reinforcement Learning
In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are those states entering which is undesirable or dangerous. We define the risk with re...
Ralph Neuneier, Oliver Mihatsch
AAAI
1997
13 years 6 months ago
Incremental Methods for Computing Bounds in Partially Observable Markov Decision Processes
Partially observable Markov decision processes (POMDPs) allow one to model complex dynamic decision or control problems that include both action outcome uncertainty and imperfect ...
Milos Hauskrecht
AAAI
1998
13 years 6 months ago
Applying Online Search Techniques to Continuous-State Reinforcement Learning
In this paper, we describe methods for e ciently computing better solutions to control problems in continuous state spaces. We provide algorithms that exploit online search to boo...
Scott Davies, Andrew Y. Ng, Andrew W. Moore
NIPS
2003
13 years 6 months ago
Applying Metric-Trees to Belief-Point POMDPs
Recent developments in grid-based and point-based approximation algorithms for POMDPs have greatly improved the tractability of POMDP planning. These approaches operate on sets of...
Joelle Pineau, Geoffrey J. Gordon, Sebastian Thrun
NIPS
2001
13 years 6 months ago
Multiagent Planning with Factored MDPs
We present a principled and efficient planning algorithm for cooperative multiagent dynamic systems. A striking feature of our method is that the coordination and communication be...
Carlos Guestrin, Daphne Koller, Ronald Parr