Sciweavers

473 search results - page 85 / 95
» Optimal policy switching algorithms for reinforcement learni...
Sort
View
CPAIOR
2007
Springer
15 years 3 months ago
Solving a Stochastic Queueing Control Problem with Constraint Programming
In a facility with front room and back room operations, it is useful to switch workers between the rooms in order to cope with changing customer demand. Assuming stochastic custome...
Daria Terekhov, J. Christopher Beck
72
Voted
ICML
2005
IEEE
15 years 10 months ago
Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees
MDPs are an attractive formalization for planning, but realistic problems often have intractably large state spaces. When we only need a partial policy to get from a fixed start s...
H. Brendan McMahan, Maxim Likhachev, Geoffrey J. G...
ICML
1996
IEEE
15 years 10 months ago
Learning Evaluation Functions for Large Acyclic Domains
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...
Justin A. Boyan, Andrew W. Moore
103
Voted
ATAL
2006
Springer
15 years 1 months ago
Learning to cooperate in multi-agent social dilemmas
In many Multi-Agent Systems (MAS), agents (even if selfinterested) need to cooperate in order to maximize their own utilities. Most of the multi-agent learning algorithms focus on...
Jose Enrique Munoz de Cote, Alessandro Lazaric, Ma...
ECML
2005
Springer
15 years 3 months ago
Active Learning in Partially Observable Markov Decision Processes
This paper examines the problem of finding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not known or is only poorly specified. W...
Robin Jaulmes, Joelle Pineau, Doina Precup