Sciweavers

238 search results - page 30 / 48
» Value-Function Approximations for Partially Observable Marko...
Sort
View
ACL
2010
14 years 9 months ago
Towards Relational POMDPs for Adaptive Dialogue Management
Open-ended spoken interactions are typically characterised by both structural complexity and high levels of uncertainty, making dialogue management in such settings a particularly...
Pierre Lison
CORR
2006
Springer
113views Education» more  CORR 2006»
14 years 11 months ago
A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
Manuel Loth, Philippe Preux
ATAL
2009
Springer
15 years 6 months ago
SarsaLandmark: an algorithm for learning in POMDPs with landmarks
Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...
Michael R. James, Satinder P. Singh
AAAI
2004
15 years 1 months ago
Stochastic Local Search for POMDP Controllers
The search for finite-state controllers for partially observable Markov decision processes (POMDPs) is often based on approaches like gradient ascent, attractive because of their ...
Darius Braziunas, Craig Boutilier
HICSS
2003
IEEE
207views Biometrics» more  HICSS 2003»
15 years 5 months ago
Formalizing Multi-Agent POMDP's in the context of network routing
This paper uses partially observable Markov decision processes (POMDP’s) as a basic framework for MultiAgent planning. We distinguish three perspectives: first one is that of a...
Bharaneedharan Rathnasabapathy, Piotr J. Gmytrasie...