Sciweavers

56 search results - page 10 / 12
» Multi-Agent Systems by Incremental Gradient Reinforcement Le...
Sort
View
ICML
2002
IEEE
16 years 13 days ago
Reinforcement Learning and Shaping: Encouraging Intended Behaviors
We explore dynamic shaping to integrate our prior beliefs of the final policy into a conventional reinforcement learning system. Shaping provides a positive or negative artificial...
Adam Laud, Gerald DeJong
115
Voted
JMLR
2006
124views more  JMLR 2006»
14 years 11 months ago
Policy Gradient in Continuous Time
Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...
Rémi Munos
ANOR
2005
80views more  ANOR 2005»
14 years 11 months ago
Entropic Penalties in Finite Games
The main objects here are finite-strategy games in which entropic terms are subtracted from the payoffs. After such subtraction each Nash equilibrium solves an explicit, unconstra...
Sjur Didrik Flåm, E. Cavazzuti
136
Voted
SIGDIAL
2010
14 years 9 months ago
Modeling Spoken Decision Making Dialogue and Optimization of its Dialogue Strategy
This paper presents a spoken dialogue framework that helps users in making decisions. Users often do not have a definite goal or criteria for selecting from a list of alternatives...
Teruhisa Misu, Komei Sugiura, Kiyonori Ohtake, Chi...
84
Voted
NETCOOP
2007
Springer
15 years 5 months ago
Load Shared Sequential Routing in MPLS Networks: System and User Optimal Solutions
Recently Gerald Ash has shown through case studies that event dependent routing is attractive in large scale multi-service MPLS networks. In this paper, we consider the application...
Gilles Brunet, Fariba Heidari, Lorne Mason