Search Sciweavers | Sciweavers

1233 search results - page 132 / 247

» Reinforcement Learning in MirrorBot

158

click to vote

ACL
2010

176views Computational Linguistics» more ACL 2010»

Learning to Adapt to Unknown Users: Referring Expression Generation in Spoken Dialogue Systems

15 years 2 months ago

Download aclweb.org

We present a data-driven approach to learn user-adaptive referring expression generation (REG) policies for spoken dialogue systems. Referring expressions can be difficult to unde...

Srinivasan Janarthanam, Oliver Lemon

claim paper

Read More »

126

click to vote

NIPS
2004

120views Information Technology» more NIPS 2004»

Multi-agent Cooperation in Diverse Population Games

15 years 5 months ago

Download books.nips.cc

We consider multi-agent systems whose agents compete for resources by striving to be in the minority group. The agents adapt to the environment by reinforcement learning of the pr...

K. Y. Michael Wong, S. W. Lim, Zhuo Gao

claim paper

Read More »

128

click to vote

NIPS
2003

108views Information Technology» more NIPS 2003»

Policy Search by Dynamic Programming

15 years 5 months ago

Download books.nips.cc

We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given (indicating roughly how often we expect a good policy to v...

J. Andrew Bagnell, Sham Kakade, Andrew Y. Ng, Jeff...

claim paper

Read More »

147

click to vote

ANOR
2005

80views more ANOR 2005»

Entropic Penalties in Finite Games

15 years 4 months ago

Download www.science.unitn.it

The main objects here are finite-strategy games in which entropic terms are subtracted from the payoffs. After such subtraction each Nash equilibrium solves an explicit, unconstra...

Sjur Didrik Flåm, E. Cavazzuti

claim paper

Read More »

138

click to vote

ALIFE
2002

176views Modeling And Simulation» more ALIFE 2002»

Ant Colony Optimization and Stochastic Gradient Descent

15 years 4 months ago

Download ti.arc.nasa.gov

In this paper, we study the relationship between the two techniques known as ant colony optimization (aco) and stochastic gradient descent. More precisely, we show that some empir...

Nicolas Meuleau, Marco Dorigo

claim paper

Read More »

« Prev « First page 132 / 247 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers