Sciweavers

2108 search results - page 216 / 422
» Tracking in Reinforcement Learning
Sort
View
113
Voted
ANOR
2005
80views more  ANOR 2005»
15 years 15 days ago
Entropic Penalties in Finite Games
The main objects here are finite-strategy games in which entropic terms are subtracted from the payoffs. After such subtraction each Nash equilibrium solves an explicit, unconstra...
Sjur Didrik Flåm, E. Cavazzuti
112
Voted
ALIFE
2002
15 years 14 days ago
Ant Colony Optimization and Stochastic Gradient Descent
In this paper, we study the relationship between the two techniques known as ant colony optimization (aco) and stochastic gradient descent. More precisely, we show that some empir...
Nicolas Meuleau, Marco Dorigo
117
Voted
AROBOTS
2008
131views more  AROBOTS 2008»
15 years 5 days ago
Active audition using the parameter-less self-organising map
This paper presents a novel method for enabling a robot to determine the position of a sound source in three dimensions using just two microphones and interaction with its environm...
Erik Berglund, Joaquin Sitte, Gordon Wyeth
122
Voted
AGI
2011
14 years 4 months ago
Measuring Agent Intelligence via Hierarchies of Environments
Under Legg’s and Hutter’s formal measure [1], performance in easy environments counts more toward an agent’s intelligence than does performance in difficult environments. An ...
Bill Hibbard
128
Voted
CORR
2006
Springer
140views Education» more  CORR 2006»
15 years 18 days ago
Nearly optimal exploration-exploitation decision thresholds
While in general trading off exploration and exploitation in reinforcement learning is hard, under some formulations relatively simple solutions exist. Optimal decision thresholds ...
Christos Dimitrakakis