Sciweavers

1799 search results - page 209 / 360
» Filtered Reinforcement Learning
Sort
View
ANOR
2005
80views more  ANOR 2005»
15 years 4 months ago
Entropic Penalties in Finite Games
The main objects here are finite-strategy games in which entropic terms are subtracted from the payoffs. After such subtraction each Nash equilibrium solves an explicit, unconstra...
Sjur Didrik Flåm, E. Cavazzuti
ALIFE
2002
15 years 4 months ago
Ant Colony Optimization and Stochastic Gradient Descent
In this paper, we study the relationship between the two techniques known as ant colony optimization (aco) and stochastic gradient descent. More precisely, we show that some empir...
Nicolas Meuleau, Marco Dorigo
AGI
2011
14 years 8 months ago
Measuring Agent Intelligence via Hierarchies of Environments
Under Legg’s and Hutter’s formal measure [1], performance in easy environments counts more toward an agent’s intelligence than does performance in difficult environments. An ...
Bill Hibbard
CORR
2006
Springer
140views Education» more  CORR 2006»
15 years 4 months ago
Nearly optimal exploration-exploitation decision thresholds
While in general trading off exploration and exploitation in reinforcement learning is hard, under some formulations relatively simple solutions exist. Optimal decision thresholds ...
Christos Dimitrakakis
ML
2010
ACM
135views Machine Learning» more  ML 2010»
14 years 11 months ago
Multi-domain learning by confidence-weighted parameter combination
State-of-the-art statistical NLP systems for a variety of tasks learn from labeled training data that is often domain specific. However, there may be multiple domains or sources o...
Mark Dredze, Alex Kulesza, Koby Crammer