Sciweavers

2011 search results - page 187 / 403
» Universal Reinforcement Learning
Sort
View
129
Voted
AIIDE
2008
15 years 7 months ago
Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games
We introduce the ALeRT (Action-dependent Learning Rates with Trends) algorithm that makes two modifications to the learning rate and one change to the exploration rate of traditio...
Maria Cutumisu, Duane Szafron, Michael H. Bowling,...
ALT
2005
Springer
16 years 1 months ago
Monotone Conditional Complexity Bounds on Future Prediction Errors
We bound the future loss when predicting any (computably) stochastic sequence online. Solomonoff finitely bounded the total deviation of his universal predictor M from the true ...
Alexey V. Chernov, Marcus Hutter
BMCBI
2007
105views more  BMCBI 2007»
15 years 5 months ago
Constrained hidden Markov models for population-based haplotyping
abstract Niels Landwehr1 , Taneli Mielik¨ainen2 , Lauri Eronen2 , Hannu Toivonen1,2 , and Heikki Mannila2 1 Machine Learning Lab, Dept. of Comp. Science, University of Freiburg, G...
Niels Landwehr, Taneli Mielikäinen, Lauri Ero...
ICML
2000
IEEE
16 years 5 months ago
Eligibility Traces for Off-Policy Policy Evaluation
Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...
Doina Precup, Richard S. Sutton, Satinder P. Singh
148
Voted
ECML
2004
Springer
15 years 10 months ago
Experiments in Value Function Approximation with Sparse Support Vector Regression
Abstract. We present first experiments using Support Vector Regression as function approximator for an on-line, sarsa-like reinforcement learner. To overcome the batch nature of S...
Tobias Jung, Thomas Uthmann