Sciweavers

2011 search results - page 268 / 403
» Universal Reinforcement Learning
Sort
View
AI
2002
Springer
15 years 4 months ago
Programming backgammon using self-teaching neural nets
TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results. Starting from random initial play, TD...
Gerald Tesauro
ML
2002
ACM
133views Machine Learning» more  ML 2002»
15 years 4 months ago
Finite-time Analysis of the Multiarmed Bandit Problem
Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while t...
Peter Auer, Nicolò Cesa-Bianchi, Paul Fisch...
COLT
2010
Springer
15 years 3 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura
JMLR
2010
141views more  JMLR 2010»
14 years 11 months ago
Pinview: Implicit Feedback in Content-Based Image Retrieval
This paper describes Pinview, a content-based image retrieval system that exploits implicit relevance feedback during a search session. Pinview contains several novel methods that...
Peter Auer, Zakria Hussain, Samuel Kaski, Arto Kla...
CHI
2009
ACM
16 years 5 months ago
Enhancing input device evaluation: longitudinal approaches
Jens Gerken HCI Group, University of Konstanz Box D-73 78457 Konstanz, Germany jens.gerken@uni-konstanz.de Hans-Joachim Bieg HCI Group, University of Konstanz Box D-73 78457 Konsta...
Jens Gerken, Hans-Joachim Bieg, Stefan Dierdorf, H...