Sciweavers

1233 search results - page 191 / 247
» Reinforcement Learning in MirrorBot
Sort
View
ML
2002
ACM
133views Machine Learning» more  ML 2002»
14 years 9 months ago
Finite-time Analysis of the Multiarmed Bandit Problem
Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while t...
Peter Auer, Nicolò Cesa-Bianchi, Paul Fisch...
COLT
2010
Springer
14 years 7 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura
JMLR
2010
141views more  JMLR 2010»
14 years 4 months ago
Pinview: Implicit Feedback in Content-Based Image Retrieval
This paper describes Pinview, a content-based image retrieval system that exploits implicit relevance feedback during a search session. Pinview contains several novel methods that...
Peter Auer, Zakria Hussain, Samuel Kaski, Arto Kla...
ROBOCUP
2007
Springer
167views Robotics» more  ROBOCUP 2007»
15 years 3 months ago
Cooperative/Competitive Behavior Acquisition Based on State Value Estimation of Others
The existing reinforcement learning approaches have been suffering from the curse of dimension problem when they are applied to multiagent dynamic environments. One of the typical...
Kentarou Noma, Yasutake Takahashi, Minoru Asada
KI
2007
Springer
15 years 3 months ago
Making a Robot Learn to Play Soccer Using Reward and Punishment
In this paper, we show how reinforcement learning can be applied to real robots to achieve optimal robot behavior. As example, we enable an autonomous soccer robot to learn interce...
Heiko Müller, Martin Lauer, Roland Hafner, Sa...