Sciweavers

1455 search results - page 35 / 291
» Exploiting Myopic Learning
Sort
View
IUI
2006
ACM
15 years 5 months ago
Who's asking for help?: a Bayesian approach to intelligent assistance
Automated software customization is drawing increasing attention as a means to help users deal with the scope, complexity, potential intrusiveness, and ever-changing nature of mod...
Bowen Hui, Craig Boutilier
JACM
2006
93views more  JACM 2006»
14 years 11 months ago
Combining expert advice in reactive environments
"Experts algorithms" constitute a methodology for choosing actions repeatedly, when the rewards depend both on the choice of action and on the unknown current state of t...
Daniela Pucci de Farias, Nimrod Megiddo
JSAC
2007
189views more  JSAC 2007»
14 years 11 months ago
Non-Cooperative Power Control for Wireless Ad Hoc Networks with Repeated Games
— One of the distinctive features in a wireless ad hoc network is lack of any central controller or single point of authority, in which each node/link then makes its own decision...
Chengnian Long, Qian Zhang, Bo Li, Huilong Yang, X...
CEC
2010
IEEE
15 years 14 hour ago
Coevolutionary Temporal Difference Learning for small-board Go
—In this paper we apply Coevolutionary Temporal Difference Learning (CTDL), a hybrid of coevolutionary search and reinforcement learning proposed in our former study, to evolve s...
Krzysztof Krawiec, Marcin Szubert
CORR
2012
Springer
216views Education» more  CORR 2012»
13 years 7 months ago
Fractional Moments on Bandit Problems
Reinforcement learning addresses the dilemma between exploration to find profitable actions and exploitation to act according to the best observations already made. Bandit proble...
Ananda Narayanan B., Balaraman Ravindran