Sciweavers

CORR
2008
Springer
136views Education» more  CORR 2008»
13 years 4 months ago
Multi-Armed Bandits in Metric Spaces
In a multi-armed bandit problem, an online algorithm chooses from a set of strategies in a sequence of n trials so as to maximize the total payoff of the chosen strategies. While ...
Robert Kleinberg, Aleksandrs Slivkins, Eli Upfal
CEC
2005
IEEE
13 years 10 months ago
XCS with computed prediction for the learning of Boolean functions
Computed prediction represents a major shift in learning classifier system research. XCS with computed prediction, based on linear approximators, has been applied so far to functi...
Pier Luca Lanzi, Daniele Loiacono, Stewart W. Wils...
LICS
2007
IEEE
13 years 10 months ago
Limits of Multi-Discounted Markov Decision Processes
Markov decision processes (MDPs) are controllable discrete event systems with stochastic transitions. The payoff received by the controller can be evaluated in different ways, dep...
Hugo Gimbert, Wieslaw Zielonka