Sciweavers

64 search results - page 8 / 13
» *-Minimax Performance in Backgammon
Sort
View
IJCAI
2007
14 years 11 months ago
Heuristic Selection of Actions in Multiagent Reinforcement Learning
This work presents a new algorithm, called Heuristically Accelerated Minimax-Q (HAMMQ), that allows the use of heuristics to speed up the wellknown Multiagent Reinforcement Learni...
Reinaldo A. C. Bianchi, Carlos H. C. Ribeiro, Anna...
COLT
2010
Springer
14 years 7 months ago
Nonparametric Bandits with Covariates
We consider a bandit problem which involves sequential sampling from two populations (arms). Each arm produces a noisy reward realization which depends on an observable random cov...
Philippe Rigollet, Assaf Zeevi
87
Voted
JSAC
2011
159views more  JSAC 2011»
14 years 4 months ago
An Anti-Jamming Stochastic Game for Cognitive Radio Networks
—Various spectrum management schemes have been proposed in recent years to improve the spectrum utilization in cognitive radio networks. However, few of them have considered the ...
Beibei Wang, Yongle Wu, K. J. Ray Liu, T. Charles ...
COLT
2006
Springer
15 years 1 months ago
Online Learning with Variable Stage Duration
We consider online learning in repeated decision problems, within the framework of a repeated game against an arbitrary opponent. For repeated matrix games, well known results esta...
Shie Mannor, Nahum Shimkin
AUSAI
2009
Springer
15 years 4 months ago
MML Invariant Linear Regression
Abstract. This paper derives two new information theoretic linear regression criteria based on the minimum message length principle. Both criteria are invariant to full rank affine...
Daniel F. Schmidt, Enes Makalic