Abstract. In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This class...
This paper proposes the β-WoLF algorithm for multiagent reinforcement learning (MARL) in the stochastic games framework that uses an additional “advice” signal to inform agen...
Software-based thread-level parallelization has been widely studied for exploiting data parallelism in purely computational loops to improve program performance on multiprocessors...
My research attempts to address on-line action selection in reinforcement learning from a Bayesian perspective. The idea is to develop more effective action selection techniques b...
Data items archived in data warehouses or those that arrive online as streams typically have attributes which take values from multiple hierarchies (e.g., time and geographic loca...
Graham Cormode, Flip Korn, S. Muthukrishnan, Dives...