Sciweavers

43 search results - page 9 / 9
» A Game of Prediction with Expert Advice
Sort
View
JMLR
2010
103views more  JMLR 2010»
12 years 11 months ago
Regret Bounds and Minimax Policies under Partial Monitoring
This work deals with four classical prediction settings, namely full information, bandit, label efficient and bandit label efficient as well as four different notions of regret: p...
Jean-Yves Audibert, Sébastien Bubeck
SIGECOM
2006
ACM
128views ECommerce» more  SIGECOM 2006»
13 years 10 months ago
Controlling a supply chain agent using value-based decomposition
We present and evaluate the design of Deep Maize, our entry in the 2005 Trading Agent Competition Supply Chain Management scenario. The central idea is to decompose the problem by...
Christopher Kiekintveld, Jason Miller, Patrick R. ...
LAMAS
2005
Springer
13 years 10 months ago
Multi-agent Relational Reinforcement Learning
In this paper we report on using a relational state space in multi-agent reinforcement learning. There is growing evidence in the Reinforcement Learning research community that a r...
Tom Croonenborghs, Karl Tuyls, Jan Ramon, Maurice ...