Sciweavers

CORR
2010
Springer
175views Education» more  CORR 2010»
12 years 11 months ago
On the Combinatorial Multi-Armed Bandit Problem with Markovian Rewards
We consider a combinatorial generalization of the classical multi-armed bandit problem that is defined as follows. There is a given bipartite graph of M users and N M resources. F...
Yi Gai, Bhaskar Krishnamachari, Mingyan Liu
CORR
2010
Springer
152views Education» more  CORR 2010»
12 years 11 months ago
Combinatorial Network Optimization with Unknown Variables: Multi-Armed Bandits with Linear Rewards
In the classic multi-armed bandits problem, the goal is to have a policy for dynamically operating arms that each yield stochastic rewards with unknown means. The key metric of int...
Yi Gai, Bhaskar Krishnamachari, Rahul Jain
AIPS
2006
13 years 6 months ago
Probabilistic Planning with Nonlinear Utility Functions
Researchers often express probabilistic planning problems as Markov decision process models and then maximize the expected total reward. However, it is often rational to maximize ...
Yaxin Liu, Sven Koenig
COLT
2006
Springer
13 years 8 months ago
Online Learning with Constraints
In this paper, we study a sequential decision making problem. The objective is to maximize the total reward while satisfying constraints, which are defined at every time step. The...
Shie Mannor, John N. Tsitsiklis