Sciweavers

271 search results - page 55 / 55
» Identifying Optimal Sequential Decisions
Sort
View
169
Voted
AMAI
2011
Springer
14 years 1 months ago
Multi-armed bandits with episode context
A multi-armed bandit episode consists of n trials, each allowing selection of one of K arms, resulting in payoff from a distribution over [0, 1] associated with that arm. We assum...
Christopher D. Rosin