Sciweavers

3 search results - page 1 / 1
» Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Gam...
Sort
View
106
Voted
AAAI
2007
15 years 18 days ago
Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games
In timed, zero-sum games, the goal is to maximize the probability of winning, which is not necessarily the same as maximizing our expected reward. We consider cumulative intermedi...
Colin McMillen, Manuela M. Veloso
93
Voted
IJCAI
2001
14 years 11 months ago
R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning
R-max is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-max, the agent always maintains a complet...
Ronen I. Brafman, Moshe Tennenholtz
94
Voted
INFOCOM
2006
IEEE
15 years 4 months ago
An Optimal Dynamic Pricing Framework for Autonomous Mobile Ad Hoc Networks
— In autonomous mobile ad hoc networks (MANET) where each user is its own authority, fully cooperative behaviors, such as unconditionally forwarding packets for each other or, ho...
Zhu Ji, Wei Yu, K. J. Ray Liu