Sciweavers

3 search results - page 1 / 1
» Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Gam...
Sort
View
AAAI
2007
13 years 6 months ago
Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games
In timed, zero-sum games, the goal is to maximize the probability of winning, which is not necessarily the same as maximizing our expected reward. We consider cumulative intermedi...
Colin McMillen, Manuela M. Veloso
IJCAI
2001
13 years 5 months ago
R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning
R-max is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-max, the agent always maintains a complet...
Ronen I. Brafman, Moshe Tennenholtz
INFOCOM
2006
IEEE
13 years 10 months ago
An Optimal Dynamic Pricing Framework for Autonomous Mobile Ad Hoc Networks
— In autonomous mobile ad hoc networks (MANET) where each user is its own authority, fully cooperative behaviors, such as unconditionally forwarding packets for each other or, ho...
Zhu Ji, Wei Yu, K. J. Ray Liu