Sciweavers

166 search results - page 31 / 34
» Online model learning in adversarial Markov decision process...
Sort
View
ICML
2009
IEEE
15 years 10 months ago
Predictive representations for policy gradient in POMDPs
We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...
Abdeslam Boularias, Brahim Chaib-draa
NIPS
1998
14 years 11 months ago
Risk Sensitive Reinforcement Learning
In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are those states entering which is undesirable or dangerous. We define the risk with re...
Ralph Neuneier, Oliver Mihatsch
84
Voted
NOMS
2008
IEEE
108views Communications» more  NOMS 2008»
15 years 4 months ago
Autonomic QoS optimization of real-time internet audio using loss prediction and stochastic control
— Quality of Internet audio is highly sensitive to packet loss caused by congestion in the links. Packet loss for audio is normally rectified by adding redundancy using Forward ...
Lopa Roychoudhuri, Ehab S. Al-Shaer
COLT
2008
Springer
14 years 11 months ago
Adapting to a Changing Environment: the Brownian Restless Bandits
In the multi-armed bandit (MAB) problem there are k distributions associated with the rewards of playing each of k strategies (slot machine arms). The reward distributions are ini...
Aleksandrs Slivkins, Eli Upfal
AGENTS
2001
Springer
15 years 2 months ago
Adjustable autonomy in real-world multi-agent environments
Through adjustable autonomy (AA), an agent can dynamically vary the degree to which it acts autonomously, allowing it to exploit human abilities to improve its performance, but wi...
Paul Scerri, David V. Pynadath, Milind Tambe