Sciweavers

771 search results - page 45 / 155
» Markov Decision Processes with Arbitrary Reward Processes
Sort
View
SOCO
2010
Springer
14 years 8 months ago
Using evolution strategies to solve DEC-POMDP problems
Decentralized partially observable Markov decision process (DEC-POMDP) is an approach to model multi-robot decision making problems under uncertainty. Since it is NEXP-complete the...
Baris Eker, H. Levent Akin
113
Voted
CORR
2010
Springer
101views Education» more  CORR 2010»
15 years 2 months ago
Finite Optimal Control for Time-Bounded Reachability in CTMDPs and Continuous-Time Markov Games
We establish the existence of optimal scheduling strategies for time-bounded reachability in continuous-time Markov decision processes, and of co-optimal strategies for continuous-...
Markus Rabe, Sven Schewe
132
Voted
AAAI
2006
15 years 3 months ago
Action Selection in Bayesian Reinforcement Learning
My research attempts to address on-line action selection in reinforcement learning from a Bayesian perspective. The idea is to develop more effective action selection techniques b...
Tao Wang
131
Voted
ICASSP
2011
IEEE
14 years 5 months ago
Logarithmic weak regret of non-Bayesian restless multi-armed bandit
Abstract—We consider the restless multi-armed bandit (RMAB) problem with unknown dynamics. At each time, a player chooses K out of N (N > K) arms to play. The state of each ar...
Haoyang Liu, Keqin Liu, Qing Zhao
CAV
2010
Springer
190views Hardware» more  CAV 2010»
15 years 5 months ago
Measuring and Synthesizing Systems in Probabilistic Environments
Often one has a preference order among the different systems that satisfy a given specification. Under a probabilistic assumption about the possible inputs, such a preference order...
Krishnendu Chatterjee, Thomas A. Henzinger, Barbar...