Sciweavers

771 search results - page 132 / 155
» Markov Decision Processes with Arbitrary Reward Processes
Sort
View
ATAL
2007
Springer
15 years 5 months ago
Combinatorial resource scheduling for multiagent MDPs
Optimal resource scheduling in multiagent systems is a computationally challenging task, particularly when the values of resources are not additive. We consider the combinatorial ...
Dmitri A. Dolgov, Michael R. James, Michael E. Sam...
ECML
2007
Springer
15 years 5 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
ROBOCUP
2007
Springer
99views Robotics» more  ROBOCUP 2007»
15 years 5 months ago
Instance-Based Action Models for Fast Action Planning
Abstract. Two main challenges of robot action planning in real domains are uncertain action effects and dynamic environments. In this paper, an instance-based action model is lear...
Mazda Ahmadi, Peter Stone
106
Voted
GLOBECOM
2006
IEEE
15 years 5 months ago
Adaptive Learning of Transmission Control Policies for MIMO Fading Channels under Delay Constraint
— This paper addresses learning based adaptive resource allocation for wireless MIMO channels with Markovian fading. The problem is posed as Constrained Markov Decision Process w...
Dejan V. Djonin, Vikram Krishnamurthy
QEST
2006
IEEE
15 years 5 months ago
Compositional Performability Evaluation for STATEMATE
Abstract— This paper reports on our efforts to link an industrial state-of-the-art modelling tool to academic state-of-the-art analysis algorithms. In a nutshell, we enable timed...
Eckard Böde, Marc Herbstritt, Holger Hermanns...