Sciweavers

2005 search results - page 323 / 401
» Decisive Markov Chains
Sort
View
ICMLA
2009
14 years 9 months ago
Multiagent Transfer Learning via Assignment-Based Decomposition
We describe a system that successfully transfers value function knowledge across multiple subdomains of realtime strategy games in the context of multiagent reinforcement learning....
Scott Proper, Prasad Tadepalli

Publication
273views
14 years 7 months ago
Monte Carlo Value Iteration for Continuous-State POMDPs
Partially observable Markov decision processes (POMDPs) have been successfully applied to various robot motion planning tasks under uncertainty. However, most existing POMDP algo...
Haoyu Bai, David Hsu, Wee Sun Lee, and Vien A. Ngo
IANDC
2011
84views more  IANDC 2011»
14 years 6 months ago
Teaching randomized learners with feedback
The present paper introduces a new model for teaching randomized learners. Our new model, though based on the classical teaching dimension model, allows to study the influence of...
Frank J. Balbach, Thomas Zeugmann
JSAC
2011
82views more  JSAC 2011»
14 years 6 months ago
Optimal Cognitive Access of Markovian Channels under Tight Collision Constraints
Abstract—The problem of cognitive access of channels of primary users by a secondary user is considered. The transmissions of primary users are modeled as independent continuous-...
Xin Li, Qianchuan Zhao, Xiaohong Guan, Lang Tong
JMLR
2010
189views more  JMLR 2010»
14 years 6 months ago
Adaptive Step-size Policy Gradients with Average Reward Metric
In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...
Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...