Sciweavers

238 search results - page 36 / 48
» Value-Function Approximations for Partially Observable Marko...
Sort
View
GLOBECOM
2010
IEEE
14 years 9 months ago
Maximize Secondary User Throughput via Optimal Sensing in Multi-Channel Cognitive Radio Networks
In a cognitive radio network, the full-spectrum is usually divided into multiple channels. However, due to the hardware and energy constraints, a cognitive user (also called second...
Shimin Gong, Ping Wang, Wei Liu, Wei Yuan
ICML
2008
IEEE
16 years 17 days ago
Reinforcement learning in the presence of rare events
We consider the task of reinforcement learning in an environment in which rare significant events occur independently of the actions selected by the controlling agent. If these ev...
Jordan Frank, Shie Mannor, Doina Precup
ATAL
2007
Springer
15 years 6 months ago
Graphical models for online solutions to interactive POMDPs
We develop a new graphical representation for interactive partially observable Markov decision processes (I-POMDPs) that is significantly more transparent and semantically clear t...
Prashant Doshi, Yifeng Zeng, Qiongyu Chen
ECAI
2008
Springer
15 years 26 days ago
A hybrid approach to multi-agent decision-making
Abstract. In the aftermath of a large-scale disaster, agents’ decisions derive from self-interested (e.g. survival), common-good (e.g. victims’ rescue) and teamwork (e.g. fire...
Paulo Trigo, Helder Coelho
NIPS
1996
15 years 1 months ago
Multidimensional Triangulation and Interpolation for Reinforcement Learning
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
Scott Davies