Sciweavers

238 search results - page 42 / 48
» Value-Function Approximations for Partially Observable Marko...
Sort
View
MOBICOM
2009
ACM
15 years 6 months ago
Interference management via rate splitting and HARQ over time-varying fading channels
The coexistence of two unlicensed links is considered, where one link interferes with the transmission of the other, over a timevarying, block-fading channel. In the absence of fa...
Marco Levorato, Osvaldo Simeone, Urbashi Mitra
GECCO
2008
Springer
179views Optimization» more  GECCO 2008»
15 years 25 days ago
Emergent architecture in self organized swarm systems for military applications
Many sectors of the military are interested in Self-Organized (SO) systems because of their flexibility, versatility and economics. The military is researching and employing auto...
Dustin J. Nowak, Gary B. Lamont, Gilbert L. Peters...
AAAI
2007
15 years 2 months ago
Optimizing Anthrax Outbreak Detection Using Reinforcement Learning
The potentially catastrophic impact of a bioterrorist attack makes developing effective detection methods essential for public health. In the case of anthrax attack, a delay of ho...
Masoumeh T. Izadi, David L. Buckeridge
IJCAI
2001
15 years 1 months ago
Complexity of Probabilistic Planning under Average Rewards
A general and expressive model of sequential decision making under uncertainty is provided by the Markov decision processes (MDPs) framework. Complex applications with very large ...
Jussi Rintanen
ICML
1996
IEEE
16 years 17 days ago
Learning Evaluation Functions for Large Acyclic Domains
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...
Justin A. Boyan, Andrew W. Moore