Search Sciweavers | Sciweavers

92 search results - page 11 / 19

» Acting Optimally in Partially Observable Stochastic Domains

146

Voted

GLOBECOM
2009
IEEE

149views Communications» more GLOBECOM 2009»

Dogfight in Spectrum: Jamming and Anti-Jamming in Multichannel Cognitive Radio Systems

14 years 11 months ago

Download web.eecs.utk.edu

Primary user emulation attack in multichannel cognitive radio systems is discussed. An attacker is assumed to be able to send primary-user-like signals during spectrum sensing peri...

Husheng Li, Zhu Han

claim paper

Read More »

133

click to vote

ICML
1996
IEEE

162views Machine Learning» more ICML 1996»

Learning Evaluation Functions for Large Acyclic Domains

16 years 2 months ago

Download www.ri.cmu.edu

Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...

Justin A. Boyan, Andrew W. Moore

claim paper

Read More »

115

Voted

NIPS
2008

96views Information Technology» more NIPS 2008»

Multi-Agent Filtering with Infinitely Nested Beliefs

15 years 3 months ago

Download www.cs.washington.edu

In partially observable worlds with many agents, nested beliefs are formed when agents simultaneously reason about the unknown state of the world and the beliefs of the other agen...

Luke S. Zettlemoyer, Brian Milch, Leslie Pack Kael...

claim paper

Read More »

156

click to vote

AI
2011
Springer

211views Artificial Intelligence» more AI 2011»

Decentralized MDPs with sparse interactions

14 years 5 months ago

Download www.inesc-id.pt

In this work, we explore how local interactions can simplify the process of decision-making in multiagent systems, particularly in multirobot problems. We review a recent decision-...

Francisco S. Melo, Manuela M. Veloso

claim paper

Read More »

110

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

15 years 8 months ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

« Prev « First page 11 / 19 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers